Re: [PATCH v2] staging: android: ion: Allocate from heap ID directly without mask

2019-02-15 Thread John Stultz
On Fri, Feb 15, 2019 at 11:22 AM Andrew F. Davis  wrote:
>
> On 2/15/19 1:01 PM, John Stultz wrote:
> > On Fri, Feb 15, 2019 at 2:51 AM Brian Starkey  wrote:
> >> On Thu, Feb 14, 2019 at 09:38:29AM -0800, John Stultz wrote:
> >>> 2) For patches that cause ABI breaks, it might be good to make it
> >>> clear in the commit what the userland impact looks like in userspace,
> >>> possibly with an example, so the poor folks who bisect down the change
> >>> as breaking their system in a year or so have a clear example as to
> >>> what they need to change in their code.
> >>>
> >>> 3) Also, its not clear how a given userland should distinguish between
> >>> the different ABIs.  We already have logic in libion to distinguish
> >>> between pre-4.12 legacy and post-4.12 implementations (using implicit
> >>> ion_free() behavior). I don't see any such check we can make with this
> >>> code. Adding another ABI version may require we provide an actual
> >>> interface version ioctl.
> >>>
> >>
> >> A slightly fragile/ugly approach might be to attempt a small
> >> allocation with a heap_mask of 0x. On an "old" implementation,
> >> you'd expect that to succeed, whereas it would/could be made to fail
> >> in the "new" one.
> >
> > Yea I think having a proper ION_IOC_VERSION is going to be necessary.
> >
>
> I think that will be helpful to have ready the future just looking at
> the way libdrm does things, but not right now as backwards compatibility
> with staging code is not a reasonable thing to do.

I'm not sure I'm following what you mean here?  While we don't have
any commitment to userland for interfaces in staging, the reality is
that there are a fair number of users affected, and we probably should
avoid causing any needless pain if possible.

Further, as part of my work, I try to keep the hikey boards with an
array of kernels (4.4, 4.9, 4.14, 4.19 and mainline) running with AOSP
master. Having hard build breaks so AOSP has to have build time
dependencies on newer or older kernels is a big pain, and the 4.12 ABI
break was not easy.

So yea, I don't think we should tie our hands in reworking the
interfaces, but it would be nice to avoid having subtle ABI changes
that don't have clear ways for userland to detect which interface
version its using.

> > I'm hoping to send out an ugly first stab at the kernel side for
> > switching to per-heap devices (with a config for keeping /dev/ion for
> > legacy compat), which I hope will address the core issue this patch
> > does (moving away from heap masks to specifically requested heaps).
> >
>
> Yes, that would remove the need for what this patch does.
> Question though, what does the user side look like for this? With the
> old /dev/ion we would:
>
> ion_fd = open("/dev/ion")
> ask for a list of heaps (ioctl on ion_fd)
> iterate over the details of each heap
> pick the best heap for the job
> request allocation from that heap (ioctl on ion_fd)
>
> with per-heap devs we need some way to iterate all over heap devices in
> a system, and extract details from each heap device. Maybe we leave
> /dev/ion but it's only job is to service ION_IOC_HEAP_QUERY requests but
> instead of heap numbers it returns heap names, then device files just
> match those names. Then we go allocate() from those.
>


So my initial thought is we simply use a /dev/ion_heaps/ dir which has
a list of heap devicenodes. /dev/ion goes away.

Currently ION_IOC_HEAP_QUERY returns:
  char name[MAX_HEAP_NAME];
  __u32 type;
  __u32 heap_id;

The names are discoverable via "ls /dev/ion_heaps/"

The heap_id is really only useful as a handle, and after opening the
heap device, we'll have the fd to use.

The only detail we're missing is the type. I'm a little skeptical how
useful type is, but worse case we provide a QUERY ioctl on the heap
device to provide type info.

Most likely, I expect users to:
1) Open "/dev/ion_heaps/" for the heap they want (since they
probably already know)
2) make a HEAP_ALLOCATE ioctl on the fd to allocate

But to match the use case you describe:
1) ls /dev/ion_heaps/ for a list of heaps
2) For each heap name, open the heap and make a QUERY ioctl to get
type info (for what its worth)
3) Pick best heap for the job (*handwaving magic!*)
4) Do an ALLOC ioctl on the heap's fd to allocate


Does that sound reasonable?  And I don't really mean to dismiss the
dynamic picking of the best heap part, and having something like a
opaque constraints bitfield that each device and each heap export so
userland can match things up would be great.  But since we don't have
any real solutions there yet(still!), it seems like most gralloc
implementations are likely to be fully knowing which heap they want at
allocation time.

thanks
-john


Re: [PATCH v2] staging: android: ion: Allocate from heap ID directly without mask

2019-02-15 Thread John Stultz
On Fri, Feb 15, 2019 at 2:51 AM Brian Starkey  wrote:
>
> Hi John,
>
> On Thu, Feb 14, 2019 at 09:38:29AM -0800, John Stultz wrote:
> >
> [snip]
>
> > Some thoughts, as this ABI break has the potential to be pretty painful.
> >
> > 1) Unfortunately, this ABI is exposed *through* libion via
> > ion_alloc/ion_alloc_fd out to gralloc implementations. Which means it
> > will have a wider impact to vendor userland code.
>
> I figured libion could fairly easily loop through all the set bits in
> heap_mask and call the ioctl for each until it succeeds. That
> preserves the old behaviour from the libion clients' perspective.

Potentially, though that implicitly still caps the heaps to 32.  So
I'm not sure what the net benefit would be.


> > 2) For patches that cause ABI breaks, it might be good to make it
> > clear in the commit what the userland impact looks like in userspace,
> > possibly with an example, so the poor folks who bisect down the change
> > as breaking their system in a year or so have a clear example as to
> > what they need to change in their code.
> >
> > 3) Also, its not clear how a given userland should distinguish between
> > the different ABIs.  We already have logic in libion to distinguish
> > between pre-4.12 legacy and post-4.12 implementations (using implicit
> > ion_free() behavior). I don't see any such check we can make with this
> > code. Adding another ABI version may require we provide an actual
> > interface version ioctl.
> >
>
> A slightly fragile/ugly approach might be to attempt a small
> allocation with a heap_mask of 0x. On an "old" implementation,
> you'd expect that to succeed, whereas it would/could be made to fail
> in the "new" one.

Yea I think having a proper ION_IOC_VERSION is going to be necessary.

I'm hoping to send out an ugly first stab at the kernel side for
switching to per-heap devices (with a config for keeping /dev/ion for
legacy compat), which I hope will address the core issue this patch
does (moving away from heap masks to specifically requested heaps).

thanks
-john


Re: [PATCH v2] staging: android: ion: Allocate from heap ID directly without mask

2019-02-14 Thread John Stultz
On Mon, Jan 28, 2019 at 1:44 PM Andrew F. Davis  wrote:
>
> Previously the heap to allocate from was selected by a mask of allowed
> heap types. This may have been done as a primitive form of constraint
> solving, the first heap type that matched any set bit of the heap mask
> was allocated from, unless that heap was excluded by having its heap
> ID bit not set in the separate passed in heap ID mask.
>
> The heap type does not really represent constraints that should be
> matched against to begin with. So the patch [0] removed the the heap type
> mask matching but unfortunately left the heap ID mask check (possibly by
> mistake or to preserve API). Therefor we now only have a mask of heap
> IDs, but heap IDs are unique identifiers and have nothing to do with the
> underlying heap, so mask matching is not useful here. This also limits us
> to 32 heaps total in a system.
>
> With the heap query API users can find the right heap based on type or
> name themselves then just supply the ID for that heap. Remove heap ID
> mask and allow users to specify heap ID directly by its number.

Sorry for the very late reply. I just dug this out of my spam box.

While this seems like a reasonable cleanup (ABI pain aside), I'm
curious as to other then the 32-heap limitation, what other benefits
this brings?
My hesitancy is that we still have fixed number to heap mapping.

Curious how this fits in (or if it would still be necessary) if we
finally moved to previously discussed per-heap device-nodes idea,
which is interesting to me as it avoids a fixed enumeration of heaps
in the kernel (and also allows for per heap permissions).


> I know this is an ABI break, but we are in staging so lets get this over
> with now rather than limit ourselves later.
>
> [0] commit 2bb9f5034ec7 ("gpu: ion: Remove heapmask from client")
>
> Signed-off-by: Andrew F. Davis 
> ---
>
> This also means we don't need to store the available heaps in a plist,
> we only operation we care about is lookup, so a better data structure
> should be chosen at some point, regular list or xarray maybe?
>
> This is base on -next as to be on top of the other taken Ion patches.
>
> Changes from v1:
>  - Fix spelling in commit message
>  - Cleanup logic per Brian's suggestion
>

Some thoughts, as this ABI break has the potential to be pretty painful.

1) Unfortunately, this ABI is exposed *through* libion via
ion_alloc/ion_alloc_fd out to gralloc implementations. Which means it
will have a wider impact to vendor userland code.

2) For patches that cause ABI breaks, it might be good to make it
clear in the commit what the userland impact looks like in userspace,
possibly with an example, so the poor folks who bisect down the change
as breaking their system in a year or so have a clear example as to
what they need to change in their code.

3) Also, its not clear how a given userland should distinguish between
the different ABIs.  We already have logic in libion to distinguish
between pre-4.12 legacy and post-4.12 implementations (using implicit
ion_free() behavior). I don't see any such check we can make with this
code. Adding another ABI version may require we provide an actual
interface version ioctl.


thanks
-john


[RFC][PATCH] usb: f_fs: Avoid crash due to out-of-scope stack ptr access

2019-02-05 Thread John Stultz
Since the 5.0 merge window opened, I've been seeing frequent
crashes on suspend and reboot with the trace:

[   36.911170] Unable to handle kernel paging request at virtual address 
ff801153d660
[   36.912769] Unable to handle kernel paging request at virtual address 
ff84b564
...
[   36.950666] Call trace:
[   36.950670]  queued_spin_lock_slowpath+0x1cc/0x2c8
[   36.950681]  _raw_spin_lock_irqsave+0x64/0x78
[   36.950692]  complete+0x28/0x70
[   36.950703]  ffs_epfile_io_complete+0x3c/0x50
[   36.950713]  usb_gadget_giveback_request+0x34/0x108
[   36.950721]  dwc3_gadget_giveback+0x50/0x68
[   36.950723]  dwc3_thread_interrupt+0x358/0x1488
[   36.950731]  irq_thread_fn+0x30/0x88
[   36.950734]  irq_thread+0x114/0x1b0
[   36.950739]  kthread+0x104/0x130
[   36.950747]  ret_from_fork+0x10/0x1c

I isolated this down to in ffs_epfile_io():
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/usb/gadget/function/f_fs.c#n1065

Where the completion done is setup on the stack:
  DECLARE_COMPLETION_ONSTACK(done);

Then later we setup a request and queue it, and wait for it:
  if (unlikely(wait_for_completion_interruptible())) {
/*
* To avoid race condition with ffs_epfile_io_complete,
* dequeue the request first then check
* status. usb_ep_dequeue API should guarantee no race
* condition with req->complete callback.
*/
usb_ep_dequeue(ep->ep, req);
interrupted = ep->status < 0;
  }

The problem is, that we end up being interrupted, dequeue the
request, and exit.

But then the irq triggers and we try calling complete() on the
context pointer which points to now random stack space, which
results in the panic.

Alan Stern pointed out there is a bug here, in that the snippet
above "assumes that usb_ep_dequeue() waits until the request has
been completed." And that:

wait_for_completion();

Is needed right after the usb_ep_dequeue().

Thus this patch implements that change. With it I no longer see
the crashes on suspend or reboot.

This issue seems to have been uncovered by behavioral changes in
the dwc3 driver in commit fec9095bdef4e ("usb: dwc3: gadget:
remove wait_end_transfer").

Cc: Alan Stern 
Cc: Felipe Balbi 
Cc: Zeng Tao 
Cc: Jack Pham 
Cc: Thinh Nguyen 
Cc: Chen Yu 
Cc: Jerry Zhang 
Cc: Lars-Peter Clausen 
Cc: Vincent Pelletier 
Cc: Andrzej Pietrasiewicz 
Cc: Greg Kroah-Hartman 
Cc: Linux USB List 
Suggested-by: Alan Stern 
Signed-off-by: John Stultz 
---
 drivers/usb/gadget/function/f_fs.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/usb/gadget/function/f_fs.c 
b/drivers/usb/gadget/function/f_fs.c
index 1e54304..0f8d16d 100644
--- a/drivers/usb/gadget/function/f_fs.c
+++ b/drivers/usb/gadget/function/f_fs.c
@@ -1082,6 +1082,7 @@ static ssize_t ffs_epfile_io(struct file *file, struct 
ffs_io_data *io_data)
 * condition with req->complete callback.
 */
usb_ep_dequeue(ep->ep, req);
+   wait_for_completion();
interrupted = ep->status < 0;
}
 
-- 
2.7.4



Re: Frequent dwc3 crashes on suspend or reboot since 5.0-rc1

2019-02-04 Thread John Stultz
On Sat, Feb 2, 2019 at 9:00 AM Alan Stern  wrote:
>
> On Fri, 1 Feb 2019, John Stultz wrote:
>
> > Hey all,
> >   Since the 5.0 merge window opened, I've been tripping on frequent
> > dwc3 crashes on reboot and suspend, which I've added an example to the
> > bottom of this mail.
> >
> > I've dug in a little bit and sort of have a sense of whats going on.
> >
> > In ffs_epfile_io():
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/usb/gadget/function/f_fs.c#n1065
> >
> > The completion done is setup on the stack:
> >   DECLARE_COMPLETION_ONSTACK(done);
> >
> > Then later we setup a request and queue it:
> >   req->context  = 
> >   ...
> >   ret = usb_ep_queue(ep->ep, req, GFP_ATOMIC);
> >
> > Then wait for it:
> >   if (unlikely(wait_for_completion_interruptible())) {
> > /*
> > * To avoid race condition with ffs_epfile_io_complete,
> > * dequeue the request first then check
> > * status. usb_ep_dequeue API should guarantee no race
> > * condition with req->complete callback.
> > */
> > usb_ep_dequeue(ep->ep, req);
>
> This code contains a bug: It assumes that usb_ep_dequeue() waits until
> the request has been completed.  You should insert
>
> wait_for_completion();
>
> right here.
>
> > interrupted = ep->status < 0;
> >   }
>

Thanks! This does seem to resolve the issue I'm seeing.

Are you planning to send in such a patch, or would it help if I sent it out?

thanks
-john


Re: [PATCH 0/8 v5] k3dma patches to add support for hi3660/HiKey960

2019-02-04 Thread John Stultz
On Mon, Feb 4, 2019 at 1:03 AM Vinod Koul  wrote:
>
> On 24-01-19, 12:24, John Stultz wrote:
> > This patch series is based on recent work by Tanglei Han, and
> > adds support for hi3660 SoCs as found on the HiKey960 board,
> > along with a few patches I've been carrying.
>
> Applied and fixed the minor style issues in patch 5, and retagged 3 thru
> 5 to dmaengine: xxx, thanks

My apologies for the style mistake. Your help is very much appreciated
here! Thanks so much again!

Just to clarify, are you planning on taking the reviewed binding
documents (patches 1&2) via your tree?

thanks
-john


Re: Frequent dwc3 crashes on suspend or reboot since 5.0-rc1

2019-02-01 Thread John Stultz
On Fri, Feb 1, 2019 at 4:46 PM Thinh Nguyen  wrote:
> John Stultz wrote:
> > On Fri, Feb 1, 2019 at 4:18 PM John Stultz  wrote:
> > Bisecting the changes down, it seems like its due to commit
> > fec9095bdef4e ("usb: dwc3: gadget: remove wait_end_transfer").
> >
> > It doesn't happen all the time, so I'll need to run some more testing,
> > but so far I've not been able to trigger it backing out the patches to
> > that point.
>
> Yeah, it sounds like the same issue. You can review the discussion here:
> https://www.spinics.net/lists/linux-usb/msg176110.html

Unfortunately, merging in
https://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb.git
testing/next seems to trigger a different issue:

[   38.585141] OOM killer enabled.
[   38.585143] Restarting tasks ...
[   38.585874] [ cut here ]
[   38.585882] ep1out: request  already in flight
[   38.585944] WARNING: CPU: 7 PID: 2545 at
drivers/usb/dwc3/gadget.c:1430 dwc3_gadget_ep_queue+0x1d4/0x200
[   38.585946] Modules linked in:
[   38.585960] CPU: 7 PID: 2545 Comm: adbd Tainted: G S
5.0.0-rc4-00110-gad0e691 #509
[   38.585963] Hardware name: HiKey960 (DT)
[   38.585968] pstate: 80400085 (Nzcv daIf +PAN -UAO)
[   38.585972] pc : dwc3_gadget_ep_queue+0x1d4/0x200
[   38.585976] lr : dwc3_gadget_ep_queue+0x1d4/0x200
[   38.585978] sp : ff80149fbb40
[   38.585981] x29: ff80149fbb40 x28: ffc20c72a200
[   38.585985] x27: ffc211376d00 x26: 
[   38.585990] x25: ff80149fbc30 x24: 0018
[   38.585994] x23: 0080 x22: ffc218c0a1b0
[   38.585997] x21: ff94 x20: ffc21930f600
[   38.586001] x19: ffc211376d00 x18: ff801161da88
[   38.586005] x17:  x16: 
[   38.586008] x15: ff80117f8468 x14: 0008
[   38.586012] x13: ff80117f8088 x12: ff80116436b8
[   38.586016] x11: ff8011643000 x10: ff8011775000
[   38.586020] x9 :  x8 : ff801178b33c
[   38.586023] x7 :  x6 : 007dd811
[   38.586027] x5 :  x4 : 
[   38.586030] x3 : 0003 x2 : 0003
[   38.586034] x1 : e8959d459c94b400 x0 : 
[   38.586039] Call trace:
[   38.586043]  dwc3_gadget_ep_queue+0x1d4/0x200
[   38.586053]  usb_ep_queue+0x5c/0x120
[   38.586059]  ffs_epfile_io.isra.12+0x3dc/0x6c0
[   38.586063]  ffs_epfile_read_iter+0xbc/0x190
[   38.586073]  __vfs_read+0x10c/0x168
[   38.586077]  vfs_read+0x8c/0x148
[   38.586081]  ksys_read+0x5c/0xc8
[   38.586085]  __arm64_sys_read+0x14/0x20
[   38.586094]  el0_svc_common+0xb4/0x118
[   38.586099]  el0_svc_handler+0x2c/0x80
[   38.586105]  el0_svc+0x8/0xc
[   38.586107] ---[ end trace 65a9814dc2b21238 ]---
[   38.587826] done.
[   38.587922] PM: suspend exit
[   38.589426] [ cut here ]
[   38.589433] ep1out: request  already in flight
[   38.589489] WARNING: CPU: 7 PID: 4087 at
drivers/usb/dwc3/gadget.c:1430 dwc3_gadget_ep_queue+0x1d4/0x200
[   38.589493] Modules linked in:
[   38.589506] CPU: 7 PID: 4087 Comm: adbd Tainted: G S  W
5.0.0-rc4-00110-gad0e691 #509
[   38.589508] Hardware name: HiKey960 (DT)
[   38.589514] pstate: 80400085 (Nzcv daIf +PAN -UAO)
[   38.589518] pc : dwc3_gadget_ep_queue+0x1d4/0x200
[   38.589521] lr : dwc3_gadget_ep_queue+0x1d4/0x200
[   38.589523] sp : ff80149fbb40
[   38.589526] x29: ff80149fbb40 x28: ffc20c72a200
[   38.589531] x27: ffc211376d00 x26: 
[   38.589535] x25: ff80149fbc30 x24: 0018
[   38.589539] x23: 0080 x22: ffc218c0a1b0
[   38.589543] x21: ff94 x20: ffc21930f600
[   38.589546] x19: ffc211376d00 x18: ff801161da88
[   38.589550] x17:  x16: 
[   38.589554] x15: ff80117f8468 x14: 0008
[   38.589557] x13: ff80117f8088 x12: ff80116436b8
[   38.589561] x11: ff8011643000 x10: ff8011775000
[   38.589565] x9 :  x8 : ff801178bb88
[   38.589568] x7 :  x6 : 007dd811
[   38.589572] x5 :  x4 : 
[   38.589576] x3 : 0003 x2 : 0003
[   38.589579] x1 : e8959d459c94b400 x0 : 
[   38.589584] Call trace:
[   38.589588]  dwc3_gadget_ep_queue+0x1d4/0x200
[   38.589597]  usb_ep_queue+0x5c/0x120
[   38.589603]  ffs_epfile_io.isra.12+0x3dc/0x6c0
[   38.589606]  ffs_epfile_read_iter+0xbc/0x190
[   38.589617]  __vfs_read+0x10c/0x168
[   38.589621]  vfs_read+0x8c/0x148
[   38.589625]  ksys_read+0x5c/0xc8
[   38.589630]  __arm64_sys_read+0x14/0x20
[   38.589638]  el0_svc_common+0xb4/0x118
[   38.589642]  el0_svc_handler+0x2c/0x80
[   38.589648]  el0_svc+0x8/0xc
[   38.589651] ---[ end trace 65a9814dc2b21239 ]---

Let me know if you have other ideas to try!

thanks
-john


Re: Frequent dwc3 crashes on suspend or reboot since 5.0-rc1

2019-02-01 Thread John Stultz
On Fri, Feb 1, 2019 at 4:18 PM John Stultz  wrote:
>
> Hey all,
>   Since the 5.0 merge window opened, I've been tripping on frequent
> dwc3 crashes on reboot and suspend, which I've added an example to the
> bottom of this mail.
>
> I've dug in a little bit and sort of have a sense of whats going on.
>
> In ffs_epfile_io():
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/usb/gadget/function/f_fs.c#n1065
>
> The completion done is setup on the stack:
>   DECLARE_COMPLETION_ONSTACK(done);
>
> Then later we setup a request and queue it:
>   req->context  = 
>   ...
>   ret = usb_ep_queue(ep->ep, req, GFP_ATOMIC);
>
> Then wait for it:
>   if (unlikely(wait_for_completion_interruptible())) {
> /*
> * To avoid race condition with ffs_epfile_io_complete,
> * dequeue the request first then check
> * status. usb_ep_dequeue API should guarantee no race
> * condition with req->complete callback.
> */
> usb_ep_dequeue(ep->ep, req);
> interrupted = ep->status < 0;
>   }
>
> The problem is, that we end up being interrupted, supposedly dequeue
> the request, and exit.
>
> But then (or in parallel) the irq triggers and we try calling
> complete() on the context pointer which points to now random stack
> space, which results in the panic.
>
> It seems like something is wrong with usb_ep_dequeue not really
> stopping the irq from happening?
>
> If I revert all the changes to dwc3 back to 4.20, I don't see the issue.
>
> I'll do some bisection to try to narrow things down, but I wanted to
> see if this was a known issue or if anyone had immediate ideas as to
> what might be wrong.

Bisecting the changes down, it seems like its due to commit
fec9095bdef4e ("usb: dwc3: gadget: remove wait_end_transfer").

It doesn't happen all the time, so I'll need to run some more testing,
but so far I've not been able to trigger it backing out the patches to
that point.

thanks
-john


Re: Frequent dwc3 crashes on suspend or reboot since 5.0-rc1

2019-02-01 Thread John Stultz
On Fri, Feb 1, 2019 at 4:31 PM Thinh Nguyen  wrote:
>
> Hi John,
>
> John Stultz wrote:
> > Hey all,
> >   Since the 5.0 merge window opened, I've been tripping on frequent
> > dwc3 crashes on reboot and suspend, which I've added an example to the
> > bottom of this mail.
> >
> > I've dug in a little bit and sort of have a sense of whats going on.
> >
> > In ffs_epfile_io():
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__git.kernel.org_pub_scm_linux_kernel_git_torvalds_linux.git_tree_drivers_usb_gadget_function_f-5Ffs.c-23n1065=DwIBaQ=DPL6_X_6JkXFx7AXWqB0tg=u9FYoxKtyhjrGFcyixFYqTjw1ZX0VsG2d8FCmzkTY-w=a8TU-itM8GBG_EARYf2yM-kVfCzmaPkKDNAUFQHTe3Q=BQiVAFiViSlxVg5_LemED0x_47FLVUD43M7R6h6T8qk=
> >
> > The completion done is setup on the stack:
> >   DECLARE_COMPLETION_ONSTACK(done);
> >
> > Then later we setup a request and queue it:
> >   req->context  = 
> >   ...
> >   ret = usb_ep_queue(ep->ep, req, GFP_ATOMIC);
> >
> > Then wait for it:
> >   if (unlikely(wait_for_completion_interruptible())) {
> > /*
> > * To avoid race condition with ffs_epfile_io_complete,
> > * dequeue the request first then check
> > * status. usb_ep_dequeue API should guarantee no race
> > * condition with req->complete callback.
> > */
> > usb_ep_dequeue(ep->ep, req);
> > interrupted = ep->status < 0;
> >   }
> >
> > The problem is, that we end up being interrupted, supposedly dequeue
> > the request, and exit.
> >
> > But then (or in parallel) the irq triggers and we try calling
> > complete() on the context pointer which points to now random stack
> > space, which results in the panic.
> >
> > It seems like something is wrong with usb_ep_dequeue not really
> > stopping the irq from happening?
> >
> > If I revert all the changes to dwc3 back to 4.20, I don't see the issue.
> >
> > I'll do some bisection to try to narrow things down, but I wanted to
> > see if this was a known issue or if anyone had immediate ideas as to
> > what might be wrong.
> >
>
> I'm not sure if this is related, but can you try to test using Felipe's
> testing/next branch? There is a fix to a race condition when the gadget
> driver tries to dequeue requests.
>
> See if you run into this issue again.

I'll check that out! Thanks so much for the pointer!
-john


Frequent dwc3 crashes on suspend or reboot since 5.0-rc1

2019-02-01 Thread John Stultz
Hey all,
  Since the 5.0 merge window opened, I've been tripping on frequent
dwc3 crashes on reboot and suspend, which I've added an example to the
bottom of this mail.

I've dug in a little bit and sort of have a sense of whats going on.

In ffs_epfile_io():
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/usb/gadget/function/f_fs.c#n1065

The completion done is setup on the stack:
  DECLARE_COMPLETION_ONSTACK(done);

Then later we setup a request and queue it:
  req->context  = 
  ...
  ret = usb_ep_queue(ep->ep, req, GFP_ATOMIC);

Then wait for it:
  if (unlikely(wait_for_completion_interruptible())) {
/*
* To avoid race condition with ffs_epfile_io_complete,
* dequeue the request first then check
* status. usb_ep_dequeue API should guarantee no race
* condition with req->complete callback.
*/
usb_ep_dequeue(ep->ep, req);
interrupted = ep->status < 0;
  }

The problem is, that we end up being interrupted, supposedly dequeue
the request, and exit.

But then (or in parallel) the irq triggers and we try calling
complete() on the context pointer which points to now random stack
space, which results in the panic.

It seems like something is wrong with usb_ep_dequeue not really
stopping the irq from happening?

If I revert all the changes to dwc3 back to 4.20, I don't see the issue.

I'll do some bisection to try to narrow things down, but I wanted to
see if this was a known issue or if anyone had immediate ideas as to
what might be wrong.

thanks
-john

[   36.911170] Unable to handle kernel paging request at virtual
address ff801153d660
[   36.912769] Unable to handle kernel paging request at virtual
address ff84b564
[   36.919881] Mem abort info:
[   36.919884]   ESR = 0x9647
[   36.919888]   Exception class = DABT (current EL), IL = 32 bits
[   36.919890]   SET = 0, FnV = 0
[   36.919895]   EA = 0, S1PTW = 0
[   36.927875] Mem abort info:
[   36.935718] Data abort info:
[   36.935721]   ISV = 0, ISS = 0x0047
[   36.935723]   CM = 0, WnR = 1
[   36.935730] swapper pgtable: 4k pages, 39-bit VAs, pgdp = f1b819ef
[   36.935733] [ff801153d660] pgd=00021803,
pud=00021803, pmd=00021fffb803, pte=
[   36.935744] Internal error: Oops: 9647 [#1] PREEMPT SMP
[   36.935748] Modules linked in:
[   36.938552]   ESR = 0x8606
[   36.941601] CPU: 0 PID: 2656 Comm: irq/69-dwc3 Tainted: G S
   4.20.0-10778-gadc8369 #210
[   36.941603] Hardware name: HiKey960 (DT)
[   36.941610] pstate: 00400085 (nzcv daIf +PAN -UAO)
[   36.947554]   Exception class = IABT (current EL), IL = 32 bits
[   36.950594] pc : queued_spin_lock_slowpath+0x1cc/0x2c8
[   36.950601] lr : queued_spin_lock_slowpath+0xd0/0x2c8
[   36.950603] sp : ff8011e13be0
[   36.950607] x29: ff8011e13be0 x28: 
[   36.950611] x27: ff801186d000 x26: ff8010159000
[   36.950615] x25: ff801186d000 x24: ffc218be36e8
[   36.950619] x23: ff801186e910 x22: 0004
[   36.950622] x21: ffc21f71b640 x20: ff801153d000
[   36.950626] x19: ff8011e1bbe8 x18: 
[   36.950629] x17: 100eb564 x16: 100eb564
[   36.950633] x15:  x14: ff801187cf80
[   36.950636] x13: 00420e1de000 x12: 34d4d91d
[   36.950640] x11:  x10: 0a20
[   36.950643] x9 : ff8011e13d10 x8 : ffc218874c00
[   36.950646] x7 :  x6 : ff801186db08
[   36.950650] x5 :  x4 : 
[   36.950653] x3 : ffc21f71b640 x2 : 
[   36.950656] x1 : ff801153d660 x0 : ffc21f71b648
[   36.950663] Process irq/69-dwc3 (pid: 2656, stack limit = 0xb627af93)
[   36.950666] Call trace:
[   36.950670]  queued_spin_lock_slowpath+0x1cc/0x2c8
[   36.950681]  _raw_spin_lock_irqsave+0x64/0x78
[   36.950692]  complete+0x28/0x70
[   36.950703]  ffs_epfile_io_complete+0x3c/0x50
[   36.950713]  usb_gadget_giveback_request+0x34/0x108
[   36.950721]  dwc3_gadget_giveback+0x50/0x68
[   36.950723]  dwc3_thread_interrupt+0x358/0x1488
[   36.950731]  irq_thread_fn+0x30/0x88
[   36.950734]  irq_thread+0x114/0x1b0
[   36.950739]  kthread+0x104/0x130
[   36.950747]  ret_from_fork+0x10/0x1c
[   36.950755] Code: 91190281 8b021021 f860dae2 91002060 (f8226823)
[   36.953901]   SET = 0, FnV = 0
[   36.956685] ---[ end trace 3d13dc405c1e8aa7 ]---
[   36.965704] Kernel panic - not syncing: Fatal exception
[   36.966372]   EA = 0, S1PTW = 0
[   36.973246] SMP: stopping secondary CPUs
[   36.983855] Kernel Offset: disabled
[   36.983860] CPU features: 0x002,21882004
[   36.983861] Memory Limit: none
[   37.210976] Rebooting in 5 seconds..


Re: [PATCH] arm64: dts: hikey: Revert "Enable HS200 mode on eMMC"

2019-01-25 Thread John Stultz
On Thu, Jan 24, 2019 at 3:18 AM Robin Murphy  wrote:
>
> Hi John,
>
> On 23/01/2019 20:06, John Stultz wrote:
> > From: Alistair Strachan 
> >
> > This reverts commit abd7d0972a192ee653efc7b151a6af69db58f2bb. This
> > change was already partially reverted by John Stultz in commit
> > 9c6d26df1fae ("arm64: dts: hikey: Fix eMMC corruption regression").
> >
> > This change appears to cause controller resets and block read failures
> > which prevents successful booting on some hikey boards.
>
> FWIW, you might want to have a play with the pinctrl settings - I've
> seen various flakiness with HS200 eMMCs on Rockchip boards which could
> be solved by bumping up the drive strength.
>

Hm. Thanks for the tip. I'll have to dig around to see if that can
help. In the meantime though, I think the revert is the right
solution, as this is currently blocking some boards from booting
mainline.  We can then look to re-enable HS200 mode if we can sort out
the issues we saw with boards using hynix mmc chips.

thanks
-john


[PATCH 7/8 v5] arm64: dts: hi3660: Add hisi asp dma device

2019-01-24 Thread John Stultz
From: Youlin Wang 

Add asp-dma device to hi3660 dts

Cc: Tanglei Han 
Cc: Zhuangluan Su 
Cc: Ryan Grachek 
Cc: Manivannan Sadhasivam 
Cc: Wei Xu 
Cc: Rob Herring 
Cc: Mark Rutland 
Cc: linux-arm-ker...@lists.infradead.org
Cc: devicet...@vger.kernel.org
Acked-by: Manivannan Sadhasivam 
Signed-off-by: Youlin Wang 
Signed-off-by: Tanglei Han 
Signed-off-by: John Stultz 
---
v2: Removed undocumented bindings
---
 arch/arm64/boot/dts/hisilicon/hi3660.dtsi | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi 
b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
index 4c8d682..77a7135 100644
--- a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
+++ b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
@@ -567,6 +567,16 @@
dma-type = "hi3660_dma";
};
 
+   asp_dmac: dma-controller@e804b000 {
+   compatible = "hisilicon,hisi-pcm-asp-dma-1.0";
+   reg = <0x0 0xe804b000 0x0 0x1000>;
+   #dma-cells = <1>;
+   dma-channels = <16>;
+   dma-requests = <32>;
+   interrupts = ;
+   interrupt-names = "asp_dma_irq";
+   };
+
rtc0: rtc@fff04000 {
compatible = "arm,pl031", "arm,primecell";
reg = <0x0 0Xfff04000 0x0 0x1000>;
-- 
2.7.4



[PATCH 3/8 v5] dma: k3dma: Upgrade k3dma driver to support hisi_asp_dma hardware

2019-01-24 Thread John Stultz
From: Youlin Wang 

On the hi3660 hardware there are two (at least) DMA controllers,
the DMA-P (Peripheral DMA) and the DMA-A (Audio DMA). The
two blocks are similar, but have some slight differences. This
resulted in the vendor implementing two separate drivers, which
after review, they have been able to condense and re-use the
existing k3dma driver.

Thus, this patch adds support for the new "hisi-pcm-asp-dma-1.0"
compatible string in the binding.

One difference with the DMA-A controller, is that it does not
need to initialize a clock. So we skip this by adding and using
soc data flags.

After above this driver will support both k3 and hisi_asp dma
hardware.

Cc: Dan Williams 
Cc: Vinod Koul 
Cc: Zhuangluan Su 
Cc: Ryan Grachek 
Cc: Manivannan Sadhasivam 
Cc: dmaeng...@vger.kernel.org
Acked-by: Manivannan Sadhasivam 
Signed-off-by: Youlin Wang 
Signed-off-by: Tanglei Han 
[jstultz: Reworked to use of_match_data, commit msg improvements]
Signed-off-by: John Stultz 
---
v2:
* Reworked to use of_match_data
v3:
* Further rework of the commit message
v5:
* Typo, whitespace fixes. Use BIT() macro.
---
 drivers/dma/k3dma.c | 38 +-
 1 file changed, 33 insertions(+), 5 deletions(-)

diff --git a/drivers/dma/k3dma.c b/drivers/dma/k3dma.c
index fdec2b6..4dce532 100644
--- a/drivers/dma/k3dma.c
+++ b/drivers/dma/k3dma.c
@@ -116,6 +116,14 @@ struct k3_dma_dev {
unsigned intirq;
 };
 
+
+#define K3_FLAG_NOCLK  BIT(1)
+
+struct k3dma_soc_data {
+   unsigned long flags;
+};
+
+
 #define to_k3_dma(dmadev) container_of(dmadev, struct k3_dma_dev, slave)
 
 static int k3_dma_config_write(struct dma_chan *chan,
@@ -790,8 +798,21 @@ static int k3_dma_transfer_resume(struct dma_chan *chan)
return 0;
 }
 
+static const struct k3dma_soc_data k3_v1_dma_data = {
+   .flags = 0,
+};
+
+static const struct k3dma_soc_data asp_v1_dma_data = {
+   .flags = K3_FLAG_NOCLK,
+};
+
 static const struct of_device_id k3_pdma_dt_ids[] = {
-   { .compatible = "hisilicon,k3-dma-1.0", },
+   { .compatible = "hisilicon,k3-dma-1.0",
+ .data = _v1_dma_data
+   },
+   { .compatible = "hisilicon,hisi-pcm-asp-dma-1.0",
+ .data = _v1_dma_data
+   },
{}
 };
 MODULE_DEVICE_TABLE(of, k3_pdma_dt_ids);
@@ -810,6 +831,7 @@ static struct dma_chan *k3_of_dma_simple_xlate(struct 
of_phandle_args *dma_spec,
 
 static int k3_dma_probe(struct platform_device *op)
 {
+   const struct k3dma_soc_data *soc_data;
struct k3_dma_dev *d;
const struct of_device_id *of_id;
struct resource *iores;
@@ -823,6 +845,10 @@ static int k3_dma_probe(struct platform_device *op)
if (!d)
return -ENOMEM;
 
+   soc_data = device_get_match_data(>dev);
+   if (!soc_data)
+   return -EINVAL;
+
d->base = devm_ioremap_resource(>dev, iores);
if (IS_ERR(d->base))
return PTR_ERR(d->base);
@@ -835,10 +861,12 @@ static int k3_dma_probe(struct platform_device *op)
"dma-requests", >dma_requests);
}
 
-   d->clk = devm_clk_get(>dev, NULL);
-   if (IS_ERR(d->clk)) {
-   dev_err(>dev, "no dma clk\n");
-   return PTR_ERR(d->clk);
+   if (!(soc_data->flags & K3_FLAG_NOCLK)) {
+   d->clk = devm_clk_get(>dev, NULL);
+   if (IS_ERR(d->clk)) {
+   dev_err(>dev, "no dma clk\n");
+   return PTR_ERR(d->clk);
+   }
}
 
irq = platform_get_irq(op, 0);
-- 
2.7.4



[PATCH 8/8 v5] arm64: dts: hi3660: Fixup unofficial dma-min-chan to dma-channel-mask

2019-01-24 Thread John Stultz
A undocumented and unimplemented binding got into the hi3660
dtsi, and this switches that binding to the now documented one.

Cc: Tanglei Han 
Cc: Zhuangluan Su 
Cc: Ryan Grachek 
Cc: Manivannan Sadhasivam 
Cc: Wei Xu 
Cc: Rob Herring 
Cc: Mark Rutland 
Cc: linux-arm-ker...@lists.infradead.org
Cc: devicet...@vger.kernel.org
Signed-off-by: John Stultz 
---
v3: Renamed to hisi-dma-avail-chan
v4: Renamed to dma-channel-mask
---
 arch/arm64/boot/dts/hisilicon/hi3660.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi 
b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
index 77a7135..8e48f42 100644
--- a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
+++ b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
@@ -560,7 +560,7 @@
#dma-cells = <1>;
dma-channels = <16>;
dma-requests = <32>;
-   dma-min-chan = <1>;
+   dma-channel-mask = <0xfffe>;
interrupts = ;
clocks = <_ctrl HI3660_CLK_GATE_DMAC>;
dma-no-cci;
-- 
2.7.4



[PATCH 0/8 v5] k3dma patches to add support for hi3660/HiKey960

2019-01-24 Thread John Stultz
This patch series is based on recent work by Tanglei Han, and
adds support for hi3660 SoCs as found on the HiKey960 board,
along with a few patches I've been carrying.

thanks
-john

New in v5:
* Minor typo fixes, minor rework to use BIT() macros 

Cc: Tanglei Han 
Cc: Zhuangluan Su 
Cc: Dan Williams 
Cc: Vinod Koul 
Cc: Wei Xu 
Cc: Mark Rutland 
Cc: Rob Herring 
Cc: Guodong Xu 
Cc: Manivannan Sadhasivam 
Cc: Ryan Grachek 
CC: devicet...@vger.kernel.org
Cc: dmaeng...@vger.kernel.org
Cc: linux-arm-ker...@lists.infradead.org

John Stultz (3):
  Documentation: bindings: dma: Add binding for dma-channel-mask
  arm64: dts: hi3660: Add dma to uart nodes
  arm64: dts: hi3660: Fixup unofficial dma-min-chan to dma-channel-mask

Li Yu (2):
  dma: k3dma: Delete axi_config
  dma: k3dma: Add support for dma-channel-mask

Youlin Wang (3):
  Documentation: bindings: k3dma: Extend the k3dma driver binding to
support hisi-asp
  dma: k3dma: Upgrade k3dma driver to support hisi_asp_dma hardware
  arm64: dts: hi3660: Add hisi asp dma device

 Documentation/devicetree/bindings/dma/dma.txt   |  4 ++
 Documentation/devicetree/bindings/dma/k3dma.txt |  4 +-
 arch/arm64/boot/dts/hisilicon/hi3660.dtsi   | 20 +++-
 drivers/dma/k3dma.c | 61 +
 4 files changed, 78 insertions(+), 11 deletions(-)

-- 
2.7.4



[PATCH 5/8 v5] dma: k3dma: Add support for dma-channel-mask

2019-01-24 Thread John Stultz
From: Li Yu 

Add dma-channel-mask as a property for k3dma, it defines
available dma channels which a non-secure mode driver can use.

One sample usage of this is in Hi3660 SoC. DMA channel 0 is
reserved to lpm3, which is a coprocessor for power management. So
as a result, any request in kernel (which runs on main processor
and in non-secure mode) should start from at least channel 1.

Cc: Dan Williams 
Cc: Vinod Koul 
Cc: Tanglei Han 
Cc: Zhuangluan Su 
Cc: Ryan Grachek 
Cc: Manivannan Sadhasivam 
Cc: Guodong Xu 
Cc: dmaeng...@vger.kernel.org
Signed-off-by: Li Yu 
[jstultz: Reworked to use a channel mask]
Signed-off-by: John Stultz 
---
v3: Rename to hisi-dma-avail-chan
v4: Rename to dma-channel-mask
v5: Use BIT(i) instead of (1<chan_pending */
spin_lock_irq(>lock);
for (pch = 0; pch < d->dma_channels; pch++) {
+   if (!(d->dma_channel_mask & (1<phy[pch];
 
if (p->vchan == NULL && !list_empty(>chan_pending)) {
@@ -336,6 +340,9 @@ static void k3_dma_tasklet(unsigned long arg)
spin_unlock_irq(>lock);
 
for (pch = 0; pch < d->dma_channels; pch++) {
+   if (!(d->dma_channel_mask & (1<phy[pch];
c = p->vchan;
@@ -856,6 +863,13 @@ static int k3_dma_probe(struct platform_device *op)
"dma-channels", >dma_channels);
of_property_read_u32((>dev)->of_node,
"dma-requests", >dma_requests);
+   ret = of_property_read_u32((>dev)->of_node,
+   "dma-channel-mask", >dma_channel_mask);
+   if (ret) {
+   dev_warn(>dev,
+"dma-channel-mask doesn't exist, considering 
all as available.\n");
+   d->dma_channel_mask = (u32)~0UL;
+   }
}
 
if (!(soc_data->flags & K3_FLAG_NOCLK)) {
@@ -887,8 +901,12 @@ static int k3_dma_probe(struct platform_device *op)
return -ENOMEM;
 
for (i = 0; i < d->dma_channels; i++) {
-   struct k3_dma_phy *p = >phy[i];
+   struct k3_dma_phy *p;
+
+   if (!(d->dma_channel_mask & BIT(i)))
+   continue;
 
+   p = >phy[i];
p->idx = i;
p->base = d->base + i * 0x40;
}
-- 
2.7.4



[PATCH 1/8 v5] Documentation: bindings: k3dma: Extend the k3dma driver binding to support hisi-asp

2019-01-24 Thread John Stultz
From: Youlin Wang 

Extend the k3dma driver binding to support hisi-asp hardware
variants.

Cc: Vinod Koul 
Cc: Rob Herring 
Cc: Mark Rutland 
Cc: Zhuangluan Su 
Cc: Tanglei Han 
Cc: Ryan Grachek 
Cc: Manivannan Sadhasivam 
Cc: dmaeng...@vger.kernel.org
Cc: devicet...@vger.kernel.org
Reviewed-by: Rob Herring 
Signed-off-by: Youlin Wang 
Signed-off-by: Tanglei Han 
Signed-off-by: John Stultz 
---
v2: Simplify patch, removing extranious examples
---
 Documentation/devicetree/bindings/dma/k3dma.txt | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/dma/k3dma.txt 
b/Documentation/devicetree/bindings/dma/k3dma.txt
index 4945aea..10a2f15 100644
--- a/Documentation/devicetree/bindings/dma/k3dma.txt
+++ b/Documentation/devicetree/bindings/dma/k3dma.txt
@@ -3,7 +3,9 @@
 See dma.txt first
 
 Required properties:
-- compatible: Should be "hisilicon,k3-dma-1.0"
+- compatible: Must be one of
+-  "hisilicon,k3-dma-1.0"
+-  "hisilicon,hisi-pcm-asp-dma-1.0"
 - reg: Should contain DMA registers location and length.
 - interrupts: Should contain one interrupt shared by all channel
 - #dma-cells: see dma.txt, should be 1, para number
-- 
2.7.4



[PATCH 4/8 v5] dma: k3dma: Delete axi_config

2019-01-24 Thread John Stultz
From: Li Yu 

Axi_config controls whether DMA resources can be accessed in non-secure
mode, such as linux kernel. The register should be set by the bootloader
stage and depends on the device.

Thus, this patch removes axi_config from k3dma driver.

Cc: Dan Williams 
Cc: Vinod Koul 
Cc: Tanglei Han 
Cc: Zhuangluan Su 
Cc: Ryan Grachek 
Cc: Manivannan Sadhasivam 
Cc: dmaeng...@vger.kernel.org
Acked-by: Manivannan Sadhasivam 
Signed-off-by: Li Yu 
Signed-off-by: Guodong Xu 
[jstultz: Minor tweaks to commit message]
Signed-off-by: John Stultz 
---
 drivers/dma/k3dma.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/dma/k3dma.c b/drivers/dma/k3dma.c
index 4dce532..e415c85 100644
--- a/drivers/dma/k3dma.c
+++ b/drivers/dma/k3dma.c
@@ -52,8 +52,6 @@
 #define CX_SRC 0x814
 #define CX_DST 0x818
 #define CX_CFG 0x81c
-#define AXI_CFG0x820
-#define AXI_CFG_DEFAULT0x201201
 
 #define CX_LLI_CHAIN_EN0x2
 #define CX_CFG_EN  0x1
@@ -169,7 +167,6 @@ static void k3_dma_set_desc(struct k3_dma_phy *phy, struct 
k3_desc_hw *hw)
writel_relaxed(hw->count, phy->base + CX_CNT0);
writel_relaxed(hw->saddr, phy->base + CX_SRC);
writel_relaxed(hw->daddr, phy->base + CX_DST);
-   writel_relaxed(AXI_CFG_DEFAULT, phy->base + AXI_CFG);
writel_relaxed(hw->config, phy->base + CX_CFG);
 }
 
-- 
2.7.4



[PATCH 6/8 v5] arm64: dts: hi3660: Add dma to uart nodes

2019-01-24 Thread John Stultz
Try to add DMA support to the uart nodes following
the assignments made in the dts from the victoria vendor kernel
here:
https://consumer.huawei.com/en/opensource/detail/?siteCode=worldwide=p10=openSourceSoftware=10=1

Cc: Tanglei Han 
Cc: Zhuangluan Su 
Cc: Ryan Grachek 
Cc: Manivannan Sadhasivam 
Cc: Wei Xu 
Cc: Rob Herring 
Cc: Mark Rutland 
Cc: linux-arm-ker...@lists.infradead.org
Cc: devicet...@vger.kernel.org
Acked-by: Manivannan Sadhasivam 
Signed-off-by: John Stultz 
---
v3:
* Remove dma enablment on uart0 which would use reserved channel 0
---
 arch/arm64/boot/dts/hisilicon/hi3660.dtsi | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi 
b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
index 20ae40d..4c8d682 100644
--- a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
+++ b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
@@ -478,6 +478,8 @@
compatible = "arm,pl011", "arm,primecell";
reg = <0x0 0xfdf0 0x0 0x1000>;
interrupts = ;
+   dma-names = "rx", "tx";
+   dmas =  < 2  3>;
clocks = <_ctrl HI3660_CLK_GATE_UART1>,
 <_ctrl HI3660_CLK_GATE_UART1>;
clock-names = "uartclk", "apb_pclk";
@@ -490,6 +492,8 @@
compatible = "arm,pl011", "arm,primecell";
reg = <0x0 0xfdf03000 0x0 0x1000>;
interrupts = ;
+   dma-names = "rx", "tx";
+   dmas =  < 4  5>;
clocks = <_ctrl HI3660_CLK_GATE_UART2>,
 <_ctrl HI3660_PCLK>;
clock-names = "uartclk", "apb_pclk";
@@ -514,6 +518,8 @@
compatible = "arm,pl011", "arm,primecell";
reg = <0x0 0xfdf01000 0x0 0x1000>;
interrupts = ;
+   dma-names = "rx", "tx";
+   dmas =  < 6  7>;
clocks = <_ctrl HI3660_CLK_GATE_UART4>,
 <_ctrl HI3660_CLK_GATE_UART4>;
clock-names = "uartclk", "apb_pclk";
@@ -526,6 +532,8 @@
compatible = "arm,pl011", "arm,primecell";
reg = <0x0 0xfdf05000 0x0 0x1000>;
interrupts = ;
+   dma-names = "rx", "tx";
+   dmas =  < 8  9>;
clocks = <_ctrl HI3660_CLK_GATE_UART5>,
 <_ctrl HI3660_CLK_GATE_UART5>;
clock-names = "uartclk", "apb_pclk";
-- 
2.7.4



[PATCH 2/8 v5] Documentation: bindings: dma: Add binding for dma-channel-mask

2019-01-24 Thread John Stultz
Some dma channels can be reserved for secure mode or other
hardware on the SoC, so provide a binding for a bitmask
listing the available channels for the kernel to use.

This follows the pre-existing bcm,dma-channel-mask binding.

Cc: Vinod Koul 
Cc: Rob Herring 
Cc: Mark Rutland 
Cc: Tanglei Han 
Cc: Zhuangluan Su 
Cc: Ryan Grachek 
Cc: Manivannan Sadhasivam 
Cc: dmaeng...@vger.kernel.org
Cc: devicet...@vger.kernel.org
Reviewed-by: Rob Herring 
Signed-off-by: John Stultz 
---
v3: Renamed to hisi-dma-avail-chan
v4: Reworked to generic dma-channel-mask
---
 Documentation/devicetree/bindings/dma/dma.txt | 4 
 1 file changed, 4 insertions(+)

diff --git a/Documentation/devicetree/bindings/dma/dma.txt 
b/Documentation/devicetree/bindings/dma/dma.txt
index 6312fb0..eeb4e4d 100644
--- a/Documentation/devicetree/bindings/dma/dma.txt
+++ b/Documentation/devicetree/bindings/dma/dma.txt
@@ -16,6 +16,9 @@ Optional properties:
 - dma-channels:Number of DMA channels supported by the controller.
 - dma-requests:Number of DMA request signals supported by the
controller.
+- dma-channel-mask:Bitmask of available DMA channels in ascending order
+   that are not reserved by firmware and are available to
+   the kernel. i.e. first channel corresponds to LSB.
 
 Example:
 
@@ -29,6 +32,7 @@ Example:
#dma-cells = <1>;
dma-channels = <32>;
dma-requests = <127>;
+   dma-channel-mask = <0xfffe>
};
 
 * DMA router
-- 
2.7.4



Re: [PATCH 3/8 v4] dma: k3dma: Upgrade k3dma driver to support hisi_asp_dma hardware

2019-01-23 Thread John Stultz
On Wed, Jan 23, 2019 at 4:57 AM Vinod Koul  wrote:
> On 22-01-19, 15:48, John Stultz wrote:
> > Do let me know if there's an example you'd rather I follow.
>
> To elaborate I was thinking of alternate scheme with:
>
> compatible = "hisilicon,k3-dma-1.0", NULL
> compatible = "hisilicon,hisi-pcm-asp-dma-1.0", .data = 
> _v1_dma_data
>
> and
>
> soc_data = device_get_match_data(>dev);
> if (!soc_data) {
> /* no data so flags are null */
> dev_warn(... "no driver data specified, assuming  no flags\n"
> k3_dma->flags = 0;
> }

Thanks for the clarification! Hmm. I can do this, though the trouble
is all the soc_data-> references have to switch to a struct
k3dma_soc_data on the stack, which we have to initialize/copy over
depending on the soc_data return, which I'm not sure really simplifies
much (other then saving a ulong off the heap). But I'm ok with either.

thanks
-john


[PATCH][RESEND] arm64: dts: hikey: Add DMA entries for Bluetooth UART

2019-01-23 Thread John Stultz
Add dma0 references for bluetooth uart to enable dma
for bt transfers.

Cc: Manivannan Sadhasivam 
Cc: Ryan Grachek 
Cc: Wei Xu 
Cc: Rob Herring 
Cc: Mark Rutland 
Cc: linux-arm-ker...@lists.infradead.org
Cc: devicet...@vger.kernel.org
Signed-off-by: John Stultz 
---
 arch/arm64/boot/dts/hisilicon/hi6220.dtsi | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm64/boot/dts/hisilicon/hi6220.dtsi 
b/arch/arm64/boot/dts/hisilicon/hi6220.dtsi
index aec9e37..fa15a08 100644
--- a/arch/arm64/boot/dts/hisilicon/hi6220.dtsi
+++ b/arch/arm64/boot/dts/hisilicon/hi6220.dtsi
@@ -319,6 +319,8 @@
clock-names = "uartclk", "apb_pclk";
pinctrl-names = "default";
pinctrl-0 = <_pmx_func _cfg_func1 
_cfg_func2>;
+   dmas = < 8  9>;
+   dma-names = "rx", "tx";
status = "disabled";
};
 
-- 
2.7.4



[PATCH] arm64: dts: hikey: Revert "Enable HS200 mode on eMMC"

2019-01-23 Thread John Stultz
From: Alistair Strachan 

This reverts commit abd7d0972a192ee653efc7b151a6af69db58f2bb. This
change was already partially reverted by John Stultz in commit
9c6d26df1fae ("arm64: dts: hikey: Fix eMMC corruption regression").

This change appears to cause controller resets and block read failures
which prevents successful booting on some hikey boards.

Cc: Ryan Grachek 
Cc: Wei Xu 
Cc: Manivannan Sadhasivam 
Cc: Rob Herring 
Cc: Mark Rutland 
Cc: linux-arm-ker...@lists.infradead.org
Cc: devicet...@vger.kernel.org
Cc: stable  #4.17+
Signed-off-by: Alistair Strachan 
Signed-off-by: John Stultz 
---
 arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts 
b/arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts
index 6102350..7092460 100644
--- a/arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts
+++ b/arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts
@@ -300,7 +300,6 @@
 
dwmmc_0: dwmmc0@f723d000 {
cap-mmc-highspeed;
-   mmc-hs200-1_8v;
non-removable;
bus-width = <0x8>;
vmmc-supply = <>;
-- 
2.7.4



Re: [PATCH 5/8 v4] dma: k3dma: Add support for dma-channel-mask

2019-01-22 Thread John Stultz
On Thu, Jan 17, 2019 at 9:14 AM Manivannan Sadhasivam
 wrote:
>
> /* Skip the channels which are masked */
> if ((d->dma_channel_mask) & BIT(pch))
> continue;

Per the discussion w/ Vinod and Rob, I think I'll leave this bit be,
so we use the channels in the bitmask.

> PS: Use BIT() macro where applicable.

But this suggestions I've integrated for v5.

Thanks so much for the review!
-john


Re: [PATCH 3/8 v4] dma: k3dma: Upgrade k3dma driver to support hisi_asp_dma hardware

2019-01-22 Thread John Stultz
On Sun, Jan 20, 2019 at 3:12 AM Vinod Koul  wrote:
>
> On 16-01-19, 09:10, John Stultz wrote:
> > From: Youlin Wang 
> >
> > On the hi3660 hardware there are two (at least) DMA controllers,
> > the DMA-P (Peripherial DMA) and the DMA-A (Audio DMA). The
> ^^^
> typo

Thanks so much for the review!

Fixed!

> > +
> > +#define K3_FLAG_NOCLK  (1<<0)
>
> why not use BIT()
>
> space between two please

Fixed.

> > +static const struct k3dma_soc_data k3_v1_dma_data = {
> > + .flags = 0,
> > +};
>
> So basically this is default right, do we need to set dma_data and not
> assume default..

I'm not sure I fully understand you here. Basically I'm just creating
a per-variant data structure. The k3_v1_dma_data isn't really the
default, but is the first variant supported. There may be future
variants that cause some new flag that the k3_v3_dma_data may need to
set. But for now that variant doesn't have any flags set.


> > +
> > +static const struct k3dma_soc_data asp_v1_dma_data = {
> > + .flags = K3_FLAG_NOCLK,
> > +};
> > +
> >  static const struct of_device_id k3_pdma_dt_ids[] = {
> > - { .compatible = "hisilicon,k3-dma-1.0", },
> > + { .compatible = "hisilicon,k3-dma-1.0",
> > +   .data = _v1_dma_data
> > + },
> > + { .compatible = "hisilicon,hisi-pcm-asp-dma-1.0",
> > +   .data = _v1_dma_data
> > + },
> >   {}
> >  };
> >  MODULE_DEVICE_TABLE(of, k3_pdma_dt_ids);
> > @@ -810,6 +830,7 @@ static struct dma_chan *k3_of_dma_simple_xlate(struct 
> > of_phandle_args *dma_spec,
> >
> >  static int k3_dma_probe(struct platform_device *op)
> >  {
> > + const struct k3dma_soc_data *soc_data;
> >   struct k3_dma_dev *d;
> >   const struct of_device_id *of_id;
> >   struct resource *iores;
> > @@ -823,6 +844,10 @@ static int k3_dma_probe(struct platform_device *op)
> >   if (!d)
> >   return -ENOMEM;
> >
> > + soc_data = device_get_match_data(>dev);
> > + if (!soc_data)
> > + return -EINVAL;
>
> So this is not optional, either ways fine by me :)

I think this way makes sense, but maybe I'm missing a better alternative?

Do let me know if there's an example you'd rather I follow.

Thanks so much again for the review!
-john


Re: [PATCH 2/8 v4] Documentation: bindings: dma: Add binding for dma-channel-mask

2019-01-17 Thread John Stultz
On Thu, Jan 17, 2019 at 9:08 AM Manivannan Sadhasivam
 wrote:
>
> On Wed, Jan 16, 2019 at 09:10:23AM -0800, John Stultz wrote:
> > Some dma channels can be reserved for secure mode or other
> > hardware on the SoC, so provide a binding for a bitmask
> > listing the available channels for the kernel to use.
> >
> > This follows the pre-existing bcm,dma-channel-mask binding.
> >
> > Cc: Vinod Koul 
> > Cc: Rob Herring 
> > Cc: Mark Rutland 
> > Cc: Tanglei Han 
> > Cc: Zhuangluan Su 
> > Cc: Ryan Grachek 
> > Cc: Manivannan Sadhasivam 
> > Cc: dmaeng...@vger.kernel.org
> > Cc: devicet...@vger.kernel.org
> > Signed-off-by: John Stultz 
> > ---
> > v3: Renamed to hisi-dma-avail-chan
> > v4: Reworked to generic dma-channel-mask
> > ---
> >  Documentation/devicetree/bindings/dma/dma.txt | 4 
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/Documentation/devicetree/bindings/dma/dma.txt 
> > b/Documentation/devicetree/bindings/dma/dma.txt
> > index 6312fb0..eeb4e4d 100644
> > --- a/Documentation/devicetree/bindings/dma/dma.txt
> > +++ b/Documentation/devicetree/bindings/dma/dma.txt
> > @@ -16,6 +16,9 @@ Optional properties:
> >  - dma-channels:  Number of DMA channels supported by the controller.
> >  - dma-requests:  Number of DMA request signals supported by the
> >   controller.
> > +- dma-channel-mask:  Bitmask of available DMA channels in ascending order
> > + that are not reserved by firmware and are available to
> > + the kernel. i.e. first channel corresponds to LSB.
>
> A general assumption is, "dma-channel-mask" refers to the bit fields of
> the channels which needs to be masked. But here, it refers to the channels
> which are available. Doesn't it contradict?

Hrm. So while I can sort of understand the common usage of "mask" as
to "hide", thus the desire to have a bitfield mean "the channels we
hide" or "don't use", but in my experience bitmasking is more commonly
used to keep only a portion of the the bits, so from that perspective
its more intuitive that a mask be the channels we keep to use. So I'm
not sure if your suggestion makes it more clear.

But I'm not very particular here, so I'll defer to others on this.

thanks
-john


Re: ufshcd_queuecommand() triggering after ufshcd_suspend()?

2019-01-16 Thread John Stultz
On Sun, Jan 13, 2019 at 7:25 PM Zang Leigang  wrote:
>   I think there are two different issues:
>
>   1. clk_gating's state(including state's trace event) and is_suspended is not
>  wrapped by ufshcd_is_clkgating_allowed which Hisilicon's kirin platoform
>  soc does not need but is set and checked in a regular path.
>   2. I think SCSI is necessary to block queue to stop sending to ufs after
>  system suspend or, add a new state for ufs like UFSHCD_STATE_SUSPENDING
>  or what else. hba->is_sys_suspended is too late to stop 
> ufshcd_queuecommand
>

Hey Zang,
  Thanks for the suggestions. I'm not sure I have enough context to
try to come up with a patch. I'll try to take a stab at it, but I'd
appreciate it if  you have any more specific suggestions or patches
for me to try. :)

thanks
-john


[PATCH 5/8 v4] dma: k3dma: Add support for dma-channel-mask

2019-01-16 Thread John Stultz
From: Li Yu 

Add dma-channel-mask as a property for k3dma, it defines
available dma channels which a non-secure mode driver can use.

One sample usage of this is in Hi3660 SoC. DMA channel 0 is
reserved to lpm3, which is a coprocessor for power management. So
as a result, any request in kernel (which runs on main processor
and in non-secure mode) should start from at least channel 1.

Cc: Dan Williams 
Cc: Vinod Koul 
Cc: Tanglei Han 
Cc: Zhuangluan Su 
Cc: Ryan Grachek 
Cc: Manivannan Sadhasivam 
Cc: Guodong Xu 
Cc: dmaeng...@vger.kernel.org
Signed-off-by: Li Yu 
[jstultz: Reworked to use a channel mask]
Signed-off-by: John Stultz 
---
v3: Rename to hisi-dma-avail-chan
v4: Rename to dma-channel-mask
---
 drivers/dma/k3dma.c | 20 +++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/dma/k3dma.c b/drivers/dma/k3dma.c
index b2060bf..ed19b1f 100644
--- a/drivers/dma/k3dma.c
+++ b/drivers/dma/k3dma.c
@@ -111,6 +111,7 @@ struct k3_dma_dev {
struct dma_pool *pool;
u32 dma_channels;
u32 dma_requests;
+   u32 dma_channel_mask;
unsigned intirq;
 };
 
@@ -318,6 +319,9 @@ static void k3_dma_tasklet(unsigned long arg)
/* check new channel request in d->chan_pending */
spin_lock_irq(>lock);
for (pch = 0; pch < d->dma_channels; pch++) {
+   if (!(d->dma_channel_mask & (1<phy[pch];
 
if (p->vchan == NULL && !list_empty(>chan_pending)) {
@@ -335,6 +339,9 @@ static void k3_dma_tasklet(unsigned long arg)
spin_unlock_irq(>lock);
 
for (pch = 0; pch < d->dma_channels; pch++) {
+   if (!(d->dma_channel_mask & (1<phy[pch];
c = p->vchan;
@@ -855,6 +862,13 @@ static int k3_dma_probe(struct platform_device *op)
"dma-channels", >dma_channels);
of_property_read_u32((>dev)->of_node,
"dma-requests", >dma_requests);
+   ret = of_property_read_u32((>dev)->of_node,
+   "dma-channel-mask", >dma_channel_mask);
+   if (ret) {
+   dev_warn(>dev,
+"dma-channel-mask doesn't exist, considering 
all as available.\n");
+   d->dma_channel_mask = (u32)~0UL;
+   }
}
 
if (!(soc_data->flags & K3_FLAG_NOCLK)) {
@@ -886,8 +900,12 @@ static int k3_dma_probe(struct platform_device *op)
return -ENOMEM;
 
for (i = 0; i < d->dma_channels; i++) {
-   struct k3_dma_phy *p = >phy[i];
+   struct k3_dma_phy *p;
+
+   if (!(d->dma_channel_mask & (1<phy[i];
p->idx = i;
p->base = d->base + i * 0x40;
}
-- 
2.7.4



[PATCH 1/8 v4] Documentation: bindings: k3dma: Extend the k3dma driver binding to support hisi-asp

2019-01-16 Thread John Stultz
From: Youlin Wang 

Extend the k3dma driver binding to support hisi-asp hardware
variants.

Cc: Vinod Koul 
Cc: Rob Herring 
Cc: Mark Rutland 
Cc: Zhuangluan Su 
Cc: Tanglei Han 
Cc: Ryan Grachek 
Cc: Manivannan Sadhasivam 
Cc: dmaeng...@vger.kernel.org
Cc: devicet...@vger.kernel.org
Reviewed-by: Rob Herring 
Signed-off-by: Youlin Wang 
Signed-off-by: Tanglei Han 
Signed-off-by: John Stultz 
---
v2: Simplify patch, removing extranious examples
---
 Documentation/devicetree/bindings/dma/k3dma.txt | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/dma/k3dma.txt 
b/Documentation/devicetree/bindings/dma/k3dma.txt
index 4945aea..10a2f15 100644
--- a/Documentation/devicetree/bindings/dma/k3dma.txt
+++ b/Documentation/devicetree/bindings/dma/k3dma.txt
@@ -3,7 +3,9 @@
 See dma.txt first
 
 Required properties:
-- compatible: Should be "hisilicon,k3-dma-1.0"
+- compatible: Must be one of
+-  "hisilicon,k3-dma-1.0"
+-  "hisilicon,hisi-pcm-asp-dma-1.0"
 - reg: Should contain DMA registers location and length.
 - interrupts: Should contain one interrupt shared by all channel
 - #dma-cells: see dma.txt, should be 1, para number
-- 
2.7.4



[PATCH 7/8 v4] arm64: dts: hi3660: Add hisi asp dma device

2019-01-16 Thread John Stultz
From: Youlin Wang 

Add asp-dma device to hi3660 dts

Cc: Tanglei Han 
Cc: Zhuangluan Su 
Cc: Ryan Grachek 
Cc: Manivannan Sadhasivam 
Cc: Wei Xu 
Cc: Rob Herring 
Cc: Mark Rutland 
Cc: linux-arm-ker...@lists.infradead.org
Cc: devicet...@vger.kernel.org
Acked-by: Manivannan Sadhasivam 
Signed-off-by: Youlin Wang 
Signed-off-by: Tanglei Han 
Signed-off-by: John Stultz 
---
v2: Removed undocumented bindings
---
 arch/arm64/boot/dts/hisilicon/hi3660.dtsi | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi 
b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
index 4c8d682..77a7135 100644
--- a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
+++ b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
@@ -567,6 +567,16 @@
dma-type = "hi3660_dma";
};
 
+   asp_dmac: dma-controller@e804b000 {
+   compatible = "hisilicon,hisi-pcm-asp-dma-1.0";
+   reg = <0x0 0xe804b000 0x0 0x1000>;
+   #dma-cells = <1>;
+   dma-channels = <16>;
+   dma-requests = <32>;
+   interrupts = ;
+   interrupt-names = "asp_dma_irq";
+   };
+
rtc0: rtc@fff04000 {
compatible = "arm,pl031", "arm,primecell";
reg = <0x0 0Xfff04000 0x0 0x1000>;
-- 
2.7.4



[PATCH 2/8 v4] Documentation: bindings: dma: Add binding for dma-channel-mask

2019-01-16 Thread John Stultz
Some dma channels can be reserved for secure mode or other
hardware on the SoC, so provide a binding for a bitmask
listing the available channels for the kernel to use.

This follows the pre-existing bcm,dma-channel-mask binding.

Cc: Vinod Koul 
Cc: Rob Herring 
Cc: Mark Rutland 
Cc: Tanglei Han 
Cc: Zhuangluan Su 
Cc: Ryan Grachek 
Cc: Manivannan Sadhasivam 
Cc: dmaeng...@vger.kernel.org
Cc: devicet...@vger.kernel.org
Signed-off-by: John Stultz 
---
v3: Renamed to hisi-dma-avail-chan
v4: Reworked to generic dma-channel-mask
---
 Documentation/devicetree/bindings/dma/dma.txt | 4 
 1 file changed, 4 insertions(+)

diff --git a/Documentation/devicetree/bindings/dma/dma.txt 
b/Documentation/devicetree/bindings/dma/dma.txt
index 6312fb0..eeb4e4d 100644
--- a/Documentation/devicetree/bindings/dma/dma.txt
+++ b/Documentation/devicetree/bindings/dma/dma.txt
@@ -16,6 +16,9 @@ Optional properties:
 - dma-channels:Number of DMA channels supported by the controller.
 - dma-requests:Number of DMA request signals supported by the
controller.
+- dma-channel-mask:Bitmask of available DMA channels in ascending order
+   that are not reserved by firmware and are available to
+   the kernel. i.e. first channel corresponds to LSB.
 
 Example:
 
@@ -29,6 +32,7 @@ Example:
#dma-cells = <1>;
dma-channels = <32>;
dma-requests = <127>;
+   dma-channel-mask = <0xfffe>
};
 
 * DMA router
-- 
2.7.4



[PATCH 6/8 v4] arm64: dts: hi3660: Add dma to uart nodes

2019-01-16 Thread John Stultz
Try to add DMA support to the uart nodes following
the assignments made in the dts from the victoria vendor kernel
here:
https://consumer.huawei.com/en/opensource/detail/?siteCode=worldwide=p10=openSourceSoftware=10=1

Cc: Tanglei Han 
Cc: Zhuangluan Su 
Cc: Ryan Grachek 
Cc: Manivannan Sadhasivam 
Cc: Wei Xu 
Cc: Rob Herring 
Cc: Mark Rutland 
Cc: linux-arm-ker...@lists.infradead.org
Cc: devicet...@vger.kernel.org
Signed-off-by: John Stultz 
---
v3:
* Remove dma enablment on uart0 which would use reserved channel 0
---
 arch/arm64/boot/dts/hisilicon/hi3660.dtsi | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi 
b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
index 20ae40d..4c8d682 100644
--- a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
+++ b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
@@ -478,6 +478,8 @@
compatible = "arm,pl011", "arm,primecell";
reg = <0x0 0xfdf0 0x0 0x1000>;
interrupts = ;
+   dma-names = "rx", "tx";
+   dmas =  < 2  3>;
clocks = <_ctrl HI3660_CLK_GATE_UART1>,
 <_ctrl HI3660_CLK_GATE_UART1>;
clock-names = "uartclk", "apb_pclk";
@@ -490,6 +492,8 @@
compatible = "arm,pl011", "arm,primecell";
reg = <0x0 0xfdf03000 0x0 0x1000>;
interrupts = ;
+   dma-names = "rx", "tx";
+   dmas =  < 4  5>;
clocks = <_ctrl HI3660_CLK_GATE_UART2>,
 <_ctrl HI3660_PCLK>;
clock-names = "uartclk", "apb_pclk";
@@ -514,6 +518,8 @@
compatible = "arm,pl011", "arm,primecell";
reg = <0x0 0xfdf01000 0x0 0x1000>;
interrupts = ;
+   dma-names = "rx", "tx";
+   dmas =  < 6  7>;
clocks = <_ctrl HI3660_CLK_GATE_UART4>,
 <_ctrl HI3660_CLK_GATE_UART4>;
clock-names = "uartclk", "apb_pclk";
@@ -526,6 +532,8 @@
compatible = "arm,pl011", "arm,primecell";
reg = <0x0 0xfdf05000 0x0 0x1000>;
interrupts = ;
+   dma-names = "rx", "tx";
+   dmas =  < 8  9>;
clocks = <_ctrl HI3660_CLK_GATE_UART5>,
 <_ctrl HI3660_CLK_GATE_UART5>;
clock-names = "uartclk", "apb_pclk";
-- 
2.7.4



[PATCH 4/8 v4] dma: k3dma: Delete axi_config

2019-01-16 Thread John Stultz
From: Li Yu 

Axi_config controls whether DMA resources can be accessed in non-secure
mode, such as linux kernel. The register should be set by the bootloader
stage and depends on the device.

Thus, this patch removes axi_config from k3dma driver.

Cc: Dan Williams 
Cc: Vinod Koul 
Cc: Tanglei Han 
Cc: Zhuangluan Su 
Cc: Ryan Grachek 
Cc: Manivannan Sadhasivam 
Cc: dmaeng...@vger.kernel.org
Acked-by: Manivannan Sadhasivam 
Signed-off-by: Li Yu 
Signed-off-by: Guodong Xu 
[jstultz: Minor tweaks to commit message]
Signed-off-by: John Stultz 
---
 drivers/dma/k3dma.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/dma/k3dma.c b/drivers/dma/k3dma.c
index df61406..b2060bf 100644
--- a/drivers/dma/k3dma.c
+++ b/drivers/dma/k3dma.c
@@ -52,8 +52,6 @@
 #define CX_SRC 0x814
 #define CX_DST 0x818
 #define CX_CFG 0x81c
-#define AXI_CFG0x820
-#define AXI_CFG_DEFAULT0x201201
 
 #define CX_LLI_CHAIN_EN0x2
 #define CX_CFG_EN  0x1
@@ -168,7 +166,6 @@ static void k3_dma_set_desc(struct k3_dma_phy *phy, struct 
k3_desc_hw *hw)
writel_relaxed(hw->count, phy->base + CX_CNT0);
writel_relaxed(hw->saddr, phy->base + CX_SRC);
writel_relaxed(hw->daddr, phy->base + CX_DST);
-   writel_relaxed(AXI_CFG_DEFAULT, phy->base + AXI_CFG);
writel_relaxed(hw->config, phy->base + CX_CFG);
 }
 
-- 
2.7.4



[PATCH 8/8 v4] arm64: dts: hi3660: Fixup unofficial dma-min-chan to dma-channel-mask

2019-01-16 Thread John Stultz
A undocumented and unimplemented binding got into the hi3660
dtsi, and this switches that binding to the now documented one.

Cc: Tanglei Han 
Cc: Zhuangluan Su 
Cc: Ryan Grachek 
Cc: Manivannan Sadhasivam 
Cc: Wei Xu 
Cc: Rob Herring 
Cc: Mark Rutland 
Cc: linux-arm-ker...@lists.infradead.org
Cc: devicet...@vger.kernel.org
Signed-off-by: John Stultz 
---
v3: Renamed to hisi-dma-avail-chan
v4: Renamed to dma-channel-mask
---
 arch/arm64/boot/dts/hisilicon/hi3660.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi 
b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
index 77a7135..8e48f42 100644
--- a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
+++ b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
@@ -560,7 +560,7 @@
#dma-cells = <1>;
dma-channels = <16>;
dma-requests = <32>;
-   dma-min-chan = <1>;
+   dma-channel-mask = <0xfffe>;
interrupts = ;
clocks = <_ctrl HI3660_CLK_GATE_DMAC>;
dma-no-cci;
-- 
2.7.4



[PATCH 3/8 v4] dma: k3dma: Upgrade k3dma driver to support hisi_asp_dma hardware

2019-01-16 Thread John Stultz
From: Youlin Wang 

On the hi3660 hardware there are two (at least) DMA controllers,
the DMA-P (Peripherial DMA) and the DMA-A (Audio DMA). The
two blocks are similar, but have some slight differences. This
resulted in the vendor implementing two separate drivers, which
after review, they have been able to condense and re-use the
existing k3dma driver.

Thus, this patch adds support for the new "hisi-pcm-asp-dma-1.0"
compatible string in the binding.

One difference with the DMA-A controller, is that it does not
need to initialize a clock. So we skip this by adding and using
soc data flags.

After above this driver will support both k3 and hisi_asp dma
hardware.

Cc: Dan Williams 
Cc: Vinod Koul 
Cc: Zhuangluan Su 
Cc: Ryan Grachek 
Cc: Manivannan Sadhasivam 
Cc: dmaeng...@vger.kernel.org
Acked-by: Manivannan Sadhasivam 
Signed-off-by: Youlin Wang 
Signed-off-by: Tanglei Han 
[jstultz: Reworked to use of_match_data, commit msg improvements]
Signed-off-by: John Stultz 
---
v2:
* Reworked to use of_match_data
v3:
* Further rework of the commit message
---
 drivers/dma/k3dma.c | 37 -
 1 file changed, 32 insertions(+), 5 deletions(-)

diff --git a/drivers/dma/k3dma.c b/drivers/dma/k3dma.c
index fdec2b6..df61406 100644
--- a/drivers/dma/k3dma.c
+++ b/drivers/dma/k3dma.c
@@ -116,6 +116,13 @@ struct k3_dma_dev {
unsigned intirq;
 };
 
+
+#define K3_FLAG_NOCLK  (1<<0)
+struct k3dma_soc_data {
+   unsigned long flags;
+};
+
+
 #define to_k3_dma(dmadev) container_of(dmadev, struct k3_dma_dev, slave)
 
 static int k3_dma_config_write(struct dma_chan *chan,
@@ -790,8 +797,21 @@ static int k3_dma_transfer_resume(struct dma_chan *chan)
return 0;
 }
 
+static const struct k3dma_soc_data k3_v1_dma_data = {
+   .flags = 0,
+};
+
+static const struct k3dma_soc_data asp_v1_dma_data = {
+   .flags = K3_FLAG_NOCLK,
+};
+
 static const struct of_device_id k3_pdma_dt_ids[] = {
-   { .compatible = "hisilicon,k3-dma-1.0", },
+   { .compatible = "hisilicon,k3-dma-1.0",
+ .data = _v1_dma_data
+   },
+   { .compatible = "hisilicon,hisi-pcm-asp-dma-1.0",
+ .data = _v1_dma_data
+   },
{}
 };
 MODULE_DEVICE_TABLE(of, k3_pdma_dt_ids);
@@ -810,6 +830,7 @@ static struct dma_chan *k3_of_dma_simple_xlate(struct 
of_phandle_args *dma_spec,
 
 static int k3_dma_probe(struct platform_device *op)
 {
+   const struct k3dma_soc_data *soc_data;
struct k3_dma_dev *d;
const struct of_device_id *of_id;
struct resource *iores;
@@ -823,6 +844,10 @@ static int k3_dma_probe(struct platform_device *op)
if (!d)
return -ENOMEM;
 
+   soc_data = device_get_match_data(>dev);
+   if (!soc_data)
+   return -EINVAL;
+
d->base = devm_ioremap_resource(>dev, iores);
if (IS_ERR(d->base))
return PTR_ERR(d->base);
@@ -835,10 +860,12 @@ static int k3_dma_probe(struct platform_device *op)
"dma-requests", >dma_requests);
}
 
-   d->clk = devm_clk_get(>dev, NULL);
-   if (IS_ERR(d->clk)) {
-   dev_err(>dev, "no dma clk\n");
-   return PTR_ERR(d->clk);
+   if (!(soc_data->flags & K3_FLAG_NOCLK)) {
+   d->clk = devm_clk_get(>dev, NULL);
+   if (IS_ERR(d->clk)) {
+   dev_err(>dev, "no dma clk\n");
+   return PTR_ERR(d->clk);
+   }
}
 
irq = platform_get_irq(op, 0);
-- 
2.7.4



[PATCH 0/8 v4] k3dma patches to add support for hi3660/HiKey960

2019-01-16 Thread John Stultz
This patch series is based on recent work by Tanglei Han, and
adds support for hi3660 SoCs as found on the HiKey960 board,
along with a few patches I've been carrying.

thanks
-john

New in v4:
* Rework hisi,dma-avail-chans to generic dma-channel-mask, per Rob's suggestion

Cc: Tanglei Han 
Cc: Zhuangluan Su 
Cc: Dan Williams 
Cc: Vinod Koul 
Cc: Wei Xu 
Cc: Mark Rutland 
Cc: Rob Herring 
Cc: Guodong Xu 
Cc: Manivannan Sadhasivam 
Cc: Ryan Grachek 
CC: devicet...@vger.kernel.org
Cc: dmaeng...@vger.kernel.org
Cc: linux-arm-ker...@lists.infradead.org


John Stultz (3):
  Documentation: bindings: dma: Add binding for dma-channel-mask
  arm64: dts: hi3660: Add dma to uart nodes
  arm64: dts: hi3660: Fixup unofficial dma-min-chan to dma-channel-mask

Li Yu (2):
  dma: k3dma: Delete axi_config
  dma: k3dma: Add support for dma-channel-mask

Youlin Wang (3):
  Documentation: bindings: k3dma: Extend the k3dma driver binding to
support hisi-asp
  dma: k3dma: Upgrade k3dma driver to support hisi_asp_dma hardware
  arm64: dts: hi3660: Add hisi asp dma device

 Documentation/devicetree/bindings/dma/dma.txt   |  4 ++
 Documentation/devicetree/bindings/dma/k3dma.txt |  4 +-
 arch/arm64/boot/dts/hisilicon/hi3660.dtsi   | 20 -
 drivers/dma/k3dma.c | 60 +
 4 files changed, 77 insertions(+), 11 deletions(-)

-- 
2.7.4



Re: [PATCH v4 1/2] mm/memfd: Add an F_SEAL_FUTURE_WRITE seal to memfd

2019-01-15 Thread John Stultz
On Sat, Jan 12, 2019 at 12:38 PM Joel Fernandes  wrote:
>
> From: "Joel Fernandes (Google)" 
>
> Android uses ashmem for sharing memory regions.  We are looking forward to
> migrating all usecases of ashmem to memfd so that we can possibly remove
> the ashmem driver in the future from staging while also benefiting from
> using memfd and contributing to it.  Note staging drivers are also not ABI
> and generally can be removed at anytime.
>
> One of the main usecases Android has is the ability to create a region and
> mmap it as writeable, then add protection against making any "future"
> writes while keeping the existing already mmap'ed writeable-region active.
> This allows us to implement a usecase where receivers of the shared
> memory buffer can get a read-only view, while the sender continues to
> write to the buffer.  See CursorWindow documentation in Android for more
> details:
> https://developer.android.com/reference/android/database/CursorWindow
>
> This usecase cannot be implemented with the existing F_SEAL_WRITE seal.
> To support the usecase, this patch adds a new F_SEAL_FUTURE_WRITE seal
> which prevents any future mmap and write syscalls from succeeding while
> keeping the existing mmap active.
>
> A better way to do F_SEAL_FUTURE_WRITE seal was discussed [1] last week
> where we don't need to modify core VFS structures to get the same
> behavior of the seal. This solves several side-effects pointed by Andy.
> self-tests are provided in later patch to verify the expected semantics.
>
> [1] https://lore.kernel.org/lkml/2018173650.ga256...@google.com/
>
> [Thanks a lot to Andy for suggestions to improve code]
> Cc: Andy Lutomirski 
> Signed-off-by: Joel Fernandes (Google) 
> ---
>  fs/hugetlbfs/inode.c   |  2 +-
>  include/uapi/linux/fcntl.h |  1 +
>  mm/memfd.c |  3 ++-
>  mm/shmem.c | 25 ++---
>  4 files changed, 26 insertions(+), 5 deletions(-)

Acked-by: John Stultz 


Re: [PATCH 2/8 v3] Documentation: bindings: k3dma: Add binding for hisi-dma-avail-chan

2019-01-11 Thread John Stultz
On Fri, Jan 11, 2019 at 12:04 PM Rob Herring  wrote:
>
> On Fri, Jan 11, 2019 at 1:58 PM Rob Herring  wrote:
> >
> > On Thu, Jan 10, 2019 at 11:34 AM John Stultz  wrote:
> > >
> > > Some dma channels can be reserved for secure mode or other
> > > hardware on the SoC, so provide a binding for a bitmask
> > > listing the available channels for the kernel to use.
> > >
> > > Cc: Vinod Koul 
> > > Cc: Rob Herring 
> > > Cc: Mark Rutland 
> > > Cc: Tanglei Han 
> > > Cc: Zhuangluan Su 
> > > Cc: Ryan Grachek 
> > > Cc: Manivannan Sadhasivam 
> > > Cc: dmaeng...@vger.kernel.org
> > > Cc: devicet...@vger.kernel.org
> > > Signed-off-by: John Stultz 
> > > ---
> > > v3: Renamed to hisi-dma-avail-chan
> > > ---
> > >  Documentation/devicetree/bindings/dma/k3dma.txt | 3 +++
> > >  1 file changed, 3 insertions(+)
> > >
> > > diff --git a/Documentation/devicetree/bindings/dma/k3dma.txt 
> > > b/Documentation/devicetree/bindings/dma/k3dma.txt
> > > index 10a2f15..38825d4 100644
> > > --- a/Documentation/devicetree/bindings/dma/k3dma.txt
> > > +++ b/Documentation/devicetree/bindings/dma/k3dma.txt
> > > @@ -14,6 +14,9 @@ Required properties:
> > > have specific request line
> > >  - clocks: clock required
> > >
> > > +Optional properties:
> > > +- hisi-dma-avail-chan: Bitmask of available physical channels
> >
> > Not quite right. Should be: hisilicon,dma-avail-chan
>
> Actually, we already have the same case elsewhere with
> 'brcm,dma-channel-mask'. Maybe there are others. So make the property
> common (i.e. documented in dma.txt) and called 'dma-channel-mask'.

Ok. I'll rework it for that then.

> Whether or not the dmaengine handles this or not is irrelevant to
> whether the binding is common or not. I have no say over OS design
> decisions.

Ok. I'll keep it in the driver for now unless otherwise directed.

Thanks so much for the review!
-john


ufshcd_queuecommand() triggering after ufshcd_suspend()?

2019-01-10 Thread John Stultz
Hey all,
  Frequently, since support for the HiKey960's UFS code landed in
4.19, I've noticed the following warning on reboot:

[   23.086860] WARNING: CPU: 0 PID: 2507 at
drivers/scsi/ufs/ufshcd.c:2460 ufshcd_queuecommand+0x59c/0x5a8
[   23.096256] Modules linked in:
[   23.099313] CPU: 0 PID: 2507 Comm: kworker/0:1H Tainted: G S
5.0.0-rc1-00068-g3f81a19 #273
[   23.108873] Hardware name: HiKey960 (DT)
[   23.112802] Workqueue: kblockd blk_mq_requeue_work
[   23.117591] pstate: 8045 (Nzcv daif +PAN -UAO)
[   23.122378] pc : ufshcd_queuecommand+0x59c/0x5a8
[   23.126990] lr : ufshcd_queuecommand+0x58c/0x5a8
[   23.131600] sp : ff8015e1ba80
[   23.134907] x29: ff8015e1ba80 x28: ffc217f94048
[   23.140214] x27: 0010 x26: ffc217a7c8b8
[   23.145520] x25: ffc217a7c000 x24: ffc217a7ceb0
[   23.150827] x23:  x22: ffc217a7c808
[   23.156133] x21: ffc217f94120 x20: 0010
[   23.161440] x19: ff801186d000 x18: ff801186db08
[   23.166746] x17:  x16: 
[   23.172053] x15: ff8095e1b7c7 x14: 692064616574736e
[   23.177360] x13: 6928204e4f5f534b x12: 4c43203d21206574
[   23.182666] x11: 6174732e676e6974 x10: 61675f6b6c633e2d
[   23.187973] x9 : 61626820646e616d x8 : 6d6f636575657571
[   23.193280] x7 :  x6 : ff801186e000
[   23.198586] x5 : ff801186e270 x4 : ff8010096dc0
[   23.203894] x3 : 0001 x2 : 47dd99afde511d00
[   23.209201] x1 :  x0 : 
[   23.214509] Call trace:
[   23.216952]  ufshcd_queuecommand+0x59c/0x5a8
[   23.221220]  scsi_queue_rq+0x5b4/0x880
[   23.224964]  blk_mq_dispatch_rq_list+0xb0/0x510
[   23.229492]  blk_mq_sched_dispatch_requests+0xf4/0x198
[   23.234626]  __blk_mq_run_hw_queue+0xb4/0x120
[   23.238978]  __blk_mq_delay_run_hw_queue+0x110/0x200
[   23.243937]  blk_mq_run_hw_queue+0xb8/0x118
[   23.248114]  blk_mq_run_hw_queues+0x58/0x78
[   23.252291]  blk_mq_requeue_work+0x140/0x168
[   23.256560]  process_one_work+0x158/0x468
[   23.260564]  worker_thread+0x50/0x460
[   23.264222]  kthread+0x104/0x130
[   23.267447]  ret_from_fork+0x10/0x1c
[   23.271017] ---[ end trace 45f1ee04059cdf00 ]---

Since the warning is triggering from the WARN_ON(hba->clk_gating.state
!= CLKS_ON) line, I annotated the clk_gating.state changes, and am
seeing on reboot:
  vdc: Waited 0ms for vold
  sd 0:0:0:3: [sdd] Synchronizing SCSI cache
  sd 0:0:0:2: [sdc] Synchronizing SCSI cache
  sd 0:0:0:1: [sdb] Synchronizing SCSI cache
  sd 0:0:0:0: [sda] Synchronizing SCSI cache
  ufshcd_suspend: setting clk_gating.state CLKS_OFF
  ufshcd_queuecommand hba->clk_gating.state != CLKS_ON (instead its 0)


So it seems like ufshcd_suspend() is has run, but then the workqueue
(occasionally) fires afterwards triggering the issue.

Maybe should something in ufshcd_queuecommand be checking the
clk_gating.is_suspended flag before proceeding?

Other ideas?  The logic all seems to be in the generic code, but I'm
not sure if maybe the ufs-hisi.c code is mis-managing something?

thanks
-john


[PATCH 6/8 v3] arm64: dts: hi3660: Add dma to uart nodes

2019-01-10 Thread John Stultz
Try to add DMA support to the uart nodes following
the assignments made in the dts from the victoria vendor kernel
here:
https://consumer.huawei.com/en/opensource/detail/?siteCode=worldwide=p10=openSourceSoftware=10=1

Cc: Tanglei Han 
Cc: Zhuangluan Su 
Cc: Ryan Grachek 
Cc: Manivannan Sadhasivam 
Cc: Wei Xu 
Cc: Rob Herring 
Cc: Mark Rutland 
Cc: linux-arm-ker...@lists.infradead.org
Cc: devicet...@vger.kernel.org
Signed-off-by: John Stultz 
---
v3:
* Remove dma enablment on uart0 which would use reserved channel 0
---
 arch/arm64/boot/dts/hisilicon/hi3660.dtsi | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi 
b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
index 20ae40d..4c8d682 100644
--- a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
+++ b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
@@ -478,6 +478,8 @@
compatible = "arm,pl011", "arm,primecell";
reg = <0x0 0xfdf0 0x0 0x1000>;
interrupts = ;
+   dma-names = "rx", "tx";
+   dmas =  < 2  3>;
clocks = <_ctrl HI3660_CLK_GATE_UART1>,
 <_ctrl HI3660_CLK_GATE_UART1>;
clock-names = "uartclk", "apb_pclk";
@@ -490,6 +492,8 @@
compatible = "arm,pl011", "arm,primecell";
reg = <0x0 0xfdf03000 0x0 0x1000>;
interrupts = ;
+   dma-names = "rx", "tx";
+   dmas =  < 4  5>;
clocks = <_ctrl HI3660_CLK_GATE_UART2>,
 <_ctrl HI3660_PCLK>;
clock-names = "uartclk", "apb_pclk";
@@ -514,6 +518,8 @@
compatible = "arm,pl011", "arm,primecell";
reg = <0x0 0xfdf01000 0x0 0x1000>;
interrupts = ;
+   dma-names = "rx", "tx";
+   dmas =  < 6  7>;
clocks = <_ctrl HI3660_CLK_GATE_UART4>,
 <_ctrl HI3660_CLK_GATE_UART4>;
clock-names = "uartclk", "apb_pclk";
@@ -526,6 +532,8 @@
compatible = "arm,pl011", "arm,primecell";
reg = <0x0 0xfdf05000 0x0 0x1000>;
interrupts = ;
+   dma-names = "rx", "tx";
+   dmas =  < 8  9>;
clocks = <_ctrl HI3660_CLK_GATE_UART5>,
 <_ctrl HI3660_CLK_GATE_UART5>;
clock-names = "uartclk", "apb_pclk";
-- 
2.7.4



[PATCH 4/8 v3] dma: k3dma: Delete axi_config

2019-01-10 Thread John Stultz
From: Li Yu 

Axi_config controls whether DMA resources can be accessed in non-secure
mode, such as linux kernel. The register should be set by the bootloader
stage and depends on the device.

Thus, this patch removes axi_config from k3dma driver.

Cc: Dan Williams 
Cc: Vinod Koul 
Cc: Tanglei Han 
Cc: Zhuangluan Su 
Cc: Ryan Grachek 
Cc: Manivannan Sadhasivam 
Cc: dmaeng...@vger.kernel.org
Acked-by: Manivannan Sadhasivam 
Signed-off-by: Li Yu 
Signed-off-by: Guodong Xu 
[jstultz: Minor tweaks to commit message]
Signed-off-by: John Stultz 
---
 drivers/dma/k3dma.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/dma/k3dma.c b/drivers/dma/k3dma.c
index df61406..b2060bf 100644
--- a/drivers/dma/k3dma.c
+++ b/drivers/dma/k3dma.c
@@ -52,8 +52,6 @@
 #define CX_SRC 0x814
 #define CX_DST 0x818
 #define CX_CFG 0x81c
-#define AXI_CFG0x820
-#define AXI_CFG_DEFAULT0x201201
 
 #define CX_LLI_CHAIN_EN0x2
 #define CX_CFG_EN  0x1
@@ -168,7 +166,6 @@ static void k3_dma_set_desc(struct k3_dma_phy *phy, struct 
k3_desc_hw *hw)
writel_relaxed(hw->count, phy->base + CX_CNT0);
writel_relaxed(hw->saddr, phy->base + CX_SRC);
writel_relaxed(hw->daddr, phy->base + CX_DST);
-   writel_relaxed(AXI_CFG_DEFAULT, phy->base + AXI_CFG);
writel_relaxed(hw->config, phy->base + CX_CFG);
 }
 
-- 
2.7.4



[PATCH 1/8 v3] Documentation: bindings: k3dma: Extend the k3dma driver binding to support hisi-asp

2019-01-10 Thread John Stultz
From: Youlin Wang 

Extend the k3dma driver binding to support hisi-asp hardware
variants.

Cc: Vinod Koul 
Cc: Rob Herring 
Cc: Mark Rutland 
Cc: Zhuangluan Su 
Cc: Tanglei Han 
Cc: Ryan Grachek 
Cc: Manivannan Sadhasivam 
Cc: dmaeng...@vger.kernel.org
Cc: devicet...@vger.kernel.org
Signed-off-by: Youlin Wang 
Signed-off-by: Tanglei Han 
Signed-off-by: John Stultz 
---
v2: Simplify patch, removing extranious examples
---
 Documentation/devicetree/bindings/dma/k3dma.txt | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/dma/k3dma.txt 
b/Documentation/devicetree/bindings/dma/k3dma.txt
index 4945aea..10a2f15 100644
--- a/Documentation/devicetree/bindings/dma/k3dma.txt
+++ b/Documentation/devicetree/bindings/dma/k3dma.txt
@@ -3,7 +3,9 @@
 See dma.txt first
 
 Required properties:
-- compatible: Should be "hisilicon,k3-dma-1.0"
+- compatible: Must be one of
+-  "hisilicon,k3-dma-1.0"
+-  "hisilicon,hisi-pcm-asp-dma-1.0"
 - reg: Should contain DMA registers location and length.
 - interrupts: Should contain one interrupt shared by all channel
 - #dma-cells: see dma.txt, should be 1, para number
-- 
2.7.4



[PATCH 3/8 v3] dma: k3dma: Upgrade k3dma driver to support hisi_asp_dma hardware

2019-01-10 Thread John Stultz
From: Youlin Wang 

On the hi3660 hardware there are two (at least) DMA controllers,
the DMA-P (Peripherial DMA) and the DMA-A (Audio DMA). The
two blocks are similar, but have some slight differences. This
resulted in the vendor implementing two separate drivers, which
after review, they have been able to condense and re-use the
existing k3dma driver.

Thus, this patch adds support for the new "hisi-pcm-asp-dma-1.0"
compatible string in the binding.

One difference with the DMA-A controller, is that it does not
need to initialize a clock. So we skip this by adding and using
soc data flags.

After above this driver will support both k3 and hisi_asp dma
hardware.

Cc: Dan Williams 
Cc: Vinod Koul 
Cc: Zhuangluan Su 
Cc: Ryan Grachek 
Cc: Manivannan Sadhasivam 
Cc: dmaeng...@vger.kernel.org
Acked-by: Manivannan Sadhasivam 
Signed-off-by: Youlin Wang 
Signed-off-by: Tanglei Han 
[jstultz: Reworked to use of_match_data, commit msg improvements]
Signed-off-by: John Stultz 
---
v2:
* Reworked to use of_match_data
v3:
* Further rework of the commit message
---
 drivers/dma/k3dma.c | 37 -
 1 file changed, 32 insertions(+), 5 deletions(-)

diff --git a/drivers/dma/k3dma.c b/drivers/dma/k3dma.c
index fdec2b6..df61406 100644
--- a/drivers/dma/k3dma.c
+++ b/drivers/dma/k3dma.c
@@ -116,6 +116,13 @@ struct k3_dma_dev {
unsigned intirq;
 };
 
+
+#define K3_FLAG_NOCLK  (1<<0)
+struct k3dma_soc_data {
+   unsigned long flags;
+};
+
+
 #define to_k3_dma(dmadev) container_of(dmadev, struct k3_dma_dev, slave)
 
 static int k3_dma_config_write(struct dma_chan *chan,
@@ -790,8 +797,21 @@ static int k3_dma_transfer_resume(struct dma_chan *chan)
return 0;
 }
 
+static const struct k3dma_soc_data k3_v1_dma_data = {
+   .flags = 0,
+};
+
+static const struct k3dma_soc_data asp_v1_dma_data = {
+   .flags = K3_FLAG_NOCLK,
+};
+
 static const struct of_device_id k3_pdma_dt_ids[] = {
-   { .compatible = "hisilicon,k3-dma-1.0", },
+   { .compatible = "hisilicon,k3-dma-1.0",
+ .data = _v1_dma_data
+   },
+   { .compatible = "hisilicon,hisi-pcm-asp-dma-1.0",
+ .data = _v1_dma_data
+   },
{}
 };
 MODULE_DEVICE_TABLE(of, k3_pdma_dt_ids);
@@ -810,6 +830,7 @@ static struct dma_chan *k3_of_dma_simple_xlate(struct 
of_phandle_args *dma_spec,
 
 static int k3_dma_probe(struct platform_device *op)
 {
+   const struct k3dma_soc_data *soc_data;
struct k3_dma_dev *d;
const struct of_device_id *of_id;
struct resource *iores;
@@ -823,6 +844,10 @@ static int k3_dma_probe(struct platform_device *op)
if (!d)
return -ENOMEM;
 
+   soc_data = device_get_match_data(>dev);
+   if (!soc_data)
+   return -EINVAL;
+
d->base = devm_ioremap_resource(>dev, iores);
if (IS_ERR(d->base))
return PTR_ERR(d->base);
@@ -835,10 +860,12 @@ static int k3_dma_probe(struct platform_device *op)
"dma-requests", >dma_requests);
}
 
-   d->clk = devm_clk_get(>dev, NULL);
-   if (IS_ERR(d->clk)) {
-   dev_err(>dev, "no dma clk\n");
-   return PTR_ERR(d->clk);
+   if (!(soc_data->flags & K3_FLAG_NOCLK)) {
+   d->clk = devm_clk_get(>dev, NULL);
+   if (IS_ERR(d->clk)) {
+   dev_err(>dev, "no dma clk\n");
+   return PTR_ERR(d->clk);
+   }
}
 
irq = platform_get_irq(op, 0);
-- 
2.7.4



[PATCH 8/8 v3] arm64: dts: hi3660: Fixup unofficial dma-min-chan to hisi-dma-avail-chan

2019-01-10 Thread John Stultz
A undocumented and unimplemented binding got into the hi3660
dtsi, and this switches that binding to the now documented one.

Cc: Tanglei Han 
Cc: Zhuangluan Su 
Cc: Ryan Grachek 
Cc: Manivannan Sadhasivam 
Cc: Wei Xu 
Cc: Rob Herring 
Cc: Mark Rutland 
Cc: linux-arm-ker...@lists.infradead.org
Cc: devicet...@vger.kernel.org
Signed-off-by: John Stultz 
---
v3: Renamed to hisi-dma-avail-chan
---
 arch/arm64/boot/dts/hisilicon/hi3660.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi 
b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
index 77a7135..472c370a3d 100644
--- a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
+++ b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
@@ -560,7 +560,7 @@
#dma-cells = <1>;
dma-channels = <16>;
dma-requests = <32>;
-   dma-min-chan = <1>;
+   hisi-dma-avail-chan = <0xfffe>;
interrupts = ;
clocks = <_ctrl HI3660_CLK_GATE_DMAC>;
dma-no-cci;
-- 
2.7.4



[PATCH 7/8 v3] arm64: dts: hi3660: Add hisi asp dma device

2019-01-10 Thread John Stultz
From: Youlin Wang 

Add asp-dma device to hi3660 dts

Cc: Tanglei Han 
Cc: Zhuangluan Su 
Cc: Ryan Grachek 
Cc: Manivannan Sadhasivam 
Cc: Wei Xu 
Cc: Rob Herring 
Cc: Mark Rutland 
Cc: linux-arm-ker...@lists.infradead.org
Cc: devicet...@vger.kernel.org
Acked-by: Manivannan Sadhasivam 
Signed-off-by: Youlin Wang 
Signed-off-by: Tanglei Han 
Signed-off-by: John Stultz 
---
v2: Removed undocumented bindings
---
 arch/arm64/boot/dts/hisilicon/hi3660.dtsi | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi 
b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
index 4c8d682..77a7135 100644
--- a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
+++ b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
@@ -567,6 +567,16 @@
dma-type = "hi3660_dma";
};
 
+   asp_dmac: dma-controller@e804b000 {
+   compatible = "hisilicon,hisi-pcm-asp-dma-1.0";
+   reg = <0x0 0xe804b000 0x0 0x1000>;
+   #dma-cells = <1>;
+   dma-channels = <16>;
+   dma-requests = <32>;
+   interrupts = ;
+   interrupt-names = "asp_dma_irq";
+   };
+
rtc0: rtc@fff04000 {
compatible = "arm,pl031", "arm,primecell";
reg = <0x0 0Xfff04000 0x0 0x1000>;
-- 
2.7.4



[PATCH 2/8 v3] Documentation: bindings: k3dma: Add binding for hisi-dma-avail-chan

2019-01-10 Thread John Stultz
Some dma channels can be reserved for secure mode or other
hardware on the SoC, so provide a binding for a bitmask
listing the available channels for the kernel to use.

Cc: Vinod Koul 
Cc: Rob Herring 
Cc: Mark Rutland 
Cc: Tanglei Han 
Cc: Zhuangluan Su 
Cc: Ryan Grachek 
Cc: Manivannan Sadhasivam 
Cc: dmaeng...@vger.kernel.org
Cc: devicet...@vger.kernel.org
Signed-off-by: John Stultz 
---
v3: Renamed to hisi-dma-avail-chan
---
 Documentation/devicetree/bindings/dma/k3dma.txt | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Documentation/devicetree/bindings/dma/k3dma.txt 
b/Documentation/devicetree/bindings/dma/k3dma.txt
index 10a2f15..38825d4 100644
--- a/Documentation/devicetree/bindings/dma/k3dma.txt
+++ b/Documentation/devicetree/bindings/dma/k3dma.txt
@@ -14,6 +14,9 @@ Required properties:
have specific request line
 - clocks: clock required
 
+Optional properties:
+- hisi-dma-avail-chan: Bitmask of available physical channels
+
 Example:
 
 Controller:
-- 
2.7.4



[PATCH 5/8 v3] dma: k3dma: Add support for hisi-dma-avail-chan

2019-01-10 Thread John Stultz
From: Li Yu 

Add hisi-dma-avail-chan as a property for k3dma, it defines
available dma channels which a non-secure mode driver can use.

One sample usage of this is in Hi3660 SoC. DMA channel 0 is
reserved to lpm3, which is a coprocessor for power management. So
as a result, any request in kernel (which runs on main processor
and in non-secure mode) should start from at least channel 1.

Cc: Dan Williams 
Cc: Vinod Koul 
Cc: Tanglei Han 
Cc: Zhuangluan Su 
Cc: Ryan Grachek 
Cc: Manivannan Sadhasivam 
Cc: Guodong Xu 
Cc: dmaeng...@vger.kernel.org
Signed-off-by: Li Yu 
[jstultz: Reworked to use a channel mask]
Signed-off-by: John Stultz 
---
v3: Rename to hisi-dma-avail-chan
---
 drivers/dma/k3dma.c | 20 +++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/dma/k3dma.c b/drivers/dma/k3dma.c
index b2060bf..f4001ca 100644
--- a/drivers/dma/k3dma.c
+++ b/drivers/dma/k3dma.c
@@ -111,6 +111,7 @@ struct k3_dma_dev {
struct dma_pool *pool;
u32 dma_channels;
u32 dma_requests;
+   u32 dma_avail_chan;
unsigned intirq;
 };
 
@@ -318,6 +319,9 @@ static void k3_dma_tasklet(unsigned long arg)
/* check new channel request in d->chan_pending */
spin_lock_irq(>lock);
for (pch = 0; pch < d->dma_channels; pch++) {
+   if (!(d->dma_avail_chan & (1<phy[pch];
 
if (p->vchan == NULL && !list_empty(>chan_pending)) {
@@ -335,6 +339,9 @@ static void k3_dma_tasklet(unsigned long arg)
spin_unlock_irq(>lock);
 
for (pch = 0; pch < d->dma_channels; pch++) {
+   if (!(d->dma_avail_chan & (1<phy[pch];
c = p->vchan;
@@ -855,6 +862,13 @@ static int k3_dma_probe(struct platform_device *op)
"dma-channels", >dma_channels);
of_property_read_u32((>dev)->of_node,
"dma-requests", >dma_requests);
+   ret = of_property_read_u32((>dev)->of_node,
+   "hisi-dma-avail-chan", >dma_avail_chan);
+   if (ret) {
+   dev_warn(>dev,
+"hisi-dma-avail-chan doesn't exist, 
considering all as available.\n");
+   d->dma_avail_chan = (u32)~0UL;
+   }
}
 
if (!(soc_data->flags & K3_FLAG_NOCLK)) {
@@ -886,8 +900,12 @@ static int k3_dma_probe(struct platform_device *op)
return -ENOMEM;
 
for (i = 0; i < d->dma_channels; i++) {
-   struct k3_dma_phy *p = >phy[i];
+   struct k3_dma_phy *p;
+
+   if (!(d->dma_avail_chan & (1<phy[i];
p->idx = i;
p->base = d->base + i * 0x40;
}
-- 
2.7.4



[PATCH 0/8 v3] k3dma patches to add support for hi3660/HiKey960

2019-01-10 Thread John Stultz
This patch series is based on recent work by Tanglei Han, and
adds support for hi3660 SoCs as found on the HiKey960 board,
along with a few patches I've been carrying.

Review and feedback would be greatly appreciated!

thanks
-john

Cc: Tanglei Han 
Cc: Zhuangluan Su 
Cc: Dan Williams 
Cc: Vinod Koul 
Cc: Wei Xu 
Cc: Mark Rutland 
Cc: Rob Herring 
Cc: Guodong Xu 
Cc: Manivannan Sadhasivam 
Cc: Ryan Grachek 
CC: devicet...@vger.kernel.org
Cc: dmaeng...@vger.kernel.org
Cc: linux-arm-ker...@lists.infradead.org

John Stultz (3):
  Documentation: bindings: k3dma: Add binding for hisi-dma-avail-chan
  arm64: dts: hi3660: Add dma to uart nodes
  arm64: dts: hi3660: Fixup unofficial dma-min-chan to
hisi-dma-avail-chan

Li Yu (2):
  dma: k3dma: Delete axi_config
  dma: k3dma: Add support for hisi-dma-avail-chan

Youlin Wang (3):
  Documentation: bindings: k3dma: Extend the k3dma driver binding to
support hisi-asp
  dma: k3dma: Upgrade k3dma driver to support hisi_asp_dma hardware
  arm64: dts: hi3660: Add hisi asp dma device

 Documentation/devicetree/bindings/dma/k3dma.txt |  7 ++-
 arch/arm64/boot/dts/hisilicon/hi3660.dtsi   | 20 -
 drivers/dma/k3dma.c | 60 +
 3 files changed, 76 insertions(+), 11 deletions(-)

-- 
2.7.4



Re: [PATCH 3/8 v2] dma: k3dma: Upgrade k3dma driver to support hisi_asp_dma hardware

2019-01-04 Thread John Stultz
On Fri, Jan 4, 2019 at 9:39 PM Manivannan Sadhasivam
 wrote:
>
> On Fri, Jan 04, 2019 at 09:22:46PM -0800, John Stultz wrote:
> > On Fri, Jan 4, 2019 at 7:42 PM Manivannan Sadhasivam
> >  wrote:
> > > On Fri, Jan 04, 2019 at 12:56:23PM -0800, John Stultz wrote:
> > > > From: Youlin Wang 
> > > >
> > > > There is an new "hisi-pcm-asp-dma-1.0" device added in
> > > > "arch/arm64/boot/dts/hisilicon/hi3660.dtsi".
> > > > So we have to add a matching id in the driver file:
> > > >  .compatible = "hisilicon,hisi-pcm-asp-dma-1.0"
> > > >
> > > > And also hisi-pcm-asp dma device needs no setting to the clock.
> > > > So we skip this by adding and using soc data flags.
> > > >
> > > > After above this driver will support both k3 and hisi_asp dma hardware.
> > > >
> > >
> > > Small description about the hardware (ASP DMAC) would be really helpful.
> >
> > I've taken a shot at this (along with integrating your other feedback
> > - thanks again for the review!), though as I don't have direct
> > documentation, my knowledge is a bit second hand.
> >
> > See here:
> > https://git.linaro.org/people/john.stultz/android-dev.git/commit/?h=dev/hikey960-mainline-WIP=754a79facf1af0b59a7f8fd63050da12ebf5521e
> >
> > Let me know if you have suggestions for more specific changes!
>
> Description looks good to me. But looks like you are not protecting the
> other clk APIs like clk_prepare_enable and clk_disable_unprepare. Don't
> they fail when the relevant clk is not found?

No, those calls just return 0 if null clock is passed

int clk_prepare(struct clk *clk)
{
if (!clk)
return 0;
...

thanks
-john


Re: [PATCH 6/8 v2] arm64: dts: hi3660: Add dma to uart nodes

2019-01-04 Thread John Stultz
On Fri, Jan 4, 2019 at 8:34 PM John Stultz  wrote:
>
> On Fri, Jan 4, 2019 at 7:49 PM Manivannan Sadhasivam
>  wrote:
> >
> > Hi John,
> >
> > On Fri, Jan 04, 2019 at 12:56:26PM -0800, John Stultz wrote:
> > > Try to add DMA support to the uart nodes following
> > > the assignments made in the dts from the victoria vendor kernel
> > > here:
> > > https://consumer.huawei.com/en/opensource/detail/?siteCode=worldwide=p10=openSourceSoftware=10=1
> > >
> > > Cc: Tanglei Han 
> > > Cc: Zhuangluan Su 
> > > Cc: Ryan Grachek 
> > > Cc: Manivannan Sadhasivam 
> > > Cc: Wei Xu 
> > > Cc: Rob Herring 
> > > Cc: Mark Rutland 
> > > Cc: linux-arm-ker...@lists.infradead.org
> > > Cc: devicet...@vger.kernel.org
> > > Signed-off-by: John Stultz 
> > > ---
> > >  arch/arm64/boot/dts/hisilicon/hi3660.dtsi | 10 ++
> > >  1 file changed, 10 insertions(+)
> > >
> > > diff --git a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi 
> > > b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
> > > index 20ae40d..aaa2b04 100644
> > > --- a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
> > > +++ b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
> > > @@ -466,6 +466,8 @@
> > >   compatible = "arm,pl011", "arm,primecell";
> > >   reg = <0x0 0xfdf02000 0x0 0x1000>;
> > >   interrupts = ;
> > > + dma-names = "rx", "tx";
> > > + dmas =  < 0  1>;
> >
> > Usage of DMA channel 0 contradicts with the description provided in
> > patch, "dma: k3dma: Add support to dma_avail_chan".
>
> Hrm. Good point.  I'll double check w/ Dr Su on this, I'm not sure if
> that inconsistency is due to the the vendor kernel (where these came
> from) having different reserved channels or just something overlooked
> if the uart0 is not actually being used (as we find on hikey960 as
> well).

Hrm. So it seems like uart0 is mapped to the dma0 chan0, but the
device specific files in the vendor tree overwite the dma values:
serial0: uart@fdf02000 {
pinctrl-names = "default", "idle";
pinctrl-0 = <_pmx_func
_pmx_func _cfg_func _cfg_func>;
pinctrl-1 = <_pmx_idle
_pmx_idle _cfg_idle _cfg_idle>;
dma-names = "", "";
dmas = <>;
clock-rate = <0 1920>;
status = "ok";
};

But I went ahead and pulled the dma values on uart0.

thaks
-john


Re: [PATCH 3/8 v2] dma: k3dma: Upgrade k3dma driver to support hisi_asp_dma hardware

2019-01-04 Thread John Stultz
On Fri, Jan 4, 2019 at 7:42 PM Manivannan Sadhasivam
 wrote:
> On Fri, Jan 04, 2019 at 12:56:23PM -0800, John Stultz wrote:
> > From: Youlin Wang 
> >
> > There is an new "hisi-pcm-asp-dma-1.0" device added in
> > "arch/arm64/boot/dts/hisilicon/hi3660.dtsi".
> > So we have to add a matching id in the driver file:
> >  .compatible = "hisilicon,hisi-pcm-asp-dma-1.0"
> >
> > And also hisi-pcm-asp dma device needs no setting to the clock.
> > So we skip this by adding and using soc data flags.
> >
> > After above this driver will support both k3 and hisi_asp dma hardware.
> >
>
> Small description about the hardware (ASP DMAC) would be really helpful.

I've taken a shot at this (along with integrating your other feedback
- thanks again for the review!), though as I don't have direct
documentation, my knowledge is a bit second hand.

See here:
https://git.linaro.org/people/john.stultz/android-dev.git/commit/?h=dev/hikey960-mainline-WIP=754a79facf1af0b59a7f8fd63050da12ebf5521e

Let me know if you have suggestions for more specific changes!
thanks
-john


Re: [PATCH 2/8 v2] Documentation: bindings: k3dma: Add binding for dma-avail-chan

2019-01-04 Thread John Stultz
On Fri, Jan 4, 2019 at 8:53 PM Manivannan Sadhasivam
 wrote:
>
> On Fri, Jan 04, 2019 at 08:39:34PM -0800, John Stultz wrote:
> > On Fri, Jan 4, 2019 at 8:00 PM Manivannan Sadhasivam
> >  wrote:
> > >
> > > Hi John,
> > >
> > > On Fri, Jan 04, 2019 at 12:56:22PM -0800, John Stultz wrote:
> > > > Some dma channels can be reserved for secure mode or other
> > > > hardware on the SoC, so provide a binding for a bitmask
> > > > listing the available channels for the kernel to use.
> > > >
> > > > Cc: Vinod Koul 
> > > > Cc: Rob Herring 
> > > > Cc: Mark Rutland 
> > > > Cc: Tanglei Han 
> > > > Cc: Zhuangluan Su 
> > > > Cc: Ryan Grachek 
> > > > Cc: Manivannan Sadhasivam 
> > > > Cc: dmaeng...@vger.kernel.org
> > > > Cc: devicet...@vger.kernel.org
> > > > Signed-off-by: John Stultz 
> > > > ---
> > > >  Documentation/devicetree/bindings/dma/k3dma.txt | 3 +++
> > > >  1 file changed, 3 insertions(+)
> > > >
> > > > diff --git a/Documentation/devicetree/bindings/dma/k3dma.txt 
> > > > b/Documentation/devicetree/bindings/dma/k3dma.txt
> > > > index 10a2f15..1c466c1 100644
> > > > --- a/Documentation/devicetree/bindings/dma/k3dma.txt
> > > > +++ b/Documentation/devicetree/bindings/dma/k3dma.txt
> > > > @@ -14,6 +14,9 @@ Required properties:
> > > >   have specific request line
> > > >  - clocks: clock required
> > > >
> > > > +Optional properties:
> > > > +- dma-avail-chan: Bitmask of available physical channels
> > > > +
> > >
> > > This property looks too generic. Since this is specific to HiSi SoCs,
> > > this could be "hisi-dma-avail-chan"?
> >
> > I'm fine to change it, but I'm not sure I fully understand the
> > rational. Can you help me understand?
> > Are device node-binding names supposed to have global scope? I assumed
> > the node property names are basically scoped to the entry?
>
> IIUC properties documented in subsystem binding (dma.txt in this case)
> will have global scope. Those which are not documented in this binding
> are specific to vendor IPs and should be prefixed with the vendor
> prefix (hisi in this case).

Thanks I appreciate the explanation here. I hadn't really understood
this point, and really haven't developed much "taste" in what makes a
good or bad binding.

> > Further, having some dma channels be reserved doesn't seem to be too
> > unique a concept, so I'm not sure what we gain long term by prefixing
> > it?
> >
>
> Right, but this brings up the point of having this functionality in
> generic DMA engine so that the DMA controller drivers need not handle.
> So either we should move this available channel check to DMA Engine
> and document the property in dma.txt so that other IPs can also use it
> or keep the functionality in K3 driver and use HiSi prefix for the
> property.
>
> But I'd like to hear Vinod/Rob's opinion on this!

Sure. Though for now I'll prefix it as the logic is handled at the
driver level.

thanks
-john


Re: [PATCH 2/8 v2] Documentation: bindings: k3dma: Add binding for dma-avail-chan

2019-01-04 Thread John Stultz
On Fri, Jan 4, 2019 at 8:00 PM Manivannan Sadhasivam
 wrote:
>
> Hi John,
>
> On Fri, Jan 04, 2019 at 12:56:22PM -0800, John Stultz wrote:
> > Some dma channels can be reserved for secure mode or other
> > hardware on the SoC, so provide a binding for a bitmask
> > listing the available channels for the kernel to use.
> >
> > Cc: Vinod Koul 
> > Cc: Rob Herring 
> > Cc: Mark Rutland 
> > Cc: Tanglei Han 
> > Cc: Zhuangluan Su 
> > Cc: Ryan Grachek 
> > Cc: Manivannan Sadhasivam 
> > Cc: dmaeng...@vger.kernel.org
> > Cc: devicet...@vger.kernel.org
> > Signed-off-by: John Stultz 
> > ---
> >  Documentation/devicetree/bindings/dma/k3dma.txt | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/Documentation/devicetree/bindings/dma/k3dma.txt 
> > b/Documentation/devicetree/bindings/dma/k3dma.txt
> > index 10a2f15..1c466c1 100644
> > --- a/Documentation/devicetree/bindings/dma/k3dma.txt
> > +++ b/Documentation/devicetree/bindings/dma/k3dma.txt
> > @@ -14,6 +14,9 @@ Required properties:
> >   have specific request line
> >  - clocks: clock required
> >
> > +Optional properties:
> > +- dma-avail-chan: Bitmask of available physical channels
> > +
>
> This property looks too generic. Since this is specific to HiSi SoCs,
> this could be "hisi-dma-avail-chan"?

I'm fine to change it, but I'm not sure I fully understand the
rational. Can you help me understand?
Are device node-binding names supposed to have global scope? I assumed
the node property names are basically scoped to the entry?
Further, having some dma channels be reserved doesn't seem to be too
unique a concept, so I'm not sure what we gain long term by prefixing
it?

thanks
-john


Re: [PATCH 6/8 v2] arm64: dts: hi3660: Add dma to uart nodes

2019-01-04 Thread John Stultz
On Fri, Jan 4, 2019 at 7:49 PM Manivannan Sadhasivam
 wrote:
>
> Hi John,
>
> On Fri, Jan 04, 2019 at 12:56:26PM -0800, John Stultz wrote:
> > Try to add DMA support to the uart nodes following
> > the assignments made in the dts from the victoria vendor kernel
> > here:
> > https://consumer.huawei.com/en/opensource/detail/?siteCode=worldwide=p10=openSourceSoftware=10=1
> >
> > Cc: Tanglei Han 
> > Cc: Zhuangluan Su 
> > Cc: Ryan Grachek 
> > Cc: Manivannan Sadhasivam 
> > Cc: Wei Xu 
> > Cc: Rob Herring 
> > Cc: Mark Rutland 
> > Cc: linux-arm-ker...@lists.infradead.org
> > Cc: devicet...@vger.kernel.org
> > Signed-off-by: John Stultz 
> > ---
> >  arch/arm64/boot/dts/hisilicon/hi3660.dtsi | 10 ++
> >  1 file changed, 10 insertions(+)
> >
> > diff --git a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi 
> > b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
> > index 20ae40d..aaa2b04 100644
> > --- a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
> > +++ b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
> > @@ -466,6 +466,8 @@
> >   compatible = "arm,pl011", "arm,primecell";
> >   reg = <0x0 0xfdf02000 0x0 0x1000>;
> >   interrupts = ;
> > + dma-names = "rx", "tx";
> > + dmas =  < 0  1>;
>
> Usage of DMA channel 0 contradicts with the description provided in
> patch, "dma: k3dma: Add support to dma_avail_chan".

Hrm. Good point.  I'll double check w/ Dr Su on this, I'm not sure if
that inconsistency is due to the the vendor kernel (where these came
from) having different reserved channels or just something overlooked
if the uart0 is not actually being used (as we find on hikey960 as
well).

thanks
-john


Re: [PATCH 0/8 v2] k3dma patches to add support for hi3660/HiKey960

2019-01-04 Thread John Stultz
On Fri, Jan 4, 2019 at 7:37 PM Manivannan Sadhasivam
 wrote:
>
> Hi John,
>
> On Fri, Jan 04, 2019 at 12:56:20PM -0800, John Stultz wrote:
> > This patch series is based on recent work by Tanglei Han, and
> > adds support for hi3660 SoCs as found on the HiKey960 board,
>
> Not sure about the description/subject here! This patchset adds support
> for ASP DMAC found in HI3660 SoCs, Peripheral DMAC is already supported.
>

So yes, it does enable ASP DMAC support, but the full set also enables
the Peripheral DMAC as well in a few ways:
* Avoiding setting the AXI register, which is configured differently on hi3660.
* By adding support for the dma-avail-chan (so we don't try to use
reserved channels)

thanks
-john


[PATCH 1/8 v2] Documentation: bindings: k3dma: Extend the k3dma driver binding to support hisi-asp

2019-01-04 Thread John Stultz
From: Youlin Wang 

Extend the k3dma driver binding to support hisi-asp hardware
variants.

Cc: Vinod Koul 
Cc: Rob Herring 
Cc: Mark Rutland 
Cc: Zhuangluan Su 
Cc: Tanglei Han 
Cc: Ryan Grachek 
Cc: Manivannan Sadhasivam 
Cc: dmaeng...@vger.kernel.org
Cc: devicet...@vger.kernel.org
Signed-off-by: Youlin Wang 
Signed-off-by: Tanglei Han 
Signed-off-by: John Stultz 
---
v2: Simplify patch, removing extranious examples
---
 Documentation/devicetree/bindings/dma/k3dma.txt | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/dma/k3dma.txt 
b/Documentation/devicetree/bindings/dma/k3dma.txt
index 4945aea..10a2f15 100644
--- a/Documentation/devicetree/bindings/dma/k3dma.txt
+++ b/Documentation/devicetree/bindings/dma/k3dma.txt
@@ -3,7 +3,9 @@
 See dma.txt first
 
 Required properties:
-- compatible: Should be "hisilicon,k3-dma-1.0"
+- compatible: Must be one of
+-  "hisilicon,k3-dma-1.0"
+-  "hisilicon,hisi-pcm-asp-dma-1.0"
 - reg: Should contain DMA registers location and length.
 - interrupts: Should contain one interrupt shared by all channel
 - #dma-cells: see dma.txt, should be 1, para number
-- 
2.7.4



[PATCH 3/8 v2] dma: k3dma: Upgrade k3dma driver to support hisi_asp_dma hardware

2019-01-04 Thread John Stultz
From: Youlin Wang 

There is an new "hisi-pcm-asp-dma-1.0" device added in
"arch/arm64/boot/dts/hisilicon/hi3660.dtsi".
So we have to add a matching id in the driver file:
 .compatible = "hisilicon,hisi-pcm-asp-dma-1.0"

And also hisi-pcm-asp dma device needs no setting to the clock.
So we skip this by adding and using soc data flags.

After above this driver will support both k3 and hisi_asp dma hardware.

Cc: Dan Williams 
Cc: Vinod Koul 
Cc: Zhuangluan Su 
Cc: Ryan Grachek 
Cc: Manivannan Sadhasivam 
Cc: dmaeng...@vger.kernel.org
Signed-off-by: Youlin Wang 
Signed-off-by: Tanglei Han 
[jstultz: Reworked to use of_match_data]
Signed-off-by: John Stultz 
---
v2:
* Reworked to use of_match_data
---
 drivers/dma/k3dma.c | 37 -
 1 file changed, 32 insertions(+), 5 deletions(-)

diff --git a/drivers/dma/k3dma.c b/drivers/dma/k3dma.c
index fdec2b6..df61406 100644
--- a/drivers/dma/k3dma.c
+++ b/drivers/dma/k3dma.c
@@ -116,6 +116,13 @@ struct k3_dma_dev {
unsigned intirq;
 };
 
+
+#define K3_FLAG_NOCLK  (1<<0)
+struct k3dma_soc_data {
+   unsigned long flags;
+};
+
+
 #define to_k3_dma(dmadev) container_of(dmadev, struct k3_dma_dev, slave)
 
 static int k3_dma_config_write(struct dma_chan *chan,
@@ -790,8 +797,21 @@ static int k3_dma_transfer_resume(struct dma_chan *chan)
return 0;
 }
 
+static const struct k3dma_soc_data k3_v1_dma_data = {
+   .flags = 0,
+};
+
+static const struct k3dma_soc_data asp_v1_dma_data = {
+   .flags = K3_FLAG_NOCLK,
+};
+
 static const struct of_device_id k3_pdma_dt_ids[] = {
-   { .compatible = "hisilicon,k3-dma-1.0", },
+   { .compatible = "hisilicon,k3-dma-1.0",
+ .data = _v1_dma_data
+   },
+   { .compatible = "hisilicon,hisi-pcm-asp-dma-1.0",
+ .data = _v1_dma_data
+   },
{}
 };
 MODULE_DEVICE_TABLE(of, k3_pdma_dt_ids);
@@ -810,6 +830,7 @@ static struct dma_chan *k3_of_dma_simple_xlate(struct 
of_phandle_args *dma_spec,
 
 static int k3_dma_probe(struct platform_device *op)
 {
+   const struct k3dma_soc_data *soc_data;
struct k3_dma_dev *d;
const struct of_device_id *of_id;
struct resource *iores;
@@ -823,6 +844,10 @@ static int k3_dma_probe(struct platform_device *op)
if (!d)
return -ENOMEM;
 
+   soc_data = device_get_match_data(>dev);
+   if (!soc_data)
+   return -EINVAL;
+
d->base = devm_ioremap_resource(>dev, iores);
if (IS_ERR(d->base))
return PTR_ERR(d->base);
@@ -835,10 +860,12 @@ static int k3_dma_probe(struct platform_device *op)
"dma-requests", >dma_requests);
}
 
-   d->clk = devm_clk_get(>dev, NULL);
-   if (IS_ERR(d->clk)) {
-   dev_err(>dev, "no dma clk\n");
-   return PTR_ERR(d->clk);
+   if (!(soc_data->flags & K3_FLAG_NOCLK)) {
+   d->clk = devm_clk_get(>dev, NULL);
+   if (IS_ERR(d->clk)) {
+   dev_err(>dev, "no dma clk\n");
+   return PTR_ERR(d->clk);
+   }
}
 
irq = platform_get_irq(op, 0);
-- 
2.7.4



[PATCH 6/8 v2] arm64: dts: hi3660: Add dma to uart nodes

2019-01-04 Thread John Stultz
Try to add DMA support to the uart nodes following
the assignments made in the dts from the victoria vendor kernel
here:
https://consumer.huawei.com/en/opensource/detail/?siteCode=worldwide=p10=openSourceSoftware=10=1

Cc: Tanglei Han 
Cc: Zhuangluan Su 
Cc: Ryan Grachek 
Cc: Manivannan Sadhasivam 
Cc: Wei Xu 
Cc: Rob Herring 
Cc: Mark Rutland 
Cc: linux-arm-ker...@lists.infradead.org
Cc: devicet...@vger.kernel.org
Signed-off-by: John Stultz 
---
 arch/arm64/boot/dts/hisilicon/hi3660.dtsi | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi 
b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
index 20ae40d..aaa2b04 100644
--- a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
+++ b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
@@ -466,6 +466,8 @@
compatible = "arm,pl011", "arm,primecell";
reg = <0x0 0xfdf02000 0x0 0x1000>;
interrupts = ;
+   dma-names = "rx", "tx";
+   dmas =  < 0  1>;
clocks = <_ctrl HI3660_CLK_MUX_UART0>,
 <_ctrl HI3660_PCLK>;
clock-names = "uartclk", "apb_pclk";
@@ -478,6 +480,8 @@
compatible = "arm,pl011", "arm,primecell";
reg = <0x0 0xfdf0 0x0 0x1000>;
interrupts = ;
+   dma-names = "rx", "tx";
+   dmas =  < 2  3>;
clocks = <_ctrl HI3660_CLK_GATE_UART1>,
 <_ctrl HI3660_CLK_GATE_UART1>;
clock-names = "uartclk", "apb_pclk";
@@ -490,6 +494,8 @@
compatible = "arm,pl011", "arm,primecell";
reg = <0x0 0xfdf03000 0x0 0x1000>;
interrupts = ;
+   dma-names = "rx", "tx";
+   dmas =  < 4  5>;
clocks = <_ctrl HI3660_CLK_GATE_UART2>,
 <_ctrl HI3660_PCLK>;
clock-names = "uartclk", "apb_pclk";
@@ -514,6 +520,8 @@
compatible = "arm,pl011", "arm,primecell";
reg = <0x0 0xfdf01000 0x0 0x1000>;
interrupts = ;
+   dma-names = "rx", "tx";
+   dmas =  < 6  7>;
clocks = <_ctrl HI3660_CLK_GATE_UART4>,
 <_ctrl HI3660_CLK_GATE_UART4>;
clock-names = "uartclk", "apb_pclk";
@@ -526,6 +534,8 @@
compatible = "arm,pl011", "arm,primecell";
reg = <0x0 0xfdf05000 0x0 0x1000>;
interrupts = ;
+   dma-names = "rx", "tx";
+   dmas =  < 8  9>;
clocks = <_ctrl HI3660_CLK_GATE_UART5>,
 <_ctrl HI3660_CLK_GATE_UART5>;
clock-names = "uartclk", "apb_pclk";
-- 
2.7.4



[PATCH 5/8 v2] dma: k3dma: Add support to dma_avail_chan

2019-01-04 Thread John Stultz
From: Li Yu 

dma_avail_chan as a property for k3dma, it defines available dma
channels which a non-secure mode driver can use.

One sample usage of this is in Hi3660 SoC. DMA channel 0 is
reserved to lpm3, which is a coprocessor for power management. So
as a result, any request in kernel (which runs on main processor
and in non-secure mode) should start from at least channel 1.

Cc: Dan Williams 
Cc: Vinod Koul 
Cc: Tanglei Han 
Cc: Zhuangluan Su 
Cc: Ryan Grachek 
Cc: Manivannan Sadhasivam 
Cc: Guodong Xu 
Cc: dmaeng...@vger.kernel.org
Signed-off-by: Li Yu 
[jstultz: Reworked to use a channel mask]
Signed-off-by: John Stultz 
---
 drivers/dma/k3dma.c | 20 +++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/dma/k3dma.c b/drivers/dma/k3dma.c
index b2060bf..431094b 100644
--- a/drivers/dma/k3dma.c
+++ b/drivers/dma/k3dma.c
@@ -111,6 +111,7 @@ struct k3_dma_dev {
struct dma_pool *pool;
u32 dma_channels;
u32 dma_requests;
+   u32 dma_avail_chan;
unsigned intirq;
 };
 
@@ -318,6 +319,9 @@ static void k3_dma_tasklet(unsigned long arg)
/* check new channel request in d->chan_pending */
spin_lock_irq(>lock);
for (pch = 0; pch < d->dma_channels; pch++) {
+   if (!(d->dma_avail_chan & (1<phy[pch];
 
if (p->vchan == NULL && !list_empty(>chan_pending)) {
@@ -335,6 +339,9 @@ static void k3_dma_tasklet(unsigned long arg)
spin_unlock_irq(>lock);
 
for (pch = 0; pch < d->dma_channels; pch++) {
+   if (!(d->dma_avail_chan & (1<phy[pch];
c = p->vchan;
@@ -855,6 +862,13 @@ static int k3_dma_probe(struct platform_device *op)
"dma-channels", >dma_channels);
of_property_read_u32((>dev)->of_node,
"dma-requests", >dma_requests);
+   ret = of_property_read_u32((>dev)->of_node,
+   "dma-avail-chan", >dma_avail_chan);
+   if (ret) {
+   dev_warn(>dev,
+"dma-avail-chan doesn't exist, considering all 
as available.\n");
+   d->dma_avail_chan = (u32)~0UL;
+   }
}
 
if (!(soc_data->flags & K3_FLAG_NOCLK)) {
@@ -886,8 +900,12 @@ static int k3_dma_probe(struct platform_device *op)
return -ENOMEM;
 
for (i = 0; i < d->dma_channels; i++) {
-   struct k3_dma_phy *p = >phy[i];
+   struct k3_dma_phy *p;
+
+   if (!(d->dma_avail_chan & (1<phy[i];
p->idx = i;
p->base = d->base + i * 0x40;
}
-- 
2.7.4



[PATCH 2/8 v2] Documentation: bindings: k3dma: Add binding for dma-avail-chan

2019-01-04 Thread John Stultz
Some dma channels can be reserved for secure mode or other
hardware on the SoC, so provide a binding for a bitmask
listing the available channels for the kernel to use.

Cc: Vinod Koul 
Cc: Rob Herring 
Cc: Mark Rutland 
Cc: Tanglei Han 
Cc: Zhuangluan Su 
Cc: Ryan Grachek 
Cc: Manivannan Sadhasivam 
Cc: dmaeng...@vger.kernel.org
Cc: devicet...@vger.kernel.org
Signed-off-by: John Stultz 
---
 Documentation/devicetree/bindings/dma/k3dma.txt | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Documentation/devicetree/bindings/dma/k3dma.txt 
b/Documentation/devicetree/bindings/dma/k3dma.txt
index 10a2f15..1c466c1 100644
--- a/Documentation/devicetree/bindings/dma/k3dma.txt
+++ b/Documentation/devicetree/bindings/dma/k3dma.txt
@@ -14,6 +14,9 @@ Required properties:
have specific request line
 - clocks: clock required
 
+Optional properties:
+- dma-avail-chan: Bitmask of available physical channels
+
 Example:
 
 Controller:
-- 
2.7.4



[PATCH 7/8 v2] arm64: dts: hi3660: Add hisi asp dma device

2019-01-04 Thread John Stultz
From: Youlin Wang 

Add asp-dma device to hi3660 dts

Cc: Tanglei Han 
Cc: Zhuangluan Su 
Cc: Ryan Grachek 
Cc: Manivannan Sadhasivam 
Cc: Wei Xu 
Cc: Rob Herring 
Cc: Mark Rutland 
Cc: linux-arm-ker...@lists.infradead.org
Cc: devicet...@vger.kernel.org
Signed-off-by: Youlin Wang 
Signed-off-by: Tanglei Han 
Signed-off-by: John Stultz 
---
v2: Removed undocumented bindings
---
 arch/arm64/boot/dts/hisilicon/hi3660.dtsi | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi 
b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
index aaa2b04..df4e96d 100644
--- a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
+++ b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
@@ -569,6 +569,16 @@
dma-type = "hi3660_dma";
};
 
+   asp_dmac: dma-controller@e804b000 {
+   compatible = "hisilicon,hisi-pcm-asp-dma-1.0";
+   reg = <0x0 0xe804b000 0x0 0x1000>;
+   #dma-cells = <1>;
+   dma-channels = <16>;
+   dma-requests = <32>;
+   interrupts = ;
+   interrupt-names = "asp_dma_irq";
+   };
+
rtc0: rtc@fff04000 {
compatible = "arm,pl031", "arm,primecell";
reg = <0x0 0Xfff04000 0x0 0x1000>;
-- 
2.7.4



[PATCH 8/8 v2] arm64: dts: hi3660: Fixup unofficial dma-min-chan to dma-avail-chan

2019-01-04 Thread John Stultz
A undocumented and unimplemented binding got into the hi3660 dtsi,
and this switches that binding to the now documented one.

Cc: Tanglei Han 
Cc: Zhuangluan Su 
Cc: Ryan Grachek 
Cc: Manivannan Sadhasivam 
Cc: Wei Xu 
Cc: Rob Herring 
Cc: Mark Rutland 
Cc: linux-arm-ker...@lists.infradead.org
Cc: devicet...@vger.kernel.org
Signed-off-by: John Stultz 
---
 arch/arm64/boot/dts/hisilicon/hi3660.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi 
b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
index df4e96d..654da63 100644
--- a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
+++ b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
@@ -562,7 +562,7 @@
#dma-cells = <1>;
dma-channels = <16>;
dma-requests = <32>;
-   dma-min-chan = <1>;
+   dma-avail-chan = <0xfffe>;
interrupts = ;
clocks = <_ctrl HI3660_CLK_GATE_DMAC>;
dma-no-cci;
-- 
2.7.4



[PATCH 4/8 v2] dma: k3dma: Delete axi_config

2019-01-04 Thread John Stultz
From: Li Yu 

Axi_config controls whether DMA resources can be accessed in non-secure
mode, such as linux kernel. The register should be set by the bootloader
stage and depends on the device.

Thus, this patch removes axi_config from k3dma driver.

Cc: Dan Williams 
Cc: Vinod Koul 
Cc: Tanglei Han 
Cc: Zhuangluan Su 
Cc: Ryan Grachek 
Cc: Manivannan Sadhasivam 
Cc: dmaeng...@vger.kernel.org
Signed-off-by: Li Yu 
Signed-off-by: Guodong Xu 
[jstultz: Minor tweaks to commit message]
Signed-off-by: John Stultz 
---
 drivers/dma/k3dma.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/dma/k3dma.c b/drivers/dma/k3dma.c
index df61406..b2060bf 100644
--- a/drivers/dma/k3dma.c
+++ b/drivers/dma/k3dma.c
@@ -52,8 +52,6 @@
 #define CX_SRC 0x814
 #define CX_DST 0x818
 #define CX_CFG 0x81c
-#define AXI_CFG0x820
-#define AXI_CFG_DEFAULT0x201201
 
 #define CX_LLI_CHAIN_EN0x2
 #define CX_CFG_EN  0x1
@@ -168,7 +166,6 @@ static void k3_dma_set_desc(struct k3_dma_phy *phy, struct 
k3_desc_hw *hw)
writel_relaxed(hw->count, phy->base + CX_CNT0);
writel_relaxed(hw->saddr, phy->base + CX_SRC);
writel_relaxed(hw->daddr, phy->base + CX_DST);
-   writel_relaxed(AXI_CFG_DEFAULT, phy->base + AXI_CFG);
writel_relaxed(hw->config, phy->base + CX_CFG);
 }
 
-- 
2.7.4



[PATCH 0/8 v2] k3dma patches to add support for hi3660/HiKey960

2019-01-04 Thread John Stultz
This patch series is based on recent work by Tanglei Han, and
adds support for hi3660 SoCs as found on the HiKey960 board,
along with a few patches I've been carrying.

Review and feedback would be greatly appreciated!

thanks
-john

Cc: Tanglei Han 
Cc: Zhuangluan Su 
Cc: Dan Williams 
Cc: Vinod Koul 
Cc: Wei Xu 
Cc: Mark Rutland 
Cc: Rob Herring 
Cc: Guodong Xu 
Cc: Manivannan Sadhasivam 
Cc: Ryan Grachek 
CC: devicet...@vger.kernel.org
Cc: dmaeng...@vger.kernel.org
Cc: linux-arm-ker...@lists.infradead.org

John Stultz (3):
  Documentation: bindings: k3dma: Add binding for dma-avail-chan
  arm64: dts: hi3660: Add dma to uart nodes
  arm64: dts: hi3660: Fixup unofficial dma-min-chan to dma-avail-chan

Li Yu (2):
  dma: k3dma: Delete axi_config
  dma: k3dma: Add support to dma_avail_chan

Youlin Wang (3):
  Documentation: bindings: k3dma: Extend the k3dma driver binding to
support hisi-asp
  dma: k3dma: Upgrade k3dma driver to support hisi_asp_dma hardware
  arm64: dts: hi3660: Add hisi asp dma device

 Documentation/devicetree/bindings/dma/k3dma.txt |  7 ++-
 arch/arm64/boot/dts/hisilicon/hi3660.dtsi   | 22 -
 drivers/dma/k3dma.c | 60 +
 3 files changed, 78 insertions(+), 11 deletions(-)

-- 
2.7.4



Re: [PATCH 1/3] k3dma: Upgrade k3dma drever to support hisi_asp_dma hardware

2019-01-04 Thread John Stultz
On Fri, Jan 4, 2019 at 9:25 AM Vinod Koul  wrote:
>
> On 28-12-18, 14:36, h00249924 wrote:
> > From: Youlin Wang 
> >
> > There is an new "hisi-pcm-asp-dma-1.0" device added in
> > "arch/arm64/boot/dts/hisilicon/hi3660.dtsi".
> > So we have to add a matching id in the driver file:
> > "{ .compatible = "hisilicon,hisi-pcm-asp-dma-1.0", }"
> >
> > And also hisi-pcm-asp dma device needs no setting to the clock.
> > So we skip this by "if" sentence on id string matching:
> > "if (strcasecmp((of_id->compatible), (k3_pdma_dt_ids[0].compatible)) == 0)"
> >
> > After above this driver will support both k3 and hisi_asp dma hardware.
> >
> > Signed-off-by: Youlin Wang 
> > Signed-off-by: Tanglei Han 
> > Cc: Dan Williams 
> > Cc: Vinod Koul 
> > ---
> >  drivers/dma/k3dma.c | 11 +++
> >  1 file changed, 7 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/dma/k3dma.c b/drivers/dma/k3dma.c
> > index fdec2b6..10eecc2 100644
> > --- a/drivers/dma/k3dma.c
> > +++ b/drivers/dma/k3dma.c
> > @@ -792,6 +792,7 @@ static int k3_dma_transfer_resume(struct dma_chan *chan)
> >
> >  static const struct of_device_id k3_pdma_dt_ids[] = {
> >   { .compatible = "hisilicon,k3-dma-1.0", },
> > + { .compatible = "hisilicon,hisi-pcm-asp-dma-1.0", },
>
> The binding doc patch should precede this..
>
> >   {}
> >  };
> >  MODULE_DEVICE_TABLE(of, k3_pdma_dt_ids);
> > @@ -835,10 +836,12 @@ static int k3_dma_probe(struct platform_device *op)
> >   "dma-requests", >dma_requests);
> >   }
> >
> > - d->clk = devm_clk_get(>dev, NULL);
> > - if (IS_ERR(d->clk)) {
> > - dev_err(>dev, "no dma clk\n");
> > - return PTR_ERR(d->clk);
> > + if (strcasecmp((of_id->compatible), (k3_pdma_dt_ids[0].compatible)) 
> > == 0) {
> > + d->clk = devm_clk_get(>dev, NULL);
>
> who provides clk in this case? how does this scale if you have another
> compatible in future for newer version of controller?

Yea, I've pushed a few times in internal review to use match_data() here.

Tanglei Han: If its ok, I've spent some time integrating these changes
with some other pending k3dma changes. So I'll rework and integrate
some of the review feedback on this and resend it so we can get this
moving a little faster.

I'll leave the i2s feedback to you, so please continue reworking and
resubmitting those, but I'll handle upstreaming the k3dma changes
(keeping you on cc of course).

thanks
-john


[PATCH] arm64: dts: hikey: Add DMA entries for Bluetooth UART

2019-01-03 Thread John Stultz
Add dma0 references for bluetooth uart to enable dma
for bt transfers.

Cc: Manivannan Sadhasivam 
Cc: Ryan Grachek 
Cc: Wei Xu 
Cc: Rob Herring 
Cc: Mark Rutland 
Cc: linux-arm-ker...@lists.infradead.org
Cc: devicet...@vger.kernel.org
Signed-off-by: John Stultz 
Change-Id: I85d36cc492a5f392cde214b144b19469e951585a
---
 arch/arm64/boot/dts/hisilicon/hi6220.dtsi | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm64/boot/dts/hisilicon/hi6220.dtsi 
b/arch/arm64/boot/dts/hisilicon/hi6220.dtsi
index aec9e37..fa15a08 100644
--- a/arch/arm64/boot/dts/hisilicon/hi6220.dtsi
+++ b/arch/arm64/boot/dts/hisilicon/hi6220.dtsi
@@ -319,6 +319,8 @@
clock-names = "uartclk", "apb_pclk";
pinctrl-names = "default";
pinctrl-0 = <_pmx_func _cfg_func1 
_cfg_func2>;
+   dmas = < 8  9>;
+   dma-names = "rx", "tx";
status = "disabled";
};
 
-- 
2.7.4



Re: [PATCH] arm64: dts: hikey: Give wifi some time after power-on

2019-01-02 Thread John Stultz
Adding a few folks to cc from the thread here:
https://patchwork.kernel.org/patch/10734021/

As this sounds like a very similar issue.
thanks
-john

On Sun, Dec 30, 2018 at 3:38 AM Jan Kiszka  wrote:
>
> From: Jan Kiszka 
>
> Somewhere along recent changes to power control of the wl1835, power-on
> became very unreliable on the hikey, failing like this:
>
> wl1271_sdio: probe of mmc2:0001:1 failed with error -16
> wl1271_sdio: probe of mmc2:0001:2 failed with error -16
>
> After playing with some dt parameters and comparing to other users of
> this chip, it turned out we need some power-on delay to make things
> stable again. In contrast to those other users which define 200 ms, the
> hikey is already very happy with 1 ms.
>
> Signed-off-by: Jan Kiszka 
> ---
>
> Tested with 4.19 and linus/master.
>
>   arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts | 1 +
>   1 file changed, 1 insertion(+)
>
> diff --git a/arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts
> b/arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts
> index f4964bee6a1a..1e771bf201b9 100644
> --- a/arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts
> +++ b/arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts
> @@ -118,6 +118,7 @@
> reset-gpios = < 5 GPIO_ACTIVE_LOW>;
> clocks = <>;
> clock-names = "ext_clock";
> +   post-power-on-delay-ms = <1>;
> power-off-delay-us = <10>;
> };
>
> --
> 2.16.4


Re: [PATCH 2/3] dmaengine: Extend the k3dma driver binding

2019-01-02 Thread John Stultz
On Thu, Dec 27, 2018 at 10:39 PM h00249924  wrote:
>
> From: Youlin Wang 
>
> Extend the k3dma driver binding to support hisi-asp hardware variants.
>
> Signed-off-by: Youlin Wang 
> Signed-off-by: Tanglei Han 
> Cc: Vinod Koul 
> Cc: Rob Herring 
> Cc: Mark Rutland 
> ---
>  Documentation/devicetree/bindings/dma/k3dma.txt | 33 
> -
>  1 file changed, 32 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/devicetree/bindings/dma/k3dma.txt 
> b/Documentation/devicetree/bindings/dma/k3dma.txt
> index 4945aea..cd21b82 100644
> --- a/Documentation/devicetree/bindings/dma/k3dma.txt
> +++ b/Documentation/devicetree/bindings/dma/k3dma.txt
> @@ -3,7 +3,9 @@
>  See dma.txt first
>
>  Required properties:
> -- compatible: Should be "hisilicon,k3-dma-1.0"
> +- compatible: Must be one of
> +-  "hisilicon,k3-dma-1.0"
> +-  "hisilicon,hisi-pcm-asp-dma-1.0"
>  - reg: Should contain DMA registers location and length.
>  - interrupts: Should contain one interrupt shared by all channel
>  - #dma-cells: see dma.txt, should be 1, para number
> @@ -43,3 +45,32 @@ For example, i2c0 read channel request line is 18, while 
> write channel use 19
> dma-names = "rx", "tx";
> };
>
> +
> +
> +
> +Controller:
> +   asp_dmac: asp_dmac@E804B000 {
> +   compatible = "hisilicon,hisi-pcm-asp-dma-1.0";
> +   reg = <0x0 0xe804b000 0x0 0x1000>;
> +   #dma-cells = <1>;
> +   dma-channels = <16>;
> +   dma-requests = <32>;
> +   dma-min-chan = <0>;
> +   dma-used-chans = <0xFFFE>;
> +   dma-share;

Thanks for sending this out!

So min-chan, used-chans and dma-share aren't in the existing binding
document. So they probably should be removed from this example, or the
binding document needs to be updated first.

thanks
-john


[RFC][PATCH] arm64: dts: hikey: Add DMA entries for Bluetooth UART

2018-12-11 Thread John Stultz
Add dma0 references for bluetooth uart to enable dma
for bt transfers.

Cc: Manivannan Sadhasivam 
Cc: Ryan Grachek 
Cc: Wei Xu 
Cc: Rob Herring 
Cc: Mark Rutland 
Cc: linux-arm-ker...@lists.infradead.org
Cc: devicet...@vger.kernel.org
Signed-off-by: John Stultz 
Change-Id: I85d36cc492a5f392cde214b144b19469e951585a
---
 arch/arm64/boot/dts/hisilicon/hi6220.dtsi | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm64/boot/dts/hisilicon/hi6220.dtsi 
b/arch/arm64/boot/dts/hisilicon/hi6220.dtsi
index 97d5bf2..c3b17dd 100644
--- a/arch/arm64/boot/dts/hisilicon/hi6220.dtsi
+++ b/arch/arm64/boot/dts/hisilicon/hi6220.dtsi
@@ -319,6 +319,8 @@
clock-names = "uartclk", "apb_pclk";
pinctrl-names = "default";
pinctrl-0 = <_pmx_func _cfg_func1 
_cfg_func2>;
+   dmas = < 8  9>;
+   dma-names = "rx", "tx";
status = "disabled";
};
 
-- 
2.7.4



Re: [PATCH 8/8] timekeeping: remove obsolete time accessors

2018-12-07 Thread John Stultz
On Fri, Dec 7, 2018 at 5:49 AM Arnd Bergmann  wrote:
>
> There are no more remaining users of these deprecated wrappers, so
> let's remove them before new users have a chance to make it in.
>
> See Documentation/core-api/timekeeping.rst for replacements when
> porting old drivers that contain calls to this function.
>
> Signed-off-by: Arnd Bergmann 
> ---
>  include/linux/timekeeping.h   | 14 --
>  include/linux/timekeeping32.h |  9 -
>  2 files changed, 23 deletions(-)

Acked-by: John Stultz 


Re: [PATCH 5/8] timekeeping: remove unused {read,update}_persistent_clock

2018-12-07 Thread John Stultz
On Fri, Dec 7, 2018 at 5:49 AM Arnd Bergmann  wrote:
>
> After arch/sh has removed the last reference to these functions,
> we can remove them completely and just rely on the 64-bit time_t
> based versions. This cleans up a rather ugly use of __weak
> functions.
>
> Signed-off-by: Arnd Bergmann 
> ---
>  include/linux/timekeeping32.h |  6 --
>  kernel/time/ntp.c | 10 +-
>  kernel/time/timekeeping.c | 12 ++--
>  3 files changed, 3 insertions(+), 25 deletions(-)

Acked-by: John Stultz 

thanks
-john


Re: [PATCH 6/8] timekeeping: remove timespec_add/timespec_del

2018-12-07 Thread John Stultz
On Fri, Dec 7, 2018 at 5:49 AM Arnd Bergmann  wrote:
>
> The last users were removed a while ago since everyone moved to ktime_t,
> so we can remove the two unused interfaces for old timespec structures.
>
> With those two gone, set_normalized_timespec() is also unused, so
> remove that as well.
>
> Signed-off-by: Arnd Bergmann 
> ---
>  include/linux/time32.h | 25 -
>  kernel/time/time.c | 36 
>  2 files changed, 61 deletions(-)

Acked-by: John Stultz 


Re: [PATCH] timers: Make the lower-level timer function first call than higher-level

2018-11-19 Thread John Stultz
On Mon, Nov 19, 2018 at 6:10 AM, Muchun Song  wrote:
> The elements of the heads array are a linked list of timer events that
> expire at the current time. And it can contain up to LVL_DEPTH levels
> and the lower the level represents the smaller the time granularity.
>
> Now the result is that the function, which will be called when the timer
> expires, in the higher-level is called first than the lower-level function.
> I think it might be better to call the lower-level timer function first
> than the higher-level function. Because the lower-level has the smaller
> granularity and delay has less impact on higher-level. So fix it.

Interesting.

Do you have any specific examples of where this was helpful?  Maybe
data on how much this helped the case your concerned about?

thanks
-john


Re: [PATCH] timers: Make the lower-level timer function first call than higher-level

2018-11-19 Thread John Stultz
On Mon, Nov 19, 2018 at 6:10 AM, Muchun Song  wrote:
> The elements of the heads array are a linked list of timer events that
> expire at the current time. And it can contain up to LVL_DEPTH levels
> and the lower the level represents the smaller the time granularity.
>
> Now the result is that the function, which will be called when the timer
> expires, in the higher-level is called first than the lower-level function.
> I think it might be better to call the lower-level timer function first
> than the higher-level function. Because the lower-level has the smaller
> granularity and delay has less impact on higher-level. So fix it.

Interesting.

Do you have any specific examples of where this was helpful?  Maybe
data on how much this helped the case your concerned about?

thanks
-john


Re: [PATCH] softirq: don't push timer softirq handling to ksoftirqd

2018-11-15 Thread John Stultz
On Thu, Nov 15, 2018 at 9:07 AM, Michael Zhivich  wrote:
> Require TIMER_SOFTIRQ to be handled immediately instead of delaying until
> ksoftirqd runs, thus preventing problems with reading clocksources that
> wrap often (e.g. acpi_pm).
>
> If acpi_pm is used as the clocksource watchdog, and machine is under heavy
> load, the time period for the watchdog check may be significantly longer
> than the requested 0.5 seconds.  If the watchdog check is delayed by 2
> seconds (observed behavior), then acpi_pm time delta will be
>
> 2.5 sec * 3579545 ticks/sec = 8948863 = 0x888c3f
>
> which will be treated as negative (since acpi_pm is only 24-bits wide) and
> truncated to 0.  This behavior will cause tsc to be incorrectly declared
> unstable in clocksource_watchdog(), as it no longer agrees with acpi_pm.
> If the clocksource watchdog check is delayed by more than 4.7 sec, then the
> acpi_pm clocksource will wrap altogether and produce incorrect time delta.
>
> The likely cause of this delay is that timer interrupts are serviced in
> ksoftirqd when the machine is very busy.
>
> Per Linus' comment in commit 3c53776e29f8 ("Mark HI and TASKLET softirq
> synchronous"):
>...
>We should probably also consider the timer softirqs to be synchronous
>and not be delayed to ksoftirqd (since they were the issue with the
>earlier watchdog problems), but that should be done as a separate patch.
>...
>
> Signed-off-by: Michael Zhivich 
> ---
>  kernel/softirq.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index d28813306b2c..6d517ce0fba8 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -82,7 +82,8 @@ static void wakeup_softirqd(void)
>   * right now. Let ksoftirqd handle this at its own rate, to get fairness,
>   * unless we're doing some of the synchronous softirqs.
>   */
> -#define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ))
> +#define SOFTIRQ_NOW_MASK \
> +   ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ) | (1 << TIMER_SOFTIRQ))
>  static bool ksoftirqd_running(unsigned long pending)
>  {
> struct task_struct *tsk = __this_cpu_read(ksoftirqd);

Thanks so much for sending this along! Sorry I didn't get back to your
mail earlier this week, I've been at Plumbers.

So while this does try to attack the reliability issue w/ the
clocksource watchdog being delayed, I worry this will have to many
side-effects elsewhere.

Would a more focused fix be to move the clocksource watchdog from a
normal timer to a hrtimer?

thanks
-john


Re: [PATCH] softirq: don't push timer softirq handling to ksoftirqd

2018-11-15 Thread John Stultz
On Thu, Nov 15, 2018 at 9:07 AM, Michael Zhivich  wrote:
> Require TIMER_SOFTIRQ to be handled immediately instead of delaying until
> ksoftirqd runs, thus preventing problems with reading clocksources that
> wrap often (e.g. acpi_pm).
>
> If acpi_pm is used as the clocksource watchdog, and machine is under heavy
> load, the time period for the watchdog check may be significantly longer
> than the requested 0.5 seconds.  If the watchdog check is delayed by 2
> seconds (observed behavior), then acpi_pm time delta will be
>
> 2.5 sec * 3579545 ticks/sec = 8948863 = 0x888c3f
>
> which will be treated as negative (since acpi_pm is only 24-bits wide) and
> truncated to 0.  This behavior will cause tsc to be incorrectly declared
> unstable in clocksource_watchdog(), as it no longer agrees with acpi_pm.
> If the clocksource watchdog check is delayed by more than 4.7 sec, then the
> acpi_pm clocksource will wrap altogether and produce incorrect time delta.
>
> The likely cause of this delay is that timer interrupts are serviced in
> ksoftirqd when the machine is very busy.
>
> Per Linus' comment in commit 3c53776e29f8 ("Mark HI and TASKLET softirq
> synchronous"):
>...
>We should probably also consider the timer softirqs to be synchronous
>and not be delayed to ksoftirqd (since they were the issue with the
>earlier watchdog problems), but that should be done as a separate patch.
>...
>
> Signed-off-by: Michael Zhivich 
> ---
>  kernel/softirq.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index d28813306b2c..6d517ce0fba8 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -82,7 +82,8 @@ static void wakeup_softirqd(void)
>   * right now. Let ksoftirqd handle this at its own rate, to get fairness,
>   * unless we're doing some of the synchronous softirqs.
>   */
> -#define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ))
> +#define SOFTIRQ_NOW_MASK \
> +   ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ) | (1 << TIMER_SOFTIRQ))
>  static bool ksoftirqd_running(unsigned long pending)
>  {
> struct task_struct *tsk = __this_cpu_read(ksoftirqd);

Thanks so much for sending this along! Sorry I didn't get back to your
mail earlier this week, I've been at Plumbers.

So while this does try to attack the reliability issue w/ the
clocksource watchdog being delayed, I worry this will have to many
side-effects elsewhere.

Would a more focused fix be to move the clocksource watchdog from a
normal timer to a hrtimer?

thanks
-john


Re: [PATCH] Revert "clocksource: Make clocksource validation work for all clocksources"

2018-11-12 Thread John Stultz
On Mon, Nov 12, 2018 at 10:56 AM, Michael Zhivich  wrote:
> Revert commit 1f45f1f33c8c ("clocksource: Make clocksource validation work
> for all clocksources") to restore correct clocksource_delta() computation
> for clocksources that wrap frequently, while retaining the check for tsc
> drifting.
>
> Truncating result of clocksource_delta() to 0 causes incorrect behavior for
> clocksources that wrap frequently (e.g. acpi_pm which is only 24-bit wide).
> In particular, large time deltas (e.g. last = 0x00, now = 0x80)
> will be incorrectly computed as 0.
>
> If acpi_pm is used as the clocksource watchdog, and machine is under heavy
> load, the time period for the watchdog check may be significantly longer
> than the requested 0.5 seconds.  If the watchdog check is delayed by 2
> seconds (observed behavior), then acpi_pm time delta will be
>
> 2.5 sec * 3579545 ticks/sec = 8948863 = 0x888c3f
>
> which will be treated as negative and truncated to 0.  This behavior will
> cause tsc to be incorrectly declared unstable in clocksource_watchdog(), as
> it no longer agrees with acpi_pm.

Thanks for raising this issue and submitting the patch!

Yea, this is a concern particularly with quick wrapping clocksources.
Though I worry that if you're already blocking the watchdog from
running for 2.5 seconds, you're likely to also block the watchdog for
more then 5 seconds, which if I'm remembering would result in the same
problem?  In other words, does this really solve the problem, or does
it just push the bar a little further out?

So, I'm wondering to really fix this, do we need to find some way to
raise the priority of the clocksource watchdog, so it isn't deferred
for quite so long?

thanks
-john


Re: [PATCH] Revert "clocksource: Make clocksource validation work for all clocksources"

2018-11-12 Thread John Stultz
On Mon, Nov 12, 2018 at 10:56 AM, Michael Zhivich  wrote:
> Revert commit 1f45f1f33c8c ("clocksource: Make clocksource validation work
> for all clocksources") to restore correct clocksource_delta() computation
> for clocksources that wrap frequently, while retaining the check for tsc
> drifting.
>
> Truncating result of clocksource_delta() to 0 causes incorrect behavior for
> clocksources that wrap frequently (e.g. acpi_pm which is only 24-bit wide).
> In particular, large time deltas (e.g. last = 0x00, now = 0x80)
> will be incorrectly computed as 0.
>
> If acpi_pm is used as the clocksource watchdog, and machine is under heavy
> load, the time period for the watchdog check may be significantly longer
> than the requested 0.5 seconds.  If the watchdog check is delayed by 2
> seconds (observed behavior), then acpi_pm time delta will be
>
> 2.5 sec * 3579545 ticks/sec = 8948863 = 0x888c3f
>
> which will be treated as negative and truncated to 0.  This behavior will
> cause tsc to be incorrectly declared unstable in clocksource_watchdog(), as
> it no longer agrees with acpi_pm.

Thanks for raising this issue and submitting the patch!

Yea, this is a concern particularly with quick wrapping clocksources.
Though I worry that if you're already blocking the watchdog from
running for 2.5 seconds, you're likely to also block the watchdog for
more then 5 seconds, which if I'm remembering would result in the same
problem?  In other words, does this really solve the problem, or does
it just push the bar a little further out?

So, I'm wondering to really fix this, do we need to find some way to
raise the priority of the clocksource watchdog, so it isn't deferred
for quite so long?

thanks
-john


Re: TSC to Mono-raw Drift

2018-11-01 Thread John Stultz
On Thu, Nov 1, 2018 at 10:44 AM, Thomas Gleixner  wrote:
> On Tue, 23 Oct 2018, John Stultz wrote:
>> On Fri, Oct 19, 2018 at 3:36 PM, John Stultz  wrote:
>> I spent a little bit of time thinking this out. Unfortunately I don't
>> think its a simple matter of calculating the granularity error on the
>> raw clock and adding it in each interval. The other trouble spot is
>> that the adjusted clocks (monotonic/realtime) are adjusted off of that
>> raw clock. So they would need to have that error added as well,
>> otherwise the raw and a otherwise non-adjusted monotonic clock would
>> drift.
>>
>> However, to be correct, the ntp adjustments made would have to be made
>> to both the base interval + error, which mucks the math up a fair bit.
>
> Hmm, confused as usual. Why would you need to do anything like that?

Because the NTP adjustment is done off of what is now the raw clock.
If the raw clock is "corrected" the ppb adjustment has to be done off
of that corrected rate.

Otherwise with no correction, the raw clock and the monotonic clock
would drift apart.

thanks
-john


Re: TSC to Mono-raw Drift

2018-11-01 Thread John Stultz
On Thu, Nov 1, 2018 at 10:44 AM, Thomas Gleixner  wrote:
> On Tue, 23 Oct 2018, John Stultz wrote:
>> On Fri, Oct 19, 2018 at 3:36 PM, John Stultz  wrote:
>> I spent a little bit of time thinking this out. Unfortunately I don't
>> think its a simple matter of calculating the granularity error on the
>> raw clock and adding it in each interval. The other trouble spot is
>> that the adjusted clocks (monotonic/realtime) are adjusted off of that
>> raw clock. So they would need to have that error added as well,
>> otherwise the raw and a otherwise non-adjusted monotonic clock would
>> drift.
>>
>> However, to be correct, the ntp adjustments made would have to be made
>> to both the base interval + error, which mucks the math up a fair bit.
>
> Hmm, confused as usual. Why would you need to do anything like that?

Because the NTP adjustment is done off of what is now the raw clock.
If the raw clock is "corrected" the ppb adjustment has to be done off
of that corrected rate.

Otherwise with no correction, the raw clock and the monotonic clock
would drift apart.

thanks
-john


Re: [patch 0/9] time: Add SPDX identifiers and cleanup boilerplates

2018-10-31 Thread John Stultz
On Wed, Oct 31, 2018 at 11:21 AM, Thomas Gleixner  wrote:
> Add SPDX identifiers to all files in kernel/time and remove the license
> boiler plates.
>
> Aside of that use the chance to get rid of (stale) file references and tidy
> up the top of file comments as they are touched anyway by this work.
>
> This work is based on a script and data from Philippe Ombredanne, Kate
> Stewart and myself. The data has been created with two independent license
> scanners and manual inspection.


Looks ok by me.

Acked-by: John Stultz 

thanks
-john


Re: [patch 0/9] time: Add SPDX identifiers and cleanup boilerplates

2018-10-31 Thread John Stultz
On Wed, Oct 31, 2018 at 11:21 AM, Thomas Gleixner  wrote:
> Add SPDX identifiers to all files in kernel/time and remove the license
> boiler plates.
>
> Aside of that use the chance to get rid of (stale) file references and tidy
> up the top of file comments as they are touched anyway by this work.
>
> This work is based on a script and data from Philippe Ombredanne, Kate
> Stewart and myself. The data has been created with two independent license
> scanners and manual inspection.


Looks ok by me.

Acked-by: John Stultz 

thanks
-john


Re: [PATCH] proc: use ns_capable instead of capable for timerslack_ns

2018-10-25 Thread John Stultz
On Wed, Oct 17, 2018 at 3:47 PM,   wrote:
> From: Benjamin Gordon 
>
> Access to timerslack_ns is controlled by a process having CAP_SYS_NICE
> in its effective capability set, but the current check looks in the root
> namespace instead of the process' user namespace.  Since a process is
> allowed to do other activities controlled by CAP_SYS_NICE inside a
> namespace, it should also be able to adjust timerslack_ns.
>
> Signed-off-by: Benjamin Gordon 
> Cc: John Stultz 
> Cc: Kees Cook 
> Cc: "Serge E. Hallyn" 
> Cc: Thomas Gleixner 
> Cc: Arjan van de Ven 
> Cc: Oren Laadan 
> Cc: Ruchi Kandoi 
> Cc: Rom Lemarchand 
> Cc: Todd Kjos 
> Cc: Colin Cross 
> Cc: Nick Kralevich 
> Cc: Dmitry Shmidt 
> Cc: Elliott Hughes 
> Cc: Android Kernel Team 
> Cc: Andrew Morton 
> ---
>  fs/proc/base.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index 7e9f07bf260d..4b50937dff80 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -2356,7 +2356,7 @@ static ssize_t timerslack_ns_write(struct file *file, 
> const char __user *buf,
> return -ESRCH;
>
> if (p != current) {
> -   if (!capable(CAP_SYS_NICE)) {
> +   if (!ns_capable(file->f_cred->user_ns, CAP_SYS_NICE)) {
> count = -EPERM;
> goto out;
> }
> @@ -2393,7 +2393,7 @@ static int timerslack_ns_show(struct seq_file *m, void 
> *v)
>
> if (p != current) {
>
> -   if (!capable(CAP_SYS_NICE)) {
> +   if (!ns_capable(seq_user_ns(m), CAP_SYS_NICE)) {
> err = -EPERM;
> goto out;
> }

Looks reasonable to me.
Acked-by: John Stultz 

thanks
-john


Re: [PATCH] proc: use ns_capable instead of capable for timerslack_ns

2018-10-25 Thread John Stultz
On Wed, Oct 17, 2018 at 3:47 PM,   wrote:
> From: Benjamin Gordon 
>
> Access to timerslack_ns is controlled by a process having CAP_SYS_NICE
> in its effective capability set, but the current check looks in the root
> namespace instead of the process' user namespace.  Since a process is
> allowed to do other activities controlled by CAP_SYS_NICE inside a
> namespace, it should also be able to adjust timerslack_ns.
>
> Signed-off-by: Benjamin Gordon 
> Cc: John Stultz 
> Cc: Kees Cook 
> Cc: "Serge E. Hallyn" 
> Cc: Thomas Gleixner 
> Cc: Arjan van de Ven 
> Cc: Oren Laadan 
> Cc: Ruchi Kandoi 
> Cc: Rom Lemarchand 
> Cc: Todd Kjos 
> Cc: Colin Cross 
> Cc: Nick Kralevich 
> Cc: Dmitry Shmidt 
> Cc: Elliott Hughes 
> Cc: Android Kernel Team 
> Cc: Andrew Morton 
> ---
>  fs/proc/base.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index 7e9f07bf260d..4b50937dff80 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -2356,7 +2356,7 @@ static ssize_t timerslack_ns_write(struct file *file, 
> const char __user *buf,
> return -ESRCH;
>
> if (p != current) {
> -   if (!capable(CAP_SYS_NICE)) {
> +   if (!ns_capable(file->f_cred->user_ns, CAP_SYS_NICE)) {
> count = -EPERM;
> goto out;
> }
> @@ -2393,7 +2393,7 @@ static int timerslack_ns_show(struct seq_file *m, void 
> *v)
>
> if (p != current) {
>
> -   if (!capable(CAP_SYS_NICE)) {
> +   if (!ns_capable(seq_user_ns(m), CAP_SYS_NICE)) {
> err = -EPERM;
> goto out;
> }

Looks reasonable to me.
Acked-by: John Stultz 

thanks
-john


Re: Regression: OOPs on boot due to "wlcore: Add support for optional wakeirq"

2018-10-25 Thread John Stultz
On Thu, Oct 25, 2018 at 10:04 AM, John Stultz  wrote:
> Hey Tony,
>   In testing linus/master on my hikey board, I'm hitting the following
> OOPS on bootup:
>
> [1.870279] Unable to handle kernel read from unreadable memory at
> virtual address 0010
> [1.870283] Mem abort info:
> [1.870287]   ESR = 0x9605
> [1.870292]   Exception class = DABT (current EL), IL = 32 bits
> [1.870296]   SET = 0, FnV = 0
> [1.870299]   EA = 0, S1PTW = 0
> [1.870302] Data abort info:
> [1.870306]   ISV = 0, ISS = 0x0005
> [1.870309]   CM = 0, WnR = 0
> [1.870312] [0010] user address but active_mm is swapper
> [1.870318] Internal error: Oops: 9605 [#1] PREEMPT SMP
> [1.870327] CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted
> 4.19.0-05129-gb3d1e8e #48
> [1.870331] Hardware name: HiKey Development Board (DT)
> [1.870350] Workqueue: events_freezable mmc_rescan
> [1.870358] pstate: 6045 (nZCv daif +PAN -UAO)
> [1.870366] pc : wl1271_probe+0x210/0x350
> [1.870371] lr : wl1271_probe+0x210/0x350
> [1.870374] sp : ff80080739b0
> [1.870377] x29: ff80080739b0 x28: 
> [1.870384] x27:  x26: 
> [1.870391] x25: 0036 x24: ffc074ecb598
> [1.870398] x23: ffc07ffdce78 x22: ffc0744ed808
> [1.870404] x21: ffc074ecbb98 x20: ff8008ff9000
> [1.870411] x19: ffc0744ed800 x18: ff8008ff9a48
> [1.870418] x17:  x16: 
> [1.870425] x15: ffc074ecb503 x14: 
> [1.870431] x13: ffc074ecb502 x12: 0030
> [1.870438] x11: 0101010101010101 x10: 0040
> [1.870444] x9 : ffc075400248 x8 : ffc075400270
> [1.870451] x7 :  x6 : 
> [1.870457] x5 :  x4 : 
> [1.870463] x3 :  x2 : 
> [1.870469] x1 : 0028 x0 : 
> [1.870477] Process kworker/0:0 (pid: 5, stack limit = 0x(ptrval))
> [1.870480] Call trace:
> [1.870485]  wl1271_probe+0x210/0x350
> [1.870491]  sdio_bus_probe+0x100/0x128
> [1.870500]  really_probe+0x1a8/0x2b8
> [1.870506]  driver_probe_device+0x58/0x100
> [1.870511]  __device_attach_driver+0x94/0xd8
> [1.870517]  bus_for_each_drv+0x70/0xc8
> [1.870522]  __device_attach+0xe0/0x140
> [1.870527]  device_initial_probe+0x10/0x18
> [1.870532]  bus_probe_device+0x94/0xa0
> [1.870537]  device_add+0x374/0x5b8
> [1.870542]  sdio_add_func+0x60/0x88
> [1.870546]  mmc_attach_sdio+0x1b0/0x358
> [1.870551]  mmc_rescan+0x2cc/0x390
> [1.870558]  process_one_work+0x12c/0x320
> [1.870563]  worker_thread+0x48/0x458
> [1.870569]  kthread+0xf8/0x128
> [1.870575]  ret_from_fork+0x10/0x18
> [1.870583] Code: 92400c21 b2760021 a90687a2 97e95bf9 (f9400803)
> [1.870587] ---[ end trace 1e15f81d3c139ca9 ]---
>
>
> I've bisected it down to 3c83dd577c7f ("wlcore: Add support for
> optional wakeirq").
>
> It seems since we don't have a wakeirq value in the dts, the wakeirq
> value in wl1271_probe() is zero, which then causes trouble in
> irqd_get_trigger_type(irq_get_irq_data(wakeirq)).
>
> Should we check wakeirq before we set the second elements of res[]?

Here's my first swing at doing the above. It seems to work ok. Let me
know if it looks reasonable and I'll submit it as a proper patch.

https://git.linaro.org/people/john.stultz/android-dev.git/commit/?h=dev/hikey-mainline-WIP=383fb6e3abb71b8a721f196120a2e3a9da3315ac


thanks
-john


Re: Regression: OOPs on boot due to "wlcore: Add support for optional wakeirq"

2018-10-25 Thread John Stultz
On Thu, Oct 25, 2018 at 10:04 AM, John Stultz  wrote:
> Hey Tony,
>   In testing linus/master on my hikey board, I'm hitting the following
> OOPS on bootup:
>
> [1.870279] Unable to handle kernel read from unreadable memory at
> virtual address 0010
> [1.870283] Mem abort info:
> [1.870287]   ESR = 0x9605
> [1.870292]   Exception class = DABT (current EL), IL = 32 bits
> [1.870296]   SET = 0, FnV = 0
> [1.870299]   EA = 0, S1PTW = 0
> [1.870302] Data abort info:
> [1.870306]   ISV = 0, ISS = 0x0005
> [1.870309]   CM = 0, WnR = 0
> [1.870312] [0010] user address but active_mm is swapper
> [1.870318] Internal error: Oops: 9605 [#1] PREEMPT SMP
> [1.870327] CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted
> 4.19.0-05129-gb3d1e8e #48
> [1.870331] Hardware name: HiKey Development Board (DT)
> [1.870350] Workqueue: events_freezable mmc_rescan
> [1.870358] pstate: 6045 (nZCv daif +PAN -UAO)
> [1.870366] pc : wl1271_probe+0x210/0x350
> [1.870371] lr : wl1271_probe+0x210/0x350
> [1.870374] sp : ff80080739b0
> [1.870377] x29: ff80080739b0 x28: 
> [1.870384] x27:  x26: 
> [1.870391] x25: 0036 x24: ffc074ecb598
> [1.870398] x23: ffc07ffdce78 x22: ffc0744ed808
> [1.870404] x21: ffc074ecbb98 x20: ff8008ff9000
> [1.870411] x19: ffc0744ed800 x18: ff8008ff9a48
> [1.870418] x17:  x16: 
> [1.870425] x15: ffc074ecb503 x14: 
> [1.870431] x13: ffc074ecb502 x12: 0030
> [1.870438] x11: 0101010101010101 x10: 0040
> [1.870444] x9 : ffc075400248 x8 : ffc075400270
> [1.870451] x7 :  x6 : 
> [1.870457] x5 :  x4 : 
> [1.870463] x3 :  x2 : 
> [1.870469] x1 : 0028 x0 : 
> [1.870477] Process kworker/0:0 (pid: 5, stack limit = 0x(ptrval))
> [1.870480] Call trace:
> [1.870485]  wl1271_probe+0x210/0x350
> [1.870491]  sdio_bus_probe+0x100/0x128
> [1.870500]  really_probe+0x1a8/0x2b8
> [1.870506]  driver_probe_device+0x58/0x100
> [1.870511]  __device_attach_driver+0x94/0xd8
> [1.870517]  bus_for_each_drv+0x70/0xc8
> [1.870522]  __device_attach+0xe0/0x140
> [1.870527]  device_initial_probe+0x10/0x18
> [1.870532]  bus_probe_device+0x94/0xa0
> [1.870537]  device_add+0x374/0x5b8
> [1.870542]  sdio_add_func+0x60/0x88
> [1.870546]  mmc_attach_sdio+0x1b0/0x358
> [1.870551]  mmc_rescan+0x2cc/0x390
> [1.870558]  process_one_work+0x12c/0x320
> [1.870563]  worker_thread+0x48/0x458
> [1.870569]  kthread+0xf8/0x128
> [1.870575]  ret_from_fork+0x10/0x18
> [1.870583] Code: 92400c21 b2760021 a90687a2 97e95bf9 (f9400803)
> [1.870587] ---[ end trace 1e15f81d3c139ca9 ]---
>
>
> I've bisected it down to 3c83dd577c7f ("wlcore: Add support for
> optional wakeirq").
>
> It seems since we don't have a wakeirq value in the dts, the wakeirq
> value in wl1271_probe() is zero, which then causes trouble in
> irqd_get_trigger_type(irq_get_irq_data(wakeirq)).
>
> Should we check wakeirq before we set the second elements of res[]?

Here's my first swing at doing the above. It seems to work ok. Let me
know if it looks reasonable and I'll submit it as a proper patch.

https://git.linaro.org/people/john.stultz/android-dev.git/commit/?h=dev/hikey-mainline-WIP=383fb6e3abb71b8a721f196120a2e3a9da3315ac


thanks
-john


Regression: OOPs on boot due to "wlcore: Add support for optional wakeirq"

2018-10-25 Thread John Stultz
Hey Tony,
  In testing linus/master on my hikey board, I'm hitting the following
OOPS on bootup:

[1.870279] Unable to handle kernel read from unreadable memory at
virtual address 0010
[1.870283] Mem abort info:
[1.870287]   ESR = 0x9605
[1.870292]   Exception class = DABT (current EL), IL = 32 bits
[1.870296]   SET = 0, FnV = 0
[1.870299]   EA = 0, S1PTW = 0
[1.870302] Data abort info:
[1.870306]   ISV = 0, ISS = 0x0005
[1.870309]   CM = 0, WnR = 0
[1.870312] [0010] user address but active_mm is swapper
[1.870318] Internal error: Oops: 9605 [#1] PREEMPT SMP
[1.870327] CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted
4.19.0-05129-gb3d1e8e #48
[1.870331] Hardware name: HiKey Development Board (DT)
[1.870350] Workqueue: events_freezable mmc_rescan
[1.870358] pstate: 6045 (nZCv daif +PAN -UAO)
[1.870366] pc : wl1271_probe+0x210/0x350
[1.870371] lr : wl1271_probe+0x210/0x350
[1.870374] sp : ff80080739b0
[1.870377] x29: ff80080739b0 x28: 
[1.870384] x27:  x26: 
[1.870391] x25: 0036 x24: ffc074ecb598
[1.870398] x23: ffc07ffdce78 x22: ffc0744ed808
[1.870404] x21: ffc074ecbb98 x20: ff8008ff9000
[1.870411] x19: ffc0744ed800 x18: ff8008ff9a48
[1.870418] x17:  x16: 
[1.870425] x15: ffc074ecb503 x14: 
[1.870431] x13: ffc074ecb502 x12: 0030
[1.870438] x11: 0101010101010101 x10: 0040
[1.870444] x9 : ffc075400248 x8 : ffc075400270
[1.870451] x7 :  x6 : 
[1.870457] x5 :  x4 : 
[1.870463] x3 :  x2 : 
[1.870469] x1 : 0028 x0 : 
[1.870477] Process kworker/0:0 (pid: 5, stack limit = 0x(ptrval))
[1.870480] Call trace:
[1.870485]  wl1271_probe+0x210/0x350
[1.870491]  sdio_bus_probe+0x100/0x128
[1.870500]  really_probe+0x1a8/0x2b8
[1.870506]  driver_probe_device+0x58/0x100
[1.870511]  __device_attach_driver+0x94/0xd8
[1.870517]  bus_for_each_drv+0x70/0xc8
[1.870522]  __device_attach+0xe0/0x140
[1.870527]  device_initial_probe+0x10/0x18
[1.870532]  bus_probe_device+0x94/0xa0
[1.870537]  device_add+0x374/0x5b8
[1.870542]  sdio_add_func+0x60/0x88
[1.870546]  mmc_attach_sdio+0x1b0/0x358
[1.870551]  mmc_rescan+0x2cc/0x390
[1.870558]  process_one_work+0x12c/0x320
[1.870563]  worker_thread+0x48/0x458
[1.870569]  kthread+0xf8/0x128
[1.870575]  ret_from_fork+0x10/0x18
[1.870583] Code: 92400c21 b2760021 a90687a2 97e95bf9 (f9400803)
[1.870587] ---[ end trace 1e15f81d3c139ca9 ]---


I've bisected it down to 3c83dd577c7f ("wlcore: Add support for
optional wakeirq").

It seems since we don't have a wakeirq value in the dts, the wakeirq
value in wl1271_probe() is zero, which then causes trouble in
irqd_get_trigger_type(irq_get_irq_data(wakeirq)).

Should we check wakeirq before we set the second elements of res[]?

thanks
-john


Regression: OOPs on boot due to "wlcore: Add support for optional wakeirq"

2018-10-25 Thread John Stultz
Hey Tony,
  In testing linus/master on my hikey board, I'm hitting the following
OOPS on bootup:

[1.870279] Unable to handle kernel read from unreadable memory at
virtual address 0010
[1.870283] Mem abort info:
[1.870287]   ESR = 0x9605
[1.870292]   Exception class = DABT (current EL), IL = 32 bits
[1.870296]   SET = 0, FnV = 0
[1.870299]   EA = 0, S1PTW = 0
[1.870302] Data abort info:
[1.870306]   ISV = 0, ISS = 0x0005
[1.870309]   CM = 0, WnR = 0
[1.870312] [0010] user address but active_mm is swapper
[1.870318] Internal error: Oops: 9605 [#1] PREEMPT SMP
[1.870327] CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted
4.19.0-05129-gb3d1e8e #48
[1.870331] Hardware name: HiKey Development Board (DT)
[1.870350] Workqueue: events_freezable mmc_rescan
[1.870358] pstate: 6045 (nZCv daif +PAN -UAO)
[1.870366] pc : wl1271_probe+0x210/0x350
[1.870371] lr : wl1271_probe+0x210/0x350
[1.870374] sp : ff80080739b0
[1.870377] x29: ff80080739b0 x28: 
[1.870384] x27:  x26: 
[1.870391] x25: 0036 x24: ffc074ecb598
[1.870398] x23: ffc07ffdce78 x22: ffc0744ed808
[1.870404] x21: ffc074ecbb98 x20: ff8008ff9000
[1.870411] x19: ffc0744ed800 x18: ff8008ff9a48
[1.870418] x17:  x16: 
[1.870425] x15: ffc074ecb503 x14: 
[1.870431] x13: ffc074ecb502 x12: 0030
[1.870438] x11: 0101010101010101 x10: 0040
[1.870444] x9 : ffc075400248 x8 : ffc075400270
[1.870451] x7 :  x6 : 
[1.870457] x5 :  x4 : 
[1.870463] x3 :  x2 : 
[1.870469] x1 : 0028 x0 : 
[1.870477] Process kworker/0:0 (pid: 5, stack limit = 0x(ptrval))
[1.870480] Call trace:
[1.870485]  wl1271_probe+0x210/0x350
[1.870491]  sdio_bus_probe+0x100/0x128
[1.870500]  really_probe+0x1a8/0x2b8
[1.870506]  driver_probe_device+0x58/0x100
[1.870511]  __device_attach_driver+0x94/0xd8
[1.870517]  bus_for_each_drv+0x70/0xc8
[1.870522]  __device_attach+0xe0/0x140
[1.870527]  device_initial_probe+0x10/0x18
[1.870532]  bus_probe_device+0x94/0xa0
[1.870537]  device_add+0x374/0x5b8
[1.870542]  sdio_add_func+0x60/0x88
[1.870546]  mmc_attach_sdio+0x1b0/0x358
[1.870551]  mmc_rescan+0x2cc/0x390
[1.870558]  process_one_work+0x12c/0x320
[1.870563]  worker_thread+0x48/0x458
[1.870569]  kthread+0xf8/0x128
[1.870575]  ret_from_fork+0x10/0x18
[1.870583] Code: 92400c21 b2760021 a90687a2 97e95bf9 (f9400803)
[1.870587] ---[ end trace 1e15f81d3c139ca9 ]---


I've bisected it down to 3c83dd577c7f ("wlcore: Add support for
optional wakeirq").

It seems since we don't have a wakeirq value in the dts, the wakeirq
value in wl1271_probe() is zero, which then causes trouble in
irqd_get_trigger_type(irq_get_irq_data(wakeirq)).

Should we check wakeirq before we set the second elements of res[]?

thanks
-john


Re: TSC to Mono-raw Drift

2018-10-23 Thread John Stultz
On Fri, Oct 19, 2018 at 3:36 PM, John Stultz  wrote:
> On Fri, Oct 19, 2018 at 1:50 PM, Thomas Gleixner  wrote:
>> John,
>>
>> On Fri, 19 Oct 2018, John Stultz wrote:
>>> On Fri, Oct 19, 2018 at 11:57 AM, Thomas Gleixner  
>>> wrote:
>>> > I don't think you need complex oscillation for that. The error is constant
>>> > and small enough that it is a fractional nanoseconds thing with an 
>>> > interval
>>> > <= 1s. So you can just add that in a regular interval. Due to it being
>>> > small you can't observe time jumping I think.
>>>
>>> Well, from the examples the trouble is we seem to be a bit fast,
>>> rather then slow.
>>> So we'll have to reduce mult by one, and rework the calculations, but
>>> maybe something like this (correcting the raw_interval value) would
>>> work.
>>
>> Shouldn't be rocket science. It's a one off calculation of adjustment value
>> and maybe the period at which the correction happens.
>>
>>> But this also sort of breaks, fundamental argument that the raw clock
>>> is a simple mult/shift transformation of the underlying clocksource
>>> counter. Its not the accuracy of the clock but the consistency that
>>> was key.
>>>
>>> The counter argument is that the raw clock is abstracting the
>>> underlying hardware so folks who would have used the TSC directly can
>>> now use the raw clock and have a generic abstracted hardware-counter
>>> interface. So userland shouldn't really be worried about the
>>> occasional injections made since they shouldn't be trying to
>>> re-generate the abstraction from the hardware themselves.  <--
>>> Remember this point as we move to the next comment:)
>>>
>>> > The end-result is 'correct' as much correct it is in relation to real
>>> > nanoseconds. :)
>>> >
>>> >> I guess I'd want to understand more of the use here and the need to
>>> >> tie the raw clock back to the hardware counter it abstracts.
>>> >
>>> > The problem there is ART which is distributed to PCIe devices and ART time
>>> > stamps are exposed in various ways. ART has a fixed ratio vs. TSC so there
>>> > is a reasonable expectation that MONOTONIC_RAW is accurate.
>>>
>>> Which is maybe sort of my issue here. The raw clock provided a
>>> abstraction away from the hardware for generic usage, but then its
>>> being re-used with other un-abstracted hardware references. So unless
>>> they use the same method of transformation, there will be problems (of
>>> varying degree).
>>
>> OTOH. If people use the CPUID provided frequency information and the TSC
>> from userspace then they get different results which is contrary to the
>> goal of providing them an abstracted way of doing it.
>
> But that's my point. If they are pulling time values from the hardware
> directly that's unabstracted. I'm not sure its smart to be comparing
> the abstracted and unabstracted time stamps if your worried about
> precision. They are sort of two separate (though similar) time
> domains.
>
>>> We might be able to reduce the degree in this case, but I worry the
>>> extra complexity may only cause problems for others.
>>
>> Is it really that complex to add a fixed correction value periodically?
>>
>> I don't think so and it should just work for any clocksource which is
>> exposed this way. Famous last words .
>
> I'm not saying that the code is overly complex (at least compared to
> the rest of the timekeeping code :), but just how the accumulation is
> done is less-trivial. So if someone else is trying to mimic the
> abstracted time with unabstracted hardware values (again, not
> something I reccomend, but that's sort of the usage case pushing
> this), they need to use a similar method that is slightly more
> complicated (or use slower math). Its all subtle stuff, but this makes
> something that was relatively very simple (by design) a bit harder to
> explain.

Adding Mirosalv as he's always thoughtful on these sorts of issues.

I spent a little bit of time thinking this out. Unfortunately I don't
think its a simple matter of calculating the granularity error on the
raw clock and adding it in each interval. The other trouble spot is
that the adjusted clocks (monotonic/realtime) are adjusted off of that
raw clock. So they would need to have that error added as well,
otherwise the raw and a otherwise non-adjusted monotonic clock would
drift.

However, to be correct, the ntp adjustments made would have to be made
to both the base interval + error, which mucks the math up a fair bit.

Maybe Miroslav will have a better idea here, but otherwise I'll stew
on this a bit more and see what I can come up with.

thanks
-john


Re: TSC to Mono-raw Drift

2018-10-23 Thread John Stultz
On Fri, Oct 19, 2018 at 3:36 PM, John Stultz  wrote:
> On Fri, Oct 19, 2018 at 1:50 PM, Thomas Gleixner  wrote:
>> John,
>>
>> On Fri, 19 Oct 2018, John Stultz wrote:
>>> On Fri, Oct 19, 2018 at 11:57 AM, Thomas Gleixner  
>>> wrote:
>>> > I don't think you need complex oscillation for that. The error is constant
>>> > and small enough that it is a fractional nanoseconds thing with an 
>>> > interval
>>> > <= 1s. So you can just add that in a regular interval. Due to it being
>>> > small you can't observe time jumping I think.
>>>
>>> Well, from the examples the trouble is we seem to be a bit fast,
>>> rather then slow.
>>> So we'll have to reduce mult by one, and rework the calculations, but
>>> maybe something like this (correcting the raw_interval value) would
>>> work.
>>
>> Shouldn't be rocket science. It's a one off calculation of adjustment value
>> and maybe the period at which the correction happens.
>>
>>> But this also sort of breaks, fundamental argument that the raw clock
>>> is a simple mult/shift transformation of the underlying clocksource
>>> counter. Its not the accuracy of the clock but the consistency that
>>> was key.
>>>
>>> The counter argument is that the raw clock is abstracting the
>>> underlying hardware so folks who would have used the TSC directly can
>>> now use the raw clock and have a generic abstracted hardware-counter
>>> interface. So userland shouldn't really be worried about the
>>> occasional injections made since they shouldn't be trying to
>>> re-generate the abstraction from the hardware themselves.  <--
>>> Remember this point as we move to the next comment:)
>>>
>>> > The end-result is 'correct' as much correct it is in relation to real
>>> > nanoseconds. :)
>>> >
>>> >> I guess I'd want to understand more of the use here and the need to
>>> >> tie the raw clock back to the hardware counter it abstracts.
>>> >
>>> > The problem there is ART which is distributed to PCIe devices and ART time
>>> > stamps are exposed in various ways. ART has a fixed ratio vs. TSC so there
>>> > is a reasonable expectation that MONOTONIC_RAW is accurate.
>>>
>>> Which is maybe sort of my issue here. The raw clock provided a
>>> abstraction away from the hardware for generic usage, but then its
>>> being re-used with other un-abstracted hardware references. So unless
>>> they use the same method of transformation, there will be problems (of
>>> varying degree).
>>
>> OTOH. If people use the CPUID provided frequency information and the TSC
>> from userspace then they get different results which is contrary to the
>> goal of providing them an abstracted way of doing it.
>
> But that's my point. If they are pulling time values from the hardware
> directly that's unabstracted. I'm not sure its smart to be comparing
> the abstracted and unabstracted time stamps if your worried about
> precision. They are sort of two separate (though similar) time
> domains.
>
>>> We might be able to reduce the degree in this case, but I worry the
>>> extra complexity may only cause problems for others.
>>
>> Is it really that complex to add a fixed correction value periodically?
>>
>> I don't think so and it should just work for any clocksource which is
>> exposed this way. Famous last words .
>
> I'm not saying that the code is overly complex (at least compared to
> the rest of the timekeeping code :), but just how the accumulation is
> done is less-trivial. So if someone else is trying to mimic the
> abstracted time with unabstracted hardware values (again, not
> something I reccomend, but that's sort of the usage case pushing
> this), they need to use a similar method that is slightly more
> complicated (or use slower math). Its all subtle stuff, but this makes
> something that was relatively very simple (by design) a bit harder to
> explain.

Adding Mirosalv as he's always thoughtful on these sorts of issues.

I spent a little bit of time thinking this out. Unfortunately I don't
think its a simple matter of calculating the granularity error on the
raw clock and adding it in each interval. The other trouble spot is
that the adjusted clocks (monotonic/realtime) are adjusted off of that
raw clock. So they would need to have that error added as well,
otherwise the raw and a otherwise non-adjusted monotonic clock would
drift.

However, to be correct, the ntp adjustments made would have to be made
to both the base interval + error, which mucks the math up a fair bit.

Maybe Miroslav will have a better idea here, but otherwise I'll stew
on this a bit more and see what I can come up with.

thanks
-john


Re: TSC to Mono-raw Drift

2018-10-19 Thread John Stultz
On Fri, Oct 19, 2018 at 1:50 PM, Thomas Gleixner  wrote:
> John,
>
> On Fri, 19 Oct 2018, John Stultz wrote:
>> On Fri, Oct 19, 2018 at 11:57 AM, Thomas Gleixner  wrote:
>> > I don't think you need complex oscillation for that. The error is constant
>> > and small enough that it is a fractional nanoseconds thing with an interval
>> > <= 1s. So you can just add that in a regular interval. Due to it being
>> > small you can't observe time jumping I think.
>>
>> Well, from the examples the trouble is we seem to be a bit fast,
>> rather then slow.
>> So we'll have to reduce mult by one, and rework the calculations, but
>> maybe something like this (correcting the raw_interval value) would
>> work.
>
> Shouldn't be rocket science. It's a one off calculation of adjustment value
> and maybe the period at which the correction happens.
>
>> But this also sort of breaks, fundamental argument that the raw clock
>> is a simple mult/shift transformation of the underlying clocksource
>> counter. Its not the accuracy of the clock but the consistency that
>> was key.
>>
>> The counter argument is that the raw clock is abstracting the
>> underlying hardware so folks who would have used the TSC directly can
>> now use the raw clock and have a generic abstracted hardware-counter
>> interface. So userland shouldn't really be worried about the
>> occasional injections made since they shouldn't be trying to
>> re-generate the abstraction from the hardware themselves.  <--
>> Remember this point as we move to the next comment:)
>>
>> > The end-result is 'correct' as much correct it is in relation to real
>> > nanoseconds. :)
>> >
>> >> I guess I'd want to understand more of the use here and the need to
>> >> tie the raw clock back to the hardware counter it abstracts.
>> >
>> > The problem there is ART which is distributed to PCIe devices and ART time
>> > stamps are exposed in various ways. ART has a fixed ratio vs. TSC so there
>> > is a reasonable expectation that MONOTONIC_RAW is accurate.
>>
>> Which is maybe sort of my issue here. The raw clock provided a
>> abstraction away from the hardware for generic usage, but then its
>> being re-used with other un-abstracted hardware references. So unless
>> they use the same method of transformation, there will be problems (of
>> varying degree).
>
> OTOH. If people use the CPUID provided frequency information and the TSC
> from userspace then they get different results which is contrary to the
> goal of providing them an abstracted way of doing it.

But that's my point. If they are pulling time values from the hardware
directly that's unabstracted. I'm not sure its smart to be comparing
the abstracted and unabstracted time stamps if your worried about
precision. They are sort of two separate (though similar) time
domains.

>> We might be able to reduce the degree in this case, but I worry the
>> extra complexity may only cause problems for others.
>
> Is it really that complex to add a fixed correction value periodically?
>
> I don't think so and it should just work for any clocksource which is
> exposed this way. Famous last words .

I'm not saying that the code is overly complex (at least compared to
the rest of the timekeeping code :), but just how the accumulation is
done is less-trivial. So if someone else is trying to mimic the
abstracted time with unabstracted hardware values (again, not
something I reccomend, but that's sort of the usage case pushing
this), they need to use a similar method that is slightly more
complicated (or use slower math). Its all subtle stuff, but this makes
something that was relatively very simple (by design) a bit harder to
explain.

So I'm not out right objecting, I'm more wringing my hands at the
potential edge cases that improving the accuracy of the un-adjusted
clock raises. :)

thanks
-john


Re: TSC to Mono-raw Drift

2018-10-19 Thread John Stultz
On Fri, Oct 19, 2018 at 1:50 PM, Thomas Gleixner  wrote:
> John,
>
> On Fri, 19 Oct 2018, John Stultz wrote:
>> On Fri, Oct 19, 2018 at 11:57 AM, Thomas Gleixner  wrote:
>> > I don't think you need complex oscillation for that. The error is constant
>> > and small enough that it is a fractional nanoseconds thing with an interval
>> > <= 1s. So you can just add that in a regular interval. Due to it being
>> > small you can't observe time jumping I think.
>>
>> Well, from the examples the trouble is we seem to be a bit fast,
>> rather then slow.
>> So we'll have to reduce mult by one, and rework the calculations, but
>> maybe something like this (correcting the raw_interval value) would
>> work.
>
> Shouldn't be rocket science. It's a one off calculation of adjustment value
> and maybe the period at which the correction happens.
>
>> But this also sort of breaks, fundamental argument that the raw clock
>> is a simple mult/shift transformation of the underlying clocksource
>> counter. Its not the accuracy of the clock but the consistency that
>> was key.
>>
>> The counter argument is that the raw clock is abstracting the
>> underlying hardware so folks who would have used the TSC directly can
>> now use the raw clock and have a generic abstracted hardware-counter
>> interface. So userland shouldn't really be worried about the
>> occasional injections made since they shouldn't be trying to
>> re-generate the abstraction from the hardware themselves.  <--
>> Remember this point as we move to the next comment:)
>>
>> > The end-result is 'correct' as much correct it is in relation to real
>> > nanoseconds. :)
>> >
>> >> I guess I'd want to understand more of the use here and the need to
>> >> tie the raw clock back to the hardware counter it abstracts.
>> >
>> > The problem there is ART which is distributed to PCIe devices and ART time
>> > stamps are exposed in various ways. ART has a fixed ratio vs. TSC so there
>> > is a reasonable expectation that MONOTONIC_RAW is accurate.
>>
>> Which is maybe sort of my issue here. The raw clock provided a
>> abstraction away from the hardware for generic usage, but then its
>> being re-used with other un-abstracted hardware references. So unless
>> they use the same method of transformation, there will be problems (of
>> varying degree).
>
> OTOH. If people use the CPUID provided frequency information and the TSC
> from userspace then they get different results which is contrary to the
> goal of providing them an abstracted way of doing it.

But that's my point. If they are pulling time values from the hardware
directly that's unabstracted. I'm not sure its smart to be comparing
the abstracted and unabstracted time stamps if your worried about
precision. They are sort of two separate (though similar) time
domains.

>> We might be able to reduce the degree in this case, but I worry the
>> extra complexity may only cause problems for others.
>
> Is it really that complex to add a fixed correction value periodically?
>
> I don't think so and it should just work for any clocksource which is
> exposed this way. Famous last words .

I'm not saying that the code is overly complex (at least compared to
the rest of the timekeeping code :), but just how the accumulation is
done is less-trivial. So if someone else is trying to mimic the
abstracted time with unabstracted hardware values (again, not
something I reccomend, but that's sort of the usage case pushing
this), they need to use a similar method that is slightly more
complicated (or use slower math). Its all subtle stuff, but this makes
something that was relatively very simple (by design) a bit harder to
explain.

So I'm not out right objecting, I'm more wringing my hands at the
potential edge cases that improving the accuracy of the un-adjusted
clock raises. :)

thanks
-john


Re: TSC to Mono-raw Drift

2018-10-19 Thread John Stultz
On Fri, Oct 19, 2018 at 11:57 AM, Thomas Gleixner  wrote:
> On Fri, 19 Oct 2018, John Stultz wrote:
>> On Fri, Oct 19, 2018 at 11:37 AM, Thomas Gleixner  wrote:
>> > On Fri, 19 Oct 2018, Thomas Gleixner wrote:
>> >> On Mon, 15 Oct 2018, Christopher Hall wrote:
>> >> > TSC kHz used to calculate mult/shift value: 3312000
>> >>
>> >> Now the most interesting information here would be the resulting 
>> >> mult/shift
>> >> value which the kernel uses. Can you please provide that?
>> >
>> > But aside of the actual values it's pretty obvious why this happens. It's a
>> > simple rounding error. Minimal, but still. To avoid rounding errors you'd
>> > have to find mult and shift values which exactly result in:
>> >
>> >  (freq * mult) >> shift = 1e9
>> >
>> > which is impossible for freq = 3312 * 1e6.
>>
>> Right.
>>
>> > A possible fix for this is, because the error is in the low PPB range, to
>> > adjust the error once per second or in some other sensible regular
>> > interval.
>>
>> Ehh.. I mean, this is basically what all the complicated ntp_error
>> logic does for the long-term accuracy compensation.  Part of the
>> allure of the raw clock is that there is no changes made over time.
>> Its a very simple constant calculation.
>>
>> We could try to do something like pre-seed the ntpinterval value so
>> the default ntp_tick value corrects for this, but that's only going to
>> effect the monotonic/realtime clock until ntpd or anyone else sets a
>> different interval rate.
>>
>> So  I'm not particularly eager in trying to do the type of small
>> frequency oscillation we do for monotonic/realtime for long-term
>> accuracy to the raw clock as that feels a bit antithetical to the
>> point of the raw clock.
>
> I don't think you need complex oscillation for that. The error is constant
> and small enough that it is a fractional nanoseconds thing with an interval
> <= 1s. So you can just add that in a regular interval. Due to it being
> small you can't observe time jumping I think.

Well, from the examples the trouble is we seem to be a bit fast,
rather then slow.
So we'll have to reduce mult by one, and rework the calculations, but
maybe something like this (correcting the raw_interval value) would
work.

But this also sort of breaks, fundamental argument that the raw clock
is a simple mult/shift transformation of the underlying clocksource
counter. Its not the accuracy of the clock but the consistency that
was key.

The counter argument is that the raw clock is abstracting the
underlying hardware so folks who would have used the TSC directly can
now use the raw clock and have a generic abstracted hardware-counter
interface. So userland shouldn't really be worried about the
occasional injections made since they shouldn't be trying to
re-generate the abstraction from the hardware themselves.  <--
Remember this point as we move to the next comment:)

> The end-result is 'correct' as much correct it is in relation to real
> nanoseconds. :)
>
>> I guess I'd want to understand more of the use here and the need to
>> tie the raw clock back to the hardware counter it abstracts.
>
> The problem there is ART which is distributed to PCIe devices and ART time
> stamps are exposed in various ways. ART has a fixed ratio vs. TSC so there
> is a reasonable expectation that MONOTONIC_RAW is accurate.

Which is maybe sort of my issue here. The raw clock provided a
abstraction away from the hardware for generic usage, but then its
being re-used with other un-abstracted hardware references. So unless
they use the same method of transformation, there will be problems (of
varying degree).

We might be able to reduce the degree in this case, but I worry the
extra complexity may only cause problems for others.

thanks
-john


Re: TSC to Mono-raw Drift

2018-10-19 Thread John Stultz
On Fri, Oct 19, 2018 at 11:57 AM, Thomas Gleixner  wrote:
> On Fri, 19 Oct 2018, John Stultz wrote:
>> On Fri, Oct 19, 2018 at 11:37 AM, Thomas Gleixner  wrote:
>> > On Fri, 19 Oct 2018, Thomas Gleixner wrote:
>> >> On Mon, 15 Oct 2018, Christopher Hall wrote:
>> >> > TSC kHz used to calculate mult/shift value: 3312000
>> >>
>> >> Now the most interesting information here would be the resulting 
>> >> mult/shift
>> >> value which the kernel uses. Can you please provide that?
>> >
>> > But aside of the actual values it's pretty obvious why this happens. It's a
>> > simple rounding error. Minimal, but still. To avoid rounding errors you'd
>> > have to find mult and shift values which exactly result in:
>> >
>> >  (freq * mult) >> shift = 1e9
>> >
>> > which is impossible for freq = 3312 * 1e6.
>>
>> Right.
>>
>> > A possible fix for this is, because the error is in the low PPB range, to
>> > adjust the error once per second or in some other sensible regular
>> > interval.
>>
>> Ehh.. I mean, this is basically what all the complicated ntp_error
>> logic does for the long-term accuracy compensation.  Part of the
>> allure of the raw clock is that there is no changes made over time.
>> Its a very simple constant calculation.
>>
>> We could try to do something like pre-seed the ntpinterval value so
>> the default ntp_tick value corrects for this, but that's only going to
>> effect the monotonic/realtime clock until ntpd or anyone else sets a
>> different interval rate.
>>
>> So  I'm not particularly eager in trying to do the type of small
>> frequency oscillation we do for monotonic/realtime for long-term
>> accuracy to the raw clock as that feels a bit antithetical to the
>> point of the raw clock.
>
> I don't think you need complex oscillation for that. The error is constant
> and small enough that it is a fractional nanoseconds thing with an interval
> <= 1s. So you can just add that in a regular interval. Due to it being
> small you can't observe time jumping I think.

Well, from the examples the trouble is we seem to be a bit fast,
rather then slow.
So we'll have to reduce mult by one, and rework the calculations, but
maybe something like this (correcting the raw_interval value) would
work.

But this also sort of breaks, fundamental argument that the raw clock
is a simple mult/shift transformation of the underlying clocksource
counter. Its not the accuracy of the clock but the consistency that
was key.

The counter argument is that the raw clock is abstracting the
underlying hardware so folks who would have used the TSC directly can
now use the raw clock and have a generic abstracted hardware-counter
interface. So userland shouldn't really be worried about the
occasional injections made since they shouldn't be trying to
re-generate the abstraction from the hardware themselves.  <--
Remember this point as we move to the next comment:)

> The end-result is 'correct' as much correct it is in relation to real
> nanoseconds. :)
>
>> I guess I'd want to understand more of the use here and the need to
>> tie the raw clock back to the hardware counter it abstracts.
>
> The problem there is ART which is distributed to PCIe devices and ART time
> stamps are exposed in various ways. ART has a fixed ratio vs. TSC so there
> is a reasonable expectation that MONOTONIC_RAW is accurate.

Which is maybe sort of my issue here. The raw clock provided a
abstraction away from the hardware for generic usage, but then its
being re-used with other un-abstracted hardware references. So unless
they use the same method of transformation, there will be problems (of
varying degree).

We might be able to reduce the degree in this case, but I worry the
extra complexity may only cause problems for others.

thanks
-john


Re: TSC to Mono-raw Drift

2018-10-19 Thread John Stultz
On Fri, Oct 19, 2018 at 11:37 AM, Thomas Gleixner  wrote:
> On Fri, 19 Oct 2018, Thomas Gleixner wrote:
>> On Mon, 15 Oct 2018, Christopher Hall wrote:
>> > TSC kHz used to calculate mult/shift value: 3312000
>>
>> Now the most interesting information here would be the resulting mult/shift
>> value which the kernel uses. Can you please provide that?
>
> But aside of the actual values it's pretty obvious why this happens. It's a
> simple rounding error. Minimal, but still. To avoid rounding errors you'd
> have to find mult and shift values which exactly result in:
>
>  (freq * mult) >> shift = 1e9
>
> which is impossible for freq = 3312 * 1e6.

Right.

> A possible fix for this is, because the error is in the low PPB range, to
> adjust the error once per second or in some other sensible regular
> interval.

Ehh.. I mean, this is basically what all the complicated ntp_error
logic does for the long-term accuracy compensation.  Part of the
allure of the raw clock is that there is no changes made over time.
Its a very simple constant calculation.

We could try to do something like pre-seed the ntpinterval value so
the default ntp_tick value corrects for this, but that's only going to
effect the monotonic/realtime clock until ntpd or anyone else sets a
different interval rate.

So  I'm not particularly eager in trying to do the type of small
frequency oscillation we do for monotonic/realtime for long-term
accuracy to the raw clock as that feels a bit antithetical to the
point of the raw clock.

I guess I'd want to understand more of the use here and the need to
tie the raw clock back to the hardware counter it abstracts.

thanks
-john


Re: TSC to Mono-raw Drift

2018-10-19 Thread John Stultz
On Fri, Oct 19, 2018 at 11:37 AM, Thomas Gleixner  wrote:
> On Fri, 19 Oct 2018, Thomas Gleixner wrote:
>> On Mon, 15 Oct 2018, Christopher Hall wrote:
>> > TSC kHz used to calculate mult/shift value: 3312000
>>
>> Now the most interesting information here would be the resulting mult/shift
>> value which the kernel uses. Can you please provide that?
>
> But aside of the actual values it's pretty obvious why this happens. It's a
> simple rounding error. Minimal, but still. To avoid rounding errors you'd
> have to find mult and shift values which exactly result in:
>
>  (freq * mult) >> shift = 1e9
>
> which is impossible for freq = 3312 * 1e6.

Right.

> A possible fix for this is, because the error is in the low PPB range, to
> adjust the error once per second or in some other sensible regular
> interval.

Ehh.. I mean, this is basically what all the complicated ntp_error
logic does for the long-term accuracy compensation.  Part of the
allure of the raw clock is that there is no changes made over time.
Its a very simple constant calculation.

We could try to do something like pre-seed the ntpinterval value so
the default ntp_tick value corrects for this, but that's only going to
effect the monotonic/realtime clock until ntpd or anyone else sets a
different interval rate.

So  I'm not particularly eager in trying to do the type of small
frequency oscillation we do for monotonic/realtime for long-term
accuracy to the raw clock as that feels a bit antithetical to the
point of the raw clock.

I guess I'd want to understand more of the use here and the need to
tie the raw clock back to the hardware counter it abstracts.

thanks
-john


Re: TSC to Mono-raw Drift

2018-10-19 Thread John Stultz
On Fri, Oct 19, 2018 at 11:34 AM, John Stultz  wrote:
> On Fri, Oct 19, 2018 at 8:25 AM, Thomas Gleixner  wrote:
>> Christopher,
>>
>> Please Cc LKML on such issues in the future.
>>
>> On Mon, 15 Oct 2018, Christopher Hall wrote:
>>
>> Leaving context around for new readers:
>>
>>> Problem Statement:
>>>
>>> The TSC clocksource mult/shift values are derived from CPUID[15H], but the
>>> monotonic raw clock value is not equal to TSC in nominal nanoseconds, i.e.
>>> the timekeeping code is not accurately transforming TSC ticks to nominal
>>> nanoseconds based on CPUID[15H}.
>>>
>>> The included code calculates the drift between nominal TSC nanoseconds and
>>> the monotonic raw clock.
>>>
>>> Background:
>>>
>>> Starting with 6th generation Intel CPUs, the TSC is "phase locked" to the
>>> Always Running Timer (ART). The relation between TSC and ART is read from
>>> CPUID[15H]. Details of the TSC-ART relation are in the "Invariant
>>> Timekeeping" section of the SDM.
>>>
>>> CPUID[15H].ECX returns the nominal frequency of ART (or crystal frequency).
>>> CPU feature TSC_KNOWN_FREQ indicates that tsc_khz (tsc.c) is derived from
>>> CPUID[15H]. The calculation is in tsc.c:native_calibrate_tsc().
>>>
>>> When the TSC clocksource is selected, the timekeeping code uses mult/shift
>>> values to transform TSC into nanoseconds. The mult/shift value is determined
>>> using tsc_khz.
>>>
>>> Example Output:
>>>
>>> Running for 3 seconds trial 1
>>> Scaled TSC delta: 3000328845
>>> Monotonic raw delta: 3000329117
>>> Ran for 3 seconds with 272 ns skew
>>>
>>> Running for 3 seconds trial 2
>>> Scaled TSC delta: 3000295209
>>> Monotonic raw delta: 3000295482
>>> Ran for 3 seconds with 273 ns skew
>>>
>>> Running for 3 seconds trial 3
>>> Scaled TSC delta: 3000262870
>>> Monotonic raw delta: 3000263142
>>> Ran for 3 seconds with 272 ns skew
>>>
>>> Running for 300 seconds trial 4
>>> Scaled TSC delta: 30281725
>>> Monotonic raw delta: 30308905
>>> Ran for 300 seconds with 27180 ns skew
>>>
>>> The skew between tsc and monotonic raw is about 91 PPB.
>>>
>>> System Information:
>>>
>>> CPU model string: Intel(R) Core(TM) i5-6600 CPU @ 3.30GHz
>>> Kernel version tested: 4.14.71-rt44
>>>   NOTE: The skew seems to be insensitive to kernel version after
>>>   introduction of TSC_KNOWN_FREQ capability
>>>
>>> >From CPUID[15H]:
>>>   Time Stamp Counter/Core Crystal Clock Information (0x15):
>>>   TSC/clock ratio = 276/2
>>>   nominal core crystal clock = 2400 Hz (table lookup)
>>>
>>> TSC kHz used to calculate mult/shift value: 3312000
>
> So, just to understand, your saying the problem that we calculate a
> tsc_khz value before calculating the mult/shift and the intermediate
> step is losing some precision?
>
> Or is the cause from something else?

The other potential cause here might be just that when we calculate
the mult/shift pair, we select a shift small enough that avoids the
multiplication from overflowing if we have a long timerval. So there
is liklely always some granularity error converting to mult/shift
pair.

thanks
-john


Re: TSC to Mono-raw Drift

2018-10-19 Thread John Stultz
On Fri, Oct 19, 2018 at 11:34 AM, John Stultz  wrote:
> On Fri, Oct 19, 2018 at 8:25 AM, Thomas Gleixner  wrote:
>> Christopher,
>>
>> Please Cc LKML on such issues in the future.
>>
>> On Mon, 15 Oct 2018, Christopher Hall wrote:
>>
>> Leaving context around for new readers:
>>
>>> Problem Statement:
>>>
>>> The TSC clocksource mult/shift values are derived from CPUID[15H], but the
>>> monotonic raw clock value is not equal to TSC in nominal nanoseconds, i.e.
>>> the timekeeping code is not accurately transforming TSC ticks to nominal
>>> nanoseconds based on CPUID[15H}.
>>>
>>> The included code calculates the drift between nominal TSC nanoseconds and
>>> the monotonic raw clock.
>>>
>>> Background:
>>>
>>> Starting with 6th generation Intel CPUs, the TSC is "phase locked" to the
>>> Always Running Timer (ART). The relation between TSC and ART is read from
>>> CPUID[15H]. Details of the TSC-ART relation are in the "Invariant
>>> Timekeeping" section of the SDM.
>>>
>>> CPUID[15H].ECX returns the nominal frequency of ART (or crystal frequency).
>>> CPU feature TSC_KNOWN_FREQ indicates that tsc_khz (tsc.c) is derived from
>>> CPUID[15H]. The calculation is in tsc.c:native_calibrate_tsc().
>>>
>>> When the TSC clocksource is selected, the timekeeping code uses mult/shift
>>> values to transform TSC into nanoseconds. The mult/shift value is determined
>>> using tsc_khz.
>>>
>>> Example Output:
>>>
>>> Running for 3 seconds trial 1
>>> Scaled TSC delta: 3000328845
>>> Monotonic raw delta: 3000329117
>>> Ran for 3 seconds with 272 ns skew
>>>
>>> Running for 3 seconds trial 2
>>> Scaled TSC delta: 3000295209
>>> Monotonic raw delta: 3000295482
>>> Ran for 3 seconds with 273 ns skew
>>>
>>> Running for 3 seconds trial 3
>>> Scaled TSC delta: 3000262870
>>> Monotonic raw delta: 3000263142
>>> Ran for 3 seconds with 272 ns skew
>>>
>>> Running for 300 seconds trial 4
>>> Scaled TSC delta: 30281725
>>> Monotonic raw delta: 30308905
>>> Ran for 300 seconds with 27180 ns skew
>>>
>>> The skew between tsc and monotonic raw is about 91 PPB.
>>>
>>> System Information:
>>>
>>> CPU model string: Intel(R) Core(TM) i5-6600 CPU @ 3.30GHz
>>> Kernel version tested: 4.14.71-rt44
>>>   NOTE: The skew seems to be insensitive to kernel version after
>>>   introduction of TSC_KNOWN_FREQ capability
>>>
>>> >From CPUID[15H]:
>>>   Time Stamp Counter/Core Crystal Clock Information (0x15):
>>>   TSC/clock ratio = 276/2
>>>   nominal core crystal clock = 2400 Hz (table lookup)
>>>
>>> TSC kHz used to calculate mult/shift value: 3312000
>
> So, just to understand, your saying the problem that we calculate a
> tsc_khz value before calculating the mult/shift and the intermediate
> step is losing some precision?
>
> Or is the cause from something else?

The other potential cause here might be just that when we calculate
the mult/shift pair, we select a shift small enough that avoids the
multiplication from overflowing if we have a long timerval. So there
is liklely always some granularity error converting to mult/shift
pair.

thanks
-john


Re: TSC to Mono-raw Drift

2018-10-19 Thread John Stultz
On Fri, Oct 19, 2018 at 8:25 AM, Thomas Gleixner  wrote:
> Christopher,
>
> Please Cc LKML on such issues in the future.
>
> On Mon, 15 Oct 2018, Christopher Hall wrote:
>
> Leaving context around for new readers:
>
>> Problem Statement:
>>
>> The TSC clocksource mult/shift values are derived from CPUID[15H], but the
>> monotonic raw clock value is not equal to TSC in nominal nanoseconds, i.e.
>> the timekeeping code is not accurately transforming TSC ticks to nominal
>> nanoseconds based on CPUID[15H}.
>>
>> The included code calculates the drift between nominal TSC nanoseconds and
>> the monotonic raw clock.
>>
>> Background:
>>
>> Starting with 6th generation Intel CPUs, the TSC is "phase locked" to the
>> Always Running Timer (ART). The relation between TSC and ART is read from
>> CPUID[15H]. Details of the TSC-ART relation are in the "Invariant
>> Timekeeping" section of the SDM.
>>
>> CPUID[15H].ECX returns the nominal frequency of ART (or crystal frequency).
>> CPU feature TSC_KNOWN_FREQ indicates that tsc_khz (tsc.c) is derived from
>> CPUID[15H]. The calculation is in tsc.c:native_calibrate_tsc().
>>
>> When the TSC clocksource is selected, the timekeeping code uses mult/shift
>> values to transform TSC into nanoseconds. The mult/shift value is determined
>> using tsc_khz.
>>
>> Example Output:
>>
>> Running for 3 seconds trial 1
>> Scaled TSC delta: 3000328845
>> Monotonic raw delta: 3000329117
>> Ran for 3 seconds with 272 ns skew
>>
>> Running for 3 seconds trial 2
>> Scaled TSC delta: 3000295209
>> Monotonic raw delta: 3000295482
>> Ran for 3 seconds with 273 ns skew
>>
>> Running for 3 seconds trial 3
>> Scaled TSC delta: 3000262870
>> Monotonic raw delta: 3000263142
>> Ran for 3 seconds with 272 ns skew
>>
>> Running for 300 seconds trial 4
>> Scaled TSC delta: 30281725
>> Monotonic raw delta: 30308905
>> Ran for 300 seconds with 27180 ns skew
>>
>> The skew between tsc and monotonic raw is about 91 PPB.
>>
>> System Information:
>>
>> CPU model string: Intel(R) Core(TM) i5-6600 CPU @ 3.30GHz
>> Kernel version tested: 4.14.71-rt44
>>   NOTE: The skew seems to be insensitive to kernel version after
>>   introduction of TSC_KNOWN_FREQ capability
>>
>> >From CPUID[15H]:
>>   Time Stamp Counter/Core Crystal Clock Information (0x15):
>>   TSC/clock ratio = 276/2
>>   nominal core crystal clock = 2400 Hz (table lookup)
>>
>> TSC kHz used to calculate mult/shift value: 3312000

So, just to understand, your saying the problem that we calculate a
tsc_khz value before calculating the mult/shift and the intermediate
step is losing some precision?

Or is the cause from something else?

thanks
-john


Re: TSC to Mono-raw Drift

2018-10-19 Thread John Stultz
On Fri, Oct 19, 2018 at 8:25 AM, Thomas Gleixner  wrote:
> Christopher,
>
> Please Cc LKML on such issues in the future.
>
> On Mon, 15 Oct 2018, Christopher Hall wrote:
>
> Leaving context around for new readers:
>
>> Problem Statement:
>>
>> The TSC clocksource mult/shift values are derived from CPUID[15H], but the
>> monotonic raw clock value is not equal to TSC in nominal nanoseconds, i.e.
>> the timekeeping code is not accurately transforming TSC ticks to nominal
>> nanoseconds based on CPUID[15H}.
>>
>> The included code calculates the drift between nominal TSC nanoseconds and
>> the monotonic raw clock.
>>
>> Background:
>>
>> Starting with 6th generation Intel CPUs, the TSC is "phase locked" to the
>> Always Running Timer (ART). The relation between TSC and ART is read from
>> CPUID[15H]. Details of the TSC-ART relation are in the "Invariant
>> Timekeeping" section of the SDM.
>>
>> CPUID[15H].ECX returns the nominal frequency of ART (or crystal frequency).
>> CPU feature TSC_KNOWN_FREQ indicates that tsc_khz (tsc.c) is derived from
>> CPUID[15H]. The calculation is in tsc.c:native_calibrate_tsc().
>>
>> When the TSC clocksource is selected, the timekeeping code uses mult/shift
>> values to transform TSC into nanoseconds. The mult/shift value is determined
>> using tsc_khz.
>>
>> Example Output:
>>
>> Running for 3 seconds trial 1
>> Scaled TSC delta: 3000328845
>> Monotonic raw delta: 3000329117
>> Ran for 3 seconds with 272 ns skew
>>
>> Running for 3 seconds trial 2
>> Scaled TSC delta: 3000295209
>> Monotonic raw delta: 3000295482
>> Ran for 3 seconds with 273 ns skew
>>
>> Running for 3 seconds trial 3
>> Scaled TSC delta: 3000262870
>> Monotonic raw delta: 3000263142
>> Ran for 3 seconds with 272 ns skew
>>
>> Running for 300 seconds trial 4
>> Scaled TSC delta: 30281725
>> Monotonic raw delta: 30308905
>> Ran for 300 seconds with 27180 ns skew
>>
>> The skew between tsc and monotonic raw is about 91 PPB.
>>
>> System Information:
>>
>> CPU model string: Intel(R) Core(TM) i5-6600 CPU @ 3.30GHz
>> Kernel version tested: 4.14.71-rt44
>>   NOTE: The skew seems to be insensitive to kernel version after
>>   introduction of TSC_KNOWN_FREQ capability
>>
>> >From CPUID[15H]:
>>   Time Stamp Counter/Core Crystal Clock Information (0x15):
>>   TSC/clock ratio = 276/2
>>   nominal core crystal clock = 2400 Hz (table lookup)
>>
>> TSC kHz used to calculate mult/shift value: 3312000

So, just to understand, your saying the problem that we calculate a
tsc_khz value before calculating the mult/shift and the intermediate
step is losing some precision?

Or is the cause from something else?

thanks
-john


<    1   2   3   4   5   6   7   8   9   10   >