date:20190820

Re: [PATCH v8 2/3] fdt: add support for rng-seed

2019-08-20 Thread Hsin-Yi Wang

Then we'd still use add_device_randomness() in case that bootloader
provides weak entropy.

On Tue, Aug 20, 2019 at 7:14 PM Ard Biesheuvel
 wrote:
>
> On Tue, 20 Aug 2019 at 10:43, Hsin-Yi Wang  wrote:
> >
> > Hi Ted,
> >
> > Thanks for raising this question.
> >
> > For UEFI based system, they have a config table that carries rng seed
> > and can be passed to device randomness. However, they also use
> > add_device_randomness (not sure if it's the same reason that they
> > can't guarantee _all_ bootloader can be trusted)
>
> The config table is actually a Linux invention: it is populated by the
> EFI stub code (which is part of the kernel) based on the output of a
> call into the EFI_RNG_PROTOCOL, which is defined in the UEFI spec, but
> optional and not widely available.
>
> I have opted for add_device_randomness() since there is no way to
> establish the quality level of the output of EFI_RNG_PROTOCOL, and so
> it is currently only used to prevent the bootup state of the entropy
> pool to be too predictable, and the output does not contribute to the
> entropy estimate kept by the RNG core.
>
>
> > This patch is to let DT based system also have similar features, which
> > can make initial random number stronger. (We only care initial
> > situation here, since more entropy would be added to kernel as time
> > goes on )
> >
> > Conservatively, we can use add_device_randomness() as well, which
> > would pass buffer to crng_slow_load() instead of crng_fast_load().
> > But I think we should trust bootloader here. Whoever wants to use this
> > feature should make sure their bootloader can pass valid (random
> > enough) seeds. If they are not sure, they can just don't add the
> > property to DT.
>
> It is the firmware that adds the property to the DT, not the user.

Re: [RFC PATCH v4 4/9] printk-rb: initialize new descriptors as invalid

2019-08-20 Thread John Ogness

On 2019-08-20, Petr Mladek  wrote:
>> Initialize never-used descriptors as permanently invalid so there
>
> The word "permanently" is confusing. It suggests that it will
> never ever be valid again. I would just remove the word.

Agreed.

>> is no risk of the descriptor unexpectedly being determined as
>> valid due to dataring head overflowing/wrapping.
>
> Please, provide more details about the solved race.

OK.

> Is it because some reader could have reference to an invalid
> (reused) descriptor?

Yes, but not because it is reused. If a writer succeeded in reserving a
descriptor, but failed to reserve a datablock, that (invalid) descriptor
is put on the committed list (see fA). By setting the lpos values to
something that could _never_ be valid, there is no risk of the
descriptor suddenly becoming valid due to head overflowing.

My RFCv2 did not account for this and instead invalid descriptors just
held on to whatever lpos values they last had. Although they are invalid
at that moment, if not set to something "permanently" invalid, those
values could become valid again. We talked about that here[0].

> Can be these invalid descriptors be member of the list?

Yes (as Sergey shows in his followup post). Readers see them as invalid
and treat them as dropped records.

> Also it might be worth to mention where is the check that might
> detect such invalid descriptors and what will be the consequences.
> Well, this might be clear from the race description.

The check itself is not special. However, readers do have to be aware of
and correctly handle the case of invalid descriptors on the list. I will
find an appropriate place to document this.

John Ogness

[0] https://lkml.kernel.org/r/20190624140948.l7ekcmz5ser3z...@pathway.suse.cz

RE: [EXT] Re: [PATCH net-next 0/1] net: fec: add C45 MDIO read/write support

2019-08-20 Thread Andy Duan

From: Andrew Lunn  Sent: Tuesday, August 20, 2019 9:04 PM
> On Tue, Aug 20, 2019 at 02:32:26AM +, Andy Duan wrote:
> > From: Andrew Lunn 
> > > On Mon, Aug 19, 2019 at 05:11:14PM +, Marco Hartmann wrote:
> > > > As of yet, the Fast Ethernet Controller (FEC) driver only supports
> > > > Clause 22 conform MDIO transactions. IEEE 802.3ae Clause 45
> > > > defines a modified MDIO protocol that uses a two staged access
> > > > model in order to increase the address space.
> > > >
> > > > This patch adds support for Clause 45 conform MDIO read and write
> > > > operations to the FEC driver.
> > >
> > > Hi Marco
> > >
> > > Do all versions of the FEC hardware support C45? Or do we need to
> > > make use of the quirk support in this driver to just enable it for some
> revisions of FEC?
> > >
> > > Thanks
> > > Andrew
> >
> > i.MX legacy platforms like i.MX6/7 series, they doesn't support Write & Read
> Increment.
> > But for i.MX8MQ/MM series, it support C45 full features like Write & Read
> Increment.
> >
> > For the patch itself, it doesn't support Write & Read Increment, so I
> > think the patch doesn't need to add quirk support.
> 
> Hi Andy
> 
> So what happens with something older than a i.MX8MQ/MM when a C45
> transfer is attempted? This patch adds a new write. Does that write
> immediately trigger a completion interrupt? Does it never trigger an 
> interrupt,
> and we have to wait FEC_MII_TIMEOUT?
> 
> Ideally, if the hardware does not support C45, we want it to return
> EOPNOTSUPP.
> 
> Thanks
> Andrew

It still trigger an interrupt to wakeup the completion, we have to wait 
FEC_MII_TIMEOUT.
Older chips just support part of C45 feature just like the patch 
implementation.

[PATCH] kprobes/x86: use instruction_pointer and instruction_pointer_set

2019-08-20 Thread Jisheng Zhang

Use an arch-independent way to get/set the instruction pointer, we can
make the x86 kprobe_ftrace_handler() more common.

Signed-off-by: Jisheng Zhang 
Acked-by: Masami Hiramatsu 
---
 arch/x86/kernel/kprobes/ftrace.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/kprobes/ftrace.c b/arch/x86/kernel/kprobes/ftrace.c
index 681a4b36e9bb..c2ad0b9259ca 100644
--- a/arch/x86/kernel/kprobes/ftrace.c
+++ b/arch/x86/kernel/kprobes/ftrace.c
@@ -28,9 +28,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long 
parent_ip,
if (kprobe_running()) {
kprobes_inc_nmissed_count(p);
} else {
-   unsigned long orig_ip = regs->ip;
+   unsigned long orig_ip = instruction_pointer(regs);
/* Kprobe handler expects regs->ip = ip + 1 as breakpoint hit */
-   regs->ip = ip + sizeof(kprobe_opcode_t);
+   instruction_pointer_set(regs, ip + sizeof(kprobe_opcode_t));
 
__this_cpu_write(current_kprobe, p);
kcb->kprobe_status = KPROBE_HIT_ACTIVE;
@@ -39,12 +39,13 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long 
parent_ip,
 * Emulate singlestep (and also recover regs->ip)
 * as if there is a 5byte nop
 */
-   regs->ip = (unsigned long)p->addr + MCOUNT_INSN_SIZE;
+   instruction_pointer_set(regs,
+   (unsigned long)p->addr + MCOUNT_INSN_SIZE);
if (unlikely(p->post_handler)) {
kcb->kprobe_status = KPROBE_HIT_SSDONE;
p->post_handler(p, regs, 0);
}
-   regs->ip = orig_ip;
+   instruction_pointer_set(regs, orig_ip);
}
/*
 * If pre_handler returns !0, it changes regs->ip. We have to
-- 
2.23.0.rc1

Re: assign_desc() barriers: Re: [RFC PATCH v4 1/9] printk-rb: add a new printk ringbuffer implementation

2019-08-20 Thread John Ogness

On 2019-08-20, Petr Mladek  wrote:
>> > --- /dev/null
>> > +++ b/kernel/printk/ringbuffer.c
>> > +/**
>> > + * assign_desc() - Assign a descriptor to the caller.
>> > + *
>> > + * @e: The entry structure to store the assigned descriptor to.
>> > + *
>> > + * Find an available descriptor to assign to the caller. First it is 
>> > checked
>> > + * if the tail descriptor from the committed list can be recycled. If not,
>> > + * perhaps a never-used descriptor is available. Otherwise, data blocks 
>> > will
>> > + * be invalidated until the tail descriptor from the committed list can be
>> > + * recycled.
>> > + *
>> > + * Assigned descriptors are invalid until data has been reserved for them.
>> > + *
>> > + * Return: true if a descriptor was assigned, otherwise false.
>> > + *
>> > + * This will only fail if it was not possible to invalidate data blocks in
>> > + * order to recycle a descriptor. This can happen if a writer has 
>> > reserved but
>> > + * not yet committed data and that reserved data is currently the oldest 
>> > data.
>> > + */
>> > +static bool assign_desc(struct prb_reserved_entry *e)
>> > +{
>> > +  struct printk_ringbuffer *rb = e->rb;
>> > +  struct prb_desc *d;
>> > +  struct nl_node *n;
>> > +  unsigned long i;
>> > +
>> > +  for (;;) {
>> > +  /*
>> > +   * jA:
>> > +   *
>> > +   * Try to recycle a descriptor on the committed list.
>> > +   */
>> > +  n = numlist_pop(>nl);
>> > +  if (n) {
>> > +  d = container_of(n, struct prb_desc, list);
>> > +  break;
>> > +  }
>> > +
>> > +  /* Fallback to static never-used descriptors. */
>> > +  if (atomic_read(>desc_next_unused) < DESCS_COUNT(rb)) {
>> > +  i = atomic_fetch_inc(>desc_next_unused);
>> > +  if (i < DESCS_COUNT(rb)) {
>> > +  d = >descs[i];
>> > +  atomic_long_set(>id, i);
>> > +  break;
>> > +  }
>> > +  }
>> > +
>> > +  /*
>> > +   * No descriptor available. Make one available for recycling
>> > +   * by invalidating data (which some descriptor will be
>> > +   * referencing).
>> > +   */
>> > +  if (!dataring_pop(>dr))
>> > +  return false;
>> > +  }
>> > +
>> > +  /*
>> > +   * jB:
>> > +   *
>> > +   * Modify the descriptor ID so that users of the descriptor see that
>> > +   * it has been recycled. A _release() is used so that prb_getdesc()
>> > +   * callers can see all data ringbuffer updates after issuing a
>> > +   * pairing smb_rmb(). See iA for details.
>> > +   *
>> > +   * Memory barrier involvement:
>> > +   *
>> > +   * If dB->iA reads from jB, then dI reads the same value as
>> > +   * jA->cD->hA.
>> > +   *
>> > +   * Relies on:
>> > +   *
>> > +   * RELEASE from jA->cD->hA to jB
>> > +   *matching
>> > +   * RMB between dB->iA and dI
>> > +   */
>> > +  atomic_long_set_release(>id, atomic_long_read(>id) +
>> > +  DESCS_COUNT(rb));
>> 
>> atomic_long_set_release() might be a bit confusing here.
>> There is no related acquire.

As the comment states, this release is for prb_getdesc() users. The only
prb_getdesc() user is _dataring_pop(). If getdesc() returns NULL
(i.e. the descriptor's ID is not what _dataring_pop() was expecting),
then the tail must have moved and _dataring_pop() needs to see
that. Since there are no data dependencies between descriptor ID and
tail_pos, an explicit memory barrier is used. More on this below.

>> In fact, d->id manipulation has barriers from both sides:
>> 
>>   + smp_rmb() before so that all reads are finished before
>> the id is updated (release)
>
> Uh, this statement does not make sense. The read barrier is not
> needed here. Instead the readers need it.
>
> Well, we might need a write barrier before d->id manipulation.
> It should be in numlist_pop() after successfully updating nl->tail_id.
> It will allow readers to detect that the desriptor is being reused
> (not in valid tail_id..head_id range) before we start manipulating it.
>
>>   + smp_wmb() after so that the new ID is written before other
>> related values are modified (acquire).
>> 
>> The smp_wmb() barrier is in prb_reserve(). I would move it here.
>
> This still makes sense. I would move the write barrier from
> prb_reserve() here.

The issue is that _dataring_pop() needs to see a moved dataring tail if
prb_getdesc() fails. Just because numlist_pop() succeeded, doesn't mean
that this was the task that changed the dataring tail. I.e. another CPU
could observe that this task changed the ID but _not_ yet see that
another task changed the dataring tail.

Issuing an smp_mb() before setting the the new ID would also suffice,
but that is a pretty big hammer for something that a set_release can
take care of.

> Sigh, I have to admit that I am not familiar with the _acquire(),
>

Re: [linux-sunxi] [PATCH v5 09/15] clk: sunxi-ng: h6: Allow I2S to change parent rate

2019-08-20 Thread Code Kipper

ThanksI've added to my next patch series but if you could add it
when applying that would be great.
BR,
CK

On Wed, 21 Aug 2019 at 06:07, Chen-Yu Tsai  wrote:
>
> On Wed, Aug 14, 2019 at 2:09 PM  wrote:
> >
> > From: Jernej Skrabec 
> >
> > I2S doesn't work if parent rate couldn't be change. Difference between
> > wanted and actual rate is too big.
> >
> > Fix this by adding CLK_SET_RATE_PARENT flag to I2S clocks.
> >
> > Signed-off-by: Jernej Skrabec 
>
> This lacks your SoB. Please reply and I can add it when applying.
>
> ChenYu

Re: [RFC PATCH v4 8/9] printk-rb: new functionality to support printk

2019-08-20 Thread John Ogness

On 2019-08-20, Sergey Senozhatsky  wrote:
> [..]
>> +void prb_init(struct printk_ringbuffer *rb, char *data, int data_size_bits,
>> +  struct prb_desc *descs, int desc_count_bits,
>> +  struct wait_queue_head *waitq)
>> +{
>> +struct dataring *dr = >dr;
>> +struct numlist *nl = >nl;
>> +
>> +rb->desc_count_bits = desc_count_bits;
>> +rb->descs = descs;
>> +atomic_long_set([0].id, 0);
>> +descs[0].desc.begin_lpos = 1;
>> +descs[0].desc.next_lpos = 1;
>
> dataring_desc_init(), perhaps?

Agreed.

Re: comments style: Re: [RFC PATCH v4 1/9] printk-rb: add a new printk ringbuffer implementation

2019-08-20 Thread John Ogness

On 2019-08-20, Sergey Senozhatsky  wrote:
> [..]
>> > +   *
>> > +   * Memory barrier involvement:
>> > +   *
>> > +   * If dB reads from gA, then dC reads from fG.
>> > +   * If dB reads from gA, then dD reads from fH.
>> > +   * If dB reads from gA, then dE reads from fE.
>> > +   *
>> > +   * Note that if dB reads from gA, then dC cannot read from fC.
>> > +   * Note that if dB reads from gA, then dD cannot read from fD.
>> > +   *
>> > +   * Relies on:
>> > +   *
>> > +   * RELEASE from fG to gA
>> > +   *matching
>> > +   * ADDRESS DEP. from dB to dC
>> > +   *
>> > +   * RELEASE from fH to gA
>> > +   *matching
>> > +   * ADDRESS DEP. from dB to dD
>> > +   *
>> > +   * RELEASE from fE to gA
>> > +   *matching
>> > +   * ACQUIRE from dB to dE
>> > +   */
>> 
>> But I am not sure how much this is useful. It would take ages to decrypt
>> all these shortcuts (signs) and translate them into something
>> human readable. Also it might get outdated easily.
>> 
>> That said, I haven't found yet if there was a system in all
>> the shortcuts. I mean if they can be descrypted easily
>> out of head. Also I am not familiar with the notation
>> of the dependencies.
>
> Does not appear to be systematic to me, but maybe I'm missing something
> obvious. For chains like
>
>   jA->cD->hA to jB
>
> I haven't found anything better than just git grep jA kernel/printk/
> so far.

I really struggled to find a way to label the code in order to document
the memory barriers. By grepping on "jA:" you will land at the exact
location.

> But once you'll grep for label cD, for instance, you'd see
> that it's not defined. It's mentioned but not defined
>
>   kernel/printk/ringbuffer.c:  * jA->cD->hA.
>   kernel/printk/ringbuffer.c:  * RELEASE from jA->cD->hA to jB

I tried to be very careful about the labeling, but you just found an
error. cD is supposed to be cC. (I probably refactored the labels and
missed this one.) Particularly with referencing labels from other files
I was not happy (which is the case with cC). This is one area that I
think it would be really helpful if the kernel guidelines had some
format.

The labels are necessary for the technical documentation of the
barriers. And, after spending much time in this, I find them very
useful. But I agree that there needs to be a better way to assign label
names.

FWIW, I chose a lowercase letter for each function and an uppercase
letter for each label within that function. The camel case (followed by
the colon) created a pair that was unique for grepping.

Petr, in case you missed it, this comment language came from my
discussion[0] with AndreaP.

> I was thinking about renaming labels. E.g.
>
>   dataring_desc_init()
>   {
>   /* di1 */
>   WRITE_ONCE(desc->begin_lpos, 1);
>   /* di2 */
>   WRITE_ONCE(desc->next_lpos, 1);
>   }
>
> Where di stands for descriptor init.
>
>   dataring_push()
>   {
>   /* dp1 */
>   ret = get_new_lpos(dr, size, _lpos, _lpos);
>   ...
>   /* dp2 */
>   smp_mb();
>   ...
>   }
>
> Where dp stands for descriptor push. For dataring we can add a 'dr'
> prefix, to avoid confusion with desc barriers, which have 'd' prefix.
> And so on. Dunno.

Yeah, I spent a lot of time going in circles on this one.

I hope that we can agree that the labels are important. And that a
formal documentation of the barriers is also important. Yes, they are a
lot of work, but I find it makes it a lot easier to go back to the code
after I've been away for a while. Even now, as I go through your
feedback on code that I wrote over a month ago, I find the formal
comments critical to quickly understand _exactly_ why the memory
barriers exist.

Perhaps we should choose labels that are more clear, like:

dataring_push:A
dataring_push:B

Then we would see comments like:

Memory barrier involvement:

If _dataring_pop:B reads from dataring_datablock_setid:A, then
_dataring_pop:C reads from dataring_push:G.

If _dataring_pop:B reads from dataring_datablock_setid:A, then
_dataring_pop:D reads from dataring_push:H.

If _dataring_pop:B reads from dataring_datablock_setid:A, then
_dataring_pop:E reads from dataring_push:E.

Note that if _dataring_pop:B reads from dataring_datablock_setid:A, then
_dataring_pop:C cannot read from dataring_push:C->dataring_desc_init:A.

Note that if _dataring_pop:B reads from dataring_datablock_setid:A, then
_dataring_pop:D cannot read from dataring_push:C->dataring_desc_init:B.

Relies on:

RELEASE from dataring_push:G to dataring_datablock_setid:A
   matching
ADDRESS DEP. from _dataring_pop:B to _dataring_pop:C

RELEASE from dataring_push:H to dataring_datablock_setid:A
   matching
ADDRESS DEP. from _dataring_pop:B to _dataring_pop:D

RELEASE from dataring_push:E to

Re: ##freemail## Re: [PATCH v2] mm: hwpoison: disable memory error handling on 1GB hugepage

2019-08-20 Thread Naoya Horiguchi

On Tue, Aug 20, 2019 at 03:03:55PM +0800, Wanpeng Li wrote:
> Cc Mel Gorman, Kirill, Dave Hansen,
> On Tue, 11 Jun 2019 at 07:51, Naoya Horiguchi  
> wrote:
> >
> > On Wed, May 29, 2019 at 04:31:01PM -0700, Mike Kravetz wrote:
> > > On 5/28/19 2:49 AM, Wanpeng Li wrote:
> > > > Cc Paolo,
> > > > Hi all,
> > > > On Wed, 14 Feb 2018 at 06:34, Mike Kravetz  
> > > > wrote:
> > > >>
> > > >> On 02/12/2018 06:48 PM, Michael Ellerman wrote:
> > > >>> Andrew Morton  writes:
> > > >>>
> > >  On Thu, 08 Feb 2018 12:30:45 + Punit Agrawal 
> > >   wrote:
> > > 
> > > >>
> > > >> So I don't think that the above test result means that errors are 
> > > >> properly
> > > >> handled, and the proposed patch should help for arm64.
> > > >
> > > > Although, the deviation of pud_huge() avoids a kernel crash the code
> > > > would be easier to maintain and reason about if arm64 helpers are
> > > > consistent with expectations by core code.
> > > >
> > > > I'll look to update the arm64 helpers once this patch gets merged. 
> > > > But
> > > > it would be helpful if there was a clear expression of semantics for
> > > > pud_huge() for various cases. Is there any version that can be used 
> > > > as
> > > > reference?
> > > 
> > >  Is that an ack or tested-by?
> > > 
> > >  Mike keeps plaintively asking the powerpc developers to take a look,
> > >  but they remain steadfastly in hiding.
> > > >>>
> > > >>> Cc'ing linuxppc-dev is always a good idea :)
> > > >>>
> > > >>
> > > >> Thanks Michael,
> > > >>
> > > >> I was mostly concerned about use cases for soft/hard offline of huge 
> > > >> pages
> > > >> larger than PMD_SIZE on powerpc.  I know that powerpc supports PGD_SIZE
> > > >> huge pages, and soft/hard offline support was specifically added for 
> > > >> this.
> > > >> See, 94310cbcaa3c "mm/madvise: enable (soft|hard) offline of HugeTLB 
> > > >> pages
> > > >> at PGD level"
> > > >>
> > > >> This patch will disable that functionality.  So, at a minimum this is a
> > > >> 'heads up'.  If there are actual use cases that depend on this, then 
> > > >> more
> > > >> work/discussions will need to happen.  From the e-mail thread on 
> > > >> PGD_SIZE
> > > >> support, I can not tell if there is a real use case or this is just a
> > > >> 'nice to have'.
> > > >
> > > > 1GB hugetlbfs pages are used by DPDK and VMs in cloud deployment, we
> > > > encounter gup_pud_range() panic several times in product environment.
> > > > Is there any plan to reenable and fix arch codes?
> > >
> > > I too am aware of slightly more interest in 1G huge pages.  Suspect that 
> > > as
> > > Intel MMU capacity increases to handle more TLB entries there will be more
> > > and more interest.
> > >
> > > Personally, I am not looking at this issue.  Perhaps Naoya will comment as
> > > he know most about this code.
> >
> > Thanks for forwarding this to me, I'm feeling that memory error handling
> > on 1GB hugepage is demanded as real use case.
> >
> > >
> > > > In addition, 
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kvm/mmu.c#n3213
> > > > The memory in guest can be 1GB/2MB/4K, though the host-backed memory
> > > > are 1GB hugetlbfs pages, after above PUD panic is fixed,
> > > > try_to_unmap() which is called in MCA recovery path will mark the PUD
> > > > hwpoison entry. The guest will vmexit and retry endlessly when
> > > > accessing any memory in the guest which is backed by this 1GB poisoned
> > > > hugetlbfs page. We have a plan to split this 1GB hugetblfs page by 2MB
> > > > hugetlbfs pages/4KB pages, maybe file remap to a virtual address range
> > > > which is 2MB/4KB page granularity, also split the KVM MMU 1GB SPTE
> > > > into 2MB/4KB and mark the offensive SPTE w/ a hwpoison flag, a sigbus
> > > > will be delivered to VM at page fault next time for the offensive
> > > > SPTE. Is this proposal acceptable?
> > >
> > > I am not sure of the error handling design, but this does sound 
> > > reasonable.
> >
> > I agree that that's better.
> >
> > > That block of code which potentially dissolves a huge page on memory error
> > > is hard to understand and I'm not sure if that is even the 'normal'
> > > functionality.  Certainly, we would hate to waste/poison an entire 1G page
> > > for an error on a small subsection.
> >
> > Yes, that's not practical, so we need at first establish the code base for
> > 2GB hugetlb splitting and then extending it to 1GB next.
> 
> I found it is not easy to split. There is a unique hugetlb page size
> that is associated with a mounted hugetlbfs filesystem, file remap to
> 2MB/4KB will break this. How about hard offline 1GB hugetlb page as
> what has already done in soft offline, replace the corrupted 1GB page
> by new 1GB page through page migration, the offending/corrupted area
> in the original 1GB page doesn't need to be copied into the new page,
> the offending/corrupted area in

Re: comments style: Re: [RFC PATCH v4 1/9] printk-rb: add a new printk ringbuffer implementation

2019-08-20 Thread John Ogness

On 2019-08-20, Petr Mladek  wrote:
>> --- /dev/null
>> +++ b/kernel/printk/dataring.c
>> +/**
>> + * _datablock_valid() - Check if given positions yield a valid data block.
>> + *
>> + * @dr: The associated data ringbuffer.
>> + *
>> + * @head_lpos:  The newest data logical position.
>> + *
>> + * @tail_lpos:  The oldest data logical position.
>> + *
>> + * @begin_lpos: The beginning logical position of the data block to check.
>> + *
>> + * @next_lpos:  The logical position of the next adjacent data block.
>> + *  This value is used to identify the end of the data block.
>> + *
>
> Please remove the empty lines between arguments description. They make
> the comments too scattered.

Your feedback is contradicting what PeterZ requested[0]. Particularly
when multiple lines are involved with a description, I find the spacing
helpful. I've grown to like the spacing, but I won't fight for it.

>> +/*
>> + * dB:
>> + *
>> + * When a writer has completed accessing its data block, it sets the
>> + * @id thus making the data block available for invalidation. This
>> + * _acquire() ensures that this task sees all data ringbuffer and
>> + * descriptor values seen by the writer as @id was set. This is
>> + * necessary to ensure that the data block can be correctly identified
>> + * as valid (i.e. @begin_lpos, @next_lpos, @head_lpos are at least the
>> + * values seen by that writer, which yielded a valid data block at
>> + * that time). It is not enough to rely on the address dependency of
>> + * @desc to @id because @head_lpos is not depedent on @id. This pairs
>> + * with the _release() in dataring_datablock_setid().
>
> This human readable description is really useful.
>
>> + *
>> + * Memory barrier involvement:
>> + *
>> + * If dB reads from gA, then dC reads from fG.
>> + * If dB reads from gA, then dD reads from fH.
>> + * If dB reads from gA, then dE reads from fE.
>> + *
>> + * Note that if dB reads from gA, then dC cannot read from fC.
>> + * Note that if dB reads from gA, then dD cannot read from fD.
>> + *
>> + * Relies on:
>> + *
>> + * RELEASE from fG to gA
>> + *matching
>> + * ADDRESS DEP. from dB to dC
>> + *
>> + * RELEASE from fH to gA
>> + *matching
>> + * ADDRESS DEP. from dB to dD
>> + *
>> + * RELEASE from fE to gA
>> + *matching
>> + * ACQUIRE from dB to dE
>> + */
>
> But I am not sure how much this is useful.

When I was first implementing RFCv3, the "human-readable" text version
was very useful for me. However, now it is the formal descriptions that
I find more useful. They provide the proof and a far more detailed
description.

> It would take ages to decrypt all these shortcuts (signs) and
> translate them into something human readable. Also it might get
> outdated easily.
>
> That said, I haven't found yet if there was a system in all
> the shortcuts. I mean if they can be descrypted easily
> out of head. Also I am not familiar with the notation
> of the dependencies.

I'll respond to this part in Sergey's followup post.

> If this is really needed then I am really scared of some barriers
> that guard too many things. This one is a good example.
>
>> +desc = dr->getdesc(smp_load_acquire(>id), dr->getdesc_arg);

The variable's value (in this case db->id) is doing the guarding. The
barriers ensure that db->id is read first (and set last).

>> +
>> +/* dD: */
>
> It would be great if all these shortcuts (signs) are followed with
> something human readable. Few words might be enough.

I'll respond to this part in Sergey's followup post.

>> +next_lpos = READ_ONCE(desc->next_lpos);
>> +
>> +if (!_datablock_valid(dr,
>> +  /* dE: */
>> +  atomic_long_read(>head_lpos),
>> +  tail_lpos, begin_lpos, next_lpos)) {
>> +/* Another task has already invalidated the data block. */
>> +goto out;
>> +}
>> +
>> +
>> +++ b/kernel/printk/numlist.c
>> +bool numlist_read(struct numlist *nl, unsigned long id, unsigned long *seq,
>> +  unsigned long *next_id)
>> +
>> +struct nl_node *n;
>> +
>> +n = nl->node(id, nl->node_arg);
>> +if (!n)
>> +return false;
>> +
>> +if (seq) {
>> +/*
>> + * aA:
>> + *
>> + * Adresss dependency on @id.
>> + */
>
> This is too scattered. If we really need so many shortcuts (signs)
> then we should find a better style. The following looks perfectly
> fine to me:
>
>   /* aA: Adresss dependency on @id. */

I'll respond to this part in Sergey's followup post.

>> +*seq = READ_ONCE(n->seq);
>> +}
>> +
>> +if (next_id) {
>> +/*
>> + * aB:
>> + *
>> + * Adresss dependency on @id.
>> + */
>> +

Re: numlist_pop(): Re: [RFC PATCH v4 1/9] printk-rb: add a new printk ringbuffer implementation

2019-08-20 Thread John Ogness

On 2019-08-20, Petr Mladek  wrote:
>> --- /dev/null
>> +++ b/kernel/printk/numlist.c
>> +/**
>> + * numlist_pop() - Remove the oldest node from the list.
>> + *
>> + * @nl: The numbered list from which to remove the tail node.
>> + *
>> + * The tail node can only be removed if two conditions are satisfied:
>> + *
>> + * * The node is not the only node on the list.
>> + * * The node is not busy.
>> + *
>> + * If, during this function, another task removes the tail, this function
>> + * will try again with the new tail.
>> + *
>> + * Return: The removed node or NULL if the tail node cannot be removed.
>> + */
>> +struct nl_node *numlist_pop(struct numlist *nl)
>> +{
>> +unsigned long tail_id;
>> +unsigned long next_id;
>> +unsigned long r;
>> +
>> +/* cA: #1 */
>> +tail_id = atomic_long_read(>tail_id);
>> +
>> +for (;;) {
>> +/* cB */
>> +while (!numlist_read(nl, tail_id, NULL, _id)) {
>> +/*
>> + * @tail_id is invalid. Try again with an
>> + * updated value.
>> + */
>> +
>> +cpu_relax();
>> +
>> +/* cA: #2 */
>> +tail_id = atomic_long_read(>tail_id);
>> +}
>
> The above while-cycle basically does the same as the upper for-cycle.
> It tries again with freshly loaded nl->tail_id. The following code
> looks easier to follow:
>
>   do {
>   tail_id = atomic_long_read(>tail_id);
>
>   /*
>* Read might fail when the tail node has been removed
>* and reused in parallel.
>*/
>   if (!numlist_read(nl, tail_id, NULL, _id))
>   continue;
>
>   /* Make sure the node is not the only node on the list. */
>   if (next_id == tail_id)
>   return NULL;
>
>   /* cC: Make sure the node is not busy. */
>   if (nl->busy(tail_id, nl->busy_arg))
>   return NULL;
>
>   while (atomic_long_cmpxchg_relaxed(>tail_id, tail_id, next_id) !=
>   tail_id);
>
>   /* This should never fail. The node is ours. */
>   return nl->node(tail_id, nl->node_arg);

You will see that pattern in several cmpxchg() loops. The reason I chose
to do it that way was so that I could make use of the return value of
the failed cmpcxhg(). This avoids an unnecessary LOAD and establishes a
data dependency between the failed cmpxchg() and the following
numlist_read(). I suppose none of that matters since we only care about
the case where cmpxchg() is successful.

I agree that your variation is easier to read.

>> +/* Make sure the node is not the only node on the list. */
>> +if (next_id == tail_id)
>> +return NULL;
>> +
>> +/*
>> + * cC:
>> + *
>> + * Make sure the node is not busy.
>> + */
>> +if (nl->busy(tail_id, nl->busy_arg))
>> +return NULL;
>> +
>> +r = atomic_long_cmpxchg_relaxed(>tail_id,
>> +tail_id, next_id);
>> +if (r == tail_id)
>> +break;
>> +
>> +/* cA: #3 */
>> +tail_id = r;
>> +}
>> +
>> +return nl->node(tail_id, nl->node_arg);
>
> If I get it correctly, the above nl->node() call should never fail.
> The node has been removed from the list and nobody else could
> touch it. It is pretty useful information and it might be worth
> mention it in a comment.

You are correct and I will add a comment.

> PS: I am scratching my head around the patchset. I'll try Peter's
> approach and comment independent things is separate mails.

I think it is an excellent approach. Especially when discussing the
memory barriers.

John Ogness

Re: [PATCH v2 2/2] phy: intel-lgm-emmc: Add support for eMMC PHY

2019-08-20 Thread Ramuthevar, Vadivel MuruganX


On 20/8/2019 9:59 PM, Andy Shevchenko wrote:

On Tue, Aug 20, 2019 at 04:56:02PM +0300, Andy Shevchenko wrote:

On Tue, Aug 20, 2019 at 06:31:33PM +0800, Ramuthevar,Vadivel MuruganX wrote:

+#define DR_TY_50OHM(x) ((~(x) << 28) & DR_TY_MASK)

For consistency it should be

#define DR_TY_SHIFT(x)  (((x) << 28) & DR_TY_MASK)

with explanation about 50 Ohm in the code below.

+   /* Drive impedance: 50 Ohm */

Nice, you have already a comment here. Just use DR_TY_SHIFT(1)

It should be DR_TY_SHIFT(6) now since I dropped the negation.


Thanks Andy, will update the review comments.

Best Regards
Vadivel

Re: [PATCH v2 2/2] phy: intel-lgm-emmc: Add support for eMMC PHY

2019-08-20 Thread Ramuthevar, Vadivel MuruganX


On 20/8/2019 9:56 PM, Andy Shevchenko wrote:

On Tue, Aug 20, 2019 at 06:31:33PM +0800, Ramuthevar,Vadivel MuruganX wrote:

From: Ramuthevar Vadivel Murugan 

Add support for eMMC PHY on Intel's Lightning Mountain SoC.

Thanks for an update.
Looks better though several minor comments below.


Thanks a lot! Andy,  for the review comments.


+/* eMMC phy register definitions */
+#define EMMC_PHYCTRL0_REG  0xa8
+#define DR_TY_MASK GENMASK(30, 28)
+#define DR_TY_50OHM(x) ((~(x) << 28) & DR_TY_MASK)

For consistency it should be

#define DR_TY_SHIFT(x)  (((x) << 28) & DR_TY_MASK)

with explanation about 50 Ohm in the code below.


+#define OTAPDLYENA BIT(14)
+#define OTAPDLYSEL_MASKGENMASK(13, 10)
+#define OTAPDLYSEL_SHIFT(x)(((x) << 10) & OTAPDLYSEL_MASK)
+
+#define EMMC_PHYCTRL1_REG  0xac
+#define PDB_MASK   BIT(0)
+#define ENDLL_MASK BIT(7)
+#define ENDLL_VAL  BIT(7)

Again, inconsistency here,

#define ENDLL_SHIFT(x)  (((x) << 7) & ENDLL_MASK)

Agreed

+#define EMMC_PHYCTRL2_REG  0xb0
+#define FRQSEL_25M 0
+#define FRQSEL_150M3
+#define FRQSEL_MASKGENMASK(24, 22)
+#define FRQSEL_SHIFT(x)((x) << 22)

And here

#define FRQSEL_SHIFT(x) (((x) << 22) & FRQSEL_MASK)

Agreed

+   /*
+* According to the user manual, calpad calibration
+* cycle takes more than 2us without the minimal recommended
+* value, so we may need a little margin here
+*/
+   usleep_range(3, 6);

Actually for this low values it's recommended to use udelay() disregard to
context.

udelay(5);

Agreed

+   regmap_update_bits(priv->syscfg, EMMC_PHYCTRL1_REG, PDB_MASK, 1);

1 looks like a magic that has to be changed in the same way as for the rest, 
i.e.

#define PDB_SHIFT(x)(((x) << 0) & PDB_MASK)

..., PDB_MASK, PDB_SHIFT(1)...

Agreed

+static int intel_emmc_phy_power_on(struct phy *phy)
+{
+   struct intel_emmc_phy *priv = phy_get_drvdata(phy);
+   int ret;
+
+   /* Drive impedance: 50 Ohm */

Nice, you have already a comment here. Just use DR_TY_SHIFT(1)


+   ret = regmap_update_bits(priv->syscfg, EMMC_PHYCTRL0_REG, DR_TY_MASK,
+DR_TY_50OHM(1));
+   ret = regmap_update_bits(priv->syscfg, EMMC_PHYCTRL0_REG, OTAPDLYENA,
+0x0);

0x0 -> 0

Noted

+static int intel_emmc_phy_probe(struct platform_device *pdev)
+{
+   struct device *dev = >dev;
+   struct intel_emmc_phy *priv;
+   struct phy *generic_phy;
+   struct phy_provider *phy_provider;
+
+   priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
+   if (!priv)
+   return -ENOMEM;
+
+   /* Get eMMC phy (accessed via chiptop) regmap */
+   priv->syscfg = syscon_regmap_lookup_by_phandle(dev->of_node,
+  "intel,syscon");

Perhaps

struct device_node *np = dev->of_node;
...
priv->syscfg = syscon_regmap_lookup_by_phandle(np, "intel,syscon");


+   generic_phy = devm_phy_create(dev, dev->of_node, );

And here.


Noted, will update

With Best Regards
Vadivel

Re: [PATCH v2 0/2] Simplify mtty driver and mdev core

2019-08-20 Thread Alex Williamson

On Wed, 21 Aug 2019 05:01:52 +
Parav Pandit  wrote:

> > -Original Message-
> > From: Alex Williamson 
> > Sent: Wednesday, August 21, 2019 10:27 AM
> > To: Parav Pandit 
> > Cc: Jiri Pirko ; David S . Miller ;
> > Kirti Wankhede ; Cornelia Huck
> > ; k...@vger.kernel.org; linux-kernel@vger.kernel.org;
> > cjia ; net...@vger.kernel.org
> > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
> > 
> > On Wed, 21 Aug 2019 04:40:15 +
> > Parav Pandit  wrote:
> >   
> > > > -Original Message-
> > > > From: Alex Williamson 
> > > > Sent: Wednesday, August 21, 2019 9:51 AM
> > > > To: Parav Pandit 
> > > > Cc: Jiri Pirko ; David S . Miller
> > > > ; Kirti Wankhede ;
> > > > Cornelia Huck ; k...@vger.kernel.org;
> > > > linux-kernel@vger.kernel.org; cjia ;
> > > > net...@vger.kernel.org
> > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
> > > >
> > > > On Wed, 21 Aug 2019 03:42:25 +
> > > > Parav Pandit  wrote:
> > > >  
> > > > > > -Original Message-
> > > > > > From: Alex Williamson 
> > > > > > Sent: Tuesday, August 20, 2019 10:49 PM
> > > > > > To: Parav Pandit 
> > > > > > Cc: Jiri Pirko ; David S . Miller
> > > > > > ; Kirti Wankhede ;
> > > > > > Cornelia Huck ; k...@vger.kernel.org;
> > > > > > linux-kernel@vger.kernel.org; cjia ;
> > > > > > net...@vger.kernel.org
> > > > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
> > > > > >
> > > > > > On Tue, 20 Aug 2019 08:58:02 + Parav Pandit
> > > > > >  wrote:
> > > > > >  
> > > > > > > + Dave.
> > > > > > >
> > > > > > > Hi Jiri, Dave, Alex, Kirti, Cornelia,
> > > > > > >
> > > > > > > Please provide your feedback on it, how shall we proceed?
> > > > > > >
> > > > > > > Short summary of requirements.
> > > > > > > For a given mdev (mediated device [1]), there is one
> > > > > > > representor netdevice and devlink port in switchdev mode
> > > > > > > (similar to SR-IOV VF), And there is one netdevice for the actual 
> > > > > > > mdev  
> > when mdev is probed.  
> > > > > > >
> > > > > > > (a) representor netdev and devlink port should be able derive
> > > > > > > phys_port_name(). So that representor netdev name can be built
> > > > > > > deterministically across reboots.
> > > > > > >
> > > > > > > (b) for mdev's netdevice, mdev's device should have an attribute.
> > > > > > > This attribute can be used by udev rules/systemd or something
> > > > > > > else to rename netdev name deterministically.
> > > > > > >
> > > > > > > (c) IFNAMSIZ of 16 bytes is too small to fit whole UUID.
> > > > > > > A simple grep IFNAMSIZ in stack hints hundreds of users of
> > > > > > > IFNAMSIZ in drivers, uapi, netlink, boot config area and more.
> > > > > > > Changing IFNAMSIZ for a mdev bus doesn't really look
> > > > > > > reasonable option  
> > > > to me.  
> > > > > >
> > > > > > How many characters do we really have to work with?  Your
> > > > > > examples below prepend various characters, ex. option-1 results
> > > > > > in ens2f0_m10 or enm10.  Do the extra 8 or 3 characters in these 
> > > > > > count  
> > against IFNAMSIZ?  
> > > > > >  
> > > > > Maximum 15. Last is null termination.
> > > > > Some udev rules setting by user prefix the PF netdev interface. I
> > > > > took such  
> > > > example below where ens2f0 netdev named is prefixed.  
> > > > > Some prefer not to prefix.
> > > > >  
> > > > > > > Hence, I would like to discuss below options.
> > > > > > >
> > > > > > > Option-1: mdev index
> > > > > > > Introduce an optional mdev index/handle as u32 during mdev
> > > > > > > create time. User passes mdev index/handle as input.
> > > > > > >
> > > > > > > phys_port_name=mIndex=m%u
> > > > > > > mdev_index will be available in sysfs as mdev attribute for
> > > > > > > udev to name the mdev's netdev.
> > > > > > >
> > > > > > > example mdev create command:
> > > > > > > UUID=$(uuidgen)
> > > > > > > echo $UUID index=10  
> > > > > > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/cr
> > > > > > > > eate  
> > > > > >
> > > > > > Nit, IIRC previous discussions of additional parameters used
> > > > > > comma separators, ex. echo $UUID,index=10 >...
> > > > > >  
> > > > > Yes, ok.
> > > > >  
> > > > > > > > example netdevs:  
> > > > > > > repnetdev=ens2f0_m10  /*ens2f0 is parent PF's netdevice */  
> > > > > >
> > > > > > Is the parent really relevant in the name?  
> > > > > No. I just picked one udev example who prefixed the parent netdev 
> > > > > name.
> > > > > But there are users who do not prefix it.
> > > > >  
> > > > > > Tools like mdevctl are meant to
> > > > > > provide persistence, creating the same mdev devices on the same
> > > > > > parent, but that's simply the easiest policy decision.  We can
> > > > > > also imagine that multiple parent devices might support a
> > > > > > specified mdev type and policies factoring in proximity,
> > > > > > load-balancing, power consumption, etc might be weighed such
> > > > > > that we really don't

Re: [PATCH v5 2/3] OPP: Add support for bandwidth OPP tables

2019-08-20 Thread Viresh Kumar

On 20-08-19, 15:36, Saravana Kannan wrote:
> On Tue, Aug 20, 2019 at 3:27 PM Saravana Kannan  wrote:
> >
> > On Mon, Aug 19, 2019 at 11:13 PM Viresh Kumar  
> > wrote:
> > >
> > > On 07-08-19, 15:31, Saravana Kannan wrote:
> 
> > > > + ret = of_property_read_u32(np, "opp-peak-kBps", );
> > > > + if (ret)
> > > > + return ret;
> > > > + new_opp->rate = (unsigned long) bw;
> > > > +
> > > > + ret = of_property_read_u32(np, "opp-avg-kBps", );
> > > > + if (!ret)
> > > > + new_opp->avg_bw = (unsigned long) bw;

Why is this casting required ? If you really want a 64 bit value for bw, then
make it 64 bit in bindings as well, like opp-hz. And then you can simply do:

of_property_read_u32(np, "opp-avg-kBps", _opp->avg_bw);


> > >
> > > If none of opp-hz/level/peak-kBps are available, print error message here
> > > itself..
> >
> > But you don't print any error for opp-level today. Seems like it's optional?
> >
> > >
> > > > +
> > > > + return 0;
> > >
> > > You are returning 0 on failure as well here.
> >
> > Thanks.
> 
> Wait, no. This is not actually a failure. opp-avg-kBps is optional. So
> returning 0 is the right thing to do. If the mandatory properties
> aren't present an error is returned before you get to th end.
> 
> -Saravana

-- 
viresh

Re: [PATCH v2 1/2] dt-bindings: phy: intel-emmc-phy: Add YAML schema for LGM eMMC PHY

2019-08-20 Thread Ramuthevar, Vadivel MuruganX




On 20/8/2019 11:54 PM, Rob Herring wrote:

On Tue, Aug 20, 2019 at 5:31 AM Ramuthevar,Vadivel MuruganX
 wrote:

From: Ramuthevar Vadivel Murugan 

Add a YAML schema to use the host controller driver with the
eMMC PHY on Intel's Lightning Mountain SoC.

Signed-off-by: Ramuthevar Vadivel Murugan 

---
changes in v2:
   As per Rob Herring review comments, the following updates
  - change GPL-2.0 -> (GPL-2.0-only OR BSD-2-Clause)
  - filename is the compatible string plus .yaml
  - LGM: Lightning Mountain
  - update maintainer
  - add intel,syscon under property list
  - keep one example instead of two
---
  .../bindings/phy/intel,lgm-emmc-phy.yaml   | 72 ++
  1 file changed, 72 insertions(+)
  create mode 100644 
Documentation/devicetree/bindings/phy/intel,lgm-emmc-phy.yaml

diff --git a/Documentation/devicetree/bindings/phy/intel,lgm-emmc-phy.yaml 
b/Documentation/devicetree/bindings/phy/intel,lgm-emmc-phy.yaml
new file mode 100644
index ..ec177573aca6
--- /dev/null
+++ b/Documentation/devicetree/bindings/phy/intel,lgm-emmc-phy.yaml
@@ -0,0 +1,72 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/phy/intel,lgm-emmc-phy.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Intel Lightning Mountain(LGM) eMMC PHY Device Tree Bindings
+
+maintainers:
+  - Ramuthevar Vadivel Murugan 
+
+
+description:
+  -  Add a new compatible to use the host controller driver with the
+ eMMC PHY on Intel's Lightning Mountain SoC.
+
+$ref: /schemas/types.yaml#definitions/phandle
+  description:
+- It also requires a "syscon" node with compatible = "intel,lgm-chiptop",
+  "syscon" to access the eMMC PHY register.

Not valid schema. Please build 'make dt_binding_check' and fix any warnings.

Hi Rob,

Thank you much for the review comments, will check and update .

With Best Regards
Vadivel

+
+properties:
+  "#phy-cells":
+const: 0
+
+  compatible:
+const: intel,lgm-emmc-phy
+
+  reg:
+maxItems: 1
+
+  intel,syscon:
+items:
+  - description:
+ - |
+   e-MMC phy module should include the following properties
+   * reg, Access the e-MMC, get the base address from syscon.
+   * reset, reset the e-MMC module.
+
+  clocks:
+items:
+  - description: e-MMC phy module clock
+
+  clock-names:
+items:
+  - const: emmcclk
+
+  resets:
+maxItems: 1
+
+required:
+  - "#phy-cells"
+  - compatible
+  - reg
+  - clocks
+  - clock-names
+  - resets
+
+additionalProperties: false
+
+examples:
+  - |
+emmc_phy: emmc_phy {
+compatible = "intel,lgm-emmc-phy";
+reg = <0xe002 0x100>;
+intel,syscon = <>;
+clocks = <>;
+clock-names = "emmcclk";
+#phy-cells = <0>;
+};
+
+...
--
2.11.0

Re: [PATCH v5 2/3] OPP: Add support for bandwidth OPP tables

2019-08-20 Thread Viresh Kumar

On 20-08-19, 15:36, Saravana Kannan wrote:
> On Tue, Aug 20, 2019 at 3:27 PM Saravana Kannan  wrote:
> >
> > On Mon, Aug 19, 2019 at 11:13 PM Viresh Kumar  
> > wrote:
> > >
> > > On 07-08-19, 15:31, Saravana Kannan wrote:
> 
> > > > + ret = of_property_read_u32(np, "opp-peak-kBps", );
> > > > + if (ret)
> > > > + return ret;
> > > > + new_opp->rate = (unsigned long) bw;
> > > > +
> > > > + ret = of_property_read_u32(np, "opp-avg-kBps", );
> > > > + if (!ret)
> > > > + new_opp->avg_bw = (unsigned long) bw;
> > >
> > > If none of opp-hz/level/peak-kBps are available, print error message here
> > > itself..
> >
> > But you don't print any error for opp-level today. Seems like it's optional?
> >
> > >
> > > > +
> > > > + return 0;
> > >
> > > You are returning 0 on failure as well here.
> >
> > Thanks.
> 
> Wait, no. This is not actually a failure. opp-avg-kBps is optional. So
> returning 0 is the right thing to do. If the mandatory properties
> aren't present an error is returned before you get to th end.

You are right :)

-- 
viresh

Re: [PATCH v5 2/3] OPP: Add support for bandwidth OPP tables

2019-08-20 Thread Viresh Kumar

On 20-08-19, 15:27, Saravana Kannan wrote:
> On Mon, Aug 19, 2019 at 11:13 PM Viresh Kumar  wrote:
> >
> > On 07-08-19, 15:31, Saravana Kannan wrote:
> > > Not all devices quantify their performance points in terms of frequency.
> > > Devices like interconnects quantify their performance points in terms of
> > > bandwidth. We need a way to represent these bandwidth levels in OPP. So,
> > > add support for parsing bandwidth OPPs from DT.
> > >
> > > Signed-off-by: Saravana Kannan 
> > > ---
> > >  drivers/opp/of.c  | 41 -
> > >  drivers/opp/opp.h |  4 +++-
> > >  2 files changed, 35 insertions(+), 10 deletions(-)
> > >
> > > diff --git a/drivers/opp/of.c b/drivers/opp/of.c
> > > index 1813f5ad5fa2..e1750033fef9 100644
> > > --- a/drivers/opp/of.c
> > > +++ b/drivers/opp/of.c
> > > @@ -523,6 +523,35 @@ void dev_pm_opp_of_remove_table(struct device *dev)
> > >  }
> > >  EXPORT_SYMBOL_GPL(dev_pm_opp_of_remove_table);
> > >
> > > +static int _read_opp_key(struct dev_pm_opp *new_opp, struct device_node 
> > > *np)
> > > +{
> > > + int ret;
> > > + u64 rate;
> > > + u32 bw;
> > > +
> > > + ret = of_property_read_u64(np, "opp-hz", );
> > > + if (!ret) {
> > > + /*
> > > +  * Rate is defined as an unsigned long in clk API, and so
> > > +  * casting explicitly to its type. Must be fixed once rate 
> > > is 64
> > > +  * bit guaranteed in clk API.
> > > +  */
> > > + new_opp->rate = (unsigned long)rate;
> > > + return 0;
> > > + }
> > > +
> >
> > Please read opp-level also here and do error handling.
> 
> Can you please explain what's the reasoning? opp-level doesn't seem to
> be a "key" based on looking at the code.

Because opp-level is the thing that distinguishes OPPs for power domains, those
nodes don't have opp-hz or bw.

> > > + ret = of_property_read_u32(np, "opp-peak-kBps", );
> > > + if (ret)
> > > + return ret;
> > > + new_opp->rate = (unsigned long) bw;
> > > +
> > > + ret = of_property_read_u32(np, "opp-avg-kBps", );
> > > + if (!ret)
> > > + new_opp->avg_bw = (unsigned long) bw;
> >
> > If none of opp-hz/level/peak-kBps are available, print error message here
> > itself..
> 
> But you don't print any error for opp-level today. Seems like it's optional?

Yeah, probably it should have been there. It will be better to do it now as we
are creating a separate routine for that.

> >
> > > +
> > > + return 0;
> >
> > You are returning 0 on failure as well here.
> 
> Thanks.
> 
> > > +}
> > > +
> > >  /**
> > >   * _opp_add_static_v2() - Allocate static OPPs (As per 'v2' DT bindings)
> > >   * @opp_table:   OPP table
> > > @@ -560,22 +589,16 @@ static struct dev_pm_opp *_opp_add_static_v2(struct 
> > > opp_table *opp_table,
> > >   if (!new_opp)
> > >   return ERR_PTR(-ENOMEM);
> > >
> > > - ret = of_property_read_u64(np, "opp-hz", );
> > > + ret = _read_opp_key(new_opp, np);
> > >   if (ret < 0) {
> > >   /* "opp-hz" is optional for devices like power domains. */
> > >   if (!opp_table->is_genpd) {
> > > - dev_err(dev, "%s: opp-hz not found\n", __func__);
> > > + dev_err(dev, "%s: opp-hz or opp-peak-kBps not 
> > > found\n",
> > > + __func__);
> > >   goto free_opp;
> > >   }
> > >
> > >   rate_not_available = true;
> >
> > Move all above as well to read_opp_key().
> 
> Ok. I didn't want to print an error at the API level and instead print
> at the caller level. But if that's what you want, that's fine by me.

That would be fine, you can keep the print message here (but a generic one, like
key missing).

> > > - } else {
> > > - /*
> > > -  * Rate is defined as an unsigned long in clk API, and so
> > > -  * casting explicitly to its type. Must be fixed once rate 
> > > is 64
> > > -  * bit guaranteed in clk API.
> > > -  */
> > > - new_opp->rate = (unsigned long)rate;
> > >   }
> > >
> > >   of_property_read_u32(np, "opp-level", _opp->level);
> > > diff --git a/drivers/opp/opp.h b/drivers/opp/opp.h
> > > index 01a500e2c40a..6bb238af9cac 100644
> > > --- a/drivers/opp/opp.h
> > > +++ b/drivers/opp/opp.h
> > > @@ -56,7 +56,8 @@ extern struct list_head opp_tables;
> > >   * @turbo:   true if turbo (boost) OPP
> > >   * @suspend: true if suspend OPP
> > >   * @pstate: Device's power domain's performance state.
> > > - * @rate:Frequency in hertz
> > > + * @rate:Frequency in hertz OR Peak bandwidth in kilobytes per second
> > > + * @avg_bw:  Average bandwidth in kilobytes per second
> >
> > Please add separate entry for peak_bw here.
> >
> > I know you reused rate because you don't want to reimplement the helpers we
> > have. Maybe we can just update them to return

[rcu:dev.2019.08.17a 8/36] ERROR: "tick_nohz_full_running" [kernel/rcu/rcutorture.ko] undefined!

2019-08-20 Thread kbuild test robot

Hi Paul,

FYI, the error/warning still remains.

tree:   
https://kernel.googlesource.com/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
dev.2019.08.17a
head:   9120323ada960bc9f1427e546772a983ad036b9a
commit: 14569aa16daa1cd7610624a500ed2750fe341351 [8/36] rcutorture: Force on 
tick for readers and callback flooders
config: x86_64-rhel (attached as .config)
compiler: gcc-7 (Debian 7.4.0-10) 7.4.0
reproduce:
git checkout 14569aa16daa1cd7610624a500ed2750fe341351
# save the attached .config to linux build tree
make ARCH=x86_64 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot 

All errors (new ones prefixed by >>):

   ERROR: "tick_nohz_dep_clear_task" [kernel/rcu/rcutorture.ko] undefined!
   ERROR: "tick_nohz_dep_set_task" [kernel/rcu/rcutorture.ko] undefined!
>> ERROR: "tick_nohz_full_running" [kernel/rcu/rcutorture.ko] undefined!

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

RE: [PATCH v2] qed: Add cleanup in qed_slowpath_start()

2019-08-20 Thread Sudarsana Reddy Kalluru



> -Original Message-
> From: Wenwen Wang 
> Sent: Wednesday, August 21, 2019 10:17 AM
> To: Wenwen Wang 
> Cc: Sudarsana Reddy Kalluru ; Ariel Elior
> ; GR-everest-linux-l2  l...@marvell.com>; David S. Miller ; open
> list:QLOGIC QL4xxx ETHERNET DRIVER ; open list
> 
> Subject: [PATCH v2] qed: Add cleanup in qed_slowpath_start()
> 
> If qed_mcp_send_drv_version() fails, no cleanup is executed, leading to
> memory leaks. To fix this issue, introduce the label 'err4' to perform the
> cleanup work before returning the error.
> 
> Signed-off-by: Wenwen Wang 
> ---
>  drivers/net/ethernet/qlogic/qed/qed_main.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/qlogic/qed/qed_main.c
> b/drivers/net/ethernet/qlogic/qed/qed_main.c
> index 829dd60..1efff7f 100644
> --- a/drivers/net/ethernet/qlogic/qed/qed_main.c
> +++ b/drivers/net/ethernet/qlogic/qed/qed_main.c
> @@ -1325,7 +1325,7 @@ static int qed_slowpath_start(struct qed_dev
> *cdev,
> _version);
>   if (rc) {
>   DP_NOTICE(cdev, "Failed sending drv version
> command\n");
> - return rc;
> + goto err4;
>   }
>   }
> 
> @@ -1333,6 +1333,8 @@ static int qed_slowpath_start(struct qed_dev
> *cdev,
> 
>   return 0;
> 
> +err4:
> + qed_ll2_dealloc_if(cdev);
>  err3:
>   qed_hw_stop(cdev);
>  err2:
> --
> 2.7.4

Acked-by: Sudarsana Reddy Kalluru

Re: PROBLEM: 5.3.0-rc* causes iwlwifi failure

2019-08-20 Thread Luciano Coelho

On Tue, 2019-08-20 at 19:37 -0400, Stuart Little wrote:
> On Tue, Aug 20, 2019 at 01:45:37PM +0300, Luciano Coelho wrote:
> > I'll have to look into all NIC/FW-version combinations that we have
> > and
> > update the iwl_mvm_sar_geo_support() function accordingly, which
> > is,
> > BTW, the easier place for you to change if you want to workaround
> > the
> > issue.
> 
> Thanks!
> 
> I didn't quite know how to interpret this suggestion (i.e. what the
> change should be), so I was poking around in there out of curiosity.
> One simple-minded thing that worked was to just pretend that that
> function always returns false:
> 
> --- cut here ---
> 
> diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/fw.c
> b/drivers/net/wireless/intel/iwlwifi/mvm/fw.c
> index 5de54d1559dd..8c0160e5588f 100644
> --- a/drivers/net/wireless/intel/iwlwifi/mvm/fw.c
> +++ b/drivers/net/wireless/intel/iwlwifi/mvm/fw.c
> @@ -925,7 +925,7 @@ int iwl_mvm_get_sar_geo_profile(struct iwl_mvm
> *mvm)
> .data = { data },
> };
>  
> -   if (!iwl_mvm_sar_geo_support(mvm))
> +   /*if (!iwl_mvm_sar_geo_support(mvm))*/
> return -EOPNOTSUPP;
>  
> ret = iwl_mvm_send_cmd(mvm, );
> @@ -953,7 +953,7 @@ static int iwl_mvm_sar_geo_init(struct iwl_mvm
> *mvm)
> int ret, i, j;
> u16 cmd_wide_id =  WIDE_ID(PHY_OPS_GROUP,
> GEO_TX_POWER_LIMIT);
>  
> -   if (!iwl_mvm_sar_geo_support(mvm))
> +   /*if (!iwl_mvm_sar_geo_support(mvm))*/
> return 0;
>  
> ret = iwl_mvm_sar_get_wgds_table(mvm);
> 
> --- cut here ---

Yeah, I meant more or less to return false for your NIC.  You could
have just forced that function return false.

--
Cheers,
Luca.

RE: [PATCH v2 0/2] Simplify mtty driver and mdev core

2019-08-20 Thread Parav Pandit




> -Original Message-
> From: Alex Williamson 
> Sent: Wednesday, August 21, 2019 10:27 AM
> To: Parav Pandit 
> Cc: Jiri Pirko ; David S . Miller ;
> Kirti Wankhede ; Cornelia Huck
> ; k...@vger.kernel.org; linux-kernel@vger.kernel.org;
> cjia ; net...@vger.kernel.org
> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
> 
> On Wed, 21 Aug 2019 04:40:15 +
> Parav Pandit  wrote:
> 
> > > -Original Message-
> > > From: Alex Williamson 
> > > Sent: Wednesday, August 21, 2019 9:51 AM
> > > To: Parav Pandit 
> > > Cc: Jiri Pirko ; David S . Miller
> > > ; Kirti Wankhede ;
> > > Cornelia Huck ; k...@vger.kernel.org;
> > > linux-kernel@vger.kernel.org; cjia ;
> > > net...@vger.kernel.org
> > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
> > >
> > > On Wed, 21 Aug 2019 03:42:25 +
> > > Parav Pandit  wrote:
> > >
> > > > > -Original Message-
> > > > > From: Alex Williamson 
> > > > > Sent: Tuesday, August 20, 2019 10:49 PM
> > > > > To: Parav Pandit 
> > > > > Cc: Jiri Pirko ; David S . Miller
> > > > > ; Kirti Wankhede ;
> > > > > Cornelia Huck ; k...@vger.kernel.org;
> > > > > linux-kernel@vger.kernel.org; cjia ;
> > > > > net...@vger.kernel.org
> > > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
> > > > >
> > > > > On Tue, 20 Aug 2019 08:58:02 + Parav Pandit
> > > > >  wrote:
> > > > >
> > > > > > + Dave.
> > > > > >
> > > > > > Hi Jiri, Dave, Alex, Kirti, Cornelia,
> > > > > >
> > > > > > Please provide your feedback on it, how shall we proceed?
> > > > > >
> > > > > > Short summary of requirements.
> > > > > > For a given mdev (mediated device [1]), there is one
> > > > > > representor netdevice and devlink port in switchdev mode
> > > > > > (similar to SR-IOV VF), And there is one netdevice for the actual 
> > > > > > mdev
> when mdev is probed.
> > > > > >
> > > > > > (a) representor netdev and devlink port should be able derive
> > > > > > phys_port_name(). So that representor netdev name can be built
> > > > > > deterministically across reboots.
> > > > > >
> > > > > > (b) for mdev's netdevice, mdev's device should have an attribute.
> > > > > > This attribute can be used by udev rules/systemd or something
> > > > > > else to rename netdev name deterministically.
> > > > > >
> > > > > > (c) IFNAMSIZ of 16 bytes is too small to fit whole UUID.
> > > > > > A simple grep IFNAMSIZ in stack hints hundreds of users of
> > > > > > IFNAMSIZ in drivers, uapi, netlink, boot config area and more.
> > > > > > Changing IFNAMSIZ for a mdev bus doesn't really look
> > > > > > reasonable option
> > > to me.
> > > > >
> > > > > How many characters do we really have to work with?  Your
> > > > > examples below prepend various characters, ex. option-1 results
> > > > > in ens2f0_m10 or enm10.  Do the extra 8 or 3 characters in these count
> against IFNAMSIZ?
> > > > >
> > > > Maximum 15. Last is null termination.
> > > > Some udev rules setting by user prefix the PF netdev interface. I
> > > > took such
> > > example below where ens2f0 netdev named is prefixed.
> > > > Some prefer not to prefix.
> > > >
> > > > > > Hence, I would like to discuss below options.
> > > > > >
> > > > > > Option-1: mdev index
> > > > > > Introduce an optional mdev index/handle as u32 during mdev
> > > > > > create time. User passes mdev index/handle as input.
> > > > > >
> > > > > > phys_port_name=mIndex=m%u
> > > > > > mdev_index will be available in sysfs as mdev attribute for
> > > > > > udev to name the mdev's netdev.
> > > > > >
> > > > > > example mdev create command:
> > > > > > UUID=$(uuidgen)
> > > > > > echo $UUID index=10
> > > > > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/cr
> > > > > > > eate
> > > > >
> > > > > Nit, IIRC previous discussions of additional parameters used
> > > > > comma separators, ex. echo $UUID,index=10 >...
> > > > >
> > > > Yes, ok.
> > > >
> > > > > > > example netdevs:
> > > > > > repnetdev=ens2f0_m10/*ens2f0 is parent PF's netdevice */
> > > > >
> > > > > Is the parent really relevant in the name?
> > > > No. I just picked one udev example who prefixed the parent netdev name.
> > > > But there are users who do not prefix it.
> > > >
> > > > > Tools like mdevctl are meant to
> > > > > provide persistence, creating the same mdev devices on the same
> > > > > parent, but that's simply the easiest policy decision.  We can
> > > > > also imagine that multiple parent devices might support a
> > > > > specified mdev type and policies factoring in proximity,
> > > > > load-balancing, power consumption, etc might be weighed such
> > > > > that we really don't want to promote userspace creating dependencies
> on the parent association.
> > > > >
> > > > > > mdev_netdev=enm10
> > > > > >
> > > > > > Pros:
> > > > > > 1. mdevctl and any other existing tools are unaffected.
> > > > > > 2. netdev stack, ovs and other switching platforms are unaffected.
> > > > > > 3. achieves unique phys_port_name for

Re: [PATCH v2 0/2] Simplify mtty driver and mdev core

2019-08-20 Thread Alex Williamson

On Wed, 21 Aug 2019 04:40:15 +
Parav Pandit  wrote:

> > -Original Message-
> > From: Alex Williamson 
> > Sent: Wednesday, August 21, 2019 9:51 AM
> > To: Parav Pandit 
> > Cc: Jiri Pirko ; David S . Miller ;
> > Kirti Wankhede ; Cornelia Huck
> > ; k...@vger.kernel.org; linux-kernel@vger.kernel.org;
> > cjia ; net...@vger.kernel.org
> > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
> > 
> > On Wed, 21 Aug 2019 03:42:25 +
> > Parav Pandit  wrote:
> >   
> > > > -Original Message-
> > > > From: Alex Williamson 
> > > > Sent: Tuesday, August 20, 2019 10:49 PM
> > > > To: Parav Pandit 
> > > > Cc: Jiri Pirko ; David S . Miller
> > > > ; Kirti Wankhede ;
> > > > Cornelia Huck ; k...@vger.kernel.org;
> > > > linux-kernel@vger.kernel.org; cjia ;
> > > > net...@vger.kernel.org
> > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
> > > >
> > > > On Tue, 20 Aug 2019 08:58:02 +
> > > > Parav Pandit  wrote:
> > > >  
> > > > > + Dave.
> > > > >
> > > > > Hi Jiri, Dave, Alex, Kirti, Cornelia,
> > > > >
> > > > > Please provide your feedback on it, how shall we proceed?
> > > > >
> > > > > Short summary of requirements.
> > > > > For a given mdev (mediated device [1]), there is one representor
> > > > > netdevice and devlink port in switchdev mode (similar to SR-IOV
> > > > > VF), And there is one netdevice for the actual mdev when mdev is 
> > > > > probed.
> > > > >
> > > > > (a) representor netdev and devlink port should be able derive
> > > > > phys_port_name(). So that representor netdev name can be built
> > > > > deterministically across reboots.
> > > > >
> > > > > (b) for mdev's netdevice, mdev's device should have an attribute.
> > > > > This attribute can be used by udev rules/systemd or something else
> > > > > to rename netdev name deterministically.
> > > > >
> > > > > (c) IFNAMSIZ of 16 bytes is too small to fit whole UUID.
> > > > > A simple grep IFNAMSIZ in stack hints hundreds of users of
> > > > > IFNAMSIZ in drivers, uapi, netlink, boot config area and more.
> > > > > Changing IFNAMSIZ for a mdev bus doesn't really look reasonable 
> > > > > option  
> > to me.  
> > > >
> > > > How many characters do we really have to work with?  Your examples
> > > > below prepend various characters, ex. option-1 results in ens2f0_m10
> > > > or enm10.  Do the extra 8 or 3 characters in these count against 
> > > > IFNAMSIZ?
> > > >  
> > > Maximum 15. Last is null termination.
> > > Some udev rules setting by user prefix the PF netdev interface. I took 
> > > such  
> > example below where ens2f0 netdev named is prefixed.  
> > > Some prefer not to prefix.
> > >  
> > > > > Hence, I would like to discuss below options.
> > > > >
> > > > > Option-1: mdev index
> > > > > Introduce an optional mdev index/handle as u32 during mdev create
> > > > > time. User passes mdev index/handle as input.
> > > > >
> > > > > phys_port_name=mIndex=m%u
> > > > > mdev_index will be available in sysfs as mdev attribute for udev
> > > > > to name the mdev's netdev.
> > > > >
> > > > > example mdev create command:
> > > > > UUID=$(uuidgen)
> > > > > echo $UUID index=10  
> > > > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create  
> > > >
> > > > Nit, IIRC previous discussions of additional parameters used comma
> > > > separators, ex. echo $UUID,index=10 >...
> > > >  
> > > Yes, ok.
> > >  
> > > > > > example netdevs:  
> > > > > repnetdev=ens2f0_m10  /*ens2f0 is parent PF's netdevice */  
> > > >
> > > > Is the parent really relevant in the name?  
> > > No. I just picked one udev example who prefixed the parent netdev name.
> > > But there are users who do not prefix it.
> > >  
> > > > Tools like mdevctl are meant to
> > > > provide persistence, creating the same mdev devices on the same
> > > > parent, but that's simply the easiest policy decision.  We can also
> > > > imagine that multiple parent devices might support a specified mdev
> > > > type and policies factoring in proximity, load-balancing, power
> > > > consumption, etc might be weighed such that we really don't want to
> > > > promote userspace creating dependencies on the parent association.
> > > >  
> > > > > mdev_netdev=enm10
> > > > >
> > > > > Pros:
> > > > > 1. mdevctl and any other existing tools are unaffected.
> > > > > 2. netdev stack, ovs and other switching platforms are unaffected.
> > > > > 3. achieves unique phys_port_name for representor netdev 4.
> > > > > achieves unique mdev eth netdev name for the mdev using udev/systemd  
> > extension.  
> > > > > 5. Aligns well with mdev and netdev subsystem and similar to
> > > > > existing sriov bdf's.  
> > > >
> > > > A user provided index seems strange to me.  It's not really an
> > > > index, just a user specified instance number.  Presumably you have
> > > > the user providing this because if it really were an index, then the
> > > > value depends on the creation order and persistence is lost.  Now
> > > > the user needs

[PATCH v2] qed: Add cleanup in qed_slowpath_start()

2019-08-20 Thread Wenwen Wang

If qed_mcp_send_drv_version() fails, no cleanup is executed, leading to
memory leaks. To fix this issue, introduce the label 'err4' to perform the
cleanup work before returning the error.

Signed-off-by: Wenwen Wang 
---
 drivers/net/ethernet/qlogic/qed/qed_main.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_main.c 
b/drivers/net/ethernet/qlogic/qed/qed_main.c
index 829dd60..1efff7f 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_main.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_main.c
@@ -1325,7 +1325,7 @@ static int qed_slowpath_start(struct qed_dev *cdev,
  _version);
if (rc) {
DP_NOTICE(cdev, "Failed sending drv version command\n");
-   return rc;
+   goto err4;
}
}
 
@@ -1333,6 +1333,8 @@ static int qed_slowpath_start(struct qed_dev *cdev,
 
return 0;
 
+err4:
+   qed_ll2_dealloc_if(cdev);
 err3:
qed_hw_stop(cdev);
 err2:
-- 
2.7.4

linux-next: manual merge of the irqchip tree with the pci tree

2019-08-20 Thread Stephen Rothwell

Hi all,

Today's linux-next merge of the irqchip tree got a conflict in:

  drivers/pci/controller/pci-hyperv.c

between commit:

  44b1ece783ff ("PCI: hv: Detect and fix Hyper-V PCI domain number collision")

from the pci tree and commit:

  467a3bb97432 ("PCI: hv: Allocate a named fwnode instead of an address-based 
one")

from the irqchip tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc drivers/pci/controller/pci-hyperv.c
index 3a56de6b2ec2,97056f3dd317..
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@@ -2563,7 -2521,7 +2563,8 @@@ static int hv_pci_probe(struct hv_devic
const struct hv_vmbus_device_id *dev_id)
  {
struct hv_pcibus_device *hbus;
 +  u16 dom_req, dom;
+   char *name;
int ret;
  
/*


pgp_SyMWnxAyA.pgp
Description: OpenPGP digital signature

RE: [PATCH v2 0/2] Simplify mtty driver and mdev core

2019-08-20 Thread Parav Pandit




> -Original Message-
> From: Alex Williamson 
> Sent: Wednesday, August 21, 2019 9:51 AM
> To: Parav Pandit 
> Cc: Jiri Pirko ; David S . Miller ;
> Kirti Wankhede ; Cornelia Huck
> ; k...@vger.kernel.org; linux-kernel@vger.kernel.org;
> cjia ; net...@vger.kernel.org
> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
> 
> On Wed, 21 Aug 2019 03:42:25 +
> Parav Pandit  wrote:
> 
> > > -Original Message-
> > > From: Alex Williamson 
> > > Sent: Tuesday, August 20, 2019 10:49 PM
> > > To: Parav Pandit 
> > > Cc: Jiri Pirko ; David S . Miller
> > > ; Kirti Wankhede ;
> > > Cornelia Huck ; k...@vger.kernel.org;
> > > linux-kernel@vger.kernel.org; cjia ;
> > > net...@vger.kernel.org
> > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
> > >
> > > On Tue, 20 Aug 2019 08:58:02 +
> > > Parav Pandit  wrote:
> > >
> > > > + Dave.
> > > >
> > > > Hi Jiri, Dave, Alex, Kirti, Cornelia,
> > > >
> > > > Please provide your feedback on it, how shall we proceed?
> > > >
> > > > Short summary of requirements.
> > > > For a given mdev (mediated device [1]), there is one representor
> > > > netdevice and devlink port in switchdev mode (similar to SR-IOV
> > > > VF), And there is one netdevice for the actual mdev when mdev is probed.
> > > >
> > > > (a) representor netdev and devlink port should be able derive
> > > > phys_port_name(). So that representor netdev name can be built
> > > > deterministically across reboots.
> > > >
> > > > (b) for mdev's netdevice, mdev's device should have an attribute.
> > > > This attribute can be used by udev rules/systemd or something else
> > > > to rename netdev name deterministically.
> > > >
> > > > (c) IFNAMSIZ of 16 bytes is too small to fit whole UUID.
> > > > A simple grep IFNAMSIZ in stack hints hundreds of users of
> > > > IFNAMSIZ in drivers, uapi, netlink, boot config area and more.
> > > > Changing IFNAMSIZ for a mdev bus doesn't really look reasonable option
> to me.
> > >
> > > How many characters do we really have to work with?  Your examples
> > > below prepend various characters, ex. option-1 results in ens2f0_m10
> > > or enm10.  Do the extra 8 or 3 characters in these count against IFNAMSIZ?
> > >
> > Maximum 15. Last is null termination.
> > Some udev rules setting by user prefix the PF netdev interface. I took such
> example below where ens2f0 netdev named is prefixed.
> > Some prefer not to prefix.
> >
> > > > Hence, I would like to discuss below options.
> > > >
> > > > Option-1: mdev index
> > > > Introduce an optional mdev index/handle as u32 during mdev create
> > > > time. User passes mdev index/handle as input.
> > > >
> > > > phys_port_name=mIndex=m%u
> > > > mdev_index will be available in sysfs as mdev attribute for udev
> > > > to name the mdev's netdev.
> > > >
> > > > example mdev create command:
> > > > UUID=$(uuidgen)
> > > > echo $UUID index=10
> > > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create
> > >
> > > Nit, IIRC previous discussions of additional parameters used comma
> > > separators, ex. echo $UUID,index=10 >...
> > >
> > Yes, ok.
> >
> > > > > example netdevs:
> > > > repnetdev=ens2f0_m10/*ens2f0 is parent PF's netdevice */
> > >
> > > Is the parent really relevant in the name?
> > No. I just picked one udev example who prefixed the parent netdev name.
> > But there are users who do not prefix it.
> >
> > > Tools like mdevctl are meant to
> > > provide persistence, creating the same mdev devices on the same
> > > parent, but that's simply the easiest policy decision.  We can also
> > > imagine that multiple parent devices might support a specified mdev
> > > type and policies factoring in proximity, load-balancing, power
> > > consumption, etc might be weighed such that we really don't want to
> > > promote userspace creating dependencies on the parent association.
> > >
> > > > mdev_netdev=enm10
> > > >
> > > > Pros:
> > > > 1. mdevctl and any other existing tools are unaffected.
> > > > 2. netdev stack, ovs and other switching platforms are unaffected.
> > > > 3. achieves unique phys_port_name for representor netdev 4.
> > > > achieves unique mdev eth netdev name for the mdev using udev/systemd
> extension.
> > > > 5. Aligns well with mdev and netdev subsystem and similar to
> > > > existing sriov bdf's.
> > >
> > > A user provided index seems strange to me.  It's not really an
> > > index, just a user specified instance number.  Presumably you have
> > > the user providing this because if it really were an index, then the
> > > value depends on the creation order and persistence is lost.  Now
> > > the user needs to both avoid uuid collision as well as "index"
> > > number collision.  The uuid namespace is large enough to mostly ignore
> this, but this is not.  This seems like a burden.
> > >
> > I liked the term 'instance number', which is lot better way to say than
> index/handle.
> > Yes, user needs to avoid both the collision.
> > UUID collision should not

Re: [PATCH] powerpc/vdso32: inline __get_datapage()

2019-08-20 Thread Benjamin Herrenschmidt

On Fri, 2019-08-16 at 14:48 +, Christophe Leroy wrote:
> __get_datapage() is only a few instructions to retrieve the
> address of the page where the kernel stores data to the VDSO.
> 
> By inlining this function into its users, a bl/blr pair and
> a mflr/mtlr pair is avoided, plus a few reg moves.
> 
> The improvement is noticeable (about 55 nsec/call on an 8xx)

Interesting... would be worth doing the same on vdso64 no ?

> vdsotest before the patch:
> gettimeofday:vdso: 731 nsec/call
> clock-gettime-realtime-coarse:vdso: 668 nsec/call
> clock-gettime-monotonic-coarse:vdso: 745 nsec/call
> 
> vdsotest after the patch:
> gettimeofday:vdso: 677 nsec/call
> clock-gettime-realtime-coarse:vdso: 613 nsec/call
> clock-gettime-monotonic-coarse:vdso: 690 nsec/call
> 
> Signed-off-by: Christophe Leroy 
> ---
>  arch/powerpc/kernel/vdso32/cacheflush.S   | 10 +-
>  arch/powerpc/kernel/vdso32/datapage.S | 29 -
> 
>  arch/powerpc/kernel/vdso32/datapage.h | 12 
>  arch/powerpc/kernel/vdso32/gettimeofday.S | 11 +--
>  4 files changed, 26 insertions(+), 36 deletions(-)
>  create mode 100644 arch/powerpc/kernel/vdso32/datapage.h
> 
> diff --git a/arch/powerpc/kernel/vdso32/cacheflush.S
> b/arch/powerpc/kernel/vdso32/cacheflush.S
> index 7f882e7b9f43..e9453837e4ee 100644
> --- a/arch/powerpc/kernel/vdso32/cacheflush.S
> +++ b/arch/powerpc/kernel/vdso32/cacheflush.S
> @@ -10,6 +10,8 @@
>  #include 
>  #include 
>  
> +#include "datapage.h"
> +
>   .text
>  
>  /*
> @@ -24,14 +26,12 @@ V_FUNCTION_BEGIN(__kernel_sync_dicache)
>.cfi_startproc
>   mflrr12
>.cfi_register lr,r12
> - mr  r11,r3
> - bl  __get_datapage@local
> + get_datapager10, r0
>   mtlrr12
> - mr  r10,r3
>  
>   lwz r7,CFG_DCACHE_BLOCKSZ(r10)
>   addir5,r7,-1
> - andcr6,r11,r5   /* round low to line bdy */
> + andcr6,r3,r5/* round low to line bdy */
>   subfr8,r6,r4/* compute length */
>   add r8,r8,r5/* ensure we get enough */
>   lwz r9,CFG_DCACHE_LOGBLOCKSZ(r10)
> @@ -48,7 +48,7 @@ V_FUNCTION_BEGIN(__kernel_sync_dicache)
>  
>   lwz r7,CFG_ICACHE_BLOCKSZ(r10)
>   addir5,r7,-1
> - andcr6,r11,r5   /* round low to line bdy */
> + andcr6,r3,r5/* round low to line bdy */
>   subfr8,r6,r4/* compute length */
>   add r8,r8,r5
>   lwz r9,CFG_ICACHE_LOGBLOCKSZ(r10)
> diff --git a/arch/powerpc/kernel/vdso32/datapage.S
> b/arch/powerpc/kernel/vdso32/datapage.S
> index 6984125b9fc0..d480d2d4a3fe 100644
> --- a/arch/powerpc/kernel/vdso32/datapage.S
> +++ b/arch/powerpc/kernel/vdso32/datapage.S
> @@ -11,34 +11,13 @@
>  #include 
>  #include 
>  
> +#include "datapage.h"
> +
>   .text
>   .global __kernel_datapage_offset;
>  __kernel_datapage_offset:
>   .long   0
>  
> -V_FUNCTION_BEGIN(__get_datapage)
> -  .cfi_startproc
> - /* We don't want that exposed or overridable as we want other
> objects
> -  * to be able to bl directly to here
> -  */
> - .protected __get_datapage
> - .hidden __get_datapage
> -
> - mflrr0
> -  .cfi_register lr,r0
> -
> - bcl 20,31,data_page_branch
> -data_page_branch:
> - mflrr3
> - mtlrr0
> - addir3, r3, __kernel_datapage_offset-data_page_branch
> - lwz r0,0(r3)
> -  .cfi_restore lr
> - add r3,r0,r3
> - blr
> -  .cfi_endproc
> -V_FUNCTION_END(__get_datapage)
> -
>  /*
>   * void *__kernel_get_syscall_map(unsigned int *syscall_count) ;
>   *
> @@ -53,7 +32,7 @@ V_FUNCTION_BEGIN(__kernel_get_syscall_map)
>   mflrr12
>.cfi_register lr,r12
>   mr  r4,r3
> - bl  __get_datapage@local
> + get_datapager3, r0
>   mtlrr12
>   addir3,r3,CFG_SYSCALL_MAP32
>   cmpli   cr0,r4,0
> @@ -74,7 +53,7 @@ V_FUNCTION_BEGIN(__kernel_get_tbfreq)
>.cfi_startproc
>   mflrr12
>.cfi_register lr,r12
> - bl  __get_datapage@local
> + get_datapager3, r0
>   lwz r4,(CFG_TB_TICKS_PER_SEC + 4)(r3)
>   lwz r3,CFG_TB_TICKS_PER_SEC(r3)
>   mtlrr12
> diff --git a/arch/powerpc/kernel/vdso32/datapage.h
> b/arch/powerpc/kernel/vdso32/datapage.h
> new file mode 100644
> index ..ad96256be090
> --- /dev/null
> +++ b/arch/powerpc/kernel/vdso32/datapage.h
> @@ -0,0 +1,12 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +
> +.macro get_datapage ptr, tmp
> + bcl 20,31,888f
> +888:
> + mflr\ptr
> + addi\ptr, \ptr, __kernel_datapage_offset - 888b
> + lwz \tmp, 0(\ptr)
> + add \ptr, \tmp, \ptr
> +.endm
> +
> +
> diff --git a/arch/powerpc/kernel/vdso32/gettimeofday.S
> b/arch/powerpc/kernel/vdso32/gettimeofday.S
> index e10098cde89c..91a58f01dcd5 100644
> ---

Re: [PATCH] fsi: scom: Don't abort operations for minor errors

2019-08-20 Thread Benjamin Herrenschmidt

On Thu, 2019-08-15 at 14:08 -0500, Eddie James wrote:
> The scom driver currently fails out of operations if certain system
> errors are flagged in the status register; system checkstop, special
> attention, or recoverable error. These errors won't impact the ability
> of the scom engine to perform operations, so the driver should continue
> under these conditions.
> Also, don't do a PIB reset for these conditions, since it won't help.
> 
> Signed-off-by: Eddie James 

Acked-by: Benjamin Herrenschmidt 

> ---
>  drivers/fsi/fsi-scom.c | 8 +---
>  1 file changed, 1 insertion(+), 7 deletions(-)
> 
> diff --git a/drivers/fsi/fsi-scom.c b/drivers/fsi/fsi-scom.c
> index 343153d..004dc03 100644
> --- a/drivers/fsi/fsi-scom.c
> +++ b/drivers/fsi/fsi-scom.c
> @@ -38,8 +38,7 @@
>  #define SCOM_STATUS_PIB_RESP_MASK0x7000
>  #define SCOM_STATUS_PIB_RESP_SHIFT   12
>  
> -#define SCOM_STATUS_ANY_ERR  (SCOM_STATUS_ERR_SUMMARY | \
> -  SCOM_STATUS_PROTECTION | \
> +#define SCOM_STATUS_ANY_ERR  (SCOM_STATUS_PROTECTION | \
>SCOM_STATUS_PARITY | \
>SCOM_STATUS_PIB_ABORT | \
>SCOM_STATUS_PIB_RESP_MASK)
> @@ -251,11 +250,6 @@ static int handle_fsi2pib_status(struct scom_device 
> *scom, uint32_t status)
>   /* Return -EBUSY on PIB abort to force a retry */
>   if (status & SCOM_STATUS_PIB_ABORT)
>   return -EBUSY;
> - if (status & SCOM_STATUS_ERR_SUMMARY) {
> - fsi_device_write(scom->fsi_dev, SCOM_FSI2PIB_RESET_REG, ,
> -  sizeof(uint32_t));
> - return -EIO;
> - }
>   return 0;
>  }
>

Re: [EXT] [PATCH] qed: Add cleanup in qed_slowpath_start()

2019-08-20 Thread Wenwen Wang

On Tue, Aug 13, 2019 at 6:46 AM Sudarsana Reddy Kalluru
 wrote:
>
> > -Original Message-
> > From: Wenwen Wang 
> > Sent: Tuesday, August 13, 2019 3:35 PM
> > To: Wenwen Wang 
> > Cc: Ariel Elior ; GR-everest-linux-l2  > l...@marvell.com>; David S. Miller ; open
> > list:QLOGIC QL4xxx ETHERNET DRIVER ; open list
> > 
> > Subject: [EXT] [PATCH] qed: Add cleanup in qed_slowpath_start()
> >
> > External Email
> >
> > --
> > If qed_mcp_send_drv_version() fails, no cleanup is executed, leading to
> > memory leaks. To fix this issue, redirect the execution to the label 'err3'
> > before returning the error.
> >
> > Signed-off-by: Wenwen Wang 
> > ---
> >  drivers/net/ethernet/qlogic/qed/qed_main.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/ethernet/qlogic/qed/qed_main.c
> > b/drivers/net/ethernet/qlogic/qed/qed_main.c
> > index 829dd60..d16a251 100644
> > --- a/drivers/net/ethernet/qlogic/qed/qed_main.c
> > +++ b/drivers/net/ethernet/qlogic/qed/qed_main.c
> > @@ -1325,7 +1325,7 @@ static int qed_slowpath_start(struct qed_dev
> > *cdev,
> > _version);
> >   if (rc) {
> >   DP_NOTICE(cdev, "Failed sending drv version
> > command\n");
> > - return rc;
> > + goto err3;
>
> In this case, we might need to free the ll2-buf allocated at the below path 
> (?),
> 1312 /* Allocate LL2 interface if needed */
> 1313 if (QED_LEADING_HWFN(cdev)->using_ll2) {
> 1314 rc = qed_ll2_alloc_if(cdev);
> May be by adding a new goto label 'err4'.

Thanks for your suggestion! I will rework the patch.

Wenwen

>
> >   }
> >   }
> >
> > --
> > 2.7.4
>

Re: [PATCH v7 1/7] driver core: Add support for linking devices during device addition

2019-08-20 Thread Frank Rowand

On 8/20/19 7:01 PM, Saravana Kannan wrote:
> 
> 
> On Tue, Aug 20, 2019, 6:56 PM Greg Kroah-Hartman  > wrote:
> 
> On Tue, Aug 20, 2019 at 06:06:55PM -0700, Frank Rowand wrote:
> > On 8/20/19 3:10 PM, Saravana Kannan wrote:
> > > On Mon, Aug 19, 2019 at 9:25 PM Frank Rowand  > wrote:
> > >>
> > >> On 8/19/19 5:00 PM, Saravana Kannan wrote:
> > >>> On Sun, Aug 18, 2019 at 8:38 PM Frank Rowand 
> mailto:frowand.l...@gmail.com>> wrote:
> > 
> >  On 8/15/19 6:50 PM, Saravana Kannan wrote:
> > > On Wed, Aug 7, 2019 at 7:04 PM Frank Rowand 
> mailto:frowand.l...@gmail.com>> wrote:
> > >>
> > >>> Date: Tue, 23 Jul 2019 17:10:54 -0700
> > >>> Subject: [PATCH v7 1/7] driver core: Add support for linking 
> devices during
> > >>>  device addition
> > >>> From: Saravana Kannan  >
> 
> This is a "fun" thread :(
> 
> You two should get together in person this week and talk.  I think you
> both will be at ELC, can we do this tomorrow or Thursday so we can hash
> it out in a way that doesn't end up talking past each other, like I feel
> is happening here right now?
> 
> 
> That would be great. Wednesday would be better for me. I might not make it to 
> ELC on Thursday. Let us know Frank.
> 
> Thanks,
> Saravana

I am really glad that you are here at ELC.  It should be very productive to
sit down together and figure some things out.

I'll send a separate reply with my phone number off list.

-Frank

Re: [PATCH v2 0/2] Simplify mtty driver and mdev core

2019-08-20 Thread Alex Williamson

On Wed, 21 Aug 2019 03:42:25 +
Parav Pandit  wrote:

> > -Original Message-
> > From: Alex Williamson 
> > Sent: Tuesday, August 20, 2019 10:49 PM
> > To: Parav Pandit 
> > Cc: Jiri Pirko ; David S . Miller ;
> > Kirti Wankhede ; Cornelia Huck
> > ; k...@vger.kernel.org; linux-kernel@vger.kernel.org;
> > cjia ; net...@vger.kernel.org
> > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
> > 
> > On Tue, 20 Aug 2019 08:58:02 +
> > Parav Pandit  wrote:
> >   
> > > + Dave.
> > >
> > > Hi Jiri, Dave, Alex, Kirti, Cornelia,
> > >
> > > Please provide your feedback on it, how shall we proceed?
> > >
> > > Short summary of requirements.
> > > For a given mdev (mediated device [1]), there is one representor
> > > netdevice and devlink port in switchdev mode (similar to SR-IOV VF),
> > > And there is one netdevice for the actual mdev when mdev is probed.
> > >
> > > (a) representor netdev and devlink port should be able derive
> > > phys_port_name(). So that representor netdev name can be built
> > > deterministically across reboots.
> > >
> > > (b) for mdev's netdevice, mdev's device should have an attribute.
> > > This attribute can be used by udev rules/systemd or something else to
> > > rename netdev name deterministically.
> > >
> > > (c) IFNAMSIZ of 16 bytes is too small to fit whole UUID.
> > > A simple grep IFNAMSIZ in stack hints hundreds of users of IFNAMSIZ in
> > > drivers, uapi, netlink, boot config area and more. Changing IFNAMSIZ
> > > for a mdev bus doesn't really look reasonable option to me.  
> > 
> > How many characters do we really have to work with?  Your examples below
> > prepend various characters, ex. option-1 results in ens2f0_m10 or enm10.  Do
> > the extra 8 or 3 characters in these count against IFNAMSIZ?
> >   
> Maximum 15. Last is null termination.
> Some udev rules setting by user prefix the PF netdev interface. I took such 
> example below where ens2f0 netdev named is prefixed.
> Some prefer not to prefix.
> 
> > > Hence, I would like to discuss below options.
> > >
> > > Option-1: mdev index
> > > Introduce an optional mdev index/handle as u32 during mdev create
> > > time. User passes mdev index/handle as input.
> > >
> > > phys_port_name=mIndex=m%u
> > > mdev_index will be available in sysfs as mdev attribute for udev to
> > > name the mdev's netdev.
> > >
> > > example mdev create command:
> > > UUID=$(uuidgen)
> > > echo $UUID index=10  
> > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create  
> > 
> > Nit, IIRC previous discussions of additional parameters used comma 
> > separators,
> > ex. echo $UUID,index=10 >...
> >   
> Yes, ok.
> 
> > > > example netdevs:  
> > > repnetdev=ens2f0_m10  /*ens2f0 is parent PF's netdevice */  
> > 
> > Is the parent really relevant in the name?
> No. I just picked one udev example who prefixed the parent netdev name.
> But there are users who do not prefix it.
> 
> > Tools like mdevctl are meant to
> > provide persistence, creating the same mdev devices on the same parent, but
> > that's simply the easiest policy decision.  We can also imagine that 
> > multiple
> > parent devices might support a specified mdev type and policies factoring in
> > proximity, load-balancing, power consumption, etc might be weighed such that
> > we really don't want to promote userspace creating dependencies on the
> > parent association.
> >   
> > > mdev_netdev=enm10
> > >
> > > Pros:
> > > 1. mdevctl and any other existing tools are unaffected.
> > > 2. netdev stack, ovs and other switching platforms are unaffected.
> > > 3. achieves unique phys_port_name for representor netdev 4. achieves
> > > unique mdev eth netdev name for the mdev using udev/systemd extension.
> > > 5. Aligns well with mdev and netdev subsystem and similar to existing
> > > sriov bdf's.  
> > 
> > A user provided index seems strange to me.  It's not really an index, just 
> > a user
> > specified instance number.  Presumably you have the user providing this
> > because if it really were an index, then the value depends on the creation 
> > order
> > and persistence is lost.  Now the user needs to both avoid uuid collision 
> > as well
> > as "index" number collision.  The uuid namespace is large enough to mostly
> > ignore this, but this is not.  This seems like a burden.
> >   
> I liked the term 'instance number', which is lot better way to say than 
> index/handle.
> Yes, user needs to avoid both the collision.
> UUID collision should not occur in most cases, they way UUID are generated.
> So practically users needs to pick unique 'instance number', similar to how 
> it picks unique netdev names.
> 
> Burden to user comes from the requirement to get uniqueness.
> 
> > > Option-2: shorter mdev name
> > > Extend mdev to have shorter mdev device name in addition to UUID.
> > > such as 'foo', 'bar'.
> > > Mdev will continue to have UUID.
> > > phys_port_name=mdev_name
> > >
> > > Pros:
> > > 1. All same as option-1, except mdevctl

[PATCH v2] net: pch_gbe: Fix memory leaks

2019-08-20 Thread Wenwen Wang

In pch_gbe_set_ringparam(), if netif_running() returns false, 'tx_old' and
'rx_old' are not deallocated, leading to memory leaks. To fix this issue,
move the free statements to the outside of the if() statement.

Signed-off-by: Wenwen Wang 
---
 drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_ethtool.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_ethtool.c 
b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_ethtool.c
index 1a3008e..cb43919 100644
--- a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_ethtool.c
+++ b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_ethtool.c
@@ -340,12 +340,10 @@ static int pch_gbe_set_ringparam(struct net_device 
*netdev,
goto err_setup_tx;
pch_gbe_free_rx_resources(adapter, rx_old);
pch_gbe_free_tx_resources(adapter, tx_old);
-   kfree(tx_old);
-   kfree(rx_old);
-   adapter->rx_ring = rxdr;
-   adapter->tx_ring = txdr;
err = pch_gbe_up(adapter);
}
+   kfree(tx_old);
+   kfree(rx_old);
return err;
 
 err_setup_tx:
-- 
2.7.4

Re: [PATCH v7 0/2] KVM: LAPIC: Implement Exitless Timer

2019-08-20 Thread Wanpeng Li

On Sat, 6 Jul 2019 at 09:26, Wanpeng Li  wrote:
>
> Dedicated instances are currently disturbed by unnecessary jitter due
> to the emulated lapic timers fire on the same pCPUs which vCPUs resident.
> There is no hardware virtual timer on Intel for guest like ARM. Both
> programming timer in guest and the emulated timer fires incur vmexits.
> This patchset tries to avoid vmexit which is incurred by the emulated
> timer fires in dedicated instance scenario.
>
> When nohz_full is enabled in dedicated instances scenario, the unpinned
> timer will be moved to the nearest busy housekeepers after commit
> 9642d18eee2cd (nohz: Affine unpinned timers to housekeepers) and commit
> 444969223c8 ("sched/nohz: Fix affine unpinned timers mess"). However,
> KVM always makes lapic timer pinned to the pCPU which vCPU residents, the
> reason is explained by commit 61abdbe0 (kvm: x86: make lapic hrtimer
> pinned). Actually, these emulated timers can be offload to the housekeeping
> cpus since APICv is really common in recent years. The guest timer interrupt
> is injected by posted-interrupt which is delivered by housekeeping cpu
> once the emulated timer fires.
>
> The host admin should fine tuned, e.g. dedicated instances scenario w/
> nohz_full cover the pCPUs which vCPUs resident, several pCPUs surplus
> for busy housekeeping, disable mwait/hlt/pause vmexits to keep in non-root
> mode, ~3% redis performance benefit can be observed on Skylake server.
>
> w/o patchset:
>
> VM-EXIT  Samples  Samples%  Time%   Min Time  Max Time   Avg time
>
> EXTERNAL_INTERRUPT4291649.43%   39.30%   0.47us   106.09us   0.71us ( 
> +-   1.09% )
>
> w/ patchset:
>
> VM-EXIT  Samples  Samples%  Time%   Min Time  Max Time 
> Avg time
>
> EXTERNAL_INTERRUPT6871 9.29% 2.96%   0.44us57.88us   0.72us ( 
> +-   4.02% )
>
> Cc: Paolo Bonzini 
> Cc: Radim Krčmář 
> Cc: Marcelo Tosatti 
>
> v6 -> v7:
>  * remove bool argument
>
> v5 -> v6:
>  * don't overwrites whatever the user specified
>  * introduce kvm_can_post_timer_interrupt and kvm_use_posted_timer_interrupt
>  * remove kvm_hlt_in_guest() condition
>  * squash all of 2/3/4 together
>
> v4 -> v5:
>  * update patch description in patch 1/4
>  * feed latest apic->lapic_timer.expired_tscdeadline to 
> kvm_wait_lapic_expire()
>  * squash advance timer handling to patch 2/4
>
> v3 -> v4:
>  * drop the HRTIMER_MODE_ABS_PINNED, add kick after set pending timer
>  * don't posted inject already-expired timer
>
> v2 -> v3:
>  * disarming the vmx preemption timer when 
> posted_interrupt_inject_timer_enabled()
>  * check kvm_hlt_in_guest instead
>
> v1 -> v2:
>  * check vcpu_halt_in_guest
>  * move module parameter from kvm-intel to kvm
>  * add housekeeping_enabled
>  * rename apic_timer_expired_pi to kvm_apic_inject_pending_timer_irqs
>
>
> Wanpeng Li (2):
>   KVM: LAPIC: Make lapic timer unpinned
>   KVM: LAPIC: Inject timer interrupt via posted interrupt

There is a further optimization for this feature in houseeking/hrtimer
subsystem.

[1] https://lkml.org/lkml/2019/7/25/963
[2] https://lkml.org/lkml/2019/6/28/231

The [2] patch tries to optimize the worst case, however, it will not
be merged by maintainers and get offline confirm, Thomas will refactor
this to avoid to predict the future on every timer enqueue. Anyway, it
still should be considered to be backported to product environment as
long as get_nohz_timer_target() is using.

Regards,
Wanpeng Li

Re: [PATCH 15/15] riscv: disable the EFI PECOFF header for M-mode

2019-08-20 Thread Troy Benjegerdes




> On Aug 13, 2019, at 8:47 AM, Christoph Hellwig  wrote:
> 
> No point in bloating the kernel image with a bootloader header if
> we run bare metal.

I would say the same for S-mode. EFI booting should be an option, not
a requirement. I have M-mode U-boot working with bootelf to start BBL,
and at some point, I’m hoping we can have a M-mode linux kernel be
the SBI provider for S-mode kernels, which seem most logical to me
to start using the vmlinux elf binaries using something like kexec()

> 
> Signed-off-by: Christoph Hellwig 
> ---
> arch/riscv/kernel/head.S | 2 ++
> 1 file changed, 2 insertions(+)
> 
> diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
> index 670e5cacb24e..09fcf3d000c0 100644
> --- a/arch/riscv/kernel/head.S
> +++ b/arch/riscv/kernel/head.S
> @@ -16,6 +16,7 @@
> 
> __INIT
> ENTRY(_start)
> +#ifndef CONFIG_M_MODE
>   /*
>* Image header expected by Linux boot-loaders. The image header data
>* structure is described in asm/image.h.
> @@ -47,6 +48,7 @@ ENTRY(_start)
> 
> .global _start_kernel
> _start_kernel:
> +#endif /* CONFIG_M_MODE */
>   /* Mask all interrupts */
>   csrw CSR_XIE, zero
>   csrw CSR_XIP, zero
> -- 
> 2.20.1
> 
> 
> ___
> linux-riscv mailing list
> linux-ri...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv

Re: [PATCH] net: pch_gbe: Fix memory leaks

2019-08-20 Thread Wenwen Wang

On Thu, Aug 15, 2019 at 4:51 PM David Miller  wrote:
>
> From: Wenwen Wang 
> Date: Thu, 15 Aug 2019 16:46:05 -0400
>
> > On Thu, Aug 15, 2019 at 4:42 PM David Miller  wrote:
> >>
> >> From: Wenwen Wang 
> >> Date: Thu, 15 Aug 2019 16:03:39 -0400
> >>
> >> > On Thu, Aug 15, 2019 at 3:34 PM David Miller  wrote:
> >> >>
> >> >> From: Wenwen Wang 
> >> >> Date: Tue, 13 Aug 2019 20:33:45 -0500
> >> >>
> >> >> > In pch_gbe_set_ringparam(), if netif_running() returns false, 
> >> >> > 'tx_old' and
> >> >> > 'rx_old' are not deallocated, leading to memory leaks. To fix this 
> >> >> > issue,
> >> >> > move the free statements after the if branch.
> >> >> >
> >> >> > Signed-off-by: Wenwen Wang 
> >> >>
> >> >> Why would they be "deallocated"?  They are still assigned to
> >> >> adapter->tx_ring and adapter->rx_ring.
> >> >
> >> > 'adapter->tx_ring' and 'adapter->rx_ring' has been covered by newly
> >> > allocated 'txdr' and 'rxdr' respectively before this if statement.
> >>
> >> That only happens inside of the if() statement, that's why rx_old and
> >> tx_old are only freed in that code path.
> >
> > That happens not only inside of the if statement, but also before the
> > if statement, just after 'txdr' and 'rxdr' are allocated.
>
> Then the assignments inside of the if() statement are redundant.
>
> Something doesn't add up here, please make the code consistent.

Thanks for your suggestion! I will remove the assignments inside of
the if() statement.

Wenwen

Re: [linux-sunxi] [PATCH v5 09/15] clk: sunxi-ng: h6: Allow I2S to change parent rate

2019-08-20 Thread Chen-Yu Tsai

On Wed, Aug 14, 2019 at 2:09 PM  wrote:
>
> From: Jernej Skrabec 
>
> I2S doesn't work if parent rate couldn't be change. Difference between
> wanted and actual rate is too big.
>
> Fix this by adding CLK_SET_RATE_PARENT flag to I2S clocks.
>
> Signed-off-by: Jernej Skrabec 

This lacks your SoB. Please reply and I can add it when applying.

ChenYu

[PATCH v2 2/3] mm/gup: introduce FOLL_PIN flag for get_user_pages()

2019-08-20 Thread John Hubbard

As explained in the newly added documentation for FOLL_PIN and
FOLL_LONGTERM, in every case where vaddr_pin_pages() is required,
FOLL_PIN must be set. That reason, plus a desire to keep FOLL_PIN
an internal (to get_user_pages() and follow_page()) detail, is why
vaddr_pin_pages() sets FOLL_PIN.

FOLL_LONGTERM, on the other hand, in only set in *some* cases, but
not all. For that reason, this patch moves the setting of FOLL_LONGTERM
out to the caller.

Also add fairly extensive documentation of the meaning and use
of both FOLL_PIN and FOLL_LONGTERM.

Thanks to Jan Kara and Vlastimil Babka for explaining the 4 cases
in this documentation. (I've reworded it and expanded on it slightly.)

The motivation behind moving away from "bare" get_user_pages() calls
is described in more detail in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Cc: Vlastimil Babka 
Cc: Jan Kara 
Cc: Michal Hocko 
Cc: Ira Weiny 
Signed-off-by: John Hubbard 
---
 drivers/infiniband/core/umem.c |  1 +
 include/linux/mm.h | 56 ++
 mm/gup.c   |  2 +-
 3 files changed, 52 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index e69eecb0023f..d84f1bfb8d21 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -300,6 +300,7 @@ struct ib_umem *ib_umem_get(struct ib_udata *udata, 
unsigned long addr,
 
while (npages) {
down_read(>mmap_sem);
+   gup_flags |= FOLL_LONGTERM;
ret = vaddr_pin_pages(cur_base,
 min_t(unsigned long, npages,
   PAGE_SIZE / sizeof (struct page *)),
diff --git a/include/linux/mm.h b/include/linux/mm.h
index bc675e94ddf8..6e7de424bf5e 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2644,6 +2644,8 @@ static inline vm_fault_t vmf_error(int err)
 struct page *follow_page(struct vm_area_struct *vma, unsigned long address,
 unsigned int foll_flags);
 
+/* Flags for follow_page(), get_user_pages ("GUP"), and vaddr_pin_pages(): */
+
 #define FOLL_WRITE 0x01/* check pte is writable */
 #define FOLL_TOUCH 0x02/* mark page accessed */
 #define FOLL_GET   0x04/* do get_page on page */
@@ -2663,13 +2665,15 @@ struct page *follow_page(struct vm_area_struct *vma, 
unsigned long address,
 #define FOLL_ANON  0x8000  /* don't do file mappings */
 #define FOLL_LONGTERM  0x1 /* mapping lifetime is indefinite: see below */
 #define FOLL_SPLIT_PMD 0x2 /* split huge pmd before returning */
+#define FOLL_PIN   0x4 /* pages must be released via put_user_page() */
 
 /*
- * NOTE on FOLL_LONGTERM:
+ * FOLL_PIN and FOLL_LONGTERM may be used in various combinations with each
+ * other. Here is what they mean, and how to use them:
  *
  * FOLL_LONGTERM indicates that the page will be held for an indefinite time
- * period _often_ under userspace control.  This is contrasted with
- * iov_iter_get_pages() where usages which are transient.
+ * period _often_ under userspace control.  This is in contrast to
+ * iov_iter_get_pages(), where usages which are transient.
  *
  * FIXME: For pages which are part of a filesystem, mappings are subject to the
  * lifetime enforced by the filesystem and we need guarantees that longterm
@@ -2684,11 +2688,51 @@ struct page *follow_page(struct vm_area_struct *vma, 
unsigned long address,
  * Currently only get_user_pages() and get_user_pages_fast() support this flag
  * and calls to get_user_pages_[un]locked are specifically not allowed.  This
  * is due to an incompatibility with the FS DAX check and
- * FAULT_FLAG_ALLOW_RETRY
+ * FAULT_FLAG_ALLOW_RETRY.
  *
- * In the CMA case: longterm pins in a CMA region would unnecessarily fragment
- * that region.  And so CMA attempts to migrate the page before pinning when
+ * In the CMA case: long term pins in a CMA region would unnecessarily fragment
+ * that region.  And so, CMA attempts to migrate the page before pinning, when
  * FOLL_LONGTERM is specified.
+ *
+ * FOLL_PIN indicates that a special kind of tracking (not just 
page->_refcount,
+ * but an additional pin counting system) will be invoked. This is intended for
+ * anything that gets a page reference and then touches page data (for example,
+ * Direct IO). This lets the filesystem know that some non-file-system entity 
is
+ * potentially changing the pages' data. FOLL_PIN pages must be released,
+ * ultimately, by a call to put_user_page(). Typically that will be via one of
+ * the vaddr_unpin_pages() variants.
+ *
+ * FIXME: note that this special tracking is not in place yet. However, the
+ * pages should still be released by put_user_page().
+ *
+ * When and where to use each flag:
+ *
+ * CASE 1: Direct IO (DIO). There are GUP references to pages that are serving
+ * as DIO buffers. These buffers are needed for a relatively

[PATCH v2 3/3] mm/gup: introduce vaddr_pin_pages_remote(), and invoke it

2019-08-20 Thread John Hubbard

vaddr_pin_user_pages_remote() is the "vaddr_pin_pages" corresponding
variant to get_user_pages_remote(), except that:
   a) it sets FOLL_PIN, and
   b) it can handle FOLL_LONGTERM (and the associated vaddr_pin arg).

Change process_vm_rw_single_vec() to invoke the new function.

Signed-off-by: John Hubbard 
Cc: Ira Weiny 
---
 include/linux/mm.h |  5 +
 mm/gup.c   | 34 ++
 mm/process_vm_access.c | 23 +--
 3 files changed, 52 insertions(+), 10 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 6e7de424bf5e..849b509e9f89 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1606,6 +1606,11 @@ int __account_locked_vm(struct mm_struct *mm, unsigned 
long pages, bool inc,
 long vaddr_pin_pages(unsigned long addr, unsigned long nr_pages,
 unsigned int gup_flags, struct page **pages,
 struct vaddr_pin *vaddr_pin);
+long vaddr_pin_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
+unsigned long start, unsigned long nr_pages,
+unsigned int gup_flags, struct page **pages,
+struct vm_area_struct **vmas, int *locked,
+struct vaddr_pin *vaddr_pin);
 void vaddr_unpin_pages(struct page **pages, unsigned long nr_pages,
   struct vaddr_pin *vaddr_pin, bool make_dirty);
 bool mapping_inode_has_layout(struct vaddr_pin *vaddr_pin, struct page *page);
diff --git a/mm/gup.c b/mm/gup.c
index ba316d960d7a..d713ed9d4b9a 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2522,3 +2522,37 @@ void vaddr_unpin_pages(struct page **pages, unsigned 
long nr_pages,
__put_user_pages_dirty_lock(pages, nr_pages, make_dirty, vaddr_pin);
 }
 EXPORT_SYMBOL(vaddr_unpin_pages);
+
+/**
+ * vaddr_pin_user_pages_remote() - pin pages by virtual address and return the
+ * pages to the user.
+ *
+ * @tsk:   the task_struct to use for page fault accounting, or
+ * NULL if faults are not to be recorded.
+ * @mm:mm_struct of target mm
+ * @addr:  start address
+ * @nr_pages:  number of pages to pin
+ * @gup_flags: flags to use for the pin. Please see FOLL_* documentation in
+ * mm.h.
+ * @pages: array of pages returned
+ * @vaddr_pin:  If FOLL_LONGTERM is set, then vaddr_pin should point to an
+ * initialized struct that contains the owning mm and file. Otherwise, 
vaddr_pin
+ * should be set to NULL.
+ *
+ * This is the "vaddr_pin_pages" corresponding variant to
+ * get_user_pages_remote(), except that:
+ *a) it sets FOLL_PIN, and
+ *b) it can handle FOLL_LONGTERM (and the associated vaddr_pin arg).
+ */
+long vaddr_pin_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
+unsigned long start, unsigned long nr_pages,
+unsigned int gup_flags, struct page **pages,
+struct vm_area_struct **vmas, int *locked,
+struct vaddr_pin *vaddr_pin)
+{
+   gup_flags |= FOLL_TOUCH | FOLL_REMOTE | FOLL_PIN;
+
+   return __get_user_pages_locked(tsk, mm, start, nr_pages, pages, vmas,
+  locked, gup_flags, vaddr_pin);
+}
+EXPORT_SYMBOL(vaddr_pin_user_pages_remote);
diff --git a/mm/process_vm_access.c b/mm/process_vm_access.c
index 357aa7bef6c0..28e0a17b6080 100644
--- a/mm/process_vm_access.c
+++ b/mm/process_vm_access.c
@@ -44,7 +44,6 @@ static int process_vm_rw_pages(struct page **pages,
 
if (vm_write) {
copied = copy_page_from_iter(page, offset, copy, iter);
-   set_page_dirty_lock(page);
} else {
copied = copy_page_to_iter(page, offset, copy, iter);
}
@@ -96,7 +95,7 @@ static int process_vm_rw_single_vec(unsigned long addr,
flags |= FOLL_WRITE;
 
while (!rc && nr_pages && iov_iter_count(iter)) {
-   int pages = min(nr_pages, max_pages_per_loop);
+   int pinned_pages = min(nr_pages, max_pages_per_loop);
int locked = 1;
size_t bytes;
 
@@ -106,14 +105,17 @@ static int process_vm_rw_single_vec(unsigned long addr,
 * current/current->mm
 */
down_read(>mmap_sem);
-   pages = get_user_pages_remote(task, mm, pa, pages, flags,
- process_pages, NULL, );
+
+   pinned_pages = vaddr_pin_user_pages_remote(task, mm, pa,
+  pinned_pages, flags,
+  process_pages, NULL,
+  , NULL);
if (locked)
up_read(>mmap_sem);
-   if (pages

[PATCH v2 1/3] For Ira: tiny formatting tweak to kerneldoc

2019-08-20 Thread John Hubbard

For your vaddr_pin_pages() and vaddr_unpin_pages().
Just merge it into wherever it goes please. Didn't want to
cause merge problems so it's a separate patch-let.

Signed-off-by: John Hubbard 
---
 mm/gup.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 56421b880325..e49096d012ea 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2465,7 +2465,7 @@ int get_user_pages_fast(unsigned long start, int nr_pages,
 EXPORT_SYMBOL_GPL(get_user_pages_fast);
 
 /**
- * vaddr_pin_pages pin pages by virtual address and return the pages to the
+ * vaddr_pin_pages() - pin pages by virtual address and return the pages to the
  * user.
  *
  * @addr: start address
@@ -2505,7 +2505,7 @@ long vaddr_pin_pages(unsigned long addr, unsigned long 
nr_pages,
 EXPORT_SYMBOL(vaddr_pin_pages);
 
 /**
- * vaddr_unpin_pages - counterpart to vaddr_pin_pages
+ * vaddr_unpin_pages() - counterpart to vaddr_pin_pages
  *
  * @pages: array of pages returned
  * @nr_pages: number of pages in pages
-- 
2.22.1

disregard: [PATCH 1/4] checkpatch: revert broken NOTIFIER_HEAD check

2019-08-20 Thread John Hubbard


On 8/20/19 9:03 PM, John Hubbard wrote:

commit 1a47005dd5aa ("checkpatch: add *_NOTIFIER_HEAD as var
definition") causes the following warning when run on some
patches:



Please disregard this series. It's stale.

thanks,
--
John Hubbard
NVIDIA

[PATCH v2 0/3] mm/gup: introduce vaddr_pin_pages_remote(), FOLL_PIN

2019-08-20 Thread John Hubbard

Hi Ira,

This is for your tree. I'm dropping the RFC because this aspect is
starting to firm up pretty well.

I've moved FOLL_PIN inside the vaddr_pin_*() routines, and moved
FOLL_LONGTERM outside, based on our recent discussions. This is
documented pretty well within the patches.

Note that there are a lot of references in comments and commit
logs, to vaddr_pin_pages(). We'll want to catch all of those if
we rename that. I am pushing pretty hard to rename it to
vaddr_pin_user_pages().

v1 of this may be found here:
https://lore.kernel.org/r/20190812015044.26176-1-jhubb...@nvidia.com

John Hubbard (3):
  For Ira: tiny formatting tweak to kerneldoc
  mm/gup: introduce FOLL_PIN flag for get_user_pages()
  mm/gup: introduce vaddr_pin_pages_remote(), and invoke it

 drivers/infiniband/core/umem.c |  1 +
 include/linux/mm.h | 61 ++
 mm/gup.c   | 40 --
 mm/process_vm_access.c | 23 +++--
 4 files changed, 106 insertions(+), 19 deletions(-)

-- 
2.22.1

[PATCH 3/4] mm/gup: introduce FOLL_PIN flag for get_user_pages()

2019-08-20 Thread John Hubbard

FOLL_PIN is set by callers of vaddr_pin_pages(). This is different
than FOLL_LONGTERM, because even short term page pins need a new kind
of tracking, if those pinned pages' data is going to potentially
be modified.

This situation is described in more detail in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

FOLL_PIN is added now, rather than waiting until there is code that
takes action based on FOLL_PIN. That's because having FOLL_PIN in
the code helps to highlight the differences between:

a) get_user_pages(): soon to be deprecated. Used to pin pages,
   but without awareness of file systems that might use those
   pages,

b) The original vaddr_pin_pages(): intended only for
   FOLL_LONGTERM and DAX use cases. This assumes direct IO
   and therefore is not applicable the most of the other
   callers of get_user_pages(), and

Also add fairly extensive documentation of the meaning and use
of both FOLL_PIN and FOLL_LONGTERM.

Thanks to Jan Kara and Vlastimil Babka for explaining the 4 cases
in this documentation. (I've reworded it and expanded on it slightly.)

Cc: Vlastimil Babka 
Cc: Jan Kara 
Cc: Michal Hocko 
Cc: Ira Weiny 
Signed-off-by: John Hubbard 
---
 include/linux/mm.h | 56 +-
 1 file changed, 50 insertions(+), 6 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index bc675e94ddf8..6e7de424bf5e 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2644,6 +2644,8 @@ static inline vm_fault_t vmf_error(int err)
 struct page *follow_page(struct vm_area_struct *vma, unsigned long address,
 unsigned int foll_flags);
 
+/* Flags for follow_page(), get_user_pages ("GUP"), and vaddr_pin_pages(): */
+
 #define FOLL_WRITE 0x01/* check pte is writable */
 #define FOLL_TOUCH 0x02/* mark page accessed */
 #define FOLL_GET   0x04/* do get_page on page */
@@ -2663,13 +2665,15 @@ struct page *follow_page(struct vm_area_struct *vma, 
unsigned long address,
 #define FOLL_ANON  0x8000  /* don't do file mappings */
 #define FOLL_LONGTERM  0x1 /* mapping lifetime is indefinite: see below */
 #define FOLL_SPLIT_PMD 0x2 /* split huge pmd before returning */
+#define FOLL_PIN   0x4 /* pages must be released via put_user_page() */
 
 /*
- * NOTE on FOLL_LONGTERM:
+ * FOLL_PIN and FOLL_LONGTERM may be used in various combinations with each
+ * other. Here is what they mean, and how to use them:
  *
  * FOLL_LONGTERM indicates that the page will be held for an indefinite time
- * period _often_ under userspace control.  This is contrasted with
- * iov_iter_get_pages() where usages which are transient.
+ * period _often_ under userspace control.  This is in contrast to
+ * iov_iter_get_pages(), where usages which are transient.
  *
  * FIXME: For pages which are part of a filesystem, mappings are subject to the
  * lifetime enforced by the filesystem and we need guarantees that longterm
@@ -2684,11 +2688,51 @@ struct page *follow_page(struct vm_area_struct *vma, 
unsigned long address,
  * Currently only get_user_pages() and get_user_pages_fast() support this flag
  * and calls to get_user_pages_[un]locked are specifically not allowed.  This
  * is due to an incompatibility with the FS DAX check and
- * FAULT_FLAG_ALLOW_RETRY
+ * FAULT_FLAG_ALLOW_RETRY.
  *
- * In the CMA case: longterm pins in a CMA region would unnecessarily fragment
- * that region.  And so CMA attempts to migrate the page before pinning when
+ * In the CMA case: long term pins in a CMA region would unnecessarily fragment
+ * that region.  And so, CMA attempts to migrate the page before pinning, when
  * FOLL_LONGTERM is specified.
+ *
+ * FOLL_PIN indicates that a special kind of tracking (not just 
page->_refcount,
+ * but an additional pin counting system) will be invoked. This is intended for
+ * anything that gets a page reference and then touches page data (for example,
+ * Direct IO). This lets the filesystem know that some non-file-system entity 
is
+ * potentially changing the pages' data. FOLL_PIN pages must be released,
+ * ultimately, by a call to put_user_page(). Typically that will be via one of
+ * the vaddr_unpin_pages() variants.
+ *
+ * FIXME: note that this special tracking is not in place yet. However, the
+ * pages should still be released by put_user_page().
+ *
+ * When and where to use each flag:
+ *
+ * CASE 1: Direct IO (DIO). There are GUP references to pages that are serving
+ * as DIO buffers. These buffers are needed for a relatively short time (so 
they
+ * are not "long term"). No special synchronization with page_mkclean() or
+ * munmap() is provided. Therefore, flags to set at the call site are:
+ *
+ * FOLL_PIN
+ *
+ * CASE 2: RDMA. There are GUP references to pages that are serving as DMA
+ * buffers. These buffers are needed for a long time ("long term"). No special
+ * synchronization with page_mkclean() or munmap() is

[PATCH 1/4] checkpatch: revert broken NOTIFIER_HEAD check

2019-08-20 Thread John Hubbard

commit 1a47005dd5aa ("checkpatch: add *_NOTIFIER_HEAD as var
definition") causes the following warning when run on some
patches:

Unescaped left brace in regex is passed through in regex;
marked by < --HERE in m/(?:
...
   [238 lines of appalling perl output, mercifully not included]
...
)/ at ./scripts/checkpatch.pl line 3889.

This is broken, so revert it until a better solution is found.

Fixes: 1a47005dd5aa ("checkpatch: add *_NOTIFIER_HEAD as var
definition")

Cc: Andy Whitcroft 
Cc: Joe Perches 
Cc: Gilad Ben-Yossef 
Cc: Ofir Drang 
Cc: Andrew Morton 
Signed-off-by: John Hubbard 
---
 scripts/checkpatch.pl | 1 -
 1 file changed, 1 deletion(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 5c00151cdee8..284eb4bd84aa 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -3891,7 +3891,6 @@ sub process {
^.DEFINE_$Ident\(\Q$name\E\)|
^.DECLARE_$Ident\(\Q$name\E\)|
^.LIST_HEAD\(\Q$name\E\)|
-   ^.{$Ident}_NOTIFIER_HEAD\(\Q$name\E\)|

^.(?:$Storage\s+)?$Type\s*\(\s*\*\s*\Q$name\E\s*\)\s*\(|
\b\Q$name\E(?:\s+$Attribute)*\s*(?:;|=|\[|\()
)/x) {
-- 
2.22.1

[PATCH 4/4] mm/gup: introduce vaddr_pin_pages_remote(), and invoke it

2019-08-20 Thread John Hubbard

vaddr_pin_user_pages_remote() is the "vaddr_pin_pages" corresponding
variant to get_user_pages_remote(): it adds the ability to handle
FOLL_PIN, FOLL_LONGTERM, or both.

Note that the put_user_page*() requirement won't be truly required until
all of the call sites have been converted, and the tracking of pages is
activated.

Also, change process_vm_rw_single_vec() to invoke the new function.

Signed-off-by: John Hubbard 
---
 include/linux/mm.h |  5 +
 mm/gup.c   | 33 +
 mm/process_vm_access.c | 23 ++-
 3 files changed, 52 insertions(+), 9 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 6e7de424bf5e..849b509e9f89 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1606,6 +1606,11 @@ int __account_locked_vm(struct mm_struct *mm, unsigned 
long pages, bool inc,
 long vaddr_pin_pages(unsigned long addr, unsigned long nr_pages,
 unsigned int gup_flags, struct page **pages,
 struct vaddr_pin *vaddr_pin);
+long vaddr_pin_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
+unsigned long start, unsigned long nr_pages,
+unsigned int gup_flags, struct page **pages,
+struct vm_area_struct **vmas, int *locked,
+struct vaddr_pin *vaddr_pin);
 void vaddr_unpin_pages(struct page **pages, unsigned long nr_pages,
   struct vaddr_pin *vaddr_pin, bool make_dirty);
 bool mapping_inode_has_layout(struct vaddr_pin *vaddr_pin, struct page *page);
diff --git a/mm/gup.c b/mm/gup.c
index e49096d012ea..d7ce9b38178f 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2522,3 +2522,36 @@ void vaddr_unpin_pages(struct page **pages, unsigned 
long nr_pages,
__put_user_pages_dirty_lock(pages, nr_pages, make_dirty, vaddr_pin);
 }
 EXPORT_SYMBOL(vaddr_unpin_pages);
+
+/**
+ * vaddr_pin_user_pages_remote() - pin pages by virtual address and return the
+ * pages to the user.
+ *
+ * @tsk:   the task_struct to use for page fault accounting, or
+ * NULL if faults are not to be recorded.
+ * @mm:mm_struct of target mm
+ * @addr:  start address
+ * @nr_pages:  number of pages to pin
+ * @gup_flags: flags to use for the pin. Please see FOLL_* documentation in
+ * mm.h.
+ * @pages: array of pages returned
+ * @vaddr_pin:  If FOLL_LONGTERM is set, then vaddr_pin should point to an
+ * initialized struct that contains the owning mm and file. Otherwise, 
vaddr_pin
+ * should be set to NULL.
+ *
+ * This is the "vaddr_pin_pages" corresponding variant to
+ * get_user_pages_remote(), but with the ability to handle FOLL_PIN,
+ * FOLL_LONGTERM, or both.
+ */
+long vaddr_pin_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
+unsigned long start, unsigned long nr_pages,
+unsigned int gup_flags, struct page **pages,
+struct vm_area_struct **vmas, int *locked,
+struct vaddr_pin *vaddr_pin)
+{
+   gup_flags |= FOLL_TOUCH | FOLL_REMOTE;
+
+   return __get_user_pages_locked(tsk, mm, start, nr_pages, pages, vmas,
+  locked, gup_flags, vaddr_pin);
+}
+EXPORT_SYMBOL(vaddr_pin_user_pages_remote);
diff --git a/mm/process_vm_access.c b/mm/process_vm_access.c
index 357aa7bef6c0..e08c1f760ad4 100644
--- a/mm/process_vm_access.c
+++ b/mm/process_vm_access.c
@@ -96,7 +96,7 @@ static int process_vm_rw_single_vec(unsigned long addr,
flags |= FOLL_WRITE;
 
while (!rc && nr_pages && iov_iter_count(iter)) {
-   int pages = min(nr_pages, max_pages_per_loop);
+   int pinned_pages = min(nr_pages, max_pages_per_loop);
int locked = 1;
size_t bytes;
 
@@ -106,14 +106,18 @@ static int process_vm_rw_single_vec(unsigned long addr,
 * current/current->mm
 */
down_read(>mmap_sem);
-   pages = get_user_pages_remote(task, mm, pa, pages, flags,
- process_pages, NULL, );
+
+   flags |= FOLL_PIN;
+   pinned_pages = vaddr_pin_user_pages_remote(task, mm, pa,
+  pinned_pages, flags,
+  process_pages, NULL,
+  , NULL);
if (locked)
up_read(>mmap_sem);
-   if (pages <= 0)
+   if (pinned_pages <= 0)
return -EFAULT;
 
-   bytes = pages * PAGE_SIZE - start_offset;
+   bytes = pinned_pages * PAGE_SIZE - start_offset;
if (bytes > len)
bytes = len;
 
@@

[PATCH 2/4] For Ira: tiny formatting tweak to kerneldoc

2019-08-20 Thread John Hubbard

For your vaddr_pin_pages() and vaddr_unpin_pages().
Just merge it into wherever it goes please. Didn't want to
cause merge problems so it's a separate patch-let.

Signed-off-by: John Hubbard 
---
 mm/gup.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 56421b880325..e49096d012ea 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2465,7 +2465,7 @@ int get_user_pages_fast(unsigned long start, int nr_pages,
 EXPORT_SYMBOL_GPL(get_user_pages_fast);
 
 /**
- * vaddr_pin_pages pin pages by virtual address and return the pages to the
+ * vaddr_pin_pages() - pin pages by virtual address and return the pages to the
  * user.
  *
  * @addr: start address
@@ -2505,7 +2505,7 @@ long vaddr_pin_pages(unsigned long addr, unsigned long 
nr_pages,
 EXPORT_SYMBOL(vaddr_pin_pages);
 
 /**
- * vaddr_unpin_pages - counterpart to vaddr_pin_pages
+ * vaddr_unpin_pages() - counterpart to vaddr_pin_pages
  *
  * @pages: array of pages returned
  * @nr_pages: number of pages in pages
-- 
2.22.1

Re: rfc: treewide scripted patch mechanism? (was: Re: [PATCH] Makefile: Convert -Wimplicit-fallthrough=3 to just -Wimplicit-fallthrough for clang)QUILT

2019-08-20 Thread Willy Tarreau

On Tue, Aug 20, 2019 at 05:43:27PM -0700, Linus Torvalds wrote:
> I would seriously suggest doing something like
> 
>copy_string( dst, dstsize, src, srcsize, FLAGS );
> 
> where FLAGS migth be "pad" or whatever. Make it return the size of the
> resulting string, because while it can be convenient to pass 'dst" on,
> it's not useful.

I actually like this a lot. FLAGS could also indicate whether or not a
zero before srcsize ends the copy or not, allowing to copy substrings
of known length or known valid strings of unknown length by passing ~0
in srcsize. And it could also indicate whether the returned value should
indicate how much was copied or how much would have been needed for the
copy to work (so that testing (result <= dstsize) indicates truncation).

> And then maybe just add the helper macro that turns an array into a
> "pointer, size" combination, rather than yet another letter jumble.

I did exactly this in some of my projects including haproxy, I called
the lib "ist" for "indirect string", and found it extremely convenient
to use because many functions now return an ist or take an ist as an
argument. Passing a structure of only two elements results in passing
only two registers, and that's the same for the return value. Moreover,
the compiler is smart enough to eliminate a *lot* of manipulations, and
to replace pointer dereferences with direct register manipulations. I
do have a lot of ist("foo") spread everywhere in the code, which makes
a struct ist from the string and its length, and when you look at the
code, the compiler used immediate values for both the string and its
length. It's also extremely convenient for string comparisons and
lookups because you start by checking the length and can eliminate
lookups and dereferences, making keyword parsers very efficient. It
also allows us to have an istcat() function doing like strncat() but
with the output size always known so that there's no risk of appending
past the end when the starting point doesn't match the beginning of a
string.

I must confess that I became quite addict to using this because it's
so much convenient not to have to care about string length nor zero
termination anymore, without the overhead of calling strlen() on
resulting values!

For illustration of the simplicity the code is here :
http://git.haproxy.org/?p=haproxy.git;a=blob_plain;f=include/common/ist.h

And here are a few examples of usage:
  - declaration in arrays:

http://git.haproxy.org/?p=haproxy.git;a=blob;f=contrib/prometheus-exporter/service-prometheus.c;h=9b9ef2ea8e2e8ee0cc63364500d39fc08009fb8d;hb=HEAD#l644
  - appending to a string:

http://git.haproxy.org/?p=haproxy.git;a=blob;f=contrib/prometheus-exporter/service-prometheus.c;h=9b9ef2ea8e2e8ee0cc63364500d39fc08009fb8d;hb=HEAD#l1112
  - passing as function arguments:

http://git.haproxy.org/?p=haproxy.git;a=blob;f=src/http_ana.c;h=b2069e3ead59e7bcde45ac76a1c6b0b6b5fb3882;hb=HEAD#l2468

http://git.haproxy.org/?p=haproxy.git;a=blob;f=src/http_ana.c;h=b2069e3ead59e7bcde45ac76a1c6b0b6b5fb3882;hb=HEAD#l2602
  - checking for known values:

http://git.haproxy.org/?p=haproxy.git;a=blob;f=src/h2.c;h=c41da8e5ee116e75e4719709527511c299a3657c;hb=HEAD#l295

I'm personally totally convinced by this approach and am slowly improving
this interface to progressively use it everywhere, and quite frankly I
can only strongly recommend going into the same direction for ease of
use, safety, and efficiency.

Willy

RE: [PATCH v2 0/2] Simplify mtty driver and mdev core

2019-08-20 Thread Parav Pandit




> -Original Message-
> From: Cornelia Huck 
> Sent: Tuesday, August 20, 2019 11:25 PM
> To: Alex Williamson 
> Cc: Parav Pandit ; Jiri Pirko ;
> David S . Miller ; Kirti Wankhede
> ; k...@vger.kernel.org; linux-
> ker...@vger.kernel.org; cjia ; net...@vger.kernel.org
> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
> 
> On Tue, 20 Aug 2019 11:19:04 -0600
> Alex Williamson  wrote:
> 
> > What about an alias based on the uuid?  For example, we use 160-bit
> > sha1s daily with git (uuids are only 128-bit), but we generally don't
> > reference git commits with the full 20 character string.  Generally 12
> > characters is recommended to avoid ambiguity.  Could mdev
> > automatically create an abbreviated sha1 alias for the device?  If so,
> > how many characters should we use and what do we do on collision?  The
> > colliding device could add enough alias characters to disambiguate (we
> > likely couldn't re-alias the existing device to disambiguate, but I'm
> > not sure it matters, userspace has sysfs to associate aliases).  Ex.
> >
> > UUID=$(uuidgen)
> > ALIAS=$(echo $UUID | sha1sum | colrm 13)
> >
> > Since there seems to be some prefix overhead, as I ask about above in
> > how many characters we actually have to work with in IFNAMESZ, maybe
> > we start with 8 characters (matching your "index" namespace) and
> > expand as necessary for disambiguation.  If we can eliminate overhead
> > in IFNAMESZ, let's start with 12.  Thanks,
> >
> > Alex
> 
> I really like that idea, and it seems the best option proposed yet, as we 
> don't
> need to create a secondary identifier.
User setting this alias at mdev creation time and exposed via sysfs as read 
only attribute works.
Exposing that as
const char *mdev_alias(struct mdev_device *dev) to vendor drivers..

[PATCH 4/4] video/logo: move pnmtologo tool to drivers/video/logo/ from scripts/

2019-08-20 Thread Masahiro Yamada

This tool is only used by drivers/video/logo/Makefile. No reason to
keep it in scripts/.

Signed-off-by: Masahiro Yamada 
---

 drivers/video/logo/.gitignore   |  1 +
 drivers/video/logo/Makefile | 10 +-
 {scripts => drivers/video/logo}/pnmtologo.c |  0
 scripts/.gitignore  |  1 -
 scripts/Makefile|  2 --
 5 files changed, 6 insertions(+), 8 deletions(-)
 rename {scripts => drivers/video/logo}/pnmtologo.c (100%)

diff --git a/drivers/video/logo/.gitignore b/drivers/video/logo/.gitignore
index e48355f538fa..9dda1b26b2e4 100644
--- a/drivers/video/logo/.gitignore
+++ b/drivers/video/logo/.gitignore
@@ -5,3 +5,4 @@
 *_vga16.c
 *_clut224.c
 *_gray256.c
+pnmtologo
diff --git a/drivers/video/logo/Makefile b/drivers/video/logo/Makefile
index 7d672d40bf01..bcda657493a4 100644
--- a/drivers/video/logo/Makefile
+++ b/drivers/video/logo/Makefile
@@ -18,19 +18,19 @@ obj-$(CONFIG_SPU_BASE)  += 
logo_spe_clut224.o
 
 # How to generate logo's
 
-pnmtologo := scripts/pnmtologo
+hostprogs-y := pnmtologo
 
 # Create commands like "pnmtologo -t mono -n logo_mac_mono -o ..."
 quiet_cmd_logo = LOGO$@
-  cmd_logo = $(pnmtologo) -t $(lastword $(subst _, ,$*)) -n $* -o $@ $<
+  cmd_logo = $(obj)/pnmtologo -t $(lastword $(subst _, ,$*)) -n $* -o $@ $<
 
-$(obj)/%.c: $(src)/%.pbm $(pnmtologo) FORCE
+$(obj)/%.c: $(src)/%.pbm $(obj)/pnmtologo FORCE
$(call if_changed,logo)
 
-$(obj)/%.c: $(src)/%.ppm $(pnmtologo) FORCE
+$(obj)/%.c: $(src)/%.ppm $(obj)/pnmtologo FORCE
$(call if_changed,logo)
 
-$(obj)/%.c: $(src)/%.pgm $(pnmtologo) FORCE
+$(obj)/%.c: $(src)/%.pgm $(obj)/pnmtologo FORCE
$(call if_changed,logo)
 
 # generated C files
diff --git a/scripts/pnmtologo.c b/drivers/video/logo/pnmtologo.c
similarity index 100%
rename from scripts/pnmtologo.c
rename to drivers/video/logo/pnmtologo.c
diff --git a/scripts/.gitignore b/scripts/.gitignore
index 17f8cef88fa8..4aa1806c59c2 100644
--- a/scripts/.gitignore
+++ b/scripts/.gitignore
@@ -4,7 +4,6 @@
 bin2c
 conmakehash
 kallsyms
-pnmtologo
 unifdef
 recordmcount
 sortextable
diff --git a/scripts/Makefile b/scripts/Makefile
index 16bcb8087899..709df809f892 100644
--- a/scripts/Makefile
+++ b/scripts/Makefile
@@ -4,7 +4,6 @@
 # the kernel for the build process.
 # ---
 # kallsyms:  Find all symbols in vmlinux
-# pnmttologo:Convert pnm files to logo files
 # conmakehash:   Create chartable
 # conmakehash:  Create arrays for initializing the kernel console tables
 
@@ -12,7 +11,6 @@ HOST_EXTRACFLAGS += -I$(srctree)/tools/include
 
 hostprogs-$(CONFIG_BUILD_BIN2C)  += bin2c
 hostprogs-$(CONFIG_KALLSYMS) += kallsyms
-hostprogs-$(CONFIG_LOGO) += pnmtologo
 hostprogs-$(CONFIG_VT)   += conmakehash
 hostprogs-$(BUILD_C_RECORDMCOUNT) += recordmcount
 hostprogs-$(CONFIG_BUILDTIME_EXTABLE_SORT) += sortextable
-- 
2.17.1

[PATCH] aio: Fix io_pgetevents() struct __compat_aio_sigset layout

2019-08-20 Thread Guillem Jover

This type is used to pass the sigset_t from userland to the kernel,
but it was using the kernel native pointer type for the member
representing the compat userland pointer to the userland sigset_t.

This messes up the layout, and makes the kernel eat up both the
userland pointer and the size members into the kernel pointer, and
then reads garbage into the kernel sigsetsize. Which makes the sigset_t
size consistency check fail, and consequently the syscall always
returns -EINVAL.

This breaks both libaio and strace on 32-bit userland running on 64-bit
kernels. And there are apparently no users in the wild of the current
broken layout (at least according to codesearch.debian.org and a brief
check over github.com search). So it looks safe to fix this directly
in the kernel, instead of either letting userland deal with this
permanently with the additional overhead or trying to make the syscall
infer what layout userland used, even though this is also being worked
around in libaio to temporarily cope with kernels that have not yet
been fixed.

We use a proper compat_uptr_t instead of a compat_sigset_t pointer.

Fixes: 7a074e96 ("aio: implement io_pgetevents")
Signed-off-by: Guillem Jover 
---
 fs/aio.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/aio.c b/fs/aio.c
index 01e0fb9ae45a..056f291bc66f 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -2179,7 +2179,7 @@ SYSCALL_DEFINE5(io_getevents_time32, __u32, ctx_id,
 #ifdef CONFIG_COMPAT
 
 struct __compat_aio_sigset {
-   compat_sigset_t __user  *sigmask;
+   compat_uptr_t   sigmask;
compat_size_t   sigsetsize;
 };
 
@@ -2204,7 +2204,7 @@ COMPAT_SYSCALL_DEFINE6(io_pgetevents,
if (usig && copy_from_user(, usig, sizeof(ksig)))
return -EFAULT;
 
-   ret = set_compat_user_sigmask(ksig.sigmask, ksig.sigsetsize);
+   ret = set_compat_user_sigmask(compat_ptr(ksig.sigmask), 
ksig.sigsetsize);
if (ret)
return ret;
 
@@ -2239,7 +2239,7 @@ COMPAT_SYSCALL_DEFINE6(io_pgetevents_time64,
if (usig && copy_from_user(, usig, sizeof(ksig)))
return -EFAULT;
 
-   ret = set_compat_user_sigmask(ksig.sigmask, ksig.sigsetsize);
+   ret = set_compat_user_sigmask(compat_ptr(ksig.sigmask), 
ksig.sigsetsize);
if (ret)
return ret;
 
-- 
2.23.0

Re: [v2 PATCH] RISC-V: Optimize tlb flush path.

2019-08-20 Thread Anup Patel

On Wed, Aug 21, 2019 at 7:10 AM h...@infradead.org  wrote:
>
> On Wed, Aug 21, 2019 at 09:29:22AM +0800, Alan Kao wrote:
> > IMHO, this approach should be avoided because CLINT is compatible to but
> >  not mandatory in the privileged spec.  In other words, it is possible that
> > a Linux-capable RISC-V platform does not contain a CLINT component but
> > rely on some other mechanism to deal with SW/timer interrupts.
>
> Hi Alan,
>
> at this point the above is just a prototype showing the performance
> improvement if we can inject IPIs and timer interrups directly from
> S-mode and delivered directly to S-mode.  It is based on a copy of
> the clint IPI block currently used by SiFive, qemu, Ariane and Kendryte.
>
> If the experiment works out (which I think it does), I'd like to
> define interfaces for the unix platform spec to make something like
> this available.  My current plan for that is to have one DT node
> each for the IPI registers, timer cmp and time val register each
> as MMIO regions.  This would fit the current clint block but also
> allow other register layouts.  Is that something you'd be fine with?
> If not do you have another proposal?  (note that eventually the
> dicussion should move to the unix platform spec list, but now that
> I have you here we can at least brain storm a bit).

I agree that IPI mechanism should be standardized for RISC-V but I
don't support the idea of mandating CLINT as part of the UNIX
platform spec. For example, the AndesTech SOC does not use CLINT
instead they have PLMT for per-HART timer and PLICSW for per-HART
IPIs.

IMHO, we can also think of:
RISC-V Timer Extension - For per-HART timer access to M-mode
and S-mode
RISC-V IPI Extension - HART IPI injection

Regards,
Anup

Re: kernel panic in 5.3-rc5, nfsd_reply_cache_stats_show+0x11

2019-08-20 Thread Dan Williams

On Tue, Aug 20, 2019 at 6:39 PM  wrote:
>
> Hi,
>
> Apology if there is a better channel reporting the issue, if so, please
> let me know.
>
> I just saw below regression in 5.3-rc5 kernel, but not in 5.2-rc7 or
> earlier kernels.

Is the error stable enough to bisect?

[PATCH] VMCI: Release resource if the work is already queued

2019-08-20 Thread Nadav Amit

Francois reported that VMware balloon gets stuck after a balloon reset,
when the VMCI doorbell is removed. A similar error can occur when the
balloon driver is removed with the following splat:

[ 1088.622000] INFO: task modprobe:3565 blocked for more than 120 seconds.
[ 1088.622035]   Tainted: GW 5.2.0 #4
[ 1088.622087] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[ 1088.622205] modprobeD0  3565   1450 0x
[ 1088.622210] Call Trace:
[ 1088.622246]  __schedule+0x2a8/0x690
[ 1088.622248]  schedule+0x2d/0x90
[ 1088.622250]  schedule_timeout+0x1d3/0x2f0
[ 1088.622252]  wait_for_completion+0xba/0x140
[ 1088.622320]  ? wake_up_q+0x80/0x80
[ 1088.622370]  vmci_resource_remove+0xb9/0xc0 [vmw_vmci]
[ 1088.622373]  vmci_doorbell_destroy+0x9e/0xd0 [vmw_vmci]
[ 1088.622379]  vmballoon_vmci_cleanup+0x6e/0xf0 [vmw_balloon]
[ 1088.622381]  vmballoon_exit+0x18/0xcc8 [vmw_balloon]
[ 1088.622394]  __x64_sys_delete_module+0x146/0x280
[ 1088.622408]  do_syscall_64+0x5a/0x130
[ 1088.622410]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1088.622415] RIP: 0033:0x7f54f62791b7
[ 1088.622421] Code: Bad RIP value.
[ 1088.622421] RSP: 002b:7fff2a949008 EFLAGS: 0206 ORIG_RAX: 
00b0
[ 1088.622426] RAX: ffda RBX: 55dff8b55d00 RCX: 7f54f62791b7
[ 1088.622426] RDX:  RSI: 0800 RDI: 55dff8b55d68
[ 1088.622427] RBP: 55dff8b55d00 R08: 7fff2a947fb1 R09: 
[ 1088.622427] R10: 7f54f62f5cc0 R11: 0206 R12: 55dff8b55d68
[ 1088.622428] R13: 0001 R14: 55dff8b55d68 R15: 7fff2a94a3f0

The cause for the bug is that when the "delayed" doorbell is invoked, it
takes a reference on the doorbell entry and schedules work that is
supposed to run the appropriate code and drop the doorbell entry
reference. The code ignores the fact that if the work is already queued,
it will not be scheduled to run one more time. As a result one of the
references would not be dropped. When the code waits for the reference
to get to zero, during balloon reset or module removal, it gets stuck.

Fix it. Drop the reference if schedule_work() indicates that the work is
already queued.

Note that this bug got more apparent (or apparent at all) due to
commit ce664331b248 ("vmw_balloon: VMCI_DOORBELL_SET does not check status").

Fixes: 83e2ec765be03 ("VMCI: doorbell implementation.")
Reported-by: Francois Rigault 
Cc: Jorgen Hansen 
Cc: Adit Ranadive 
Cc: Alexios Zavras 
Cc: Vishnu DASA 
Cc: sta...@vger.kernel.org
Signed-off-by: Nadav Amit 
---
 drivers/misc/vmw_vmci/vmci_doorbell.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/misc/vmw_vmci/vmci_doorbell.c 
b/drivers/misc/vmw_vmci/vmci_doorbell.c
index bad89b6e0802..345addd9306d 100644
--- a/drivers/misc/vmw_vmci/vmci_doorbell.c
+++ b/drivers/misc/vmw_vmci/vmci_doorbell.c
@@ -310,7 +310,8 @@ int vmci_dbell_host_context_notify(u32 src_cid, struct 
vmci_handle handle)
 
entry = container_of(resource, struct dbell_entry, resource);
if (entry->run_delayed) {
-   schedule_work(>work);
+   if (!schedule_work(>work))
+   vmci_resource_put(resource);
} else {
entry->notify_cb(entry->client_data);
vmci_resource_put(resource);
@@ -361,7 +362,8 @@ static void dbell_fire_entries(u32 notify_idx)
atomic_read(>active) == 1) {
if (dbell->run_delayed) {
vmci_resource_get(>resource);
-   schedule_work(>work);
+   if (!schedule_work(>work))
+   vmci_resource_put(>resource);
} else {
dbell->notify_cb(dbell->client_data);
}
-- 
2.19.1

Re: [PATCH 1/3] csky: Fixup arch_get_unmapped_area() implementation

2019-08-20 Thread Guo Ren

Thx Christoph,

On Wed, Aug 21, 2019 at 10:16 AM Christoph Hellwig  wrote:
>
> > +/*
> > + * We need to ensure that shared mappings are correctly aligned to
> > + * avoid aliasing issues with VIPT caches.  We need to ensure that
> > + * a specific page of an object is always mapped at a multiple of
> > + * SHMLBA bytes.
> > + *
> > + * We unconditionally provide this function for all cases.
> > + */
>
> On something unrelated: If csky has virtually indexed caches you also
> need to implement the flush_kernel_vmap_range and
> invalidate_kernel_vmap_range functions to avoid data corruption when
> doing I/O on vmalloc/vmap ranges.

I'll give another patch for this issue

-- 
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/

[PATCH v2] ACPI / PCI: fix acpi_pci_irq_enable() memory leak

2019-08-20 Thread Wenwen Wang

In acpi_pci_irq_enable(), 'entry' is allocated by kzalloc() in
acpi_pci_irq_check_entry() (invoked from acpi_pci_irq_lookup()). However,
it is not deallocated if acpi_pci_irq_valid() returns false, leading to a
memory leak. To fix this issue, free 'entry' before returning 0.

Fixes: e237a5518425 ("x86/ACPI/PCI: Recognize that Interrupt Line 255 means
"not connected"")

Signed-off-by: Wenwen Wang 
---
 drivers/acpi/pci_irq.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c
index d2549ae..dea8a60 100644
--- a/drivers/acpi/pci_irq.c
+++ b/drivers/acpi/pci_irq.c
@@ -449,8 +449,10 @@ int acpi_pci_irq_enable(struct pci_dev *dev)
 * No IRQ known to the ACPI subsystem - maybe the BIOS /
 * driver reported one, then use it. Exit in any case.
 */
-   if (!acpi_pci_irq_valid(dev, pin))
+   if (!acpi_pci_irq_valid(dev, pin)) {
+   kfree(entry);
return 0;
+   }
 
if (acpi_isa_register_gsi(dev))
dev_warn(>dev, "PCI INT %c: no GSI\n",
-- 
2.7.4

Re: [PATCH 3/3] csky: Support kernel non-aligned access

2019-08-20 Thread Guo Ren

Thx Christoph

On Wed, Aug 21, 2019 at 10:17 AM Christoph Hellwig  wrote:
>
> On Tue, Aug 20, 2019 at 08:34:29PM +0800, guo...@kernel.org wrote:
> > From: Guo Ren 
> >
> > We prohibit non-aligned access in kernel mode, but some special NIC
> > driver needs to support kernel-state unaligned access. For example,
> > when the bus does not support unaligned access, IP header parsing
> > will cause non-aligned access and driver does not recopy the skb
> > buffer to dma for performance reasons.
> >
> > Added kernel_enable & user_enable to control unaligned access and
> > added kernel_count  & user_count for statistical unaligned access.
>
> If the NIC drivers requires this it is buggy.
Yes, you are right, but I've no control on their non-upstreamed
drivers. Every time kernel version updated I need to take care of that
issue for them. So just give them a back door in arch/csky and they
could disable it by manual.

> Kernel code must
> use the get_unaligned* / put_unaligned* helpers for that.
Most of ethernet drivers use netdev_alloc_skb_ip_align() to let
hardware deal with unaligned access,
but some NICs couldn't and we may modify kernel's skb_ip_header
parsing code with get_unaligned*/put_unaligned* ?

-- 
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/

RE: [PATCH v2 0/2] Simplify mtty driver and mdev core

2019-08-20 Thread Parav Pandit




> -Original Message-
> From: Alex Williamson 
> Sent: Tuesday, August 20, 2019 10:49 PM
> To: Parav Pandit 
> Cc: Jiri Pirko ; David S . Miller ;
> Kirti Wankhede ; Cornelia Huck
> ; k...@vger.kernel.org; linux-kernel@vger.kernel.org;
> cjia ; net...@vger.kernel.org
> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
> 
> On Tue, 20 Aug 2019 08:58:02 +
> Parav Pandit  wrote:
> 
> > + Dave.
> >
> > Hi Jiri, Dave, Alex, Kirti, Cornelia,
> >
> > Please provide your feedback on it, how shall we proceed?
> >
> > Short summary of requirements.
> > For a given mdev (mediated device [1]), there is one representor
> > netdevice and devlink port in switchdev mode (similar to SR-IOV VF),
> > And there is one netdevice for the actual mdev when mdev is probed.
> >
> > (a) representor netdev and devlink port should be able derive
> > phys_port_name(). So that representor netdev name can be built
> > deterministically across reboots.
> >
> > (b) for mdev's netdevice, mdev's device should have an attribute.
> > This attribute can be used by udev rules/systemd or something else to
> > rename netdev name deterministically.
> >
> > (c) IFNAMSIZ of 16 bytes is too small to fit whole UUID.
> > A simple grep IFNAMSIZ in stack hints hundreds of users of IFNAMSIZ in
> > drivers, uapi, netlink, boot config area and more. Changing IFNAMSIZ
> > for a mdev bus doesn't really look reasonable option to me.
> 
> How many characters do we really have to work with?  Your examples below
> prepend various characters, ex. option-1 results in ens2f0_m10 or enm10.  Do
> the extra 8 or 3 characters in these count against IFNAMSIZ?
> 
Maximum 15. Last is null termination.
Some udev rules setting by user prefix the PF netdev interface. I took such 
example below where ens2f0 netdev named is prefixed.
Some prefer not to prefix.

> > Hence, I would like to discuss below options.
> >
> > Option-1: mdev index
> > Introduce an optional mdev index/handle as u32 during mdev create
> > time. User passes mdev index/handle as input.
> >
> > phys_port_name=mIndex=m%u
> > mdev_index will be available in sysfs as mdev attribute for udev to
> > name the mdev's netdev.
> >
> > example mdev create command:
> > UUID=$(uuidgen)
> > echo $UUID index=10
> > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create
> 
> Nit, IIRC previous discussions of additional parameters used comma separators,
> ex. echo $UUID,index=10 >...
> 
Yes, ok.

> > > example netdevs:
> > repnetdev=ens2f0_m10/*ens2f0 is parent PF's netdevice */
> 
> Is the parent really relevant in the name?  
No. I just picked one udev example who prefixed the parent netdev name.
But there are users who do not prefix it.

> Tools like mdevctl are meant to
> provide persistence, creating the same mdev devices on the same parent, but
> that's simply the easiest policy decision.  We can also imagine that multiple
> parent devices might support a specified mdev type and policies factoring in
> proximity, load-balancing, power consumption, etc might be weighed such that
> we really don't want to promote userspace creating dependencies on the
> parent association.
> 
> > mdev_netdev=enm10
> >
> > Pros:
> > 1. mdevctl and any other existing tools are unaffected.
> > 2. netdev stack, ovs and other switching platforms are unaffected.
> > 3. achieves unique phys_port_name for representor netdev 4. achieves
> > unique mdev eth netdev name for the mdev using udev/systemd extension.
> > 5. Aligns well with mdev and netdev subsystem and similar to existing
> > sriov bdf's.
> 
> A user provided index seems strange to me.  It's not really an index, just a 
> user
> specified instance number.  Presumably you have the user providing this
> because if it really were an index, then the value depends on the creation 
> order
> and persistence is lost.  Now the user needs to both avoid uuid collision as 
> well
> as "index" number collision.  The uuid namespace is large enough to mostly
> ignore this, but this is not.  This seems like a burden.
> 
I liked the term 'instance number', which is lot better way to say than 
index/handle.
Yes, user needs to avoid both the collision.
UUID collision should not occur in most cases, they way UUID are generated.
So practically users needs to pick unique 'instance number', similar to how it 
picks unique netdev names.

Burden to user comes from the requirement to get uniqueness.

> > Option-2: shorter mdev name
> > Extend mdev to have shorter mdev device name in addition to UUID.
> > such as 'foo', 'bar'.
> > Mdev will continue to have UUID.
> > phys_port_name=mdev_name
> >
> > Pros:
> > 1. All same as option-1, except mdevctl needs upgrade for newer usage.
> > It is common practice to upgrade iproute2 package along with the
> > kernel. Similar practice to be done with mdevctl.
> > 2. Newer users of mdevctl who wants to work with non_UUID names, will
> > use newer mdevctl/tools. Cons:
> > 1. Dual naming scheme of mdev might affect some of the

Re: [PATCH] ACPI / PCI: fix a memory leak bug

2019-08-20 Thread Wenwen Wang

On Mon, Aug 19, 2019 at 5:23 PM Bjorn Helgaas  wrote:
>
> The subject line should give a clue about where the leak is, e.g.,
>
>   ACPI / PCI: fix acpi_pci_irq_enable() memory leak
>
> On Thu, Aug 15, 2019 at 11:33:22PM -0500, Wenwen Wang wrote:
> > In acpi_pci_irq_enable(), 'entry' is allocated by invoking
> > acpi_pci_irq_lookup(). However, it is not deallocated if
> > acpi_pci_irq_valid() returns false, leading to a memory leak. To fix this
> > issue, free 'entry' before returning 0.
>
> I think the corresponding kzalloc() is the one in
> acpi_pci_irq_check_entry().
>
> > Signed-off-by: Wenwen Wang 
> > ---
> >  drivers/acpi/pci_irq.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c
> > index d2549ae..dea8a60 100644
> > --- a/drivers/acpi/pci_irq.c
> > +++ b/drivers/acpi/pci_irq.c
> > @@ -449,8 +449,10 @@ int acpi_pci_irq_enable(struct pci_dev *dev)
> >* No IRQ known to the ACPI subsystem - maybe the BIOS /
> >* driver reported one, then use it. Exit in any case.
> >*/
> > - if (!acpi_pci_irq_valid(dev, pin))
> > + if (!acpi_pci_irq_valid(dev, pin)) {
> > + kfree(entry);
> >   return 0;
> > + }
>
> Looks like we missed this when e237a5518425 ("x86/ACPI/PCI: Recognize
> that Interrupt Line 255 means "not connected"") was merged.
>
> You could add:
>
> Fixes: e237a5518425 ("x86/ACPI/PCI: Recognize that Interrupt Line 255 means 
> "not connected"")
>
> >   if (acpi_isa_register_gsi(dev))
> >   dev_warn(>dev, "PCI INT %c: no GSI\n",
> > --
> > 2.7.4
> >

Thanks for your comments and suggestions! I will rework the patch.

Wenwen

linux-next: manual merge of the integrity tree with the security tree

2019-08-20 Thread Stephen Rothwell

Hi all,

Today's linux-next merge of the integrity tree got a conflict in:

  arch/s390/kernel/machine_kexec_file.c

between commit:

  99d5cadfde2b ("kexec_file: split KEXEC_VERIFY_SIG into KEXEC_SIG and 
KEXEC_SIG_FORCE")

from the security tree and commit:

  c8424e776b09 ("MODSIGN: Export module signature definitions")

from the integrity tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc arch/s390/kernel/machine_kexec_file.c
index c0f33ba49a9a,1ac9fbc6e01e..
--- a/arch/s390/kernel/machine_kexec_file.c
+++ b/arch/s390/kernel/machine_kexec_file.c
@@@ -22,29 -22,7 +22,7 @@@ const struct kexec_file_ops * const kex
NULL,
  };
  
 -#ifdef CONFIG_KEXEC_VERIFY_SIG
 +#ifdef CONFIG_KEXEC_SIG
- /*
-  * Module signature information block.
-  *
-  * The constituents of the signature section are, in order:
-  *
-  *- Signer's name
-  *- Key identifier
-  *- Signature data
-  *- Information block
-  */
- struct module_signature {
-   u8  algo;   /* Public-key crypto algorithm [0] */
-   u8  hash;   /* Digest algorithm [0] */
-   u8  id_type;/* Key identifier type [PKEY_ID_PKCS7] */
-   u8  signer_len; /* Length of signer's name [0] */
-   u8  key_id_len; /* Length of key identifier [0] */
-   u8  __pad[3];
-   __be32  sig_len;/* Length of signature data */
- };
- 
- #define PKEY_ID_PKCS7 2
- 
  int s390_verify_sig(const char *kernel, unsigned long kernel_len)
  {
const unsigned long marker_len = sizeof(MODULE_SIG_STRING) - 1;


pgpMDKl3g_kus.pgp
Description: OpenPGP digital signature

[PATCH v3] psi: get poll_work to run when calling poll syscall next time

2019-08-20 Thread Joseph Qi

From: Jason Xing 

Only when calling the poll syscall the first time can user
receive POLLPRI correctly. After that, user always fails to
acquire the event signal.

Reproduce case:
1. Get the monitor code in Documentation/accounting/psi.txt
2. Run it, and wait for the event triggered.
3. Kill and restart the process.

The question is why we can end up with poll_scheduled = 1 but the work
not running (which would reset it to 0). And the answer is because the
scheduling side sees group->poll_kworker under RCU protection and then
schedules it, but here we cancel the work and destroy the worker. The
cancel needs to pair with resetting the poll_scheduled flag.

Signed-off-by: Jason Xing 
Reviewed-by: Caspar Zhang 
Reviewed-by: Suren Baghdasaryan 
Acked-by: Johannes Weiner 
Signed-off-by: Joseph Qi 
---
v3: Change the description as Johannes Weiner suggested.

 kernel/sched/psi.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index 23fbbcc..6e52b67 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -1131,7 +1131,15 @@ static void psi_trigger_destroy(struct kref *ref)
 * deadlock while waiting for psi_poll_work to acquire trigger_lock
 */
if (kworker_to_destroy) {
+   /*
+* After the RCU grace period has expired, the worker
+* can no longer be found through group->poll_kworker.
+* But it might have been already scheduled before
+* that - deschedule it cleanly before destroying it.
+*/
kthread_cancel_delayed_work_sync(>poll_work);
+   atomic_set(>poll_scheduled, 0);
+
kthread_destroy_worker(kworker_to_destroy);
}
kfree(t);
-- 
1.8.3.1

Re: [PATCH] mm/oom: Add oom_score_adj value to oom Killed process message

2019-08-20 Thread David Rientjes

On Tue, 20 Aug 2019, Edward Chron wrote:

> For an OOM event: print oom_score_adj value for the OOM Killed process to
> document what the oom score adjust value was at the time the process was
> OOM Killed. The adjustment value can be set by user code and it affects
> the resulting oom_score so it is used to influence kill process selection.
> 
> When eligible tasks are not printed (sysctl oom_dump_tasks = 0) printing
> this value is the only documentation of the value for the process being
> killed. Having this value on the Killed process message documents if a
> miscconfiguration occurred or it can confirm that the oom_score_adj
> value applies as expected.
> 
> An example which illustates both misconfiguration and validation that
> the oom_score_adj was applied as expected is:
> 
> Aug 14 23:00:02 testserver kernel: Out of memory: Killed process 2692
>  (systemd-udevd) total-vm:1056800kB, anon-rss:1052760kB, file-rss:4kB,
>  shmem-rss:0kB oom_score_adj:1000
> 
> The systemd-udevd is a critical system application that should have an
> oom_score_adj of -1000. Here it was misconfigured to have a adjustment
> of 1000 making it a highly favored OOM kill target process. The output
> documents both the misconfiguration and the fact that the process
> was correctly targeted by OOM due to the miconfiguration. Having
> the oom_score_adj on the Killed message ensures that it is documented.
> 
> Signed-off-by: Edward Chron 
> Acked-by: Michal Hocko 

Acked-by: David Rientjes 

vm.oom_dump_tasks is pretty useful, however, so it's curious why you 
haven't left it enabled :/

> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index eda2e2a0bdc6..c781f73b6cd6 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -884,12 +884,13 @@ static void __oom_kill_process(struct task_struct 
> *victim, const char *message)
>*/
>   do_send_sig_info(SIGKILL, SEND_SIG_PRIV, victim, PIDTYPE_TGID);
>   mark_oom_victim(victim);
> - pr_err("%s: Killed process %d (%s) total-vm:%lukB, anon-rss:%lukB, 
> file-rss:%lukB, shmem-rss:%lukB\n",
> + pr_err("%s: Killed process %d (%s) total-vm:%lukB, anon-rss:%lukB, 
> file-rss:%lukB, shmem-rss:%lukB oom_score_adj:%ld\n",
>   message, task_pid_nr(victim), victim->comm,
>   K(victim->mm->total_vm),
>   K(get_mm_counter(victim->mm, MM_ANONPAGES)),
>   K(get_mm_counter(victim->mm, MM_FILEPAGES)),
> - K(get_mm_counter(victim->mm, MM_SHMEMPAGES)));
> + K(get_mm_counter(victim->mm, MM_SHMEMPAGES)),
> + (long)victim->signal->oom_score_adj);
>   task_unlock(victim);
>  
>   /*

Nit: why not just use %hd and avoid the cast to long?

RE: [PATCH v6 3/3] soc: fsl: add RCPM driver

2019-08-20 Thread Ran Wang

Hi Pavel,

On Wednesday, August 21, 2019 11:16, Ran Wang wrote:
> 
> The NXP's QorIQ Processors based on ARM Core have RCPM module (Run
> Control and Power Management), which performs system level tasks associated
> with power management such as wakeup source control.
> 
> This driver depends on PM wakeup source framework which help to collect wake
> information.
> 
> Signed-off-by: Ran Wang 
> ---
> Change in v6:
>   - Adjust related API usage to meet wakeup.c's update in patch 1/3.
> Change in v5:
>   - Fix v4 regression of the return value of wakeup_source_get_next()
>   didn't pass to ws in while loop.
>   - Rename wakeup_source member 'attached_dev' to 'dev'.
>   - Rename property 'fsl,#rcpm-wakeup-cells' to '#fsl,rcpm-wakeup-cells'.
>   please see
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kern
> el.org%2Fpatchwork%2Fpatch%2F1101022%2Fdata=02%7C01%7Cran.wa
> ng_1%40nxp.com%7C27cff523c0a54ce89afe08d725e5987b%7C686ea1d3bc2b4
> c6fa92cd99c5c301635%7C0%7C0%7C637019540358555022sdata=4YYGD
> lwvB%2B4Y1436c1bOUzFyjYEqTU5HbiUFv5%2FCxi0%3Dreserved=0
> 
> Change in v4:
>   - Remove extra ',' in author line of rcpm.c
>   - Update usage of wakeup_source_get_next() to be less confusing to
> the reader, code logic remain the same.
> 
> Change in v3:
>   - Some whitespace ajdustment.
> 
> Change in v2:
>   - Rebase Kconfig and Makefile update to latest mainline.
> 
>  drivers/soc/fsl/Kconfig  |   8 +++
>  drivers/soc/fsl/Makefile |   2 +
>  drivers/soc/fsl/rcpm.c   | 128
> +++
>  3 files changed, 138 insertions(+)
>  create mode 100644 drivers/soc/fsl/rcpm.c
> 
> diff --git a/drivers/soc/fsl/Kconfig b/drivers/soc/fsl/Kconfig index
> f9ad8ad..4918856 100644
> --- a/drivers/soc/fsl/Kconfig
> +++ b/drivers/soc/fsl/Kconfig
> @@ -40,4 +40,12 @@ config DPAA2_CONSOLE
> /dev/dpaa2_mc_console and /dev/dpaa2_aiop_console,
> which can be used to dump the Management Complex and AIOP
> firmware logs.
> +
> +config FSL_RCPM
> + bool "Freescale RCPM support"
> + depends on PM_SLEEP
> + help
> +   The NXP QorIQ Processors based on ARM Core have RCPM module
> +   (Run Control and Power Management), which performs all device-level
> +   tasks associated with power management, such as wakeup source
> control.
>  endmenu
> diff --git a/drivers/soc/fsl/Makefile b/drivers/soc/fsl/Makefile index
> 71dee8d..28c6dac 100644
> --- a/drivers/soc/fsl/Makefile
> +++ b/drivers/soc/fsl/Makefile
> @@ -6,6 +6,8 @@
>  obj-$(CONFIG_FSL_DPAA) += qbman/
>  obj-$(CONFIG_QUICC_ENGINE)   += qe/
>  obj-$(CONFIG_CPM)+= qe/
> +obj-$(CONFIG_FSL_RCPM)   += rcpm.o
>  obj-$(CONFIG_FSL_GUTS)   += guts.o
>  obj-$(CONFIG_FSL_MC_DPIO)+= dpio/
>  obj-$(CONFIG_DPAA2_CONSOLE)  += dpaa2-console.o
> +obj-y += ftm_alarm.o
> diff --git a/drivers/soc/fsl/rcpm.c b/drivers/soc/fsl/rcpm.c new file mode
> 100644 index 000..82c0ad5
> --- /dev/null
> +++ b/drivers/soc/fsl/rcpm.c
> @@ -0,0 +1,128 @@
> +// SPDX-License-Identifier: GPL-2.0
> +//
> +// rcpm.c - Freescale QorIQ RCPM driver // // Copyright 2019 NXP // //
> +Author: Ran Wang 
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define RCPM_WAKEUP_CELL_MAX_SIZE7
> +
> +struct rcpm {
> + unsigned intwakeup_cells;
> + void __iomem*ippdexpcr_base;
> + boollittle_endian;
> +};
> +
> +static int rcpm_pm_prepare(struct device *dev) {
> + struct device_node  *np = dev->of_node;
> + struct wakeup_source*ws;
> + struct rcpm *rcpm;
> + u32 value[RCPM_WAKEUP_CELL_MAX_SIZE + 1], tmp;
> + int i, ret, idx;
> +
> + rcpm = dev_get_drvdata(dev);
> + if (!rcpm)
> + return -EINVAL;
> +
> + /* Begin with first registered wakeup source */
> + ws = wakeup_source_get_start();

Since I have mad some change in this version, could you please take a look on 
this.
If it's OK to you, I would re-add 'Acked-by: Pavel Machek  '

> + do {
> + /* skip object which is not attached to device */
> + if (!ws->dev)
> + continue;
> +
> + ret = device_property_read_u32_array(ws->dev,
> + "fsl,rcpm-wakeup", value, rcpm->wakeup_cells
> + 1);
> +
> + /*  Wakeup source should refer to current rcpm device */
> + if (ret || (np->phandle != value[0])) {
> + dev_info(dev, "%s doesn't refer to this rcpm\n",
> + ws->name);
> + continue;
> + }
> +
> + for (i = 0; i < rcpm->wakeup_cells; i++) {
> + /* We can only OR related bits */
> + if (value[i + 1]) {
> + if

[PATCH v2] NFSv4: Fix a memory leak bug

2019-08-20 Thread Wenwen Wang

In nfs4_try_migration(), if nfs4_begin_drain_session() fails, the
previously allocated 'page' and 'locations' are not deallocated, leading to
memory leaks. To fix this issue, go to the 'out' label to free 'page' and
'locations' before returning the error.

Signed-off-by: Wenwen Wang 
---
 fs/nfs/nfs4state.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index cad4e06..e916aba 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -2095,8 +2095,10 @@ static int nfs4_try_migration(struct nfs_server *server, 
const struct cred *cred
}
 
status = nfs4_begin_drain_session(clp);
-   if (status != 0)
-   return status;
+   if (status != 0) {
+   result = status;
+   goto out;
+   }
 
status = nfs4_replace_transport(server, locations);
if (status != 0) {
-- 
2.7.4

Re: [PATCH v2 2/3] x86/cpu: Add new Intel Atom CPU model name

2019-08-20 Thread Tanwar, Rahul




On 20/8/2019 10:57 PM, Peter Zijlstra wrote:

On Tue, Aug 20, 2019 at 12:48:05PM +, Luck, Tony wrote:

+#define INTEL_FAM6_ATOM_AIRMONT_NP0x75 /* Lightning Mountain */

What's _NP ?

Network Processor. But that is too narrow a descriptor. This is going to be 
used in
other areas besides networking.

I’m contemplating calling it AIRMONT2

What would describe the special sause that warranted a new SOC? If this
thing is marketed as 'Network Processor' then I suppose we can actually
use it, esp. if we're going to see this more, like the MID thing -- that
lived for a while over multiple uarchs.

Note that for the big cores we added the NNPI thing, which was for
Neural Network Processing something.



INTEL_FAM6_ATOM_AIRMONT_NP was used keeping in mind the recommended

symbol naming form i.e. INTEL_FAM6{OPTFAMILY}_{MICROARCH}{OPTDIFF}

where OPTDIFF is supposed to be the market segment.


This SoC uses AMT (Admantium/Airmont) configuration which is supposed to be

a higher configuration. Looking at other existing examples, it seems that

INTEL_FAM6_ATOM_AIRMONT_PLUS is most appropriate. Would you have any

concerns with _PLUS name ? Thanks.

[PATCH 1/2] nvmem: imx: scu: support hole region check

2019-08-20 Thread Peng Fan

From: Peng Fan 

Introduce HOLE/ECC_REGION flag and in_hole helper to ease the check
of hole region. The ECC_REGION is also introduced here which is
preparing for programming support. ECC_REGION could only be programmed
once, so need take care.

Signed-off-by: Peng Fan 
---
 drivers/nvmem/imx-ocotp-scu.c | 42 +-
 1 file changed, 37 insertions(+), 5 deletions(-)

diff --git a/drivers/nvmem/imx-ocotp-scu.c b/drivers/nvmem/imx-ocotp-scu.c
index d9dc482ecb2f..2f339d7432e6 100644
--- a/drivers/nvmem/imx-ocotp-scu.c
+++ b/drivers/nvmem/imx-ocotp-scu.c
@@ -18,9 +18,20 @@ enum ocotp_devtype {
IMX8QXP,
 };
 
+#define ECC_REGION BIT(0)
+#define HOLE_REGIONBIT(1)
+
+struct ocotp_region {
+   u32 start;
+   u32 end;
+   u32 flag;
+};
+
 struct ocotp_devtype_data {
int devtype;
int nregs;
+   u32 num_region;
+   struct ocotp_region region[];
 };
 
 struct ocotp_priv {
@@ -37,8 +48,31 @@ struct imx_sc_msg_misc_fuse_read {
 static struct ocotp_devtype_data imx8qxp_data = {
.devtype = IMX8QXP,
.nregs = 800,
+   .num_region = 3,
+   .region = {
+   {0x10, 0x10f, ECC_REGION},
+   {0x110, 0x21F, HOLE_REGION},
+   {0x220, 0x31F, ECC_REGION},
+   },
 };
 
+static bool in_hole(void *context, u32 index)
+{
+   struct ocotp_priv *priv = context;
+   const struct ocotp_devtype_data *data = priv->data;
+   int i;
+
+   for (i = 0; i < data->num_region; i++) {
+   if (data->region[i].flag & HOLE_REGION) {
+   if ((index >= data->region[i].start) &&
+   (index <= data->region[i].end))
+   return true;
+   }
+   }
+
+   return false;
+}
+
 static int imx_sc_misc_otp_fuse_read(struct imx_sc_ipc *ipc, u32 word,
 u32 *val)
 {
@@ -85,11 +119,9 @@ static int imx_scu_ocotp_read(void *context, unsigned int 
offset,
buf = p;
 
for (i = index; i < (index + count); i++) {
-   if (priv->data->devtype == IMX8QXP) {
-   if ((i > 271) && (i < 544)) {
-   *buf++ = 0;
-   continue;
-   }
+   if (in_hole(context, i)) {
+   *buf++ = 0;
+   continue;
}
 
ret = imx_sc_misc_otp_fuse_read(priv->nvmem_ipc, i, buf);
-- 
2.16.4

[PATCH 2/2] nvmem: imx: scu: support write

2019-08-20 Thread Peng Fan

From: Peng Fan 

The fuse programming from non-secure world is blocked, so we could
only use Arm Trusted Firmware SIP call to let ATF program fuse.

Because there is ECC region that could only be programmed once,
so add a heler in_ecc to check the ecc region.

Signed-off-by: Peng Fan 
---

The ATF patch will soon be posted to ATF community.

 drivers/nvmem/imx-ocotp-scu.c | 73 ++-
 1 file changed, 72 insertions(+), 1 deletion(-)

diff --git a/drivers/nvmem/imx-ocotp-scu.c b/drivers/nvmem/imx-ocotp-scu.c
index 2f339d7432e6..0f064f2e74a8 100644
--- a/drivers/nvmem/imx-ocotp-scu.c
+++ b/drivers/nvmem/imx-ocotp-scu.c
@@ -7,6 +7,7 @@
  * Peng Fan 
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -14,6 +15,9 @@
 #include 
 #include 
 
+#define IMX_SIP_OTP0xC20A
+#define IMX_SIP_OTP_WRITE  0x2
+
 enum ocotp_devtype {
IMX8QXP,
 };
@@ -45,6 +49,8 @@ struct imx_sc_msg_misc_fuse_read {
u32 word;
 } __packed;
 
+static DEFINE_MUTEX(scu_ocotp_mutex);
+
 static struct ocotp_devtype_data imx8qxp_data = {
.devtype = IMX8QXP,
.nregs = 800,
@@ -73,6 +79,23 @@ static bool in_hole(void *context, u32 index)
return false;
 }
 
+static bool in_ecc(void *context, u32 index)
+{
+   struct ocotp_priv *priv = context;
+   const struct ocotp_devtype_data *data = priv->data;
+   int i;
+
+   for (i = 0; i < data->num_region; i++) {
+   if (data->region[i].flag & ECC_REGION) {
+   if ((index >= data->region[i].start) &&
+   (index <= data->region[i].end))
+   return true;
+   }
+   }
+
+   return false;
+}
+
 static int imx_sc_misc_otp_fuse_read(struct imx_sc_ipc *ipc, u32 word,
 u32 *val)
 {
@@ -116,6 +139,8 @@ static int imx_scu_ocotp_read(void *context, unsigned int 
offset,
if (!p)
return -ENOMEM;
 
+   mutex_lock(_ocotp_mutex);
+
buf = p;
 
for (i = index; i < (index + count); i++) {
@@ -126,6 +151,7 @@ static int imx_scu_ocotp_read(void *context, unsigned int 
offset,
 
ret = imx_sc_misc_otp_fuse_read(priv->nvmem_ipc, i, buf);
if (ret) {
+   mutex_unlock(_ocotp_mutex);
kfree(p);
return ret;
}
@@ -134,18 +160,63 @@ static int imx_scu_ocotp_read(void *context, unsigned int 
offset,
 
memcpy(val, (u8 *)p + offset % 4, bytes);
 
+   mutex_unlock(_ocotp_mutex);
+
kfree(p);
 
return 0;
 }
 
+static int imx_scu_ocotp_write(void *context, unsigned int offset,
+  void *val, size_t bytes)
+{
+   struct ocotp_priv *priv = context;
+   struct arm_smccc_res res;
+   u32 *buf = val;
+   u32 tmp;
+   u32 index;
+   int ret;
+
+   /* allow only writing one complete OTP word at a time */
+   if ((bytes != 4) || (offset % 4))
+   return -EINVAL;
+
+   index = offset >> 2;
+
+   if (in_hole(context, index))
+   return -EINVAL;
+
+   if (in_ecc(context, index)) {
+   pr_warn("ECC region, only program once\n");
+   mutex_lock(_ocotp_mutex);
+   ret = imx_sc_misc_otp_fuse_read(priv->nvmem_ipc, index, );
+   mutex_unlock(_ocotp_mutex);
+   if (ret)
+   return ret;
+   if (tmp) {
+   pr_warn("ECC region, already has value: %x\n", tmp);
+   return -EIO;
+   }
+   }
+
+   mutex_lock(_ocotp_mutex);
+
+   arm_smccc_smc(IMX_SIP_OTP, IMX_SIP_OTP_WRITE, index, *buf,
+ 0, 0, 0, 0, );
+
+   mutex_unlock(_ocotp_mutex);
+
+   return res.a0;
+}
+
 static struct nvmem_config imx_scu_ocotp_nvmem_config = {
.name = "imx-scu-ocotp",
-   .read_only = true,
+   .read_only = false,
.word_size = 4,
.stride = 1,
.owner = THIS_MODULE,
.reg_read = imx_scu_ocotp_read,
+   .reg_write = imx_scu_ocotp_write,
 };
 
 static const struct of_device_id imx_scu_ocotp_dt_ids[] = {
-- 
2.16.4

Re: [PATCH v2] (submitted) input: misc: soc_button_array: use platform_device_register_resndata()

2019-08-20 Thread Dmitry Torokhov

Hi Enrico,

On Tue, Aug 20, 2019 at 02:25:44PM +0200, Enrico Weigelt, metux IT consult 
wrote:
> From: Enrico Weigelt 
> 
> The registration of gpio-keys device can be written much shorter
> by using the platform_device_register_resndata() helper.
> 
> v2:
> * pass >dev to platform_device_register_resndata()
> * fixed errval on failed platform_device_register_resndata()
> 
> Signed-off-by: Enrico Weigelt 
> ---
>  drivers/input/misc/soc_button_array.c | 27 +--
>  1 file changed, 13 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/input/misc/soc_button_array.c 
> b/drivers/input/misc/soc_button_array.c
> index 5e59f8e5..27550f9 100644
> --- a/drivers/input/misc/soc_button_array.c
> +++ b/drivers/input/misc/soc_button_array.c
> @@ -110,25 +110,24 @@ static int soc_button_lookup_gpio(struct device *dev, 
> int acpi_index)
>   gpio_keys_pdata->nbuttons = n_buttons;
>   gpio_keys_pdata->rep = autorepeat;
>  
> - pd = platform_device_alloc("gpio-keys", PLATFORM_DEVID_AUTO);
> - if (!pd) {
> - error = -ENOMEM;
> + pd = platform_device_register_resndata(
> + >dev,
> + "gpio-keys",
> + PLATFORM_DEVID_AUTO,
> + NULL,
> + 0,
> + gpio_keys_pdata,
> + sizeof(*gpio_keys_pdata));
> +
> + error = PTR_ERR_OR_ZERO(pd);
> +
> + if (IS_ERR(pd)) {

I changed this and the PTR_ERR() to simply "error" and applied.

> + dev_err(>dev, "failed registering gpio-keys: %ld\n", 
> PTR_ERR(pd));
>   goto err_free_mem;
>   }
>  
> - error = platform_device_add_data(pd, gpio_keys_pdata,
> -  sizeof(*gpio_keys_pdata));
> - if (error)
> - goto err_free_pdev;
> -
> - error = platform_device_add(pd);
> - if (error)
> - goto err_free_pdev;
> -
>   return pd;
>  
> -err_free_pdev:
> - platform_device_put(pd);
>  err_free_mem:
>   devm_kfree(>dev, gpio_keys_pdata);
>   return ERR_PTR(error);
> -- 
> 1.9.1
> 

Thanks.

-- 
Dmitry

[PATCH] net: Add the same IP detection for duplicate address.

2019-08-20 Thread Dongxu Liu

The network sends an ARP REQUEST packet to determine
whether there is a host with the same IP.
Windows and some other hosts may send the source IP
address instead of 0.
When IN_DEV_ORCONF(in_dev, DROP_GRATUITOUS_ARP) is enable,
the REQUEST will be dropped.
When IN_DEV_ORCONF(in_dev, DROP_GRATUITOUS_ARP) is disable,
The case should be added to the IP conflict handling process.

Signed-off-by: Dongxu Liu 
---
 net/ipv4/arp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/arp.c b/net/ipv4/arp.c
index 05eb42f..a51c921 100644
--- a/net/ipv4/arp.c
+++ b/net/ipv4/arp.c
@@ -801,7 +801,7 @@ static int arp_process(struct net *net, struct sock *sk, 
struct sk_buff *skb)
GFP_ATOMIC);
 
/* Special case: IPv4 duplicate address detection packet (RFC2131) */
-   if (sip == 0) {
+   if (sip == 0 || sip == tip) {
if (arp->ar_op == htons(ARPOP_REQUEST) &&
inet_addr_type_dev_table(net, dev, tip) == RTN_LOCAL &&
!arp_ignore(in_dev, sip, tip))
-- 
2.12.3

Re: linux-next: manual merge of the security tree with Linus' tree

2019-08-20 Thread Stephen Rothwell

Hi all,

Just adding a couple of more Cc's

On Wed, 21 Aug 2019 13:01:06 +1000 Stephen Rothwell  
wrote:
> 
> Today's linux-next merge of the security tree got conflicts in:
> 
>   arch/s390/configs/debug_defconfig
>   arch/s390/configs/defconfig
> 
> between commit:
> 
>   3361f3193c74 ("s390: update configs")
> 
> from Linus' tree and commit:
> 
>   99d5cadfde2b ("kexec_file: split KEXEC_VERIFY_SIG into KEXEC_SIG and 
> KEXEC_SIG_FORCE")
> 
> from the security tree.
> 
> I fixed it up (the former removed the CONFIG option updated by the latter)
> and can carry the fix as necessary. This is now fixed as far as linux-next
> is concerned, but any non trivial conflicts should be mentioned to your
> upstream maintainer when your tree is submitted for merging.  You may
> also want to consider cooperating with the maintainer of the conflicting
> tree to minimise any particularly complex conflicts.

-- 
Cheers,
Stephen Rothwell


pgp6y7Eibt40a.pgp
Description: OpenPGP digital signature

[PATCH v6 1/3] PM: wakeup: Add routine to help fetch wakeup source object.

2019-08-20 Thread Ran Wang

Some user might want to go through all registered wakeup sources
and doing things accordingly. For example, SoC PM driver might need to
do HW programming to prevent powering down specific IP which wakeup
source depending on. So add this API to help walk through all registered
wakeup source objects on that list and return them one by one.

Signed-off-by: Ran Wang 
---
Change in v6:
- Add wakeup_source_get_star() and wakeup_source_get_stop() to aligned 
with wakeup_sources_stats_seq_start/nex/stop.

Change in v5:
- Update commit message, add decription of walk through all wakeup
source objects.
- Add SCU protection in function wakeup_source_get_next().
- Rename wakeup_source member 'attached_dev' to 'dev' and move it up
(before wakeirq).

Change in v4:
- None.

Change in v3:
- Adjust indentation of *attached_dev;.

Change in v2:
- None.

 drivers/base/power/wakeup.c | 39 +++
 include/linux/pm_wakeup.h   |  5 +
 2 files changed, 44 insertions(+)

diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c
index ee31d4f..61bc16b 100644
--- a/drivers/base/power/wakeup.c
+++ b/drivers/base/power/wakeup.c
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -228,6 +229,43 @@ void wakeup_source_unregister(struct wakeup_source *ws)
 EXPORT_SYMBOL_GPL(wakeup_source_unregister);
 
 /**
+ * wakeup_source_get_star - Begin a walk on wakeup source list
+ * @srcuidx: Lock index allocated for this caller.
+ */
+struct wakeup_source *wakeup_source_get_start(int *srcuidx)
+{
+   struct list_head *ws_head = _sources;
+
+   *srcuidx = srcu_read_lock(_srcu);
+
+   return list_entry_rcu(ws_head->next, struct wakeup_source, entry);
+}
+EXPORT_SYMBOL_GPL(wakeup_source_get_start);
+
+/**
+ * wakeup_source_get_next - Get next wakeup source from the list
+ * @ws: Previous wakeup source object
+ */
+struct wakeup_source *wakeup_source_get_next(struct wakeup_source *ws)
+{
+   struct list_head *ws_head = _sources;
+
+   return list_next_or_null_rcu(ws_head, >entry,
+   struct wakeup_source, entry);
+}
+EXPORT_SYMBOL_GPL(wakeup_source_get_next);
+
+/**
+ * wakeup_source_get_stop - Stop a walk on wakeup source list
+ * @idx: Dedicated lock index of this caller.
+ */
+void wakeup_source_get_stop(int idx)
+{
+   srcu_read_unlock(_srcu, idx);
+}
+EXPORT_SYMBOL_GPL(wakeup_source_get_stop);
+
+/**
  * device_wakeup_attach - Attach a wakeup source object to a device object.
  * @dev: Device to handle.
  * @ws: Wakeup source object to attach to @dev.
@@ -242,6 +280,7 @@ static int device_wakeup_attach(struct device *dev, struct 
wakeup_source *ws)
return -EEXIST;
}
dev->power.wakeup = ws;
+   ws->dev = dev;
if (dev->power.wakeirq)
device_wakeup_attach_irq(dev, dev->power.wakeirq);
spin_unlock_irq(>power.lock);
diff --git a/include/linux/pm_wakeup.h b/include/linux/pm_wakeup.h
index 9102760..e6b47b6 100644
--- a/include/linux/pm_wakeup.h
+++ b/include/linux/pm_wakeup.h
@@ -23,6 +23,7 @@ struct wake_irq;
  * @name: Name of the wakeup source
  * @entry: Wakeup source list entry
  * @lock: Wakeup source lock
+ * @dev: The device it attached to
  * @wakeirq: Optional device specific wakeirq
  * @timer: Wakeup timer list
  * @timer_expires: Wakeup timer expiration
@@ -42,6 +43,7 @@ struct wakeup_source {
const char  *name;
struct list_headentry;
spinlock_t  lock;
+   struct device   *dev;
struct wake_irq *wakeirq;
struct timer_list   timer;
unsigned long   timer_expires;
@@ -88,6 +90,9 @@ extern void wakeup_source_add(struct wakeup_source *ws);
 extern void wakeup_source_remove(struct wakeup_source *ws);
 extern struct wakeup_source *wakeup_source_register(const char *name);
 extern void wakeup_source_unregister(struct wakeup_source *ws);
+extern struct wakeup_source *wakeup_source_get_start(int *srcuidx);
+extern struct wakeup_source *wakeup_source_get_next(struct wakeup_source *ws);
+extern void wakeup_source_get_stop(int idx);
 extern int device_wakeup_enable(struct device *dev);
 extern int device_wakeup_disable(struct device *dev);
 extern void device_set_wakeup_capable(struct device *dev, bool capable);
-- 
2.7.4

[PATCH v6 3/3] soc: fsl: add RCPM driver

2019-08-20 Thread Ran Wang

The NXP's QorIQ Processors based on ARM Core have RCPM module
(Run Control and Power Management), which performs system level
tasks associated with power management such as wakeup source control.

This driver depends on PM wakeup source framework which help to
collect wake information.

Signed-off-by: Ran Wang 
---
Change in v6:
- Adjust related API usage to meet wakeup.c's update in patch 1/3.

Change in v5:
- Fix v4 regression of the return value of wakeup_source_get_next()
didn't pass to ws in while loop.
- Rename wakeup_source member 'attached_dev' to 'dev'.
- Rename property 'fsl,#rcpm-wakeup-cells' to '#fsl,rcpm-wakeup-cells'.
please see https://lore.kernel.org/patchwork/patch/1101022/

Change in v4:
- Remove extra ',' in author line of rcpm.c
- Update usage of wakeup_source_get_next() to be less confusing to the
reader, code logic remain the same.

Change in v3:
- Some whitespace ajdustment.

Change in v2:
- Rebase Kconfig and Makefile update to latest mainline.

 drivers/soc/fsl/Kconfig  |   8 +++
 drivers/soc/fsl/Makefile |   2 +
 drivers/soc/fsl/rcpm.c   | 128 +++
 3 files changed, 138 insertions(+)
 create mode 100644 drivers/soc/fsl/rcpm.c

diff --git a/drivers/soc/fsl/Kconfig b/drivers/soc/fsl/Kconfig
index f9ad8ad..4918856 100644
--- a/drivers/soc/fsl/Kconfig
+++ b/drivers/soc/fsl/Kconfig
@@ -40,4 +40,12 @@ config DPAA2_CONSOLE
  /dev/dpaa2_mc_console and /dev/dpaa2_aiop_console,
  which can be used to dump the Management Complex and AIOP
  firmware logs.
+
+config FSL_RCPM
+   bool "Freescale RCPM support"
+   depends on PM_SLEEP
+   help
+ The NXP QorIQ Processors based on ARM Core have RCPM module
+ (Run Control and Power Management), which performs all device-level
+ tasks associated with power management, such as wakeup source control.
 endmenu
diff --git a/drivers/soc/fsl/Makefile b/drivers/soc/fsl/Makefile
index 71dee8d..28c6dac 100644
--- a/drivers/soc/fsl/Makefile
+++ b/drivers/soc/fsl/Makefile
@@ -6,6 +6,8 @@
 obj-$(CONFIG_FSL_DPAA) += qbman/
 obj-$(CONFIG_QUICC_ENGINE) += qe/
 obj-$(CONFIG_CPM)  += qe/
+obj-$(CONFIG_FSL_RCPM) += rcpm.o
 obj-$(CONFIG_FSL_GUTS) += guts.o
 obj-$(CONFIG_FSL_MC_DPIO)  += dpio/
 obj-$(CONFIG_DPAA2_CONSOLE)+= dpaa2-console.o
+obj-y += ftm_alarm.o
diff --git a/drivers/soc/fsl/rcpm.c b/drivers/soc/fsl/rcpm.c
new file mode 100644
index 000..82c0ad5
--- /dev/null
+++ b/drivers/soc/fsl/rcpm.c
@@ -0,0 +1,128 @@
+// SPDX-License-Identifier: GPL-2.0
+//
+// rcpm.c - Freescale QorIQ RCPM driver
+//
+// Copyright 2019 NXP
+//
+// Author: Ran Wang 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define RCPM_WAKEUP_CELL_MAX_SIZE  7
+
+struct rcpm {
+   unsigned intwakeup_cells;
+   void __iomem*ippdexpcr_base;
+   boollittle_endian;
+};
+
+static int rcpm_pm_prepare(struct device *dev)
+{
+   struct device_node  *np = dev->of_node;
+   struct wakeup_source*ws;
+   struct rcpm *rcpm;
+   u32 value[RCPM_WAKEUP_CELL_MAX_SIZE + 1], tmp;
+   int i, ret, idx;
+
+   rcpm = dev_get_drvdata(dev);
+   if (!rcpm)
+   return -EINVAL;
+
+   /* Begin with first registered wakeup source */
+   ws = wakeup_source_get_start();
+   do {
+   /* skip object which is not attached to device */
+   if (!ws->dev)
+   continue;
+
+   ret = device_property_read_u32_array(ws->dev,
+   "fsl,rcpm-wakeup", value, rcpm->wakeup_cells + 
1);
+
+   /*  Wakeup source should refer to current rcpm device */
+   if (ret || (np->phandle != value[0])) {
+   dev_info(dev, "%s doesn't refer to this rcpm\n",
+   ws->name);
+   continue;
+   }
+
+   for (i = 0; i < rcpm->wakeup_cells; i++) {
+   /* We can only OR related bits */
+   if (value[i + 1]) {
+   if (rcpm->little_endian) {
+   tmp = ioread32(rcpm->ippdexpcr_base + i 
* 4);
+   tmp |= value[i + 1];
+   iowrite32(tmp, rcpm->ippdexpcr_base + i 
* 4);
+   } else {
+   tmp = ioread32be(rcpm->ippdexpcr_base + 
i * 4);
+   tmp |= value[i + 1];
+   iowrite32be(tmp, rcpm->ippdexpcr_base + 
i * 4);
+   }
+   }
+   }
+   } while (ws =

[PATCH v6 2/3] Documentation: dt: binding: fsl: Add 'little-endian' and update Chassis define

2019-08-20 Thread Ran Wang

By default, QorIQ SoC's RCPM register block is Big Endian. But
there are some exceptions, such as LS1088A and LS2088A, are
Little Endian. So add this optional property to help identify
them.

Actually LS2021A and other Layerscapes won't totally follow Chassis
2.1, so separate them from powerpc SoC.

Signed-off-by: Ran Wang 
Reviewed-by: Rob Herring 
---
Change in v6:
- None.

Change in v5:
- Add 'Reviewed-by: Rob Herring ' to commit message.
- Rename property 'fsl,#rcpm-wakeup-cells' to '#fsl,rcpm-wakeup-cells'.
please see https://lore.kernel.org/patchwork/patch/1101022/

Change in v4:
- Adjust indectation of 'ls1021a, ls1012a, ls1043a, ls1046a'.

Change in v3:
- None.

Change in v2:
- None.

 Documentation/devicetree/bindings/soc/fsl/rcpm.txt | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/Documentation/devicetree/bindings/soc/fsl/rcpm.txt 
b/Documentation/devicetree/bindings/soc/fsl/rcpm.txt
index e284e4e..5a33619 100644
--- a/Documentation/devicetree/bindings/soc/fsl/rcpm.txt
+++ b/Documentation/devicetree/bindings/soc/fsl/rcpm.txt
@@ -5,7 +5,7 @@ and power management.
 
 Required properites:
   - reg : Offset and length of the register set of the RCPM block.
-  - fsl,#rcpm-wakeup-cells : The number of IPPDEXPCR register cells in the
+  - #fsl,rcpm-wakeup-cells : The number of IPPDEXPCR register cells in the
fsl,rcpm-wakeup property.
   - compatible : Must contain a chip-specific RCPM block compatible string
and (if applicable) may contain a chassis-version RCPM compatible
@@ -20,6 +20,7 @@ Required properites:
* "fsl,qoriq-rcpm-1.0": for chassis 1.0 rcpm
* "fsl,qoriq-rcpm-2.0": for chassis 2.0 rcpm
* "fsl,qoriq-rcpm-2.1": for chassis 2.1 rcpm
+   * "fsl,qoriq-rcpm-2.1+": for chassis 2.1+ rcpm
 
 All references to "1.0" and "2.0" refer to the QorIQ chassis version to
 which the chip complies.
@@ -27,14 +28,19 @@ Chassis Version Example Chips
 ------
 1.0p4080, p5020, p5040, p2041, p3041
 2.0t4240, b4860, b4420
-2.1t1040, ls1021
+2.1t1040,
+2.1+   ls1021a, ls1012a, ls1043a, ls1046a
+
+Optional properties:
+ - little-endian : RCPM register block is Little Endian. Without it RCPM
+   will be Big Endian (default case).
 
 Example:
 The RCPM node for T4240:
rcpm: global-utilities@e2000 {
compatible = "fsl,t4240-rcpm", "fsl,qoriq-rcpm-2.0";
reg = <0xe2000 0x1000>;
-   fsl,#rcpm-wakeup-cells = <2>;
+   #fsl,rcpm-wakeup-cells = <2>;
};
 
 * Freescale RCPM Wakeup Source Device Tree Bindings
@@ -44,7 +50,7 @@ can be used as a wakeup source.
 
   - fsl,rcpm-wakeup: Consists of a phandle to the rcpm node and the IPPDEXPCR
register cells. The number of IPPDEXPCR register cells is defined in
-   "fsl,#rcpm-wakeup-cells" in the rcpm node. The first register cell is
+   "#fsl,rcpm-wakeup-cells" in the rcpm node. The first register cell is
the bit mask that should be set in IPPDEXPCR0, and the second register
cell is for IPPDEXPCR1, and so on.
 
-- 
2.7.4

[PATCH v2 2/2] dt-bindings: arm: rockchip: remove reference to fennec board

2019-08-20 Thread Kever Yang

The rk3288 fennec board has been removed, remove the binding document at
the same time.

Signed-off-by: Kever Yang 
---

Changes in v2: None

 Documentation/devicetree/bindings/arm/rockchip.yaml | 5 -
 1 file changed, 5 deletions(-)

diff --git a/Documentation/devicetree/bindings/arm/rockchip.yaml 
b/Documentation/devicetree/bindings/arm/rockchip.yaml
index 34865042f4e4..cc2f1c2d0cd0 100644
--- a/Documentation/devicetree/bindings/arm/rockchip.yaml
+++ b/Documentation/devicetree/bindings/arm/rockchip.yaml
@@ -424,11 +424,6 @@ properties:
   - rockchip,rk3288-evb-rk808
   - const: rockchip,rk3288
 
-  - description: Rockchip RK3288 Fennec
-items:
-  - const: rockchip,rk3288-fennec
-  - const: rockchip,rk3288
-
   - description: Rockchip RK3328 Evaluation board
 items:
   - const: rockchip,rk3328-evb
-- 
2.17.1

[PATCH v2] mt76: fix some checkpatch warnings

2019-08-20 Thread Ryder Lee

This fixes the following checkpatch warnings:
CHECK: Alignment should match open parenthesis
CHECK: No space is necessary after a cast

Signed-off-by: Ryder Lee 
---
Changes since v2: remove false positive checkpatch warnings.
Changes since v1: none.
---
 drivers/net/wireless/mediatek/mt76/agg-rx.c   | 23 +++---
 drivers/net/wireless/mediatek/mt76/dma.c  |  2 +-
 drivers/net/wireless/mediatek/mt76/mac80211.c | 30 +-
 drivers/net/wireless/mediatek/mt76/mt76.h |  2 +-
 .../net/wireless/mediatek/mt76/mt7615/mac.c   |  3 +-
 drivers/net/wireless/mediatek/mt76/trace.h|  9 +++---
 drivers/net/wireless/mediatek/mt76/tx.c   | 18 +--
 drivers/net/wireless/mediatek/mt76/usb.c  | 31 ++-
 .../net/wireless/mediatek/mt76/usb_trace.h| 11 ---
 drivers/net/wireless/mediatek/mt76/util.h |  4 +--
 10 files changed, 70 insertions(+), 63 deletions(-)

diff --git a/drivers/net/wireless/mediatek/mt76/agg-rx.c 
b/drivers/net/wireless/mediatek/mt76/agg-rx.c
index 2ba157f73bf1..8f3d36a15e17 100644
--- a/drivers/net/wireless/mediatek/mt76/agg-rx.c
+++ b/drivers/net/wireless/mediatek/mt76/agg-rx.c
@@ -23,8 +23,9 @@ mt76_aggr_release(struct mt76_rx_tid *tid, struct 
sk_buff_head *frames, int idx)
 }
 
 static void
-mt76_rx_aggr_release_frames(struct mt76_rx_tid *tid, struct sk_buff_head 
*frames,
-u16 head)
+mt76_rx_aggr_release_frames(struct mt76_rx_tid *tid,
+   struct sk_buff_head *frames,
+   u16 head)
 {
int idx;
 
@@ -63,15 +64,14 @@ mt76_rx_aggr_check_release(struct mt76_rx_tid *tid, struct 
sk_buff_head *frames)
for (idx = (tid->head + 1) % tid->size;
 idx != start && nframes;
 idx = (idx + 1) % tid->size) {
-
skb = tid->reorder_buf[idx];
if (!skb)
continue;
 
nframes--;
-   status = (struct mt76_rx_status *) skb->cb;
-   if (!time_after(jiffies, status->reorder_time +
-REORDER_TIMEOUT))
+   status = (struct mt76_rx_status *)skb->cb;
+   if (!time_after(jiffies,
+   status->reorder_time + REORDER_TIMEOUT))
continue;
 
mt76_rx_aggr_release_frames(tid, frames, status->seqno);
@@ -111,8 +111,8 @@ mt76_rx_aggr_reorder_work(struct work_struct *work)
 static void
 mt76_rx_aggr_check_ctl(struct sk_buff *skb, struct sk_buff_head *frames)
 {
-   struct mt76_rx_status *status = (struct mt76_rx_status *) skb->cb;
-   struct ieee80211_bar *bar = (struct ieee80211_bar *) skb->data;
+   struct mt76_rx_status *status = (struct mt76_rx_status *)skb->cb;
+   struct ieee80211_bar *bar = (struct ieee80211_bar *)skb->data;
struct mt76_wcid *wcid = status->wcid;
struct mt76_rx_tid *tid;
u16 seqno;
@@ -137,8 +137,8 @@ mt76_rx_aggr_check_ctl(struct sk_buff *skb, struct 
sk_buff_head *frames)
 
 void mt76_rx_aggr_reorder(struct sk_buff *skb, struct sk_buff_head *frames)
 {
-   struct mt76_rx_status *status = (struct mt76_rx_status *) skb->cb;
-   struct ieee80211_hdr *hdr = (struct ieee80211_hdr *) skb->data;
+   struct mt76_rx_status *status = (struct mt76_rx_status *)skb->cb;
+   struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)skb->data;
struct mt76_wcid *wcid = status->wcid;
struct ieee80211_sta *sta;
struct mt76_rx_tid *tid;
@@ -222,7 +222,8 @@ void mt76_rx_aggr_reorder(struct sk_buff *skb, struct 
sk_buff_head *frames)
tid->nframes++;
mt76_rx_aggr_release_head(tid, frames);
 
-   ieee80211_queue_delayed_work(tid->dev->hw, >reorder_work, 
REORDER_TIMEOUT);
+   ieee80211_queue_delayed_work(tid->dev->hw, >reorder_work,
+REORDER_TIMEOUT);
 
 out:
spin_unlock_bh(>lock);
diff --git a/drivers/net/wireless/mediatek/mt76/dma.c 
b/drivers/net/wireless/mediatek/mt76/dma.c
index dbfd15e861e9..46f5223b4d89 100644
--- a/drivers/net/wireless/mediatek/mt76/dma.c
+++ b/drivers/net/wireless/mediatek/mt76/dma.c
@@ -493,7 +493,7 @@ mt76_dma_rx_process(struct mt76_dev *dev, struct mt76_queue 
*q, int budget)
skb_reserve(skb, q->buf_offset);
 
if (q == >q_rx[MT_RXQ_MCU]) {
-   u32 *rxfce = (u32 *) skb->cb;
+   u32 *rxfce = (u32 *)skb->cb;
*rxfce = info;
}
 
diff --git a/drivers/net/wireless/mediatek/mt76/mac80211.c 
b/drivers/net/wireless/mediatek/mt76/mac80211.c
index 581415425cd6..d1075e13ecf7 100644
--- a/drivers/net/wireless/mediatek/mt76/mac80211.c
+++ b/drivers/net/wireless/mediatek/mt76/mac80211.c
@@ -491,7 +491,7 @@ struct ieee80211_sta *mt76_rx_convert(struct sk_buff *skb)
struct ieee80211_rx_status *status = IEEE80211_SKB_RXCB(skb);
struct mt76_rx_status mstat;
 
-

[PATCH v2 1/2] ARM: dts: rockchip: remove rk3288 fennec board support

2019-08-20 Thread Kever Yang

Since there is no one using this board, remove it.

Signed-off-by: Kever Yang 
---

Changes in v2:
- update document at the same time

 arch/arm/boot/dts/Makefile  |   1 -
 arch/arm/boot/dts/rk3288-fennec.dts | 347 
 2 files changed, 348 deletions(-)
 delete mode 100644 arch/arm/boot/dts/rk3288-fennec.dts

diff --git a/arch/arm/boot/dts/Makefile b/arch/arm/boot/dts/Makefile
index 9159fa2cea90..1437ff8fe727 100644
--- a/arch/arm/boot/dts/Makefile
+++ b/arch/arm/boot/dts/Makefile
@@ -907,7 +907,6 @@ dtb-$(CONFIG_ARCH_ROCKCHIP) += \
rk3229-evb.dtb \
rk3288-evb-act8846.dtb \
rk3288-evb-rk808.dtb \
-   rk3288-fennec.dtb \
rk3288-firefly-beta.dtb \
rk3288-firefly.dtb \
rk3288-firefly-reload.dtb \
diff --git a/arch/arm/boot/dts/rk3288-fennec.dts 
b/arch/arm/boot/dts/rk3288-fennec.dts
deleted file mode 100644
index 4847cf902a15..
--- a/arch/arm/boot/dts/rk3288-fennec.dts
+++ /dev/null
@@ -1,347 +0,0 @@
-// SPDX-License-Identifier: (GPL-2.0+ OR MIT)
-
-/dts-v1/;
-
-#include "rk3288.dtsi"
-
-/ {
-   model = "Rockchip RK3288 Fennec Board";
-   compatible = "rockchip,rk3288-fennec", "rockchip,rk3288";
-
-   memory@0 {
-   reg = <0x0 0x0 0x0 0x8000>;
-   device_type = "memory";
-   };
-
-   ext_gmac: external-gmac-clock {
-   compatible = "fixed-clock";
-   #clock-cells = <0>;
-   clock-frequency = <12500>;
-   clock-output-names = "ext_gmac";
-   };
-
-   vcc_sys: vsys-regulator {
-   compatible = "regulator-fixed";
-   regulator-name = "vcc_sys";
-   regulator-min-microvolt = <500>;
-   regulator-max-microvolt = <500>;
-   regulator-always-on;
-   regulator-boot-on;
-   };
-};
-
- {
-   cpu0-supply = <_cpu>;
-};
-
- {
-   bus-width = <8>;
-   cap-mmc-highspeed;
-   non-removable;
-   pinctrl-names = "default";
-   pinctrl-0 = <_clk _cmd _pwr _bus8>;
-   status = "okay";
-};
-
- {
-   assigned-clocks = < SCLK_MAC>;
-   assigned-clock-parents = <_gmac>;
-   clock_in_out = "input";
-   pinctrl-names = "default";
-   pinctrl-0 = <_pins>, <_rst>, <_pmeb>, <_int>;
-   phy-supply = <_lan>;
-   phy-mode = "rgmii";
-   snps,reset-active-low;
-   snps,reset-delays-us = <0 1 100>;
-   snps,reset-gpio = < RK_PB0 GPIO_ACTIVE_LOW>;
-   tx_delay = <0x30>;
-   rx_delay = <0x10>;
-   status = "okay";
-};
-
- {
-   mali-supply = <_gpu>;
-   status = "okay";
-};
-
- {
-   status = "okay";
-};
-
- {
-   status = "okay";
-   clock-frequency = <40>;
-
-   rk808: pmic@1b {
-   compatible = "rockchip,rk808";
-   reg = <0x1b>;
-   interrupt-parent = <>;
-   interrupts = ;
-   #clock-cells = <1>;
-   clock-output-names = "xin32k", "rk808-clkout2";
-   pinctrl-names = "default";
-   pinctrl-0 = <_int _pwroff>;
-   rockchip,system-power-controller;
-   wakeup-source;
-
-   vcc1-supply = <_sys>;
-   vcc2-supply = <_sys>;
-   vcc3-supply = <_sys>;
-   vcc4-supply = <_sys>;
-   vcc6-supply = <_sys>;
-   vcc7-supply = <_sys>;
-   vcc8-supply = <_io>;
-   vcc9-supply = <_io>;
-   vcc10-supply = <_io>;
-   vcc11-supply = <_io>;
-   vcc12-supply = <_io>;
-   vddio-supply = <_io>;
-
-   regulators {
-   vdd_cpu: DCDC_REG1 {
-   regulator-always-on;
-   regulator-boot-on;
-   regulator-min-microvolt = <75>;
-   regulator-max-microvolt = <135>;
-   regulator-name = "vdd_arm";
-   regulator-state-mem {
-   regulator-off-in-suspend;
-   };
-   };
-
-   vdd_gpu: DCDC_REG2 {
-   regulator-always-on;
-   regulator-boot-on;
-   regulator-min-microvolt = <85>;
-   regulator-max-microvolt = <125>;
-   regulator-name = "vdd_gpu";
-   regulator-state-mem {
-   regulator-on-in-suspend;
-   regulator-suspend-microvolt = <100>;
-   };
-   };
-
-   vcc_ddr: DCDC_REG3 {
-   regulator-always-on;
-   regulator-boot-on;
-

[PATCH] scsi: qla4xxx: Fix a typo in ql4_os.c

2019-08-20 Thread Masanari Iida

This patch fix a spelling typo in a printk message.

Signed-off-by: Masanari Iida 
---
 drivers/scsi/qla4xxx/ql4_os.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/qla4xxx/ql4_os.c b/drivers/scsi/qla4xxx/ql4_os.c
index 8c674eca09f1..1ac18f93cf9a 100644
--- a/drivers/scsi/qla4xxx/ql4_os.c
+++ b/drivers/scsi/qla4xxx/ql4_os.c
@@ -6191,7 +6191,7 @@ static int qla4xxx_setup_boot_info(struct scsi_qla_host 
*ha)
 
if (ql4xdisablesysfsboot) {
ql4_printk(KERN_INFO, ha,
-  "%s: syfsboot disabled - driver will trigger login "
+  "%s: sysfsboot disabled - driver will trigger login "
   "and publish session for discovery .\n", __func__);
return QLA_SUCCESS;
}
-- 
2.23.0

[PATCH v2 4/6] staging: erofs: avoid loop in submit chains

2019-08-20 Thread Gao Xiang

As reported by erofs-utils fuzzer, 2 conditions
can happen in corrupted images, which can cause
unexpected behaviors.
 - access the same pcluster one more time;
 - access the tail end pcluster again, e.g.
_ access again (will trigger tail merging)
   |
 1 2 3 1 2 ->   1 2 3 1
 |_ tail end of the chain\___/ (unexpected behavior)
Let's detect and avoid them now.

Reviewed-by: Chao Yu 
Signed-off-by: Gao Xiang 
---
Hi Greg,

 It seems that you picked up [PATCH 4/6], could you replace it
 with this v2? It seems that I missed a condition here, which
 can be observed after a much longer fuzzing on corrupted
 compressed images. Or you could just drop this [PATCH 4/6]
 patch when you apply to staging-next since those patches are
 independent.

Thanks you very much,
Gao Xiang

 drivers/staging/erofs/zdata.c | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/erofs/zdata.c b/drivers/staging/erofs/zdata.c
index 2d7aaf98f7de..5f8d3ac0e813 100644
--- a/drivers/staging/erofs/zdata.c
+++ b/drivers/staging/erofs/zdata.c
@@ -132,7 +132,7 @@ enum z_erofs_collectmode {
 struct z_erofs_collector {
struct z_erofs_pagevec_ctor vector;
 
-   struct z_erofs_pcluster *pcl;
+   struct z_erofs_pcluster *pcl, *tailpcl;
struct z_erofs_collection *cl;
struct page **compressedpages;
z_erofs_next_pcluster_t owned_head;
@@ -353,6 +353,11 @@ static struct z_erofs_collection *cllookup(struct 
z_erofs_collector *clt,
return NULL;
 
pcl = container_of(grp, struct z_erofs_pcluster, obj);
+   if (clt->owned_head == >next || pcl == clt->tailpcl) {
+   DBG_BUGON(1);
+   erofs_workgroup_put(grp);
+   return ERR_PTR(-EFSCORRUPTED);
+   }
 
cl = z_erofs_primarycollection(pcl);
if (unlikely(cl->pageofs != (map->m_la & ~PAGE_MASK))) {
@@ -379,7 +384,13 @@ static struct z_erofs_collection *cllookup(struct 
z_erofs_collector *clt,
}
}
mutex_lock(>lock);
+   /* used to check tail merging loop due to corrupted images */
+   if (clt->owned_head == Z_EROFS_PCLUSTER_TAIL)
+   clt->tailpcl = pcl;
clt->mode = try_to_claim_pcluster(pcl, >owned_head);
+   /* clean tailpcl if the current owned_head is Z_EROFS_PCLUSTER_TAIL */
+   if (clt->owned_head == Z_EROFS_PCLUSTER_TAIL)
+   clt->tailpcl = NULL;
clt->pcl = pcl;
clt->cl = cl;
return cl;
@@ -432,6 +443,9 @@ static struct z_erofs_collection *clregister(struct 
z_erofs_collector *clt,
kmem_cache_free(pcluster_cachep, pcl);
return ERR_PTR(-EAGAIN);
}
+   /* used to check tail merging loop due to corrupted images */
+   if (clt->owned_head == Z_EROFS_PCLUSTER_TAIL)
+   clt->tailpcl = pcl;
clt->owned_head = >next;
clt->pcl = pcl;
clt->cl = cl;
-- 
2.17.1

linux-next: manual merge of the security tree with Linus' tree

2019-08-20 Thread Stephen Rothwell

Hi all,

Today's linux-next merge of the security tree got a conflict in:

  arch/s390/configs/performance_defconfig

between commit:

  d1523a8f4b8b ("s390: replace defconfig with performance_defconfig")

from Linus' tree and commit:

  99d5cadfde2b ("kexec_file: split KEXEC_VERIFY_SIG into KEXEC_SIG and 
KEXEC_SIG_FORCE")

from the security tree.

I fixed it up (the former removed this file) and can carry the fix as
necessary. This is now fixed as far as linux-next is concerned, but any
non trivial conflicts should be mentioned to your upstream maintainer
when your tree is submitted for merging.  You may also want to consider
cooperating with the maintainer of the conflicting tree to minimise any
particularly complex conflicts.

-- 
Cheers,
Stephen Rothwell


pgpofz7RCShSZ.pgp
Description: OpenPGP digital signature

[PATCH] __div64_const32(): improve the generic C version

2019-08-20 Thread Nicolas Pitre

Let's rework that code to avoid large immediate values and convert some
64-bit variables to 32-bit ones when possible. This allows gcc to
produce smaller and better code. This even produces optimal code on
RISC-V.

Signed-off-by: Nicolas Pitre 

diff --git a/include/asm-generic/div64.h b/include/asm-generic/div64.h
index dc9726fdac..33358245b4 100644
--- a/include/asm-generic/div64.h
+++ b/include/asm-generic/div64.h
@@ -178,7 +178,8 @@ static inline uint64_t __arch_xprod_64(const uint64_t m, 
uint64_t n, bool bias)
uint32_t m_hi = m >> 32;
uint32_t n_lo = n;
uint32_t n_hi = n >> 32;
-   uint64_t res, tmp;
+   uint64_t res;
+   uint32_t res_lo, res_hi, tmp;
 
if (!bias) {
res = ((uint64_t)m_lo * n_lo) >> 32;
@@ -187,8 +188,9 @@ static inline uint64_t __arch_xprod_64(const uint64_t m, 
uint64_t n, bool bias)
res = (m + (uint64_t)m_lo * n_lo) >> 32;
} else {
res = m + (uint64_t)m_lo * n_lo;
-   tmp = (res < m) ? (1ULL << 32) : 0;
-   res = (res >> 32) + tmp;
+   res_lo = res >> 32;
+   res_hi = (res_lo < m_hi);
+   res = res_lo | ((uint64_t)res_hi << 32);
}
 
if (!(m & ((1ULL << 63) | (1ULL << 31 {
@@ -197,10 +199,12 @@ static inline uint64_t __arch_xprod_64(const uint64_t m, 
uint64_t n, bool bias)
res += (uint64_t)m_hi * n_lo;
res >>= 32;
} else {
-   tmp = res += (uint64_t)m_lo * n_hi;
+   res += (uint64_t)m_lo * n_hi;
+   tmp = res >> 32;
res += (uint64_t)m_hi * n_lo;
-   tmp = (res < tmp) ? (1ULL << 32) : 0;
-   res = (res >> 32) + tmp;
+   res_lo = res >> 32;
+   res_hi = (res_lo < tmp);
+   res = res_lo | ((uint64_t)res_hi << 32);
}
 
res += (uint64_t)m_hi * n_hi;

linux-next: manual merge of the security tree with Linus' tree

2019-08-20 Thread Stephen Rothwell

Hi all,

Today's linux-next merge of the security tree got a conflict in:

  security/integrity/ima/Kconfig

between commit:

  9e1e5d4372d6 ("x86/ima: fix the Kconfig dependency for IMA_ARCH_POLICY")

from Linus' tree and commit:

  99d5cadfde2b ("kexec_file: split KEXEC_VERIFY_SIG into KEXEC_SIG and 
KEXEC_SIG_FORCE")

from the security tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc security/integrity/ima/Kconfig
index 2ced99dde694,32cd25fa44a5..
--- a/security/integrity/ima/Kconfig
+++ b/security/integrity/ima/Kconfig
@@@ -160,8 -160,7 +160,8 @@@ config IMA_APPRAIS
  
  config IMA_ARCH_POLICY
  bool "Enable loading an IMA architecture specific policy"
- depends on (KEXEC_VERIFY_SIG && IMA) || IMA_APPRAISE \
 -depends on KEXEC_SIG || IMA_APPRAISE && INTEGRITY_ASYMMETRIC_KEYS
++depends on (KEXEC_SIG && IMA) || IMA_APPRAISE \
 + && INTEGRITY_ASYMMETRIC_KEYS
  default n
  help
This option enables loading an IMA architecture specific policy


pgpAmHSoPWXBw.pgp
Description: OpenPGP digital signature

Re: [PATCH] ARM: dts: rockchip: remove rk3288 fennec board support

2019-08-20 Thread Kever Yang




On 2019/8/20 下午9:56, Heiko Stuebner wrote:

Hi Kever,

Am Dienstag, 20. August 2019, 12:03:52 CEST schrieb Kever Yang:

Since there is no one using this board, remove it.

so just to elaborate a bit, I guess this board was internal to Rockchip,
never went to the market and therefore is obsolete without any users,
right?



Yes, even if there is someone using this board, they don't use upstream 
source code, you can see


there is only one commit relate to board itself, but never update. So I 
would like to remove it


from kernel and U-Boot upstream.



Also we should remove the binding  from
Documentation/devicetree/bindings/arm/rockchip.yaml as well


Will update.


Thanks,

- Kever




Heiko



Signed-off-by: Kever Yang 
---

  arch/arm/boot/dts/rk3288-fennec.dts | 347 
  1 file changed, 347 deletions(-)
  delete mode 100644 arch/arm/boot/dts/rk3288-fennec.dts

diff --git a/arch/arm/boot/dts/rk3288-fennec.dts 
b/arch/arm/boot/dts/rk3288-fennec.dts
deleted file mode 100644
index 4847cf902a15..
--- a/arch/arm/boot/dts/rk3288-fennec.dts
+++ /dev/null
@@ -1,347 +0,0 @@
-// SPDX-License-Identifier: (GPL-2.0+ OR MIT)
-
-/dts-v1/;
-
-#include "rk3288.dtsi"
-
-/ {
-   model = "Rockchip RK3288 Fennec Board";
-   compatible = "rockchip,rk3288-fennec", "rockchip,rk3288";
-
-   memory@0 {
-   reg = <0x0 0x0 0x0 0x8000>;
-   device_type = "memory";
-   };
-
-   ext_gmac: external-gmac-clock {
-   compatible = "fixed-clock";
-   #clock-cells = <0>;
-   clock-frequency = <12500>;
-   clock-output-names = "ext_gmac";
-   };
-
-   vcc_sys: vsys-regulator {
-   compatible = "regulator-fixed";
-   regulator-name = "vcc_sys";
-   regulator-min-microvolt = <500>;
-   regulator-max-microvolt = <500>;
-   regulator-always-on;
-   regulator-boot-on;
-   };
-};
-
- {
-   cpu0-supply = <_cpu>;
-};
-
- {
-   bus-width = <8>;
-   cap-mmc-highspeed;
-   non-removable;
-   pinctrl-names = "default";
-   pinctrl-0 = <_clk _cmd _pwr _bus8>;
-   status = "okay";
-};
-
- {
-   assigned-clocks = < SCLK_MAC>;
-   assigned-clock-parents = <_gmac>;
-   clock_in_out = "input";
-   pinctrl-names = "default";
-   pinctrl-0 = <_pins>, <_rst>, <_pmeb>, <_int>;
-   phy-supply = <_lan>;
-   phy-mode = "rgmii";
-   snps,reset-active-low;
-   snps,reset-delays-us = <0 1 100>;
-   snps,reset-gpio = < RK_PB0 GPIO_ACTIVE_LOW>;
-   tx_delay = <0x30>;
-   rx_delay = <0x10>;
-   status = "okay";
-};
-
- {
-   mali-supply = <_gpu>;
-   status = "okay";
-};
-
- {
-   status = "okay";
-};
-
- {
-   status = "okay";
-   clock-frequency = <40>;
-
-   rk808: pmic@1b {
-   compatible = "rockchip,rk808";
-   reg = <0x1b>;
-   interrupt-parent = <>;
-   interrupts = ;
-   #clock-cells = <1>;
-   clock-output-names = "xin32k", "rk808-clkout2";
-   pinctrl-names = "default";
-   pinctrl-0 = <_int _pwroff>;
-   rockchip,system-power-controller;
-   wakeup-source;
-
-   vcc1-supply = <_sys>;
-   vcc2-supply = <_sys>;
-   vcc3-supply = <_sys>;
-   vcc4-supply = <_sys>;
-   vcc6-supply = <_sys>;
-   vcc7-supply = <_sys>;
-   vcc8-supply = <_io>;
-   vcc9-supply = <_io>;
-   vcc10-supply = <_io>;
-   vcc11-supply = <_io>;
-   vcc12-supply = <_io>;
-   vddio-supply = <_io>;
-
-   regulators {
-   vdd_cpu: DCDC_REG1 {
-   regulator-always-on;
-   regulator-boot-on;
-   regulator-min-microvolt = <75>;
-   regulator-max-microvolt = <135>;
-   regulator-name = "vdd_arm";
-   regulator-state-mem {
-   regulator-off-in-suspend;
-   };
-   };
-
-   vdd_gpu: DCDC_REG2 {
-   regulator-always-on;
-   regulator-boot-on;
-   regulator-min-microvolt = <85>;
-   regulator-max-microvolt = <125>;
-   regulator-name = "vdd_gpu";
-   regulator-state-mem {
-   regulator-on-in-suspend;
-   regulator-suspend-microvolt = <100>;
-   };
-   };
-
-   vcc_ddr:

linux-next: manual merge of the security tree with Linus' tree

2019-08-20 Thread Stephen Rothwell

Hi all,

FIXME: Add owner of second tree to To:
   Add author(s)/SOB of conflicting commits.

Today's linux-next merge of the security tree got conflicts in:

  arch/s390/configs/debug_defconfig
  arch/s390/configs/defconfig

between commit:

  3361f3193c74 ("s390: update configs")

from Linus' tree and commit:

  99d5cadfde2b ("kexec_file: split KEXEC_VERIFY_SIG into KEXEC_SIG and 
KEXEC_SIG_FORCE")

from the security tree.

I fixed it up (the former removed the CONFIG option updated by the latter)
and can carry the fix as necessary. This is now fixed as far as linux-next
is concerned, but any non trivial conflicts should be mentioned to your
upstream maintainer when your tree is submitted for merging.  You may
also want to consider cooperating with the maintainer of the conflicting
tree to minimise any particularly complex conflicts.



-- 
Cheers,
Stephen Rothwell


pgpNxQV6sJTHX.pgp
Description: OpenPGP digital signature

Re: [PATCH 2/4] memremap: remove the dev field in struct dev_pagemap

2019-08-20 Thread Dan Williams

On Tue, Aug 20, 2019 at 6:27 AM Jason Gunthorpe  wrote:
>
> On Mon, Aug 19, 2019 at 06:44:02PM -0700, Dan Williams wrote:
> > On Sun, Aug 18, 2019 at 2:12 AM Christoph Hellwig  wrote:
> > >
> > > The dev field in struct dev_pagemap is only used to print dev_name in
> > > two places, which are at best nice to have.  Just remove the field
> > > and thus the name in those two messages.
> > >
> > > Signed-off-by: Christoph Hellwig 
> > > Reviewed-by: Ira Weiny 
> >
> > Needs the below as well.
> >
> > /me goes to check if he ever merged the fix to make the unit test
> > stuff get built by default with COMPILE_TEST [1]. Argh! Nope, didn't
> > submit it for 5.3-rc1, sorry for the thrash.
> >
> > You can otherwise add:
> >
> > Reviewed-by: Dan Williams 
> >
> > [1]: 
> > https://lore.kernel.org/lkml/156097224232.1086847.9463861924683372741.st...@dwillia2-desk3.amr.corp.intel.com/
>
> Can you get this merged? Do you want it to go with this series?

Yeah, makes some sense to let you merge it so that you can get
kbuild-robot reports about any follow-on memremap_pages() work that
may trip up the build. Otherwise let me know and I'll get it queued
with the other v5.4 libnvdimm pending bits.

[PATCH v3 5/7] mmc: Add Actions Semi Owl SoCs SD/MMC driver

2019-08-20 Thread Manivannan Sadhasivam

Add SD/MMC driver for Actions Semi Owl SoCs. This driver currently
supports standard, high speed, SDR12, SDR25 and SDR50. DDR50 mode is
supported but it is untested. There is no SDIO support for now.

Signed-off-by: Manivannan Sadhasivam 
---
 drivers/mmc/host/Kconfig   |   8 +
 drivers/mmc/host/Makefile  |   1 +
 drivers/mmc/host/owl-mmc.c | 696 +
 3 files changed, 705 insertions(+)
 create mode 100644 drivers/mmc/host/owl-mmc.c

diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig
index 14d89a108edd..2c38e36953af 100644
--- a/drivers/mmc/host/Kconfig
+++ b/drivers/mmc/host/Kconfig
@@ -1006,3 +1006,11 @@ config MMC_SDHCI_AM654
  If you have a controller with this interface, say Y or M here.
 
  If unsure, say N.
+
+config MMC_OWL
+   tristate "Actions Semi Owl SD/MMC Host Controller support"
+   depends on HAS_DMA
+   depends on ARCH_ACTIONS || COMPILE_TEST
+   help
+ This selects support for the SD/MMC Host Controller on
+ Actions Semi Owl SoCs.
diff --git a/drivers/mmc/host/Makefile b/drivers/mmc/host/Makefile
index 73578718f119..41a0b1728389 100644
--- a/drivers/mmc/host/Makefile
+++ b/drivers/mmc/host/Makefile
@@ -73,6 +73,7 @@ obj-$(CONFIG_MMC_SUNXI)   += sunxi-mmc.o
 obj-$(CONFIG_MMC_USDHI6ROL0)   += usdhi6rol0.o
 obj-$(CONFIG_MMC_TOSHIBA_PCI)  += toshsd.o
 obj-$(CONFIG_MMC_BCM2835)  += bcm2835.o
+obj-$(CONFIG_MMC_OWL)  += owl-mmc.o
 
 obj-$(CONFIG_MMC_REALTEK_PCI)  += rtsx_pci_sdmmc.o
 obj-$(CONFIG_MMC_REALTEK_USB)  += rtsx_usb_sdmmc.o
diff --git a/drivers/mmc/host/owl-mmc.c b/drivers/mmc/host/owl-mmc.c
new file mode 100644
index ..771e3d00f1bb
--- /dev/null
+++ b/drivers/mmc/host/owl-mmc.c
@@ -0,0 +1,696 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Actions Semi Owl SoCs SD/MMC driver
+ *
+ * Copyright (c) 2014 Actions Semi Inc.
+ * Copyright (c) 2019 Manivannan Sadhasivam 
+ *
+ * TODO: SDIO support
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * SDC registers
+ */
+#define OWL_REG_SD_EN  0x
+#define OWL_REG_SD_CTL 0x0004
+#define OWL_REG_SD_STATE   0x0008
+#define OWL_REG_SD_CMD 0x000c
+#define OWL_REG_SD_ARG 0x0010
+#define OWL_REG_SD_RSPBUF0 0x0014
+#define OWL_REG_SD_RSPBUF1 0x0018
+#define OWL_REG_SD_RSPBUF2 0x001c
+#define OWL_REG_SD_RSPBUF3 0x0020
+#define OWL_REG_SD_RSPBUF4 0x0024
+#define OWL_REG_SD_DAT 0x0028
+#define OWL_REG_SD_BLK_SIZE0x002c
+#define OWL_REG_SD_BLK_NUM 0x0030
+#define OWL_REG_SD_BUF_SIZE0x0034
+
+/* SD_EN Bits */
+#define OWL_SD_EN_RANE BIT(31)
+#define OWL_SD_EN_RAN_SEED(x)  (((x) & 0x3f) << 24)
+#define OWL_SD_EN_S18ENBIT(12)
+#define OWL_SD_EN_RESE BIT(10)
+#define OWL_SD_EN_DAT1_S   BIT(9)
+#define OWL_SD_EN_CLK_SBIT(8)
+#define OWL_SD_ENABLE  BIT(7)
+#define OWL_SD_EN_BSEL BIT(6)
+#define OWL_SD_EN_SDIOEN   BIT(3)
+#define OWL_SD_EN_DDRENBIT(2)
+#define OWL_SD_EN_DATAWID(x)   (((x) & 0x3) << 0)
+
+/* SD_CTL Bits */
+#define OWL_SD_CTL_TOUTEN  BIT(31)
+#define OWL_SD_CTL_TOUTCNT(x)  (((x) & 0x7f) << 24)
+#define OWL_SD_CTL_DELAY_MSK   GENMASK(23, 16)
+#define OWL_SD_CTL_RDELAY(x)   (((x) & 0xf) << 20)
+#define OWL_SD_CTL_WDELAY(x)   (((x) & 0xf) << 16)
+#define OWL_SD_CTL_CMDLEN  BIT(13)
+#define OWL_SD_CTL_SCC BIT(12)
+#define OWL_SD_CTL_TCN(x)  (((x) & 0xf) << 8)
+#define OWL_SD_CTL_TS  BIT(7)
+#define OWL_SD_CTL_LBE BIT(6)
+#define OWL_SD_CTL_C7ENBIT(5)
+#define OWL_SD_CTL_TM(x)   (((x) & 0xf) << 0)
+
+#define OWL_SD_DELAY_LOW_CLK   0x0f
+#define OWL_SD_DELAY_MID_CLK   0x0a
+#define OWL_SD_DELAY_HIGH_CLK  0x09
+#define OWL_SD_RDELAY_DDR500x0a
+#define OWL_SD_WDELAY_DDR500x08
+
+/* SD_STATE Bits */
+#define OWL_SD_STATE_DAT1BSBIT(18)
+#define OWL_SD_STATE_SDIOB_P   BIT(17)
+#define OWL_SD_STATE_SDIOB_EN  BIT(16)
+#define OWL_SD_STATE_TOUTE BIT(15)
+#define OWL_SD_STATE_BAEP  BIT(14)
+#define OWL_SD_STATE_MEMRDYBIT(12)
+#define OWL_SD_STATE_CMDS  BIT(11)
+#define OWL_SD_STATE_DAT1ASBIT(10)
+#define OWL_SD_STATE_SDIOA_P   BIT(9)
+#define OWL_SD_STATE_SDIOA_EN  BIT(8)
+#define OWL_SD_STATE_DAT0S BIT(7)
+#define OWL_SD_STATE_TEIE  BIT(6)
+#define OWL_SD_STATE_TEI   BIT(5)
+#define OWL_SD_STATE_CLNR  BIT(4)
+#define

[PATCH v3 6/7] MAINTAINERS: Add entry for Actions Semi SD/MMC driver and binding

2019-08-20 Thread Manivannan Sadhasivam

Add MAINTAINERS entry for Actions Semi SD/MMC driver with its binding.

Signed-off-by: Manivannan Sadhasivam 
---
 MAINTAINERS | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index c31e6492b601..d13138330b97 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1375,6 +1375,7 @@ F:drivers/clk/actions/
 F: drivers/clocksource/timer-owl*
 F: drivers/dma/owl-dma.c
 F: drivers/i2c/busses/i2c-owl.c
+F: drivers/mmc/host/owl-mmc.c
 F: drivers/pinctrl/actions/*
 F: drivers/soc/actions/
 F: include/dt-bindings/power/owl-*
@@ -1383,6 +1384,7 @@ F:
Documentation/devicetree/bindings/arm/actions.yaml
 F: Documentation/devicetree/bindings/clock/actions,owl-cmu.txt
 F: Documentation/devicetree/bindings/dma/owl-dma.txt
 F: Documentation/devicetree/bindings/i2c/i2c-owl.txt
+F: Documentation/devicetree/bindings/mmc/owl-mmc.yaml
 F: Documentation/devicetree/bindings/pinctrl/actions,s900-pinctrl.txt
 F: Documentation/devicetree/bindings/power/actions,owl-sps.txt
 F: Documentation/devicetree/bindings/timer/actions,owl-timer.txt
-- 
2.17.1

[PATCH v3 7/7] arm64: configs: Enable Actions Semi platform in defconfig

2019-08-20 Thread Manivannan Sadhasivam

Since the Actions Semi platform can now boot a distro, enable it in
ARM64 defconfig.

Signed-off-by: Manivannan Sadhasivam 
---
 arch/arm64/configs/defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index 0e58ef02880c..8e2d6687 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -29,6 +29,7 @@ CONFIG_BLK_DEV_INITRD=y
 CONFIG_KALLSYMS_ALL=y
 # CONFIG_COMPAT_BRK is not set
 CONFIG_PROFILING=y
+CONFIG_ARCH_ACTIONS=y
 CONFIG_ARCH_AGILEX=y
 CONFIG_ARCH_SUNXI=y
 CONFIG_ARCH_ALPINE=y
-- 
2.17.1

[PATCH v3 4/7] arm64: dts: actions: Add uSD and eMMC support for Bubblegum96

2019-08-20 Thread Manivannan Sadhasivam

Add uSD and eMMC support for Bubblegum96 board based on Actions Semi
S900 SoC. SD0 is connected to uSD slot and SD2 is connected to eMMC.
Since there is no PMIC support added yet, fixed regulator has been
used as a regulator node.

Signed-off-by: Manivannan Sadhasivam 
---
 .../boot/dts/actions/s900-bubblegum-96.dts| 62 +++
 1 file changed, 62 insertions(+)

diff --git a/arch/arm64/boot/dts/actions/s900-bubblegum-96.dts 
b/arch/arm64/boot/dts/actions/s900-bubblegum-96.dts
index 732daaa6e9d3..59291e0ea1ee 100644
--- a/arch/arm64/boot/dts/actions/s900-bubblegum-96.dts
+++ b/arch/arm64/boot/dts/actions/s900-bubblegum-96.dts
@@ -12,6 +12,9 @@
model = "Bubblegum-96";
 
aliases {
+   mmc0 = 
+   mmc1 = 
+   mmc2 = 
serial5 = 
};
 
@@ -23,6 +26,24 @@
device_type = "memory";
reg = <0x0 0x0 0x0 0x8000>;
};
+
+   /* Fixed regulator used in the absence of PMIC */
+   vcc_3v1: vcc-3v1 {
+   compatible = "regulator-fixed";
+   regulator-name = "fixed-3.1V";
+   regulator-min-microvolt = <310>;
+   regulator-max-microvolt = <310>;
+   regulator-always-on;
+   };
+
+   /* Fixed regulator used in the absence of PMIC */
+   sd_vcc: sd-vcc {
+   compatible = "regulator-fixed";
+   regulator-name = "fixed-3.1V";
+   regulator-min-microvolt = <310>;
+   regulator-max-microvolt = <310>;
+   regulator-always-on;
+   };
 };
 
  {
@@ -241,6 +262,47 @@
bias-pull-up;
};
};
+
+   mmc0_default: mmc0_default {
+   pinmux {
+   groups = "sd0_d0_mfp", "sd0_d1_mfp", "sd0_d2_d3_mfp",
+"sd0_cmd_mfp", "sd0_clk_mfp";
+   function = "sd0";
+   };
+   };
+
+   mmc2_default: mmc2_default {
+   pinmux {
+   groups = "nand0_d0_ceb3_mfp";
+   function = "sd2";
+   };
+   };
+};
+
+/* uSD */
+ {
+   status = "okay";
+   pinctrl-names = "default";
+   pinctrl-0 = <_default>;
+   no-sdio;
+   no-mmc;
+   no-1-8-v;
+   cd-gpios = < 120 GPIO_ACTIVE_LOW>;
+   bus-width = <4>;
+   vmmc-supply = <_vcc>;
+   vqmmc-supply = <_vcc>;
+};
+
+/* eMMC */
+ {
+   status = "okay";
+   pinctrl-names = "default";
+   pinctrl-0 = <_default>;
+   no-sdio;
+   no-sd;
+   non-removable;
+   bus-width = <8>;
+   vmmc-supply = <_3v1>;
 };
 
  {
-- 
2.17.1

[PATCH v3 1/7] clk: actions: Fix factor clk struct member access

2019-08-20 Thread Manivannan Sadhasivam

Since the helper "owl_factor_helper_round_rate" is shared between factor
and composite clocks, using the factor clk specific helper function
like "hw_to_owl_factor" to access its members will create issues when
called from composite clk specific code. Hence, pass the "factor_hw"
struct pointer directly instead of fetching it using factor clk specific
helpers.

This issue has been observed when a composite clock like "sd0_clk" tried
to call "owl_factor_helper_round_rate" resulting in pointer dereferencing
error.

While we are at it, let's rename the "clk_val_best" function to
"owl_clk_val_best" since this is an owl SoCs specific helper.

Fixes: 4bb78fc9744a ("clk: actions: Add factor clock support")
Signed-off-by: Manivannan Sadhasivam 
Reviewed-by: Stephen Boyd 
---
 drivers/clk/actions/owl-factor.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/clk/actions/owl-factor.c b/drivers/clk/actions/owl-factor.c
index 317d4a9e112e..f15e2621fa18 100644
--- a/drivers/clk/actions/owl-factor.c
+++ b/drivers/clk/actions/owl-factor.c
@@ -64,11 +64,10 @@ static unsigned int _get_table_val(const struct 
clk_factor_table *table,
return val;
 }
 
-static int clk_val_best(struct clk_hw *hw, unsigned long rate,
+static int owl_clk_val_best(const struct owl_factor_hw *factor_hw,
+   struct clk_hw *hw, unsigned long rate,
unsigned long *best_parent_rate)
 {
-   struct owl_factor *factor = hw_to_owl_factor(hw);
-   struct owl_factor_hw *factor_hw = >factor_hw;
const struct clk_factor_table *clkt = factor_hw->table;
unsigned long parent_rate, try_parent_rate, best = 0, cur_rate;
unsigned long parent_rate_saved = *best_parent_rate;
@@ -126,7 +125,7 @@ long owl_factor_helper_round_rate(struct owl_clk_common 
*common,
const struct clk_factor_table *clkt = factor_hw->table;
unsigned int val, mul = 0, div = 1;
 
-   val = clk_val_best(>hw, rate, parent_rate);
+   val = owl_clk_val_best(factor_hw, >hw, rate, parent_rate);
_get_table_div_mul(clkt, val, , );
 
return *parent_rate * mul / div;
-- 
2.17.1

[PATCH v3 0/7] Add SD/MMC driver for Actions Semi S900 SoC

2019-08-20 Thread Manivannan Sadhasivam

Hello,

This patchset adds SD/MMC driver for Actions Semi S900 SoC from Owl
family SoCs. There are 4 SD/MMC controller present in this SoC but
only 2 are enabled currently for Bubblegum96 board to access uSD and
onboard eMMC. SDIO support for this driver is not currently implemented.

Note: Currently, driver uses 2 completion mechanisms for maintaining
the coherency between SDC and DMA interrupts and I know that it is not
efficient. Hence, I'd like to hear any suggestions for reimplementing
the logic if anyone has.

With this driver, this patchset also fixes one clk driver issue and enables
the Actions Semi platform in ARM64 defconfig.

Thanks,
Mani

Changes in v3:

* Incorporated a review comment from Andreas on board dts patch
* Modified the MAINTAINERS entry for devicetree YAML binding

Changes in v2:

* Converted the devicetree bindings to YAML
* Misc changes to bubblegum devicetree as per the review from Andreas
* Dropped the read/write wrappers and renamed all functions to use owl-
  prefix as per the review from Ulf
* Renamed clk_val_best to owl_clk_val_best and added Reviewed-by tag
  from Stephen

Manivannan Sadhasivam (7):
  clk: actions: Fix factor clk struct member access
  dt-bindings: mmc: Add Actions Semi SD/MMC/SDIO controller binding
  arm64: dts: actions: Add MMC controller support for S900
  arm64: dts: actions: Add uSD and eMMC support for Bubblegum96
  mmc: Add Actions Semi Owl SoCs SD/MMC driver
  MAINTAINERS: Add entry for Actions Semi SD/MMC driver and binding
  arm64: configs: Enable Actions Semi platform in defconfig

 .../devicetree/bindings/mmc/owl-mmc.yaml  |  62 ++
 MAINTAINERS   |   2 +
 .../boot/dts/actions/s900-bubblegum-96.dts|  62 ++
 arch/arm64/boot/dts/actions/s900.dtsi |  45 ++
 arch/arm64/configs/defconfig  |   1 +
 drivers/clk/actions/owl-factor.c  |   7 +-
 drivers/mmc/host/Kconfig  |   8 +
 drivers/mmc/host/Makefile |   1 +
 drivers/mmc/host/owl-mmc.c| 696 ++
 9 files changed, 880 insertions(+), 4 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/mmc/owl-mmc.yaml
 create mode 100644 drivers/mmc/host/owl-mmc.c

-- 
2.17.1

[PATCH v3 2/7] dt-bindings: mmc: Add Actions Semi SD/MMC/SDIO controller binding

2019-08-20 Thread Manivannan Sadhasivam

Add devicetree YAML binding for Actions Semi Owl SoC's SD/MMC/SDIO
controller.

Signed-off-by: Manivannan Sadhasivam 
---
 .../devicetree/bindings/mmc/owl-mmc.yaml  | 62 +++
 1 file changed, 62 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/mmc/owl-mmc.yaml

diff --git a/Documentation/devicetree/bindings/mmc/owl-mmc.yaml 
b/Documentation/devicetree/bindings/mmc/owl-mmc.yaml
new file mode 100644
index ..f7eff4c43017
--- /dev/null
+++ b/Documentation/devicetree/bindings/mmc/owl-mmc.yaml
@@ -0,0 +1,62 @@
+# SPDX-License-Identifier: GPL-2.0
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/mmc/owl-mmc.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Actions Semi Owl SoCs SD/MMC/SDIO controller
+
+allOf:
+  - $ref: "mmc-controller.yaml"
+
+maintainers:
+  - Manivannan Sadhasivam 
+
+properties:
+  "#address-cells": true
+  "#size-cells": true
+
+  compatible:
+const: actions,owl-mmc
+
+  reg:
+maxItems: 1
+
+  interrupts:
+maxItems: 1
+
+  clocks:
+minItems: 1
+
+  resets:
+maxItems: 1
+
+  dmas:
+maxItems: 1
+
+  dma-names:
+const: mmc
+
+required:
+  - compatible
+  - reg
+  - interrupts
+  - clocks
+  - resets
+  - dmas
+  - dma-names
+
+examples:
+  - |
+mmc0: mmc@e033 {
+compatible = "actions,owl-mmc";
+reg = <0x0 0xe033 0x0 0x4000>;
+interrupts = <0 42 4>;
+clocks = < 56>;
+resets = < 23>;
+dmas = < 2>;
+dma-names = "mmc";
+bus-width = <4>;
+};
+
+...
-- 
2.17.1

[PATCH v3 3/7] arm64: dts: actions: Add MMC controller support for S900

2019-08-20 Thread Manivannan Sadhasivam

Add MMC controller support for Actions Semi S900 SoC. There are 4 MMC
controllers in this SoC which can be used for accessing SD/MMC/SDIO cards.

Signed-off-by: Manivannan Sadhasivam 
---
 arch/arm64/boot/dts/actions/s900.dtsi | 45 +++
 1 file changed, 45 insertions(+)

diff --git a/arch/arm64/boot/dts/actions/s900.dtsi 
b/arch/arm64/boot/dts/actions/s900.dtsi
index df3a68a3ac97..eb35cf78ab73 100644
--- a/arch/arm64/boot/dts/actions/s900.dtsi
+++ b/arch/arm64/boot/dts/actions/s900.dtsi
@@ -4,6 +4,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 
@@ -284,5 +285,49 @@
dma-requests = <46>;
clocks = < CLK_DMAC>;
};
+
+   mmc0: mmc@e033 {
+   compatible = "actions,owl-mmc";
+   reg = <0x0 0xe033 0x0 0x4000>;
+   interrupts = ;
+   clocks = < CLK_SD0>;
+   resets = < RESET_SD0>;
+   dmas = < 2>;
+   dma-names = "mmc";
+   status = "disabled";
+   };
+
+   mmc1: mmc@e0334000 {
+   compatible = "actions,owl-mmc";
+   reg = <0x0 0xe0334000 0x0 0x4000>;
+   interrupts = ;
+   clocks = < CLK_SD1>;
+   resets = < RESET_SD1>;
+   dmas = < 3>;
+   dma-names = "mmc";
+   status = "disabled";
+   };
+
+   mmc2: mmc@e0338000 {
+   compatible = "actions,owl-mmc";
+   reg = <0x0 0xe0338000 0x0 0x4000>;
+   interrupts = ;
+   clocks = < CLK_SD2>;
+   resets = < RESET_SD2>;
+   dmas = < 4>;
+   dma-names = "mmc";
+   status = "disabled";
+   };
+
+   mmc3: mmc@e033c000 {
+   compatible = "actions,owl-mmc";
+   reg = <0x0 0xe033c000 0x0 0x4000>;
+   interrupts = ;
+   clocks = < CLK_SD3>;
+   resets = < RESET_SD3>;
+   dmas = < 46>;
+   dma-names = "mmc";
+   status = "disabled";
+   };
};
 };
-- 
2.17.1

Re: [PATCH v2 4/7] arm64: dts: actions: Add uSD and eMMC support for Bubblegum96

2019-08-20 Thread Manivannan Sadhasivam

Hi Andreas,

On Wed, Aug 21, 2019 at 08:10:11AM +0530, Manivannan Sadhasivam wrote:
> Add uSD and eMMC support for Bubblegum96 board based on Actions Semi
> Owl SoC. SD0 is connected to uSD slot and SD2 is connected to eMMC.
> Since there is no PMIC support added yet, fixed regulator has been
> used as a regulator node.
> 

Just realised that I missed your review on the patch description here.
Will either modify in next iteration (if needed) or modify it while
applying.

Sorry for that!

Thanks,
Mani

> Signed-off-by: Manivannan Sadhasivam 
> ---
>  .../boot/dts/actions/s900-bubblegum-96.dts| 60 +++
>  1 file changed, 60 insertions(+)
> 
> diff --git a/arch/arm64/boot/dts/actions/s900-bubblegum-96.dts 
> b/arch/arm64/boot/dts/actions/s900-bubblegum-96.dts
> index 732daaa6e9d3..92376b71cb8f 100644
> --- a/arch/arm64/boot/dts/actions/s900-bubblegum-96.dts
> +++ b/arch/arm64/boot/dts/actions/s900-bubblegum-96.dts
> @@ -12,6 +12,9 @@
>   model = "Bubblegum-96";
>  
>   aliases {
> + mmc0 = 
> + mmc1 = 
> + mmc2 = 
>   serial5 = 
>   };
>  
> @@ -23,6 +26,22 @@
>   device_type = "memory";
>   reg = <0x0 0x0 0x0 0x8000>;
>   };
> +
> + vcc_3v1: vcc-3v1 {
> + compatible = "regulator-fixed";
> + regulator-name = "fixed-3.1V";
> + regulator-min-microvolt = <310>;
> + regulator-max-microvolt = <310>;
> + regulator-always-on;
> + };
> +
> + sd_vcc: sd-vcc {
> + compatible = "regulator-fixed";
> + regulator-name = "fixed-3.1V";
> + regulator-min-microvolt = <310>;
> + regulator-max-microvolt = <310>;
> + regulator-always-on;
> + };
>  };
>  
>   {
> @@ -241,6 +260,47 @@
>   bias-pull-up;
>   };
>   };
> +
> + mmc0_default: mmc0_default {
> + pinmux {
> + groups = "sd0_d0_mfp", "sd0_d1_mfp", "sd0_d2_d3_mfp",
> +  "sd0_cmd_mfp", "sd0_clk_mfp";
> + function = "sd0";
> + };
> + };
> +
> + mmc2_default: mmc2_default {
> + pinmux {
> + groups = "nand0_d0_ceb3_mfp";
> + function = "sd2";
> + };
> + };
> +};
> +
> +/* uSD */
> + {
> + status = "okay";
> + pinctrl-names = "default";
> + pinctrl-0 = <_default>;
> + no-sdio;
> + no-mmc;
> + no-1-8-v;
> + cd-gpios = < 120 GPIO_ACTIVE_LOW>;
> + bus-width = <4>;
> + vmmc-supply = <_vcc>;
> + vqmmc-supply = <_vcc>;
> +};
> +
> +/* eMMC */
> + {
> + status = "okay";
> + pinctrl-names = "default";
> + pinctrl-0 = <_default>;
> + no-sdio;
> + no-sd;
> + non-removable;
> + bus-width = <8>;
> + vmmc-supply = <_3v1>;
>  };
>  
>   {
> -- 
> 2.17.1
>

RE: [PATCH v2 0/2] Simplify mtty driver and mdev core

2019-08-20 Thread Parav Pandit




> -Original Message-
> From: Cornelia Huck 
> Sent: Tuesday, August 20, 2019 10:01 PM
> > > > Option-1: mdev index
> > > > Introduce an optional mdev index/handle as u32 during mdev create
> time.
> > > > User passes mdev index/handle as input.
> > > >
> > > > phys_port_name=mIndex=m%u
> > > > mdev_index will be available in sysfs as mdev attribute for udev
> > > > to name the
> > > mdev's netdev.
> > > >
> > > > example mdev create command:
> > > > UUID=$(uuidgen)
> > > > echo $UUID index=10 >
> > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create
> > > > example netdevs:
> > > > repnetdev=ens2f0_m10/*ens2f0 is parent PF's netdevice */
> > > > mdev_netdev=enm10
> > > >
> > > > Pros:
> > > > 1. mdevctl and any other existing tools are unaffected.
> > > > 2. netdev stack, ovs and other switching platforms are unaffected.
> > > > 3. achieves unique phys_port_name for representor netdev 4.
> > > > achieves unique mdev eth netdev name for the mdev using udev/systemd
> extension.
> > > > 5. Aligns well with mdev and netdev subsystem and similar to
> > > > existing sriov
> > > bdf's.
> > > >
> > > > Option-2: shorter mdev name
> > > > Extend mdev to have shorter mdev device name in addition to UUID.
> > > > such as 'foo', 'bar'.
> > > > Mdev will continue to have UUID.
> 
> I fail to understand how 'uses uuid' and 'allow shorter device name'
> are supposed to play together?
> 
Each mdev will have uuid as today. Instead of naming device based on UUID, name 
it based on explicit name given by the user.
Again, I want to repeat, this name parameter is optional.

> > > > phys_port_name=mdev_name
> > > >
> > > > Pros:
> > > > 1. All same as option-1, except mdevctl needs upgrade for newer usage.
> > > > It is common practice to upgrade iproute2 package along with the kernel.
> > > > Similar practice to be done with mdevctl.
> > > > 2. Newer users of mdevctl who wants to work with non_UUID names,
> > > > will use
> > > newer mdevctl/tools.
> > > > Cons:
> > > > 1. Dual naming scheme of mdev might affect some of the existing tools.
> > > > It's unclear how/if it actually affects.
> > > > mdevctl [2] is very recently developed and can be enhanced for
> > > > dual naming
> > > scheme.
> 
> The main problem is not tools we know about (i.e. mdevctl), but those we don't
> know about.
> 
Well, if it not part of the distros, there is very little can do about it by 
kernel.
I tried mdevctl with mdev named using non UUID and it were able to list them.

> IOW, this (and the IFNAMESIZ change, which seems even worse) are the
> options I would not want at all.
> 
Ok.

> > > >
> > > > Option-3: mdev uuid alias
> > > > Instead of shorter mdev name or mdev index, have alpha-numeric
> > > > name
> > > alias.
> > > > Alias is an optional mdev sysfs attribute such as 'foo', 'bar'.
> > > > example mdev create command:
> > > > UUID=$(uuidgen)
> > > > echo $UUID alias=foo >
> > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create
> > > > example netdevs:
> > > > examle netdevs:
> > > > repnetdev = ens2f0_mfoo
> > > > mdev_netdev=enmfoo
> > > >
> > > > Pros:
> > > > 1. All same as option-1.
> > > > 2. Doesn't affect existing mdev naming scheme.
> > > > Cons:
> > > > 1. Index scheme of option-1 is better which can number large
> > > > number of
> > > mdevs with fewer characters, simplifying the management tool.
> > >
> > > I believe that Alex pointed out another "Cons" to all three options,
> > > which is that it forces user-space to resolve potential race
> > > conditions when creating an index or short name or alias.
> > >
> > This race condition exists for at least two subsystems that I know of, i.e.
> netdev and rdma.
> > If a device with a given name exists, subsystem returns error.
> > When user space gets error code EEXIST, and it can picks up different
> identifier(s).
> 
> If you decouple device creation and setting the alias/index, you make the 
> issue
> visible and thus much more manageable.
> 
I thought about it. It has two issues.
1. user should be able to set this only once. Repeatedly setting it requires 
changing/notifying it.
2. setting alias translating in creating devlink port doesn't sound correct.
Because if user attempts to reset to different value, it required 
unregistration, reregistration.
All of such race conditions handling it not worth it.
So setting the index, I liked Alex's term more 'instance number', at instance 
creation time is lot more simple.

> >
> > > Also, what happens if `index=10` is not provided on the command-line?
> > > Does that make the device unusable for your purpose?
> > Yes, it is unusable to an extent.
> > Currently we have DEVLINK_PORT_FLAVOUR_PCI_VF in
> > include/uapi/linux/devlink.h Similar to it, we need to have
> DEVLINK_PORT_FLAVOUR_MDEV for mdev eswitch ports.
> > This port flavour needs to generate phys_port_name(). This should be user
> parameter driven.
> > Because representor netdevice name is generated based on this parameter.
>

[PATCH v2 6/7] MAINTAINERS: Add entry for Actions Semi SD/MMC driver and binding

2019-08-20 Thread Manivannan Sadhasivam

Add MAINTAINERS entry for Actions Semi SD/MMC driver with its binding.

Signed-off-by: Manivannan Sadhasivam 
---
 MAINTAINERS | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index c31e6492b601..247d5332f7b7 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1375,6 +1375,7 @@ F:drivers/clk/actions/
 F: drivers/clocksource/timer-owl*
 F: drivers/dma/owl-dma.c
 F: drivers/i2c/busses/i2c-owl.c
+F: drivers/mmc/host/owl-mmc.c
 F: drivers/pinctrl/actions/*
 F: drivers/soc/actions/
 F: include/dt-bindings/power/owl-*
@@ -1383,6 +1384,7 @@ F:
Documentation/devicetree/bindings/arm/actions.yaml
 F: Documentation/devicetree/bindings/clock/actions,owl-cmu.txt
 F: Documentation/devicetree/bindings/dma/owl-dma.txt
 F: Documentation/devicetree/bindings/i2c/i2c-owl.txt
+F: Documentation/devicetree/bindings/mmc/owl-mmc.txt
 F: Documentation/devicetree/bindings/pinctrl/actions,s900-pinctrl.txt
 F: Documentation/devicetree/bindings/power/actions,owl-sps.txt
 F: Documentation/devicetree/bindings/timer/actions,owl-timer.txt
-- 
2.17.1

[PATCH v2 3/7] arm64: dts: actions: Add MMC controller support for S900

2019-08-20 Thread Manivannan Sadhasivam

Add MMC controller support for Actions Semi S900 SoC. There are 4 MMC
controllers in this SoC which can be used for accessing SD/MMC/SDIO cards.

Signed-off-by: Manivannan Sadhasivam 
---
 arch/arm64/boot/dts/actions/s900.dtsi | 45 +++
 1 file changed, 45 insertions(+)

diff --git a/arch/arm64/boot/dts/actions/s900.dtsi 
b/arch/arm64/boot/dts/actions/s900.dtsi
index df3a68a3ac97..eb35cf78ab73 100644
--- a/arch/arm64/boot/dts/actions/s900.dtsi
+++ b/arch/arm64/boot/dts/actions/s900.dtsi
@@ -4,6 +4,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 
@@ -284,5 +285,49 @@
dma-requests = <46>;
clocks = < CLK_DMAC>;
};
+
+   mmc0: mmc@e033 {
+   compatible = "actions,owl-mmc";
+   reg = <0x0 0xe033 0x0 0x4000>;
+   interrupts = ;
+   clocks = < CLK_SD0>;
+   resets = < RESET_SD0>;
+   dmas = < 2>;
+   dma-names = "mmc";
+   status = "disabled";
+   };
+
+   mmc1: mmc@e0334000 {
+   compatible = "actions,owl-mmc";
+   reg = <0x0 0xe0334000 0x0 0x4000>;
+   interrupts = ;
+   clocks = < CLK_SD1>;
+   resets = < RESET_SD1>;
+   dmas = < 3>;
+   dma-names = "mmc";
+   status = "disabled";
+   };
+
+   mmc2: mmc@e0338000 {
+   compatible = "actions,owl-mmc";
+   reg = <0x0 0xe0338000 0x0 0x4000>;
+   interrupts = ;
+   clocks = < CLK_SD2>;
+   resets = < RESET_SD2>;
+   dmas = < 4>;
+   dma-names = "mmc";
+   status = "disabled";
+   };
+
+   mmc3: mmc@e033c000 {
+   compatible = "actions,owl-mmc";
+   reg = <0x0 0xe033c000 0x0 0x4000>;
+   interrupts = ;
+   clocks = < CLK_SD3>;
+   resets = < RESET_SD3>;
+   dmas = < 46>;
+   dma-names = "mmc";
+   status = "disabled";
+   };
};
 };
-- 
2.17.1

[PATCH v2 7/7] arm64: configs: Enable Actions Semi platform in defconfig

2019-08-20 Thread Manivannan Sadhasivam

Since the Actions Semi platform can now boot a distro, enable it in
ARM64 defconfig.

Signed-off-by: Manivannan Sadhasivam 
---
 arch/arm64/configs/defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index 0e58ef02880c..8e2d6687 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -29,6 +29,7 @@ CONFIG_BLK_DEV_INITRD=y
 CONFIG_KALLSYMS_ALL=y
 # CONFIG_COMPAT_BRK is not set
 CONFIG_PROFILING=y
+CONFIG_ARCH_ACTIONS=y
 CONFIG_ARCH_AGILEX=y
 CONFIG_ARCH_SUNXI=y
 CONFIG_ARCH_ALPINE=y
-- 
2.17.1

[PATCH v2 5/7] mmc: Add Actions Semi Owl SoCs SD/MMC driver

2019-08-20 Thread Manivannan Sadhasivam

Add SD/MMC driver for Actions Semi Owl SoCs. This driver currently
supports standard, high speed, SDR12, SDR25 and SDR50. DDR50 mode is
supported but it is untested. There is no SDIO support for now.

Signed-off-by: Manivannan Sadhasivam 
---
 drivers/mmc/host/Kconfig   |   8 +
 drivers/mmc/host/Makefile  |   1 +
 drivers/mmc/host/owl-mmc.c | 696 +
 3 files changed, 705 insertions(+)
 create mode 100644 drivers/mmc/host/owl-mmc.c

diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig
index 14d89a108edd..2c38e36953af 100644
--- a/drivers/mmc/host/Kconfig
+++ b/drivers/mmc/host/Kconfig
@@ -1006,3 +1006,11 @@ config MMC_SDHCI_AM654
  If you have a controller with this interface, say Y or M here.
 
  If unsure, say N.
+
+config MMC_OWL
+   tristate "Actions Semi Owl SD/MMC Host Controller support"
+   depends on HAS_DMA
+   depends on ARCH_ACTIONS || COMPILE_TEST
+   help
+ This selects support for the SD/MMC Host Controller on
+ Actions Semi Owl SoCs.
diff --git a/drivers/mmc/host/Makefile b/drivers/mmc/host/Makefile
index 73578718f119..41a0b1728389 100644
--- a/drivers/mmc/host/Makefile
+++ b/drivers/mmc/host/Makefile
@@ -73,6 +73,7 @@ obj-$(CONFIG_MMC_SUNXI)   += sunxi-mmc.o
 obj-$(CONFIG_MMC_USDHI6ROL0)   += usdhi6rol0.o
 obj-$(CONFIG_MMC_TOSHIBA_PCI)  += toshsd.o
 obj-$(CONFIG_MMC_BCM2835)  += bcm2835.o
+obj-$(CONFIG_MMC_OWL)  += owl-mmc.o
 
 obj-$(CONFIG_MMC_REALTEK_PCI)  += rtsx_pci_sdmmc.o
 obj-$(CONFIG_MMC_REALTEK_USB)  += rtsx_usb_sdmmc.o
diff --git a/drivers/mmc/host/owl-mmc.c b/drivers/mmc/host/owl-mmc.c
new file mode 100644
index ..771e3d00f1bb
--- /dev/null
+++ b/drivers/mmc/host/owl-mmc.c
@@ -0,0 +1,696 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Actions Semi Owl SoCs SD/MMC driver
+ *
+ * Copyright (c) 2014 Actions Semi Inc.
+ * Copyright (c) 2019 Manivannan Sadhasivam 
+ *
+ * TODO: SDIO support
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * SDC registers
+ */
+#define OWL_REG_SD_EN  0x
+#define OWL_REG_SD_CTL 0x0004
+#define OWL_REG_SD_STATE   0x0008
+#define OWL_REG_SD_CMD 0x000c
+#define OWL_REG_SD_ARG 0x0010
+#define OWL_REG_SD_RSPBUF0 0x0014
+#define OWL_REG_SD_RSPBUF1 0x0018
+#define OWL_REG_SD_RSPBUF2 0x001c
+#define OWL_REG_SD_RSPBUF3 0x0020
+#define OWL_REG_SD_RSPBUF4 0x0024
+#define OWL_REG_SD_DAT 0x0028
+#define OWL_REG_SD_BLK_SIZE0x002c
+#define OWL_REG_SD_BLK_NUM 0x0030
+#define OWL_REG_SD_BUF_SIZE0x0034
+
+/* SD_EN Bits */
+#define OWL_SD_EN_RANE BIT(31)
+#define OWL_SD_EN_RAN_SEED(x)  (((x) & 0x3f) << 24)
+#define OWL_SD_EN_S18ENBIT(12)
+#define OWL_SD_EN_RESE BIT(10)
+#define OWL_SD_EN_DAT1_S   BIT(9)
+#define OWL_SD_EN_CLK_SBIT(8)
+#define OWL_SD_ENABLE  BIT(7)
+#define OWL_SD_EN_BSEL BIT(6)
+#define OWL_SD_EN_SDIOEN   BIT(3)
+#define OWL_SD_EN_DDRENBIT(2)
+#define OWL_SD_EN_DATAWID(x)   (((x) & 0x3) << 0)
+
+/* SD_CTL Bits */
+#define OWL_SD_CTL_TOUTEN  BIT(31)
+#define OWL_SD_CTL_TOUTCNT(x)  (((x) & 0x7f) << 24)
+#define OWL_SD_CTL_DELAY_MSK   GENMASK(23, 16)
+#define OWL_SD_CTL_RDELAY(x)   (((x) & 0xf) << 20)
+#define OWL_SD_CTL_WDELAY(x)   (((x) & 0xf) << 16)
+#define OWL_SD_CTL_CMDLEN  BIT(13)
+#define OWL_SD_CTL_SCC BIT(12)
+#define OWL_SD_CTL_TCN(x)  (((x) & 0xf) << 8)
+#define OWL_SD_CTL_TS  BIT(7)
+#define OWL_SD_CTL_LBE BIT(6)
+#define OWL_SD_CTL_C7ENBIT(5)
+#define OWL_SD_CTL_TM(x)   (((x) & 0xf) << 0)
+
+#define OWL_SD_DELAY_LOW_CLK   0x0f
+#define OWL_SD_DELAY_MID_CLK   0x0a
+#define OWL_SD_DELAY_HIGH_CLK  0x09
+#define OWL_SD_RDELAY_DDR500x0a
+#define OWL_SD_WDELAY_DDR500x08
+
+/* SD_STATE Bits */
+#define OWL_SD_STATE_DAT1BSBIT(18)
+#define OWL_SD_STATE_SDIOB_P   BIT(17)
+#define OWL_SD_STATE_SDIOB_EN  BIT(16)
+#define OWL_SD_STATE_TOUTE BIT(15)
+#define OWL_SD_STATE_BAEP  BIT(14)
+#define OWL_SD_STATE_MEMRDYBIT(12)
+#define OWL_SD_STATE_CMDS  BIT(11)
+#define OWL_SD_STATE_DAT1ASBIT(10)
+#define OWL_SD_STATE_SDIOA_P   BIT(9)
+#define OWL_SD_STATE_SDIOA_EN  BIT(8)
+#define OWL_SD_STATE_DAT0S BIT(7)
+#define OWL_SD_STATE_TEIE  BIT(6)
+#define OWL_SD_STATE_TEI   BIT(5)
+#define OWL_SD_STATE_CLNR  BIT(4)
+#define

[PATCH v2 2/7] dt-bindings: mmc: Add Actions Semi SD/MMC/SDIO controller binding

2019-08-20 Thread Manivannan Sadhasivam

Add devicetree YAML binding for Actions Semi Owl SoC's SD/MMC/SDIO
controller.

Signed-off-by: Manivannan Sadhasivam 
---
 .../devicetree/bindings/mmc/owl-mmc.yaml  | 62 +++
 1 file changed, 62 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/mmc/owl-mmc.yaml

diff --git a/Documentation/devicetree/bindings/mmc/owl-mmc.yaml 
b/Documentation/devicetree/bindings/mmc/owl-mmc.yaml
new file mode 100644
index ..f7eff4c43017
--- /dev/null
+++ b/Documentation/devicetree/bindings/mmc/owl-mmc.yaml
@@ -0,0 +1,62 @@
+# SPDX-License-Identifier: GPL-2.0
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/mmc/owl-mmc.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Actions Semi Owl SoCs SD/MMC/SDIO controller
+
+allOf:
+  - $ref: "mmc-controller.yaml"
+
+maintainers:
+  - Manivannan Sadhasivam 
+
+properties:
+  "#address-cells": true
+  "#size-cells": true
+
+  compatible:
+const: actions,owl-mmc
+
+  reg:
+maxItems: 1
+
+  interrupts:
+maxItems: 1
+
+  clocks:
+minItems: 1
+
+  resets:
+maxItems: 1
+
+  dmas:
+maxItems: 1
+
+  dma-names:
+const: mmc
+
+required:
+  - compatible
+  - reg
+  - interrupts
+  - clocks
+  - resets
+  - dmas
+  - dma-names
+
+examples:
+  - |
+mmc0: mmc@e033 {
+compatible = "actions,owl-mmc";
+reg = <0x0 0xe033 0x0 0x4000>;
+interrupts = <0 42 4>;
+clocks = < 56>;
+resets = < 23>;
+dmas = < 2>;
+dma-names = "mmc";
+bus-width = <4>;
+};
+
+...
-- 
2.17.1

[PATCH v2 4/7] arm64: dts: actions: Add uSD and eMMC support for Bubblegum96

2019-08-20 Thread Manivannan Sadhasivam

Add uSD and eMMC support for Bubblegum96 board based on Actions Semi
Owl SoC. SD0 is connected to uSD slot and SD2 is connected to eMMC.
Since there is no PMIC support added yet, fixed regulator has been
used as a regulator node.

Signed-off-by: Manivannan Sadhasivam 
---
 .../boot/dts/actions/s900-bubblegum-96.dts| 60 +++
 1 file changed, 60 insertions(+)

diff --git a/arch/arm64/boot/dts/actions/s900-bubblegum-96.dts 
b/arch/arm64/boot/dts/actions/s900-bubblegum-96.dts
index 732daaa6e9d3..92376b71cb8f 100644
--- a/arch/arm64/boot/dts/actions/s900-bubblegum-96.dts
+++ b/arch/arm64/boot/dts/actions/s900-bubblegum-96.dts
@@ -12,6 +12,9 @@
model = "Bubblegum-96";
 
aliases {
+   mmc0 = 
+   mmc1 = 
+   mmc2 = 
serial5 = 
};
 
@@ -23,6 +26,22 @@
device_type = "memory";
reg = <0x0 0x0 0x0 0x8000>;
};
+
+   vcc_3v1: vcc-3v1 {
+   compatible = "regulator-fixed";
+   regulator-name = "fixed-3.1V";
+   regulator-min-microvolt = <310>;
+   regulator-max-microvolt = <310>;
+   regulator-always-on;
+   };
+
+   sd_vcc: sd-vcc {
+   compatible = "regulator-fixed";
+   regulator-name = "fixed-3.1V";
+   regulator-min-microvolt = <310>;
+   regulator-max-microvolt = <310>;
+   regulator-always-on;
+   };
 };
 
  {
@@ -241,6 +260,47 @@
bias-pull-up;
};
};
+
+   mmc0_default: mmc0_default {
+   pinmux {
+   groups = "sd0_d0_mfp", "sd0_d1_mfp", "sd0_d2_d3_mfp",
+"sd0_cmd_mfp", "sd0_clk_mfp";
+   function = "sd0";
+   };
+   };
+
+   mmc2_default: mmc2_default {
+   pinmux {
+   groups = "nand0_d0_ceb3_mfp";
+   function = "sd2";
+   };
+   };
+};
+
+/* uSD */
+ {
+   status = "okay";
+   pinctrl-names = "default";
+   pinctrl-0 = <_default>;
+   no-sdio;
+   no-mmc;
+   no-1-8-v;
+   cd-gpios = < 120 GPIO_ACTIVE_LOW>;
+   bus-width = <4>;
+   vmmc-supply = <_vcc>;
+   vqmmc-supply = <_vcc>;
+};
+
+/* eMMC */
+ {
+   status = "okay";
+   pinctrl-names = "default";
+   pinctrl-0 = <_default>;
+   no-sdio;
+   no-sd;
+   non-removable;
+   bus-width = <8>;
+   vmmc-supply = <_3v1>;
 };
 
  {
-- 
2.17.1

[PATCH v2 1/7] clk: actions: Fix factor clk struct member access

2019-08-20 Thread Manivannan Sadhasivam

Since the helper "owl_factor_helper_round_rate" is shared between factor
and composite clocks, using the factor clk specific helper function
like "hw_to_owl_factor" to access its members will create issues when
called from composite clk specific code. Hence, pass the "factor_hw"
struct pointer directly instead of fetching it using factor clk specific
helpers.

This issue has been observed when a composite clock like "sd0_clk" tried
to call "owl_factor_helper_round_rate" resulting in pointer dereferencing
error.

While we are at it, let's rename the "clk_val_best" function to
"owl_clk_val_best" since this is an owl SoCs specific helper.

Fixes: 4bb78fc9744a ("clk: actions: Add factor clock support")
Signed-off-by: Manivannan Sadhasivam 
Reviewed-by: Stephen Boyd 
---
 drivers/clk/actions/owl-factor.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/clk/actions/owl-factor.c b/drivers/clk/actions/owl-factor.c
index 317d4a9e112e..f15e2621fa18 100644
--- a/drivers/clk/actions/owl-factor.c
+++ b/drivers/clk/actions/owl-factor.c
@@ -64,11 +64,10 @@ static unsigned int _get_table_val(const struct 
clk_factor_table *table,
return val;
 }
 
-static int clk_val_best(struct clk_hw *hw, unsigned long rate,
+static int owl_clk_val_best(const struct owl_factor_hw *factor_hw,
+   struct clk_hw *hw, unsigned long rate,
unsigned long *best_parent_rate)
 {
-   struct owl_factor *factor = hw_to_owl_factor(hw);
-   struct owl_factor_hw *factor_hw = >factor_hw;
const struct clk_factor_table *clkt = factor_hw->table;
unsigned long parent_rate, try_parent_rate, best = 0, cur_rate;
unsigned long parent_rate_saved = *best_parent_rate;
@@ -126,7 +125,7 @@ long owl_factor_helper_round_rate(struct owl_clk_common 
*common,
const struct clk_factor_table *clkt = factor_hw->table;
unsigned int val, mul = 0, div = 1;
 
-   val = clk_val_best(>hw, rate, parent_rate);
+   val = owl_clk_val_best(factor_hw, >hw, rate, parent_rate);
_get_table_div_mul(clkt, val, , );
 
return *parent_rate * mul / div;
-- 
2.17.1

[PATCH v2 0/7] Add SD/MMC driver for Actions Semi S900 SoC

2019-08-20 Thread Manivannan Sadhasivam

Hello,

This patchset adds SD/MMC driver for Actions Semi S900 SoC from Owl
family SoCs. There are 4 SD/MMC controller present in this SoC but
only 2 are enabled currently for Bubblegum96 board to access uSD and
onboard eMMC. SDIO support for this driver is not currently implemented.

Note: Currently, driver uses 2 completion mechanisms for maintaining
the coherency between SDC and DMA interrupts and I know that it is not
efficient. Hence, I'd like to hear any suggestions for reimplementing
the logic if anyone has.

With this driver, this patchset also fixes one clk driver issue and enables
the Actions Semi platform in ARM64 defconfig.

Thanks,
Mani

Changes in v2:

* Converted the devicetree bindings to YAML
* Misc changes to bubblegum devicetree as per the review from Andreas
* Dropped the read/write wrappers and renamed all functions to use owl-
  prefix as per the review from Ulf
* Renamed clk_val_best to owl_clk_val_best and added Reviewed-by tag
  from Stephen

Manivannan Sadhasivam (7):
  clk: actions: Fix factor clk struct member access
  dt-bindings: mmc: Add Actions Semi SD/MMC/SDIO controller binding
  arm64: dts: actions: Add MMC controller support for S900
  arm64: dts: actions: Add uSD and eMMC support for Bubblegum96
  mmc: Add Actions Semi Owl SoCs SD/MMC driver
  MAINTAINERS: Add entry for Actions Semi SD/MMC driver and binding
  arm64: configs: Enable Actions Semi platform in defconfig

 .../devicetree/bindings/mmc/owl-mmc.yaml  |  62 ++
 MAINTAINERS   |   2 +
 .../boot/dts/actions/s900-bubblegum-96.dts|  60 ++
 arch/arm64/boot/dts/actions/s900.dtsi |  45 ++
 arch/arm64/configs/defconfig  |   1 +
 drivers/clk/actions/owl-factor.c  |   7 +-
 drivers/mmc/host/Kconfig  |   8 +
 drivers/mmc/host/Makefile |   1 +
 drivers/mmc/host/owl-mmc.c| 696 ++
 9 files changed, 878 insertions(+), 4 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/mmc/owl-mmc.yaml
 create mode 100644 drivers/mmc/host/owl-mmc.c

-- 
2.17.1

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1196 matches

Mail list logo