Re: [PATCH 00/30] DMA: Mundane typo fixes

2021-03-29 Thread Greg KH
On Mon, Mar 29, 2021 at 11:25:11AM +0530, Bhaskar Chowdhury wrote:
> On 07:29 Mon 29 Mar 2021, Christoph Hellwig wrote:
> > I really don't think these typo patchbomb are that useful.  I'm all
> > for fixing typos when working with a subsystem, but I'm not sure these
> > patchbombs help anything.
> > 
> I am sure you are holding the wrong end of the wand and grossly failing to
> understand.

Please stop statements like this, it is not helpful and is doing nothing
but ensure that your patches will not be looked at in the future.

> Anyway, I hope I give a heads up ...find "your way" to fix those damn
> thing...it's glaring

There is no requirement that anyone accept patches that are sent to
them.  When you complain when receiving comments on them, that
shows you do not wish to work with others.

Sorry, but you are now on my local blacklist for a while, and I
encourage other maintainers to just ignore these patches as well.

thanks,

greg k-h


Re: [PATCH] [backport for 5.10] powerpc/603: Fix protection of user pages mapped with PROT_NONE

2021-03-12 Thread Greg KH
On Thu, Mar 11, 2021 at 06:24:30PM +, Christophe Leroy wrote:
> (cherry picked from commit c119565a15a628efdfa51352f9f6c5186e506a1c)
> 
> On book3s/32, page protection is defined by the PP bits in the PTE
> which provide the following protection depending on the access
> keys defined in the matching segment register:
> - PP 00 means RW with key 0 and N/A with key 1.
> - PP 01 means RW with key 0 and RO with key 1.
> - PP 10 means RW with both key 0 and key 1.
> - PP 11 means RO with both key 0 and key 1.
> 
> Since the implementation of kernel userspace access protection,
> PP bits have been set as follows:
> - PP00 for pages without _PAGE_USER
> - PP01 for pages with _PAGE_USER and _PAGE_RW
> - PP11 for pages with _PAGE_USER and without _PAGE_RW
> 
> For kernelspace segments, kernel accesses are performed with key 0
> and user accesses are performed with key 1. As PP00 is used for
> non _PAGE_USER pages, user can't access kernel pages not flagged
> _PAGE_USER while kernel can.
> 
> For userspace segments, both kernel and user accesses are performed
> with key 0, therefore pages not flagged _PAGE_USER are still
> accessible to the user.
> 
> This shouldn't be an issue, because userspace is expected to be
> accessible to the user. But unlike most other architectures, powerpc
> implements PROT_NONE protection by removing _PAGE_USER flag instead of
> flagging the page as not valid. This means that pages in userspace
> that are not flagged _PAGE_USER shall remain inaccessible.
> 
> To get the expected behaviour, just mimic other architectures in the
> TLB miss handler by checking _PAGE_USER permission on userspace
> accesses as if it was the _PAGE_PRESENT bit.
> 
> Note that this problem only is only for 603 cores. The 604+ have
> an hash table, and hash_page() function already implement the
> verification of _PAGE_USER permission on userspace pages.
> 
> Fixes: f342adca3afc ("powerpc/32s: Prepare Kernel Userspace Access 
> Protection")
> Change-Id: I68bc5e5ff4542bdfcdcd12923fa96a5811707475
> Cc: sta...@vger.kernel.org # v5.2+
> Reported-by: Christoph Plattner 
> Signed-off-by: Christophe Leroy 
> Signed-off-by: Michael Ellerman 
> Link: 
> https://lore.kernel.org/r/4a0c6e3bb8f0c162457bf54d9bc6fd8d7b55129f.1612160907.git.christophe.le...@csgroup.eu
> ---
>  arch/powerpc/kernel/head_book3s_32.S | 9 ++---
>  1 file changed, 6 insertions(+), 3 deletions(-)

Both backports applied, thanks.

greg k-h


Re: linux-next: manual merge of the tty tree with the powerpc-fixes tree

2021-03-04 Thread Greg KH
On Fri, Mar 05, 2021 at 12:05:23PM +1100, Stephen Rothwell wrote:
> Hi all,
> 
> Today's linux-next merge of the tty tree got a conflict in:
> 
>   drivers/tty/hvc/hvcs.c
> 
> between commit:
> 
>   386a966f5ce7 ("vio: make remove callback return void")
> 
> from the powerpc-fixes tree and commit:
> 
>   fb8d350c291c ("tty: hvc, drop unneeded forward declarations")
> 
> from the tty tree.
> 
> I fixed it up (they both removed the forward decalrartion of
> hvcs_remove(), but the latter removed more) and can carry the fix as
> necessary. This is now fixed as far as linux-next is concerned, but any
> non trivial conflicts should be mentioned to your upstream maintainer
> when your tree is submitted for merging.  You may also want to consider
> cooperating with the maintainer of the conflicting tree to minimise any
> particularly complex conflicts.

Thanks, that's the correct fix, we knew this was going to happen :(

greg k-h


Re: [PATCH for 5.10] powerpc/32: Preserve cr1 in exception prolog stack check to fix build error

2021-03-01 Thread Greg KH
On Tue, Feb 23, 2021 at 03:39:20PM +0100, Christophe Leroy wrote:
> 
> 
> Le 15/02/2021 à 15:30, Greg KH a écrit :
> > On Fri, Feb 12, 2021 at 08:57:14AM +, Christophe Leroy wrote:
> > > This is backport of 3642eb21256a ("powerpc/32: Preserve cr1 in
> > > exception prolog stack check to fix build error") for kernel 5.10
> > > 
> > > It fixes the build failure on v5.10 reported by kernel test robot
> > > and by David Michael.
> > > 
> > > This fix is not in Linux tree yet, it is in next branch in powerpc tree.
> > 
> > Then there's nothing I can do about it until that happens :(
> > 
> 
> It now is in Linus' tree, see 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3642eb21256a317ac14e9ed560242c6d20cf06d9

Now queued up, thanks.

greg k-h


Re: [PATCH for 5.10] powerpc/32: Preserve cr1 in exception prolog stack check to fix build error

2021-02-15 Thread Greg KH
On Fri, Feb 12, 2021 at 08:57:14AM +, Christophe Leroy wrote:
> This is backport of 3642eb21256a ("powerpc/32: Preserve cr1 in
> exception prolog stack check to fix build error") for kernel 5.10
> 
> It fixes the build failure on v5.10 reported by kernel test robot
> and by David Michael.
> 
> This fix is not in Linux tree yet, it is in next branch in powerpc tree.

Then there's nothing I can do about it until that happens :(



Re: [PATCH v2 2/2] ima: Free IMA measurement buffer after kexec syscall

2021-02-05 Thread Greg KH
On Thu, Feb 04, 2021 at 09:49:51AM -0800, Lakshmi Ramasubramanian wrote:
> IMA allocates kernel virtual memory to carry forward the measurement
> list, from the current kernel to the next kernel on kexec system call,
> in ima_add_kexec_buffer() function.  This buffer is not freed before
> completing the kexec system call resulting in memory leak.
> 
> Add ima_buffer field in "struct kimage" to store the virtual address
> of the buffer allocated for the IMA measurement list.
> Free the memory allocated for the IMA measurement list in
> kimage_file_post_load_cleanup() function.
> 
> Signed-off-by: Lakshmi Ramasubramanian 
> Suggested-by: Tyler Hicks 
> Reviewed-by: Thiago Jung Bauermann 
> Reviewed-by: Tyler Hicks 
> Fixes: 7b8589cc29e7 ("ima: on soft reboot, save the measurement list")
> ---
>  include/linux/kexec.h  | 5 +
>  kernel/kexec_file.c| 5 +
>  security/integrity/ima/ima_kexec.c | 2 ++
>  3 files changed, 12 insertions(+)



This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read:
https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.




Re: [PATCH v2 1/2] ima: Free IMA measurement buffer on error

2021-02-05 Thread Greg KH
On Thu, Feb 04, 2021 at 09:49:50AM -0800, Lakshmi Ramasubramanian wrote:
> IMA allocates kernel virtual memory to carry forward the measurement
> list, from the current kernel to the next kernel on kexec system call,
> in ima_add_kexec_buffer() function.  In error code paths this memory
> is not freed resulting in memory leak.
> 
> Free the memory allocated for the IMA measurement list in
> the error code paths in ima_add_kexec_buffer() function.
> 
> Signed-off-by: Lakshmi Ramasubramanian 
> Suggested-by: Tyler Hicks 
> Fixes: 7b8589cc29e7 ("ima: on soft reboot, save the measurement list")
> ---
>  security/integrity/ima/ima_kexec.c | 1 +
>  1 file changed, 1 insertion(+)



This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read:
https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.




Re: [RFC PATCH v3 2/6] swiotlb: Add restricted DMA pool

2021-01-13 Thread Greg KH
On Wed, Jan 13, 2021 at 12:51:26PM +0100, Christoph Hellwig wrote:
> On Wed, Jan 06, 2021 at 08:50:03AM +0100, Greg KH wrote:
> > > --- a/include/linux/device.h
> > > +++ b/include/linux/device.h
> > > @@ -413,6 +413,7 @@ struct dev_links_info {
> > >   * @dma_pools:   Dma pools (if dma'ble device).
> > >   * @dma_mem: Internal for coherent mem override.
> > >   * @cma_area:Contiguous memory area for dma allocations
> > > + * @dma_io_tlb_mem: Internal for swiotlb io_tlb_mem override.
> > 
> > Why does this have to be added here?  Shouldn't the platform-specific
> > code handle it instead?
> 
> The whole code added here is pretty generic.  What we need to eventually
> do, though is to add a separate dma_device instead of adding more and more
> bloat to struct device.

I have no objections for that happening!


Re: [RFC PATCH v3 2/6] swiotlb: Add restricted DMA pool

2021-01-05 Thread Greg KH
On Wed, Jan 06, 2021 at 11:41:20AM +0800, Claire Chang wrote:
> Add the initialization function to create restricted DMA pools from
> matching reserved-memory nodes in the device tree.
> 
> Signed-off-by: Claire Chang 
> ---
>  include/linux/device.h  |   4 ++
>  include/linux/swiotlb.h |   7 +-
>  kernel/dma/Kconfig  |   1 +
>  kernel/dma/swiotlb.c| 144 ++--
>  4 files changed, 131 insertions(+), 25 deletions(-)
> 
> diff --git a/include/linux/device.h b/include/linux/device.h
> index 89bb8b84173e..ca6f71ec8871 100644
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -413,6 +413,7 @@ struct dev_links_info {
>   * @dma_pools:   Dma pools (if dma'ble device).
>   * @dma_mem: Internal for coherent mem override.
>   * @cma_area:Contiguous memory area for dma allocations
> + * @dma_io_tlb_mem: Internal for swiotlb io_tlb_mem override.

Why does this have to be added here?  Shouldn't the platform-specific
code handle it instead?

thanks,

greg k-h


Re: [PATCH v3 5/6] mm/mmu_gather: invalidate TLB correctly on batch allocation failure and flush

2021-01-05 Thread Greg KH
On Thu, Mar 12, 2020 at 06:57:39PM +0530, Santosh Sivaraj wrote:
> From: Peter Zijlstra 
> 
> commit 0ed1325967ab5f7a4549a2641c6ebe115f76e228 upstream.
> 
> Architectures for which we have hardware walkers of Linux page table
> should flush TLB on mmu gather batch allocation failures and batch flush.
> Some architectures like POWER supports multiple translation modes (hash
> and radix) and in the case of POWER only radix translation mode needs the
> above TLBI.  This is because for hash translation mode kernel wants to
> avoid this extra flush since there are no hardware walkers of linux page
> table.  With radix translation, the hardware also walks linux page table
> and with that, kernel needs to make sure to TLB invalidate page walk cache
> before page table pages are freed.
> 
> More details in commit d86564a2f085 ("mm/tlb, x86/mm: Support invalidating
> TLB caches for RCU_TABLE_FREE")
> 
> The changes to sparc are to make sure we keep the old behavior since we
> are now removing HAVE_RCU_TABLE_NO_INVALIDATE.  The default value for
> tlb_needs_table_invalidate is to always force an invalidate and sparc can
> avoid the table invalidate.  Hence we define tlb_needs_table_invalidate to
> false for sparc architecture.
> 
> Link: 
> http://lkml.kernel.org/r/20200116064531.483522-3-aneesh.ku...@linux.ibm.com
> Fixes: a46cc7a90fd8 ("powerpc/mm/radix: Improve TLB/PWC flushes")
> Signed-off-by: Peter Zijlstra (Intel)  Signed-off-by: Aneesh Kumar K.V 
> Cc:   # 4.19
> Signed-off-by: Santosh Sivaraj 
> [santosh: backported to 4.19 stable]
> ---
>  arch/Kconfig|  3 ---
>  arch/powerpc/Kconfig|  1 -
>  arch/powerpc/include/asm/tlb.h  | 11 +++
>  arch/sparc/Kconfig  |  1 -
>  arch/sparc/include/asm/tlb_64.h |  9 +
>  include/asm-generic/tlb.h   | 15 +++
>  mm/memory.c | 16 
>  7 files changed, 43 insertions(+), 13 deletions(-)

As the testing pointed out, this breaks the build on lots of arches:

https://lore.kernel.org/r/caeuse78+f1q9lfjpf8sqzqa6+ak4wcpiincuvxecv+kpdry...@mail.gmail.com

https://lore.kernel.org/r/cff87cd2-8cd5-241e-3a05-a817b1a56...@roeck-us.net

so I'm going to drop this whole series and do a -rc2.

If you still want/need this series in 4.19, please make sure it really
works for everyone :)

thanks,

greg k-h


Re: [PATCH v3 0/6] Memory corruption may occur due to incorrent tlb flush

2021-01-04 Thread Greg KH
On Thu, Mar 12, 2020 at 06:57:34PM +0530, Santosh Sivaraj wrote:
> The TLB flush optimisation (a46cc7a90f: powerpc/mm/radix: Improve TLB/PWC
> flushes) may result in random memory corruption. Any concurrent page-table 
> walk
> could end up with a Use-after-Free. Even on UP this might give issues, since
> mmu_gather is preemptible these days. An interrupt or preempted task accessing
> user pages might stumble into the free page if the hardware caches page
> directories.
> 
> The series is a backport of the fix sent by Peter [1].
> 
> The first three patches are dependencies for the last patch (avoid potential
> double flush). If the performance impact due to double flush is considered
> trivial then the first three patches and last patch may be dropped.
> 
> This is only for v4.19 stable.
> 
> [1] https://patchwork.kernel.org/cover/11284843/

Sorry for the delay, now queued up, let's see what the test-builders say
about it...

thanks,

greg k-h


Re: [PATCH] powerpc: Stop exporting __clear_user which is now inlined.

2020-12-06 Thread Greg KH
On Sat, Dec 05, 2020 at 09:58:23PM +1100, Michael Ellerman wrote:
> Michal Suchanek  writes:
> > Stable commit 452e2a83ea23 ("powerpc: Fix __clear_user() with KUAP
> > enabled") redefines __clear_user as inline function but does not remove
> > the export.
> >
> > Fixes: 452e2a83ea23 ("powerpc: Fix __clear_user() with KUAP enabled")
> >
> > Signed-off-by: Michal Suchanek 
> > ---
> >  arch/powerpc/lib/ppc_ksyms.c | 1 -
> >  1 file changed, 1 deletion(-)
> 
> Acked-by: Michael Ellerman 

Now applied, thanks.

greg k-h


Re: [PATCH] powerpc/hotplug: assign hot added LMB to the right node

2020-12-03 Thread Greg KH
On Thu, Dec 03, 2020 at 11:15:14AM +0100, Laurent Dufour wrote:
> This patch applies to 5.9 and earlier kernels only.
> 
> Since 5.10, this has been fortunately fixed by the commit
> e5e179aa3a39 ("pseries/drmem: don't cache node id in drmem_lmb struct").

Why can't we just backport that patch instead?  It's almost always
better to do that than to have a one-off patch, as almost always those
have bugs in them.

thanks,

greg k-h


Re: [PATCH for 5.4] powerpc/8xx: Always fault when _PAGE_ACCESSED is not set

2020-11-20 Thread Greg KH
On Thu, Nov 19, 2020 at 08:47:54AM +, Christophe Leroy wrote:
> [This is backport for 5.4 of 29daf869cbab69088fe1755d9dd224e99ba78b56]
> 
> The kernel expects pte_young() to work regardless of CONFIG_SWAP.

All backports now queued up, thanks.

greg k-h


Re: [PATCH for 5.4] powerpc/603: Always fault when _PAGE_ACCESSED is not set

2020-11-17 Thread Greg KH
On Mon, Nov 09, 2020 at 05:40:52PM +, Christophe Leroy wrote:
> [That is backport of 11522448e641e8f1690c9db06e01985e8e19b401 to linux 5.4]
> 
> The kernel expects pte_young() to work regardless of CONFIG_SWAP.
> 
> Make sure a minor fault is taken to set _PAGE_ACCESSED when it
> is not already set, regardless of the selection of CONFIG_SWAP.
> 
> Fixes: 84de6ab0e904 ("powerpc/603: don't handle PAGE_ACCESSED in TLB miss 
> handlers.")
> Cc: sta...@vger.kernel.org
> Signed-off-by: Christophe Leroy 
> Signed-off-by: Michael Ellerman 
> Link: 
> https://lore.kernel.org/r/a44367744de54e2315b2f1a8cbbd7f88488072e0.1602342806.git.christophe.le...@csgroup.eu
> ---
>  arch/powerpc/kernel/head_32.S | 12 
>  1 file changed, 12 deletions(-)

Both backports now queued up, thanks.

greg k-h


Re: [PATCH 4.19] mm: fix exec activate_mm vs TLB shootdown and lazy tlb switching race

2020-11-04 Thread Greg KH
On Wed, Nov 04, 2020 at 12:14:06PM +1100, Michael Ellerman wrote:
> From: Nicholas Piggin 
> 
> commit d53c3dfb23c45f7d4f910c3a3ca84bf0a99c6143 upstream.
> 
> Reading and modifying current->mm and current->active_mm and switching
> mm should be done with irqs off, to prevent races seeing an intermediate
> state.
> 
> This is similar to commit 38cf307c1f20 ("mm: fix kthread_use_mm() vs TLB
> invalidate"). At exec-time when the new mm is activated, the old one
> should usually be single-threaded and no longer used, unless something
> else is holding an mm_users reference (which may be possible).
> 
> Absent other mm_users, there is also a race with preemption and lazy tlb
> switching. Consider the kernel_execve case where the current thread is
> using a lazy tlb active mm:
> 
>   call_usermodehelper()
> kernel_execve()
>   old_mm = current->mm;
>   active_mm = current->active_mm;
>   *** preempt *** >  schedule()
>prev->active_mm = NULL;
>mmdrop(prev active_mm);
>  ...
>   <  schedule()
>   current->mm = mm;
>   current->active_mm = mm;
>   if (!old_mm)
>   mmdrop(active_mm);
> 
> If we switch back to the kernel thread from a different mm, there is a
> double free of the old active_mm, and a missing free of the new one.
> 
> Closing this race only requires interrupts to be disabled while ->mm
> and ->active_mm are being switched, but the TLB problem requires also
> holding interrupts off over activate_mm. Unfortunately not all archs
> can do that yet, e.g., arm defers the switch if irqs are disabled and
> expects finish_arch_post_lock_switch() to be called to complete the
> flush; um takes a blocking lock in activate_mm().
> 
> So as a first step, disable interrupts across the mm/active_mm updates
> to close the lazy tlb preempt race, and provide an arch option to
> extend that to activate_mm which allows architectures doing IPI based
> TLB shootdowns to close the second race.
> 
> This is a bit ugly, but in the interest of fixing the bug and backporting
> before all architectures are converted this is a compromise.
> 
> Signed-off-by: Nicholas Piggin 
> Acked-by: Peter Zijlstra (Intel) 
> [mpe: Manual backport to 4.19 due to membarrier_exec_mmap(mm) changes]
> Signed-off-by: Michael Ellerman 
> Link: https://lore.kernel.org/r/20200914045219.3736466-2-npig...@gmail.com
> ---
>  arch/Kconfig |  7 +++
>  fs/exec.c| 15 ++-
>  2 files changed, 21 insertions(+), 1 deletion(-)

Now queued up, thanks!

greg k-h


Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"

2020-11-02 Thread 'Greg KH'
On Mon, Nov 02, 2020 at 09:06:38AM +, David Laight wrote:
> From: 'Greg KH'
> > Sent: 23 October 2020 15:47
> > 
> > On Fri, Oct 23, 2020 at 02:39:24PM +, David Laight wrote:
> > > From: David Hildenbrand
> > > > Sent: 23 October 2020 15:33
> > > ...
> > > > I just checked against upstream code generated by clang 10 and it
> > > > properly discards the upper 32bit via a mov w23 w2.
> > > >
> > > > So at least clang 10 indeed properly assumes we could have garbage and
> > > > masks it off.
> > > >
> > > > Maybe the issue is somewhere else, unrelated to nr_pages ... or clang 11
> > > > behaves differently.
> > >
> > > We'll need the disassembly from a failing kernel image.
> > > It isn't that big to hand annotate.
> > 
> > I've worked around the merge at the moment in the android tree, but it
> > is still quite reproducable, and will try to get a .o file to
> > disassemble on Monday or so...
> 
> Did this get properly resolved?

For some reason, 5.10-rc2 fixed all of this up.  I backed out all of the
patches I had to revert to get 5.10-rc1 to work properly, and then did
the merge and all is well.

It must have been something to do with the compat changes in this same
area that went in after 5.10-rc1, and something got reorganized in the
files somehow.  I really do not know, and at the moment, don't have the
time to track it down anymore.  So for now, I'd say it's all good, sorry
for the noise.

greg k-h


Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"

2020-10-23 Thread 'Greg KH'
On Fri, Oct 23, 2020 at 02:39:24PM +, David Laight wrote:
> From: David Hildenbrand
> > Sent: 23 October 2020 15:33
> ...
> > I just checked against upstream code generated by clang 10 and it
> > properly discards the upper 32bit via a mov w23 w2.
> > 
> > So at least clang 10 indeed properly assumes we could have garbage and
> > masks it off.
> > 
> > Maybe the issue is somewhere else, unrelated to nr_pages ... or clang 11
> > behaves differently.
> 
> We'll need the disassembly from a failing kernel image.
> It isn't that big to hand annotate.

I've worked around the merge at the moment in the android tree, but it
is still quite reproducable, and will try to get a .o file to
disassemble on Monday or so...

thanks,

greg k-h


Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"

2020-10-22 Thread Greg KH
On Thu, Oct 22, 2020 at 04:28:20PM +0200, Arnd Bergmann wrote:
> On Thu, Oct 22, 2020 at 3:50 PM Greg KH  wrote:
> > On Thu, Oct 22, 2020 at 02:57:59PM +0200, Greg KH wrote:
> > > On Thu, Oct 22, 2020 at 02:42:24PM +0200, David Hildenbrand wrote:
> 
> > > >  struct iovec *iovec_from_user(const struct iovec __user *uvec,
> > > > -   unsigned long nr_segs, unsigned long fast_segs,
> > > > +   unsigned nr_segs, unsigned fast_segs,
> > > > struct iovec *fast_iov, bool compat)
> > > >  {
> > > > struct iovec *iov = fast_iov;
> > > > @@ -1738,7 +1738,7 @@ ssize_t __import_iovec(int type, const struct
> > > > iovec __user *uvec,
> > > >  struct iov_iter *i, bool compat)
> > > >  {
> > > > ssize_t total_len = 0;
> > > > -   unsigned long seg;
> > > > +   unsigned seg;
> > > > struct iovec *iov;
> > > >
> > > > iov = iovec_from_user(uvec, nr_segs, fast_segs, *iovp, compat);
> > > >
> > >
> > > Ah, I tested the other way around, making everything "unsigned long"
> > > instead.  Will go try this too, as other tests are still running...
> >
> > Ok, no, this didn't work either.
> >
> > Nick, I think I need some compiler help here.  Any ideas?
> 
> I don't think the patch above would reliably clear the upper bits if they
> contain garbage.
> 
> If the integer extension is the problem, the way I'd try it is to make the
> function take an 'unsigned long' and then explictly mask the upper
> bits with
> 
>  seg = lower_32_bits(seg);
> 
> Can you attach the iov_iter.s files from the broken build, plus the
> one with 'noinline' for comparison? Maybe something can be seen
> in there.

I don't know how to extract the .s files easily from the AOSP build
system, I'll look into that.  I'm also now testing by downgrading to an
older version of clang (10 instead of 11), to see if that matters at all
or not...

thanks,

greg k-h


Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"

2020-10-22 Thread Greg KH
On Thu, Oct 22, 2020 at 02:57:59PM +0200, Greg KH wrote:
> On Thu, Oct 22, 2020 at 02:42:24PM +0200, David Hildenbrand wrote:
> > On 22.10.20 14:18, Greg KH wrote:
> > > On Thu, Oct 22, 2020 at 12:48:05PM +0200, Greg KH wrote:
> > >> On Thu, Oct 22, 2020 at 11:36:40AM +0200, David Hildenbrand wrote:
> > >>> On 22.10.20 11:32, David Laight wrote:
> > >>>> From: David Hildenbrand
> > >>>>> Sent: 22 October 2020 10:25
> > >>>> ...
> > >>>>> ... especially because I recall that clang and gcc behave slightly
> > >>>>> differently:
> > >>>>>
> > >>>>> https://github.com/hjl-tools/x86-psABI/issues/2
> > >>>>>
> > >>>>> "Function args are different: narrow types are sign or zero extended 
> > >>>>> to
> > >>>>> 32 bits, depending on their type. clang depends on this for incoming
> > >>>>> args, but gcc doesn't make that assumption. But both compilers do it
> > >>>>> when calling, so gcc code can call clang code.
> > >>>>
> > >>>> It really is best to use 'int' (or even 'long') for all numeric
> > >>>> arguments (and results) regardless of the domain of the value.
> > >>>>
> > >>>> Related, I've always worried about 'bool'
> > >>>>
> > >>>>> The upper 32 bits of registers are always undefined garbage for types
> > >>>>> smaller than 64 bits."
> > >>>>
> > >>>> On x86-64 the high bits are zeroed by all 32bit loads.
> > >>>
> > >>> Yeah, but does not help here.
> > >>>
> > >>>
> > >>> My thinking: if the compiler that calls import_iovec() has garbage in
> > >>> the upper 32 bit
> > >>>
> > >>> a) gcc will zero it out and not rely on it being zero.
> > >>> b) clang will not zero it out, assuming it is zero.
> > >>>
> > >>> But
> > >>>
> > >>> a) will zero it out when calling the !inlined variant
> > >>> b) clang will zero it out when calling the !inlined variant
> > >>>
> > >>> When inlining, b) strikes. We access garbage. That would mean that we
> > >>> have calling code that's not generated by clang/gcc IIUC.
> > >>>
> > >>> We can test easily by changing the parameters instead of adding an 
> > >>> "inline".
> > >>
> > >> Let me try that as well, as I seem to have a good reproducer, but it
> > >> takes a while to run...
> > > 
> > > Ok, that didn't work.
> > > 
> > > And I can't seem to "fix" this by adding noinline to patches further
> > > along in the patch series (because this commit's function is no longer
> > > present due to later patches.)
> > 
> > We might have the same issues with iovec_from_user() and friends now.
> > 
> > > 
> > > Will keep digging...
> > > 
> > > greg k-h
> > > 
> > 
> > 
> > Might be worth to give this a try, just to see if it's related to
> > garbage in upper 32 bit and the way clang is handling it (might be a BUG
> > in clang, though):
> > 
> > 
> > diff --git a/include/linux/uio.h b/include/linux/uio.h
> > index 72d88566694e..7527298c6b56 100644
> > --- a/include/linux/uio.h
> > +++ b/include/linux/uio.h
> > @@ -267,7 +267,7 @@ size_t hash_and_copy_to_iter(const void *addr,
> > size_t bytes, void *hashp,
> > struct iov_iter *i);
> > 
> >  struct iovec *iovec_from_user(const struct iovec __user *uvector,
> > -   unsigned long nr_segs, unsigned long fast_segs,
> > +   unsigned nr_segs, unsigned fast_segs,
> > struct iovec *fast_iov, bool compat);
> >  ssize_t import_iovec(int type, const struct iovec __user *uvec,
> >  unsigned nr_segs, unsigned fast_segs, struct iovec **iovp,
> > diff --git a/lib/iov_iter.c b/lib/iov_iter.c
> > index 1635111c5bd2..58417f1916dc 100644
> > --- a/lib/iov_iter.c
> > +++ b/lib/iov_iter.c
> > @@ -1652,7 +1652,7 @@ const void *dup_iter(struct iov_iter *new, struct
> > iov_iter *old, gfp_t flags)
> >  EXPORT_SYMBOL(dup_iter);
> > 
> >  static int copy_compat_iovec_from_user(struct iovec *iov,
> > -   const s

Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"

2020-10-22 Thread Greg KH
On Thu, Oct 22, 2020 at 02:42:24PM +0200, David Hildenbrand wrote:
> On 22.10.20 14:18, Greg KH wrote:
> > On Thu, Oct 22, 2020 at 12:48:05PM +0200, Greg KH wrote:
> >> On Thu, Oct 22, 2020 at 11:36:40AM +0200, David Hildenbrand wrote:
> >>> On 22.10.20 11:32, David Laight wrote:
> >>>> From: David Hildenbrand
> >>>>> Sent: 22 October 2020 10:25
> >>>> ...
> >>>>> ... especially because I recall that clang and gcc behave slightly
> >>>>> differently:
> >>>>>
> >>>>> https://github.com/hjl-tools/x86-psABI/issues/2
> >>>>>
> >>>>> "Function args are different: narrow types are sign or zero extended to
> >>>>> 32 bits, depending on their type. clang depends on this for incoming
> >>>>> args, but gcc doesn't make that assumption. But both compilers do it
> >>>>> when calling, so gcc code can call clang code.
> >>>>
> >>>> It really is best to use 'int' (or even 'long') for all numeric
> >>>> arguments (and results) regardless of the domain of the value.
> >>>>
> >>>> Related, I've always worried about 'bool'
> >>>>
> >>>>> The upper 32 bits of registers are always undefined garbage for types
> >>>>> smaller than 64 bits."
> >>>>
> >>>> On x86-64 the high bits are zeroed by all 32bit loads.
> >>>
> >>> Yeah, but does not help here.
> >>>
> >>>
> >>> My thinking: if the compiler that calls import_iovec() has garbage in
> >>> the upper 32 bit
> >>>
> >>> a) gcc will zero it out and not rely on it being zero.
> >>> b) clang will not zero it out, assuming it is zero.
> >>>
> >>> But
> >>>
> >>> a) will zero it out when calling the !inlined variant
> >>> b) clang will zero it out when calling the !inlined variant
> >>>
> >>> When inlining, b) strikes. We access garbage. That would mean that we
> >>> have calling code that's not generated by clang/gcc IIUC.
> >>>
> >>> We can test easily by changing the parameters instead of adding an 
> >>> "inline".
> >>
> >> Let me try that as well, as I seem to have a good reproducer, but it
> >> takes a while to run...
> > 
> > Ok, that didn't work.
> > 
> > And I can't seem to "fix" this by adding noinline to patches further
> > along in the patch series (because this commit's function is no longer
> > present due to later patches.)
> 
> We might have the same issues with iovec_from_user() and friends now.
> 
> > 
> > Will keep digging...
> > 
> > greg k-h
> > 
> 
> 
> Might be worth to give this a try, just to see if it's related to
> garbage in upper 32 bit and the way clang is handling it (might be a BUG
> in clang, though):
> 
> 
> diff --git a/include/linux/uio.h b/include/linux/uio.h
> index 72d88566694e..7527298c6b56 100644
> --- a/include/linux/uio.h
> +++ b/include/linux/uio.h
> @@ -267,7 +267,7 @@ size_t hash_and_copy_to_iter(const void *addr,
> size_t bytes, void *hashp,
> struct iov_iter *i);
> 
>  struct iovec *iovec_from_user(const struct iovec __user *uvector,
> -   unsigned long nr_segs, unsigned long fast_segs,
> +   unsigned nr_segs, unsigned fast_segs,
> struct iovec *fast_iov, bool compat);
>  ssize_t import_iovec(int type, const struct iovec __user *uvec,
>  unsigned nr_segs, unsigned fast_segs, struct iovec **iovp,
> diff --git a/lib/iov_iter.c b/lib/iov_iter.c
> index 1635111c5bd2..58417f1916dc 100644
> --- a/lib/iov_iter.c
> +++ b/lib/iov_iter.c
> @@ -1652,7 +1652,7 @@ const void *dup_iter(struct iov_iter *new, struct
> iov_iter *old, gfp_t flags)
>  EXPORT_SYMBOL(dup_iter);
> 
>  static int copy_compat_iovec_from_user(struct iovec *iov,
> -   const struct iovec __user *uvec, unsigned long nr_segs)
> +   const struct iovec __user *uvec, unsigned nr_segs)
>  {
> const struct compat_iovec __user *uiov =
> (const struct compat_iovec __user *)uvec;
> @@ -1684,7 +1684,7 @@ static int copy_compat_iovec_from_user(struct
> iovec *iov,
>  }
> 
>  static int copy_iovec_from_user(struct iovec *iov,
> -   const struct iovec __user *uvec, unsigned long nr_segs)
> +   const struct iovec __user *uvec, unsigned nr_segs)
>  {
> unsigned long seg;
> 
> @@ -1699,7 +1699,7 @@ static int copy_iovec_from_user(struct iovec *iov,
>  }
> 
>  struct iovec *iovec_from_user(const struct iovec __user *uvec,
> -   unsigned long nr_segs, unsigned long fast_segs,
> +   unsigned nr_segs, unsigned fast_segs,
> struct iovec *fast_iov, bool compat)
>  {
> struct iovec *iov = fast_iov;
> @@ -1738,7 +1738,7 @@ ssize_t __import_iovec(int type, const struct
> iovec __user *uvec,
>  struct iov_iter *i, bool compat)
>  {
> ssize_t total_len = 0;
> -   unsigned long seg;
> +   unsigned seg;
> struct iovec *iov;
> 
> iov = iovec_from_user(uvec, nr_segs, fast_segs, *iovp, compat);
> 

Ah, I tested the other way around, making everything "unsigned long"
instead.  Will go try this too, as other tests are still running...

thanks,

greg k-h


Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"

2020-10-22 Thread Greg KH
On Thu, Oct 22, 2020 at 12:48:05PM +0200, Greg KH wrote:
> On Thu, Oct 22, 2020 at 11:36:40AM +0200, David Hildenbrand wrote:
> > On 22.10.20 11:32, David Laight wrote:
> > > From: David Hildenbrand
> > >> Sent: 22 October 2020 10:25
> > > ...
> > >> ... especially because I recall that clang and gcc behave slightly
> > >> differently:
> > >>
> > >> https://github.com/hjl-tools/x86-psABI/issues/2
> > >>
> > >> "Function args are different: narrow types are sign or zero extended to
> > >> 32 bits, depending on their type. clang depends on this for incoming
> > >> args, but gcc doesn't make that assumption. But both compilers do it
> > >> when calling, so gcc code can call clang code.
> > > 
> > > It really is best to use 'int' (or even 'long') for all numeric
> > > arguments (and results) regardless of the domain of the value.
> > > 
> > > Related, I've always worried about 'bool'
> > > 
> > >> The upper 32 bits of registers are always undefined garbage for types
> > >> smaller than 64 bits."
> > > 
> > > On x86-64 the high bits are zeroed by all 32bit loads.
> > 
> > Yeah, but does not help here.
> > 
> > 
> > My thinking: if the compiler that calls import_iovec() has garbage in
> > the upper 32 bit
> > 
> > a) gcc will zero it out and not rely on it being zero.
> > b) clang will not zero it out, assuming it is zero.
> > 
> > But
> > 
> > a) will zero it out when calling the !inlined variant
> > b) clang will zero it out when calling the !inlined variant
> > 
> > When inlining, b) strikes. We access garbage. That would mean that we
> > have calling code that's not generated by clang/gcc IIUC.
> > 
> > We can test easily by changing the parameters instead of adding an "inline".
> 
> Let me try that as well, as I seem to have a good reproducer, but it
> takes a while to run...

Ok, that didn't work.

And I can't seem to "fix" this by adding noinline to patches further
along in the patch series (because this commit's function is no longer
present due to later patches.)

Will keep digging...

greg k-h


Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"

2020-10-22 Thread Greg KH
On Thu, Oct 22, 2020 at 11:36:40AM +0200, David Hildenbrand wrote:
> On 22.10.20 11:32, David Laight wrote:
> > From: David Hildenbrand
> >> Sent: 22 October 2020 10:25
> > ...
> >> ... especially because I recall that clang and gcc behave slightly
> >> differently:
> >>
> >> https://github.com/hjl-tools/x86-psABI/issues/2
> >>
> >> "Function args are different: narrow types are sign or zero extended to
> >> 32 bits, depending on their type. clang depends on this for incoming
> >> args, but gcc doesn't make that assumption. But both compilers do it
> >> when calling, so gcc code can call clang code.
> > 
> > It really is best to use 'int' (or even 'long') for all numeric
> > arguments (and results) regardless of the domain of the value.
> > 
> > Related, I've always worried about 'bool'
> > 
> >> The upper 32 bits of registers are always undefined garbage for types
> >> smaller than 64 bits."
> > 
> > On x86-64 the high bits are zeroed by all 32bit loads.
> 
> Yeah, but does not help here.
> 
> 
> My thinking: if the compiler that calls import_iovec() has garbage in
> the upper 32 bit
> 
> a) gcc will zero it out and not rely on it being zero.
> b) clang will not zero it out, assuming it is zero.
> 
> But
> 
> a) will zero it out when calling the !inlined variant
> b) clang will zero it out when calling the !inlined variant
> 
> When inlining, b) strikes. We access garbage. That would mean that we
> have calling code that's not generated by clang/gcc IIUC.
> 
> We can test easily by changing the parameters instead of adding an "inline".

Let me try that as well, as I seem to have a good reproducer, but it
takes a while to run...

greg k-h


Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"

2020-10-22 Thread Greg KH
On Thu, Oct 22, 2020 at 10:48:59AM +0200, David Hildenbrand wrote:
> On 22.10.20 10:40, David Laight wrote:
> > From: David Hildenbrand
> >> Sent: 22 October 2020 09:35
> >>
> >> On 22.10.20 10:26, Greg KH wrote:
> >>> On Thu, Oct 22, 2020 at 12:39:14AM +0100, Al Viro wrote:
> >>>> On Wed, Oct 21, 2020 at 06:13:01PM +0200, Greg KH wrote:
> >>>>> On Fri, Sep 25, 2020 at 06:51:39AM +0200, Christoph Hellwig wrote:
> >>>>>> From: David Laight 
> >>>>>>
> >>>>>> This lets the compiler inline it into import_iovec() generating
> >>>>>> much better code.
> >>>>>>
> >>>>>> Signed-off-by: David Laight 
> >>>>>> Signed-off-by: Christoph Hellwig 
> >>>>>> ---
> >>>>>>  fs/read_write.c | 179 
> >>>>>>  lib/iov_iter.c  | 176 +++
> >>>>>>  2 files changed, 176 insertions(+), 179 deletions(-)
> >>>>>
> >>>>> Strangely, this commit causes a regression in Linus's tree right now.
> >>>>>
> >>>>> I can't really figure out what the regression is, only that this commit
> >>>>> triggers a "large Android system binary" from working properly.  There's
> >>>>> no kernel log messages anywhere, and I don't have any way to strace the
> >>>>> thing in the testing framework, so any hints that people can provide
> >>>>> would be most appreciated.
> >>>>
> >>>> It's a pure move - modulo changed line breaks in the argument lists
> >>>> the functions involved are identical before and after that (just checked
> >>>> that directly, by checking out the trees before and after, extracting two
> >>>> functions in question from fs/read_write.c and lib/iov_iter.c (before and
> >>>> after, resp.) and checking the diff between those.
> >>>>
> >>>> How certain is your bisection?
> >>>
> >>> The bisection is very reproducable.
> >>>
> >>> But, this looks now to be a compiler bug.  I'm using the latest version
> >>> of clang and if I put "noinline" at the front of the function,
> >>> everything works.
> >>
> >> Well, the compiler can do more invasive optimizations when inlining. If
> >> you have buggy code that relies on some unspecified behavior, inlining
> >> can change the behavior ... but going over that code, there isn't too
> >> much action going on. At least nothing screamed at me.
> > 
> > Apart from all the optimisations that get rid off the 'pass be reference'
> > parameters and strange conditional tests.
> > Plenty of scope for the compiler getting it wrong.
> > But nothing even vaguely illegal.
> 
> Not the first time that people blame the compiler to then figure out
> that something else is wrong ... but maybe this time is different :)

I agree, I hate to blame the compiler, that's almost never the real
problem, but this one sure "feels" like it.

I'm running some more tests, trying to narrow things down as just adding
a "noinline" to the function that got moved here doesn't work on Linus's
tree at the moment because the function was split into multiple
functions.

Give me a few hours...

thanks,

greg k-h


Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"

2020-10-22 Thread Greg KH
On Thu, Oct 22, 2020 at 12:39:14AM +0100, Al Viro wrote:
> On Wed, Oct 21, 2020 at 06:13:01PM +0200, Greg KH wrote:
> > On Fri, Sep 25, 2020 at 06:51:39AM +0200, Christoph Hellwig wrote:
> > > From: David Laight 
> > > 
> > > This lets the compiler inline it into import_iovec() generating
> > > much better code.
> > > 
> > > Signed-off-by: David Laight 
> > > Signed-off-by: Christoph Hellwig 
> > > ---
> > >  fs/read_write.c | 179 
> > >  lib/iov_iter.c  | 176 +++
> > >  2 files changed, 176 insertions(+), 179 deletions(-)
> > 
> > Strangely, this commit causes a regression in Linus's tree right now.
> > 
> > I can't really figure out what the regression is, only that this commit
> > triggers a "large Android system binary" from working properly.  There's
> > no kernel log messages anywhere, and I don't have any way to strace the
> > thing in the testing framework, so any hints that people can provide
> > would be most appreciated.
> 
> It's a pure move - modulo changed line breaks in the argument lists
> the functions involved are identical before and after that (just checked
> that directly, by checking out the trees before and after, extracting two
> functions in question from fs/read_write.c and lib/iov_iter.c (before and
> after, resp.) and checking the diff between those.
> 
> How certain is your bisection?

The bisection is very reproducable.

But, this looks now to be a compiler bug.  I'm using the latest version
of clang and if I put "noinline" at the front of the function,
everything works.

Nick, any ideas here as to who I should report this to?

I'll work on a fixup patch for the Android kernel tree to see if I can
work around it there, but others will hit this in Linus's tree sooner or
later...

thanks,

greg k-h


Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"

2020-10-21 Thread Greg KH
On Fri, Sep 25, 2020 at 06:51:39AM +0200, Christoph Hellwig wrote:
> From: David Laight 
> 
> This lets the compiler inline it into import_iovec() generating
> much better code.
> 
> Signed-off-by: David Laight 
> Signed-off-by: Christoph Hellwig 
> ---
>  fs/read_write.c | 179 
>  lib/iov_iter.c  | 176 +++
>  2 files changed, 176 insertions(+), 179 deletions(-)

Strangely, this commit causes a regression in Linus's tree right now.

I can't really figure out what the regression is, only that this commit
triggers a "large Android system binary" from working properly.  There's
no kernel log messages anywhere, and I don't have any way to strace the
thing in the testing framework, so any hints that people can provide
would be most appreciated.

thanks,

greg k-h


Re: [PATCH] serial: pmac_zilog: don't init if zilog is not available

2020-10-20 Thread Greg KH
On Tue, Oct 20, 2020 at 08:19:26PM +0200, Laurent Vivier wrote:
> Le 20/10/2020 à 19:37, Greg KH a écrit :
> > On Tue, Oct 20, 2020 at 06:37:41PM +0200, Laurent Vivier wrote:
> >> Le 20/10/2020 à 18:28, Greg KH a écrit :
> >>> On Tue, Oct 20, 2020 at 06:23:03PM +0200, Laurent Vivier wrote:
> >>>> We can avoid to probe for the Zilog device (and generate ugly kernel 
> >>>> warning)
> >>>> if kernel is built for Mac but not on a Mac.
> >>>>
> >>>> Signed-off-by: Laurent Vivier 
> >>>> ---
> >>>>  drivers/tty/serial/pmac_zilog.c | 11 +++
> >>>>  1 file changed, 11 insertions(+)
> >>>>
> >>>> diff --git a/drivers/tty/serial/pmac_zilog.c 
> >>>> b/drivers/tty/serial/pmac_zilog.c
> >>>> index 063484b22523..d1d2e55983c3 100644
> >>>> --- a/drivers/tty/serial/pmac_zilog.c
> >>>> +++ b/drivers/tty/serial/pmac_zilog.c
> >>>> @@ -1867,6 +1867,12 @@ static struct platform_driver pmz_driver = {
> >>>>  static int __init init_pmz(void)
> >>>>  {
> >>>>  int rc, i;
> >>>> +
> >>>> +#ifdef CONFIG_MAC
> >>>> +if (!MACH_IS_MAC)
> >>>> +return -ENODEV;
> >>>> +#endif
> >>>
> >>> Why is the #ifdef needed?
> >>>
> >>> We don't like putting #ifdef in .c files for good reasons.  Can you make
> >>> the api check for this work with and without that #ifdef needed?
> >>
> >> The #ifdef is needed because this file can be compiled for PowerMac and
> >> m68k Mac. For PowerMac, the MACH_IS_MAC is not defined, so we need the
> >> #ifdef.
> >>
> >> We need the MAC_IS_MAC because the same kernel can be used with several
> >> m68k machines, so the init_pmz can be called on a m68k machine without
> >> the zilog device (it's a multi-targets kernel).
> >>
> >> You can check it's the good way to do by looking inside:
> >>
> >> drivers/video/fbdev/valkyriefb.c +317
> >> drivers/macintosh/adb.c +316
> >>
> >> That are two files used by both, mac and pmac.
> > 
> > Why not fix it to work properly like other arch checks are done
> I would be happy to do the same.
> 
> > Put it in a .h file and do the #ifdef there.  Why is this "special"?
> 
> I don't know.
> 
> Do you mean something like:
> 
> drivers/tty/serial/pmac_zilog.h
> ...
> #ifndef MACH_IS_MAC
> #define MACH_IS_MAC (0)
> #endif
> ...
> 
> drivers/tty/serial/pmac_zilog.c
> ...
> static int __init pmz_console_init(void)
> {
> if (!MACH_IS_MAC)
> return -ENODEV;
> ...

Yup, that would be a good start, but why is the pmac_zilog.h file
responsible for this?  Shouldn't this be in some arch-specific file
somewhere?

thanks,

greg k-h


Re: [PATCH] serial: pmac_zilog: don't init if zilog is not available

2020-10-20 Thread Greg KH
On Tue, Oct 20, 2020 at 06:37:41PM +0200, Laurent Vivier wrote:
> Le 20/10/2020 à 18:28, Greg KH a écrit :
> > On Tue, Oct 20, 2020 at 06:23:03PM +0200, Laurent Vivier wrote:
> >> We can avoid to probe for the Zilog device (and generate ugly kernel 
> >> warning)
> >> if kernel is built for Mac but not on a Mac.
> >>
> >> Signed-off-by: Laurent Vivier 
> >> ---
> >>  drivers/tty/serial/pmac_zilog.c | 11 +++
> >>  1 file changed, 11 insertions(+)
> >>
> >> diff --git a/drivers/tty/serial/pmac_zilog.c 
> >> b/drivers/tty/serial/pmac_zilog.c
> >> index 063484b22523..d1d2e55983c3 100644
> >> --- a/drivers/tty/serial/pmac_zilog.c
> >> +++ b/drivers/tty/serial/pmac_zilog.c
> >> @@ -1867,6 +1867,12 @@ static struct platform_driver pmz_driver = {
> >>  static int __init init_pmz(void)
> >>  {
> >>int rc, i;
> >> +
> >> +#ifdef CONFIG_MAC
> >> +  if (!MACH_IS_MAC)
> >> +  return -ENODEV;
> >> +#endif
> > 
> > Why is the #ifdef needed?
> > 
> > We don't like putting #ifdef in .c files for good reasons.  Can you make
> > the api check for this work with and without that #ifdef needed?
> 
> The #ifdef is needed because this file can be compiled for PowerMac and
> m68k Mac. For PowerMac, the MACH_IS_MAC is not defined, so we need the
> #ifdef.
> 
> We need the MAC_IS_MAC because the same kernel can be used with several
> m68k machines, so the init_pmz can be called on a m68k machine without
> the zilog device (it's a multi-targets kernel).
> 
> You can check it's the good way to do by looking inside:
> 
> drivers/video/fbdev/valkyriefb.c +317
> drivers/macintosh/adb.c +316
> 
> That are two files used by both, mac and pmac.

Why not fix it to work properly like other arch checks are done?

Put it in a .h file and do the #ifdef there.  Why is this "special"?

thanks,

greg k-h


Re: [PATCH] serial: pmac_zilog: don't init if zilog is not available

2020-10-20 Thread Greg KH
On Tue, Oct 20, 2020 at 06:23:03PM +0200, Laurent Vivier wrote:
> We can avoid to probe for the Zilog device (and generate ugly kernel warning)
> if kernel is built for Mac but not on a Mac.
> 
> Signed-off-by: Laurent Vivier 
> ---
>  drivers/tty/serial/pmac_zilog.c | 11 +++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/drivers/tty/serial/pmac_zilog.c b/drivers/tty/serial/pmac_zilog.c
> index 063484b22523..d1d2e55983c3 100644
> --- a/drivers/tty/serial/pmac_zilog.c
> +++ b/drivers/tty/serial/pmac_zilog.c
> @@ -1867,6 +1867,12 @@ static struct platform_driver pmz_driver = {
>  static int __init init_pmz(void)
>  {
>   int rc, i;
> +
> +#ifdef CONFIG_MAC
> + if (!MACH_IS_MAC)
> + return -ENODEV;
> +#endif

Why is the #ifdef needed?

We don't like putting #ifdef in .c files for good reasons.  Can you make
the api check for this work with and without that #ifdef needed?

thanks,

greg k-h


Re: linux-next: manual merge of the char-misc tree with the powerpc tree

2020-10-06 Thread Greg KH
On Tue, Oct 06, 2020 at 06:35:06PM +1100, Stephen Rothwell wrote:
> Hi all,
> 
> Today's linux-next merge of the char-misc tree got a conflict in:
> 
>   drivers/misc/ocxl/Kconfig
> 
> between commit:
> 
>   dde6f18a8779 ("ocxl: Don't return trigger page when allocating an 
> interrupt")
> 
> from the powerpc tree and commit:
> 
>   4b53a3c72116 ("ocxl: fix kconfig dependency warning for OCXL")
> 
> from the char-misc tree.
> 
> I fixed it up (see below) and can carry the fix as necessary. This
> is now fixed as far as linux-next is concerned, but any non trivial
> conflicts should be mentioned to your upstream maintainer when your tree
> is submitted for merging.  You may also want to consider cooperating
> with the maintainer of the conflicting tree to minimise any particularly
> complex conflicts.
> 
> -- 
> Cheers,
> Stephen Rothwell
> 
> diff --cc drivers/misc/ocxl/Kconfig
> index 0d815b2a40b3,947294f6d7f4..
> --- a/drivers/misc/ocxl/Kconfig
> +++ b/drivers/misc/ocxl/Kconfig
> @@@ -9,9 -9,8 +9,9 @@@ config OCXL_BAS
>   
>   config OCXL
>   tristate "OpenCAPI coherent accelerator support"
>  -depends on PPC_POWERNV && PCI && EEH && HOTPLUG_PCI_POWERNV
>  +depends on PPC_POWERNV && PCI && EEH && PPC_XIVE_NATIVE
> ++depends on HOTPLUG_PCI_POWERNV
>   select OCXL_BASE
> - select HOTPLUG_PCI_POWERNV
>   default m
>   help
> Select this option to enable the ocxl driver for Open

Looks good, thanks!

greg k-h


Re: [PATCH -next] tty: hvc: fix link error with CONFIG_SERIAL_CORE_CONSOLE=n

2020-09-18 Thread Greg KH
On Sat, Sep 19, 2020 at 10:48:41AM +0800, Yang Yingliang wrote:
> 
> On 2020/9/18 19:17, Greg KH wrote:
> > On Fri, Sep 18, 2020 at 05:20:30PM +0800, Yang Yingliang wrote:
> > > Fix the link error by selecting SERIAL_CORE_CONSOLE.
> > > 
> > > aarch64-linux-gnu-ld: drivers/tty/hvc/hvc_dcc.o: in function 
> > > `dcc_early_write':
> > > hvc_dcc.c:(.text+0x164): undefined reference to `uart_console_write'
> > > 
> > > Reported-by: Hulk Robot 
> > > Signed-off-by: Yang Yingliang 
> > > ---
> > >   drivers/tty/hvc/Kconfig | 1 +
> > >   1 file changed, 1 insertion(+)
> > > 
> > > diff --git a/drivers/tty/hvc/Kconfig b/drivers/tty/hvc/Kconfig
> > > index d1b27b0522a3..8d60e0ff67b4 100644
> > > --- a/drivers/tty/hvc/Kconfig
> > > +++ b/drivers/tty/hvc/Kconfig
> > > @@ -81,6 +81,7 @@ config HVC_DCC
> > >   bool "ARM JTAG DCC console"
> > >   depends on ARM || ARM64
> > >   select HVC_DRIVER
> > > + select SERIAL_CORE_CONSOLE
> > >   help
> > > This console uses the JTAG DCC on ARM to create a console 
> > > under the HVC
> > > driver. This console is used through a JTAG only on ARM. If 
> > > you don't have
> > > -- 
> > > 2.25.1
> > > 
> > Same question here, what caused this problem to happen?
> Fixes: d1a1af2cdf19 ("hvc: dcc: Add earlycon support")

Great, can you resend with that added please?

thanks,

greg k-h


Re: [PATCH -next] tty: hvc: fix link error with CONFIG_SERIAL_CORE_CONSOLE=n

2020-09-18 Thread Greg KH
On Fri, Sep 18, 2020 at 05:20:30PM +0800, Yang Yingliang wrote:
> Fix the link error by selecting SERIAL_CORE_CONSOLE.
> 
> aarch64-linux-gnu-ld: drivers/tty/hvc/hvc_dcc.o: in function 
> `dcc_early_write':
> hvc_dcc.c:(.text+0x164): undefined reference to `uart_console_write'
> 
> Reported-by: Hulk Robot 
> Signed-off-by: Yang Yingliang 
> ---
>  drivers/tty/hvc/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/tty/hvc/Kconfig b/drivers/tty/hvc/Kconfig
> index d1b27b0522a3..8d60e0ff67b4 100644
> --- a/drivers/tty/hvc/Kconfig
> +++ b/drivers/tty/hvc/Kconfig
> @@ -81,6 +81,7 @@ config HVC_DCC
>   bool "ARM JTAG DCC console"
>   depends on ARM || ARM64
>   select HVC_DRIVER
> + select SERIAL_CORE_CONSOLE
>   help
> This console uses the JTAG DCC on ARM to create a console under the 
> HVC
> driver. This console is used through a JTAG only on ARM. If you don't 
> have
> -- 
> 2.25.1
> 

Same question here, what caused this problem to happen?

thanks,

greg k-h


Re: Please apply commit 0828137e8f16 ("powerpc/64s: Don't init FSCR_DSCR in __init_FSCR()") to v4.14.y, v4.19.y, v5.4.y, v5.7.y

2020-08-26 Thread Greg KH
On Tue, Aug 25, 2020 at 07:44:08PM -0300, Thadeu Lima de Souza Cascardo wrote:
> After commit 912c0a7f2b5daa3cbb2bc10f303981e493de73bd ("powerpc/64s: Save FSCR
> to init_task.thread.fscr after feature init"), which has been applied to the
> referred branches, when userspace sets the user DSCR MSR, it won't be 
> inherited
> or restored during context switch, because the facility unavailable interrupt
> won't trigger.
> 
> Applying 0828137e8f16721842468e33df0460044a0c588b ("powerpc/64s: Don't init
> FSCR_DSCR in __init_FSCR()") will fix it.

Now queued up, thanks.

greg k-h


Re: [PATCH] usb: gadget: fsl: Fix unsigned expression compared with zero in fsl_udc_probe

2020-08-24 Thread Greg KH
On Mon, Aug 24, 2020 at 04:04:37PM +0800, Ye Bin wrote:
> Signed-off-by: Ye Bin 

I can't take patches without any changelog text, sorry.

greg k-h


Re: [PATCH v2] powerpc/vio: drop bus_type from parent device

2020-07-29 Thread Greg KH
On Thu, Jul 30, 2020 at 11:28:38AM +1000, Michael Ellerman wrote:
> [ Added Peter & Greg to Cc ]
> 
> Thadeu Lima de Souza Cascardo  writes:
> > Commit df44b479654f62b478c18ee4d8bc4e9f897a9844 ("kobject: return error
> > code if writing /sys/.../uevent fails") started returning failure when
> > writing to /sys/devices/vio/uevent.
> >
> > This causes an early udevadm trigger to fail. On some installer versions of
> > Ubuntu, this will cause init to exit, thus panicing the system very early
> > during boot.
> >
> > Removing the bus_type from the parent device will remove some of the extra
> > empty files from /sys/devices/vio/, but will keep the rest of the layout
> > for vio devices, keeping them under /sys/devices/vio/.
> 
> What exactly does it change?
> 
> I'm finding it hard to evaluate if this change is going to cause a
> regression somehow.
> 
> I'm also not clear on why removing the bus type is correct, apart from
> whether it fixes the bug you're seeing.
> 
> > It has been tested that uevents for vio devices don't change after this
> > fix, they still contain MODALIAS.
> >
> > Signed-off-by: Thadeu Lima de Souza Cascardo 
> > Fixes: df44b479654f ("kobject: return error code if writing /sys/.../uevent 
> > fails")
> 
> AFAICS there haven't been any other fixes for that commit. Do we know
> why it is only vio that was affected? (possibly because it's a fake bus
> to begin with?)

So there was an error previously, the core was ignoring it, and now it
isn't and to fix that you want to remove describing what bus a device is
on?

Huh???

> 
> cheers
> 
> > diff --git a/arch/powerpc/platforms/pseries/vio.c 
> > b/arch/powerpc/platforms/pseries/vio.c
> > index 37f1f25ba804..a94dab3972a0 100644
> > --- a/arch/powerpc/platforms/pseries/vio.c
> > +++ b/arch/powerpc/platforms/pseries/vio.c
> > @@ -36,7 +36,6 @@ static struct vio_dev vio_bus_device  = { /* fake 
> > "parent" device */
> > .name = "vio",
> > .type = "",
> > .dev.init_name = "vio",
> > -   .dev.bus = _bus_type,
> >  };

Wait, a static 'struct device'?  You all are playing with fire there.
That's a reference counted object, and should never be declared like
that at all.

I see you register it, but never unregister it, why?  Why is it even
needed?

And if you remove the bus type of it, it will show up in a different
part of sysfs, so I think this patch will show a user-visable change,
right?

thanks,

greg k-h


Re: [PATCH v2 30/30] misc: cxl: flash: Remove unused pointer

2020-07-01 Thread Greg KH
On Wed, Jul 01, 2020 at 09:31:18AM +0100, Lee Jones wrote:
> The DRC index pointer us updated on an OPCODE_ADD, but never
> actually read.  Remove the used pointer and shift up OPCODE_ADD
> to group with OPCODE_DELETE which also provides a noop.
> 
> Fixes the following W=1 kernel build warning:
> 
>  drivers/misc/cxl/flash.c: In function ‘update_devicetree’:
>  drivers/misc/cxl/flash.c:178:16: warning: variable ‘drc_index’ set but not 
> used [-Wunused-but-set-variable]
>  178 | __be32 *data, drc_index, phandle;
>  | ^
> 
> Cc: Frederic Barrat 
> Cc: Andrew Donnellan 
> Cc: linuxppc-dev@lists.ozlabs.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/misc/cxl/flash.c | 7 ++-
>  1 file changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/misc/cxl/flash.c b/drivers/misc/cxl/flash.c
> index cb9cca35a2263..24e3dfcc91a74 100644
> --- a/drivers/misc/cxl/flash.c
> +++ b/drivers/misc/cxl/flash.c
> @@ -175,7 +175,7 @@ static int update_devicetree(struct cxl *adapter, s32 
> scope)
>   struct update_nodes_workarea *unwa;
>   u32 action, node_count;
>   int token, rc, i;
> - __be32 *data, drc_index, phandle;
> + __be32 *data, phandle;
>   char *buf;
>  
>   token = rtas_token("ibm,update-nodes");
> @@ -206,15 +206,12 @@ static int update_devicetree(struct cxl *adapter, s32 
> scope)
>  
>   switch (action) {
>   case OPCODE_DELETE:
> + case OPCODE_ADD:
>   /* nothing to do */
>   break;
>   case OPCODE_UPDATE:
>   update_node(phandle, scope);
>   break;
> - case OPCODE_ADD:
> - /* nothing to do, just move pointer */
> - drc_index = *data++;
> - break;

I think this is not correct, as *data++ changes the value there, and so
this changes the logic of the code.

Dropping this one from my queue.

thanks,

greg k-h


Re: [PATCH 11/13] random: simplify sysctl declaration with register_sysctl_subdir()

2020-05-29 Thread Greg KH
On Fri, May 29, 2020 at 07:41:06AM +, Luis Chamberlain wrote:
> From: Xiaoming Ni 
> 
> Move random_table sysctl from kernel/sysctl.c to drivers/char/random.c
> and use register_sysctl_subdir() to help remove the clutter out of
> kernel/sysctl.c.
> 
> Signed-off-by: Xiaoming Ni 
> Signed-off-by: Luis Chamberlain 
> ---
>  drivers/char/random.c  | 14 --
>  include/linux/sysctl.h |  1 -
>  kernel/sysctl.c|  5 -
>  3 files changed, 12 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/char/random.c b/drivers/char/random.c
> index a7cf6aa65908..73fd4b6e9c18 100644
> --- a/drivers/char/random.c
> +++ b/drivers/char/random.c
> @@ -2101,8 +2101,7 @@ static int proc_do_entropy(struct ctl_table *table, int 
> write,
>  }
>  
>  static int sysctl_poolsize = INPUT_POOL_WORDS * 32;
> -extern struct ctl_table random_table[];
> -struct ctl_table random_table[] = {
> +static struct ctl_table random_table[] = {
>   {
>   .procname   = "poolsize",
>   .data   = _poolsize,
> @@ -2164,6 +2163,17 @@ struct ctl_table random_table[] = {
>  #endif
>   { }
>  };
> +
> +/*
> + * rand_initialize() is called before sysctl_init(),
> + * so we cannot call register_sysctl_init() in rand_initialize()
> + */
> +static int __init random_sysctls_init(void)
> +{
> + register_sysctl_subdir("kernel", "random", random_table);

No error checking?

:(



Re: [PATCH 09/13] firmware_loader: simplify sysctl declaration with register_sysctl_subdir()

2020-05-29 Thread Greg KH
On Fri, May 29, 2020 at 07:41:04AM +, Luis Chamberlain wrote:
> From: Xiaoming Ni 
> 
> Move the firmware config sysctl table to fallback_table.c and use the
> new register_sysctl_subdir() helper. This removes the clutter from
> kernel/sysctl.c.
> 
> Signed-off-by: Xiaoming Ni 
> Signed-off-by: Luis Chamberlain 
> ---
>  drivers/base/firmware_loader/fallback.c   |  4 
>  drivers/base/firmware_loader/fallback.h   | 11 ++
>  drivers/base/firmware_loader/fallback_table.c | 22 +--
>  include/linux/sysctl.h|  1 -
>  kernel/sysctl.c   |  7 --
>  5 files changed, 35 insertions(+), 10 deletions(-)

So it now takes more lines than the old stuff?  :(

> 
> diff --git a/drivers/base/firmware_loader/fallback.c 
> b/drivers/base/firmware_loader/fallback.c
> index d9ac7296205e..8190653ae9a3 100644
> --- a/drivers/base/firmware_loader/fallback.c
> +++ b/drivers/base/firmware_loader/fallback.c
> @@ -200,12 +200,16 @@ static struct class firmware_class = {
>  
>  int register_sysfs_loader(void)
>  {
> + int ret = register_firmware_config_sysctl();
> + if (ret != 0)
> + return ret;

checkpatch :(

>   return class_register(_class);

And if that fails?

>  }
>  
>  void unregister_sysfs_loader(void)
>  {
>   class_unregister(_class);
> + unregister_firmware_config_sysctl();
>  }
>  
>  static ssize_t firmware_loading_show(struct device *dev,
> diff --git a/drivers/base/firmware_loader/fallback.h 
> b/drivers/base/firmware_loader/fallback.h
> index 06f4577733a8..7d2cb5f6ceb8 100644
> --- a/drivers/base/firmware_loader/fallback.h
> +++ b/drivers/base/firmware_loader/fallback.h
> @@ -42,6 +42,17 @@ void fw_fallback_set_default_timeout(void);
>  
>  int register_sysfs_loader(void);
>  void unregister_sysfs_loader(void);
> +#ifdef CONFIG_SYSCTL
> +extern int register_firmware_config_sysctl(void);
> +extern void unregister_firmware_config_sysctl(void);
> +#else
> +static inline int register_firmware_config_sysctl(void)
> +{
> + return 0;
> +}
> +static inline void unregister_firmware_config_sysctl(void) { }
> +#endif /* CONFIG_SYSCTL */
> +
>  #else /* CONFIG_FW_LOADER_USER_HELPER */
>  static inline int firmware_fallback_sysfs(struct firmware *fw, const char 
> *name,
> struct device *device,
> diff --git a/drivers/base/firmware_loader/fallback_table.c 
> b/drivers/base/firmware_loader/fallback_table.c
> index 46a731dede6f..4234aa5ee5df 100644
> --- a/drivers/base/firmware_loader/fallback_table.c
> +++ b/drivers/base/firmware_loader/fallback_table.c
> @@ -24,7 +24,7 @@ struct firmware_fallback_config fw_fallback_config = {
>  EXPORT_SYMBOL_NS_GPL(fw_fallback_config, FIRMWARE_LOADER_PRIVATE);
>  
>  #ifdef CONFIG_SYSCTL
> -struct ctl_table firmware_config_table[] = {
> +static struct ctl_table firmware_config_table[] = {
>   {
>   .procname   = "force_sysfs_fallback",
>   .data   = _fallback_config.force_sysfs_fallback,
> @@ -45,4 +45,22 @@ struct ctl_table firmware_config_table[] = {
>   },
>   { }
>  };
> -#endif
> +
> +static struct ctl_table_header *hdr;
> +int register_firmware_config_sysctl(void)
> +{
> + if (hdr)
> + return -EEXIST;

How can hdr be set?

> + hdr = register_sysctl_subdir("kernel", "firmware_config",
> +  firmware_config_table);
> + if (!hdr)
> + return -ENOMEM;
> + return 0;
> +}
> +
> +void unregister_firmware_config_sysctl(void)
> +{
> + if (hdr)
> + unregister_sysctl_table(hdr);

Why can't unregister_sysctl_table() take a null pointer value?

And what sets 'hdr' (worst name for a static variable) to NULL so that
it knows not to be unregistered again as it looks like
register_firmware_config_sysctl() could be called multiple times.

thanks,

greg k-h


Re: [PATCH v4 3/6] asm-generic/tlb, arch: Invert CONFIG_HAVE_RCU_TABLE_INVALIDATE

2020-05-26 Thread Greg KH
On Wed, May 20, 2020 at 02:00:22PM +0530, Santosh Sivaraj wrote:
> From: Peter Zijlstra 
> 
> commit 96bc9567cbe112e9320250f01b9c060c882e8619 upstream
> 
> Make issuing a TLB invalidate for page-table pages the normal case.
> 
> The reason is twofold:
> 
>  - too many invalidates is safer than too few,
>  - most architectures use the linux page-tables natively
>and would thus require this.
> 
> Make it an opt-out, instead of an opt-in.
> 
> No change in behavior intended.
> 
> Signed-off-by: Peter Zijlstra (Intel) 
> Cc:  # 4.19
> Signed-off-by: Santosh Sivaraj 
> [santosh: prerequisite for upcoming tlbflush backports]
> ---
>  arch/Kconfig | 2 +-
>  arch/powerpc/Kconfig | 1 +
>  arch/sparc/Kconfig   | 1 +
>  arch/x86/Kconfig | 1 -
>  mm/memory.c  | 2 +-
>  5 files changed, 4 insertions(+), 3 deletions(-)

Why did you not also change arch/arm64/Kconfig and
include/asm-generic/tlb.h like the original patch changed?

Why can those files be ignored/left out?  You need to explain that in
the backport section, all you said was "prerequisite..." and did not say
why you changed this patch.

Please fix up, and make sure you do the same for all of the other
patches in this series for when you resend it.

thanks,

greg k-h


Re: [PATCH] tty: hvc: Fix data abort due to race in hvc_open

2020-05-15 Thread Greg KH
On Thu, May 14, 2020 at 04:22:10PM -0700, rana...@codeaurora.org wrote:
> On 2020-05-13 00:04, Greg KH wrote:
> > On Tue, May 12, 2020 at 02:39:50PM -0700, rana...@codeaurora.org wrote:
> > > On 2020-05-12 01:25, Greg KH wrote:
> > > > On Tue, May 12, 2020 at 09:22:15AM +0200, Jiri Slaby wrote:
> > > > > On 11. 05. 20, 9:39, Greg KH wrote:
> > > > > > On Mon, May 11, 2020 at 12:23:58AM -0700, rana...@codeaurora.org 
> > > > > > wrote:
> > > > > >> On 2020-05-09 23:48, Greg KH wrote:
> > > > > >>> On Sat, May 09, 2020 at 06:30:56PM -0700, rana...@codeaurora.org 
> > > > > >>> wrote:
> > > > > >>>> On 2020-05-06 02:48, Greg KH wrote:
> > > > > >>>>> On Mon, Apr 27, 2020 at 08:26:01PM -0700, Raghavendra Rao 
> > > > > >>>>> Ananta wrote:
> > > > > >>>>>> Potentially, hvc_open() can be called in parallel when two 
> > > > > >>>>>> tasks calls
> > > > > >>>>>> open() on /dev/hvcX. In such a scenario, if the
> > > > > >>>>>> hp->ops->notifier_add()
> > > > > >>>>>> callback in the function fails, where it sets the 
> > > > > >>>>>> tty->driver_data to
> > > > > >>>>>> NULL, the parallel hvc_open() can see this NULL and cause a 
> > > > > >>>>>> memory
> > > > > >>>>>> abort.
> > > > > >>>>>> Hence, serialize hvc_open and check if tty->private_data is 
> > > > > >>>>>> NULL
> > > > > >>>>>> before
> > > > > >>>>>> proceeding ahead.
> > > > > >>>>>>
> > > > > >>>>>> The issue can be easily reproduced by launching two tasks
> > > > > >>>>>> simultaneously
> > > > > >>>>>> that does nothing but open() and close() on /dev/hvcX.
> > > > > >>>>>> For example:
> > > > > >>>>>> $ ./simple_open_close /dev/hvc0 & ./simple_open_close 
> > > > > >>>>>> /dev/hvc0 &
> > > > > >>>>>>
> > > > > >>>>>> Signed-off-by: Raghavendra Rao Ananta 
> > > > > >>>>>> ---
> > > > > >>>>>>  drivers/tty/hvc/hvc_console.c | 16 ++--
> > > > > >>>>>>  1 file changed, 14 insertions(+), 2 deletions(-)
> > > > > >>>>>>
> > > > > >>>>>> diff --git a/drivers/tty/hvc/hvc_console.c
> > > > > >>>>>> b/drivers/tty/hvc/hvc_console.c
> > > > > >>>>>> index 436cc51c92c3..ebe26fe5ac09 100644
> > > > > >>>>>> --- a/drivers/tty/hvc/hvc_console.c
> > > > > >>>>>> +++ b/drivers/tty/hvc/hvc_console.c
> > > > > >>>>>> @@ -75,6 +75,8 @@ static LIST_HEAD(hvc_structs);
> > > > > >>>>>>   */
> > > > > >>>>>>  static DEFINE_MUTEX(hvc_structs_mutex);
> > > > > >>>>>>
> > > > > >>>>>> +/* Mutex to serialize hvc_open */
> > > > > >>>>>> +static DEFINE_MUTEX(hvc_open_mutex);
> > > > > >>>>>>  /*
> > > > > >>>>>>   * This value is used to assign a tty->index value to a 
> > > > > >>>>>> hvc_struct
> > > > > >>>>>> based
> > > > > >>>>>>   * upon order of exposure via hvc_probe(), when we can not 
> > > > > >>>>>> match it
> > > > > >>>>>> to
> > > > > >>>>>> @@ -346,16 +348,24 @@ static int hvc_install(struct tty_driver
> > > > > >>>>>> *driver, struct tty_struct *tty)
> > > > > >>>>>>   */
> > > > > >>>>>>  static int hvc_open(struct tty_struct *tty, struct file * 
> > > > > >>>>>> filp)
> > > > > >>>>>>  {
> > > > > >>>>>> -  struct hvc_struct *hp = 

Re: [PATCH] tty: hvc: Fix data abort due to race in hvc_open

2020-05-13 Thread Greg KH
On Tue, May 12, 2020 at 02:39:50PM -0700, rana...@codeaurora.org wrote:
> On 2020-05-12 01:25, Greg KH wrote:
> > On Tue, May 12, 2020 at 09:22:15AM +0200, Jiri Slaby wrote:
> > > On 11. 05. 20, 9:39, Greg KH wrote:
> > > > On Mon, May 11, 2020 at 12:23:58AM -0700, rana...@codeaurora.org wrote:
> > > >> On 2020-05-09 23:48, Greg KH wrote:
> > > >>> On Sat, May 09, 2020 at 06:30:56PM -0700, rana...@codeaurora.org 
> > > >>> wrote:
> > > >>>> On 2020-05-06 02:48, Greg KH wrote:
> > > >>>>> On Mon, Apr 27, 2020 at 08:26:01PM -0700, Raghavendra Rao Ananta 
> > > >>>>> wrote:
> > > >>>>>> Potentially, hvc_open() can be called in parallel when two tasks 
> > > >>>>>> calls
> > > >>>>>> open() on /dev/hvcX. In such a scenario, if the
> > > >>>>>> hp->ops->notifier_add()
> > > >>>>>> callback in the function fails, where it sets the tty->driver_data 
> > > >>>>>> to
> > > >>>>>> NULL, the parallel hvc_open() can see this NULL and cause a memory
> > > >>>>>> abort.
> > > >>>>>> Hence, serialize hvc_open and check if tty->private_data is NULL
> > > >>>>>> before
> > > >>>>>> proceeding ahead.
> > > >>>>>>
> > > >>>>>> The issue can be easily reproduced by launching two tasks
> > > >>>>>> simultaneously
> > > >>>>>> that does nothing but open() and close() on /dev/hvcX.
> > > >>>>>> For example:
> > > >>>>>> $ ./simple_open_close /dev/hvc0 & ./simple_open_close /dev/hvc0 &
> > > >>>>>>
> > > >>>>>> Signed-off-by: Raghavendra Rao Ananta 
> > > >>>>>> ---
> > > >>>>>>  drivers/tty/hvc/hvc_console.c | 16 ++--
> > > >>>>>>  1 file changed, 14 insertions(+), 2 deletions(-)
> > > >>>>>>
> > > >>>>>> diff --git a/drivers/tty/hvc/hvc_console.c
> > > >>>>>> b/drivers/tty/hvc/hvc_console.c
> > > >>>>>> index 436cc51c92c3..ebe26fe5ac09 100644
> > > >>>>>> --- a/drivers/tty/hvc/hvc_console.c
> > > >>>>>> +++ b/drivers/tty/hvc/hvc_console.c
> > > >>>>>> @@ -75,6 +75,8 @@ static LIST_HEAD(hvc_structs);
> > > >>>>>>   */
> > > >>>>>>  static DEFINE_MUTEX(hvc_structs_mutex);
> > > >>>>>>
> > > >>>>>> +/* Mutex to serialize hvc_open */
> > > >>>>>> +static DEFINE_MUTEX(hvc_open_mutex);
> > > >>>>>>  /*
> > > >>>>>>   * This value is used to assign a tty->index value to a hvc_struct
> > > >>>>>> based
> > > >>>>>>   * upon order of exposure via hvc_probe(), when we can not match 
> > > >>>>>> it
> > > >>>>>> to
> > > >>>>>> @@ -346,16 +348,24 @@ static int hvc_install(struct tty_driver
> > > >>>>>> *driver, struct tty_struct *tty)
> > > >>>>>>   */
> > > >>>>>>  static int hvc_open(struct tty_struct *tty, struct file * filp)
> > > >>>>>>  {
> > > >>>>>> -  struct hvc_struct *hp = tty->driver_data;
> > > >>>>>> +  struct hvc_struct *hp;
> > > >>>>>>unsigned long flags;
> > > >>>>>>int rc = 0;
> > > >>>>>>
> > > >>>>>> +  mutex_lock(_open_mutex);
> > > >>>>>> +
> > > >>>>>> +  hp = tty->driver_data;
> > > >>>>>> +  if (!hp) {
> > > >>>>>> +  rc = -EIO;
> > > >>>>>> +  goto out;
> > > >>>>>> +  }
> > > >>>>>> +
> > > >>>>>>spin_lock_irqsave(>port.lock, flags);
> > > >>>>>>/* Check and then increment for fast path open. */
> > > >>>>>>if (hp->port.count++ > 0) {
> > >

Re: [PATCH] tty: hvc: Fix data abort due to race in hvc_open

2020-05-12 Thread Greg KH
On Tue, May 12, 2020 at 09:22:15AM +0200, Jiri Slaby wrote:
> On 11. 05. 20, 9:39, Greg KH wrote:
> > On Mon, May 11, 2020 at 12:23:58AM -0700, rana...@codeaurora.org wrote:
> >> On 2020-05-09 23:48, Greg KH wrote:
> >>> On Sat, May 09, 2020 at 06:30:56PM -0700, rana...@codeaurora.org wrote:
> >>>> On 2020-05-06 02:48, Greg KH wrote:
> >>>>> On Mon, Apr 27, 2020 at 08:26:01PM -0700, Raghavendra Rao Ananta wrote:
> >>>>>> Potentially, hvc_open() can be called in parallel when two tasks calls
> >>>>>> open() on /dev/hvcX. In such a scenario, if the
> >>>>>> hp->ops->notifier_add()
> >>>>>> callback in the function fails, where it sets the tty->driver_data to
> >>>>>> NULL, the parallel hvc_open() can see this NULL and cause a memory
> >>>>>> abort.
> >>>>>> Hence, serialize hvc_open and check if tty->private_data is NULL
> >>>>>> before
> >>>>>> proceeding ahead.
> >>>>>>
> >>>>>> The issue can be easily reproduced by launching two tasks
> >>>>>> simultaneously
> >>>>>> that does nothing but open() and close() on /dev/hvcX.
> >>>>>> For example:
> >>>>>> $ ./simple_open_close /dev/hvc0 & ./simple_open_close /dev/hvc0 &
> >>>>>>
> >>>>>> Signed-off-by: Raghavendra Rao Ananta 
> >>>>>> ---
> >>>>>>  drivers/tty/hvc/hvc_console.c | 16 ++--
> >>>>>>  1 file changed, 14 insertions(+), 2 deletions(-)
> >>>>>>
> >>>>>> diff --git a/drivers/tty/hvc/hvc_console.c
> >>>>>> b/drivers/tty/hvc/hvc_console.c
> >>>>>> index 436cc51c92c3..ebe26fe5ac09 100644
> >>>>>> --- a/drivers/tty/hvc/hvc_console.c
> >>>>>> +++ b/drivers/tty/hvc/hvc_console.c
> >>>>>> @@ -75,6 +75,8 @@ static LIST_HEAD(hvc_structs);
> >>>>>>   */
> >>>>>>  static DEFINE_MUTEX(hvc_structs_mutex);
> >>>>>>
> >>>>>> +/* Mutex to serialize hvc_open */
> >>>>>> +static DEFINE_MUTEX(hvc_open_mutex);
> >>>>>>  /*
> >>>>>>   * This value is used to assign a tty->index value to a hvc_struct
> >>>>>> based
> >>>>>>   * upon order of exposure via hvc_probe(), when we can not match it
> >>>>>> to
> >>>>>> @@ -346,16 +348,24 @@ static int hvc_install(struct tty_driver
> >>>>>> *driver, struct tty_struct *tty)
> >>>>>>   */
> >>>>>>  static int hvc_open(struct tty_struct *tty, struct file * filp)
> >>>>>>  {
> >>>>>> -  struct hvc_struct *hp = tty->driver_data;
> >>>>>> +  struct hvc_struct *hp;
> >>>>>>unsigned long flags;
> >>>>>>int rc = 0;
> >>>>>>
> >>>>>> +  mutex_lock(_open_mutex);
> >>>>>> +
> >>>>>> +  hp = tty->driver_data;
> >>>>>> +  if (!hp) {
> >>>>>> +  rc = -EIO;
> >>>>>> +  goto out;
> >>>>>> +  }
> >>>>>> +
> >>>>>>spin_lock_irqsave(>port.lock, flags);
> >>>>>>/* Check and then increment for fast path open. */
> >>>>>>if (hp->port.count++ > 0) {
> >>>>>>spin_unlock_irqrestore(>port.lock, flags);
> >>>>>>hvc_kick();
> >>>>>> -  return 0;
> >>>>>> +  goto out;
> >>>>>>} /* else count == 0 */
> >>>>>>spin_unlock_irqrestore(>port.lock, flags);
> >>>>>
> >>>>> Wait, why isn't this driver just calling tty_port_open() instead of
> >>>>> trying to open-code all of this?
> >>>>>
> >>>>> Keeping a single mutext for open will not protect it from close, it will
> >>>>> just slow things down a bit.  There should already be a tty lock held by
> >>>>> the tty core for open() to keep it from racing things, right?
> >>>> The tty l

Re: [PATCH] tty: hvc: Fix data abort due to race in hvc_open

2020-05-11 Thread Greg KH
On Mon, May 11, 2020 at 12:34:44AM -0700, rana...@codeaurora.org wrote:
> On 2020-05-11 00:23, rana...@codeaurora.org wrote:
> > On 2020-05-09 23:48, Greg KH wrote:
> > > On Sat, May 09, 2020 at 06:30:56PM -0700, rana...@codeaurora.org
> > > wrote:
> > > > On 2020-05-06 02:48, Greg KH wrote:
> > > > > On Mon, Apr 27, 2020 at 08:26:01PM -0700, Raghavendra Rao Ananta 
> > > > > wrote:
> > > > > > Potentially, hvc_open() can be called in parallel when two tasks 
> > > > > > calls
> > > > > > open() on /dev/hvcX. In such a scenario, if the
> > > > > > hp->ops->notifier_add()
> > > > > > callback in the function fails, where it sets the tty->driver_data 
> > > > > > to
> > > > > > NULL, the parallel hvc_open() can see this NULL and cause a memory
> > > > > > abort.
> > > > > > Hence, serialize hvc_open and check if tty->private_data is NULL
> > > > > > before
> > > > > > proceeding ahead.
> > > > > >
> > > > > > The issue can be easily reproduced by launching two tasks
> > > > > > simultaneously
> > > > > > that does nothing but open() and close() on /dev/hvcX.
> > > > > > For example:
> > > > > > $ ./simple_open_close /dev/hvc0 & ./simple_open_close /dev/hvc0 &
> > > > > >
> > > > > > Signed-off-by: Raghavendra Rao Ananta 
> > > > > > ---
> > > > > >  drivers/tty/hvc/hvc_console.c | 16 ++--
> > > > > >  1 file changed, 14 insertions(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/tty/hvc/hvc_console.c
> > > > > > b/drivers/tty/hvc/hvc_console.c
> > > > > > index 436cc51c92c3..ebe26fe5ac09 100644
> > > > > > --- a/drivers/tty/hvc/hvc_console.c
> > > > > > +++ b/drivers/tty/hvc/hvc_console.c
> > > > > > @@ -75,6 +75,8 @@ static LIST_HEAD(hvc_structs);
> > > > > >   */
> > > > > >  static DEFINE_MUTEX(hvc_structs_mutex);
> > > > > >
> > > > > > +/* Mutex to serialize hvc_open */
> > > > > > +static DEFINE_MUTEX(hvc_open_mutex);
> > > > > >  /*
> > > > > >   * This value is used to assign a tty->index value to a hvc_struct
> > > > > > based
> > > > > >   * upon order of exposure via hvc_probe(), when we can not match it
> > > > > > to
> > > > > > @@ -346,16 +348,24 @@ static int hvc_install(struct tty_driver
> > > > > > *driver, struct tty_struct *tty)
> > > > > >   */
> > > > > >  static int hvc_open(struct tty_struct *tty, struct file * filp)
> > > > > >  {
> > > > > > -   struct hvc_struct *hp = tty->driver_data;
> > > > > > +   struct hvc_struct *hp;
> > > > > > unsigned long flags;
> > > > > > int rc = 0;
> > > > > >
> > > > > > +   mutex_lock(_open_mutex);
> > > > > > +
> > > > > > +   hp = tty->driver_data;
> > > > > > +   if (!hp) {
> > > > > > +   rc = -EIO;
> > > > > > +   goto out;
> > > > > > +   }
> > > > > > +
> > > > > > spin_lock_irqsave(>port.lock, flags);
> > > > > > /* Check and then increment for fast path open. */
> > > > > > if (hp->port.count++ > 0) {
> > > > > > spin_unlock_irqrestore(>port.lock, flags);
> > > > > > hvc_kick();
> > > > > > -   return 0;
> > > > > > +   goto out;
> > > > > > } /* else count == 0 */
> > > > > > spin_unlock_irqrestore(>port.lock, flags);
> > > > >
> > > > > Wait, why isn't this driver just calling tty_port_open() instead of
> > > > > trying to open-code all of this?
> > > > >
> > > > > Keeping a single mutext for open will not protect it from close, it 
> > > > > will
> > > > > just slow things down a bit.  There should already be a tty lock held 
> > > > > by
> > > > > the tty core for open() to keep it from racing things, right?
> > > > The tty lock should have been held, but not likely across
> > > > ->install() and
> > > > ->open() callbacks, thus resulting in a race between
> > > > hvc_install() and
> > > > hvc_open(),
> > > 
> > > How?  The tty lock is held in install, and should not conflict with
> > > open(), otherwise, we would be seeing this happen in all tty drivers,
> > > right?
> > > 
> > Well, I was expecting the same, but IIRC, I see that the open() was
> > being
> > called in parallel for the same device node.
> > 
> > Is it expected that the tty core would allow only one thread to
> > access the dev-node, while blocking the other, or is it the client
> > driver's responsibility to handle the exclusiveness?
> Or is there any optimization going on where the second call doesn't go
> through
> install(), but calls open() directly as the file was already opened by the
> first
> thread?

Yes, it should only happen once, look at the logic in tty_kopen().

greg k-h


Re: [PATCH] tty: hvc: Fix data abort due to race in hvc_open

2020-05-11 Thread Greg KH
On Mon, May 11, 2020 at 12:23:58AM -0700, rana...@codeaurora.org wrote:
> On 2020-05-09 23:48, Greg KH wrote:
> > On Sat, May 09, 2020 at 06:30:56PM -0700, rana...@codeaurora.org wrote:
> > > On 2020-05-06 02:48, Greg KH wrote:
> > > > On Mon, Apr 27, 2020 at 08:26:01PM -0700, Raghavendra Rao Ananta wrote:
> > > > > Potentially, hvc_open() can be called in parallel when two tasks calls
> > > > > open() on /dev/hvcX. In such a scenario, if the
> > > > > hp->ops->notifier_add()
> > > > > callback in the function fails, where it sets the tty->driver_data to
> > > > > NULL, the parallel hvc_open() can see this NULL and cause a memory
> > > > > abort.
> > > > > Hence, serialize hvc_open and check if tty->private_data is NULL
> > > > > before
> > > > > proceeding ahead.
> > > > >
> > > > > The issue can be easily reproduced by launching two tasks
> > > > > simultaneously
> > > > > that does nothing but open() and close() on /dev/hvcX.
> > > > > For example:
> > > > > $ ./simple_open_close /dev/hvc0 & ./simple_open_close /dev/hvc0 &
> > > > >
> > > > > Signed-off-by: Raghavendra Rao Ananta 
> > > > > ---
> > > > >  drivers/tty/hvc/hvc_console.c | 16 ++--
> > > > >  1 file changed, 14 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/drivers/tty/hvc/hvc_console.c
> > > > > b/drivers/tty/hvc/hvc_console.c
> > > > > index 436cc51c92c3..ebe26fe5ac09 100644
> > > > > --- a/drivers/tty/hvc/hvc_console.c
> > > > > +++ b/drivers/tty/hvc/hvc_console.c
> > > > > @@ -75,6 +75,8 @@ static LIST_HEAD(hvc_structs);
> > > > >   */
> > > > >  static DEFINE_MUTEX(hvc_structs_mutex);
> > > > >
> > > > > +/* Mutex to serialize hvc_open */
> > > > > +static DEFINE_MUTEX(hvc_open_mutex);
> > > > >  /*
> > > > >   * This value is used to assign a tty->index value to a hvc_struct
> > > > > based
> > > > >   * upon order of exposure via hvc_probe(), when we can not match it
> > > > > to
> > > > > @@ -346,16 +348,24 @@ static int hvc_install(struct tty_driver
> > > > > *driver, struct tty_struct *tty)
> > > > >   */
> > > > >  static int hvc_open(struct tty_struct *tty, struct file * filp)
> > > > >  {
> > > > > - struct hvc_struct *hp = tty->driver_data;
> > > > > + struct hvc_struct *hp;
> > > > >   unsigned long flags;
> > > > >   int rc = 0;
> > > > >
> > > > > + mutex_lock(_open_mutex);
> > > > > +
> > > > > + hp = tty->driver_data;
> > > > > + if (!hp) {
> > > > > + rc = -EIO;
> > > > > + goto out;
> > > > > + }
> > > > > +
> > > > >   spin_lock_irqsave(>port.lock, flags);
> > > > >   /* Check and then increment for fast path open. */
> > > > >   if (hp->port.count++ > 0) {
> > > > >   spin_unlock_irqrestore(>port.lock, flags);
> > > > >   hvc_kick();
> > > > > - return 0;
> > > > > + goto out;
> > > > >   } /* else count == 0 */
> > > > >   spin_unlock_irqrestore(>port.lock, flags);
> > > >
> > > > Wait, why isn't this driver just calling tty_port_open() instead of
> > > > trying to open-code all of this?
> > > >
> > > > Keeping a single mutext for open will not protect it from close, it will
> > > > just slow things down a bit.  There should already be a tty lock held by
> > > > the tty core for open() to keep it from racing things, right?
> > > The tty lock should have been held, but not likely across
> > > ->install() and
> > > ->open() callbacks, thus resulting in a race between hvc_install() and
> > > hvc_open(),
> > 
> > How?  The tty lock is held in install, and should not conflict with
> > open(), otherwise, we would be seeing this happen in all tty drivers,
> > right?
> > 
> Well, I was expecting the same, but IIRC, I see that the open() was being
> called in parallel for the same device node.

So open 

Re: [PATCH] tty: hvc: Fix data abort due to race in hvc_open

2020-05-10 Thread Greg KH
On Sat, May 09, 2020 at 06:30:56PM -0700, rana...@codeaurora.org wrote:
> On 2020-05-06 02:48, Greg KH wrote:
> > On Mon, Apr 27, 2020 at 08:26:01PM -0700, Raghavendra Rao Ananta wrote:
> > > Potentially, hvc_open() can be called in parallel when two tasks calls
> > > open() on /dev/hvcX. In such a scenario, if the
> > > hp->ops->notifier_add()
> > > callback in the function fails, where it sets the tty->driver_data to
> > > NULL, the parallel hvc_open() can see this NULL and cause a memory
> > > abort.
> > > Hence, serialize hvc_open and check if tty->private_data is NULL
> > > before
> > > proceeding ahead.
> > > 
> > > The issue can be easily reproduced by launching two tasks
> > > simultaneously
> > > that does nothing but open() and close() on /dev/hvcX.
> > > For example:
> > > $ ./simple_open_close /dev/hvc0 & ./simple_open_close /dev/hvc0 &
> > > 
> > > Signed-off-by: Raghavendra Rao Ananta 
> > > ---
> > >  drivers/tty/hvc/hvc_console.c | 16 ++--
> > >  1 file changed, 14 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/tty/hvc/hvc_console.c
> > > b/drivers/tty/hvc/hvc_console.c
> > > index 436cc51c92c3..ebe26fe5ac09 100644
> > > --- a/drivers/tty/hvc/hvc_console.c
> > > +++ b/drivers/tty/hvc/hvc_console.c
> > > @@ -75,6 +75,8 @@ static LIST_HEAD(hvc_structs);
> > >   */
> > >  static DEFINE_MUTEX(hvc_structs_mutex);
> > > 
> > > +/* Mutex to serialize hvc_open */
> > > +static DEFINE_MUTEX(hvc_open_mutex);
> > >  /*
> > >   * This value is used to assign a tty->index value to a hvc_struct
> > > based
> > >   * upon order of exposure via hvc_probe(), when we can not match it
> > > to
> > > @@ -346,16 +348,24 @@ static int hvc_install(struct tty_driver
> > > *driver, struct tty_struct *tty)
> > >   */
> > >  static int hvc_open(struct tty_struct *tty, struct file * filp)
> > >  {
> > > - struct hvc_struct *hp = tty->driver_data;
> > > + struct hvc_struct *hp;
> > >   unsigned long flags;
> > >   int rc = 0;
> > > 
> > > + mutex_lock(_open_mutex);
> > > +
> > > + hp = tty->driver_data;
> > > + if (!hp) {
> > > + rc = -EIO;
> > > + goto out;
> > > + }
> > > +
> > >   spin_lock_irqsave(>port.lock, flags);
> > >   /* Check and then increment for fast path open. */
> > >   if (hp->port.count++ > 0) {
> > >   spin_unlock_irqrestore(>port.lock, flags);
> > >   hvc_kick();
> > > - return 0;
> > > + goto out;
> > >   } /* else count == 0 */
> > >   spin_unlock_irqrestore(>port.lock, flags);
> > 
> > Wait, why isn't this driver just calling tty_port_open() instead of
> > trying to open-code all of this?
> > 
> > Keeping a single mutext for open will not protect it from close, it will
> > just slow things down a bit.  There should already be a tty lock held by
> > the tty core for open() to keep it from racing things, right?
> The tty lock should have been held, but not likely across ->install() and
> ->open() callbacks, thus resulting in a race between hvc_install() and
> hvc_open(),

How?  The tty lock is held in install, and should not conflict with
open(), otherwise we would be seeing this happen in all tty drivers,
right?

> where hvc_install() sets a data and the hvc_open() clears it. hvc_open()
> doesn't
> check if the data was set to NULL and proceeds.

What data is being set that hvc_open is checking?

And you are not grabbing a lock in your install callback, you are only
serializing your open call here, I don't see how this is fixing anything
other than perhaps slowing down your codepaths.

As an arument why this isn't correct, can you answer why this same type
of change wouldn't be required for all tty drivers in the tree?

thanks,

greg k-h


Re: [PATCH] tty: hvc: Fix data abort due to race in hvc_open

2020-05-06 Thread Greg KH
On Mon, Apr 27, 2020 at 08:26:01PM -0700, Raghavendra Rao Ananta wrote:
> Potentially, hvc_open() can be called in parallel when two tasks calls
> open() on /dev/hvcX. In such a scenario, if the hp->ops->notifier_add()
> callback in the function fails, where it sets the tty->driver_data to
> NULL, the parallel hvc_open() can see this NULL and cause a memory abort.
> Hence, serialize hvc_open and check if tty->private_data is NULL before
> proceeding ahead.
> 
> The issue can be easily reproduced by launching two tasks simultaneously
> that does nothing but open() and close() on /dev/hvcX.
> For example:
> $ ./simple_open_close /dev/hvc0 & ./simple_open_close /dev/hvc0 &
> 
> Signed-off-by: Raghavendra Rao Ananta 
> ---
>  drivers/tty/hvc/hvc_console.c | 16 ++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c
> index 436cc51c92c3..ebe26fe5ac09 100644
> --- a/drivers/tty/hvc/hvc_console.c
> +++ b/drivers/tty/hvc/hvc_console.c
> @@ -75,6 +75,8 @@ static LIST_HEAD(hvc_structs);
>   */
>  static DEFINE_MUTEX(hvc_structs_mutex);
>  
> +/* Mutex to serialize hvc_open */
> +static DEFINE_MUTEX(hvc_open_mutex);
>  /*
>   * This value is used to assign a tty->index value to a hvc_struct based
>   * upon order of exposure via hvc_probe(), when we can not match it to
> @@ -346,16 +348,24 @@ static int hvc_install(struct tty_driver *driver, 
> struct tty_struct *tty)
>   */
>  static int hvc_open(struct tty_struct *tty, struct file * filp)
>  {
> - struct hvc_struct *hp = tty->driver_data;
> + struct hvc_struct *hp;
>   unsigned long flags;
>   int rc = 0;
>  
> + mutex_lock(_open_mutex);
> +
> + hp = tty->driver_data;
> + if (!hp) {
> + rc = -EIO;
> + goto out;
> + }
> +
>   spin_lock_irqsave(>port.lock, flags);
>   /* Check and then increment for fast path open. */
>   if (hp->port.count++ > 0) {
>   spin_unlock_irqrestore(>port.lock, flags);
>   hvc_kick();
> - return 0;
> + goto out;
>   } /* else count == 0 */
>   spin_unlock_irqrestore(>port.lock, flags);

Wait, why isn't this driver just calling tty_port_open() instead of
trying to open-code all of this?

Keeping a single mutext for open will not protect it from close, it will
just slow things down a bit.  There should already be a tty lock held by
the tty core for open() to keep it from racing things, right?

Try just removing all of this logic and replacing it with a call to
tty_port_open() and see if that fixes this issue.

As "proof" of this, I don't see other serial drivers needing a single
mutex for their open calls, do you?

thanks,

greg k-h


Re: [PATCH v2, RESEND] misc: new driver sram_uapi for user level SRAM access

2020-04-21 Thread Greg KH
On Tue, Apr 21, 2020 at 05:09:47PM +0800, 王文虎 wrote:
> Hi, Greg, Arnd,
> 
> Thank you for your comments first, and then really very very very sorry
> for driving Greg to sigh and I hope there would be chance to share Moutai
> (rather than whisky, we drink it much, a kind of Baijiu), after the virus.
> 
> Back to the comments, I'd like to do a bit of documentation or explanation 
> first,
> which should have been done early or else there would not be so much to 
> explain:
> 1. What I have been trying to do is to access the Freescale Cache-SRAM device 
> form
> user level;
> 2. I implemented it using UIO, which was thought of non-proper;

I still think that using uio is the best way to do this, and never said
it was "non-proper".  All we got bogged down in was the DT
representation of stuff from what I remember.  That should be worked
through.

thanks,

greg k-h


Re: [PATCH v2, RESEND] misc: new driver sram_uapi for user level SRAM access

2020-04-20 Thread Greg KH
On Sun, Apr 19, 2020 at 08:05:38PM -0700, Wang Wenhu wrote:
> A generic User-Kernel interface that allows a misc device created
> by it to support file-operations of ioctl and mmap to access SRAM
> memory from user level. Different kinds of SRAM alloction and free
> APIs could be registered by specific SRAM hardware level driver to
> the available list and then be chosen by users to allocate and map
> SRAM memory from user level.
> 
> It is extremely helpful for the user space applications that require
> high performance memory accesses, such as embedded networking devices
> that would process data in user space, and PowerPC e500 is a case.
> 
> Signed-off-by: Wang Wenhu 
> Cc: Greg Kroah-Hartman 
> Cc: Arnd Bergmann 
> Cc: Christophe Leroy 
> Cc: Scott Wood 
> Cc: Michael Ellerman 
> Cc: Randy Dunlap 
> Cc: linuxppc-dev@lists.ozlabs.org
> ---
> Changes since v1: addressed comments from Arnd
>  * Changed the ioctl cmd definitions using _IO micros
>  * Export interfaces for HW-SRAM drivers to register apis to available list
>  * Modified allocation alignment to PAGE_SIZE
>  * Use phys_addr_t as type of SRAM resource size and offset
>  * Support compat_ioctl
>  * Misc device name:sram
> 
> Note: From this on, the SRAM_UAPI driver is independent to any hardware
> drivers, so I would only commit the patch itself as v2, while the v1 of
> it was wrapped together with patches for Freescale L2-Cache-SRAM device.
> Then after, I'd create patches for Freescale L2-Cache-SRAM device as
> another series.
> 
> Link for v1:
>  * https://lore.kernel.org/lkml/20200418162157.50428-5-wenhu.w...@vivo.com/

Why was this a RESEND?  What was wrong with the original v2 version?

Anyway, minor review comments:

> ---
>  drivers/misc/Kconfig  |  11 ++
>  drivers/misc/Makefile |   1 +
>  drivers/misc/sram_uapi.c  | 352 ++
>  include/linux/sram_uapi.h |  28 +++
>  4 files changed, 392 insertions(+)
>  create mode 100644 drivers/misc/sram_uapi.c
>  create mode 100644 include/linux/sram_uapi.h
> 
> diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
> index 99e151475d8f..b19c8b24f18e 100644
> --- a/drivers/misc/Kconfig
> +++ b/drivers/misc/Kconfig
> @@ -465,6 +465,17 @@ config PVPANIC
> a paravirtualized device provided by QEMU; it lets a virtual machine
> (guest) communicate panic events to the host.
>  
> +config SRAM_UAPI
> + bool "Generic SRAM User Level API driver"
> + help
> +   This driver allows you to create a misc device which could be used
> +   as an interface to allocate SRAM memory from user level.
> +
> +   It is extremely helpful for some user space applications that require
> +   high performance memory accesses.
> +
> +   If unsure, say N.

Naming is hard, but shouldn't this just be "sram" and drop the whole
"UAPI" stuff everywhere?  That doesn't make much sense as drivers are
just interfaces to userspace...


> +
>  source "drivers/misc/c2port/Kconfig"
>  source "drivers/misc/eeprom/Kconfig"
>  source "drivers/misc/cb710/Kconfig"
> diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
> index 9abf2923d831..794447ca07ca 100644
> --- a/drivers/misc/Makefile
> +++ b/drivers/misc/Makefile
> @@ -46,6 +46,7 @@ obj-$(CONFIG_VMWARE_VMCI)   += vmw_vmci/
>  obj-$(CONFIG_LATTICE_ECP3_CONFIG)+= lattice-ecp3-config.o
>  obj-$(CONFIG_SRAM)   += sram.o
>  obj-$(CONFIG_SRAM_EXEC)  += sram-exec.o
> +obj-$(CONFIG_SRAM_UAPI)  += sram_uapi.o
>  obj-y+= mic/
>  obj-$(CONFIG_GENWQE) += genwqe/
>  obj-$(CONFIG_ECHO)   += echo/
> diff --git a/drivers/misc/sram_uapi.c b/drivers/misc/sram_uapi.c
> new file mode 100644
> index ..66c7b56b635f
> --- /dev/null
> +++ b/drivers/misc/sram_uapi.c
> @@ -0,0 +1,352 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2020 Vivo Communication Technology Co. Ltd.
> + * Copyright (C) 2020 Wang Wenhu 
> + * All rights reserved.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define DRIVER_NAME  "sram_uapi"

If you really need this, just use KBUILD_MODNAME instead.  But I don't
think you need it, as most drivers do not.


> +
> +struct res_info {
> + phys_addr_t offset;
> + phys_addr_t size;
> +};
> +
> +struct sram_resource {
> + struct list_headlist;
> + struct res_info info;
> + phys_addr_t phys;
> + void*virt;
> + struct vm_area_struct   *vma;
> + struct sram_uapi*parent;
> +};
> +
> +struct sram_uapi {
> + struct list_headres_list;
> + struct sram_api *sa;
> +};
> +
> +static DEFINE_MUTEX(sram_api_list_lock);
> +static LIST_HEAD(sram_api_list);
> +
> +long sram_api_register(struct sram_api *sa)

Why are all of these functions global?

And shouldn't this return an 'int'?

> +{
> + struct sram_api *cur;
> +
> +  

Re: [PATCH v4,4/4] drivers: uio: new driver for fsl_85xx_cache_sram

2020-04-17 Thread Greg KH
On Thu, Apr 16, 2020 at 11:58:29PM -0500, Scott Wood wrote:
> On Fri, 2020-04-17 at 10:31 +0800, 王文虎 wrote:
> > > > On Thu, 2020-04-16 at 08:35 -0700, Wang Wenhu wrote:
> > > > > +#define UIO_INFO_VER "devicetree,pseudo"
> > > > 
> > > > What does this mean?  Changing a number into a non-obvious string (Why
> > > > "pseudo"?  Why does the UIO user care that the config came from the
> > > > device
> > > > tree?) just to avoid setting off Greg's version number autoresponse
> > > > isn't
> > > > really helping anything.
> > > > 
> > > > > +static const struct of_device_id uio_mpc85xx_l2ctlr_of_match[] = {
> > > > > + {   .compatible = "uio,mpc85xx-cache-sram", },
> > > 
> > > Form is , and "uio" is not a vendor (and never will be).
> > > 
> > 
> > Should have been something like "fsl,mpc85xx-cache-sram-uio", and if it is
> > to be defined with module parameters, this would be user defined.
> > Anyway, , should always be used.
> > 
> > > > > + {},
> > > > > +};
> > > > > +
> > > > > +static struct platform_driver uio_fsl_85xx_cache_sram = {
> > > > > + .probe = uio_fsl_85xx_cache_sram_probe,
> > > > > + .remove = uio_fsl_85xx_cache_sram_remove,
> > > > > + .driver = {
> > > > > + .name = DRIVER_NAME,
> > > > > + .owner = THIS_MODULE,
> > > > > + .of_match_table = uio_mpc85xx_l2ctlr_of_match,
> > > > > + },
> > > > > +};
> > > > 
> > > > Greg's comment notwithstanding, I really don't think this belongs in the
> > > > device tree (and if I do get overruled on that point, it at least needs
> > > > a
> > > > binding document).  Let me try to come up with a patch for dynamic
> > > > allocation.
> > > 
> > > Agreed. "UIO" bindings have long been rejected.
> > > 
> > 
> > Sounds it is. And does the modification below fit well?
> > ---
> > -static const struct of_device_id uio_mpc85xx_l2ctlr_of_match[] = {
> > -   {   .compatible = "uio,mpc85xx-cache-sram", },
> > -   {},
> > +#ifdef CONFIG_OF
> > +static struct of_device_id uio_fsl_85xx_cache_sram_of_match[] = {
> > +   { /* This is filled with module_parm */ },
> > +   { /* Sentinel */ },
> >  };
> > +MODULE_DEVICE_TABLE(of, uio_fsl_85xx_cache_sram_of_match);
> > +module_param_string(of_id, uio_fsl_85xx_cache_sram_of_match[0].compatible,
> > +   sizeof(uio_fsl_85xx_cache_sram_of_match[0].compa
> > tible), 0);
> > +MODULE_PARM_DESC(of_id, "platform device id to be handled by cache-sram-
> > uio");
> > +#endif
> 
> No.  The point is that you wouldn't be configuring this with the device tree
> at all.

Wait, why not?  Don't force people to use module parameters, that is
crazy.  DT describes the hardware involved, if someone wants to bind to
a specific range of memory, as described by DT, why can't they do so?

I can understand not liking the name "uio" in a dt tree, but there's no
reason that DT can not describe what a driver binds to here.

Remember, module parameters are NEVER the answer, this isn't the 1990's
anymore.

thanks,

greg k-h


Re: [PATCH v2,5/5] drivers: uio: new driver for fsl_85xx_cache_sram

2020-04-16 Thread Greg KH
On Wed, Apr 15, 2020 at 02:27:51PM -0500, Scott Wood wrote:
> > > + dev_err(>dev, "error no valid uio-map configured\n");
> > > + ret = -EINVAL;
> > > + goto err_info_free_internel;
> > > + }
> > > +
> > > + info->version = "0.1.0";
> > 
> > Could you define some DRIVER_VERSION in the top of the file next to 
> > DRIVER_NAME instead of hard coding in the middle on a function ?
> 
> That's what v1 had, and Greg KH said to remove it.  I'm guessing that he
> thought it was the common-but-pointless practice of having the driver print a
> version number that never gets updated, rather than something the UIO API
> (unfortunately, compared to a feature query interface) expects.  That said,
> I'm not sure what the value is of making it a macro since it should only be
> used once, that use is self documenting, it isn't tunable, etc.  Though if
> this isn't a macro, UIO_NAME also shouldn't be (and if it is made a macro
> again, it should be UIO_VERSION, not DRIVER_VERSION).
> 
> Does this really need a three-part version scheme?  What's wrong with a
> version of "1", to be changed to "2" in the hopefully-unlikely event that the
> userspace API changes?  Assuming UIO is used for this at all, which doesn't
> seem like a great fit to me.

No driver version numbers at all please, they do not make any sense when
the driver is included in the kernel tree.

greg k-h


Re: [PATCH v2,5/5] drivers: uio: new driver for fsl_85xx_cache_sram

2020-04-16 Thread Greg KH
On Wed, Apr 15, 2020 at 02:26:55PM -0500, Scott Wood wrote:
> Instead, have module parameters that take the sizes and alignments you'd like
> to allocate and expose to userspace.  Better still would be some sort of
> dynamic allocation (e.g. open a fd, ioctl to set the requested size/alignment,
> if it succeeds you can mmap it, and when the fd is closed the region is
> freed).

No module parameters please, this is not the 1990's.

Use device tree, that is what it is there for.

thanks,

greg k-h


Re: [PATCH 4.19] powerpc/powernv/idle: Restore AMR/UAMOR/AMOR after idle

2020-04-15 Thread Greg KH
On Wed, Apr 15, 2020 at 10:40:05PM +1000, Andrew Donnellan wrote:
> From: Michael Ellerman 
> 
> commit 53a712bae5dd919521a58d7bad773b949358add0 upstream.
> 
> In order to implement KUAP (Kernel Userspace Access Protection) on
> Power9 we will be using the AMR, and therefore indirectly the
> UAMOR/AMOR.
> 
> So save/restore these regs in the idle code.
> 
> Signed-off-by: Michael Ellerman 
> [ajd: Backport to 4.19 tree, CVE-2020-11669]
> Signed-off-by: Andrew Donnellan 
> ---
>  arch/powerpc/kernel/idle_book3s.S | 27 +++
>  1 file changed, 23 insertions(+), 4 deletions(-)

This and the 4.14 patch now queued up, thanks.

greg k-h


Re: [PATCH 5/5] drivers: uio: new driver for fsl_85xx_cache_sram

2020-04-15 Thread Greg KH
On Wed, Apr 15, 2020 at 05:33:46AM -0700, Wang Wenhu wrote:
> A driver for freescale 85xx platforms to access the Cache-Sram form
> user level. This is extremely helpful for some user-space applications
> that require high performance memory accesses.
> 
> Cc: Greg Kroah-Hartman 
> Cc: Christophe Leroy 
> Cc: Scott Wood 
> Cc: Michael Ellerman 
> Cc: linuxppc-dev@lists.ozlabs.org
> Signed-off-by: Wang Wenhu 
> ---
>  drivers/uio/Kconfig   |   8 ++
>  drivers/uio/Makefile  |   1 +
>  drivers/uio/uio_fsl_85xx_cache_sram.c | 195 ++
>  3 files changed, 204 insertions(+)
>  create mode 100644 drivers/uio/uio_fsl_85xx_cache_sram.c
> 
> diff --git a/drivers/uio/Kconfig b/drivers/uio/Kconfig
> index 202ee81cfc2b..afd38ec13de0 100644
> --- a/drivers/uio/Kconfig
> +++ b/drivers/uio/Kconfig
> @@ -105,6 +105,14 @@ config UIO_NETX
> To compile this driver as a module, choose M here; the module
> will be called uio_netx.
>  
> +config UIO_FSL_85XX_CACHE_SRAM
> + tristate "Freescale 85xx Cache-Sram driver"
> + depends on FSL_85XX_CACHE_SRAM
> + help
> +   Generic driver for accessing the Cache-Sram form user level. This
> +   is extremely helpful for some user-space applications that require
> +   high performance memory accesses.
> +
>  config UIO_FSL_ELBC_GPCM
>   tristate "eLBC/GPCM driver"
>   depends on FSL_LBC
> diff --git a/drivers/uio/Makefile b/drivers/uio/Makefile
> index c285dd2a4539..be2056cffc21 100644
> --- a/drivers/uio/Makefile
> +++ b/drivers/uio/Makefile
> @@ -10,4 +10,5 @@ obj-$(CONFIG_UIO_NETX)  += uio_netx.o
>  obj-$(CONFIG_UIO_PRUSS) += uio_pruss.o
>  obj-$(CONFIG_UIO_MF624) += uio_mf624.o
>  obj-$(CONFIG_UIO_FSL_ELBC_GPCM)  += uio_fsl_elbc_gpcm.o
> +obj-$(CONFIG_UIO_FSL_85XX_CACHE_SRAM)+= uio_fsl_85xx_cache_sram.o
>  obj-$(CONFIG_UIO_HV_GENERIC) += uio_hv_generic.o
> diff --git a/drivers/uio/uio_fsl_85xx_cache_sram.c 
> b/drivers/uio/uio_fsl_85xx_cache_sram.c
> new file mode 100644
> index ..e11202dd5b93
> --- /dev/null
> +++ b/drivers/uio/uio_fsl_85xx_cache_sram.c
> @@ -0,0 +1,195 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2020 Vivo Communication Technology Co. Ltd.
> + * Copyright (C) 2020 Wang Wenhu 
> + * All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License version 2 as published
> + * by the Free Software Foundation.

Nit, you don't need this sentance anymore now that you have the SPDX
line above

> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define DRIVER_VERSION   "0.1.0"

Don't do DRIVER_VERSIONs, they never work once the code is in the kernel
tree.

> +#define DRIVER_NAME  "uio_fsl_85xx_cache_sram"

KBUILD_MODNAME?

> +#define UIO_NAME "uio_cache_sram"
> +
> +static const struct of_device_id uio_mpc85xx_l2ctlr_of_match[] = {
> + {   .compatible = "uio,fsl,p2020-l2-cache-controller",  },
> + {   .compatible = "uio,fsl,p2010-l2-cache-controller",  },
> + {   .compatible = "uio,fsl,p1020-l2-cache-controller",  },
> + {   .compatible = "uio,fsl,p1011-l2-cache-controller",  },
> + {   .compatible = "uio,fsl,p1013-l2-cache-controller",  },
> + {   .compatible = "uio,fsl,p1022-l2-cache-controller",  },
> + {   .compatible = "uio,fsl,mpc8548-l2-cache-controller",},
> + {   .compatible = "uio,fsl,mpc8544-l2-cache-controller",},
> + {   .compatible = "uio,fsl,mpc8572-l2-cache-controller",},
> + {   .compatible = "uio,fsl,mpc8536-l2-cache-controller",},
> + {   .compatible = "uio,fsl,p1021-l2-cache-controller",  },
> + {   .compatible = "uio,fsl,p1012-l2-cache-controller",  },
> + {   .compatible = "uio,fsl,p1025-l2-cache-controller",  },
> + {   .compatible = "uio,fsl,p1016-l2-cache-controller",  },
> + {   .compatible = "uio,fsl,p1024-l2-cache-controller",  },
> + {   .compatible = "uio,fsl,p1015-l2-cache-controller",  },
> + {   .compatible = "uio,fsl,p1010-l2-cache-controller",  },
> + {   .compatible = "uio,fsl,bsc9131-l2-cache-controller",},
> + {},
> +};
> +
> +static void uio_info_free_internal(struct uio_info *info)
> +{
> + struct uio_mem *uiomem = >mem[0];
> +
> + while (uiomem < >mem[MAX_UIO_MAPS]) {
> + if (uiomem->size) {
> + mpc85xx_cache_sram_free(uiomem->internal_addr);
> + kfree(uiomem->name);
> + }
> + uiomem++;
> + }
> +}
> +
> +static int uio_fsl_85xx_cache_sram_probe(struct platform_device *pdev)
> +{
> + struct device_node *parent = pdev->dev.of_node;
> + struct device_node *node = NULL;
> + struct uio_info *info;
> + struct 

Re: [PATCH v2] Fix: buffer overflow during hvc_alloc().

2020-04-15 Thread Greg KH
On Tue, Apr 14, 2020 at 10:15:03PM +0300, and...@daynix.com wrote:
> From: Andrew Melnychenko 
> 
> If there is a lot(more then 16) of virtio-console devices
> or virtio_console module is reloaded
> - buffers 'vtermnos' and 'cons_ops' are overflowed.
> In older kernels it overruns spinlock which leads to kernel freezing:
> https://bugzilla.redhat.com/show_bug.cgi?id=1786239
> 
> To reproduce the issue, you can try simple script that
> loads/unloads module. Something like this:
> while [ 1 ]
> do
>   modprobe virtio_console
>   sleep 2
>   modprobe -r virtio_console
>   sleep 2
> done
> 
> Description of problem:
> Guest get 'Call Trace' when loading module "virtio_console"
> and unloading it frequently - clearly reproduced on kernel-4.18.0:
> 
> [   81.498208] [ cut here ]
> [   81.499263] pvqspinlock: lock 0x92080020 has corrupted value 
> 0xc0774ca0!
> [   81.501000] WARNING: CPU: 0 PID: 785 at 
> kernel/locking/qspinlock_paravirt.h:500 
> __pv_queued_spin_unlock_slowpath+0xc0/0xd0
> [   81.503173] Modules linked in: virtio_console fuse xt_CHECKSUM 
> ipt_MASQUERADE xt_conntrack ipt_REJECT nft_counter nf_nat_tftp nft_objref 
> nf_conntrack_tftp tun bridge stp llc nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 
> nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct 
> nf_tables_set nft_chain_nat_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 
> nft_chain_route_ipv6 nft_chain_nat_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 
> nf_nat_ipv4 nf_nat nf_conntrack nft_chain_route_ipv4 ip6_tables nft_compat 
> ip_set nf_tables nfnetlink sunrpc bochs_drm drm_vram_helper ttm 
> drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm i2c_piix4 
> pcspkr crct10dif_pclmul crc32_pclmul joydev ghash_clmulni_intel ip_tables xfs 
> libcrc32c sd_mod sg ata_generic ata_piix virtio_net libata crc32c_intel 
> net_failover failover serio_raw virtio_scsi dm_mirror dm_region_hash dm_log 
> dm_mod [last unloaded: virtio_console]
> [   81.517019] CPU: 0 PID: 785 Comm: kworker/0:2 Kdump: loaded Not tainted 
> 4.18.0-167.el8.x86_64 #1
> [   81.518639] Hardware name: Red Hat KVM, BIOS 
> 1.12.0-5.scrmod+el8.2.0+5159+d8aa4d83 04/01/2014
> [   81.520205] Workqueue: events control_work_handler [virtio_console]
> [   81.521354] RIP: 0010:__pv_queued_spin_unlock_slowpath+0xc0/0xd0
> [   81.522450] Code: 07 00 48 63 7a 10 e8 bf 64 f5 ff 66 90 c3 8b 05 e6 cf d6 
> 01 85 c0 74 01 c3 8b 17 48 89 fe 48 c7 c7 38 4b 29 91 e8 3a 6c fa ff <0f> 0b 
> c3 0f 0b 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 48
> [   81.525830] RSP: 0018:b51a01ffbd70 EFLAGS: 00010282
> [   81.526798] RAX:  RBX: 0010 RCX: 
> 
> [   81.528110] RDX: 9e66f1826480 RSI: 9e66f1816a08 RDI: 
> 9e66f1816a08
> [   81.529437] RBP: 9153ff10 R08: 026c R09: 
> 0053
> [   81.530732] R10:  R11: b51a01ffbc18 R12: 
> 9e66cd682200
> [   81.532133] R13: 9153ff10 R14: 9e6685569500 R15: 
> 9e66cd682000
> [   81.533442] FS:  () GS:9e66f180() 
> knlGS:
> [   81.534914] CS:  0010 DS:  ES:  CR0: 80050033
> [   81.535971] CR2: 5624c55b14d0 CR3: 0003a023c000 CR4: 
> 003406f0
> [   81.537283] Call Trace:
> [   81.537763]  __raw_callee_save___pv_queued_spin_unlock_slowpath+0x11/0x20
> [   81.539011]  .slowpath+0x9/0xe
> [   81.539585]  hvc_alloc+0x25e/0x300
> [   81.540237]  init_port_console+0x28/0x100 [virtio_console]
> [   81.541251]  handle_control_message.constprop.27+0x1c4/0x310 
> [virtio_console]
> [   81.542546]  control_work_handler+0x70/0x10c [virtio_console]
> [   81.543601]  process_one_work+0x1a7/0x3b0
> [   81.544356]  worker_thread+0x30/0x390
> [   81.545025]  ? create_worker+0x1a0/0x1a0
> [   81.545749]  kthread+0x112/0x130
> [   81.546358]  ? kthread_flush_work_fn+0x10/0x10
> [   81.547183]  ret_from_fork+0x22/0x40
> [   81.547842] ---[ end trace aa97649bd16c8655 ]---
> [   83.546539] general protection fault:  [#1] SMP NOPTI
> [   83.547422] CPU: 5 PID: 3225 Comm: modprobe Kdump: loaded Tainted: G   
>  W- -  - 4.18.0-167.el8.x86_64 #1
> [   83.549191] Hardware name: Red Hat KVM, BIOS 
> 1.12.0-5.scrmod+el8.2.0+5159+d8aa4d83 04/01/2014
> [   83.550544] RIP: 0010:__pv_queued_spin_lock_slowpath+0x19a/0x2a0
> [   83.551504] Code: c4 c1 ea 12 41 be 01 00 00 00 4c 8d 6d 14 41 83 e4 03 8d 
> 42 ff 49 c1 e4 05 48 98 49 81 c4 40 a5 02 00 4c 03 24 c5 60 48 34 91 <49> 89 
> 2c 24 b8 00 80 00 00 eb 15 84 c0 75 0a 41 0f b6 54 24 14 84
> [   83.554449] RSP: 0018:b51a0323fdb0 EFLAGS: 00010202
> [   83.555290] RAX: 301c RBX: 92080020 RCX: 
> 0001
> [   83.556426] RDX: 301d RSI:  RDI: 
> 
> [   83.557556] RBP: 9e66f196a540 R08: 028a R09: 
> 9e66d2757788
> [   83.558688] R10:  R11:  

Re: [PATCH 02/28] staging: android: ion: use vmap instead of vm_map_ram

2020-04-08 Thread Greg KH
On Wed, Apr 08, 2020 at 01:59:00PM +0200, Christoph Hellwig wrote:
> vm_map_ram can keep mappings around after the vm_unmap_ram.  Using that
> with non-PAGE_KERNEL mappings can lead to all kinds of aliasing issues.
> 
> Signed-off-by: Christoph Hellwig 

Acked-by: Greg Kroah-Hartman 


Re: [PATCH 0/6] Memory corruption may occur due to incorrent tlb flush

2020-02-19 Thread Greg KH
On Thu, Feb 20, 2020 at 11:04:51AM +0530, Santosh Sivaraj wrote:
> The TLB flush optimisation (a46cc7a90f: powerpc/mm/radix: Improve TLB/PWC
> flushes) may result in random memory corruption. Any concurrent page-table 
> walk
> could end up with a Use-after-Free. Even on UP this might give issues, since
> mmu_gather is preemptible these days. An interrupt or preempted task accessing
> user pages might stumble into the free page if the hardware caches page
> directories.
> 
> The series is a backport of the fix sent by Peter [1].
> 
> The first three patches are dependencies for the last patch (avoid potential
> double flush). If the performance impact due to double flush is considered
> trivial then the first three patches and last patch may be dropped.
> 
> [1] https://patchwork.kernel.org/cover/11284843/

Can you resend these with the git commit ids of the upstream patches in
them, and say what stable tree(s) you wish to have them applied to?

thanks,

greg k-h


Re: [PATCH v5 3/6] powerpc/fadump: reorganize /sys/kernel/fadump_* sysfs files

2019-12-09 Thread Greg KH
On Mon, Dec 09, 2019 at 10:28:23AM +0530, Sourabh Jain wrote:
> +#define CREATE_SYMLINK(target, symlink_name) do {\
> + rc = compat_only_sysfs_link_entry_to_kobj(kernel_kobj, fadump_kobj, \
> +   target, symlink_name); \
> + if (rc) \
> + pr_err("unable to create %s symlink (%d)", symlink_name, rc); \
> +} while (0)


No need for a macro, just spell it all out.  And properly clean up if an
error happens, you are just printing it out and moving on, which is
probably NOT what you want to do, right?

> +static struct attribute_group fadump_group = {
> + .attrs = fadump_attrs,
> +};

ATTRIBUTE_GROUPS()?

thanks,

greg k-h


Re: [PATCH v4 2/6] sysfs: wrap __compat_only_sysfs_link_entry_to_kobj function to change the symlink name

2019-12-06 Thread Greg KH
On Fri, Dec 06, 2019 at 11:57:53PM +0530, Sourabh Jain wrote:
> 
> 
> On 12/6/19 6:16 PM, Greg KH wrote:
> > On Fri, Dec 06, 2019 at 05:54:30PM +0530, Sourabh Jain wrote:
> >> The __compat_only_sysfs_link_entry_to_kobj function creates a symlink to a
> >> kobject but doesn't provide an option to change the symlink file name.
> >>
> >> This patch adds a wrapper function create_sysfs_symlink_entry_to_kobj that
> >> extends the __compat_only_sysfs_link_entry_to_kobj functionality which
> >> allows function caller to customize the symlink name.
> >>
> >> Signed-off-by: Sourabh Jain 
> >> ---
> >>  fs/sysfs/group.c  | 28 +---
> >>  include/linux/sysfs.h | 12 
> >>  2 files changed, 37 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/fs/sysfs/group.c b/fs/sysfs/group.c
> >> index d41c21fef138..5eb38145b957 100644
> >> --- a/fs/sysfs/group.c
> >> +++ b/fs/sysfs/group.c
> >> @@ -424,6 +424,25 @@ EXPORT_SYMBOL_GPL(sysfs_remove_link_from_group);
> >>  int __compat_only_sysfs_link_entry_to_kobj(struct kobject *kobj,
> >>  struct kobject *target_kobj,
> >>  const char *target_name)
> >> +{
> >> +  return create_sysfs_symlink_entry_to_kobj(kobj, target_kobj,
> >> +  target_name, NULL);
> >> +}
> >> +EXPORT_SYMBOL_GPL(__compat_only_sysfs_link_entry_to_kobj);
> >> +
> >> +/**
> >> + * create_sysfs_symlink_entry_to_kobj - add a symlink to a kobject 
> >> pointing
> >> + * to a group or an attribute
> >> + * @kobj: The kobject containing the group.
> >> + * @target_kobj:  The target kobject.
> >> + * @target_name:  The name of the target group or attribute.
> >> + * @symlink_name: The name of the symlink file (target_name will be
> >> + *considered if symlink_name is NULL).
> >> + */
> >> +int create_sysfs_symlink_entry_to_kobj(struct kobject *kobj,
> >> + struct kobject *target_kobj,
> >> + const char *target_name,
> >> + const char *symlink_name)
> >>  {
> >>struct kernfs_node *target;
> >>struct kernfs_node *entry;
> >> @@ -448,12 +467,15 @@ int __compat_only_sysfs_link_entry_to_kobj(struct 
> >> kobject *kobj,
> >>return -ENOENT;
> >>}
> >>  
> >> -  link = kernfs_create_link(kobj->sd, target_name, entry);
> >> +  if (!symlink_name)
> >> +  symlink_name = target_name;
> >> +
> >> +  link = kernfs_create_link(kobj->sd, symlink_name, entry);
> >>if (IS_ERR(link) && PTR_ERR(link) == -EEXIST)
> >> -  sysfs_warn_dup(kobj->sd, target_name);
> >> +  sysfs_warn_dup(kobj->sd, symlink_name);
> >>  
> >>kernfs_put(entry);
> >>kernfs_put(target);
> >>return PTR_ERR_OR_ZERO(link);
> >>  }
> >> -EXPORT_SYMBOL_GPL(__compat_only_sysfs_link_entry_to_kobj);
> >> +EXPORT_SYMBOL_GPL(create_sysfs_symlink_entry_to_kobj);
> >> diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
> >> index 5420817ed317..123c6f10333a 100644
> >> --- a/include/linux/sysfs.h
> >> +++ b/include/linux/sysfs.h
> >> @@ -300,6 +300,10 @@ void sysfs_remove_link_from_group(struct kobject 
> >> *kobj, const char *group_name,
> >>  int __compat_only_sysfs_link_entry_to_kobj(struct kobject *kobj,
> >>  struct kobject *target_kobj,
> >>  const char *target_name);
> >> +int create_sysfs_symlink_entry_to_kobj(struct kobject *kobj,
> >> + struct kobject *target_kobj,
> >> + const char *target_name,
> >> + const char *symlink_name);
> > 
> > sysfs_create_symlink_entry_to_kobj()?
> > 
> > I can't remember why we put __compat_only there, perhaps because we do
> > not want people to really use this unless you really really have to?
> 
> We don't have much option here. I tried replicating the sysfs files
> in older patch series but creating symlink at old location is much
> better approach.
> 
> The __compat_only_sysfs_link_entry_to_kobj function is pretty generic,
> unable to understand the reason behind restricting its usage.
> 
> > 
> > So then keep compat_only here as well?
> 
> Sure, I will rename the wrapper function.
> 
> But how about changing the function signature instead of creating
> a wrapper function?
> 
> Considering the fact that there are only two places this function
> has called.
> 
> > 
> > What breaks if you remove those undocumented sysfs files?  What
> > userspace tool do you have that will even notice?
> 
> The scripts used in kdump service need those sysfs files to control
> the dump collection. So we can't just move the sysfs files to the
> new location.

If you can not change them, then just document them and live with it.
Why do this extra work to create a symlink for something you will never
use?

greg k-h


Re: [PATCH v4 6/6] powerpc/fadump: sysfs for fadump memory reservation

2019-12-06 Thread Greg KH
On Fri, Dec 06, 2019 at 05:54:34PM +0530, Sourabh Jain wrote:
> Add a sys interface to allow querying the memory reserved by FADump for
> saving the crash dump.
> 
> Also added Documentation/ABI for the new sysfs file.
> 
> Signed-off-by: Sourabh Jain 
> ---
>  Documentation/ABI/testing/sysfs-kernel-fadump|  7 +++
>  Documentation/powerpc/firmware-assisted-dump.rst |  5 +
>  arch/powerpc/kernel/fadump.c | 15 +++
>  3 files changed, 27 insertions(+)
> 
> diff --git a/Documentation/ABI/testing/sysfs-kernel-fadump 
> b/Documentation/ABI/testing/sysfs-kernel-fadump
> index 5d988b919e81..8f7a64a81783 100644
> --- a/Documentation/ABI/testing/sysfs-kernel-fadump
> +++ b/Documentation/ABI/testing/sysfs-kernel-fadump
> @@ -31,3 +31,10 @@ Description:   write only
>   the system is booted to capture the vmcore using FADump.
>   It is used to release the memory reserved by FADump to
>   save the crash dump.
> +
> +What:/sys/kernel/fadump/mem_reserved
> +Date:Dec 2019
> +Contact: linuxppc-dev@lists.ozlabs.org
> +Description: read only
> + Provide information about the amount of memory reserved by
> + FADump to save the crash dump in bytes.
> diff --git a/Documentation/powerpc/firmware-assisted-dump.rst 
> b/Documentation/powerpc/firmware-assisted-dump.rst
> index 365c10209ef3..04993eaf3113 100644
> --- a/Documentation/powerpc/firmware-assisted-dump.rst
> +++ b/Documentation/powerpc/firmware-assisted-dump.rst
> @@ -268,6 +268,11 @@ Here is the list of files under kernel sysfs:
>  be handled and vmcore will not be captured. This interface can be
>  easily integrated with kdump service start/stop.
>  
> + /sys/kernel/fadump/mem_reserved
> +
> +   This is used to display the memory reserved by FADump for saving the
> +   crash dump.
> +
>   /sys/kernel/fadump_release_mem
>  This file is available only when FADump is active during
>  second kernel. This is used to release the reserved memory
> diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
> index 41a3cda81791..b2af51b7c750 100644
> --- a/arch/powerpc/kernel/fadump.c
> +++ b/arch/powerpc/kernel/fadump.c
> @@ -1357,6 +1357,13 @@ static ssize_t fadump_enabled_show(struct kobject 
> *kobj,
>   return sprintf(buf, "%d\n", fw_dump.fadump_enabled);
>  }
>  
> +static ssize_t fadump_mem_reserved_show(struct kobject *kobj,
> + struct kobj_attribute *attr,
> + char *buf)
> +{
> + return sprintf(buf, "%ld\n", fw_dump.reserve_dump_area_size);
> +}
> +
>  static ssize_t fadump_register_show(struct kobject *kobj,
>   struct kobj_attribute *attr,
>   char *buf)
> @@ -1430,6 +1437,10 @@ static struct kobj_attribute enable_attr = 
> __ATTR(enabled,
>  static struct kobj_attribute register_attr = __ATTR(registered,
>   0644, fadump_register_show,
>   fadump_register_store);
> +static struct kobj_attribute mem_reserved_attr = __ATTR(mem_reserved,
> + 0444, fadump_mem_reserved_show,
> + NULL);

__ATTRI_RO()?

> +
>  
>  DEFINE_SHOW_ATTRIBUTE(fadump_region);
>  
> @@ -1464,6 +1475,10 @@ static void fadump_init_files(void)
>   pr_err("unable to create release_mem sysfs file (%d)\n",
>  rc);
>   }
> + rc = sysfs_create_file(fadump_kobj, _reserved_attr.attr);
> + if (rc)
> + pr_err("unable to create mem_reserved sysfs file (%d)\n",
> +rc);

Again, put it in an attribute group, that would have only required one
line, and not this mess of not cleaning up if something went wrong.

thanks,

greg k-h


Re: [PATCH v4 4/6] powerpc/powernv: move core and fadump_release_opalcore under new kobject

2019-12-06 Thread Greg KH
On Fri, Dec 06, 2019 at 05:54:32PM +0530, Sourabh Jain wrote:
> The /sys/firmware/opal/core and /sys/kernel/fadump_release_opalcore sysfs
> files are used to export and release the OPAL memory on PowerNV platform.
> let's organize them into a new kobject under /sys/firmware/opal/mpipl/
> directory.
> 
> A symlink is added to maintain the backward compatibility for
> /sys/firmware/opal/core sysfs file.
> 
> Signed-off-by: Sourabh Jain 
> ---
>  .../sysfs-kernel-fadump_release_opalcore  |  2 ++
>  .../powerpc/firmware-assisted-dump.rst| 15 +
>  arch/powerpc/platforms/powernv/opal-core.c| 31 ++-
>  3 files changed, 34 insertions(+), 14 deletions(-)
>  rename Documentation/ABI/{testing => 
> removed}/sysfs-kernel-fadump_release_opalcore (82%)
> 
> diff --git a/Documentation/ABI/testing/sysfs-kernel-fadump_release_opalcore 
> b/Documentation/ABI/removed/sysfs-kernel-fadump_release_opalcore
> similarity index 82%
> rename from Documentation/ABI/testing/sysfs-kernel-fadump_release_opalcore
> rename to Documentation/ABI/removed/sysfs-kernel-fadump_release_opalcore
> index 53313c1d4e7a..a8d46cd0f4e6 100644
> --- a/Documentation/ABI/testing/sysfs-kernel-fadump_release_opalcore
> +++ b/Documentation/ABI/removed/sysfs-kernel-fadump_release_opalcore
> @@ -1,3 +1,5 @@
> +This ABI is moved to /sys/firmware/opal/mpipl/release_core.
> +
>  What:/sys/kernel/fadump_release_opalcore
>  Date:Sep 2019
>  Contact: linuxppc-dev@lists.ozlabs.org
> diff --git a/Documentation/powerpc/firmware-assisted-dump.rst 
> b/Documentation/powerpc/firmware-assisted-dump.rst
> index 0455a78486d5..345a3405206e 100644
> --- a/Documentation/powerpc/firmware-assisted-dump.rst
> +++ b/Documentation/powerpc/firmware-assisted-dump.rst
> @@ -112,13 +112,13 @@ to ensure that crash data is preserved to process later.
>  
>  -- On OPAL based machines (PowerNV), if the kernel is build with
> CONFIG_OPAL_CORE=y, OPAL memory at the time of crash is also
> -   exported as /sys/firmware/opal/core file. This procfs file is
> +   exported as /sys/firmware/opal/mpipl/core file. This procfs file is
> helpful in debugging OPAL crashes with GDB. The kernel memory
> used for exporting this procfs file can be released by echo'ing
> -   '1' to /sys/kernel/fadump_release_opalcore node.
> +   '1' to /sys/firmware/opal/mpipl/release_core node.
>  
> e.g.
> - # echo 1 > /sys/kernel/fadump_release_opalcore
> + # echo 1 > /sys/firmware/opal/mpipl/release_core
>  
>  Implementation details:
>  ---
> @@ -283,14 +283,17 @@ Here is the list of files under kernel sysfs:
>  enhanced to use this interface to release the memory reserved for
>  dump and continue without 2nd reboot.
>  
> - /sys/kernel/fadump_release_opalcore
> +Note: /sys/kernel/fadump_release_opalcore sysfs has moved to
> +  /sys/firmware/opal/mpipl/release_core
> +
> + /sys/firmware/opal/mpipl/release_core
>  
>  This file is available only on OPAL based machines when FADump is
>  active during capture kernel. This is used to release the memory
> -used by the kernel to export /sys/firmware/opal/core file. To
> +used by the kernel to export /sys/firmware/opal/mpipl/core file. To
>  release this memory, echo '1' to it:
>  
> -echo 1  > /sys/kernel/fadump_release_opalcore
> +echo 1  > /sys/firmware/opal/mpipl/release_core
>  
>  Here is the list of files under powerpc debugfs:
>  (Assuming debugfs is mounted on /sys/kernel/debug directory.)
> diff --git a/arch/powerpc/platforms/powernv/opal-core.c 
> b/arch/powerpc/platforms/powernv/opal-core.c
> index ed895d82c048..7fcc092d065e 100644
> --- a/arch/powerpc/platforms/powernv/opal-core.c
> +++ b/arch/powerpc/platforms/powernv/opal-core.c
> @@ -589,7 +589,8 @@ static ssize_t fadump_release_opalcore_store(struct 
> kobject *kobj,
>   return count;
>  }
>  
> -static struct kobj_attribute opalcore_rel_attr = 
> __ATTR(fadump_release_opalcore,
> +struct kobject *mpipl_kobj;
> +static struct kobj_attribute opalcore_rel_attr = __ATTR(release_core,
>   0200, NULL,
>   fadump_release_opalcore_store);

__ATTR_WO()?

>  
> @@ -609,7 +610,7 @@ static int __init opalcore_init(void)
>* then capture the dump.
>*/
>   if (!(is_opalcore_usable())) {
> - pr_err("Failed to export /sys/firmware/opal/core\n");
> + pr_err("Failed to export /sys/firmware/opal/mpipl/core\n");
>   opalcore_cleanup();
>   return rc;
>   }
> @@ -617,18 +618,32 @@ static int __init opalcore_init(void)
>   /* Set OPAL core file size */
>   opal_core_attr.size = oc_conf->opalcore_size;
>  
> + mpipl_kobj = kobject_create_and_add("mpipl", opal_kobj);
> + if (!mpipl_kobj) {
> + pr_err("unable to create mpipl kobject\n");
> + return -ENOMEM;
> + }
> +
>

Re: [PATCH v4 1/6] Documentation/ABI: add ABI documentation for /sys/kernel/fadump_*

2019-12-06 Thread Greg KH
On Fri, Dec 06, 2019 at 05:54:29PM +0530, Sourabh Jain wrote:
> Add missing ABI documentation for existing FADump sysfs files.
> 
> Signed-off-by: Sourabh Jain 
> ---
>  Documentation/ABI/testing/sysfs-kernel-fadump_enabled | 7 +++
>  Documentation/ABI/testing/sysfs-kernel-fadump_registered  | 8 
>  Documentation/ABI/testing/sysfs-kernel-fadump_release_mem | 8 
>  .../ABI/testing/sysfs-kernel-fadump_release_opalcore  | 7 +++
>  4 files changed, 30 insertions(+)
>  create mode 100644 Documentation/ABI/testing/sysfs-kernel-fadump_enabled
>  create mode 100644 Documentation/ABI/testing/sysfs-kernel-fadump_registered
>  create mode 100644 Documentation/ABI/testing/sysfs-kernel-fadump_release_mem
>  create mode 100644 
> Documentation/ABI/testing/sysfs-kernel-fadump_release_opalcore
> 
> diff --git a/Documentation/ABI/testing/sysfs-kernel-fadump_enabled 
> b/Documentation/ABI/testing/sysfs-kernel-fadump_enabled
> new file mode 100644
> index ..f73632b1c006
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-kernel-fadump_enabled
> @@ -0,0 +1,7 @@
> +What:/sys/kernel/fadump_enabled

Ugh, no wonder no one wanted to document these, that would have been
noticed right away :(

greg k-h


Re: [PATCH v4 2/6] sysfs: wrap __compat_only_sysfs_link_entry_to_kobj function to change the symlink name

2019-12-06 Thread Greg KH
On Fri, Dec 06, 2019 at 05:54:30PM +0530, Sourabh Jain wrote:
> The __compat_only_sysfs_link_entry_to_kobj function creates a symlink to a
> kobject but doesn't provide an option to change the symlink file name.
> 
> This patch adds a wrapper function create_sysfs_symlink_entry_to_kobj that
> extends the __compat_only_sysfs_link_entry_to_kobj functionality which
> allows function caller to customize the symlink name.
> 
> Signed-off-by: Sourabh Jain 
> ---
>  fs/sysfs/group.c  | 28 +---
>  include/linux/sysfs.h | 12 
>  2 files changed, 37 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/sysfs/group.c b/fs/sysfs/group.c
> index d41c21fef138..5eb38145b957 100644
> --- a/fs/sysfs/group.c
> +++ b/fs/sysfs/group.c
> @@ -424,6 +424,25 @@ EXPORT_SYMBOL_GPL(sysfs_remove_link_from_group);
>  int __compat_only_sysfs_link_entry_to_kobj(struct kobject *kobj,
> struct kobject *target_kobj,
> const char *target_name)
> +{
> + return create_sysfs_symlink_entry_to_kobj(kobj, target_kobj,
> + target_name, NULL);
> +}
> +EXPORT_SYMBOL_GPL(__compat_only_sysfs_link_entry_to_kobj);
> +
> +/**
> + * create_sysfs_symlink_entry_to_kobj - add a symlink to a kobject pointing
> + * to a group or an attribute
> + * @kobj:The kobject containing the group.
> + * @target_kobj: The target kobject.
> + * @target_name: The name of the target group or attribute.
> + * @symlink_name:The name of the symlink file (target_name will be
> + *   considered if symlink_name is NULL).
> + */
> +int create_sysfs_symlink_entry_to_kobj(struct kobject *kobj,
> +struct kobject *target_kobj,
> +const char *target_name,
> +const char *symlink_name)
>  {
>   struct kernfs_node *target;
>   struct kernfs_node *entry;
> @@ -448,12 +467,15 @@ int __compat_only_sysfs_link_entry_to_kobj(struct 
> kobject *kobj,
>   return -ENOENT;
>   }
>  
> - link = kernfs_create_link(kobj->sd, target_name, entry);
> + if (!symlink_name)
> + symlink_name = target_name;
> +
> + link = kernfs_create_link(kobj->sd, symlink_name, entry);
>   if (IS_ERR(link) && PTR_ERR(link) == -EEXIST)
> - sysfs_warn_dup(kobj->sd, target_name);
> + sysfs_warn_dup(kobj->sd, symlink_name);
>  
>   kernfs_put(entry);
>   kernfs_put(target);
>   return PTR_ERR_OR_ZERO(link);
>  }
> -EXPORT_SYMBOL_GPL(__compat_only_sysfs_link_entry_to_kobj);
> +EXPORT_SYMBOL_GPL(create_sysfs_symlink_entry_to_kobj);
> diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
> index 5420817ed317..123c6f10333a 100644
> --- a/include/linux/sysfs.h
> +++ b/include/linux/sysfs.h
> @@ -300,6 +300,10 @@ void sysfs_remove_link_from_group(struct kobject *kobj, 
> const char *group_name,
>  int __compat_only_sysfs_link_entry_to_kobj(struct kobject *kobj,
> struct kobject *target_kobj,
> const char *target_name);
> +int create_sysfs_symlink_entry_to_kobj(struct kobject *kobj,
> +struct kobject *target_kobj,
> +const char *target_name,
> +const char *symlink_name);

sysfs_create_symlink_entry_to_kobj()?

I can't remember why we put __compat_only there, perhaps because we do
not want people to really use this unless you really really have to?

So then keep compat_only here as well?

What breaks if you remove those undocumented sysfs files?  What
userspace tool do you have that will even notice?

thanks,

greg k-h


Re: [PATCH v4 3/6] powerpc/fadump: reorganize /sys/kernel/fadump_* sysfs files

2019-12-06 Thread Greg KH
On Fri, Dec 06, 2019 at 05:54:31PM +0530, Sourabh Jain wrote:
> +static struct kobj_attribute release_attr = __ATTR(release_mem,
>   0200, NULL,
>   fadump_release_memory_store);
> -static struct kobj_attribute fadump_attr = __ATTR(fadump_enabled,
> +static struct kobj_attribute enable_attr = __ATTR(enabled,
>   0444, fadump_enabled_show,
>   NULL);

__ATTR_RO()?

> -static struct kobj_attribute fadump_register_attr = __ATTR(fadump_registered,
> +static struct kobj_attribute register_attr = __ATTR(registered,
>   0644, fadump_register_show,
>   fadump_register_store);

__ATTR_RW()?

And then use an ATTRIBUTE_GROUP() macro to create a group so that you
then can do:

> @@ -1452,11 +1450,47 @@ static void fadump_init_files(void)
>   printk(KERN_ERR "fadump: unable to create debugfs file"
>   " fadump_region\n");
>  
> + rc = sysfs_create_file(fadump_kobj, _attr.attr);
> + if (rc)
> + pr_err("unable to create enabled sysfs file (%d)\n",
> +rc);
> + rc = sysfs_create_file(fadump_kobj, _attr.attr);
> + if (rc)
> + pr_err("unable to create registered sysfs file (%d)\n",
> +rc);
> + if (fw_dump.dump_active) {
> + rc = sysfs_create_file(fadump_kobj, _attr.attr);
> + if (rc)
> + pr_err("unable to create release_mem sysfs file (%d)\n",
> +rc);
> + }

a single call to sysfs_create_groups() here instead of trying to unwind
the mess if something went wrong.

thanks,

greg k-h


Re: [4.14] Backport request: powerpc/perf: Fix IMC_MAX_PMU macro

2019-11-14 Thread Greg KH
On Fri, Nov 15, 2019 at 04:43:05PM +1100, Andrew Donnellan wrote:
> On 15/11/19 4:37 pm, Andrew Donnellan wrote:
> > Dear stable team
> > 
> > Please backport the following patch.
> > 
> > Commit: 7029d1eb0c2c7ee093dc625c679fc277c8eb623b
> >  ("powerpc/perf: Fix IMC_MAX_PMU macro")
> 
> Whoops, this was a local SHA - I meant
> 73ce9aec65b17433e18163d07eb5cb6bf114bd6c.
> 
> There's also 110df8bd3e418b3476cae80babe8add48a8ea523 which is an additional
> fix.

Both now queued up, thanks.

greg k-h


Re: [PATCH stable 4.4] powerpc/boot: Request no dynamic linker for boot wrapper

2019-11-13 Thread Greg KH
On Tue, Nov 12, 2019 at 05:59:41PM +1100, Andrew Donnellan wrote:
> From: Nicholas Piggin 
> 
> Commit ff45000fcb56b5b0f1a14a865d3541746d838a0a upstream.
> 
> The boot wrapper performs its own relocations and does not require
> PT_INTERP segment. However currently we don't tell the linker that.
> 
> Prior to binutils 2.28 that works OK. But since binutils commit
> 1a9ccd70f9a7 ("Fix the linker so that it will not silently generate ELF
> binaries with invalid program headers. Fix readelf to report such
> invalid binaries.") binutils tries to create a program header segment
> due to PT_INTERP, and the link fails because there is no space for it:
> 
>   ld: arch/powerpc/boot/zImage.pseries: Not enough room for program headers, 
> try linking with -N
>   ld: final link failed: Bad value
> 
> So tell the linker not to do that, by passing --no-dynamic-linker.
> 
> Cc: sta...@vger.kernel.org
> Reported-by: Anton Blanchard 
> Signed-off-by: Nicholas Piggin 
> [mpe: Drop dependency on ld-version.sh and massage change log]
> Signed-off-by: Michael Ellerman 
> [ajd: backport to v4.4 (resolve conflict with a comment line)]
> Signed-off-by: Andrew Donnellan 
> ---
>  arch/powerpc/boot/wrapper | 24 +++-
>  1 file changed, 23 insertions(+), 1 deletion(-)

Now queud up, thanks.

greg k-h


Re: [4.4] Backport request: powerpc: Fix compiling a BE kernel with a powerpc64le toolchain

2019-11-13 Thread Greg KH
On Tue, Nov 12, 2019 at 07:52:24PM +1100, Andrew Donnellan wrote:
> Dear stable team
> 
> Please backport the following patches.
> 
> Commits:
> 
> - 164af597ce945751e2dcd53d0a86e84203a6d117
>   ("powerpc/Makefile: Use cflags-y/aflags-y for setting endian options")
> 
> - 4dc831aa88132f835cefe876aa0206977c4d7710
>   ("powerpc: Fix compiling a BE kernel with a powerpc64le toolchain")
> 
> Stable tree targeted: 4.4 (applies cleanly)
> 
> Justification: This fixes the build when attempting to compile a BE powerpc
> kernel using a bi-endian toolchain that defaults to LE, which is a common
> setup.
> 
> I have tested that these patches apply cleanly and appear to rectify the
> build failure on my machine.

Now queued up, thanks.

greg k-h


Re: [PATCH v6] numa: make node_to_cpumask_map() NUMA_NO_NODE aware

2019-10-15 Thread Greg KH
On Tue, Oct 15, 2019 at 06:40:29PM +0800, Yunsheng Lin wrote:
> On 2019/10/14 17:25, Greg KH wrote:
> > On Mon, Oct 14, 2019 at 04:00:46PM +0800, Yunsheng Lin wrote:
> >> On 2019/10/12 18:47, Greg KH wrote:
> >>> On Sat, Oct 12, 2019 at 12:40:01PM +0200, Greg KH wrote:
> >>>> On Sat, Oct 12, 2019 at 05:47:56PM +0800, Yunsheng Lin wrote:
> >>>>> On 2019/10/12 15:40, Greg KH wrote:
> >>>>>> On Sat, Oct 12, 2019 at 02:17:26PM +0800, Yunsheng Lin wrote:
> >>>>>>> add pci and acpi maintainer
> >>>>>>> cc linux-...@vger.kernel.org and linux-a...@vger.kernel.org
> >>>>>>>
> >>>>>>> On 2019/10/11 19:15, Peter Zijlstra wrote:
> >>>>>>>> On Fri, Oct 11, 2019 at 11:27:54AM +0800, Yunsheng Lin wrote:
> >>>>>>>>> But I failed to see why the above is related to making 
> >>>>>>>>> node_to_cpumask_map()
> >>>>>>>>> NUMA_NO_NODE aware?
> >>>>>>>>
> >>>>>>>> Your initial bug is for hns3, which is a PCI device, which really 
> >>>>>>>> _MUST_
> >>>>>>>> have a node assigned.
> >>>>>>>>
> >>>>>>>> It not having one, is a straight up bug. We must not silently accept
> >>>>>>>> NO_NODE there, ever.
> >>>>>>>>
> >>>>>>>
> >>>>>>> I suppose you mean reporting a lack of affinity when the node of a 
> >>>>>>> pcie
> >>>>>>> device is not set by "not silently accept NO_NODE".
> >>>>>>
> >>>>>> If the firmware of a pci device does not provide the node information,
> >>>>>> then yes, warn about that.
> >>>>>>
> >>>>>>> As Greg has asked about in [1]:
> >>>>>>> what is a user to do when the user sees the kernel reporting that?
> >>>>>>>
> >>>>>>> We may tell user to contact their vendor for info or updates about
> >>>>>>> that when they do not know about their system well enough, but their
> >>>>>>> vendor may get away with this by quoting ACPI spec as the spec
> >>>>>>> considering this optional. Should the user believe this is indeed a
> >>>>>>> fw bug or a misreport from the kernel?
> >>>>>>
> >>>>>> Say it is a firmware bug, if it is a firmware bug, that's simple.
> >>>>>>
> >>>>>>> If this kind of reporting is common pratice and will not cause any
> >>>>>>> misunderstanding, then maybe we can report that.
> >>>>>>
> >>>>>> Yes, please do so, that's the only way those boxes are ever going to 
> >>>>>> get
> >>>>>> fixed.  And go add the test to the "firmware testing" tool that is 
> >>>>>> based
> >>>>>> on Linux that Intel has somewhere, to give vendors a chance to fix this
> >>>>>> before they ship hardware.
> >>>>>>
> >>>>>> This shouldn't be a big deal, we warn of other hardware bugs all the
> >>>>>> time.
> >>>>>
> >>>>> Ok, thanks for clarifying.
> >>>>>
> >>>>> Will send a patch to catch the case when a pcie device without numa node
> >>>>> being set and warn about it.
> >>>>>
> >>>>> Maybe use dev->bus to verify if it is a pci device?
> >>>>
> >>>> No, do that in the pci bus core code itself, when creating the devices
> >>>> as that is when you know, or do not know, the numa node, right?
> >>>>
> >>>> This can't be in the driver core only, as each bus type will have a
> >>>> different way of determining what the node the device is on.  For some
> >>>> reason, I thought the PCI core code already does this, right?
> >>>
> >>> Yes, pci_irq_get_node(), which NO ONE CALLS!  I should go delete that
> >>> thing...
> >>>
> >>> Anyway, it looks like the pci core code does call set_dev_node() based
> >>> on the PCI bridge, so if that is set up properly, all should be fine.
> >>>
> >>> If not, well, you 

Re: [PATCH v6] numa: make node_to_cpumask_map() NUMA_NO_NODE aware

2019-10-14 Thread Greg KH
On Mon, Oct 14, 2019 at 11:49:12AM +0200, Peter Zijlstra wrote:
> On Mon, Oct 14, 2019 at 11:25:09AM +0200, Greg KH wrote:
> > Good luck, I don't really think that most, if any, of this is needed,
> > but hey, it's nice to clean it up where it can be :)
> 
> Some of the virtual devices we have (that use devm) really ought to set
> the node too, like drivers/base/cpu.c and driver/base/node.c and
> arguably the cooling devices too (they create a device per cpu).
> 
> The patch I had here:
> 
>   
> https://lkml.kernel.org/r/20190925214526.ga4...@worktop.programming.kicks-ass.net
> 
> takes the more radical approach of requiring a node, except when
> explicitly marked not (the fake devices that don't use devm for
> example).

I like that patch :)

> But yes, PCI and other physical busses really should be having a node
> set, no excuses.

Agreed, at least just warning on the bus creation will make it a bit
less "noisy", in contrast to your patch.  But the messages in your patch
show people just how broken their bioses really are.  Which is always
fun...

> Anyway, I don't think non-physical devices actually use
> cpumask_of_node() much, a quick grep didn't show any.

That's good.

thanks,

greg k-h


Re: [PATCH v6] numa: make node_to_cpumask_map() NUMA_NO_NODE aware

2019-10-14 Thread Greg KH
On Mon, Oct 14, 2019 at 04:00:46PM +0800, Yunsheng Lin wrote:
> On 2019/10/12 18:47, Greg KH wrote:
> > On Sat, Oct 12, 2019 at 12:40:01PM +0200, Greg KH wrote:
> >> On Sat, Oct 12, 2019 at 05:47:56PM +0800, Yunsheng Lin wrote:
> >>> On 2019/10/12 15:40, Greg KH wrote:
> >>>> On Sat, Oct 12, 2019 at 02:17:26PM +0800, Yunsheng Lin wrote:
> >>>>> add pci and acpi maintainer
> >>>>> cc linux-...@vger.kernel.org and linux-a...@vger.kernel.org
> >>>>>
> >>>>> On 2019/10/11 19:15, Peter Zijlstra wrote:
> >>>>>> On Fri, Oct 11, 2019 at 11:27:54AM +0800, Yunsheng Lin wrote:
> >>>>>>> But I failed to see why the above is related to making 
> >>>>>>> node_to_cpumask_map()
> >>>>>>> NUMA_NO_NODE aware?
> >>>>>>
> >>>>>> Your initial bug is for hns3, which is a PCI device, which really 
> >>>>>> _MUST_
> >>>>>> have a node assigned.
> >>>>>>
> >>>>>> It not having one, is a straight up bug. We must not silently accept
> >>>>>> NO_NODE there, ever.
> >>>>>>
> >>>>>
> >>>>> I suppose you mean reporting a lack of affinity when the node of a pcie
> >>>>> device is not set by "not silently accept NO_NODE".
> >>>>
> >>>> If the firmware of a pci device does not provide the node information,
> >>>> then yes, warn about that.
> >>>>
> >>>>> As Greg has asked about in [1]:
> >>>>> what is a user to do when the user sees the kernel reporting that?
> >>>>>
> >>>>> We may tell user to contact their vendor for info or updates about
> >>>>> that when they do not know about their system well enough, but their
> >>>>> vendor may get away with this by quoting ACPI spec as the spec
> >>>>> considering this optional. Should the user believe this is indeed a
> >>>>> fw bug or a misreport from the kernel?
> >>>>
> >>>> Say it is a firmware bug, if it is a firmware bug, that's simple.
> >>>>
> >>>>> If this kind of reporting is common pratice and will not cause any
> >>>>> misunderstanding, then maybe we can report that.
> >>>>
> >>>> Yes, please do so, that's the only way those boxes are ever going to get
> >>>> fixed.  And go add the test to the "firmware testing" tool that is based
> >>>> on Linux that Intel has somewhere, to give vendors a chance to fix this
> >>>> before they ship hardware.
> >>>>
> >>>> This shouldn't be a big deal, we warn of other hardware bugs all the
> >>>> time.
> >>>
> >>> Ok, thanks for clarifying.
> >>>
> >>> Will send a patch to catch the case when a pcie device without numa node
> >>> being set and warn about it.
> >>>
> >>> Maybe use dev->bus to verify if it is a pci device?
> >>
> >> No, do that in the pci bus core code itself, when creating the devices
> >> as that is when you know, or do not know, the numa node, right?
> >>
> >> This can't be in the driver core only, as each bus type will have a
> >> different way of determining what the node the device is on.  For some
> >> reason, I thought the PCI core code already does this, right?
> > 
> > Yes, pci_irq_get_node(), which NO ONE CALLS!  I should go delete that
> > thing...
> > 
> > Anyway, it looks like the pci core code does call set_dev_node() based
> > on the PCI bridge, so if that is set up properly, all should be fine.
> > 
> > If not, well, you have buggy firmware and you need to warn about that at
> > the time you are creating the bridge.  Look at the call to
> > pcibus_to_node() in pci_register_host_bridge().
> 
> Thanks for pointing out the specific function.
> Maybe we do not need to warn about the case when the device has a parent,
> because we must have warned about the parent if the device has a parent
> and the parent also has a node of NO_NODE, so do not need to warn the child
> device anymore? like blew:
> 
> @@ -932,6 +932,10 @@ static int pci_register_host_bridge(struct 
> pci_host_bridge *bridge)
> list_add_tail(>node, _root_buses);
> up_write(_bus_sem);
> 
> +   if (nr_node_ids > 1 && !parent &&

Why

Re: [PATCH v6] numa: make node_to_cpumask_map() NUMA_NO_NODE aware

2019-10-12 Thread Greg KH
On Sat, Oct 12, 2019 at 12:40:01PM +0200, Greg KH wrote:
> On Sat, Oct 12, 2019 at 05:47:56PM +0800, Yunsheng Lin wrote:
> > On 2019/10/12 15:40, Greg KH wrote:
> > > On Sat, Oct 12, 2019 at 02:17:26PM +0800, Yunsheng Lin wrote:
> > >> add pci and acpi maintainer
> > >> cc linux-...@vger.kernel.org and linux-a...@vger.kernel.org
> > >>
> > >> On 2019/10/11 19:15, Peter Zijlstra wrote:
> > >>> On Fri, Oct 11, 2019 at 11:27:54AM +0800, Yunsheng Lin wrote:
> > >>>> But I failed to see why the above is related to making 
> > >>>> node_to_cpumask_map()
> > >>>> NUMA_NO_NODE aware?
> > >>>
> > >>> Your initial bug is for hns3, which is a PCI device, which really _MUST_
> > >>> have a node assigned.
> > >>>
> > >>> It not having one, is a straight up bug. We must not silently accept
> > >>> NO_NODE there, ever.
> > >>>
> > >>
> > >> I suppose you mean reporting a lack of affinity when the node of a pcie
> > >> device is not set by "not silently accept NO_NODE".
> > > 
> > > If the firmware of a pci device does not provide the node information,
> > > then yes, warn about that.
> > > 
> > >> As Greg has asked about in [1]:
> > >> what is a user to do when the user sees the kernel reporting that?
> > >>
> > >> We may tell user to contact their vendor for info or updates about
> > >> that when they do not know about their system well enough, but their
> > >> vendor may get away with this by quoting ACPI spec as the spec
> > >> considering this optional. Should the user believe this is indeed a
> > >> fw bug or a misreport from the kernel?
> > > 
> > > Say it is a firmware bug, if it is a firmware bug, that's simple.
> > > 
> > >> If this kind of reporting is common pratice and will not cause any
> > >> misunderstanding, then maybe we can report that.
> > > 
> > > Yes, please do so, that's the only way those boxes are ever going to get
> > > fixed.  And go add the test to the "firmware testing" tool that is based
> > > on Linux that Intel has somewhere, to give vendors a chance to fix this
> > > before they ship hardware.
> > > 
> > > This shouldn't be a big deal, we warn of other hardware bugs all the
> > > time.
> > 
> > Ok, thanks for clarifying.
> > 
> > Will send a patch to catch the case when a pcie device without numa node
> > being set and warn about it.
> > 
> > Maybe use dev->bus to verify if it is a pci device?
> 
> No, do that in the pci bus core code itself, when creating the devices
> as that is when you know, or do not know, the numa node, right?
> 
> This can't be in the driver core only, as each bus type will have a
> different way of determining what the node the device is on.  For some
> reason, I thought the PCI core code already does this, right?

Yes, pci_irq_get_node(), which NO ONE CALLS!  I should go delete that
thing...

Anyway, it looks like the pci core code does call set_dev_node() based
on the PCI bridge, so if that is set up properly, all should be fine.

If not, well, you have buggy firmware and you need to warn about that at
the time you are creating the bridge.  Look at the call to
pcibus_to_node() in pci_register_host_bridge().

And yes, you need to do this all on a per-bus-type basis, as has been
pointed out.  It's up to the bus to create the device and set this up
properly.

thanks,

greg k-h


Re: [PATCH v6] numa: make node_to_cpumask_map() NUMA_NO_NODE aware

2019-10-12 Thread Greg KH
On Sat, Oct 12, 2019 at 05:47:56PM +0800, Yunsheng Lin wrote:
> On 2019/10/12 15:40, Greg KH wrote:
> > On Sat, Oct 12, 2019 at 02:17:26PM +0800, Yunsheng Lin wrote:
> >> add pci and acpi maintainer
> >> cc linux-...@vger.kernel.org and linux-a...@vger.kernel.org
> >>
> >> On 2019/10/11 19:15, Peter Zijlstra wrote:
> >>> On Fri, Oct 11, 2019 at 11:27:54AM +0800, Yunsheng Lin wrote:
> >>>> But I failed to see why the above is related to making 
> >>>> node_to_cpumask_map()
> >>>> NUMA_NO_NODE aware?
> >>>
> >>> Your initial bug is for hns3, which is a PCI device, which really _MUST_
> >>> have a node assigned.
> >>>
> >>> It not having one, is a straight up bug. We must not silently accept
> >>> NO_NODE there, ever.
> >>>
> >>
> >> I suppose you mean reporting a lack of affinity when the node of a pcie
> >> device is not set by "not silently accept NO_NODE".
> > 
> > If the firmware of a pci device does not provide the node information,
> > then yes, warn about that.
> > 
> >> As Greg has asked about in [1]:
> >> what is a user to do when the user sees the kernel reporting that?
> >>
> >> We may tell user to contact their vendor for info or updates about
> >> that when they do not know about their system well enough, but their
> >> vendor may get away with this by quoting ACPI spec as the spec
> >> considering this optional. Should the user believe this is indeed a
> >> fw bug or a misreport from the kernel?
> > 
> > Say it is a firmware bug, if it is a firmware bug, that's simple.
> > 
> >> If this kind of reporting is common pratice and will not cause any
> >> misunderstanding, then maybe we can report that.
> > 
> > Yes, please do so, that's the only way those boxes are ever going to get
> > fixed.  And go add the test to the "firmware testing" tool that is based
> > on Linux that Intel has somewhere, to give vendors a chance to fix this
> > before they ship hardware.
> > 
> > This shouldn't be a big deal, we warn of other hardware bugs all the
> > time.
> 
> Ok, thanks for clarifying.
> 
> Will send a patch to catch the case when a pcie device without numa node
> being set and warn about it.
> 
> Maybe use dev->bus to verify if it is a pci device?

No, do that in the pci bus core code itself, when creating the devices
as that is when you know, or do not know, the numa node, right?

This can't be in the driver core only, as each bus type will have a
different way of determining what the node the device is on.  For some
reason, I thought the PCI core code already does this, right?

thanks,

greg k-h


Re: [PATCH v6] numa: make node_to_cpumask_map() NUMA_NO_NODE aware

2019-10-12 Thread Greg KH
On Sat, Oct 12, 2019 at 02:17:26PM +0800, Yunsheng Lin wrote:
> add pci and acpi maintainer
> cc linux-...@vger.kernel.org and linux-a...@vger.kernel.org
> 
> On 2019/10/11 19:15, Peter Zijlstra wrote:
> > On Fri, Oct 11, 2019 at 11:27:54AM +0800, Yunsheng Lin wrote:
> >> But I failed to see why the above is related to making 
> >> node_to_cpumask_map()
> >> NUMA_NO_NODE aware?
> > 
> > Your initial bug is for hns3, which is a PCI device, which really _MUST_
> > have a node assigned.
> > 
> > It not having one, is a straight up bug. We must not silently accept
> > NO_NODE there, ever.
> > 
> 
> I suppose you mean reporting a lack of affinity when the node of a pcie
> device is not set by "not silently accept NO_NODE".

If the firmware of a pci device does not provide the node information,
then yes, warn about that.

> As Greg has asked about in [1]:
> what is a user to do when the user sees the kernel reporting that?
> 
> We may tell user to contact their vendor for info or updates about
> that when they do not know about their system well enough, but their
> vendor may get away with this by quoting ACPI spec as the spec
> considering this optional. Should the user believe this is indeed a
> fw bug or a misreport from the kernel?

Say it is a firmware bug, if it is a firmware bug, that's simple.

> If this kind of reporting is common pratice and will not cause any
> misunderstanding, then maybe we can report that.

Yes, please do so, that's the only way those boxes are ever going to get
fixed.  And go add the test to the "firmware testing" tool that is based
on Linux that Intel has somewhere, to give vendors a chance to fix this
before they ship hardware.

This shouldn't be a big deal, we warn of other hardware bugs all the
time.

thanks,

greg k-h


Re: [PATCH] powerpc/xive: Fix bogus error code returned by OPAL

2019-09-23 Thread Greg KH
On Mon, Sep 23, 2019 at 08:29:40AM +0200, Greg Kurz wrote:
> There's a bug in skiboot that causes the OPAL_XIVE_ALLOCATE_IRQ call
> to return the 32-bit value 0x when OPAL has run out of IRQs.
> Unfortunatelty, OPAL return values are signed 64-bit entities and
> errors are supposed to be negative. If that happens, the linux code
> confusingly treats 0x as a valid IRQ number and panics at some
> point.
> 
> A fix was recently merged in skiboot:
> 
> e97391ae2bb5 ("xive: fix return value of opal_xive_allocate_irq()")
> 
> but we need a workaround anyway to support older skiboots already
> in the field.
> 
> Internally convert 0x to OPAL_RESOURCE which is the usual error
> returned upon resource exhaustion.
> 
> Cc: sta...@vger.kernel.org # v4.12+
> Signed-off-by: Greg Kurz 
> Reviewed-by: Cédric Le Goater 
> Signed-off-by: Michael Ellerman 
> Link: 
> https://lore.kernel.org/r/156821713818.1985334.14123187368108582810.st...@bahia.lan
> (cherry picked from commit 6ccb4ac2bf8a35c694ead92f8ac5530a16e8f2c8,
>  groug: fix arch/powerpc/platforms/powernv/opal-wrappers.S instead of
> non-existing arch/powerpc/platforms/powernv/opal-call.c)
> Signed-off-by: Greg Kurz 
> ---
> 
> This is for 4.14 and 4.19.

Thanks for the backport, now queued up.

greg k-h


Re: [WIP RFC PATCH 0/6] Generic Firmware Variable Filesystem

2019-06-05 Thread Greg KH
On Tue, Jun 04, 2019 at 01:05:45PM -0700, Matthew Garrett wrote:
> On Tue, Jun 4, 2019 at 1:01 PM Nayna  wrote:
> > It seems efivars were first implemented in sysfs and then later
> > separated out as efivarfs.
> > Refer - Documentation/filesystems/efivarfs.txt.
> >
> > So, the reason wasn't that sysfs should not be used for exposing
> > firmware variables,
> > but for the size limitations which seems to come from UEFI Specification.
> >
> > Is this limitation valid for the new requirement of secure variables ?
> 
> I don't think the size restriction is an issue now, but there's a lot
> of complex semantics around variable deletion and immutability that
> need to be represented somehow.

Ah, yeah, that's the reason it would not work in sysfs, forgot all about
that, thanks.

greg k-h


Re: [WIP RFC PATCH 0/6] Generic Firmware Variable Filesystem

2019-06-05 Thread Greg KH
On Tue, Jun 04, 2019 at 04:33:14PM -0400, Nayna wrote:
> 
> 
> On 06/03/2019 03:29 AM, Greg KH wrote:
> > On Mon, Jun 03, 2019 at 04:04:32PM +1000, Daniel Axtens wrote:
> > > Hi Nayna,
> > > 
> > > > > As PowerNV moves towards secure boot, we need a place to put secure
> > > > > variables. One option that has been canvassed is to make our secure
> > > > > variables look like EFI variables. This is an early sketch of another
> > > > > approach where we create a generic firmware variable file system,
> > > > > fwvarfs, and an OPAL Secure Variable backend for it.
> > > > Is there a need of new filesystem ? I am wondering why can't these be
> > > > exposed via sysfs / securityfs ?
> > > > Probably, something like... /sys/firmware/secureboot or
> > > > /sys/kernel/security/secureboot/  ?
> > > I suppose we could put secure variables in sysfs, but I'm not sure
> > > that's what sysfs was intended for. I understand sysfs as "a
> > > filesystem-based view of kernel objects" (from
> > > Documentation/filesystems/configfs/configfs.txt), and I don't think a
> > > secure variable is really a kernel object in the same way most other
> > > things in sysfs are... but I'm open to being convinced.
> > What makes them more "secure" than anything else that is in sysfs today?
> > I didn't see anything in this patchset that provided "additional
> > security", did I miss it?
> > 
> > > securityfs seems to be reserved for LSMs, I don't think we can put
> > > things there.
> > Yeah, I wouldn't mess with that.
> 
> Thanks Greg for clarifying!! I am curious, the TPM exposes the BIOS
> event log to userspace via securityfs. Is there a reason for not
> exposing these security variables to userspace via securityfs as well?

securityfs is for LSMs to use.  If the TPM drivers also use it, well,
that's between those authors and the securityfs developers.

BIOS/firmware variables are a much different thing than a TPM log.

thanks,

greg k-h


Re: [PATCH BACKPORTv2 4.19, 5.0, 5.1] crypto: vmx - ghash: do nosimd fallback manually

2019-06-03 Thread Greg KH
On Mon, Jun 03, 2019 at 12:08:48PM +1000, Daniel Axtens wrote:
> commit 357d065a44cdd77ed5ff35155a989f2a763e96ef upstream.
> [backported: the VMX driver did not use crypto_simd_usable() until
>  after 5.1]
> 
> VMX ghash was using a fallback that did not support interleaving simd
> and nosimd operations, leading to failures in the extended test suite.
> 
> If I understood correctly, Eric's suggestion was to use the same
> data format that the generic code uses, allowing us to call into it
> with the same contexts. I wasn't able to get that to work - I think
> there's a very different key structure and data layout being used.
> 
> So instead steal the arm64 approach and perform the fallback
> operations directly if required.
> 
> Fixes: cc333cd68dfa ("crypto: vmx - Adding GHASH routines for VMX module")
> Cc: sta...@vger.kernel.org # v4.1+
> Reported-by: Eric Biggers 
> Signed-off-by: Daniel Axtens 
> Acked-by: Ard Biesheuvel 
> Tested-by: Michael Ellerman 
> Signed-off-by: Herbert Xu 
> Signed-off-by: Daniel Axtens 
> ---
> 
> v2: do stable backport form correctly.

Thanks for all of these, now queued up.

greg k-h


Re: [WIP RFC PATCH 0/6] Generic Firmware Variable Filesystem

2019-06-03 Thread Greg KH
On Mon, Jun 03, 2019 at 04:04:32PM +1000, Daniel Axtens wrote:
> Hi Nayna,
> 
> >> As PowerNV moves towards secure boot, we need a place to put secure
> >> variables. One option that has been canvassed is to make our secure
> >> variables look like EFI variables. This is an early sketch of another
> >> approach where we create a generic firmware variable file system,
> >> fwvarfs, and an OPAL Secure Variable backend for it.
> >
> > Is there a need of new filesystem ? I am wondering why can't these be 
> > exposed via sysfs / securityfs ?
> > Probably, something like... /sys/firmware/secureboot or 
> > /sys/kernel/security/secureboot/  ?
> 
> I suppose we could put secure variables in sysfs, but I'm not sure
> that's what sysfs was intended for. I understand sysfs as "a
> filesystem-based view of kernel objects" (from
> Documentation/filesystems/configfs/configfs.txt), and I don't think a
> secure variable is really a kernel object in the same way most other
> things in sysfs are... but I'm open to being convinced.

What makes them more "secure" than anything else that is in sysfs today?
I didn't see anything in this patchset that provided "additional
security", did I miss it?

> securityfs seems to be reserved for LSMs, I don't think we can put
> things there.

Yeah, I wouldn't mess with that.

I would just recommend putting this in sysfs.  Make a new subsystem
(i.e. class) and away you go.

> My hope with fwvarfs is to provide a generic place for firmware
> variables so that we don't need to expand the list of firmware-specific
> filesystems beyond efivarfs. I am also aiming to make things simple to
> use so that people familiar with firmware don't also have to become
> familiar with filesystem code in order to expose firmware variables to
> userspace.

Why would anyone need to be writing new code to firmware variables that
makes it any different from any other kernel change?

> > Also, it sounds like this is needed only for secure firmware variables 
> > and does not include
> > other firmware variables which are not security relevant ? Is that 
> > correct understanding ?
> 
> The primary use case at the moment - OPAL secure variables - is security
> focused because the current OPAL secure variable design stores and
> manipulates secure variables separately from the rest of nvram. This
> isn't an inherent feature of fwvarfs.

Again, why not just put it in sysfs please?

> fwvarfs can also be used for variables that are not security relevant as
> well. For example, with the EFI backend (patch 3), both secure and
> insecure variables can be read.

I don't remember why efi variables were not put in sysfs, I think there
was some reasoning behind it originally.  Perhaps look in the linux-efi
archives.

thanks,

greg k-h


Re: Linux 5.1-rc5

2019-05-20 Thread Greg KH
On Thu, May 02, 2019 at 05:10:55PM +0200, Martin Schwidefsky wrote:
> On Thu, 2 May 2019 16:31:10 +0200
> Greg KH  wrote:
> 
> > On Thu, May 02, 2019 at 04:17:58PM +0200, Martin Schwidefsky wrote:
> > > On Thu, 2 May 2019 14:21:28 +0200
> > > Greg KH  wrote:
> > >   
> > > > On Mon, Apr 15, 2019 at 09:17:10AM -0700, Linus Torvalds wrote:  
> > > > > On Sun, Apr 14, 2019 at 10:19 PM Christoph Hellwig 
> > > > >  wrote:
> > > > > >
> > > > > > Can we please have the page refcount overflow fixes out on the list
> > > > > > for review, even if it is after the fact?
> > > > > 
> > > > > They were actually on a list for review long before the fact, but it
> > > > > was the security mailing list. The issue actually got discussed back
> > > > > in January along with early versions of the patches, but then we
> > > > > dropped the ball because it just wasn't on anybody's radar and it got
> > > > > resurrected late March. Willy wrote a rather bigger patch-series, and
> > > > > review of that is what then resulted in those commits. So they may
> > > > > look recent, but that's just because the original patches got
> > > > > seriously edited down and rewritten.
> > > > > 
> > > > > That said, powerpc and s390 should at least look at maybe adding a
> > > > > check for the page ref in their gup paths too. Powerpc has the special
> > > > > gup_hugepte() case, and s390 has its own version of gup entirely. I
> > > > > was actually hoping the s390 guys would look at using the generic gup
> > > > > code.
> > > > > 
> > > > > I ruthlessly also entirely ignored MIPS, SH and sparc, since they seem
> > > > > largely irrelevant, partly since even theoretically this whole issue
> > > > > needs a _lot_ of memory.
> > > > > 
> > > > > Michael, Martin, see commit 6b3a70773630 ("Merge branch 'page-refs'
> > > > > (page ref overflow)"). You may or may not really care.
> > > > 
> > > > I've now queued these patches up for the next round of stable releases,
> > > > as some people seem to care about these.
> > > > 
> > > > I didn't see any follow-on patches for s390 or ppc64 hit the tree for
> > > > these changes, am I just missing them and should also queue up a few
> > > > more to handle this issue on those platforms?  
> > > 
> > > I fixed that with a different approach. The following two patches are
> > > queued for the next merge window:
> > > 
> > > d1874a0c2805 "s390/mm: make the pxd_offset functions more robust"
> > > 1a42010cdc26 "s390/mm: convert to the generic get_user_pages_fast code"
> > > 
> > > With these two s390 now uses the generic gup code in mm/gup.c  
> > 
> > Nice!  Do you want me to queue those up for the stable backports once
> > they hit a public -rc release?
> 
> Yes please!

Now queued up to 5.0 and 5.1, but did not apply to 4.19 or older :(

thanks,

greg k-h


Re: [PATCH] kbuild: do not check name uniqueness of builtin modules

2019-05-20 Thread Greg KH
On Mon, May 20, 2019 at 11:54:37AM +0900, Masahiro Yamada wrote:
> I just thought it was a good idea to scan builtin.modules in the name
> uniqueness checking, but Stephen reported a false positive.
> 
> ppc64_defconfig produces:
> 
>   warning: same basename if the following are built as modules:
> arch/powerpc/platforms/powermac/nvram.ko
> drivers/char/nvram.ko
> 
> ..., which is a false positive because the former is never built as
> a module as you see in arch/powerpc/platforms/powermac/Makefile:
> 
>   # CONFIG_NVRAM is an arch. independent tristate symbol, for pmac32 we really
>   # need this to be a bool.  Cheat here and pretend CONFIG_NVRAM=m is really
>   # CONFIG_NVRAM=y
>   obj-$(CONFIG_NVRAM:m=y) += nvram.o
> 
> Since we cannot predict how tricky Makefiles are written in wild,
> builtin.modules may potentially contain false positives. I do not
> think it is a big deal as far as kmod is concerned, but false positive
> warnings in the kernel build makes people upset. It is better to not
> do it.
> 
> Even without checking builtin.modules, we have enough (and more solid)
> test coverage with allmodconfig.
> 
> While I touched this part, I replaced the sed code with neater one
> provided by Stephen.
> 
> Link: https://lkml.org/lkml/2019/5/19/120
> Link: https://lkml.org/lkml/2019/5/19/123
> Fixes: 3a48a91901c5 ("kbuild: check uniqueness of module names")
> Reported-by: Stephen Rothwell 
> Signed-off-by: Masahiro Yamada 
> ---

Reviewed-by: Greg Kroah-Hartman 


Re: [PATCH stable 4.4] powerpc/lib: fix book3s/32 boot failure due to code patching

2019-05-15 Thread Greg KH
On Wed, May 15, 2019 at 01:30:42PM +, Christophe Leroy wrote:
> [Backport of upstream commit b45ba4a51cde29b2939365ef0c07ad34c8321789]
> 
> On powerpc32, patch_instruction() is called by apply_feature_fixups()
> which is called from early_init()
> 
> There is the following note in front of early_init():
>  * Note that the kernel may be running at an address which is different
>  * from the address that it was linked at, so we must use RELOC/PTRRELOC
>  * to access static data (including strings).  -- paulus
> 
> Therefore init_mem_is_free must be accessed with PTRRELOC()
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=203597
> Signed-off-by: Christophe Leroy 
> 
> ---
> Can't apply the upstream commit as such due to several other unrelated stuff
> like for instance STRICT_KERNEL_RWX which are missing.
> So instead, using same approach as for commit 
> 252eb55816a6f69ef9464cad303cdb3326cdc61d
> 
> Removed the Fixes: tag as I don't know yet the commit Id of the fixed commit 
> on 4.4 branch.
> ---
>  arch/powerpc/lib/code-patching.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Now added, thanks.

greg k-h


Re: [PATCH stable 4.9] powerpc/lib: fix book3s/32 boot failure due to code patching

2019-05-15 Thread Greg KH
On Wed, May 15, 2019 at 02:35:36PM +0200, Christophe Leroy wrote:
> 
> 
> Le 15/05/2019 à 10:29, Greg KH a écrit :
> > On Wed, May 15, 2019 at 06:40:47AM +, Christophe Leroy wrote:
> > > [Backport of upstream commit b45ba4a51cde29b2939365ef0c07ad34c8321789]
> > > 
> > > On powerpc32, patch_instruction() is called by apply_feature_fixups()
> > > which is called from early_init()
> > > 
> > > There is the following note in front of early_init():
> > >   * Note that the kernel may be running at an address which is different
> > >   * from the address that it was linked at, so we must use RELOC/PTRRELOC
> > >   * to access static data (including strings).  -- paulus
> > > 
> > > Therefore init_mem_is_free must be accessed with PTRRELOC()
> > > 
> > > Fixes: 1c38a84d4586 ("powerpc: Avoid code patching freed init sections")
> > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=203597
> > > Signed-off-by: Christophe Leroy 
> > > 
> > > ---
> > > Can't apply the upstream commit as such due to several other unrelated 
> > > stuff
> > > like for instance STRICT_KERNEL_RWX which are missing.
> > > So instead, using same approach as for commit 
> > > 252eb55816a6f69ef9464cad303cdb3326cdc61d
> > 
> > Now queued up, thanks.
> > 
> 
> Should go to 4.4 as well since the commit it fixes is now queued for 4.4
> ([PATCH 4.4 056/266] powerpc: Avoid code patching freed init sections)

Ok, can someone send me a backport that actually applies there?

thanks,

greg k-h


Re: [PATCH stable 4.9] powerpc/lib: fix book3s/32 boot failure due to code patching

2019-05-15 Thread Greg KH
On Wed, May 15, 2019 at 06:40:47AM +, Christophe Leroy wrote:
> [Backport of upstream commit b45ba4a51cde29b2939365ef0c07ad34c8321789]
> 
> On powerpc32, patch_instruction() is called by apply_feature_fixups()
> which is called from early_init()
> 
> There is the following note in front of early_init():
>  * Note that the kernel may be running at an address which is different
>  * from the address that it was linked at, so we must use RELOC/PTRRELOC
>  * to access static data (including strings).  -- paulus
> 
> Therefore init_mem_is_free must be accessed with PTRRELOC()
> 
> Fixes: 1c38a84d4586 ("powerpc: Avoid code patching freed init sections")
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=203597
> Signed-off-by: Christophe Leroy 
> 
> ---
> Can't apply the upstream commit as such due to several other unrelated stuff
> like for instance STRICT_KERNEL_RWX which are missing.
> So instead, using same approach as for commit 
> 252eb55816a6f69ef9464cad303cdb3326cdc61d

Now queued up, thanks.

greg k-h


Re: [Bug 203597] New: kernel 4.9.175 fails to boot on a PowerMac G4 3,6 at early stage

2019-05-14 Thread Greg KH
On Tue, May 14, 2019 at 06:56:03AM +0200, Christophe Leroy wrote:
> Hi Greg,
> 
> Could you please apply b45ba4a51cde29b2939365ef0c07ad34c8321789 to 4.9 since
> 51c3c62b58b357e8d35e4cc32f7b4ec907426fe3 was applied allthought marked for
> #4.13+

It does not apply there (nor to the 4.4.y queue, which will need it as
well), so can you provide a working backport so that I can queue it up?

thanks,

greg k-h


Re: [PATCH 03/16] lib,treewide: add new match_string() helper/macro

2019-05-08 Thread Greg KH
On Wed, May 08, 2019 at 04:11:28PM +0300, Andy Shevchenko wrote:
> On Wed, May 08, 2019 at 02:28:29PM +0300, Alexandru Ardelean wrote:
> > This change re-introduces `match_string()` as a macro that uses
> > ARRAY_SIZE() to compute the size of the array.
> > The macro is added in all the places that do
> > `match_string(_a, ARRAY_SIZE(_a), s)`, since the change is pretty
> > straightforward.
> 
> Can you split include/linux/ change from the rest?

That would break the build, why do you want it split out?  This makes
sense all as a single patch to me.

thanks,

greg k-h


Re: [PATCH v2] powerpc/powernv: Restrict OPAL symbol map to only be readable by root

2019-05-03 Thread Greg KH
On Fri, May 03, 2019 at 06:27:18PM +1000, Andrew Donnellan wrote:
> On 3/5/19 5:59 pm, Greg KH wrote:>> -static BIN_ATTR_RO(symbol_map, 0);
> > > +static struct bin_attribute symbol_map_attr = {
> > > + .attr = {.name = "symbol_map", .mode = 0400},
> > > + .read = symbol_map_read
> > > +};
> > 
> > There's no real need to rename the structure, right?  Why not just keep
> > the bin_attr_symbol_map name?  That would make this patch even smaller.
> 
> No real need but it's locally more consistent with the rest of the PPC code.
> (Though perhaps the other cases should use the BIN_ATTR macro...)
> 
> Given this is for stable I'm happy to change that if the smaller patch is
> more acceptable.

stable doesn't care, and if this is more consistent, that's fine with
me, I didn't see the larger picture here, just providing unsolicited
patch review :)

> > >   static void opal_export_symmap(void)
> > >   {
> > > @@ -698,10 +701,10 @@ static void opal_export_symmap(void)
> > >   return;
> > >   /* Setup attributes */
> > > - bin_attr_symbol_map.private = __va(be64_to_cpu(syms[0]));
> > > - bin_attr_symbol_map.size = be64_to_cpu(syms[1]);
> > > + symbol_map_attr.private = __va(be64_to_cpu(syms[0]));
> > > + symbol_map_attr.size = be64_to_cpu(syms[1]);
> > > - rc = sysfs_create_bin_file(opal_kobj, _attr_symbol_map);
> > > + rc = sysfs_create_bin_file(opal_kobj, _map_attr);
> > 
> > Meta-comment, odds are you are racing userspace when you create this
> > sysfs file, why not add it to the device's default attributes so the
> > driver core creates it for you at the correct time?
> 
> I was not previously aware of default attributes...
> 
> Are we actually racing against userspace in a subsys initcall?

You can be, if you subsys is a module :)

thanks,

greg k-h


Re: [PATCH v2] powerpc/powernv: Restrict OPAL symbol map to only be readable by root

2019-05-03 Thread Greg KH
On Fri, May 03, 2019 at 05:52:53PM +1000, Andrew Donnellan wrote:
> Currently the OPAL symbol map is globally readable, which seems bad as it
> contains physical addresses.
> 
> Restrict it to root.
> 
> Suggested-by: Michael Ellerman 
> Cc: Jordan Niethe 
> Cc: Stewart Smith 
> Fixes: c8742f85125d ("powerpc/powernv: Expose OPAL firmware symbol map")
> Cc: sta...@vger.kernel.org
> Signed-off-by: Andrew Donnellan 
> 
> ---
> 
> v1->v2:
> 
> - fix tabs vs spaces (Greg)
> ---
>  arch/powerpc/platforms/powernv/opal.c | 11 +++
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/opal.c 
> b/arch/powerpc/platforms/powernv/opal.c
> index 2b0eca104f86..0582a02623d0 100644
> --- a/arch/powerpc/platforms/powernv/opal.c
> +++ b/arch/powerpc/platforms/powernv/opal.c
> @@ -681,7 +681,10 @@ static ssize_t symbol_map_read(struct file *fp, struct 
> kobject *kobj,
>  bin_attr->size);
>  }
>  
> -static BIN_ATTR_RO(symbol_map, 0);
> +static struct bin_attribute symbol_map_attr = {
> + .attr = {.name = "symbol_map", .mode = 0400},
> + .read = symbol_map_read
> +};

There's no real need to rename the structure, right?  Why not just keep
the bin_attr_symbol_map name?  That would make this patch even smaller.

>  static void opal_export_symmap(void)
>  {
> @@ -698,10 +701,10 @@ static void opal_export_symmap(void)
>   return;
>  
>   /* Setup attributes */
> - bin_attr_symbol_map.private = __va(be64_to_cpu(syms[0]));
> - bin_attr_symbol_map.size = be64_to_cpu(syms[1]);
> + symbol_map_attr.private = __va(be64_to_cpu(syms[0]));
> + symbol_map_attr.size = be64_to_cpu(syms[1]);
>  
> - rc = sysfs_create_bin_file(opal_kobj, _attr_symbol_map);
> + rc = sysfs_create_bin_file(opal_kobj, _map_attr);

Meta-comment, odds are you are racing userspace when you create this
sysfs file, why not add it to the device's default attributes so the
driver core creates it for you at the correct time?

thanks,

greg k-h


Re: [PATCH] powerpc/powernv: Restrict OPAL symbol map to only be readable by root

2019-05-03 Thread Greg KH
On Fri, May 03, 2019 at 05:44:05PM +1000, Andrew Donnellan wrote:
> Currently the OPAL symbol map is globally readable, which seems bad as it
> contains physical addresses.
> 
> Restrict it to root.
> 
> Suggested-by: Michael Ellerman 
> Cc: Jordan Niethe 
> Cc: Stewart Smith 
> Fixes: c8742f85125d ("powerpc/powernv: Expose OPAL firmware symbol map")
> Cc: sta...@vger.kernel.org
> Signed-off-by: Andrew Donnellan 
> ---
>  arch/powerpc/platforms/powernv/opal.c | 11 +++
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/opal.c 
> b/arch/powerpc/platforms/powernv/opal.c
> index 2b0eca104f86..505460a72052 100644
> --- a/arch/powerpc/platforms/powernv/opal.c
> +++ b/arch/powerpc/platforms/powernv/opal.c
> @@ -681,7 +681,10 @@ static ssize_t symbol_map_read(struct file *fp, struct 
> kobject *kobj,
>  bin_attr->size);
>  }
>  
> -static BIN_ATTR_RO(symbol_map, 0);
> +static struct bin_attribute symbol_map_attr = {
> + .attr = {.name = "symbol_map", .mode = 0400},
> + .read = symbol_map_read
> +};
>  
>  static void opal_export_symmap(void)
>  {
> @@ -698,10 +701,10 @@ static void opal_export_symmap(void)
>   return;
>  
>   /* Setup attributes */
> - bin_attr_symbol_map.private = __va(be64_to_cpu(syms[0]));
> - bin_attr_symbol_map.size = be64_to_cpu(syms[1]);
> +symbol_map_attr.private = __va(be64_to_cpu(syms[0]));

no tab?

checkpatch.pl is your friend :)



Re: Linux 5.1-rc5

2019-05-02 Thread Greg KH
On Thu, May 02, 2019 at 04:17:58PM +0200, Martin Schwidefsky wrote:
> On Thu, 2 May 2019 14:21:28 +0200
> Greg KH  wrote:
> 
> > On Mon, Apr 15, 2019 at 09:17:10AM -0700, Linus Torvalds wrote:
> > > On Sun, Apr 14, 2019 at 10:19 PM Christoph Hellwig  
> > > wrote:  
> > > >
> > > > Can we please have the page refcount overflow fixes out on the list
> > > > for review, even if it is after the fact?  
> > > 
> > > They were actually on a list for review long before the fact, but it
> > > was the security mailing list. The issue actually got discussed back
> > > in January along with early versions of the patches, but then we
> > > dropped the ball because it just wasn't on anybody's radar and it got
> > > resurrected late March. Willy wrote a rather bigger patch-series, and
> > > review of that is what then resulted in those commits. So they may
> > > look recent, but that's just because the original patches got
> > > seriously edited down and rewritten.
> > > 
> > > That said, powerpc and s390 should at least look at maybe adding a
> > > check for the page ref in their gup paths too. Powerpc has the special
> > > gup_hugepte() case, and s390 has its own version of gup entirely. I
> > > was actually hoping the s390 guys would look at using the generic gup
> > > code.
> > > 
> > > I ruthlessly also entirely ignored MIPS, SH and sparc, since they seem
> > > largely irrelevant, partly since even theoretically this whole issue
> > > needs a _lot_ of memory.
> > > 
> > > Michael, Martin, see commit 6b3a70773630 ("Merge branch 'page-refs'
> > > (page ref overflow)"). You may or may not really care.  
> > 
> > I've now queued these patches up for the next round of stable releases,
> > as some people seem to care about these.
> > 
> > I didn't see any follow-on patches for s390 or ppc64 hit the tree for
> > these changes, am I just missing them and should also queue up a few
> > more to handle this issue on those platforms?
> 
> I fixed that with a different approach. The following two patches are
> queued for the next merge window:
> 
> d1874a0c2805 "s390/mm: make the pxd_offset functions more robust"
> 1a42010cdc26 "s390/mm: convert to the generic get_user_pages_fast code"
> 
> With these two s390 now uses the generic gup code in mm/gup.c

Nice!  Do you want me to queue those up for the stable backports once
they hit a public -rc release?

thanks,

greg k-h


Re: Linux 5.1-rc5

2019-05-02 Thread Greg KH
On Mon, Apr 15, 2019 at 09:17:10AM -0700, Linus Torvalds wrote:
> On Sun, Apr 14, 2019 at 10:19 PM Christoph Hellwig  wrote:
> >
> > Can we please have the page refcount overflow fixes out on the list
> > for review, even if it is after the fact?
> 
> They were actually on a list for review long before the fact, but it
> was the security mailing list. The issue actually got discussed back
> in January along with early versions of the patches, but then we
> dropped the ball because it just wasn't on anybody's radar and it got
> resurrected late March. Willy wrote a rather bigger patch-series, and
> review of that is what then resulted in those commits. So they may
> look recent, but that's just because the original patches got
> seriously edited down and rewritten.
> 
> That said, powerpc and s390 should at least look at maybe adding a
> check for the page ref in their gup paths too. Powerpc has the special
> gup_hugepte() case, and s390 has its own version of gup entirely. I
> was actually hoping the s390 guys would look at using the generic gup
> code.
> 
> I ruthlessly also entirely ignored MIPS, SH and sparc, since they seem
> largely irrelevant, partly since even theoretically this whole issue
> needs a _lot_ of memory.
> 
> Michael, Martin, see commit 6b3a70773630 ("Merge branch 'page-refs'
> (page ref overflow)"). You may or may not really care.

I've now queued these patches up for the next round of stable releases,
as some people seem to care about these.

I didn't see any follow-on patches for s390 or ppc64 hit the tree for
these changes, am I just missing them and should also queue up a few
more to handle this issue on those platforms?

thanks,

greg k-h


Re: [PATCH v2 stable v4.4 2/2] Documentation: Add nospectre_v1 parameter

2019-04-30 Thread Greg KH
On Tue, Apr 30, 2019 at 03:42:27PM +0300, Diana Craciun wrote:
> commit 26cb1f36c43ee6e89d2a9f48a5a7500d5248f836 upstream.
> 
> Currently only supported on powerpc.
> 
> Signed-off-by: Diana Craciun 
> Signed-off-by: Michael Ellerman 
> ---
>  Documentation/kernel-parameters.txt | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/Documentation/kernel-parameters.txt 
> b/Documentation/kernel-parameters.txt
> index f0bdf78420a0..3ff87d5d6fea 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -2449,6 +2449,10 @@ bytes respectively. Such letter suffixes can also be 
> entirely omitted.
>   legacy floating-point registers on task switch.
>  
>   nohugeiomap [KNL,x86] Disable kernel huge I/O mappings.
> + 
> + nospectre_v1[PPC] Disable mitigations for Spectre Variant 1 (bounds
> + check bypass). With this option data leaks are possible
> + in the system.
>  
>   nospectre_v2[X86,PPC_FSL_BOOK3E] Disable all mitigations for the 
> Spectre variant 2
>   (indirect branch prediction) vulnerability. System may
> -- 
> 2.17.1
>

Both of these patches needed to be added to a bunch of the stable trees,
so I've now done that.

thanks,

greg k-h


Re: [PATCH v2 stable v4.4 2/2] Documentation: Add nospectre_v1 parameter

2019-04-30 Thread Greg KH
On Tue, Apr 30, 2019 at 03:42:27PM +0300, Diana Craciun wrote:
> commit 26cb1f36c43ee6e89d2a9f48a5a7500d5248f836 upstream.
> 
> Currently only supported on powerpc.
> 
> Signed-off-by: Diana Craciun 
> Signed-off-by: Michael Ellerman 
> ---
>  Documentation/kernel-parameters.txt | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/Documentation/kernel-parameters.txt 
> b/Documentation/kernel-parameters.txt
> index f0bdf78420a0..3ff87d5d6fea 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -2449,6 +2449,10 @@ bytes respectively. Such letter suffixes can also be 
> entirely omitted.
>   legacy floating-point registers on task switch.
>  
>   nohugeiomap [KNL,x86] Disable kernel huge I/O mappings.
> + 

Trailing whitespace :(

Fix up your editor to flag this as RED or something.  I'll go fix it
up...



Re: [PATCH stable v4.4 0/8] missing powerpc spectre backports for 4.4

2019-04-30 Thread Greg KH
On Mon, Apr 29, 2019 at 06:49:00PM +0300, Diana Craciun wrote:
> Hi Greg,
> 
> These are missing patches from the initial powerpc spectre backports for 4.4.
> Please queue them as well if you don't have any objections.

I applied the first 6 of these now.  If you could fix up the last two
and resend them, that would be wonderful.

thanks,

greg k-h


Re: [PATCH stable v4.4 8/8] Documentation: Add nospectre_v1 parameter

2019-04-30 Thread Greg KH
On Mon, Apr 29, 2019 at 06:49:08PM +0300, Diana Craciun wrote:
> Currently only supported on powerpc.

No upstream git commit id for this one?

thanks,

greg k-h


Re: [PATCH stable v4.4 7/8] powerpc/fsl: Add FSL_PPC_BOOK3E as supported arch for nospectre_v2 boot arg

2019-04-30 Thread Greg KH
On Mon, Apr 29, 2019 at 06:49:07PM +0300, Diana Craciun wrote:
> commit f633a8ad636efb5d4bba1a047d4a0f1ef719aa06 upstream.

No, the patch below is not that git commit :(

I'll stop here in applying these patches.

thanks,

greg k-h


Re: [PATCH 01/41] drivers: tty: serial: dz: use dev_err() instead of printk()

2019-04-29 Thread Greg KH
On Mon, Apr 29, 2019 at 04:11:15PM +0200, Enrico Weigelt, metux IT consult 
wrote:
> On 27.04.19 15:29, Greg KH wrote:
> > On Sat, Apr 27, 2019 at 02:51:42PM +0200, Enrico Weigelt, metux IT consult 
> > wrote:
> >> Using dev_err() instead of printk() for more consistent output.
> >> (prints device name, etc).
> >>
> >> Signed-off-by: Enrico Weigelt 
> > 
> > Your "From:" line does not match the signed-off-by line, so I can't take
> > any of these if I wanted to :(
> 
> Grmpf. I've manually changed it, as you isisted in having my company
> name remove from it 

Yes, that's fine, but the lines have to match.  See the documentation
for how to have a "From:" in the changelog text to override whatever
your email client happens to pollute the email with :)

thanks,

greg k-h


Re: [PATCH 01/41] drivers: tty: serial: dz: use dev_err() instead of printk()

2019-04-29 Thread Greg KH
On Mon, Apr 29, 2019 at 02:37:04PM +0200, Enrico Weigelt, metux IT consult 
wrote:
> On 27.04.19 15:31, Greg KH wrote:
> > On Sat, Apr 27, 2019 at 02:51:42PM +0200, Enrico Weigelt, metux IT consult 
> > wrote:
> >> Using dev_err() instead of printk() for more consistent output.
> >> (prints device name, etc).
> >>
> >> Signed-off-by: Enrico Weigelt 
> >> ---
> >>  drivers/tty/serial/dz.c | 8 
> > 
> > Do you have this hardware to test any of these changes with?
> 
> Unfortunately not :(

Then I can take the "basic" types of patches for the driver (like this
one), but not any others, sorry.

thanks,

greg k-h


Re: [PATCH stable v4.4 00/52] powerpc spectre backports for 4.4

2019-04-29 Thread Greg KH
On Mon, Apr 22, 2019 at 12:19:45AM +1000, Michael Ellerman wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> Hi Greg/Sasha,
> 
> Please queue up these powerpc patches for 4.4 if you have no objections.

All now queued up, thanks.

greg k-h


Re: [PATCH stable v4.4 00/52] powerpc spectre backports for 4.4

2019-04-29 Thread Greg KH
On Mon, Apr 29, 2019 at 04:26:45PM +1000, Michael Ellerman wrote:
> Michael Ellerman  writes:
> > Greg KH  writes:
> >> On Mon, Apr 22, 2019 at 12:19:45AM +1000, Michael Ellerman wrote:
> >>> -BEGIN PGP SIGNED MESSAGE-
> >>> Hash: SHA1
> >>> 
> >>> Hi Greg/Sasha,
> >>> 
> >>> Please queue up these powerpc patches for 4.4 if you have no objections.
> >>
> >> why?  Do you, or someone else, really care about spectre issues in 4.4?
> >> Who is using ppc for 4.4 becides a specific enterprise distro (and they
> >> don't seem to be pulling in my stable updates anyway...)?
> >
> > Someone asked for it, but TBH I can't remember who it was. I can chase
> > it up if you like.
> 
> Yeah it was a request from one of the distros. They plan to take it once
> it lands in 4.4 stable.

Ok, thanks for confirming, I'll work on this this afternoon.

greg k-h


Re: [PATCH 10/41] drivers: tty: serial: sb1250-duart: fix missing parentheses

2019-04-27 Thread Greg KH
On Sat, Apr 27, 2019 at 02:51:51PM +0200, Enrico Weigelt, metux IT consult 
wrote:
> Fix checkpatch warning:
> 
> ERROR: Macros with complex values should be enclosed in parentheses
> #911: FILE: drivers/tty/serial/sb1250-duart.c:911:
> +#define SERIAL_SB1250_DUART_CONSOLE  _console
> 
> Signed-off-by: Enrico Weigelt 
> ---
>  drivers/tty/serial/sb1250-duart.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/tty/serial/sb1250-duart.c 
> b/drivers/tty/serial/sb1250-duart.c
> index 1184226..ec74f09 100644
> --- a/drivers/tty/serial/sb1250-duart.c
> +++ b/drivers/tty/serial/sb1250-duart.c
> @@ -908,7 +908,7 @@ static int __init sbd_serial_console_init(void)
>  
>  console_initcall(sbd_serial_console_init);
>  
> -#define SERIAL_SB1250_DUART_CONSOLE  _console
> +#define SERIAL_SB1250_DUART_CONSOLE  (_console)

No, that's foolish.

checkpatch is a hint, it's not always right.

Also, checkpatch cleanups for really old drivers is not generally a good
idea, especially if you do not have the hardware for them.  Please don't
cause unneeded churn for this type of thing in this subsystem, unless
you have the hardware.

thanks,

greg k-h


Re: [PATCH 01/41] drivers: tty: serial: dz: use dev_err() instead of printk()

2019-04-27 Thread Greg KH
On Sat, Apr 27, 2019 at 02:51:42PM +0200, Enrico Weigelt, metux IT consult 
wrote:
> Using dev_err() instead of printk() for more consistent output.
> (prints device name, etc).
> 
> Signed-off-by: Enrico Weigelt 
> ---
>  drivers/tty/serial/dz.c | 8 

Do you have this hardware to test any of these changes with?

thanks,

greg k-h


  1   2   3   4   5   >