Re: bootx_init.c:88: undefined reference to `__stack_chk_fail_local'

2017-01-11 Thread Christophe LEROY



Le 11/01/2017 à 23:54, Segher Boessenkool a écrit :

On Tue, Jan 10, 2017 at 07:26:15AM +0100, Christophe LEROY wrote:

Maybe ppc32 is not supposed to be built with CC_STACKPROTECTOR ?


Indeed, the latest versions of GCC don't use anymore the global variable
__stack_chk_guard as canary value, but a value stored at -0x7008(r2).
This is not compatible with the current implementation of  the kernel
with uses r2 as a pointeur to current task struct.
So until we fix it, I don't think CC_STACKPROTECTOR is usable on PPC
with modern versions of GCC.


I still wonder what changed.  Nothing relevant has changed for ten years
or whatever as far as I see; unless it is just the -fstack-protector-strong
that makes it fail now.  Curious.



Yes, looks like it was changed from global to TLS in 2005 on powerpc. 
Indeed when I implemented STACKPROTECTOR in Kernel on ppc I 
copied/pasted it from ARM which is (still?) using the global 
__stack_chk_guard, and at first it worked quite well on my powerpc.


x86 has the following option on GCC. Couldn't we have it on powerpc too ?

-mstack-protector-guard=guard
Generate  stack  protection  code  using  canary  at
guard.   Supported  locations are ‘ global ’ for global canary or ‘ tls
’ for per-thread canary in the TLS block (the  default). This  option 
has  effect  only  when  ‘-fstack-protector’  or ‘-fstack-protector-all’ 
is specified.


Christophe


Re: [PATCH 2/9] Move dma_ops from archdata into struct device

2017-01-11 Thread gre...@linuxfoundation.org
On Wed, Jan 11, 2017 at 10:28:05PM +, Bart Van Assche wrote:
> On Wed, 2017-01-11 at 21:31 +0100, gre...@linuxfoundation.org wrote:
> > That's a big sign that your patch series needs work.  Break it up into
> > smaller pieces, it should be possible, which will make merges easier
> > (well, different in a way.)
> 
> Hello Greg,
> 
> Can you have a look at the attached patches? These three patches are a
> splitup of the single patch at the start of this e-mail thread.

Please send them in the proper format (i.e. one patch per email), and I
will be glad to review them.  Otherwise it's really hard to do so, would
you want to review attachments?

thanks,

greg k-h


Re: [PATCH kernel v2 05/11] KVM: PPC: Use preregistered memory API to access TCE list

2017-01-11 Thread David Gibson
On Wed, Jan 11, 2017 at 05:35:21PM +1100, Alexey Kardashevskiy wrote:
> On 21/12/16 19:57, Alexey Kardashevskiy wrote:
> > On 21/12/16 15:08, David Gibson wrote:
> >> On Sun, Dec 18, 2016 at 12:28:54PM +1100, Alexey Kardashevskiy wrote:
> >>> VFIO on sPAPR already implements guest memory pre-registration
> >>> when the entire guest RAM gets pinned. This can be used to translate
> >>> the physical address of a guest page containing the TCE list
> >>> from H_PUT_TCE_INDIRECT.
> >>>
> >>> This makes use of the pre-registrered memory API to access TCE list
> >>> pages in order to avoid unnecessary locking on the KVM memory
> >>> reverse map as we know that all of guest memory is pinned and
> >>> we have a flat array mapping GPA to HPA which makes it simpler and
> >>> quicker to index into that array (even with looking up the
> >>> kernel page tables in vmalloc_to_phys) than it is to find the memslot,
> >>> lock the rmap entry, look up the user page tables, and unlock the rmap
> >>> entry. Note that the rmap pointer is initialized to NULL
> >>> where declared (not in this patch).
> >>>
> >>> If a requested chunk of memory has not been preregistered,
> >>> this will fail with H_TOO_HARD so the virtual mode handle can
> >>> handle the request.
> >>>
> >>> Signed-off-by: Alexey Kardashevskiy 
> >>> ---
> >>> Changes:
> >>> v2:
> >>> * updated the commit log with David's comment
> >>> ---
> >>>  arch/powerpc/kvm/book3s_64_vio_hv.c | 65 
> >>> -
> >>>  1 file changed, 49 insertions(+), 16 deletions(-)
> >>>
> >>> diff --git a/arch/powerpc/kvm/book3s_64_vio_hv.c 
> >>> b/arch/powerpc/kvm/book3s_64_vio_hv.c
> >>> index d461c440889a..a3be4bd6188f 100644
> >>> --- a/arch/powerpc/kvm/book3s_64_vio_hv.c
> >>> +++ b/arch/powerpc/kvm/book3s_64_vio_hv.c
> >>> @@ -180,6 +180,17 @@ long kvmppc_gpa_to_ua(struct kvm *kvm, unsigned long 
> >>> gpa,
> >>>  EXPORT_SYMBOL_GPL(kvmppc_gpa_to_ua);
> >>>  
> >>>  #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
> >>> +static inline bool kvmppc_preregistered(struct kvm_vcpu *vcpu)
> >>> +{
> >>> + return mm_iommu_preregistered(vcpu->kvm->mm);
> >>> +}
> >>> +
> >>> +static struct mm_iommu_table_group_mem_t *kvmppc_rm_iommu_lookup(
> >>> + struct kvm_vcpu *vcpu, unsigned long ua, unsigned long size)
> >>> +{
> >>> + return mm_iommu_lookup_rm(vcpu->kvm->mm, ua, size);
> >>> +}
> >>
> >> I don't see that there's much point to these inlines.
> >>
> >>>  long kvmppc_rm_h_put_tce(struct kvm_vcpu *vcpu, unsigned long liobn,
> >>>   unsigned long ioba, unsigned long tce)
> >>>  {
> >>> @@ -260,23 +271,44 @@ long kvmppc_rm_h_put_tce_indirect(struct kvm_vcpu 
> >>> *vcpu,
> >>>   if (ret != H_SUCCESS)
> >>>   return ret;
> >>>  
> >>> - if (kvmppc_gpa_to_ua(vcpu->kvm, tce_list, , ))
> >>> - return H_TOO_HARD;
> >>> + if (kvmppc_preregistered(vcpu)) {
> >>> + /*
> >>> +  * We get here if guest memory was pre-registered which
> >>> +  * is normally VFIO case and gpa->hpa translation does not
> >>> +  * depend on hpt.
> >>> +  */
> >>> + struct mm_iommu_table_group_mem_t *mem;
> >>>  
> >>> - rmap = (void *) vmalloc_to_phys(rmap);
> >>> + if (kvmppc_gpa_to_ua(vcpu->kvm, tce_list, , NULL))
> >>> + return H_TOO_HARD;
> >>>  
> >>> - /*
> >>> -  * Synchronize with the MMU notifier callbacks in
> >>> -  * book3s_64_mmu_hv.c (kvm_unmap_hva_hv etc.).
> >>> -  * While we have the rmap lock, code running on other CPUs
> >>> -  * cannot finish unmapping the host real page that backs
> >>> -  * this guest real page, so we are OK to access the host
> >>> -  * real page.
> >>> -  */
> >>> - lock_rmap(rmap);
> >>> - if (kvmppc_rm_ua_to_hpa(vcpu, ua, )) {
> >>> - ret = H_TOO_HARD;
> >>> - goto unlock_exit;
> >>> + mem = kvmppc_rm_iommu_lookup(vcpu, ua, IOMMU_PAGE_SIZE_4K);
> >>> + if (!mem || mm_iommu_ua_to_hpa_rm(mem, ua, ))
> >>> + return H_TOO_HARD;
> >>> + } else {
> >>> + /*
> >>> +  * This is emulated devices case.
> >>> +  * We do not require memory to be preregistered in this case
> >>> +  * so lock rmap and do __find_linux_pte_or_hugepte().
> >>> +  */
> >>
> >> Hmm.  So this isn't wrong as such, but the logic and comments are
> >> both misleading.  The 'if' here isn't really about VFIO vs. emulated -
> >> it's about whether the mm has *any* preregistered chunks, without any
> >> regard to which particular device you're talking about.  For example
> >> if your guest has two PHBs, one with VFIO devices and the other with
> >> emulated devices, then the emulated devices will still go through the
> >> "VFIO" case here.
> > 
> > kvmppc_preregistered() checks for a single pointer, kvmppc_rm_ua_to_hpa()
> > goes through __find_linux_pte_or_hugepte() which is unnecessary
> > complication here.

Except that you're going to call kvmppc_rm_ua_to_hpa() eventually anyway.

> > s/emulated 

Re: [PATCH kernel v3] KVM: PPC: Add in-kernel acceleration for VFIO

2017-01-11 Thread David Gibson
On Tue, Dec 20, 2016 at 05:52:29PM +1100, Alexey Kardashevskiy wrote:
> This allows the host kernel to handle H_PUT_TCE, H_PUT_TCE_INDIRECT
> and H_STUFF_TCE requests targeted an IOMMU TCE table used for VFIO
> without passing them to user space which saves time on switching
> to user space and back.
> 
> This adds H_PUT_TCE/H_PUT_TCE_INDIRECT/H_STUFF_TCE handlers to KVM.
> KVM tries to handle a TCE request in the real mode, if failed
> it passes the request to the virtual mode to complete the operation.
> If it a virtual mode handler fails, the request is passed to
> the user space; this is not expected to happen though.
> 
> To avoid dealing with page use counters (which is tricky in real mode),
> this only accelerates SPAPR TCE IOMMU v2 clients which are required
> to pre-register the userspace memory. The very first TCE request will
> be handled in the VFIO SPAPR TCE driver anyway as the userspace view
> of the TCE table (iommu_table::it_userspace) is not allocated till
> the very first mapping happens and we cannot call vmalloc in real mode.
> 
> This adds new attribute - KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE - to
> the VFIO KVM device. It takes a VFIO group fd and SPAPR TCE table fd
> and associates a physical IOMMU table with the SPAPR TCE table (which
> is a guest view of the hardware IOMMU table). The iommu_table object
> is cached and referenced so we do not have to look up for it in real mode.
> 
> This does not implement the UNSET counterpart as there is no use for it -
> once the acceleration is enabled, the existing userspace won't
> disable it unless a VFIO container is destroyed; this adds necessary
> cleanup to the KVM_DEV_VFIO_GROUP_DEL handler.
> 
> As this creates a descriptor per IOMMU table-LIOBN couple (called
> kvmppc_spapr_tce_iommu_table), it is possible to have several
> descriptors with the same iommu_table (hardware IOMMU table) attached
> to the same LIOBN, this is done to simplify the cleanup and can be
> improved later.
> 
> This advertises the new KVM_CAP_SPAPR_TCE_VFIO capability to the user
> space.
> 
> This finally makes use of vfio_external_user_iommu_id() which was
> introduced quite some time ago and was considered for removal.
> 
> Tests show that this patch increases transmission speed from 220MB/s
> to 750..1020MB/s on 10Gb network (Chelsea CXGB3 10Gb ethernet card).
> 
> Signed-off-by: Alexey Kardashevskiy 
> ---
> Changes:
> v3:
> * simplified not to use VFIO group notifiers
> * reworked cleanup, should be cleaner/simpler now
> 
> v2:
> * reworked to use new VFIO notifiers
> * now same iommu_table may appear in the list several times, to be fixed later
> ---
> 
> This obsoletes:
> 
> [PATCH kernel v2 08/11] KVM: PPC: Pass kvm* to kvmppc_find_table()
> [PATCH kernel v2 09/11] vfio iommu: Add helpers to (un)register blocking 
> notifiers per group
> [PATCH kernel v2 11/11] KVM: PPC: Add in-kernel acceleration for VFIO
> 
> 
> So I have not reposted the whole thing, should have I?
> 
> 
> btw "F: virt/kvm/vfio.*"  is missing MAINTAINERS.
> 
> 
> ---
>  Documentation/virtual/kvm/devices/vfio.txt |  22 ++-
>  arch/powerpc/include/asm/kvm_host.h|   8 +
>  arch/powerpc/include/asm/kvm_ppc.h |   4 +
>  include/uapi/linux/kvm.h   |   8 +
>  arch/powerpc/kvm/book3s_64_vio.c   | 286 
> +
>  arch/powerpc/kvm/book3s_64_vio_hv.c| 178 ++
>  arch/powerpc/kvm/powerpc.c |   2 +
>  virt/kvm/vfio.c|  88 +
>  8 files changed, 594 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/virtual/kvm/devices/vfio.txt 
> b/Documentation/virtual/kvm/devices/vfio.txt
> index ef51740c67ca..f95d867168ea 100644
> --- a/Documentation/virtual/kvm/devices/vfio.txt
> +++ b/Documentation/virtual/kvm/devices/vfio.txt
> @@ -16,7 +16,25 @@ Groups:
>  
>  KVM_DEV_VFIO_GROUP attributes:
>KVM_DEV_VFIO_GROUP_ADD: Add a VFIO group to VFIO-KVM device tracking
> + kvm_device_attr.addr points to an int32_t file descriptor
> + for the VFIO group.
>KVM_DEV_VFIO_GROUP_DEL: Remove a VFIO group from VFIO-KVM device tracking
> + kvm_device_attr.addr points to an int32_t file descriptor
> + for the VFIO group.
> +  KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: attaches a guest visible TCE table
> + allocated by sPAPR KVM.
> + kvm_device_attr.addr points to a struct:
>  
> -For each, kvm_device_attr.addr points to an int32_t file descriptor
> -for the VFIO group.
> + struct kvm_vfio_spapr_tce {
> + __u32   argsz;
> + __u32   flags;
> + __s32   groupfd;
> + __s32   tablefd;
> + };
> +
> + where
> + @argsz is the size of kvm_vfio_spapr_tce_liobn;
> + @flags are not supported now, must be zero;
> + @groupfd is a file descriptor for a VFIO group;
> + @tablefd is a file descriptor for a TCE table allocated via
> + KVM_CREATE_SPAPR_TCE.
> diff --git 

Re: [PATCH v2 0/4] cxlflash: Enhancements, cleanup and fixes

2017-01-11 Thread Martin K. Petersen
> "Uma" == Uma Krishnan  writes:

Uma> This patch series includes an enhancement to support a new command
Uma> queuing model and also cleans up prints throughout the driver. The
Uma> last patch in the series fixes a racing issue.

Applied to 4.11/scsi-queue.

-- 
Martin K. Petersen  Oracle Linux Engineering


[PATCH] powerpc: Use octal numbers for file permissions

2017-01-11 Thread Russell Currey
Symbolic macros are unintuitive and hard to read, whereas octal constants
are much easier to interpret.  Replace macros for the basic permission
flags (user/group/other read/write/execute) with numeric constants
instead, across the whole powerpc tree.

Introducing a significant number of changes across the tree for no runtime
benefit isn't exactly desirable, but so long as these macros are still
used in the tree people will keep sending patches that add them.  Not only
are they hard to parse at a glance, there are multiple ways of coming to
the same value (as you can see with 0444 and 0644 in this patch) which
hurts readability.

Signed-off-by: Russell Currey 
---
I wondered what a "S_IRUGO" was and subsequently found the following:
https://lwn.net/Articles/696229/
so I figured making numeric constants standard across the tree was a good
idea instead of the mix we currently have.

For new patches that come in, checkpatch warns when using something like
S_IRUGO and tells you to use something like 0444 instead.
---
 arch/powerpc/kernel/eeh_sysfs.c|  2 +-
 arch/powerpc/kernel/iommu.c|  2 +-
 arch/powerpc/kernel/proc_powerpc.c |  2 +-
 arch/powerpc/kernel/rtas-proc.c| 14 +++---
 arch/powerpc/kernel/rtas_flash.c   |  2 +-
 arch/powerpc/kernel/rtasd.c|  2 +-
 arch/powerpc/kernel/traps.c|  4 ++--
 arch/powerpc/kvm/book3s_hv.c   | 10 --
 arch/powerpc/kvm/book3s_xics.c |  2 +-
 arch/powerpc/platforms/83xx/mcu_mpc8349emitx.c |  2 +-
 arch/powerpc/platforms/cell/spufs/inode.c  |  4 ++--
 arch/powerpc/platforms/powernv/opal-dump.c |  4 ++--
 arch/powerpc/platforms/powernv/opal-elog.c |  4 ++--
 arch/powerpc/platforms/powernv/opal-sysparam.c |  6 +++---
 arch/powerpc/platforms/pseries/cmm.c   | 16 
 arch/powerpc/platforms/pseries/dlpar.c |  2 +-
 arch/powerpc/platforms/pseries/hvCall_inst.c   |  2 +-
 arch/powerpc/platforms/pseries/ibmebus.c   |  4 ++--
 arch/powerpc/platforms/pseries/lparcfg.c   |  4 ++--
 arch/powerpc/platforms/pseries/mobility.c  |  4 ++--
 arch/powerpc/platforms/pseries/reconfig.c  |  2 +-
 arch/powerpc/platforms/pseries/scanlog.c   |  2 +-
 arch/powerpc/platforms/pseries/suspend.c   |  3 +--
 arch/powerpc/platforms/pseries/vio.c   |  8 
 arch/powerpc/sysdev/axonram.c  |  2 +-
 arch/powerpc/sysdev/mv64x60_pci.c  |  2 +-
 26 files changed, 54 insertions(+), 57 deletions(-)

diff --git a/arch/powerpc/kernel/eeh_sysfs.c b/arch/powerpc/kernel/eeh_sysfs.c
index 1ceecdda810b..f6d2d5b66907 100644
--- a/arch/powerpc/kernel/eeh_sysfs.c
+++ b/arch/powerpc/kernel/eeh_sysfs.c
@@ -48,7 +48,7 @@ static ssize_t eeh_show_##_name(struct device *dev,  \
  \
return sprintf(buf, _format "\n", edev->_memb);   \
 }\
-static DEVICE_ATTR(_name, S_IRUGO, eeh_show_##_name, NULL);
+static DEVICE_ATTR(_name, 0444, eeh_show_##_name, NULL);
 
 EEH_SHOW_ATTR(eeh_mode,mode,"0x%x");
 EEH_SHOW_ATTR(eeh_config_addr, config_addr, "0x%x");
diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
index 5f202a566ec5..7ef279a0888e 100644
--- a/arch/powerpc/kernel/iommu.c
+++ b/arch/powerpc/kernel/iommu.c
@@ -127,7 +127,7 @@ static ssize_t fail_iommu_store(struct device *dev,
return count;
 }
 
-static DEVICE_ATTR(fail_iommu, S_IRUGO|S_IWUSR, fail_iommu_show,
+static DEVICE_ATTR(fail_iommu, 0644, fail_iommu_show,
   fail_iommu_store);
 
 static int fail_iommu_bus_notify(struct notifier_block *nb,
diff --git a/arch/powerpc/kernel/proc_powerpc.c 
b/arch/powerpc/kernel/proc_powerpc.c
index 56548bf6231f..9bfbd800d32f 100644
--- a/arch/powerpc/kernel/proc_powerpc.c
+++ b/arch/powerpc/kernel/proc_powerpc.c
@@ -63,7 +63,7 @@ static int __init proc_ppc64_init(void)
 {
struct proc_dir_entry *pde;
 
-   pde = proc_create_data("powerpc/systemcfg", S_IFREG|S_IRUGO, NULL,
+   pde = proc_create_data("powerpc/systemcfg", S_IFREG | 0444, NULL,
   _map_fops, vdso_data);
if (!pde)
return 1;
diff --git a/arch/powerpc/kernel/rtas-proc.c b/arch/powerpc/kernel/rtas-proc.c
index df56dfc4b681..bb5e8cbcc553 100644
--- a/arch/powerpc/kernel/rtas-proc.c
+++ b/arch/powerpc/kernel/rtas-proc.c
@@ -260,19 +260,19 @@ static int __init proc_rtas_init(void)
if (rtas_node == NULL)
return -ENODEV;
 
-   proc_create("powerpc/rtas/progress", S_IRUGO|S_IWUSR, NULL,
+   proc_create("powerpc/rtas/progress", 0644, NULL,
_rtas_progress_operations);
-   proc_create("powerpc/rtas/clock", S_IRUGO|S_IWUSR, NULL,
+   proc_create("powerpc/rtas/clock", 0644, NULL,
  

Re: [PATCH v4 2/2] KVM: PPC: Exit guest upon MCE when FWNMI capability is enabled

2017-01-11 Thread David Gibson
On Mon, Jan 09, 2017 at 05:10:45PM +0530, Aravinda Prasad wrote:
> Enhance KVM to cause a guest exit with KVM_EXIT_NMI
> exit reason upon a machine check exception (MCE) in
> the guest address space if the KVM_CAP_PPC_FWNMI
> capability is enabled (instead of delivering a 0x200
> interrupt to guest). This enables QEMU to build error
> log and deliver machine check exception to guest via
> guest registered machine check handler.
> 
> This approach simplifies the delivery of machine
> check exception to guest OS compared to the earlier
> approach of KVM directly invoking 0x200 guest interrupt
> vector.
> 
> This design/approach is based on the feedback for the
> QEMU patches to handle machine check exception. Details
> of earlier approach of handling machine check exception
> in QEMU and related discussions can be found at:
> 
> https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html
> 
> Note:
> 
> This patch introduces a hook which is invoked at the time
> of guest exit to facilitate the host-side handling of
> machine check exception before the exception is passed
> on to the guest. Hence, the host-side handling which was
> performed earlier via machine_check_fwnmi is removed.
> 
> The reasons for this approach is (i) it is not possible
> to distinguish whether the exception occurred in the
> guest or the host from the pt_regs passed on the
> machine_check_exception(). Hence machine_check_exception()
> calls panic, instead of passing on the exception to
> the guest, if the machine check exception is not
> recoverable. (ii) the approach introduced in this
> patch gives opportunity to the host kernel to perform
> actions in virtual mode before passing on the exception
> to the guest. This approach does not require complex
> tweaks to machine_check_fwnmi and friends.
> 
> Signed-off-by: Aravinda Prasad 

Reviewed-by: David Gibson 

> ---
>  arch/powerpc/kvm/book3s_hv.c|   27 +-
>  arch/powerpc/kvm/book3s_hv_rmhandlers.S |   47 
> ---
>  arch/powerpc/platforms/powernv/opal.c   |   10 +++
>  3 files changed, 54 insertions(+), 30 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 3686471..cae4921 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -123,6 +123,7 @@ MODULE_PARM_DESC(halt_poll_ns_shrink, "Factor halt poll 
> time is shrunk by");
>  
>  static void kvmppc_end_cede(struct kvm_vcpu *vcpu);
>  static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu);
> +static void kvmppc_machine_check_hook(void);
>  
>  static inline struct kvm_vcpu *next_runnable_thread(struct kvmppc_vcore *vc,
>   int *ip)
> @@ -954,15 +955,14 @@ static int kvmppc_handle_exit_hv(struct kvm_run *run, 
> struct kvm_vcpu *vcpu,
>   r = RESUME_GUEST;
>   break;
>   case BOOK3S_INTERRUPT_MACHINE_CHECK:
> + /* Exit to guest with KVM_EXIT_NMI as exit reason */
> + run->exit_reason = KVM_EXIT_NMI;
> + r = RESUME_HOST;
>   /*
> -  * Deliver a machine check interrupt to the guest.
> -  * We have to do this, even if the host has handled the
> -  * machine check, because machine checks use SRR0/1 and
> -  * the interrupt might have trashed guest state in them.
> +  * Invoke host-kernel handler to perform any host-side
> +  * handling before exiting the guest.
>*/
> - kvmppc_book3s_queue_irqprio(vcpu,
> - BOOK3S_INTERRUPT_MACHINE_CHECK);
> - r = RESUME_GUEST;
> + kvmppc_machine_check_hook();
>   break;
>   case BOOK3S_INTERRUPT_PROGRAM:
>   {
> @@ -3491,6 +3491,19 @@ static void kvmppc_irq_bypass_del_producer_hv(struct 
> irq_bypass_consumer *cons,
>  }
>  #endif
>  
> +/*
> + * Hook to handle machine check exceptions occurred inside a guest.
> + * This hook is invoked from host virtual mode from KVM before exiting
> + * the guest with KVM_EXIT_NMI exit reason. This gives an opportunity
> + * for the host to take action (if any) before passing on the machine
> + * check exception to the guest kernel.
> + */
> +static void kvmppc_machine_check_hook(void)
> +{
> + if (ppc_md.machine_check_exception)
> + ppc_md.machine_check_exception(NULL);
> +}
> +
>  static long kvm_arch_vm_ioctl_hv(struct file *filp,
>unsigned int ioctl, unsigned long arg)
>  {
> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
> b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> index c3c1d1b..9b41390 100644
> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> @@ -134,21 +134,18 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
>   stb r0, HSTATE_HWTHREAD_REQ(r13)
>  
>   /*
> -  * For external 

[PATCH v2 4/4] cxlflash: Cancel scheduled workers before stopping AFU

2017-01-11 Thread Uma Krishnan
When processing an AFU asynchronous interrupt, if the action results in an
operation that requires off level processing (a link reset for example),
the worker thread is scheduled. In the meantime a reset event (i.e.: EEH)
could unmap the AFU to recover. This results in an Oops when the worker
thread tries to access the AFU mapping.

[c00f17e03b90] d7cd5978 cxlflash_worker_thread+0x268/0x550
[c00f17e03c40] c011883c process_one_work+0x1dc/0x680
[c00f17e03ce0] c0118e80 worker_thread+0x1a0/0x520
[c00f17e03d80] c0126174 kthread+0xf4/0x100
[c00f17e03e30] c000a47c ret_from_kernel_thread+0x5c/0xe0

In an effort to avoid this, a mapcount was introduced in
commit b45cdbaf9f7f ("cxlflash: Resolve oops in wait_port_offline")
but due to the race condition described above, this solution is incomplete.

In order to fully resolve this problem and to simplify things, this commit
removes the mapcount solution. Instead, the scheduled worker thread is
cancelled after interrupts have been disabled and prior to the mapping
being freed.

Fixes: b45cdbaf9f7f ("cxlflash: Resolve oops in wait_port_offline")
Signed-off-by: Uma Krishnan 
Acked-by: Matthew R. Ochs 
---
 drivers/scsi/cxlflash/common.h |  2 --
 drivers/scsi/cxlflash/main.c   | 34 ++
 2 files changed, 6 insertions(+), 30 deletions(-)

diff --git a/drivers/scsi/cxlflash/common.h b/drivers/scsi/cxlflash/common.h
index dee8657..d11dcc5 100644
--- a/drivers/scsi/cxlflash/common.h
+++ b/drivers/scsi/cxlflash/common.h
@@ -174,8 +174,6 @@ struct afu {
struct sisl_host_map __iomem *host_map; /* MC host map */
struct sisl_ctrl_map __iomem *ctrl_map; /* MC control map */
 
-   struct kref mapcount;
-
ctx_hndl_t ctx_hndl;/* master's context handle */
 
atomic_t hsq_credits;
diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index ab38bca..7069639 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -419,16 +419,6 @@ static int send_tmf(struct afu *afu, struct scsi_cmnd 
*scp, u64 tmfcmd)
return rc;
 }
 
-static void afu_unmap(struct kref *ref)
-{
-   struct afu *afu = container_of(ref, struct afu, mapcount);
-
-   if (likely(afu->afu_map)) {
-   cxl_psa_unmap((void __iomem *)afu->afu_map);
-   afu->afu_map = NULL;
-   }
-}
-
 /**
  * cxlflash_driver_info() - information handler for this host driver
  * @host:  SCSI host associated with device.
@@ -459,7 +449,6 @@ static int cxlflash_queuecommand(struct Scsi_Host *host, 
struct scsi_cmnd *scp)
ulong lock_flags;
int nseg = 0;
int rc = 0;
-   int kref_got = 0;
 
dev_dbg_ratelimited(dev, "%s: (scp=%p) %d/%d/%d/%llu "
"cdb=(%08x-%08x-%08x-%08x)\n",
@@ -497,9 +486,6 @@ static int cxlflash_queuecommand(struct Scsi_Host *host, 
struct scsi_cmnd *scp)
break;
}
 
-   kref_get(>afu->mapcount);
-   kref_got = 1;
-
if (likely(sg)) {
nseg = scsi_dma_map(scp);
if (unlikely(nseg < 0)) {
@@ -530,8 +516,6 @@ static int cxlflash_queuecommand(struct Scsi_Host *host, 
struct scsi_cmnd *scp)
if (unlikely(rc))
scsi_dma_unmap(scp);
 out:
-   if (kref_got)
-   kref_put(>mapcount, afu_unmap);
return rc;
 }
 
@@ -569,13 +553,15 @@ static void free_mem(struct cxlflash_cfg *cfg)
  *
  * Safe to call with AFU in a partially allocated/initialized state.
  *
- * Waits for any active internal AFU commands to timeout and then unmaps
- * the MMIO space.
+ * Cancels scheduled worker threads, waits for any active internal AFU
+ * commands to timeout and then unmaps the MMIO space.
  */
 static void stop_afu(struct cxlflash_cfg *cfg)
 {
struct afu *afu = cfg->afu;
 
+   cancel_work_sync(>work_q);
+
if (likely(afu)) {
while (atomic_read(>cmds_active))
ssleep(1);
@@ -583,7 +569,6 @@ static void stop_afu(struct cxlflash_cfg *cfg)
cxl_psa_unmap((void __iomem *)afu->afu_map);
afu->afu_map = NULL;
}
-   kref_put(>mapcount, afu_unmap);
}
 }
 
@@ -767,7 +752,6 @@ static void cxlflash_remove(struct pci_dev *pdev)
scsi_remove_host(cfg->host);
/* fall through */
case INIT_STATE_AFU:
-   cancel_work_sync(>work_q);
term_afu(cfg);
case INIT_STATE_PCI:
pci_disable_device(pdev);
@@ -1277,7 +1261,6 @@ static irqreturn_t cxlflash_async_err_irq(int irq, void 
*data)
__func__, port);
cfg->lr_state = LINK_RESET_REQUIRED;
cfg->lr_port = port;
-   kref_get(>afu->mapcount);
  

[PATCH v2 3/4] cxlflash: Cleanup prints

2017-01-11 Thread Uma Krishnan
From: "Matthew R. Ochs" 

The usage of prints within the cxlflash driver is inconsistent. This
hinders debug and makes the driver source and log output appear sloppy.

The following cleanups help unify the prints within cxlflash:
 - move all prints to dev-* where possible
 - transition all hex prints to lowercase
 - standardize variable prints in debug output
 - derive pointers in a consistent manner
 - change int to bool where appropriate
 - remove superfluous data from prints and print statements that do not
   make sense

Signed-off-by: Matthew R. Ochs 
Signed-off-by: Uma Krishnan 
Reviewed-by: Andrew Donnellan 
---
 drivers/scsi/cxlflash/lunmgt.c|  31 ++--
 drivers/scsi/cxlflash/main.c  | 319 ++
 drivers/scsi/cxlflash/superpipe.c | 165 ++--
 drivers/scsi/cxlflash/vlun.c  | 169 ++--
 4 files changed, 346 insertions(+), 338 deletions(-)

diff --git a/drivers/scsi/cxlflash/lunmgt.c b/drivers/scsi/cxlflash/lunmgt.c
index 6c318db9..0efed17 100644
--- a/drivers/scsi/cxlflash/lunmgt.c
+++ b/drivers/scsi/cxlflash/lunmgt.c
@@ -32,11 +32,13 @@
  */
 static struct llun_info *create_local(struct scsi_device *sdev, u8 *wwid)
 {
+   struct cxlflash_cfg *cfg = shost_priv(sdev->host);
+   struct device *dev = >dev->dev;
struct llun_info *lli = NULL;
 
lli = kzalloc(sizeof(*lli), GFP_KERNEL);
if (unlikely(!lli)) {
-   pr_err("%s: could not allocate lli\n", __func__);
+   dev_err(dev, "%s: could not allocate lli\n", __func__);
goto out;
}
 
@@ -58,11 +60,13 @@ static struct llun_info *create_local(struct scsi_device 
*sdev, u8 *wwid)
  */
 static struct glun_info *create_global(struct scsi_device *sdev, u8 *wwid)
 {
+   struct cxlflash_cfg *cfg = shost_priv(sdev->host);
+   struct device *dev = >dev->dev;
struct glun_info *gli = NULL;
 
gli = kzalloc(sizeof(*gli), GFP_KERNEL);
if (unlikely(!gli)) {
-   pr_err("%s: could not allocate gli\n", __func__);
+   dev_err(dev, "%s: could not allocate gli\n", __func__);
goto out;
}
 
@@ -129,10 +133,10 @@ static struct glun_info *lookup_global(u8 *wwid)
  */
 static struct llun_info *find_and_create_lun(struct scsi_device *sdev, u8 
*wwid)
 {
+   struct cxlflash_cfg *cfg = shost_priv(sdev->host);
+   struct device *dev = >dev->dev;
struct llun_info *lli = NULL;
struct glun_info *gli = NULL;
-   struct Scsi_Host *shost = sdev->host;
-   struct cxlflash_cfg *cfg = shost_priv(shost);
 
if (unlikely(!wwid))
goto out;
@@ -165,7 +169,7 @@ static struct llun_info *find_and_create_lun(struct 
scsi_device *sdev, u8 *wwid)
list_add(>list, );
 
 out:
-   pr_debug("%s: returning %p\n", __func__, lli);
+   dev_dbg(dev, "%s: returning lli=%p, gli=%p\n", __func__, lli, gli);
return lli;
 }
 
@@ -225,17 +229,18 @@ void cxlflash_term_global_luns(void)
 int cxlflash_manage_lun(struct scsi_device *sdev,
struct dk_cxlflash_manage_lun *manage)
 {
-   int rc = 0;
+   struct cxlflash_cfg *cfg = shost_priv(sdev->host);
+   struct device *dev = >dev->dev;
struct llun_info *lli = NULL;
+   int rc = 0;
u64 flags = manage->hdr.flags;
u32 chan = sdev->channel;
 
mutex_lock();
lli = find_and_create_lun(sdev, manage->wwid);
-   pr_debug("%s: ENTER: WWID = %016llX%016llX, flags = %016llX li = %p\n",
-__func__, get_unaligned_be64(>wwid[0]),
-get_unaligned_be64(>wwid[8]),
-manage->hdr.flags, lli);
+   dev_dbg(dev, "%s: WWID=%016llx%016llx, flags=%016llx lli=%p\n",
+   __func__, get_unaligned_be64(>wwid[0]),
+   get_unaligned_be64(>wwid[8]), manage->hdr.flags, lli);
if (unlikely(!lli)) {
rc = -ENOMEM;
goto out;
@@ -265,11 +270,11 @@ int cxlflash_manage_lun(struct scsi_device *sdev,
}
}
 
-   pr_debug("%s: port_sel = %08X chan = %u lun_id = %016llX\n", __func__,
-lli->port_sel, chan, lli->lun_id[chan]);
+   dev_dbg(dev, "%s: port_sel=%08x chan=%u lun_id=%016llx\n",
+   __func__, lli->port_sel, chan, lli->lun_id[chan]);
 
 out:
mutex_unlock();
-   pr_debug("%s: returning rc=%d\n", __func__, rc);
+   dev_dbg(dev, "%s: returning rc=%d\n", __func__, rc);
return rc;
 }
diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index d2bac4b..ab38bca 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -43,6 +43,9 @@ MODULE_LICENSE("GPL");
  */
 static void process_cmd_err(struct afu_cmd *cmd, struct scsi_cmnd *scp)
 {
+   struct afu *afu = cmd->parent;
+   struct 

[PATCH v2 2/4] cxlflash: Support SQ Command Mode

2017-01-11 Thread Uma Krishnan
From: "Matthew R. Ochs" 

The SISLite specification outlines a new queuing model to improve
over the MMIO-based IOARRIN model that exists today. This new model
uses a submission queue that exists in host memory and is shared with
the device. Each entry in the queue is an IOARCB that describes a
transfer request. When requests are submitted, IOARCBs ('current'
position tracked in host software) are populated and the submission
queue tail pointer is then updated via MMIO to make the device aware
of the requests.

Signed-off-by: Matthew R. Ochs 
Signed-off-by: Uma Krishnan 
---
 drivers/scsi/cxlflash/common.h | 30 +++-
 drivers/scsi/cxlflash/main.c   | 98 --
 drivers/scsi/cxlflash/sislite.h| 19 +++-
 drivers/scsi/cxlflash/superpipe.c  | 18 +--
 include/uapi/scsi/cxlflash_ioctl.h |  1 +
 5 files changed, 153 insertions(+), 13 deletions(-)

diff --git a/drivers/scsi/cxlflash/common.h b/drivers/scsi/cxlflash/common.h
index 0e9de5d..dee8657 100644
--- a/drivers/scsi/cxlflash/common.h
+++ b/drivers/scsi/cxlflash/common.h
@@ -54,6 +54,9 @@ extern const struct file_operations cxlflash_cxl_fops;
 /* RRQ for master issued cmds */
 #define NUM_RRQ_ENTRY   CXLFLASH_MAX_CMDS
 
+/* SQ for master issued cmds */
+#define NUM_SQ_ENTRY   CXLFLASH_MAX_CMDS
+
 
 static inline void check_sizes(void)
 {
@@ -155,8 +158,8 @@ static inline struct afu_cmd *sc_to_afucz(struct scsi_cmnd 
*sc)
 
 struct afu {
/* Stuff requiring alignment go first. */
-
-   u64 rrq_entry[NUM_RRQ_ENTRY];   /* 2K RRQ */
+   struct sisl_ioarcb sq[NUM_SQ_ENTRY];/* 16K SQ */
+   u64 rrq_entry[NUM_RRQ_ENTRY];   /* 2K RRQ */
 
/* Beware of alignment till here. Preferably introduce new
 * fields after this point
@@ -174,6 +177,12 @@ struct afu {
struct kref mapcount;
 
ctx_hndl_t ctx_hndl;/* master's context handle */
+
+   atomic_t hsq_credits;
+   spinlock_t hsq_slock;
+   struct sisl_ioarcb *hsq_start;
+   struct sisl_ioarcb *hsq_end;
+   struct sisl_ioarcb *hsq_curr;
u64 *hrrq_start;
u64 *hrrq_end;
u64 *hrrq_curr;
@@ -191,6 +200,23 @@ struct afu {
 
 };
 
+static inline bool afu_is_cmd_mode(struct afu *afu, u64 cmd_mode)
+{
+   u64 afu_cap = afu->interface_version >> SISL_INTVER_CAP_SHIFT;
+
+   return afu_cap & cmd_mode;
+}
+
+static inline bool afu_is_sq_cmd_mode(struct afu *afu)
+{
+   return afu_is_cmd_mode(afu, SISL_INTVER_CAP_SQ_CMD_MODE);
+}
+
+static inline bool afu_is_ioarrin_cmd_mode(struct afu *afu)
+{
+   return afu_is_cmd_mode(afu, SISL_INTVER_CAP_IOARRIN_CMD_MODE);
+}
+
 static inline u64 lun_to_lunid(u64 lun)
 {
__be64 lun_id;
diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index a990efb..d2bac4b 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -227,6 +227,17 @@ static void context_reset_ioarrin(struct afu_cmd *cmd)
 }
 
 /**
+ * context_reset_sq() - reset command owner context w/ SQ Context Reset 
register
+ * @cmd:   AFU command that timed out.
+ */
+static void context_reset_sq(struct afu_cmd *cmd)
+{
+   struct afu *afu = cmd->parent;
+
+   context_reset(cmd, >host_map->sq_ctx_reset);
+}
+
+/**
  * send_cmd_ioarrin() - sends an AFU command via IOARRIN register
  * @afu:   AFU associated with the host.
  * @cmd:   AFU command to send.
@@ -269,6 +280,49 @@ static int send_cmd_ioarrin(struct afu *afu, struct 
afu_cmd *cmd)
 }
 
 /**
+ * send_cmd_sq() - sends an AFU command via SQ ring
+ * @afu:   AFU associated with the host.
+ * @cmd:   AFU command to send.
+ *
+ * Return:
+ * 0 on success, SCSI_MLQUEUE_HOST_BUSY on failure
+ */
+static int send_cmd_sq(struct afu *afu, struct afu_cmd *cmd)
+{
+   struct cxlflash_cfg *cfg = afu->parent;
+   struct device *dev = >dev->dev;
+   int rc = 0;
+   int newval;
+   ulong lock_flags;
+
+   newval = atomic_dec_if_positive(>hsq_credits);
+   if (newval <= 0) {
+   rc = SCSI_MLQUEUE_HOST_BUSY;
+   goto out;
+   }
+
+   cmd->rcb.ioasa = >sa;
+
+   spin_lock_irqsave(>hsq_slock, lock_flags);
+
+   *afu->hsq_curr = cmd->rcb;
+   if (afu->hsq_curr < afu->hsq_end)
+   afu->hsq_curr++;
+   else
+   afu->hsq_curr = afu->hsq_start;
+   writeq_be((u64)afu->hsq_curr, >host_map->sq_tail);
+
+   spin_unlock_irqrestore(>hsq_slock, lock_flags);
+out:
+   dev_dbg(dev, "%s: cmd=%p len=%d ea=%p ioasa=%p rc=%d curr=%p "
+  "head=%016llX tail=%016llX\n", __func__, cmd, cmd->rcb.data_len,
+  (void *)cmd->rcb.data_ea, cmd->rcb.ioasa, rc, afu->hsq_curr,
+  readq_be(>host_map->sq_head),
+  readq_be(>host_map->sq_tail));
+   return rc;
+}
+
+/**
  * 

[PATCH v2 1/4] cxlflash: Refactor context reset to share reset logic

2017-01-11 Thread Uma Krishnan
From: "Matthew R. Ochs" 

As staging for supporting hardware with different context reset
registers but a similar reset procedure, refactor the existing context
reset routine to move the reset logic to a common routine. This will
allow hardware with a different reset register to leverage existing
code.

Signed-off-by: Matthew R. Ochs 
Signed-off-by: Uma Krishnan 
---
 drivers/scsi/cxlflash/main.c | 20 
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index b17ebf6..a990efb 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -188,10 +188,11 @@ static void cmd_complete(struct afu_cmd *cmd)
 }
 
 /**
- * context_reset_ioarrin() - reset command owner context via IOARRIN register
+ * context_reset() - reset command owner context via specified register
  * @cmd:   AFU command that timed out.
+ * @reset_reg: MMIO register to perform reset.
  */
-static void context_reset_ioarrin(struct afu_cmd *cmd)
+static void context_reset(struct afu_cmd *cmd, __be64 __iomem *reset_reg)
 {
int nretry = 0;
u64 rrin = 0x1;
@@ -201,9 +202,9 @@ static void context_reset_ioarrin(struct afu_cmd *cmd)
 
pr_debug("%s: cmd=%p\n", __func__, cmd);
 
-   writeq_be(rrin, >host_map->ioarrin);
+   writeq_be(rrin, reset_reg);
do {
-   rrin = readq_be(>host_map->ioarrin);
+   rrin = readq_be(reset_reg);
if (rrin != 0x1)
break;
/* Double delay each time */
@@ -215,6 +216,17 @@ static void context_reset_ioarrin(struct afu_cmd *cmd)
 }
 
 /**
+ * context_reset_ioarrin() - reset command owner context via IOARRIN register
+ * @cmd:   AFU command that timed out.
+ */
+static void context_reset_ioarrin(struct afu_cmd *cmd)
+{
+   struct afu *afu = cmd->parent;
+
+   context_reset(cmd, >host_map->ioarrin);
+}
+
+/**
  * send_cmd_ioarrin() - sends an AFU command via IOARRIN register
  * @afu:   AFU associated with the host.
  * @cmd:   AFU command to send.
-- 
2.1.0



[PATCH v2 0/4] cxlflash: Enhancements, cleanup and fixes

2017-01-11 Thread Uma Krishnan
This patch series includes an enhancement to support a new command queuing
model and also cleans up prints throughout the driver. The last patch in
the series fixes a racing issue.

The series is based upon v4.10-rc2, intended for 4.11 and is bisectable.

v2 Changes:
- Fixed SOBs for all the submitted patches

Matthew R. Ochs (3):
  cxlflash: Refactor context reset to share reset logic
  cxlflash: Support SQ Command Mode
  cxlflash: Cleanup prints

Uma Krishnan (1):
  cxlflash: Cancel scheduled workers before stopping AFU

 drivers/scsi/cxlflash/common.h |  32 ++-
 drivers/scsi/cxlflash/lunmgt.c |  31 +--
 drivers/scsi/cxlflash/main.c   | 465 +
 drivers/scsi/cxlflash/sislite.h|  19 +-
 drivers/scsi/cxlflash/superpipe.c  | 183 ---
 drivers/scsi/cxlflash/vlun.c   | 169 +++---
 include/uapi/scsi/cxlflash_ioctl.h |   1 +
 7 files changed, 518 insertions(+), 382 deletions(-)

-- 
2.1.0



Re: bootx_init.c:88: undefined reference to `__stack_chk_fail_local'

2017-01-11 Thread Segher Boessenkool
On Tue, Jan 10, 2017 at 07:26:15AM +0100, Christophe LEROY wrote:
> >Maybe ppc32 is not supposed to be built with CC_STACKPROTECTOR ?
> 
> Indeed, the latest versions of GCC don't use anymore the global variable 
> __stack_chk_guard as canary value, but a value stored at -0x7008(r2). 
> This is not compatible with the current implementation of  the kernel 
> with uses r2 as a pointeur to current task struct.
> So until we fix it, I don't think CC_STACKPROTECTOR is usable on PPC 
> with modern versions of GCC.

I still wonder what changed.  Nothing relevant has changed for ten years
or whatever as far as I see; unless it is just the -fstack-protector-strong
that makes it fail now.  Curious.


Segher


Re: [PATCH 2/9] Move dma_ops from archdata into struct device

2017-01-11 Thread Bart Van Assche
On Wed, 2017-01-11 at 21:31 +0100, gre...@linuxfoundation.org wrote:
> That's a big sign that your patch series needs work.  Break it up into
> smaller pieces, it should be possible, which will make merges easier
> (well, different in a way.)

Hello Greg,

Can you have a look at the attached patches? These three patches are a
splitup of the single patch at the start of this e-mail thread.

Thanks,

Bart.From a6fe3a6db80f2bc359e049b72e13aa171fff6ffa Mon Sep 17 00:00:00 2001
From: Bart Van Assche 
Date: Wed, 11 Jan 2017 13:31:42 -0800
Subject: [PATCH 1/3] treewide: Move dma_ops from struct dev_archdata into
 struct device

This change is necessary to make the dma_ops pointer configurable
per device on architectures that do not yet implement set_dma_ops().

Signed-off-by: Bart Van Assche 
---
 arch/arm/include/asm/device.h| 1 -
 arch/arm/include/asm/dma-mapping.h   | 6 +++---
 arch/arm64/include/asm/device.h  | 1 -
 arch/arm64/include/asm/dma-mapping.h | 4 ++--
 arch/arm64/mm/dma-mapping.c  | 8 
 arch/m32r/include/asm/device.h   | 1 -
 arch/m32r/include/asm/dma-mapping.h  | 4 ++--
 arch/mips/include/asm/device.h   | 5 -
 arch/mips/include/asm/dma-mapping.h  | 4 ++--
 arch/mips/pci/pci-octeon.c   | 2 +-
 arch/powerpc/include/asm/device.h| 4 
 arch/powerpc/include/asm/dma-mapping.h   | 4 ++--
 arch/powerpc/kernel/dma.c| 2 +-
 arch/powerpc/platforms/cell/iommu.c  | 2 +-
 arch/powerpc/platforms/pasemi/iommu.c| 2 +-
 arch/powerpc/platforms/pasemi/setup.c| 2 +-
 arch/powerpc/platforms/ps3/system-bus.c  | 4 ++--
 arch/powerpc/platforms/pseries/ibmebus.c | 2 +-
 arch/s390/include/asm/device.h   | 1 -
 arch/s390/include/asm/dma-mapping.h  | 4 ++--
 arch/s390/pci/pci.c  | 2 +-
 arch/tile/include/asm/device.h   | 3 ---
 arch/tile/include/asm/dma-mapping.h  | 6 +++---
 arch/x86/include/asm/device.h| 3 ---
 arch/x86/include/asm/dma-mapping.h   | 4 ++--
 arch/x86/kernel/pci-calgary_64.c | 4 ++--
 arch/x86/pci/common.c| 2 +-
 arch/x86/pci/sta2x11-fixup.c | 8 
 arch/xtensa/include/asm/device.h | 4 
 arch/xtensa/include/asm/dma-mapping.h| 4 ++--
 drivers/infiniband/ulp/srpt/ib_srpt.c| 2 +-
 drivers/iommu/amd_iommu.c| 6 +++---
 drivers/misc/mic/bus/mic_bus.c   | 2 +-
 drivers/misc/mic/bus/scif_bus.c  | 2 +-
 include/linux/device.h   | 1 +
 35 files changed, 47 insertions(+), 69 deletions(-)

diff --git a/arch/arm/include/asm/device.h b/arch/arm/include/asm/device.h
index d8a572f9c187..220ba207be91 100644
--- a/arch/arm/include/asm/device.h
+++ b/arch/arm/include/asm/device.h
@@ -7,7 +7,6 @@
 #define ASMARM_DEVICE_H
 
 struct dev_archdata {
-	const struct dma_map_ops	*dma_ops;
 #ifdef CONFIG_DMABOUNCE
 	struct dmabounce_device_info *dmabounce;
 #endif
diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
index 1aabd781306f..312f4d0564d6 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -18,8 +18,8 @@ extern const struct dma_map_ops arm_coherent_dma_ops;
 
 static inline const struct dma_map_ops *__generic_dma_ops(struct device *dev)
 {
-	if (dev && dev->archdata.dma_ops)
-		return dev->archdata.dma_ops;
+	if (dev && dev->dma_ops)
+		return dev->dma_ops;
 	return _dma_ops;
 }
 
@@ -34,7 +34,7 @@ static inline const struct dma_map_ops *get_dma_ops(struct device *dev)
 static inline void set_dma_ops(struct device *dev, const struct dma_map_ops *ops)
 {
 	BUG_ON(!dev);
-	dev->archdata.dma_ops = ops;
+	dev->dma_ops = ops;
 }
 
 #define HAVE_ARCH_DMA_SUPPORTED 1
diff --git a/arch/arm64/include/asm/device.h b/arch/arm64/include/asm/device.h
index 00c678cc31e1..73d5bab015eb 100644
--- a/arch/arm64/include/asm/device.h
+++ b/arch/arm64/include/asm/device.h
@@ -17,7 +17,6 @@
 #define __ASM_DEVICE_H
 
 struct dev_archdata {
-	const struct dma_map_ops *dma_ops;
 #ifdef CONFIG_IOMMU_API
 	void *iommu;			/* private IOMMU data */
 #endif
diff --git a/arch/arm64/include/asm/dma-mapping.h b/arch/arm64/include/asm/dma-mapping.h
index 1fedb43be712..58ae36cc3b60 100644
--- a/arch/arm64/include/asm/dma-mapping.h
+++ b/arch/arm64/include/asm/dma-mapping.h
@@ -29,8 +29,8 @@ extern const struct dma_map_ops dummy_dma_ops;
 
 static inline const struct dma_map_ops *__generic_dma_ops(struct device *dev)
 {
-	if (dev && dev->archdata.dma_ops)
-		return dev->archdata.dma_ops;
+	if (dev && dev->dma_ops)
+		return dev->dma_ops;
 
 	/*
 	 * We expect no ISA devices, and all other DMA masters are expected to
diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index bcef6368d48f..dbab4c6c084b 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -837,7 +837,7 @@ static bool 

Re: [PATCH v2] pci: hotplug: This patch removes unnecessary return statement using spatch tool

2017-01-11 Thread Bjorn Helgaas
On Sat, Dec 24, 2016 at 03:08:00PM +0530, Rahul Krishnan wrote:
> 
> This patch removes unnecessary return statement using spatch tool
> 
> Signed-off-by: Rahul Krishnan 

Applied to pci/hotplug for v4.11 with Tyrel's Reviewed-by, thanks!

Are there other similar instances elsewhere in drivers/pci?  If so,
can you fix them all at once?

> ---
>  drivers/pci/hotplug/rpadlpar_core.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/drivers/pci/hotplug/rpadlpar_core.c 
> b/drivers/pci/hotplug/rpadlpar_core.c
> index dc67f39..78ce2c7 100644
> --- a/drivers/pci/hotplug/rpadlpar_core.c
> +++ b/drivers/pci/hotplug/rpadlpar_core.c
> @@ -455,7 +455,6 @@ static inline int is_dlpar_capable(void)
>  
>  int __init rpadlpar_io_init(void)
>  {
> - int rc = 0;
>  
>   if (!is_dlpar_capable()) {
>   printk(KERN_WARNING "%s: partition not DLPAR capable\n",
> @@ -463,8 +462,7 @@ int __init rpadlpar_io_init(void)
>   return -EPERM;
>   }
>  
> - rc = dlpar_sysfs_init();
> - return rc;
> + return dlpar_sysfs_init();
>  }
>  
>  void rpadlpar_io_exit(void)
> -- 
> 2.7.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] Move dma_ops from archdata into struct device

2017-01-11 Thread gre...@linuxfoundation.org
On Wed, Jan 11, 2017 at 06:17:03PM +, Bart Van Assche wrote:
> On Wed, 2017-01-11 at 07:48 +0100, Greg Kroah-Hartman wrote:
> > On Tue, Jan 10, 2017 at 04:56:41PM -0800, Bart Van Assche wrote:
> > > Several RDMA drivers, e.g. drivers/infiniband/hw/qib, use the CPU to
> > > transfer data between memory and PCIe adapter. Because of performance
> > > reasons it is important that the CPU cache is not flushed when such
> > > drivers transfer data. Make this possible by allowing these drivers to
> > > override the dma_map_ops pointer. Additionally, introduce the function
> > > set_dma_ops() that will be used by a later patch in this series.
> > 
> > When you say things like "additionally", that's a huge flag that this
> > needs to be split up into multiple patches.  No need to add
> > set_dma_ops() here in this patch.
> 
> Hello Greg,
> 
> Some architectures already define a set_dma_ops() function. So what this
> patch does is to move both the dma_ops pointer and the set_dma_ops()
> function from architecture-specific to architecture independent code. I
> don't think that it is possible to separate these two changes. But I
> understand that how I formulated the patch description caused confusion. I
> will rewrite the patch description to make it more clear before I repost
> this patch series.

I think you should separate it out into multiple patches, this is a
mess, as you say below:

> > And I'd argue that it should be dma_ops_set(), and dma_ops_get(), just
> > to keep the namespace sane, but that's probably a different set of
> > patches...
> 
> Every time I rebase and retest this patch series on top of a new kernel
> version I have to modify some of the patches to compensate for changes in
> the architecture code. So I expect that once Linus merges these patches that
> he will have to resolve one or more merge conflicts. Including a rename of
> the functions that query and set the dma_ops pointer in this patch series
> would increase the number of merge conflicts triggered by this patch series
> and would make Linus' job harder. So I hope that you will allow me to
> postpone that rename until a later time ...

That's a big sign that your patch series needs work.  Break it up into
smaller pieces, it should be possible, which will make merges easier
(well, different in a way.)

Good luck, tree-wide changes are not simple.

greg k-h


Re: [PATCH 2/9] Move dma_ops from archdata into struct device

2017-01-11 Thread gre...@linuxfoundation.org
On Wed, Jan 11, 2017 at 06:03:15PM +, Bart Van Assche wrote:
> On Wed, 2017-01-11 at 07:46 +0100, Greg Kroah-Hartman wrote:
> > On Tue, Jan 10, 2017 at 04:56:41PM -0800, Bart Van Assche wrote:
> > > Several RDMA drivers, e.g. drivers/infiniband/hw/qib, use the CPU to
> > > transfer data between memory and PCIe adapter. Because of performance
> > > reasons it is important that the CPU cache is not flushed when such
> > > drivers transfer data. Make this possible by allowing these drivers to
> > > override the dma_map_ops pointer. Additionally, introduce the function
> > > set_dma_ops() that will be used by a later patch in this series.
> > > 
> > > Signed-off-by: Bart Van Assche 
> > > Cc: [ ... ]
> > 
> > That's a crazy cc: list, you should break this up into smaller pieces,
> > otherwise it's going to bounce...
> 
> That's a subset of what scripts/get_maintainer.pl came up with. Suggestions
> for a more appropriate cc-list for a patch like this that touches all
> architectures would be welcome.

You need to break this patch up into a series that can be applied in
sequence, don't change everything all at once.  That's a mess to merge,
as you are finding out.

> > > diff --git a/include/linux/device.h b/include/linux/device.h
> > > index 491b4c0ca633..c7cb225d36b0 100644
> > > --- a/include/linux/device.h
> > > +++ b/include/linux/device.h
> > > @@ -885,6 +885,8 @@ struct dev_links_info {
> > >   * a higher-level representation of the device.
> > >   */
> > >  struct device {
> > > + const struct dma_map_ops *dma_ops; /* See also get_dma_ops() */
> > > +
> > >   struct device   *parent;
> > >  
> > >   struct device_private   *p;
> > 
> > Why not put this new pointer down with the other dma fields in this
> > structure?  Any specific reason it needs to be first?
> 
> Are there CPU architectures for which access to the first member of a
> structure can be encoded and/or executed more efficiently than access to
> other members of a structure? If not, I'm fine with moving the new pointer
> further down.

Why do you think that your pointer is the one that gets to be "most
efficient"?  :)

Seriously, no, it doesn't matter at all, it's all just pointer math
which is very fast.  Put it with the other stuff please, don't try to
optimize something without ever measuring it.

thanks,

greg k-h


Re: [PATCH 2/9] Move dma_ops from archdata into struct device

2017-01-11 Thread Bart Van Assche
On Wed, 2017-01-11 at 07:48 +0100, Greg Kroah-Hartman wrote:
> On Tue, Jan 10, 2017 at 04:56:41PM -0800, Bart Van Assche wrote:
> > Several RDMA drivers, e.g. drivers/infiniband/hw/qib, use the CPU to
> > transfer data between memory and PCIe adapter. Because of performance
> > reasons it is important that the CPU cache is not flushed when such
> > drivers transfer data. Make this possible by allowing these drivers to
> > override the dma_map_ops pointer. Additionally, introduce the function
> > set_dma_ops() that will be used by a later patch in this series.
> 
> When you say things like "additionally", that's a huge flag that this
> needs to be split up into multiple patches.  No need to add
> set_dma_ops() here in this patch.

Hello Greg,

Some architectures already define a set_dma_ops() function. So what this
patch does is to move both the dma_ops pointer and the set_dma_ops()
function from architecture-specific to architecture independent code. I
don't think that it is possible to separate these two changes. But I
understand that how I formulated the patch description caused confusion. I
will rewrite the patch description to make it more clear before I repost
this patch series.

> And I'd argue that it should be dma_ops_set(), and dma_ops_get(), just
> to keep the namespace sane, but that's probably a different set of
> patches...

Every time I rebase and retest this patch series on top of a new kernel
version I have to modify some of the patches to compensate for changes in
the architecture code. So I expect that once Linus merges these patches that
he will have to resolve one or more merge conflicts. Including a rename of
the functions that query and set the dma_ops pointer in this patch series
would increase the number of merge conflicts triggered by this patch series
and would make Linus' job harder. So I hope that you will allow me to
postpone that rename until a later time ...

Bart.

Re: [Linux-c6x-dev] [PATCH v2 7/7] uapi: export all headers under uapi directories

2017-01-11 Thread Mark Salter
On Fri, 2017-01-06 at 10:43 +0100, Nicolas Dichtel wrote:
> Regularly, when a new header is created in include/uapi/, the developer
> forgets to add it in the corresponding Kbuild file. This error is usually
> detected after the release is out.
> 
> In fact, all headers under uapi directories should be exported, thus it's
> useless to have an exhaustive list.
> 
> After this patch, the following files, which were not exported, are now
> exported (with make headers_install_all):
> asm-unicore32/shmparam.h
> asm-unicore32/ucontext.h
> asm-hexagon/shmparam.h
> asm-mips/ucontext.h
> asm-mips/hwcap.h
> asm-mips/reg.h
> drm/vgem_drm.h
> drm/armada_drm.h
> drm/omap_drm.h
> drm/etnaviv_drm.h
> asm-tile/shmparam.h
> asm-blackfin/shmparam.h
> asm-blackfin/ucontext.h
> asm-powerpc/perf_regs.h
> rdma/qedr-abi.h
> asm-parisc/kvm_para.h
> asm-openrisc/shmparam.h
> asm-nios2/kvm_para.h
> asm-nios2/ucontext.h
> asm-sh/kvm_para.h
> asm-sh/ucontext.h
> asm-xtensa/kvm_para.h
> asm-avr32/kvm_para.h
> asm-m32r/kvm_para.h
> asm-h8300/shmparam.h
> asm-h8300/ucontext.h
> asm-metag/kvm_para.h
> asm-metag/shmparam.h
> asm-metag/ucontext.h
> asm-m68k/kvm_para.h
> asm-m68k/shmparam.h
> linux/bcache.h
> linux/kvm.h
> linux/kvm_para.h
> linux/kfd_ioctl.h
> linux/cryptouser.h
> linux/kcm.h
> linux/kcov.h
> linux/seg6_iptunnel.h
> linux/stm.h
> linux/genwqe
> linux/genwqe/.install
> linux/genwqe/genwqe_card.h
> linux/genwqe/..install.cmd
> linux/seg6.h
> linux/cifs
> linux/cifs/.install
> linux/cifs/cifs_mount.h
> linux/cifs/..install.cmd
> linux/auto_dev-ioctl.h
> 
> Thanks to Julien Floret  for the tip to get all
> subdirs with a pure makefile command.
> 
> Signed-off-by: Nicolas Dichtel 
> ---
>  Documentation/kbuild/makefiles.txt  |  41 ++-
>  arch/alpha/include/uapi/asm/Kbuild  |  41 ---
>  arch/arc/include/uapi/asm/Kbuild|   3 -
>  arch/arm/include/uapi/asm/Kbuild|  17 -
>  arch/arm64/include/uapi/asm/Kbuild  |  18 --
>  arch/avr32/include/uapi/asm/Kbuild  |  20 --
>  arch/blackfin/include/uapi/asm/Kbuild   |  17 -
>  arch/c6x/include/uapi/asm/Kbuild|   8 -
>  arch/cris/include/uapi/arch-v10/arch/Kbuild |   5 -
>  arch/cris/include/uapi/arch-v32/arch/Kbuild |   3 -
>  arch/cris/include/uapi/asm/Kbuild   |  43 +--
>  arch/frv/include/uapi/asm/Kbuild|  33 --
>  arch/h8300/include/uapi/asm/Kbuild  |  28 --
>  arch/hexagon/include/asm/Kbuild |   3 -
>  arch/hexagon/include/uapi/asm/Kbuild|  13 -
>  arch/ia64/include/uapi/asm/Kbuild   |  45 ---
>  arch/m32r/include/uapi/asm/Kbuild   |  31 --
>  arch/m68k/include/uapi/asm/Kbuild   |  24 --
>  arch/metag/include/uapi/asm/Kbuild  |   8 -
>  arch/microblaze/include/uapi/asm/Kbuild |  32 --
>  arch/mips/include/uapi/asm/Kbuild   |  37 ---
>  arch/mn10300/include/uapi/asm/Kbuild|  32 --
>  arch/nios2/include/uapi/asm/Kbuild  |   4 +-
>  arch/openrisc/include/asm/Kbuild|   3 -
>  arch/openrisc/include/uapi/asm/Kbuild   |   8 -
>  arch/parisc/include/uapi/asm/Kbuild |  28 --
>  arch/powerpc/include/uapi/asm/Kbuild|  45 ---
>  arch/s390/include/uapi/asm/Kbuild   |  52 ---
>  arch/score/include/asm/Kbuild   |   4 -
>  arch/score/include/uapi/asm/Kbuild  |  32 --
>  arch/sh/include/uapi/asm/Kbuild |  23 --
>  arch/sparc/include/uapi/asm/Kbuild  |  48 ---
>  arch/tile/include/asm/Kbuild|   3 -
>  arch/tile/include/uapi/arch/Kbuild  |  17 -
>  arch/tile/include/uapi/asm/Kbuild   |  19 +-
>  arch/unicore32/include/uapi/asm/Kbuild  |   6 -
>  arch/x86/include/uapi/asm/Kbuild|  59 
>  arch/xtensa/include/uapi/asm/Kbuild |  23 --
>  include/Kbuild  |   2 -
>  include/asm-generic/Kbuild.asm  |   1 -
>  include/scsi/fc/Kbuild  |   0
>  include/uapi/Kbuild |  15 -
>  include/uapi/asm-generic/Kbuild |  36 ---
>  include/uapi/asm-generic/Kbuild.asm |  62 ++--
>  include/uapi/drm/Kbuild |  22 --
>  include/uapi/linux/Kbuild   | 482 
> 
>  include/uapi/linux/android/Kbuild   |   2 -
>  include/uapi/linux/byteorder/Kbuild |   3 -
>  include/uapi/linux/caif/Kbuild  |   3 -
>  include/uapi/linux/can/Kbuild   |   6 -
>  include/uapi/linux/dvb/Kbuild   |   9 -
>  include/uapi/linux/hdlc/Kbuild  |   2 -
>  include/uapi/linux/hsi/Kbuild   |   2 -
>  include/uapi/linux/iio/Kbuild   |   3 -
>  include/uapi/linux/isdn/Kbuild  |   2 -
>  include/uapi/linux/mmc/Kbuild   |   2 -
>  include/uapi/linux/netfilter/Kbuild |  89 -
>  include/uapi/linux/netfilter/ipset/Kbuild   

Re: [PATCH 2/9] Move dma_ops from archdata into struct device

2017-01-11 Thread Bart Van Assche
On Wed, 2017-01-11 at 07:46 +0100, Greg Kroah-Hartman wrote:
> On Tue, Jan 10, 2017 at 04:56:41PM -0800, Bart Van Assche wrote:
> > Several RDMA drivers, e.g. drivers/infiniband/hw/qib, use the CPU to
> > transfer data between memory and PCIe adapter. Because of performance
> > reasons it is important that the CPU cache is not flushed when such
> > drivers transfer data. Make this possible by allowing these drivers to
> > override the dma_map_ops pointer. Additionally, introduce the function
> > set_dma_ops() that will be used by a later patch in this series.
> > 
> > Signed-off-by: Bart Van Assche 
> > Cc: [ ... ]
> 
> That's a crazy cc: list, you should break this up into smaller pieces,
> otherwise it's going to bounce...

That's a subset of what scripts/get_maintainer.pl came up with. Suggestions
for a more appropriate cc-list for a patch like this that touches all
architectures would be welcome.

> > diff --git a/include/linux/device.h b/include/linux/device.h
> > index 491b4c0ca633..c7cb225d36b0 100644
> > --- a/include/linux/device.h
> > +++ b/include/linux/device.h
> > @@ -885,6 +885,8 @@ struct dev_links_info {
> >   * a higher-level representation of the device.
> >   */
> >  struct device {
> > +   const struct dma_map_ops *dma_ops; /* See also get_dma_ops() */
> > +
> >     struct device   *parent;
> >  
> >     struct device_private   *p;
> 
> Why not put this new pointer down with the other dma fields in this
> structure?  Any specific reason it needs to be first?

Are there CPU architectures for which access to the first member of a
structure can be encoded and/or executed more efficiently than access to
other members of a structure? If not, I'm fine with moving the new pointer
further down.

Bart.

[PATCH] powerpc/pseries: Report DLPAR capabilities

2017-01-11 Thread Nathan Fontenot
As we add the ability to do DLPAR of additional devices through
the sysfs interface we need to know which devices are supported.
This adds the reporting of supported devices with a comma separated
list reported in the existing /sys/kernel/dlpar.

Signed-off-by: Nathan Fontenot 
---
 arch/powerpc/platforms/pseries/dlpar.c |8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/dlpar.c 
b/arch/powerpc/platforms/pseries/dlpar.c
index 76caa4a..1152590 100644
--- a/arch/powerpc/platforms/pseries/dlpar.c
+++ b/arch/powerpc/platforms/pseries/dlpar.c
@@ -551,7 +551,13 @@ static ssize_t dlpar_store(struct class *class, struct 
class_attribute *attr,
return rc ? rc : count;
 }
 
-static CLASS_ATTR(dlpar, S_IWUSR, NULL, dlpar_store);
+static ssize_t dlpar_show(struct class *class, struct class_attribute *attr,
+ char *buf)
+{
+   return sprintf(buf, "%s\n", "memory,cpu");
+}
+
+static CLASS_ATTR(dlpar, S_IWUSR | S_IRUSR, dlpar_show, dlpar_store);
 
 static int __init pseries_dlpar_init(void)
 {



Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

2017-01-11 Thread Josh Poimboeuf
On Wed, Jan 11, 2017 at 04:18:28PM +0100, Petr Mladek wrote:
> On Tue 2017-01-10 14:46:46, Josh Poimboeuf wrote:
> > On Tue, Jan 10, 2017 at 02:00:58PM +0100, Petr Mladek wrote:
> > > On Thu 2016-12-22 12:31:37, Josh Poimboeuf wrote:
> > > > On Thu, Dec 22, 2016 at 03:34:52PM +0100, Petr Mladek wrote:
> > > > > On Wed 2016-12-21 15:25:05, Josh Poimboeuf wrote:
> > > > > > On Tue, Dec 20, 2016 at 06:32:46PM +0100, Petr Mladek wrote:
> > > > > > > On Thu 2016-12-08 12:08:38, Josh Poimboeuf wrote:
> > > > > > > > +   read_unlock(_lock);
> > > > > > > > +
> > > > > > > > +   /*
> > > > > > > > +* Ditto for the idle "swapper" tasks, though they 
> > > > > > > > never cross the
> > > > > > > > +* syscall barrier.  Instead they switch over in 
> > > > > > > > cpu_idle_loop().
> > > > > > > > +*/
> > > > > > > > +   get_online_cpus();
> > > > > > > > +   for_each_online_cpu(cpu)
> > > > > > > > +   set_tsk_thread_flag(idle_task(cpu), 
> > > > > > > > TIF_PATCH_PENDING);
> > > > > > > > +   put_online_cpus();
> > > > > > > 
> > > > > > > Also this stage need to be somehow handled by CPU coming/going
> > > > > > > handlers.
> > > > > > 
> > > > > > Here I think we could automatically switch any offline CPUs' idle 
> > > > > > tasks.
> > > > > > And something similar in klp_try_complete_transition().
> > > > > 
> > > > > We still need to make sure to do not race with the cpu_up()/cpu_down()
> > > > > calls.
> > > > 
> > > > Hm, maybe we'd need to call cpu_hotplug_disable() before switching the
> > > > offline idle tasks?
> > > > 
> > > > > I would use here the trick with for_each_possible_cpu() and let
> > > > > the migration for the stack check.
> > > > 
> > > > There are a few issues with that:
> > > > 
> > > > 1) The idle task of a missing CPU doesn't *have* a stack, so it doesn't
> > > >make much sense to try to check it.
> > > > 
> > > > 2) We can't rely *only* on the stack check, because not all arches have
> > > >it.  The other way to migrate idle tasks is from the idle loop switch
> > > >point.  But if the task's CPU is down, its idle loop isn't running so
> > > >it can't migrate.
> > > > 
> > > >(Note this is currently a theoretical point: we currently don't allow
> > > >such arches to use the consistency model anyway because there's no
> > > >way for them to migrate kthreads.)
> > > 
> > > Good points. My only concern is that the transaction might take a long
> > > or even forever. I am not sure if it is wise to disable cpu_hotplug
> > > for the entire transaction.
> > > 
> > > A compromise might be to disable cpu hotplug only when the task
> > > state is manipulated a more complex way. Hmm, cpu_hotplug_disable()
> > > looks like a rather costly function. We should not call it in
> > > klp_try_complete_transition(). But we could do:
> > > 
> > >   1. When the patch is being enabled, disable cpu hotplug,
> > >  go through each_possible_cpu and setup the transaction
> > >  only for CPUs that are online. Then we could enable
> > >  the hotplug again.
> > > 
> > >   2. Check only each_online_cpu in klp_try_complete_transition().
> > >  If all tasks are migrated, disable cpu hotplug and re-check
> > >  idle tasks on online CPUs. If any is not migrated, enable
> > >  hotplug and return failure. Othewise, continue with
> > >  completion of the transaction. [*]
> > > 
> > >   3. In klp_complete_transition, update all tasks including
> > >  the offline CPUs and enable cpu hotplug again.
> > > 
> > > If the re-check in the 2nd step looks ugly, we could add some hotlug
> > > notifiers to make sure that enabled/disabled CPUs are in a reasonable
> > > state. We still should disable the hotplug in the 1st and 3rd step.
> > > 
> > > BTW: There is a new API for the cpu hotplug callbacks. I was involved
> > > in one conversion. You might take inspiration in
> > > drivers/thermal/intel_powerclamp.c. See cpuhp_setup_state_nocalls()
> > > there.
> > 
> > Backing up a bit, although I brought up cpu_hotplug_disable(), I think I
> > misunderstood the race you mentioned.  I actually don't think
> > cpu_hotplug_disable() is necessary.
> 
> Great backing! You made me to study the difference. If I get it
> correctly:
> 
>   + cpu_hotplug_disable() works like a writer lock. It gets
> exclusive access via cpu_hotplug_begin(). A side effect
> is that do_cpu_up() and do_cpu_down() do not wait. They
> return -EBUSY if hotplug is disabled.
> 
>   + get_online_cpus() is kind of reader lock. It makes sure
> that all the hotplug operations are finished and "softly"
> blocks other further operation. By "softly" I mean that
> the operations wait for the exclusive (write) access
> in cpu_hotplug_begin().
> 
> IMHO, we really have to use get_online_cpus() and avoid the
> the "hard" blocking.
> 
> 
> > What do you think about something like the following:
>  
> > In klp_start_transition:
> > 
> > 

Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

2017-01-11 Thread Petr Mladek
On Tue 2017-01-10 14:46:46, Josh Poimboeuf wrote:
> On Tue, Jan 10, 2017 at 02:00:58PM +0100, Petr Mladek wrote:
> > On Thu 2016-12-22 12:31:37, Josh Poimboeuf wrote:
> > > On Thu, Dec 22, 2016 at 03:34:52PM +0100, Petr Mladek wrote:
> > > > On Wed 2016-12-21 15:25:05, Josh Poimboeuf wrote:
> > > > > On Tue, Dec 20, 2016 at 06:32:46PM +0100, Petr Mladek wrote:
> > > > > > On Thu 2016-12-08 12:08:38, Josh Poimboeuf wrote:
> > > > > > > + read_unlock(_lock);
> > > > > > > +
> > > > > > > + /*
> > > > > > > +  * Ditto for the idle "swapper" tasks, though they never cross 
> > > > > > > the
> > > > > > > +  * syscall barrier.  Instead they switch over in 
> > > > > > > cpu_idle_loop().
> > > > > > > +  */
> > > > > > > + get_online_cpus();
> > > > > > > + for_each_online_cpu(cpu)
> > > > > > > + set_tsk_thread_flag(idle_task(cpu), TIF_PATCH_PENDING);
> > > > > > > + put_online_cpus();
> > > > > > 
> > > > > > Also this stage need to be somehow handled by CPU coming/going
> > > > > > handlers.
> > > > > 
> > > > > Here I think we could automatically switch any offline CPUs' idle 
> > > > > tasks.
> > > > > And something similar in klp_try_complete_transition().
> > > > 
> > > > We still need to make sure to do not race with the cpu_up()/cpu_down()
> > > > calls.
> > > 
> > > Hm, maybe we'd need to call cpu_hotplug_disable() before switching the
> > > offline idle tasks?
> > > 
> > > > I would use here the trick with for_each_possible_cpu() and let
> > > > the migration for the stack check.
> > > 
> > > There are a few issues with that:
> > > 
> > > 1) The idle task of a missing CPU doesn't *have* a stack, so it doesn't
> > >make much sense to try to check it.
> > > 
> > > 2) We can't rely *only* on the stack check, because not all arches have
> > >it.  The other way to migrate idle tasks is from the idle loop switch
> > >point.  But if the task's CPU is down, its idle loop isn't running so
> > >it can't migrate.
> > > 
> > >(Note this is currently a theoretical point: we currently don't allow
> > >such arches to use the consistency model anyway because there's no
> > >way for them to migrate kthreads.)
> > 
> > Good points. My only concern is that the transaction might take a long
> > or even forever. I am not sure if it is wise to disable cpu_hotplug
> > for the entire transaction.
> > 
> > A compromise might be to disable cpu hotplug only when the task
> > state is manipulated a more complex way. Hmm, cpu_hotplug_disable()
> > looks like a rather costly function. We should not call it in
> > klp_try_complete_transition(). But we could do:
> > 
> >   1. When the patch is being enabled, disable cpu hotplug,
> >  go through each_possible_cpu and setup the transaction
> >  only for CPUs that are online. Then we could enable
> >  the hotplug again.
> > 
> >   2. Check only each_online_cpu in klp_try_complete_transition().
> >  If all tasks are migrated, disable cpu hotplug and re-check
> >  idle tasks on online CPUs. If any is not migrated, enable
> >  hotplug and return failure. Othewise, continue with
> >  completion of the transaction. [*]
> > 
> >   3. In klp_complete_transition, update all tasks including
> >  the offline CPUs and enable cpu hotplug again.
> > 
> > If the re-check in the 2nd step looks ugly, we could add some hotlug
> > notifiers to make sure that enabled/disabled CPUs are in a reasonable
> > state. We still should disable the hotplug in the 1st and 3rd step.
> > 
> > BTW: There is a new API for the cpu hotplug callbacks. I was involved
> > in one conversion. You might take inspiration in
> > drivers/thermal/intel_powerclamp.c. See cpuhp_setup_state_nocalls()
> > there.
> 
> Backing up a bit, although I brought up cpu_hotplug_disable(), I think I
> misunderstood the race you mentioned.  I actually don't think
> cpu_hotplug_disable() is necessary.

Great backing! You made me to study the difference. If I get it
correctly:

  + cpu_hotplug_disable() works like a writer lock. It gets
exclusive access via cpu_hotplug_begin(). A side effect
is that do_cpu_up() and do_cpu_down() do not wait. They
return -EBUSY if hotplug is disabled.

  + get_online_cpus() is kind of reader lock. It makes sure
that all the hotplug operations are finished and "softly"
blocks other further operation. By "softly" I mean that
the operations wait for the exclusive (write) access
in cpu_hotplug_begin().

IMHO, we really have to use get_online_cpus() and avoid the
the "hard" blocking.


> What do you think about something like the following:
 
> In klp_start_transition:
> 
>   get_online_cpus();
>   for_each_possible_cpu(cpu)
>   set_tsk_thread_flag(idle_task(cpu), TIF_PATCH_PENDING);
>   put_online_cpus();
>
> In klp_try_complete_transition:
> 
>   get_online_cpus();
>   for_each_possible_cpu(cpu) {
>   task = idle_task(cpu);
>   if (cpu_online(cpu)) {
>  

Re: [PATCH v2 0/7] uapi: export all headers under uapi directories

2017-01-11 Thread Jesper Nilsson
On Mon, Jan 09, 2017 at 12:33:58PM +0100, Arnd Bergmann wrote:
> On Friday, January 6, 2017 10:43:52 AM CET Nicolas Dichtel wrote:
> > Here is the v2 of this series. The first 5 patches are just cleanup: some
> > exported headers were still under a non-uapi directory.
> 
> Since this is meant as a cleanup, I commented on this to point out a cleaner
> way to do the same.
> 
> > The patch 6 was spotted by code review: there is no in-tree user of this
> > functionality.
> > The last patch remove the use of header-y. Now all files under an uapi
> > directory are exported.
> 
> Very nice!
> 
> > asm is a bit special, most of architectures export 
> > asm//include/uapi/asm
> > only, but there is two exceptions:
> >  - cris which exports arch/cris/include/uapi/arch-v[10|32];
> 
> This is interesting, though not your problem. Maybe someone who understands
> cris better can comment on this: How is the decision made about which of
> the arch/user.h headers gets used? I couldn't find that in the sources,
> but it appears to be based on kernel compile-time settings, which is
> wrong for user space header files that should be independent of the kernel
> config.

I believe it's since the CRISv10 and CRISv32 are very different beasts,
and that is selected via kernel config...

This part of the CRIS port has been transformed a couple of times from
the original layout without uapi, and there's still some legacy silliness,
where some files might have been exported but never used from userspace
except for some corner cases.

> >  - tile which exports arch/tile/include/uapi/arch.
> > Because I don't know if the output of 'make headers_install_all' can be 
> > changed,
> > I introduce subdir-y in Kbuild file. The headers_install_all target copies 
> > all
> > asm//include/uapi/asm to usr/include/asm- but
> > arch/cris/include/uapi/arch-v[10|32] and arch/tile/include/uapi/arch are not
> > prefixed (they are put asis in usr/include/). If it's acceptable to modify 
> > the
> > output of 'make headers_install_all' to export asm headers in
> > usr/include/asm-/asm, then I could remove this new subdir-y and 
> > exports
> > everything under arch//include/uapi/.
> 
> I don't know if anyone still uses "make headers_install_all", I suspect
> distros these days all use "make headers_install", so it probably
> doesn't matter much.
> 
> In case of cris, it should be easy enough to move all the contents of the
> uapi/arch-*/*.h headers into the respective uapi/asm/*.h headers, they
> only seem to be referenced from there.

This would seem to be a reasonable change.

> For tile, I suspect that would not work as the arch/*.h headers are
> apparently defined as interfaces for both user space and kernel.
> 
> > Note also that exported files for asm are a mix of files listed by:
> >  - include/uapi/asm-generic/Kbuild.asm;
> >  - arch/x86/include/uapi/asm/Kbuild;
> >  - arch/x86/include/asm/Kbuild.
> > This complicates a lot the processing (arch/x86/include/asm/Kbuild is also
> > used by scripts/Makefile.asm-generic).
> > 
> > This series has been tested with a 'make headers_install' on x86 and a
> > 'make headers_install_all'. I've checked the result of both commands.
> > 
> > This patch is built against linus tree. I don't know if it should be
> > made against antoher tree.
> 
> The series should probably get merged through the kbuild tree, but testing
> it on mainline is fine here.
> 
>   Arnd

/^JN - Jesper Nilsson
-- 
   Jesper Nilsson -- jesper.nils...@axis.com


Re: [PATCH v3 12/15] livepatch: store function sizes

2017-01-11 Thread Kamalesh Babulal

On Thursday 08 December 2016 11:38 PM, Josh Poimboeuf wrote:

For the consistency model we'll need to know the sizes of the old and
new functions to determine if they're on the stacks of any tasks.

Signed-off-by: Josh Poimboeuf 


Reviewed-by: Kamalesh Babulal 

--
cheers,
Kamalesh.



Re: WARNING at fs/sysfs/group.c:237 .sysfs_remove_group+0xc4/0xd0 on Linus mainline

2017-01-11 Thread seeteena
Hi , Let me know if the below community fix is been merged into upstream 
code.


https://www.mail-archive.com/linux-scsi@vger.kernel.org/msg39187.html

Thanks

Seeteena


On 01/11/2017 12:14 PM, abdul wrote:

Hi

Today's mainline shows warning while load/unload lpfc module on PowerPC

Machine : Power8 PowerVM LPAR
Kernel version : 4.10.0-rc3

Steps to recreate:
modprobe lpfc
mpdprobe -r lpfc

[ cut here ]
WARNING: CPU: 3 PID: 3819 at fs/sysfs/group.c:237 
.sysfs_remove_group+0xc4/0xd0
Modules linked in: dm_snapshot(E) dm_bufio(E) ext4(E) mbcache(E) 
jbd2(E) ip6t_rpfilter(E) ipt_REJECT(E) nf_reject_ipv4(E) 
ip6t_REJECT(E) nf_reject_ipv6(E) xt_conntrack(E) ip_set(E) 
nfnetlink(E) ebtable_nat(E) ebtable_broute(E) bridge(E) stp(E) llc(E) 
ip6table_nat(E) nf_conntrack_ipv6(E) nf_defrag_ipv6(E) nf_nat_ipv6(E) 
ip6table_mangle(E) ip6table_security(E) ip6table_raw(E) iptable_nat(E) 
nf_conntrack_ipv4(E) nf_defrag_ipv4(E) nf_nat_ipv4(E) nf_nat(E) 
nf_conntrack(E) iptable_mangle(E) iptable_security(E) iptable_raw(E) 
ebtable_filter(E) ebtables(E) ip6table_filter(E) ip6_tables(E) 
iptable_filter(E) rpcrdma(E) sunrpc(E) ib_isert(E) iscsi_target_mod(E) 
ib_iser(E) libiscsi(E) scsi_transport_iscsi(E) ib_srpt(E) 
target_core_mod(E) ib_srp(E) ib_ipoib(E) rdma_ucm(E) ib_ucm(E) 
ib_uverbs(E) ib_umad(E)
rdma_cm(E) ib_cm(E) iw_cm(E) iw_cxgb4(E) ib_core(E) libcxgb(E) 
nls_utf8(E) isofs(E) loop(E) dm_mirror(E) dm_region_hash(E) dm_log(E) 
sg(E) pseries_rng(E) ghash_generic(E) shpchp(E) xts(E) gf128mul(E) 
vmx_crypto(E) dm_multipath(E) dm_mod(E) ip_tables(E) xfs(E) 
libcrc32c(E) sr_mod(E) cdrom(E) sd_mod(E) ibmvscsi(E) 
scsi_transport_srp(E) ibmveth(E) lpfc(E-) cxgb4(E) scsi_transport_fc(E)
CPU: 3 PID: 3819 Comm: rmmod Tainted: GE 
4.10.0-rc3-autotest #1

task: c2404200 task.stack: c00158868000
NIP: c038f074 LR: c038f070 CTR: 006338e4
REGS: c0015886b1c0 TRAP: 0700   Tainted: G E (4.10.0-rc3-autotest)
MSR: 8282b032 
CR: 22008822  XER: 0004
CFAR: c08a9510 SOFTE: 1
GPR00: c038f070 c0015886b440 c11faa00 
0033
GPR04: c0028fbcada0 c0028fbdf628 02ab5b98 

GPR08:  c0c9f41c 00028ef3 
026d
GPR12: 22008844 ce7f1b00  

GPR16:   01000e7e01d0 
1001f8e0
GPR20: 1001f898 1001f880 1001f8c0 
1001f8f8
GPR24:  c2a011d8 c001890270b0 
c110f1f0
GPR28: c000fe81b188 c00189027010 c1128c50 


NIP [c038f074] .sysfs_remove_group+0xc4/0xd0
LR [c038f070] .sysfs_remove_group+0xc0/0xd0
Call Trace:
[c0015886b440] [c038f070] .sysfs_remove_group+0xc0/0xd0 
(unreliable)

[c0015886b4d0] [c0592950] .dpm_sysfs_remove+0x70/0x90
[c0015886b560] [c057ea6c] .device_del+0x12c/0x3b0
[c0015886b610] [c057ed10] .device_unregister+0x20/0x90
[c0015886b690] [c0452a28] .bsg_unregister_queue+0x78/0x110
[c0015886b720] [c05d130c] .__scsi_remove_device+0xfc/0x130
[c0015886b7a0] [c05ce934] .scsi_forget_host+0x94/0xa0
[c0015886b820] [c05beac4] .scsi_remove_host+0x94/0x1a0
[c0015886b8b0] [d31e2f14] 
.lpfc_sli4_queue_destroy+0xb94/0x1000 [lpfc]

[c0015886b980] [c04d1b70] .pci_device_remove+0x60/0x100
[c0015886ba10] [c05859f8] 
.device_release_driver_internal+0x1e8/0x2b0

[c0015886bab0] [c0585b3c] .driver_detach+0x6c/0xf0
[c0015886bb40] [c05840e4] .bus_remove_driver+0x74/0x130
[c0015886bbc0] [c05867d8] .driver_unregister+0x38/0x70
[c0015886bc40] [c04cf4a4] .pci_unregister_driver+0x34/0x120
[c0015886bce0] [d3215d64] .cleanup_module+0x34/0x6fa0 [lpfc]
[c0015886bd60] [c01784f4] .SyS_delete_module+0x1e4/0x280
[c0015886be30] [c000b184] system_call+0x38/0xe0
Instruction dump:
4e800020 6000 6000 4bff93b1 6000 4ba0 e89e e8bd
3c62ff93 3863d598 4851a445 6000 <0fe0> 4bb4 6000 7c0802a6
---[ end trace 8eb607cc8c5cea78 ]---
sysfs group 'power' not found for kobject '2:0:0:0'

Thanks
Abdul