Re: [PATCH 2/6] powerpc: kvm: no need to check return value of debugfs_create functions

2020-03-02 Thread Michael Ellerman
Greg Kroah-Hartman  writes:
> When calling debugfs functions, there is no need to ever check the
> return value.  The function can work or not, but the code logic should
> never do something different based on this.

Except it does need to do something different, if the file was created
it needs to be removed in the remove path.

> diff --git a/arch/powerpc/kvm/timing.c b/arch/powerpc/kvm/timing.c
> index bfe4f106cffc..8e4791c6f2af 100644
> --- a/arch/powerpc/kvm/timing.c
> +++ b/arch/powerpc/kvm/timing.c
> @@ -207,19 +207,12 @@ static const struct file_operations 
> kvmppc_exit_timing_fops = {
>  void kvmppc_create_vcpu_debugfs(struct kvm_vcpu *vcpu, unsigned int id)
>  {
>   static char dbg_fname[50];
> - struct dentry *debugfs_file;
>  
>   snprintf(dbg_fname, sizeof(dbg_fname), "vm%u_vcpu%u_timing",
>current->pid, id);
> - debugfs_file = debugfs_create_file(dbg_fname, 0666,
> - kvm_debugfs_dir, vcpu,
> - _exit_timing_fops);
> -
> - if (!debugfs_file) {
> - printk(KERN_ERR"%s: error creating debugfs file %s\n",
> - __func__, dbg_fname);
> - return;
> - }
> + debugfs_create_file(dbg_fname, 0666, kvm_debugfs_dir, vcpu,
> + _exit_timing_fops);
> +
>  
>   vcpu->arch.debugfs_exit_timing = debugfs_file;
>  }

This doesn't build:

arch/powerpc/kvm/timing.c:217:35: error: 'debugfs_file' undeclared (first 
use in this function); did you mean 'debugfs_file_put'?

We can't just drop the assignment, we need the dentry to do the removal:

void kvmppc_remove_vcpu_debugfs(struct kvm_vcpu *vcpu)
{
if (vcpu->arch.debugfs_exit_timing) {
debugfs_remove(vcpu->arch.debugfs_exit_timing);
vcpu->arch.debugfs_exit_timing = NULL;
}
}


I squashed this in, which seems to work:

diff --git a/arch/powerpc/kvm/timing.c b/arch/powerpc/kvm/timing.c
index 8e4791c6f2af..5b7a66f86bd5 100644
--- a/arch/powerpc/kvm/timing.c
+++ b/arch/powerpc/kvm/timing.c
@@ -207,19 +207,19 @@ static const struct file_operations 
kvmppc_exit_timing_fops = {
 void kvmppc_create_vcpu_debugfs(struct kvm_vcpu *vcpu, unsigned int id)
 {
static char dbg_fname[50];
+   struct dentry *debugfs_file;
 
snprintf(dbg_fname, sizeof(dbg_fname), "vm%u_vcpu%u_timing",
 current->pid, id);
-   debugfs_create_file(dbg_fname, 0666, kvm_debugfs_dir, vcpu,
-   _exit_timing_fops);
-
+   debugfs_file = debugfs_create_file(dbg_fname, 0666, kvm_debugfs_dir,
+  vcpu, _exit_timing_fops);
 
vcpu->arch.debugfs_exit_timing = debugfs_file;
 }
 
 void kvmppc_remove_vcpu_debugfs(struct kvm_vcpu *vcpu)
 {
-   if (vcpu->arch.debugfs_exit_timing) {
+   if (!IS_ERR_OR_NULL(vcpu->arch.debugfs_exit_timing)) {
debugfs_remove(vcpu->arch.debugfs_exit_timing);
vcpu->arch.debugfs_exit_timing = NULL;
}


cheers


Re: [PATCH v4 1/8] ASoC: dt-bindings: fsl_asrc: Change asrc-width to asrc-format

2020-03-02 Thread Nicolin Chen
On Tue, Mar 03, 2020 at 11:59:30AM +0800, Shengjiu Wang wrote:
> Hi
> 
> On Tue, Mar 3, 2020 at 9:43 AM Rob Herring  wrote:
> >
> > On Sun, Mar 01, 2020 at 01:24:12PM +0800, Shengjiu Wang wrote:
> > > asrc_format is more inteligent, which is align with the alsa
> > > definition snd_pcm_format_t, we don't need to convert it to
> > > format in driver, and it can distinguish S24_LE & S24_3LE.
> > >
> > > Signed-off-by: Shengjiu Wang 
> > > ---
> > >  Documentation/devicetree/bindings/sound/fsl,asrc.txt | 4 +++-
> > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/Documentation/devicetree/bindings/sound/fsl,asrc.txt 
> > > b/Documentation/devicetree/bindings/sound/fsl,asrc.txt
> > > index cb9a25165503..0cbb86c026d5 100644
> > > --- a/Documentation/devicetree/bindings/sound/fsl,asrc.txt
> > > +++ b/Documentation/devicetree/bindings/sound/fsl,asrc.txt
> > > @@ -38,7 +38,9 @@ Required properties:
> > >
> > > - fsl,asrc-rate   : Defines a mutual sample rate used by DPCM Back 
> > > Ends.
> > >
> > > -   - fsl,asrc-width  : Defines a mutual sample width used by DPCM Back 
> > > Ends.
> > > +   - fsl,asrc-format : Defines a mutual sample format used by DPCM Back
> > > +   Ends. The value is one of SNDRV_PCM_FORMAT_XX in
> > > +   "include/uapi/sound/asound.h"
> >
> > You can't just change properties. They are an ABI.
> 
> I have updated all the things related with this ABI in this patch series.
> What else should I do?

You probably should add one beside the old one. And all
the existing drivers would have to continue to support
"fsl,asrc-width", even if they start to support the new
"fsl,asrc-format". The ground rule here is that a newer
kernel should be able to work with an old DTB, IIRC.

One more concern here is about the format value. Though
I don't think those values, defined in asound.h, would
be changed, yet I am not sure if it's legit to align DT
bindings to a subsystem header file -- I only know that
usually we keep shared macros under include/dt-bindings
folder. I won't have any problem, if either Rob or Mark
has no objection.


[Bug 206733] i2c i2c-3: i2c-powermac: modalias failure on /uni-n@f8000000/i2c@f8001000/cereal@1c0

2020-03-02 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=206733

--- Comment #2 from Mathieu Malaterre (mathieu.malate...@gmail.com) ---
>  i2c i2c-3: i2c-powermac: modalias failure on
>  /uni-n@f800/i2c@f8001000/cereal@1c0

Ben,

Can you confirm this warning is harmless ?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

RE: [RFC PATCH v1] powerpc/prom_init: disable XIVE in Secure VM.

2020-03-02 Thread Cédric Le Goater
On 3/3/20 12:32 AM, David Gibson wrote:
> On Fri, Feb 28, 2020 at 11:54:04PM -0800, Ram Pai wrote:
>> XIVE is not correctly enabled for Secure VM in the KVM Hypervisor yet.
>>
>> Hence Secure VM, must always default to XICS interrupt controller.
>>
>> If XIVE is requested through kernel command line option "xive=on",
>> override and turn it off.
>>
>> If XIVE is the only supported platform interrupt controller; specified
>> through qemu option "ic-mode=xive", simply abort. Otherwise default to
>> XICS.
> 
> Uh... the discussion thread here seems to have gotten oddly off
> track.  

There seem to be multiple issues. It is difficult to have a clear status.

> So, to try to clean up some misunderstandings on both sides:
> 
>   1) The guest is the main thing that knows that it will be in secure
>  mode, so it's reasonable for it to conditionally use XIVE based
>  on that

FW support is required AFAIUI.

>   2) The mechanism by which we do it here isn't quite right.  Here the
>  guest is checking itself that the host only allows XIVE, but we
>  can't do XIVE and is panic()ing.  Instead, in the SVM case we
>  should force support->xive to false, and send that in the CAS
>  request to the host.  We expect the host to just terminate
>  us because of the mismatch, but this will interact better with
>  host side options setting policy for panic states and the like.
>  Essentially an SVM kernel should behave like an old kernel with
>  no XIVE support at all, at least w.r.t. the CAS irq mode flags.

Yes. XIVE shouldn't be requested by the guest. This is the last option 
I proposed but I thought there was some negotiation with the hypervisor
which is not the case. 

>   3) Although there are means by which the hypervisor can kind of know
>  a guest is in secure mode, there's not really an "svm=on" option
>  on the host side.  For the most part secure mode is based on
>  discussion directly between the guest and the ultravisor with
>  almost no hypervisor intervention.

Is there a negotiation with the ultravisor ? 

>   4) I'm guessing the problem with XIVE in SVM mode is that XIVE needs
>  to write to event queues in guest memory, which would have to be
>  explicitly shared for secure mode.  That's true whether it's KVM
>  or qemu accessing the guest memory, so kernel_irqchip=on/off is
>  entirely irrelevant.

This problem should be already fixed. The XIVE event queues are shared 
and the remaining problem with XIVE is the KVM page fault handler 
populating the TIMA and ESB pages. Ultravisor doesn't seem to support
this feature and this breaks interrupt management in the guest. 

But, kernel_irqchip=off should work out of the box. It seems it doesn't. 
Something to investigate.

> 
>   5) All the above said, having to use XICS is pretty crappy.  You
>  should really get working on XIVE support for secure VMs.

Yes. 

Thanks,

C.



Re: [PATCH v3 20/27] powerpc/powernv/pmem: Forward events to userspace

2020-03-02 Thread Andrew Donnellan
On 21/2/20 2:27 pm, Alastair D'Silva wrote:> @@ -938,6 +955,51 @@ static 
int ioctl_controller_stats(struct ocxlpmem *ocxlpmem,

return rc;
  }
  
+static int ioctl_eventfd(struct ocxlpmem *ocxlpmem,

+struct ioctl_ocxl_pmem_eventfd __user *uarg)
+{
+   struct ioctl_ocxl_pmem_eventfd args;
+
+   if (copy_from_user(, uarg, sizeof(args)))
+   return -EFAULT;
+
+   if (ocxlpmem->ev_ctx)
+   return -EINVAL;


I think EBUSY is more appropriate here.


+
+   ocxlpmem->ev_ctx = eventfd_ctx_fdget(args.eventfd);
+   if (!ocxlpmem->ev_ctx)
+   return -EFAULT;
+
+   return 0;
+}
+
+static int ioctl_event_check(struct ocxlpmem *ocxlpmem, u64 __user *uarg)
+{
+   u64 val = 0;
+   int rc;
+   u64 chi = 0;
+
+   rc = ocxlpmem_chi(ocxlpmem, );
+   if (rc < 0)
+   return rc;
+
+   if (chi & GLOBAL_MMIO_CHI_ELA)
+   val |= IOCTL_OCXL_PMEM_EVENT_ERROR_LOG_AVAILABLE;
+
+   if (chi & GLOBAL_MMIO_CHI_CDA)
+   val |= IOCTL_OCXL_PMEM_EVENT_CONTROLLER_DUMP_AVAILABLE;
+
+   if (chi & GLOBAL_MMIO_CHI_CFFS)
+   val |= IOCTL_OCXL_PMEM_EVENT_FIRMWARE_FATAL;
+
+   if (chi & GLOBAL_MMIO_CHI_CHFS)
+   val |= IOCTL_OCXL_PMEM_EVENT_HARDWARE_FATAL;
+
+   rc = copy_to_user((u64 __user *) uarg, , sizeof(val));
+
+   return rc;
+}
+
  static long file_ioctl(struct file *file, unsigned int cmd, unsigned long 
args)
  {
struct ocxlpmem *ocxlpmem = file->private_data;
@@ -966,6 +1028,15 @@ static long file_ioctl(struct file *file, unsigned int 
cmd, unsigned long args)
rc = ioctl_controller_stats(ocxlpmem,
(struct 
ioctl_ocxl_pmem_controller_stats __user *)args);
break;
+
+   case IOCTL_OCXL_PMEM_EVENTFD:
+   rc = ioctl_eventfd(ocxlpmem,
+  (struct ioctl_ocxl_pmem_eventfd __user 
*)args);
+   break;
+
+   case IOCTL_OCXL_PMEM_EVENT_CHECK:
+   rc = ioctl_event_check(ocxlpmem, (u64 __user *)args);
+   break;
}
  
  	return rc;

@@ -1107,6 +1178,146 @@ static void dump_error_log(struct ocxlpmem *ocxlpmem)
kfree(buf);
  }
  
+static irqreturn_t imn0_handler(void *private)

+{
+   struct ocxlpmem *ocxlpmem = private;
+   u64 chi = 0;
+
+   (void)ocxlpmem_chi(ocxlpmem, );
+
+   if (chi & GLOBAL_MMIO_CHI_ELA) {
+   dev_warn(>dev, "Error log is available\n");
+
+   if (ocxlpmem->ev_ctx)
+   eventfd_signal(ocxlpmem->ev_ctx, 1);
+   }
+
+   if (chi & GLOBAL_MMIO_CHI_CDA) {
+   dev_warn(>dev, "Controller dump is available\n");
+
+   if (ocxlpmem->ev_ctx)
+   eventfd_signal(ocxlpmem->ev_ctx, 1);
+   }
+
+
+   return IRQ_HANDLED;
+}
+
+static irqreturn_t imn1_handler(void *private)
+{
+   struct ocxlpmem *ocxlpmem = private;
+   u64 chi = 0;
+
+   (void)ocxlpmem_chi(ocxlpmem, );
+
+   if (chi & (GLOBAL_MMIO_CHI_CFFS | GLOBAL_MMIO_CHI_CHFS)) {
+   dev_err(>dev,
+   "Controller status is fatal, chi=0x%llx, going 
offline\n", chi);
+
+   if (ocxlpmem->nvdimm_bus) {
+   nvdimm_bus_unregister(ocxlpmem->nvdimm_bus);
+   ocxlpmem->nvdimm_bus = NULL;
+   }
+
+   if (ocxlpmem->ev_ctx)
+   eventfd_signal(ocxlpmem->ev_ctx, 1);
+   }
+
+   return IRQ_HANDLED;
+}
+
+
+/**
+ * ocxlpmem_setup_irq() - Set up the IRQs for the OpenCAPI Persistent Memory 
device
+ * @ocxlpmem: the device metadata
+ * Return: 0 on success, negative on failure
+ */
+static int ocxlpmem_setup_irq(struct ocxlpmem *ocxlpmem)
+{
+   int rc;
+   u64 irq_addr;
+
+   rc = ocxl_afu_irq_alloc(ocxlpmem->ocxl_context, >irq_id[0]);
+   if (rc)
+   return rc;
+
+   rc = ocxl_irq_set_handler(ocxlpmem->ocxl_context, ocxlpmem->irq_id[0],
+ imn0_handler, NULL, ocxlpmem);
+
+   irq_addr = ocxl_afu_irq_get_addr(ocxlpmem->ocxl_context, 
ocxlpmem->irq_id[0]);
+   if (!irq_addr)
+   return -EINVAL;
+
+   ocxlpmem->irq_addr[0] = ioremap(irq_addr, PAGE_SIZE);
+   if (!ocxlpmem->irq_addr[0])
+   return -EINVAL;


Something other than EINVAL for these two


+
+   rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_IMA0_OHP,
+ OCXL_LITTLE_ENDIAN,
+ (u64)ocxlpmem->irq_addr[0]);
+   if (rc)
+   goto out_irq0;
+
+   rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_IMA0_CFP,
+ OCXL_LITTLE_ENDIAN, 0);
+   if (rc)
+   goto out_irq0;
+
+   rc = ocxl_afu_irq_alloc(ocxlpmem->ocxl_context, >irq_id[1]);
+   if (rc)
+ 

Re: [PATCH v3 03/27] powerpc: Map & release OpenCAPI LPC memory

2020-03-02 Thread Andrew Donnellan
On 21/2/20 2:26 pm, Alastair D'Silva wrote:> +#ifdef 
CONFIG_MEMORY_HOTPLUG_SPARSE

+u64 pnv_ocxl_platform_lpc_setup(struct pci_dev *pdev, u64 size)
+{
+   struct pci_controller *hose = pci_bus_to_host(pdev->bus);
+   struct pnv_phb *phb = hose->private_data;
+   u32 bdfn = pci_dev_id(pdev);
+   __be64 base_addr_be64;
+   u64 base_addr;
+   int rc;
+
+   rc = opal_npu_mem_alloc(phb->opal_id, bdfn, size, _addr_be64);


Sparse warning:

https://openpower.xyz/job/snowpatch/job/snowpatch-linux-sparse/15776//artifact/linux/report.txt

I think in patch 1 we need to change a uint64_t to a __be64.

--
Andrew Donnellan  OzLabs, ADL Canberra
a...@linux.ibm.com IBM Australia Limited



Re: [PATCH V14] mm/debug: Add tests validating architecture page table helpers

2020-03-02 Thread Christophe Leroy




Le 02/03/2020 à 20:40, Qian Cai a écrit :

On Wed, 2020-02-26 at 10:51 -0500, Qian Cai wrote:

On Wed, 2020-02-26 at 15:45 +0100, Christophe Leroy wrote:


Le 26/02/2020 à 15:09, Qian Cai a écrit :

On Mon, 2020-02-17 at 08:47 +0530, Anshuman Khandual wrote:

This adds tests which will validate architecture page table helpers and
other accessors in their compliance with expected generic MM semantics.
This will help various architectures in validating changes to existing
page table helpers or addition of new ones.

This test covers basic page table entry transformations including but not
limited to old, young, dirty, clean, write, write protect etc at various
level along with populating intermediate entries with next page table page
and validating them.

Test page table pages are allocated from system memory with required size
and alignments. The mapped pfns at page table levels are derived from a
real pfn representing a valid kernel text symbol. This test gets called
inside kernel_init() right after async_synchronize_full().

This test gets built and run when CONFIG_DEBUG_VM_PGTABLE is selected. Any
architecture, which is willing to subscribe this test will need to select
ARCH_HAS_DEBUG_VM_PGTABLE. For now this is limited to arc, arm64, x86, s390
and ppc32 platforms where the test is known to build and run successfully.
Going forward, other architectures too can subscribe the test after fixing
any build or runtime problems with their page table helpers. Meanwhile for
better platform coverage, the test can also be enabled with CONFIG_EXPERT
even without ARCH_HAS_DEBUG_VM_PGTABLE.

Folks interested in making sure that a given platform's page table helpers
conform to expected generic MM semantics should enable the above config
which will just trigger this test during boot. Any non conformity here will
be reported as an warning which would need to be fixed. This test will help
catch any changes to the agreed upon semantics expected from generic MM and
enable platforms to accommodate it thereafter.


How useful is this that straightly crash the powerpc?

[   23.263425][T1] debug_vm_pgtable: debug_vm_pgtable: Validating
architecture page table helpers
[   23.263625][T1] [ cut here ]
[   23.263649][T1] kernel BUG at arch/powerpc/mm/pgtable.c:274!


The problem on PPC64 is known and has to be investigated and fixed.


It might be interesting to hear what powerpc64 maintainers would say about it
and if it is actually worth "fixing" in the arch code, but that BUG_ON() was
there since 2009 and had not been exposed until this patch comes alone?


This patch below makes it works on powerpc64 in order to dodge the BUG_ON()s in
assert_pte_locked() triggered by pte_clear_tests().


diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index 96dd7d574cef..50b385233971 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -55,6 +55,8 @@
  #define RANDOM_ORVALUEGENMASK(BITS_PER_LONG - 1, S390_MASK_BITS)
  #define RANDOM_NZVALUEGENMASK(7, 0)
  
+unsigned long vaddr;

+


Can we avoid global var ?


  static void __init pte_basic_tests(unsigned long pfn, pgprot_t prot)
  {
    pte_t pte = pfn_pte(pfn, prot);
@@ -256,7 +258,7 @@ static void __init pte_clear_tests(struct mm_struct *mm,
pte_t *ptep)
  
  	pte = __pte(pte_val(pte) | RANDOM_ORVALUE);

    WRITE_ONCE(*ptep, pte);
-   pte_clear(mm, 0, ptep);
+   pte_clear(mm, vaddr, ptep);
    pte = READ_ONCE(*ptep);
    WARN_ON(!pte_none(pte));
  }
@@ -310,8 +312,9 @@ void __init debug_vm_pgtable(void)
    pgtable_t saved_ptep;
    pgprot_t prot;
    phys_addr_t paddr;
-   unsigned long vaddr, pte_aligned, pmd_aligned;


Can we pass local vaddr to pte_clear_tests() instead of making it a 
global var ?



+   unsigned long pte_aligned, pmd_aligned;
    unsigned long pud_aligned, p4d_aligned, pgd_aligned;
+   spinlock_t *ptl;
  
  	pr_info("Validating architecture page table helpers\n");

    prot = vm_get_page_prot(VMFLAGS);
@@ -344,7 +347,7 @@ void __init debug_vm_pgtable(void)
    p4dp = p4d_alloc(mm, pgdp, vaddr);
    pudp = pud_alloc(mm, p4dp, vaddr);
    pmdp = pmd_alloc(mm, pudp, vaddr);
-   ptep = pte_alloc_map(mm, pmdp, vaddr);
+   ptep = pte_alloc_map_lock(mm, pmdp, vaddr, );
  
  	/*

     * Save all the page table page addresses as the page table
@@ -370,7 +373,7 @@ void __init debug_vm_pgtable(void)
    p4d_clear_tests(mm, p4dp);
    pgd_clear_tests(mm, pgdp);
  
-	pte_unmap(ptep);

+   pte_unmap_unlock(ptep, ptl);
  
  	pmd_populate_tests(mm, pmdp, saved_ptep);

    pud_populate_tests(mm, pudp, saved_pmdp);



Christophe


Re: [PATCH V14] mm/debug: Add tests validating architecture page table helpers

2020-03-02 Thread Anshuman Khandual



On 03/03/2020 01:10 AM, Qian Cai wrote:
> On Wed, 2020-02-26 at 10:51 -0500, Qian Cai wrote:
>> On Wed, 2020-02-26 at 15:45 +0100, Christophe Leroy wrote:
>>>
>>> Le 26/02/2020 à 15:09, Qian Cai a écrit :
 On Mon, 2020-02-17 at 08:47 +0530, Anshuman Khandual wrote:
> This adds tests which will validate architecture page table helpers and
> other accessors in their compliance with expected generic MM semantics.
> This will help various architectures in validating changes to existing
> page table helpers or addition of new ones.
>
> This test covers basic page table entry transformations including but not
> limited to old, young, dirty, clean, write, write protect etc at various
> level along with populating intermediate entries with next page table page
> and validating them.
>
> Test page table pages are allocated from system memory with required size
> and alignments. The mapped pfns at page table levels are derived from a
> real pfn representing a valid kernel text symbol. This test gets called
> inside kernel_init() right after async_synchronize_full().
>
> This test gets built and run when CONFIG_DEBUG_VM_PGTABLE is selected. Any
> architecture, which is willing to subscribe this test will need to select
> ARCH_HAS_DEBUG_VM_PGTABLE. For now this is limited to arc, arm64, x86, 
> s390
> and ppc32 platforms where the test is known to build and run successfully.
> Going forward, other architectures too can subscribe the test after fixing
> any build or runtime problems with their page table helpers. Meanwhile for
> better platform coverage, the test can also be enabled with CONFIG_EXPERT
> even without ARCH_HAS_DEBUG_VM_PGTABLE.
>
> Folks interested in making sure that a given platform's page table helpers
> conform to expected generic MM semantics should enable the above config
> which will just trigger this test during boot. Any non conformity here 
> will
> be reported as an warning which would need to be fixed. This test will 
> help
> catch any changes to the agreed upon semantics expected from generic MM 
> and
> enable platforms to accommodate it thereafter.

 How useful is this that straightly crash the powerpc?

 [   23.263425][T1] debug_vm_pgtable: debug_vm_pgtable: Validating
 architecture page table helpers
 [   23.263625][T1] [ cut here ]
 [   23.263649][T1] kernel BUG at arch/powerpc/mm/pgtable.c:274!
>>>
>>> The problem on PPC64 is known and has to be investigated and fixed.
>>
>> It might be interesting to hear what powerpc64 maintainers would say about it
>> and if it is actually worth "fixing" in the arch code, but that BUG_ON() was
>> there since 2009 and had not been exposed until this patch comes alone?
> 
> This patch below makes it works on powerpc64 in order to dodge the BUG_ON()s 
> in 
> assert_pte_locked() triggered by pte_clear_tests().
> 
> 
> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> index 96dd7d574cef..50b385233971 100644
> --- a/mm/debug_vm_pgtable.c
> +++ b/mm/debug_vm_pgtable.c
> @@ -55,6 +55,8 @@
>  #define RANDOM_ORVALUE   GENMASK(BITS_PER_LONG - 1, S390_MASK_BITS)
>  #define RANDOM_NZVALUE   GENMASK(7, 0)
>  
> +unsigned long vaddr;
> +
>  static void __init pte_basic_tests(unsigned long pfn, pgprot_t prot)
>  {
>   pte_t pte = pfn_pte(pfn, prot);
> @@ -256,7 +258,7 @@ static void __init pte_clear_tests(struct mm_struct *mm,
> pte_t *ptep)
>  
>   pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
>   WRITE_ONCE(*ptep, pte);
> - pte_clear(mm, 0, ptep);
> + pte_clear(mm, vaddr, ptep);
>   pte = READ_ONCE(*ptep);
>   WARN_ON(!pte_none(pte));
>  }
> @@ -310,8 +312,9 @@ void __init debug_vm_pgtable(void)
>   pgtable_t saved_ptep;
>   pgprot_t prot;
>   phys_addr_t paddr;
> - unsigned long vaddr, pte_aligned, pmd_aligned;
> + unsigned long pte_aligned, pmd_aligned;
>   unsigned long pud_aligned, p4d_aligned, pgd_aligned;
> + spinlock_t *ptl;
>  
>   pr_info("Validating architecture page table helpers\n");
>   prot = vm_get_page_prot(VMFLAGS);
> @@ -344,7 +347,7 @@ void __init debug_vm_pgtable(void)
>   p4dp = p4d_alloc(mm, pgdp, vaddr);
>   pudp = pud_alloc(mm, p4dp, vaddr);
>   pmdp = pmd_alloc(mm, pudp, vaddr);
> - ptep = pte_alloc_map(mm, pmdp, vaddr);
> + ptep = pte_alloc_map_lock(mm, pmdp, vaddr, );
>  
>   /*
>    * Save all the page table page addresses as the page table
> @@ -370,7 +373,7 @@ void __init debug_vm_pgtable(void)
>   p4d_clear_tests(mm, p4dp);
>   pgd_clear_tests(mm, pgdp);
>  
> - pte_unmap(ptep);
> + pte_unmap_unlock(ptep, ptl);
>  
>   pmd_populate_tests(mm, pmdp, saved_ptep);
>   pud_populate_tests(mm, pudp, saved_pmdp);
> 

Below is slightly modified version of your change above and should still
prevent the bug 

Re: [PATCH v4 1/8] ASoC: dt-bindings: fsl_asrc: Change asrc-width to asrc-format

2020-03-02 Thread Shengjiu Wang
Hi

On Tue, Mar 3, 2020 at 9:43 AM Rob Herring  wrote:
>
> On Sun, Mar 01, 2020 at 01:24:12PM +0800, Shengjiu Wang wrote:
> > asrc_format is more inteligent, which is align with the alsa
> > definition snd_pcm_format_t, we don't need to convert it to
> > format in driver, and it can distinguish S24_LE & S24_3LE.
> >
> > Signed-off-by: Shengjiu Wang 
> > ---
> >  Documentation/devicetree/bindings/sound/fsl,asrc.txt | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/Documentation/devicetree/bindings/sound/fsl,asrc.txt 
> > b/Documentation/devicetree/bindings/sound/fsl,asrc.txt
> > index cb9a25165503..0cbb86c026d5 100644
> > --- a/Documentation/devicetree/bindings/sound/fsl,asrc.txt
> > +++ b/Documentation/devicetree/bindings/sound/fsl,asrc.txt
> > @@ -38,7 +38,9 @@ Required properties:
> >
> > - fsl,asrc-rate   : Defines a mutual sample rate used by DPCM Back Ends.
> >
> > -   - fsl,asrc-width  : Defines a mutual sample width used by DPCM Back 
> > Ends.
> > +   - fsl,asrc-format : Defines a mutual sample format used by DPCM Back
> > +   Ends. The value is one of SNDRV_PCM_FORMAT_XX in
> > +   "include/uapi/sound/asound.h"
>
> You can't just change properties. They are an ABI.

I have updated all the things related with this ABI in this patch series.
What else should I do?

Best regards
Wang Shengjiu


Re: [RFC 00/11] perf: Enhancing perf to export processor hazard information

2020-03-02 Thread Kim Phillips
On 3/2/20 2:21 PM, Stephane Eranian wrote:
> On Mon, Mar 2, 2020 at 2:13 AM Peter Zijlstra  wrote:
>>
>> On Mon, Mar 02, 2020 at 10:53:44AM +0530, Ravi Bangoria wrote:
>>> Modern processors export such hazard data in Performance
>>> Monitoring Unit (PMU) registers. Ex, 'Sampled Instruction Event
>>> Register' on IBM PowerPC[1][2] and 'Instruction-Based Sampling' on
>>> AMD[3] provides similar information.
>>>
>>> Implementation detail:
>>>
>>> A new sample_type called PERF_SAMPLE_PIPELINE_HAZ is introduced.
>>> If it's set, kernel converts arch specific hazard information
>>> into generic format:
>>>
>>>   struct perf_pipeline_haz_data {
>>>  /* Instruction/Opcode type: Load, Store, Branch  */
>>>  __u8itype;
>>>  /* Instruction Cache source */
>>>  __u8icache;
>>>  /* Instruction suffered hazard in pipeline stage */
>>>  __u8hazard_stage;
>>>  /* Hazard reason */
>>>  __u8hazard_reason;
>>>  /* Instruction suffered stall in pipeline stage */
>>>  __u8stall_stage;
>>>  /* Stall reason */
>>>  __u8stall_reason;
>>>  __u16   pad;
>>>   };
>>
>> Kim, does this format indeed work for AMD IBS?

It's not really 1:1, we don't have these separations of stages
and reasons, for example: we have missed in L2 cache, for example.
So IBS output is flatter, with more cycle latency figures than
IBM's AFAICT.

> Personally, I don't like the term hazard. This is too IBM Power
> specific. We need to find a better term, maybe stall or penalty.

Right, IBS doesn't have a filter to only count stalled or otherwise
bad events.  IBS' PPR descriptions has one occurrence of the
word stall, and no penalty.  The way I read IBS is it's just
reporting more sample data than just the precise IP: things like
hits, misses, cycle latencies, addresses, types, etc., so words
like 'extended', or the 'auxiliary' already used today even
are more appropriate for IBS, although I'm the last person to
bikeshed.

> Also worth considering is the support of ARM SPE (Statistical
> Profiling Extension) which is their version of IBS.
> Whatever gets added need to cover all three with no limitations.

I thought Intel's various LBR, PEBS, and PT supported providing
similar sample data in perf already, like with perf mem/c2c?

Kim


Re: [PATCH v4 7/8] ASoC: dt-bindings: fsl_easrc: Add document for EASRC

2020-03-02 Thread Rob Herring
On Sun, Mar 01, 2020 at 01:24:18PM +0800, Shengjiu Wang wrote:
> EASRC (Enhanced Asynchronous Sample Rate Converter) is a new
> IP module found on i.MX8MN.
> 
> Signed-off-by: Shengjiu Wang 
> ---
>  .../devicetree/bindings/sound/fsl,easrc.yaml  | 96 +++
>  1 file changed, 96 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/sound/fsl,easrc.yaml
> 
> diff --git a/Documentation/devicetree/bindings/sound/fsl,easrc.yaml 
> b/Documentation/devicetree/bindings/sound/fsl,easrc.yaml
> new file mode 100644
> index ..500af8f0c8f0
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/sound/fsl,easrc.yaml
> @@ -0,0 +1,96 @@
> +# SPDX-License-Identifier: GPL-2.0

Dual license new bindings:

(GPL-2.0-only OR BSD-2-Clause)

> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/sound/fsl,easrc.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: NXP Asynchronous Sample Rate Converter (ASRC) Controller
> +
> +maintainers:
> +  - Shengjiu Wang 
> +
> +properties:
> +  $nodename:
> +pattern: "^easrc@.*"
> +
> +  compatible:
> +oneOf:
> +  - items:

You can drop oneOf and items here.

> +- enum:
> +  - fsl,imx8mn-easrc

Blank line between properties please.

> +  reg:
> +maxItems: 1
> +
> +  interrupts:
> +maxItems: 1
> +
> +  clocks:
> +items:
> +  - description: Peripheral clock
> +
> +  clock-names:
> +items:
> +  - const: mem
> +
> +  dmas:
> +maxItems: 8
> +
> +  dma-names:
> +oneOf:

Drop oneOf as there is only one.

> +  - items:
> +  - const: ctx0_rx
> +  - const: ctx0_tx
> +  - const: ctx1_rx
> +  - const: ctx1_tx
> +  - const: ctx2_rx
> +  - const: ctx2_tx
> +  - const: ctx3_rx
> +  - const: ctx3_tx
> +
> +  fsl,easrc-ram-script-name:
> +$ref: /schemas/types.yaml#/definitions/string
> +description: The coefficient table for the filters

Need to define the exact string(s).

> +
> +  fsl,asrc-rate:
> +$ref: /schemas/types.yaml#/definitions/uint32
> +description: Defines a mutual sample rate used by DPCM Back Ends

Constraints?

> +
> +  fsl,asrc-format:
> +$ref: /schemas/types.yaml#/definitions/uint32
> +description: Defines a mutual sample format used by DPCM Back Ends

Constraints?

> +
> +required:
> +  - compatible
> +  - reg
> +  - interrupts
> +  - clocks
> +  - clock-names
> +  - dmas
> +  - dma-name

dma-names

> +  - fsl,easrc-ram-script-name
> +  - fsl,asrc-rate
> +  - fsl,asrc-format
> +
> +examples:
> +  - |
> +#include 
> +
> +easrc: easrc@300C {
> +   compatible = "fsl,imx8mn-easrc";
> +   reg = <0x0 0x300C 0x0 0x1>;
> +   interrupts = <0x0 122 0x4>;
> +   clocks = < IMX8MN_CLK_ASRC_ROOT>;
> +   clock-names = "mem";
> +   dmas = < 16 23 0> , < 17 23 0>,
> +  < 18 23 0> , < 19 23 0>,
> +  < 20 23 0> , < 21 23 0>,
> +  < 22 23 0> , < 23 23 0>;
> +   dma-names = "ctx0_rx", "ctx0_tx",
> +   "ctx1_rx", "ctx1_tx",
> +   "ctx2_rx", "ctx2_tx",
> +   "ctx3_rx", "ctx3_tx";
> +   fsl,easrc-ram-script-name = "imx/easrc/easrc-imx8mn.bin";
> +   fsl,asrc-rate  = <8000>;
> +   fsl,asrc-format = <2>;
> +   status = "disabled";

Don't show status in examples.

> +};
> -- 
> 2.21.0
> 


[PATCH] powerpc/64: BE option to use ELFv2 ABI for big endian kernels

2020-03-02 Thread Nicholas Piggin
Provide an option to use ELFv2 ABI for big endian builds. This works on
GCC and clang (since 2014). it is is not officially supported by the GNU
toolchain, but it can give some useful advantages of the ELFv2 ABI for
BE (e.g., less stack usage). Some distros build BE ELFv2 userspace.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/Kconfig| 19 +++
 arch/powerpc/Makefile   | 15 ++-
 arch/powerpc/boot/Makefile  |  4 
 drivers/crypto/vmx/Makefile |  4 
 drivers/crypto/vmx/aesp8-ppc.pl |  2 +-
 drivers/crypto/vmx/ppc-xlate.pl | 11 +++
 6 files changed, 45 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 497b7d0b2d7e..31dd921a5145 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -146,6 +146,7 @@ config PPC
select ARCH_WEAK_RELEASE_ACQUIRE
select BINFMT_ELF
select BUILDTIME_TABLE_SORT
+   select BUILD_ELF_V2 if PPC64 && CPU_LITTLE_ENDIAN
select CLONE_BACKWARDS
select DCACHE_WORD_ACCESS   if PPC64 && CPU_LITTLE_ENDIAN
select DYNAMIC_FTRACE   if FUNCTION_TRACER
@@ -538,6 +539,24 @@ config KEXEC_FILE
 config ARCH_HAS_KEXEC_PURGATORY
def_bool KEXEC_FILE
 
+config BUILD_ELF_V2
+   bool
+
+config BUILD_BIG_ENDIAN_ELF_V2
+   bool "Build big-endian kernel using ELFv2 ABI (EXPERIMENTAL)"
+   depends on PPC64 && CPU_BIG_ENDIAN && EXPERT
+   default n
+   select BUILD_ELF_V2
+   help
+ This builds the kernel image using the ELFv2 ABI, which has a
+ reduced stack overhead and faster function calls. This does not
+ affect the userspace ABIs.
+
+ ELFv2 is the standard ABI for little-endian, but for big-endian
+ this is an experimental option that is not well tested (kernel and
+ toolchain). This requires gcc 4.9 or newer and binutils 2.24 or
+ newer.
+
 config RELOCATABLE
bool "Build a relocatable kernel"
depends on PPC64 || (FLATMEM && (44x || FSL_BOOKE))
diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index f35730548e42..ae8036a0b169 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -92,10 +92,14 @@ endif
 
 ifdef CONFIG_PPC64
 ifndef CONFIG_CC_IS_CLANG
-cflags-$(CONFIG_CPU_BIG_ENDIAN)+= $(call cc-option,-mabi=elfv1)
-cflags-$(CONFIG_CPU_BIG_ENDIAN)+= $(call 
cc-option,-mcall-aixdesc)
-aflags-$(CONFIG_CPU_BIG_ENDIAN)+= $(call cc-option,-mabi=elfv1)
-aflags-$(CONFIG_CPU_LITTLE_ENDIAN) += -mabi=elfv2
+ifdef CONFIG_BUILD_ELF_V2
+cflags-y   += $(call cc-option,-mabi=elfv2,$(call 
cc-option,-mcall-aixdesc))
+aflags-y   += $(call cc-option,-mabi=elfv2)
+else
+cflags-y   += $(call cc-option,-mabi=elfv1)
+cflags-y   += $(call cc-option,-mcall-aixdesc)
+aflags-y   += $(call cc-option,-mabi=elfv1)
+endif
 endif
 endif
 
@@ -144,7 +148,7 @@ endif
 
 CFLAGS-$(CONFIG_PPC64) := $(call cc-option,-mtraceback=no)
 ifndef CONFIG_CC_IS_CLANG
-ifdef CONFIG_CPU_LITTLE_ENDIAN
+ifdef CONFIG_BUILD_ELF_V2
 CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mabi=elfv2,$(call 
cc-option,-mcall-aixdesc))
 AFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mabi=elfv2)
 else
@@ -153,6 +157,7 @@ CFLAGS-$(CONFIG_PPC64)  += $(call 
cc-option,-mcall-aixdesc)
 AFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mabi=elfv1)
 endif
 endif
+
 CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mcmodel=medium,$(call 
cc-option,-mminimal-toc))
 CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mno-pointers-to-nested-functions)
 
diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile
index 0556bf4fc9e9..137ff20b13f8 100644
--- a/arch/powerpc/boot/Makefile
+++ b/arch/powerpc/boot/Makefile
@@ -41,6 +41,10 @@ endif
 
 BOOTCFLAGS += -isystem $(shell $(BOOTCC) -print-file-name=include)
 
+ifdef CONFIG_BUILD_ELF_V2
+BOOTCFLAGS += $(call cc-option,-mabi=elfv2)
+endif
+
 ifdef CONFIG_CPU_BIG_ENDIAN
 BOOTCFLAGS += -mbig-endian
 else
diff --git a/drivers/crypto/vmx/Makefile b/drivers/crypto/vmx/Makefile
index 709670d2b553..8d79514eb474 100644
--- a/drivers/crypto/vmx/Makefile
+++ b/drivers/crypto/vmx/Makefile
@@ -5,8 +5,12 @@ vmx-crypto-objs := vmx.o aesp8-ppc.o ghashp8-ppc.o aes.o 
aes_cbc.o aes_ctr.o aes
 ifeq ($(CONFIG_CPU_LITTLE_ENDIAN),y)
 override flavour := linux-ppc64le
 else
+ifdef CONFIG_BUILD_ELF_V2
+override flavour := linux-ppc64v2
+else
 override flavour := linux-ppc64
 endif
+endif
 
 quiet_cmd_perl = PERL $@
   cmd_perl = $(PERL) $(<) $(flavour) > $(@)
diff --git a/drivers/crypto/vmx/aesp8-ppc.pl b/drivers/crypto/vmx/aesp8-ppc.pl
index db874367b602..6733a68f12ed 100644
--- a/drivers/crypto/vmx/aesp8-ppc.pl
+++ b/drivers/crypto/vmx/aesp8-ppc.pl
@@ -100,7 +100,7 @@ if ($flavour =~ /64/) {
$SHL="slwi";
 

Re: [PATCH v4 1/8] ASoC: dt-bindings: fsl_asrc: Change asrc-width to asrc-format

2020-03-02 Thread Rob Herring
On Sun, Mar 01, 2020 at 01:24:12PM +0800, Shengjiu Wang wrote:
> asrc_format is more inteligent, which is align with the alsa
> definition snd_pcm_format_t, we don't need to convert it to
> format in driver, and it can distinguish S24_LE & S24_3LE.
> 
> Signed-off-by: Shengjiu Wang 
> ---
>  Documentation/devicetree/bindings/sound/fsl,asrc.txt | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/devicetree/bindings/sound/fsl,asrc.txt 
> b/Documentation/devicetree/bindings/sound/fsl,asrc.txt
> index cb9a25165503..0cbb86c026d5 100644
> --- a/Documentation/devicetree/bindings/sound/fsl,asrc.txt
> +++ b/Documentation/devicetree/bindings/sound/fsl,asrc.txt
> @@ -38,7 +38,9 @@ Required properties:
>  
> - fsl,asrc-rate   : Defines a mutual sample rate used by DPCM Back Ends.
>  
> -   - fsl,asrc-width  : Defines a mutual sample width used by DPCM Back Ends.
> +   - fsl,asrc-format : Defines a mutual sample format used by DPCM Back
> +   Ends. The value is one of SNDRV_PCM_FORMAT_XX in
> +   "include/uapi/sound/asound.h"

You can't just change properties. They are an ABI.

>  
> - fsl,asrc-clk-map   : Defines clock map used in driver. which is required
> by imx8qm/imx8qxp platform
> -- 
> 2.21.0
> 


Re: [RFC 00/11] perf: Enhancing perf to export processor hazard information

2020-03-02 Thread Andi Kleen
On Mon, Mar 02, 2020 at 11:13:32AM +0100, Peter Zijlstra wrote:
> On Mon, Mar 02, 2020 at 10:53:44AM +0530, Ravi Bangoria wrote:
> > Modern processors export such hazard data in Performance
> > Monitoring Unit (PMU) registers. Ex, 'Sampled Instruction Event
> > Register' on IBM PowerPC[1][2] and 'Instruction-Based Sampling' on
> > AMD[3] provides similar information.
> > 
> > Implementation detail:
> > 
> > A new sample_type called PERF_SAMPLE_PIPELINE_HAZ is introduced.
> > If it's set, kernel converts arch specific hazard information
> > into generic format:
> > 
> >   struct perf_pipeline_haz_data {
> >  /* Instruction/Opcode type: Load, Store, Branch  */
> >  __u8itype;
> >  /* Instruction Cache source */
> >  __u8icache;
> >  /* Instruction suffered hazard in pipeline stage */
> >  __u8hazard_stage;
> >  /* Hazard reason */
> >  __u8hazard_reason;
> >  /* Instruction suffered stall in pipeline stage */
> >  __u8stall_stage;
> >  /* Stall reason */
> >  __u8stall_reason;
> >  __u16   pad;
> >   };
> 
> Kim, does this format indeed work for AMD IBS?

Intel PEBS has a similar concept for annotation of memory accesses,
which is already exported through perf_mem_data_src. This is essentially
an extension. It would be better to have something unified here. 
Right now it seems to duplicate at least part of the PEBS facility.

-Andi



[PATCH] powerpc/build: vdso linker warning for orphan sections

2020-03-02 Thread Nicholas Piggin
Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/kernel/vdso32/Makefile | 2 +-
 arch/powerpc/kernel/vdso32/vdso32.lds.S | 1 +
 arch/powerpc/kernel/vdso64/Makefile | 2 +-
 arch/powerpc/kernel/vdso64/vdso64.lds.S | 3 ++-
 4 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/vdso32/Makefile 
b/arch/powerpc/kernel/vdso32/Makefile
index e147bbdc12cd..87ab1152d5ce 100644
--- a/arch/powerpc/kernel/vdso32/Makefile
+++ b/arch/powerpc/kernel/vdso32/Makefile
@@ -50,7 +50,7 @@ $(obj-vdso32): %.o: %.S FORCE
 
 # actual build commands
 quiet_cmd_vdso32ld = VDSO32L $@
-  cmd_vdso32ld = $(VDSOCC) $(c_flags) $(CC32FLAGS) -o $@ -Wl,-T$(filter 
%.lds,$^) $(filter %.o,$^)
+  cmd_vdso32ld = $(VDSOCC) $(c_flags) $(CC32FLAGS) -o $@ $(call 
cc-ldoption, -Wl$(comma)--orphan-handling=warn) -Wl,-T$(filter %.lds,$^) 
$(filter %.o,$^)
 quiet_cmd_vdso32as = VDSO32A $@
   cmd_vdso32as = $(VDSOCC) $(a_flags) $(CC32FLAGS) -c -o $@ $<
 
diff --git a/arch/powerpc/kernel/vdso32/vdso32.lds.S 
b/arch/powerpc/kernel/vdso32/vdso32.lds.S
index 5206c2eb2a1d..4c985467a668 100644
--- a/arch/powerpc/kernel/vdso32/vdso32.lds.S
+++ b/arch/powerpc/kernel/vdso32/vdso32.lds.S
@@ -111,6 +111,7 @@ SECTIONS
*(.note.GNU-stack)
*(.data .data.* .gnu.linkonce.d.* .sdata*)
*(.bss .sbss .dynbss .dynsbss)
+   *(.glink .iplt .plt .rela*)
}
 }
 
diff --git a/arch/powerpc/kernel/vdso64/Makefile 
b/arch/powerpc/kernel/vdso64/Makefile
index 32ebb3522ea1..38c317f25141 100644
--- a/arch/powerpc/kernel/vdso64/Makefile
+++ b/arch/powerpc/kernel/vdso64/Makefile
@@ -34,7 +34,7 @@ $(obj)/%.so: $(obj)/%.so.dbg FORCE
 
 # actual build commands
 quiet_cmd_vdso64ld = VDSO64L $@
-  cmd_vdso64ld = $(CC) $(c_flags) -o $@ -Wl,-T$(filter %.lds,$^) $(filter 
%.o,$^)
+  cmd_vdso64ld = $(CC) $(c_flags) -o $@ -Wl,-T$(filter %.lds,$^) $(filter 
%.o,$^) $(call cc-ldoption, -Wl$(comma)--orphan-handling=warn)
 
 # install commands for the unstripped file
 quiet_cmd_vdso_install = INSTALL $@
diff --git a/arch/powerpc/kernel/vdso64/vdso64.lds.S 
b/arch/powerpc/kernel/vdso64/vdso64.lds.S
index 256fb9720298..4e3a8d4ee614 100644
--- a/arch/powerpc/kernel/vdso64/vdso64.lds.S
+++ b/arch/powerpc/kernel/vdso64/vdso64.lds.S
@@ -30,7 +30,7 @@ SECTIONS
. = ALIGN(16);
.text   : {
*(.text .stub .text.* .gnu.linkonce.t.* __ftr_alt_*)
-   *(.sfpr .glink)
+   *(.sfpr)
}   :text
PROVIDE(__etext = .);
PROVIDE(_etext = .);
@@ -111,6 +111,7 @@ SECTIONS
*(.branch_lt)
*(.data .data.* .gnu.linkonce.d.* .sdata*)
*(.bss .sbss .dynbss .dynsbss)
+   *(.glink .iplt .plt .rela*)
}
 }
 
-- 
2.23.0



Re: [RFC PATCH v1] powerpc/prom_init: disable XIVE in Secure VM.

2020-03-02 Thread David Gibson
On Fri, Feb 28, 2020 at 11:54:04PM -0800, Ram Pai wrote:
> XIVE is not correctly enabled for Secure VM in the KVM Hypervisor yet.
> 
> Hence Secure VM, must always default to XICS interrupt controller.
> 
> If XIVE is requested through kernel command line option "xive=on",
> override and turn it off.
> 
> If XIVE is the only supported platform interrupt controller; specified
> through qemu option "ic-mode=xive", simply abort. Otherwise default to
> XICS.

Uh... the discussion thread here seems to have gotten oddly off
track.  So, to try to clean up some misunderstandings on both sides:

  1) The guest is the main thing that knows that it will be in secure
 mode, so it's reasonable for it to conditionally use XIVE based
 on that.

  2) The mechanism by which we do it here isn't quite right.  Here the
 guest is checking itself that the host only allows XIVE, but we
 can't do XIVE and is panic()ing.  Instead, in the SVM case we
 should force support->xive to false, and send that in the CAS
 request to the host.  We expect the host to just terminate
 us because of the mismatch, but this will interact better with
 host side options setting policy for panic states and the like.
 Essentially an SVM kernel should behave like an old kernel with
 no XIVE support at all, at least w.r.t. the CAS irq mode flags.

  3) Although there are means by which the hypervisor can kind of know
 a guest is in secure mode, there's not really an "svm=on" option
 on the host side.  For the most part secure mode is based on
 discussion directly between the guest and the ultravisor with
 almost no hypervisor intervention.

  4) I'm guessing the problem with XIVE in SVM mode is that XIVE needs
 to write to event queues in guest memory, which would have to be
 explicitly shared for secure mode.  That's true whether it's KVM
 or qemu accessing the guest memory, so kernel_irqchip=on/off is
 entirely irrelevant.

  5) All the above said, having to use XICS is pretty crappy.  You
 should really get working on XIVE support for secure VMs.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH] ima: add a new CONFIG for loading arch-specific policies

2020-03-02 Thread Michael Ellerman
Mimi Zohar  writes:
> On Mon, 2020-03-02 at 15:52 +0100, Ard Biesheuvel wrote:
>> On Mon, 2 Mar 2020 at 15:48, Mimi Zohar  wrote:
>> >
>> > On Wed, 2020-02-26 at 14:10 -0500, Nayna Jain wrote:
>> > > Every time a new architecture defines the IMA architecture specific
>> > > functions - arch_ima_get_secureboot() and arch_ima_get_policy(), the IMA
>> > > include file needs to be updated. To avoid this "noise", this patch
>> > > defines a new IMA Kconfig IMA_SECURE_AND_OR_TRUSTED_BOOT option, allowing
>> > > the different architectures to select it.
>> > >
>> > > Suggested-by: Linus Torvalds 
>> > > Signed-off-by: Nayna Jain 
>> > > Cc: Ard Biesheuvel 
>> > > Cc: Martin Schwidefsky 
>> > > Cc: Philipp Rudo 
>> > > Cc: Michael Ellerman 
>> > > ---
>> > >  arch/powerpc/Kconfig   | 2 +-
>> > >  arch/s390/Kconfig  | 1 +
>> > >  arch/x86/Kconfig   | 1 +
>> > >  include/linux/ima.h| 3 +--
>> > >  security/integrity/ima/Kconfig | 9 +
>> > >  5 files changed, 13 insertions(+), 3 deletions(-)
>> > >
>> > > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>> > > index 497b7d0b2d7e..b8ce1b995633 100644
>> > > --- a/arch/powerpc/Kconfig
>> > > +++ b/arch/powerpc/Kconfig
>> > > @@ -246,6 +246,7 @@ config PPC
>> > >   select SYSCTL_EXCEPTION_TRACE
>> > >   select THREAD_INFO_IN_TASK
>> > >   select VIRT_TO_BUS  if !PPC64
>> > > + select IMA_SECURE_AND_OR_TRUSTED_BOOT   if PPC_SECURE_BOOT
>> > >   #
>> > >   # Please keep this list sorted alphabetically.
>> > >   #
>> > > @@ -978,7 +979,6 @@ config PPC_SECURE_BOOT
>> > >   prompt "Enable secure boot support"
>> > >   bool
>> > >   depends on PPC_POWERNV
>> > > - depends on IMA_ARCH_POLICY
>> > >   help
>> > > Systems with firmware secure boot enabled need to define security
>> > > policies to extend secure boot to the OS. This config allows a 
>> > > user
>> > > diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
>> > > index 8abe77536d9d..90ff3633ade6 100644
>> > > --- a/arch/s390/Kconfig
>> > > +++ b/arch/s390/Kconfig
>> > > @@ -195,6 +195,7 @@ config S390
>> > >   select ARCH_HAS_FORCE_DMA_UNENCRYPTED
>> > >   select SWIOTLB
>> > >   select GENERIC_ALLOCATOR
>> > > + select IMA_SECURE_AND_OR_TRUSTED_BOOT
>> > >
>> > >
>> > >  config SCHED_OMIT_FRAME_POINTER
>> > > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
>> > > index beea77046f9b..cafa66313fe2 100644
>> > > --- a/arch/x86/Kconfig
>> > > +++ b/arch/x86/Kconfig
>> > > @@ -230,6 +230,7 @@ config X86
>> > >   select VIRT_TO_BUS
>> > >   select X86_FEATURE_NAMESif PROC_FS
>> > >   select PROC_PID_ARCH_STATUS if PROC_FS
>> > > + select IMA_SECURE_AND_OR_TRUSTED_BOOT   if EFI
>> >
>> > Not everyone is interested in enabling IMA or requiring IMA runtime
>> > policies.  With this patch, enabling IMA_ARCH_POLICY is therefore
>> > still left up to the person building the kernel.  As a result, I'm
>> > seeing the following warning, which is kind of cool.
>> >
>> > WARNING: unmet direct dependencies detected for
>> > IMA_SECURE_AND_OR_TRUSTED_BOOT
>> >   Depends on [n]: INTEGRITY [=y] && IMA [=y] && IMA_ARCH_POLICY [=n]
>> >   Selected by [y]:
>> >   - X86 [=y] && EFI [=y]
>> >
>> > Ard, Michael, Martin, just making sure this type of warning is
>> > acceptable before upstreaming this patch.  I would appreciate your
>> > tags.
>> >
>> 
>> Ehm, no, warnings like these are not really acceptable. It means there
>> is an inconsistency in the way the Kconfig dependencies are defined.
>> 
>> Does this help:
>> 
>>   select IMA_SECURE_AND_OR_TRUSTED_BOOT   if EFI && IMA_ARCH_POLICY
>> 
>> ?
>
> Yes, that's fine for x86.  Michael, Martin, do you want something
> similar or would you prefer actually selecting IMA_ARCH_POLICY?

For powerpc this should be all we need:

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 497b7d0b2d7e..a5cfde432983 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -976,12 +976,13 @@ config PPC_MEM_KEYS
 
 config PPC_SECURE_BOOT
prompt "Enable secure boot support"
bool
depends on PPC_POWERNV
depends on IMA_ARCH_POLICY
+   select IMA_SECURE_AND_OR_TRUSTED_BOOT
help
  Systems with firmware secure boot enabled need to define security
  policies to extend secure boot to the OS. This config allows a user
  to enable OS secure boot on systems that have firmware support for
  it. If in doubt say N.
 

cheers


Re: [RFC PATCH v1] powerpc/prom_init: disable XIVE in Secure VM.

2020-03-02 Thread Greg Kurz
On Fri, 28 Feb 2020 23:54:04 -0800
Ram Pai  wrote:

> XIVE is not correctly enabled for Secure VM in the KVM Hypervisor yet.
> 

What exactly is "not correctly enabled" ?

> Hence Secure VM, must always default to XICS interrupt controller.
> 

So this is a temporary workaround until whatever isn't working with
XIVE and the Secure VM gets fixed. Maybe worth mentioning this in
some comment.

> If XIVE is requested through kernel command line option "xive=on",
> override and turn it off.
> 

There's no such thing as requesting XIVE with "xive=on". XIVE is
on by default if the platform and CPU support it BUT it can be
disabled with "xive=off" in which case the guest wont request
XIVE except if it's the only available mode.

> If XIVE is the only supported platform interrupt controller; specified
> through qemu option "ic-mode=xive", simply abort. Otherwise default to
> XICS.
> 

If XIVE is the only option and the guest requests XICS anyway, QEMU is
supposed to print an error message and terminate:

if (!spapr->irq->xics) {
error_report(
"Guest requested unavailable interrupt mode (XICS), either don't set the 
ic-mode machine property or try ic-mode=xics or ic-mode=dual");
exit(EXIT_FAILURE);
}

I think it would be better to end up there rather than aborting.

> Cc: kvm-...@vger.kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: Michael Ellerman 
> Cc: Thiago Jung Bauermann 
> Cc: Michael Anderson 
> Cc: Sukadev Bhattiprolu 
> Cc: Alexey Kardashevskiy 
> Cc: Paul Mackerras 
> Cc: Greg Kurz 
> Cc: Cedric Le Goater 
> Cc: David Gibson 
> Signed-off-by: Ram Pai 
> ---
>  arch/powerpc/kernel/prom_init.c | 43 
> -
>  1 file changed, 30 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
> index 5773453..dd96c82 100644
> --- a/arch/powerpc/kernel/prom_init.c
> +++ b/arch/powerpc/kernel/prom_init.c
> @@ -805,6 +805,18 @@ static void __init early_cmdline_parse(void)
>  #endif
>   }
>  
> +#ifdef CONFIG_PPC_SVM
> + opt = prom_strstr(prom_cmd_line, "svm=");
> + if (opt) {
> + bool val;
> +
> + opt += sizeof("svm=") - 1;
> + if (!prom_strtobool(opt, ))
> + prom_svm_enable = val;
> + prom_printf("svm =%d\n", prom_svm_enable);
> + }
> +#endif /* CONFIG_PPC_SVM */
> +
>  #ifdef CONFIG_PPC_PSERIES
>   prom_radix_disable = !IS_ENABLED(CONFIG_PPC_RADIX_MMU_DEFAULT);
>   opt = prom_strstr(prom_cmd_line, "disable_radix");
> @@ -823,23 +835,22 @@ static void __init early_cmdline_parse(void)
>   if (prom_radix_disable)
>   prom_debug("Radix disabled from cmdline\n");
>  
> - opt = prom_strstr(prom_cmd_line, "xive=off");
> - if (opt) {

A comment to explain why we currently need to limit ourselves to using
XICS would be appreciated.

> +#ifdef CONFIG_PPC_SVM
> + if (prom_svm_enable) {
>   prom_xive_disable = true;
> - prom_debug("XIVE disabled from cmdline\n");
> + prom_debug("XIVE disabled in Secure VM\n");
>   }
> -#endif /* CONFIG_PPC_PSERIES */
> -
> -#ifdef CONFIG_PPC_SVM
> - opt = prom_strstr(prom_cmd_line, "svm=");
> - if (opt) {
> - bool val;
> +#endif /* CONFIG_PPC_SVM */
>  
> - opt += sizeof("svm=") - 1;
> - if (!prom_strtobool(opt, ))
> - prom_svm_enable = val;
> + if (!prom_xive_disable) {
> + opt = prom_strstr(prom_cmd_line, "xive=off");
> + if (opt) {
> + prom_xive_disable = true;
> + prom_debug("XIVE disabled from cmdline\n");
> + }
>   }
> -#endif /* CONFIG_PPC_SVM */
> +
> +#endif /* CONFIG_PPC_PSERIES */
>  }
>  
>  #ifdef CONFIG_PPC_PSERIES
> @@ -1251,6 +1262,12 @@ static void __init prom_parse_xive_model(u8 val,
>   break;
>   case OV5_FEAT(OV5_XIVE_EXPLOIT): /* Only Exploitation mode */
>   prom_debug("XIVE - exploitation mode supported\n");
> +
> +#ifdef CONFIG_PPC_SVM
> + if (prom_svm_enable)
> + prom_panic("WARNING: xive unsupported in Secure VM\n");

Change the prom_panic() line into a break. The guest will ask XICS and QEMU
will terminate nicely. Maybe still print out a warning since QEMU won't mention
the Secure VM aspect of things.

> +#endif /* CONFIG_PPC_SVM */
> +
>   if (prom_xive_disable) {
>   /*
>* If we __have__ to do XIVE, we're better off ignoring



Re: [PATCH] mm/debug: Add tests validating arch page table helpers for core features

2020-03-02 Thread Christophe Leroy

Anshuman Khandual  a écrit :


On 02/27/2020 04:59 PM, Christophe Leroy wrote:



Le 27/02/2020 à 11:33, Anshuman Khandual a écrit :

This adds new tests validating arch page table helpers for these following
core memory features. These tests create and test specific mapping types at
various page table levels.

* SPECIAL mapping
* PROTNONE mapping
* DEVMAP mapping
* SOFTDIRTY mapping
* SWAP mapping
* MIGRATION mapping
* HUGETLB mapping
* THP mapping

Cc: Andrew Morton 
Cc: Mike Rapoport 
Cc: Vineet Gupta 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Michael Ellerman 
Cc: Heiko Carstens 
Cc: Vasily Gorbik 
Cc: Christian Borntraeger 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: "H. Peter Anvin" 
Cc: Kirill A. Shutemov 
Cc: Paul Walmsley 
Cc: Palmer Dabbelt 
Cc: linux-snps-...@lists.infradead.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s...@vger.kernel.org
Cc: linux-ri...@lists.infradead.org
Cc: x...@kernel.org
Cc: linux-a...@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Suggested-by: Catalin Marinas 
Signed-off-by: Anshuman Khandual 
---
Tested on arm64 and x86 platforms without any test failures. But this has
only been built tested on several other platforms. Individual tests need
to be verified on all current enabling platforms for the test i.e s390,
ppc32, arc etc.

This patch must be applied on v5.6-rc3 after these patches

1. https://patchwork.kernel.org/patch/11385057/
2. https://patchwork.kernel.org/patch/11407715/

OR

This patch must be applied on linux-next (next-20200227) after this patch

2. https://patchwork.kernel.org/patch/11407715/

  mm/debug_vm_pgtable.c | 310 +-
  1 file changed, 309 insertions(+), 1 deletion(-)

diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index 96dd7d574cef..3fb90d5b604e 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -41,6 +41,44 @@
   * wrprotect(entry)    = A write protected and not a write entry
   * pxx_bad(entry)    = A mapped and non-table entry
   * pxx_same(entry1, entry2)    = Both entries hold the exact same value
+ *
+ * Specific feature operations
+ *
+ * pte_mkspecial(entry)    = Creates a special entry at PTE level
+ * pte_special(entry)    = Tests a special entry at PTE level
+ *
+ * pte_protnone(entry)    = Tests a no access entry at PTE level
+ * pmd_protnone(entry)    = Tests a no access entry at PMD level
+ *
+ * pte_mkdevmap(entry)    = Creates a device entry at PTE level
+ * pmd_mkdevmap(entry)    = Creates a device entry at PMD level
+ * pud_mkdevmap(entry)    = Creates a device entry at PUD level
+ * pte_devmap(entry)    = Tests a device entry at PTE level
+ * pmd_devmap(entry)    = Tests a device entry at PMD level
+ * pud_devmap(entry)    = Tests a device entry at PUD level
+ *
+ * pte_mksoft_dirty(entry)    = Creates a soft dirty entry at PTE level
+ * pmd_mksoft_dirty(entry)    = Creates a soft dirty entry at PMD level
+ * pte_swp_mksoft_dirty(entry)    = Creates a soft dirty swap  
entry at PTE level
+ * pmd_swp_mksoft_dirty(entry)    = Creates a soft dirty swap  
entry at PMD level

+ * pte_soft_dirty(entry)    = Tests a soft dirty entry at PTE level
+ * pmd_soft_dirty(entry)    = Tests a soft dirty entry at PMD level
+ * pte_swp_soft_dirty(entry)    = Tests a soft dirty swap entry  
at PTE level
+ * pmd_swp_soft_dirty(entry)    = Tests a soft dirty swap entry  
at PMD level
+ * pte_clear_soft_dirty(entry)   = Clears a soft dirty entry  
at PTE level
+ * pmd_clear_soft_dirty(entry)   = Clears a soft dirty entry  
at PMD level
+ * pte_swp_clear_soft_dirty(entry) = Clears a soft dirty swap  
entry at PTE level
+ * pmd_swp_clear_soft_dirty(entry) = Clears a soft dirty swap  
entry at PMD level

+ *
+ * pte_mkhuge(entry)    = Creates a HugeTLB entry at given level
+ * pte_huge(entry)    = Tests a HugeTLB entry at given level
+ *
+ * pmd_trans_huge(entry)    = Tests a trans huge page at PMD level
+ * pud_trans_huge(entry)    = Tests a trans huge page at PUD level
+ * pmd_present(entry)    = Tests an entry points to memory at  
PMD level
+ * pud_present(entry)    = Tests an entry points to memory at  
PUD level

+ * pmd_mknotpresent(entry)    = Invalidates an PMD entry for MMU
+ * pud_mknotpresent(entry)    = Invalidates an PUD entry for MMU
   */
  #define VMFLAGS    (VM_READ|VM_WRITE|VM_EXEC)
  @@ -287,6 +325,233 @@ static void __init  
pmd_populate_tests(struct mm_struct *mm, pmd_t *pmdp,

  WARN_ON(pmd_bad(pmd));
  }
  +#ifdef CONFIG_ARCH_HAS_PTE_SPECIAL


Can we avoid ifdefs unless necessary ?

In mm/memory.c I see things like the following, it means  
pte_special() always exist and a #ifdef is not necessary.


True, #ifdef here can be dropped here, done.



if (IS_ENABLED(CONFIG_ARCH_HAS_PTE_SPECIAL)) {
    if (likely(!pte_special(pte)))
    goto check_pfn;
    if 

Re: [PATCH] ima: add a new CONFIG for loading arch-specific policies

2020-03-02 Thread Heiko Carstens
On Mon, Mar 02, 2020 at 09:56:58AM -0500, Mimi Zohar wrote:
> On Mon, 2020-03-02 at 15:52 +0100, Ard Biesheuvel wrote:
> > On Mon, 2 Mar 2020 at 15:48, Mimi Zohar  wrote:
> > > > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> > > > index beea77046f9b..cafa66313fe2 100644
> > > > --- a/arch/x86/Kconfig
> > > > +++ b/arch/x86/Kconfig
> > > > @@ -230,6 +230,7 @@ config X86
> > > >   select VIRT_TO_BUS
> > > >   select X86_FEATURE_NAMESif PROC_FS
> > > >   select PROC_PID_ARCH_STATUS if PROC_FS
> > > > + select IMA_SECURE_AND_OR_TRUSTED_BOOT   if EFI
> > >
> > > Not everyone is interested in enabling IMA or requiring IMA runtime
> > > policies.  With this patch, enabling IMA_ARCH_POLICY is therefore
> > > still left up to the person building the kernel.  As a result, I'm
> > > seeing the following warning, which is kind of cool.
> > >
> > > WARNING: unmet direct dependencies detected for
> > > IMA_SECURE_AND_OR_TRUSTED_BOOT
> > >   Depends on [n]: INTEGRITY [=y] && IMA [=y] && IMA_ARCH_POLICY [=n]
> > >   Selected by [y]:
> > >   - X86 [=y] && EFI [=y]
> > >
> > > Ard, Michael, Martin, just making sure this type of warning is
> > > acceptable before upstreaming this patch.  I would appreciate your
> > > tags.
> > >
> > 
> > Ehm, no, warnings like these are not really acceptable. It means there
> > is an inconsistency in the way the Kconfig dependencies are defined.
> > 
> > Does this help:
> > 
> >   select IMA_SECURE_AND_OR_TRUSTED_BOOT   if EFI && IMA_ARCH_POLICY
> > 
> > ?
> 
> Yes, that's fine for x86.  Michael, Martin, do you want something
> similar or would you prefer actually selecting IMA_ARCH_POLICY?

For s390 something like

select IMA_SECURE_AND_OR_TRUSTED_BOOT if IMA_ARCH_POLICY

should be fine.

Thanks,
Heiko



Re: [PATCH v3 7/7] mm/memremap: Set caching mode for PCI P2PDMA memory to WC

2020-03-02 Thread Logan Gunthorpe



On 2020-02-29 3:47 p.m., Dan Williams wrote:
> On Fri, Feb 21, 2020 at 10:25 AM Logan Gunthorpe  wrote:
>>
>> PCI BAR IO memory should never be mapped as WB, however prior to this
>> the PAT bits were set WB and it was typically overridden by MTRR
>> registers set by the firmware.
>>
>> Set PCI P2PDMA memory to be WC (writecombining) as the only current
>> user (the NVMe CMB) was originally mapped WC before the P2PDMA code
>> replaced the mapping with devm_memremap_pages().
> 
> Will the change to UC regress this existing use case?

I don't think so. They've been essentially mapped UC for a long time now
(since the P2PDMA patch set was merged) and nobody has complained.


Re: [RFC 00/11] perf: Enhancing perf to export processor hazard information

2020-03-02 Thread Paul Clarke
On 3/1/20 11:23 PM, Ravi Bangoria wrote:
> Most modern microprocessors employ complex instruction execution
> pipelines such that many instructions can be 'in flight' at any
> given point in time. Various factors affect this pipeline and
> hazards are the primary among them. Different types of hazards
> exist - Data hazards, Structural hazards and Control hazards.
> Data hazard is the case where data dependencies exist between
> instructions in different stages in the pipeline. Structural
> hazard is when the same processor hardware is needed by more
> than one instruction in flight at the same time. Control hazards
> are more the branch misprediction kinds. 
> 
> Information about these hazards are critical towards analyzing
> performance issues and also to tune software to overcome such
> issues. Modern processors export such hazard data in Performance
> Monitoring Unit (PMU) registers. Ex, 'Sampled Instruction Event
> Register' on IBM PowerPC[1][2] and 'Instruction-Based Sampling' on
> AMD[3] provides similar information.
> 
> Implementation detail:
> 
> A new sample_type called PERF_SAMPLE_PIPELINE_HAZ is introduced.
> If it's set, kernel converts arch specific hazard information
> into generic format:
> 
>   struct perf_pipeline_haz_data {
>  /* Instruction/Opcode type: Load, Store, Branch  */
>  __u8itype;

At the risk of bike-shedding (in an RFC, no less), "itype" doesn't convey 
enough meaning to me.  "inst_type"?  I see in 03/11, you use "perf_inst_type".

>  /* Instruction Cache source */
>  __u8icache;

Possibly same here, and you use "perf_inst_cache" in 03/11.

>  /* Instruction suffered hazard in pipeline stage */
>  __u8hazard_stage;
>  /* Hazard reason */
>  __u8hazard_reason;
>  /* Instruction suffered stall in pipeline stage */
>  __u8stall_stage;
>  /* Stall reason */
>  __u8stall_reason;
>  __u16   pad;
>   };
> 
> ... which can be read by user from mmap() ring buffer. With this
> approach, sample perf report in hazard mode looks like (On IBM
> PowerPC):
> 
>   # ./perf record --hazard ./ebizzy
>   # ./perf report --hazard
>   Overhead  Symbol  Shared  Instruction Type  Hazard Stage   Hazard 
> Reason Stall Stage   Stall Reason  ICache access
> 36.58%  [.] thread_run  ebizzy  Load  LSU
> MispredictLSU   Load fin  L1 hit
>  9.46%  [.] thread_run  ebizzy  Load  LSU
> MispredictLSU   Dcache_miss   L1 hit
>  1.76%  [.] thread_run  ebizzy  Fixed point   -  -
>  - - L1 hit
>  1.31%  [.] thread_run  ebizzy  Load  LSUERAT 
> Miss LSU   Load fin  L1 hit
>  1.27%  [.] thread_run  ebizzy  Load  LSU
> Mispredict- - L1 hit
>  1.16%  [.] thread_run  ebizzy  Fixed point   -  -
>  FXU   Fixed cycle   L1 hit
>  0.50%  [.] thread_run  ebizzy  Fixed point   ISUSource 
> UnavailableFXU   Fixed cycle   L1 hit
>  0.30%  [.] thread_run  ebizzy  Load  LSULMQ 
> Full, DERAT Miss  LSU   Load fin  L1 hit
>  0.24%  [.] thread_run  ebizzy  Load  LSUERAT 
> Miss - - L1 hit
>  0.08%  [.] thread_run  ebizzy  - -  -
>  BRU   Fixed cycle   L1 hit
>  0.05%  [.] thread_run  ebizzy  Branch-  -
>  BRU   Fixed cycle   L1 hit
>  0.04%  [.] thread_run  ebizzy  Fixed point   ISUSource 
> Unavailable- - L1 hit

How are these to be interpreted?  This is great information, but is it possible 
to make it more readable for non-experts?  If each of these map 1:1 with 
hardware events, should you emit the name of the event here, so that can be 
used to look up further information?  For example, does the first line map to 
PM_CMPLU_STALL_LSU_FIN?
What was "Mispredict[ed]"? (Is it different from a branch misprediction?) And 
how does this relate to "L1 hit"?
Can we emit "Load finish" instead of "Load fin" for easier reading?  03/11 also 
has "Marked fin before NTC".
Nit: why does "Dcache_miss" have an underscore and none of the others?

> Also perf annotate with hazard data:

>  │static int
>  │compare(const void *p1, const void *p2)
>  │{
>33.23 │  stdr31,-8(r1)
>  │   {haz_stage: LSU, haz_reason: ERAT Miss, stall_stage: LSU, 
> stall_reason: Store, icache: L1 hit}
>  │   {haz_stage: LSU, haz_reason: ERAT Miss, stall_stage: LSU, 
> stall_reason: Store, icache: L1 hit}
>  │   {haz_stage: LSU, 

Re: [PATCH v3 6/7] mm/memory_hotplug: Add pgprot_t to mhp_params

2020-03-02 Thread Dan Williams
On Mon, Mar 2, 2020 at 10:55 AM Logan Gunthorpe  wrote:
>
>
>
> On 2020-02-29 3:44 p.m., Dan Williams wrote:
> > On Fri, Feb 21, 2020 at 10:25 AM Logan Gunthorpe  
> > wrote:
> >>
> >> devm_memremap_pages() is currently used by the PCI P2PDMA code to create
> >> struct page mappings for IO memory. At present, these mappings are created
> >> with PAGE_KERNEL which implies setting the PAT bits to be WB. However, on
> >> x86, an mtrr register will typically override this and force the cache
> >> type to be UC-. In the case firmware doesn't set this register it is
> >> effectively WB and will typically result in a machine check exception
> >> when it's accessed.
> >>
> >> Other arches are not currently likely to function correctly seeing they
> >> don't have any MTRR registers to fall back on.
> >>
> >> To solve this, provide a way to specify the pgprot value explicitly to
> >> arch_add_memory().
> >>
> >> Of the arches that support MEMORY_HOTPLUG: x86_64, and arm64 need a simple
> >> change to pass the pgprot_t down to their respective functions which set
> >> up the page tables. For x86_32, set the page tables explicitly using
> >> _set_memory_prot() (seeing they are already mapped). For ia64, s390 and
> >> sh, reject anything but PAGE_KERNEL settings -- this should be fine,
> >> for now, seeing these architectures don't support ZONE_DEVICE.
> >>
> >> A check in __add_pages() is also added to ensure the pgprot parameter was
> >> set for all arches.
> >>
> >> Cc: Dan Williams 
> >> Signed-off-by: Logan Gunthorpe 
> >> Acked-by: David Hildenbrand 
> >> Acked-by: Michal Hocko 
> >> ---
> >>  arch/arm64/mm/mmu.c| 3 ++-
> >>  arch/ia64/mm/init.c| 3 +++
> >>  arch/powerpc/mm/mem.c  | 3 ++-
> >>  arch/s390/mm/init.c| 3 +++
> >>  arch/sh/mm/init.c  | 3 +++
> >>  arch/x86/mm/init_32.c  | 5 +
> >>  arch/x86/mm/init_64.c  | 2 +-
> >>  include/linux/memory_hotplug.h | 2 ++
> >>  mm/memory_hotplug.c| 5 -
> >>  mm/memremap.c  | 6 +++---
> >>  10 files changed, 28 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> >> index ee37bca8aba8..ea3fa844a8a2 100644
> >> --- a/arch/arm64/mm/mmu.c
> >> +++ b/arch/arm64/mm/mmu.c
> >> @@ -1058,7 +1058,8 @@ int arch_add_memory(int nid, u64 start, u64 size,
> >> flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
> >>
> >> __create_pgd_mapping(swapper_pg_dir, start, __phys_to_virt(start),
> >> -size, PAGE_KERNEL, __pgd_pgtable_alloc, 
> >> flags);
> >> +size, params->pgprot, __pgd_pgtable_alloc,
> >> +flags);
> >>
> >> memblock_clear_nomap(start, size);
> >>
> >> diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c
> >> index 97bbc23ea1e3..d637b4ea3147 100644
> >> --- a/arch/ia64/mm/init.c
> >> +++ b/arch/ia64/mm/init.c
> >> @@ -676,6 +676,9 @@ int arch_add_memory(int nid, u64 start, u64 size,
> >> unsigned long nr_pages = size >> PAGE_SHIFT;
> >> int ret;
> >>
> >> +   if (WARN_ON_ONCE(params->pgprot.pgprot != PAGE_KERNEL.pgprot))
> >> +   return -EINVAL;
> >> +
> >> ret = __add_pages(nid, start_pfn, nr_pages, params);
> >> if (ret)
> >> printk("%s: Problem encountered in __add_pages() as 
> >> ret=%d\n",
> >> diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
> >> index 19b1da5d7eca..832412bc7fad 100644
> >> --- a/arch/powerpc/mm/mem.c
> >> +++ b/arch/powerpc/mm/mem.c
> >> @@ -138,7 +138,8 @@ int __ref arch_add_memory(int nid, u64 start, u64 size,
> >> resize_hpt_for_hotplug(memblock_phys_mem_size());
> >>
> >> start = (unsigned long)__va(start);
> >> -   rc = create_section_mapping(start, start + size, nid, PAGE_KERNEL);
> >> +   rc = create_section_mapping(start, start + size, nid,
> >> +   params->pgprot);
> >> if (rc) {
> >> pr_warn("Unable to create mapping for hot added memory 
> >> 0x%llx..0x%llx: %d\n",
> >> start, start + size, rc);
> >> diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
> >> index e9e4a7abd0cc..87b2d024e75a 100644
> >> --- a/arch/s390/mm/init.c
> >> +++ b/arch/s390/mm/init.c
> >> @@ -277,6 +277,9 @@ int arch_add_memory(int nid, u64 start, u64 size,
> >> if (WARN_ON_ONCE(params->altmap))
> >> return -EINVAL;
> >>
> >> +   if (WARN_ON_ONCE(params->pgprot.pgprot != PAGE_KERNEL.pgprot))
> >> +   return -EINVAL;
> >> +
> >> rc = vmem_add_mapping(start, size);
> >> if (rc)
> >> return rc;
> >> diff --git a/arch/sh/mm/init.c b/arch/sh/mm/init.c
> >> index e5114c053364..b9de2d4fa57e 100644
> >> --- a/arch/sh/mm/init.c
> >> +++ b/arch/sh/mm/init.c
> >> @@ -412,6 +412,9 @@ int arch_add_memory(int nid, u64 start, u64 size,
> >> unsigned long nr_pages = 

[RESEND PATCH] soc: fsl: Enable compile testing of FSL_RCPM

2020-03-02 Thread Krzysztof Kozlowski
FSL_RCPM can be compile tested to increase build coverage.

Signed-off-by: Krzysztof Kozlowski 
---
 drivers/soc/fsl/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/soc/fsl/Kconfig b/drivers/soc/fsl/Kconfig
index 4df32bc4c7a6..e142662d7c99 100644
--- a/drivers/soc/fsl/Kconfig
+++ b/drivers/soc/fsl/Kconfig
@@ -43,7 +43,7 @@ config DPAA2_CONSOLE
 
 config FSL_RCPM
bool "Freescale RCPM support"
-   depends on PM_SLEEP && (ARM || ARM64)
+   depends on PM_SLEEP && (ARM || ARM64 || COMPILE_TEST)
help
  The NXP QorIQ Processors based on ARM Core have RCPM module
  (Run Control and Power Management), which performs all device-level
-- 
2.17.1



Re: [Intel-gfx] [PATCH v7 00/12] Introduce CAP_PERFMON to secure system performance monitoring and observability

2020-03-02 Thread James Morris
On Sun, 1 Mar 2020, Serge Hallyn wrote:

> Thanks, this looks good to me, in keeping with the CAP_SYSLOG break.
> 
> Acked-by: Serge E. Hallyn 
> 
> for the set.
> 
> James/Ingo/Peter, if noone has remaining objections, whose branch
> should these go in through?
> 
> thanks,

I was assuming via the perf tree, but I am happy to take them.


> -serge
> 
> On Tue, Feb 25, 2020 at 12:55:54PM +0300, Alexey Budankov wrote:
> > 
> > Hi,
> > 
> > Is there anything else I could do in order to move the changes forward
> > or is something still missing from this patch set?
> > Could you please share you mind?
> > 
> > Thanks,
> > Alexey
> > 
> > On 17.02.2020 11:02, Alexey Budankov wrote:
> > > 
> > > Currently access to perf_events, i915_perf and other performance
> > > monitoring and observability subsystems of the kernel is open only for
> > > a privileged process [1] with CAP_SYS_ADMIN capability enabled in the
> > > process effective set [2].
> > > 
> > > This patch set introduces CAP_PERFMON capability designed to secure
> > > system performance monitoring and observability operations so that
> > > CAP_PERFMON would assist CAP_SYS_ADMIN capability in its governing role
> > > for performance monitoring and observability subsystems of the kernel.
> > > 
> > > CAP_PERFMON intends to harden system security and integrity during
> > > performance monitoring and observability operations by decreasing attack
> > > surface that is available to a CAP_SYS_ADMIN privileged process [2].
> > > Providing the access to performance monitoring and observability
> > > operations under CAP_PERFMON capability singly, without the rest of
> > > CAP_SYS_ADMIN credentials, excludes chances to misuse the credentials
> > > and makes the operation more secure. Thus, CAP_PERFMON implements the
> > > principal of least privilege for performance monitoring and
> > > observability operations (POSIX IEEE 1003.1e: 2.2.2.39 principle of
> > > least privilege: A security design principle that states that a process
> > > or program be granted only those privileges (e.g., capabilities)
> > > necessary to accomplish its legitimate function, and only for the time
> > > that such privileges are actually required)
> > > 
> > > CAP_PERFMON intends to meet the demand to secure system performance
> > > monitoring and observability operations for adoption in security
> > > sensitive, restricted, multiuser production environments (e.g. HPC
> > > clusters, cloud and virtual compute environments), where root or
> > > CAP_SYS_ADMIN credentials are not available to mass users of a system,
> > > and securely unblock accessibility of system performance monitoring and
> > > observability operations beyond root and CAP_SYS_ADMIN use cases.
> > > 
> > > CAP_PERFMON intends to take over CAP_SYS_ADMIN credentials related to
> > > system performance monitoring and observability operations and balance
> > > amount of CAP_SYS_ADMIN credentials following the recommendations in
> > > the capabilities man page [2] for CAP_SYS_ADMIN: "Note: this capability
> > > is overloaded; see Notes to kernel developers, below." For backward
> > > compatibility reasons access to system performance monitoring and
> > > observability subsystems of the kernel remains open for CAP_SYS_ADMIN
> > > privileged processes but CAP_SYS_ADMIN capability usage for secure
> > > system performance monitoring and observability operations is
> > > discouraged with respect to the designed CAP_PERFMON capability.
> > > 
> > > Possible alternative solution to this system security hardening,
> > > capabilities balancing task of making performance monitoring and
> > > observability operations more secure and accessible could be to use
> > > the existing CAP_SYS_PTRACE capability to govern system performance
> > > monitoring and observability subsystems. However CAP_SYS_PTRACE
> > > capability still provides users with more credentials than are
> > > required for secure performance monitoring and observability
> > > operations and this excess is avoided by the designed CAP_PERFMON.
> > > 
> > > Although software running under CAP_PERFMON can not ensure avoidance of
> > > related hardware issues, the software can still mitigate those issues
> > > following the official hardware issues mitigation procedure [3]. The
> > > bugs in the software itself can be fixed following the standard kernel
> > > development process [4] to maintain and harden security of system
> > > performance monitoring and observability operations. Finally, the patch
> > > set is shaped in the way that simplifies backtracking procedure of
> > > possible induced issues [5] as much as possible.
> > > 
> > > The patch set is for tip perf/core repository:
> > > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip perf/core
> > > sha1: fdb64822443ec9fb8c3a74b598a74790ae8d2e22
> > > 
> > > ---
> > > Changes in v7:
> > > - updated and extended kernel.rst and perf-security.rst documentation 
> > >   files with the information about CAP_PERFMON 

Re: [PATCH V14] mm/debug: Add tests validating architecture page table helpers

2020-03-02 Thread Qian Cai
On Wed, 2020-02-26 at 10:51 -0500, Qian Cai wrote:
> On Wed, 2020-02-26 at 15:45 +0100, Christophe Leroy wrote:
> > 
> > Le 26/02/2020 à 15:09, Qian Cai a écrit :
> > > On Mon, 2020-02-17 at 08:47 +0530, Anshuman Khandual wrote:
> > > > This adds tests which will validate architecture page table helpers and
> > > > other accessors in their compliance with expected generic MM semantics.
> > > > This will help various architectures in validating changes to existing
> > > > page table helpers or addition of new ones.
> > > > 
> > > > This test covers basic page table entry transformations including but 
> > > > not
> > > > limited to old, young, dirty, clean, write, write protect etc at various
> > > > level along with populating intermediate entries with next page table 
> > > > page
> > > > and validating them.
> > > > 
> > > > Test page table pages are allocated from system memory with required 
> > > > size
> > > > and alignments. The mapped pfns at page table levels are derived from a
> > > > real pfn representing a valid kernel text symbol. This test gets called
> > > > inside kernel_init() right after async_synchronize_full().
> > > > 
> > > > This test gets built and run when CONFIG_DEBUG_VM_PGTABLE is selected. 
> > > > Any
> > > > architecture, which is willing to subscribe this test will need to 
> > > > select
> > > > ARCH_HAS_DEBUG_VM_PGTABLE. For now this is limited to arc, arm64, x86, 
> > > > s390
> > > > and ppc32 platforms where the test is known to build and run 
> > > > successfully.
> > > > Going forward, other architectures too can subscribe the test after 
> > > > fixing
> > > > any build or runtime problems with their page table helpers. Meanwhile 
> > > > for
> > > > better platform coverage, the test can also be enabled with 
> > > > CONFIG_EXPERT
> > > > even without ARCH_HAS_DEBUG_VM_PGTABLE.
> > > > 
> > > > Folks interested in making sure that a given platform's page table 
> > > > helpers
> > > > conform to expected generic MM semantics should enable the above config
> > > > which will just trigger this test during boot. Any non conformity here 
> > > > will
> > > > be reported as an warning which would need to be fixed. This test will 
> > > > help
> > > > catch any changes to the agreed upon semantics expected from generic MM 
> > > > and
> > > > enable platforms to accommodate it thereafter.
> > > 
> > > How useful is this that straightly crash the powerpc?
> > > 
> > > [   23.263425][T1] debug_vm_pgtable: debug_vm_pgtable: Validating
> > > architecture page table helpers
> > > [   23.263625][T1] [ cut here ]
> > > [   23.263649][T1] kernel BUG at arch/powerpc/mm/pgtable.c:274!
> > 
> > The problem on PPC64 is known and has to be investigated and fixed.
> 
> It might be interesting to hear what powerpc64 maintainers would say about it
> and if it is actually worth "fixing" in the arch code, but that BUG_ON() was
> there since 2009 and had not been exposed until this patch comes alone?

This patch below makes it works on powerpc64 in order to dodge the BUG_ON()s in 
assert_pte_locked() triggered by pte_clear_tests().


diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index 96dd7d574cef..50b385233971 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -55,6 +55,8 @@
 #define RANDOM_ORVALUE GENMASK(BITS_PER_LONG - 1, S390_MASK_BITS)
 #define RANDOM_NZVALUE GENMASK(7, 0)
 
+unsigned long vaddr;
+
 static void __init pte_basic_tests(unsigned long pfn, pgprot_t prot)
 {
    pte_t pte = pfn_pte(pfn, prot);
@@ -256,7 +258,7 @@ static void __init pte_clear_tests(struct mm_struct *mm,
pte_t *ptep)
 
    pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
    WRITE_ONCE(*ptep, pte);
-   pte_clear(mm, 0, ptep);
+   pte_clear(mm, vaddr, ptep);
    pte = READ_ONCE(*ptep);
    WARN_ON(!pte_none(pte));
 }
@@ -310,8 +312,9 @@ void __init debug_vm_pgtable(void)
    pgtable_t saved_ptep;
    pgprot_t prot;
    phys_addr_t paddr;
-   unsigned long vaddr, pte_aligned, pmd_aligned;
+   unsigned long pte_aligned, pmd_aligned;
    unsigned long pud_aligned, p4d_aligned, pgd_aligned;
+   spinlock_t *ptl;
 
    pr_info("Validating architecture page table helpers\n");
    prot = vm_get_page_prot(VMFLAGS);
@@ -344,7 +347,7 @@ void __init debug_vm_pgtable(void)
    p4dp = p4d_alloc(mm, pgdp, vaddr);
    pudp = pud_alloc(mm, p4dp, vaddr);
    pmdp = pmd_alloc(mm, pudp, vaddr);
-   ptep = pte_alloc_map(mm, pmdp, vaddr);
+   ptep = pte_alloc_map_lock(mm, pmdp, vaddr, );
 
    /*
     * Save all the page table page addresses as the page table
@@ -370,7 +373,7 @@ void __init debug_vm_pgtable(void)
    p4d_clear_tests(mm, p4dp);
    pgd_clear_tests(mm, pgdp);
 
-   pte_unmap(ptep);
+   pte_unmap_unlock(ptep, ptl);
 
    pmd_populate_tests(mm, pmdp, saved_ptep);
    pud_populate_tests(mm, pudp, saved_pmdp);


Re: [PATCH v3 6/8] perf/tools: Enhance JSON/metric infrastructure to handle "?"

2020-03-02 Thread Jiri Olsa
On Sat, Feb 29, 2020 at 03:11:57PM +0530, Kajol Jain wrote:

SNIP

>  #define PVR_VER(pvr)(((pvr) >>  16) & 0x) /* Version field */
>  #define PVR_REV(pvr)(((pvr) >>   0) & 0x) /* Revison field */
>  
> +#define SOCKETS_INFO_FILE_PATH "/devices/hv_24x7/interface/"
> +
>  int
>  get_cpuid(char *buffer, size_t sz)
>  {
> @@ -44,3 +51,43 @@ get_cpuid_str(struct perf_pmu *pmu __maybe_unused)
>  
>   return bufp;
>  }
> +
> +int arch_get_runtimeparam(void)
> +{
> + int count = 0;
> + DIR *dir;
> + char path[PATH_MAX];
> + const char *sysfs = sysfs__mountpoint();
> + char filename[] = "sockets";
> + FILE *file;
> + char buf[16], *num;
> + int data;
> +
> + if (!sysfs)
> + goto out;
> +
> + snprintf(path, PATH_MAX,
> +  "%s" SOCKETS_INFO_FILE_PATH, sysfs);
> + dir = opendir(path);
> +
> + if (!dir)
> + goto out;
> +
> + strcat(path, filename);
> + file = fopen(path, "r");
> +
> + if (!file)
> + goto out;
> +
> + data = fread(buf, 1, sizeof(buf), file);
> +
> + if (data == 0)
> + goto out;
> +
> + count = strtol(buf, , 10);
> +out:
> + if (!count)
> + count = 1;
> +
> + return count;

we have sysfs__read_ull for this

jirka



Re: [PATCH v3 6/7] mm/memory_hotplug: Add pgprot_t to mhp_params

2020-03-02 Thread Logan Gunthorpe



On 2020-02-29 3:44 p.m., Dan Williams wrote:
> On Fri, Feb 21, 2020 at 10:25 AM Logan Gunthorpe  wrote:
>>
>> devm_memremap_pages() is currently used by the PCI P2PDMA code to create
>> struct page mappings for IO memory. At present, these mappings are created
>> with PAGE_KERNEL which implies setting the PAT bits to be WB. However, on
>> x86, an mtrr register will typically override this and force the cache
>> type to be UC-. In the case firmware doesn't set this register it is
>> effectively WB and will typically result in a machine check exception
>> when it's accessed.
>>
>> Other arches are not currently likely to function correctly seeing they
>> don't have any MTRR registers to fall back on.
>>
>> To solve this, provide a way to specify the pgprot value explicitly to
>> arch_add_memory().
>>
>> Of the arches that support MEMORY_HOTPLUG: x86_64, and arm64 need a simple
>> change to pass the pgprot_t down to their respective functions which set
>> up the page tables. For x86_32, set the page tables explicitly using
>> _set_memory_prot() (seeing they are already mapped). For ia64, s390 and
>> sh, reject anything but PAGE_KERNEL settings -- this should be fine,
>> for now, seeing these architectures don't support ZONE_DEVICE.
>>
>> A check in __add_pages() is also added to ensure the pgprot parameter was
>> set for all arches.
>>
>> Cc: Dan Williams 
>> Signed-off-by: Logan Gunthorpe 
>> Acked-by: David Hildenbrand 
>> Acked-by: Michal Hocko 
>> ---
>>  arch/arm64/mm/mmu.c| 3 ++-
>>  arch/ia64/mm/init.c| 3 +++
>>  arch/powerpc/mm/mem.c  | 3 ++-
>>  arch/s390/mm/init.c| 3 +++
>>  arch/sh/mm/init.c  | 3 +++
>>  arch/x86/mm/init_32.c  | 5 +
>>  arch/x86/mm/init_64.c  | 2 +-
>>  include/linux/memory_hotplug.h | 2 ++
>>  mm/memory_hotplug.c| 5 -
>>  mm/memremap.c  | 6 +++---
>>  10 files changed, 28 insertions(+), 7 deletions(-)
>>
>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>> index ee37bca8aba8..ea3fa844a8a2 100644
>> --- a/arch/arm64/mm/mmu.c
>> +++ b/arch/arm64/mm/mmu.c
>> @@ -1058,7 +1058,8 @@ int arch_add_memory(int nid, u64 start, u64 size,
>> flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
>>
>> __create_pgd_mapping(swapper_pg_dir, start, __phys_to_virt(start),
>> -size, PAGE_KERNEL, __pgd_pgtable_alloc, flags);
>> +size, params->pgprot, __pgd_pgtable_alloc,
>> +flags);
>>
>> memblock_clear_nomap(start, size);
>>
>> diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c
>> index 97bbc23ea1e3..d637b4ea3147 100644
>> --- a/arch/ia64/mm/init.c
>> +++ b/arch/ia64/mm/init.c
>> @@ -676,6 +676,9 @@ int arch_add_memory(int nid, u64 start, u64 size,
>> unsigned long nr_pages = size >> PAGE_SHIFT;
>> int ret;
>>
>> +   if (WARN_ON_ONCE(params->pgprot.pgprot != PAGE_KERNEL.pgprot))
>> +   return -EINVAL;
>> +
>> ret = __add_pages(nid, start_pfn, nr_pages, params);
>> if (ret)
>> printk("%s: Problem encountered in __add_pages() as 
>> ret=%d\n",
>> diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
>> index 19b1da5d7eca..832412bc7fad 100644
>> --- a/arch/powerpc/mm/mem.c
>> +++ b/arch/powerpc/mm/mem.c
>> @@ -138,7 +138,8 @@ int __ref arch_add_memory(int nid, u64 start, u64 size,
>> resize_hpt_for_hotplug(memblock_phys_mem_size());
>>
>> start = (unsigned long)__va(start);
>> -   rc = create_section_mapping(start, start + size, nid, PAGE_KERNEL);
>> +   rc = create_section_mapping(start, start + size, nid,
>> +   params->pgprot);
>> if (rc) {
>> pr_warn("Unable to create mapping for hot added memory 
>> 0x%llx..0x%llx: %d\n",
>> start, start + size, rc);
>> diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
>> index e9e4a7abd0cc..87b2d024e75a 100644
>> --- a/arch/s390/mm/init.c
>> +++ b/arch/s390/mm/init.c
>> @@ -277,6 +277,9 @@ int arch_add_memory(int nid, u64 start, u64 size,
>> if (WARN_ON_ONCE(params->altmap))
>> return -EINVAL;
>>
>> +   if (WARN_ON_ONCE(params->pgprot.pgprot != PAGE_KERNEL.pgprot))
>> +   return -EINVAL;
>> +
>> rc = vmem_add_mapping(start, size);
>> if (rc)
>> return rc;
>> diff --git a/arch/sh/mm/init.c b/arch/sh/mm/init.c
>> index e5114c053364..b9de2d4fa57e 100644
>> --- a/arch/sh/mm/init.c
>> +++ b/arch/sh/mm/init.c
>> @@ -412,6 +412,9 @@ int arch_add_memory(int nid, u64 start, u64 size,
>> unsigned long nr_pages = size >> PAGE_SHIFT;
>> int ret;
>>
>> +   if (WARN_ON_ONCE(params->pgprot.pgprot != PAGE_KERNEL.pgprot)
>> +   return -EINVAL;
>> +
>> /* We only have ZONE_NORMAL, so this is easy.. */
>> ret = __add_pages(nid, start_pfn, nr_pages, params);
>>

Re: [PATCH v3 4/7] x86/mm: Introduce _set_memory_prot()

2020-03-02 Thread Logan Gunthorpe



On 2020-02-29 3:33 p.m., Dan Williams wrote:
> On Fri, Feb 21, 2020 at 10:25 AM Logan Gunthorpe  wrote:
>>
>> For use in the 32bit arch_add_memory() to set the pgprot type of the
>> memory to add.
>>
>> Cc: Thomas Gleixner 
>> Cc: Ingo Molnar 
>> Cc: Borislav Petkov 
>> Cc: "H. Peter Anvin" 
>> Cc: x...@kernel.org
>> Cc: Dave Hansen 
>> Cc: Andy Lutomirski 
>> Cc: Peter Zijlstra 
>> Signed-off-by: Logan Gunthorpe 
>> ---
>>  arch/x86/include/asm/set_memory.h | 1 +
>>  arch/x86/mm/pat/set_memory.c  | 7 +++
>>  2 files changed, 8 insertions(+)
>>
>> diff --git a/arch/x86/include/asm/set_memory.h 
>> b/arch/x86/include/asm/set_memory.h
>> index 64c3dce374e5..0aca959cf9a4 100644
>> --- a/arch/x86/include/asm/set_memory.h
>> +++ b/arch/x86/include/asm/set_memory.h
>> @@ -34,6 +34,7 @@
>>   * The caller is required to take care of these.
>>   */
>>
>> +int _set_memory_prot(unsigned long addr, int numpages, pgprot_t prot);
> 
> I wonder if this should be separated from the naming convention of the
> other routines because this is only an internal helper for code paths
> where the prot was established by an upper layer. For example, I
> expect that the kernel does not want new usages to make the mistake of
> calling:
> 
>_set_memory_prot(..., pgprot_writecombine(pgprot))
> 
> ...instead of
> 
> _set_memory_wc()
> 
> I'm thinking just a double underscore rename (__set_memory_prot) and a
> kerneldoc comment for that  pointing people to use the direct
> _set_memory_ helpers.

Thanks! Will do. Note, though, that even _set_memory_wc() is an internal
x86-specific function. But the extra comment and underscore still make
sense.

> With that you can add:
> 
> Reviewed-by: Dan Williams 
> 


Re: [PATCH v3 3/5] libnvdimm/namespace: Enforce memremap_compat_align()

2020-03-02 Thread Dan Williams
On Mon, Mar 2, 2020 at 4:09 AM Aneesh Kumar K.V
 wrote:
>
> Dan Williams  writes:
>
> > The pmem driver on PowerPC crashes with the following signature when
> > instantiating misaligned namespaces that map their capacity via
> > memremap_pages().
> >
> > BUG: Unable to handle kernel data access at 0xc00100040600
> > Faulting instruction address: 0xc0090790
> > NIP [c0090790] arch_add_memory+0xc0/0x130
> > LR [c0090744] arch_add_memory+0x74/0x130
> > Call Trace:
> >  arch_add_memory+0x74/0x130 (unreliable)
> >  memremap_pages+0x74c/0xa30
> >  devm_memremap_pages+0x3c/0xa0
> >  pmem_attach_disk+0x188/0x770
> >  nvdimm_bus_probe+0xd8/0x470
> >
> > With the assumption that only memremap_pages() has alignment
> > constraints, enforce memremap_compat_align() for
> > pmem_should_map_pages(), nd_pfn, and nd_dax cases. This includes
> > preventing the creation of namespaces where the base address is
> > misaligned and cases there infoblock padding parameters are invalid.
> >
>
> Reviewed-by: Aneesh Kumar K.V 
>
> > Reported-by: Aneesh Kumar K.V 
> > Cc: Jeff Moyer 
> > Fixes: a3619190d62e ("libnvdimm/pfn: stop padding pmem namespaces to 
> > section alignment")
> > Signed-off-by: Dan Williams 
> > ---
> >  drivers/nvdimm/namespace_devs.c |   12 
> >  drivers/nvdimm/pfn_devs.c   |   26 +++---
> >  2 files changed, 35 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/nvdimm/namespace_devs.c 
> > b/drivers/nvdimm/namespace_devs.c
> > index 032dc61725ff..68e89855f779 100644
> > --- a/drivers/nvdimm/namespace_devs.c
> > +++ b/drivers/nvdimm/namespace_devs.c
> > @@ -10,6 +10,7 @@
> >  #include 
> >  #include "nd-core.h"
> >  #include "pmem.h"
> > +#include "pfn.h"
> >  #include "nd.h"
> >
> >  static void namespace_io_release(struct device *dev)
> > @@ -1739,6 +1740,17 @@ struct nd_namespace_common 
> > *nvdimm_namespace_common_probe(struct device *dev)
> >   return ERR_PTR(-ENODEV);
> >   }
>
> May be add a comment here that both dax/fsdax namespace details are
> checked in nd_pfn_validate() so that we look at start_pad and end_trunc
> while validating the namespace?
>
> >
> > + if (pmem_should_map_pages(dev)) {
> > + struct nd_namespace_io *nsio = to_nd_namespace_io(>dev);
> > + struct resource *res = >res;
> > +
> > + if (!IS_ALIGNED(res->start | (res->end + 1),
> > + memremap_compat_align())) {
> > + dev_err(>dev, "%pr misaligned, unable to 
> > map\n", res);
> > + return ERR_PTR(-EOPNOTSUPP);
> > + }
> > + }
> > +
> >   if (is_namespace_pmem(>dev)) {
> >   struct nd_namespace_pmem *nspm;
> >
> > diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c
> > index 79fe02d6f657..3bdd4b883d05 100644
> > --- a/drivers/nvdimm/pfn_devs.c
> > +++ b/drivers/nvdimm/pfn_devs.c
> > @@ -446,6 +446,7 @@ static bool nd_supported_alignment(unsigned long align)
> >  int nd_pfn_validate(struct nd_pfn *nd_pfn, const char *sig)
> >  {
> >   u64 checksum, offset;
> > + struct resource *res;
> >   enum nd_pfn_mode mode;
> >   struct nd_namespace_io *nsio;
> >   unsigned long align, start_pad;
> > @@ -578,13 +579,14 @@ int nd_pfn_validate(struct nd_pfn *nd_pfn, const char 
> > *sig)
> >* established.
> >*/
> >   nsio = to_nd_namespace_io(>dev);
> > - if (offset >= resource_size(>res)) {
> > + res = >res;
> > + if (offset >= resource_size(res)) {
> >   dev_err(_pfn->dev, "pfn array size exceeds capacity of 
> > %s\n",
> >   dev_name(>dev));
> >   return -EOPNOTSUPP;
> >   }
> >
> > - if ((align && !IS_ALIGNED(nsio->res.start + offset + start_pad, 
> > align))
> > + if ((align && !IS_ALIGNED(res->start + offset + start_pad, align))
> >   || !IS_ALIGNED(offset, PAGE_SIZE)) {
> >   dev_err(_pfn->dev,
> >   "bad offset: %#llx dax disabled align: 
> > %#lx\n",
> > @@ -592,6 +594,18 @@ int nd_pfn_validate(struct nd_pfn *nd_pfn, const char 
> > *sig)
> >   return -EOPNOTSUPP;
> >   }
> >
> > + if (!IS_ALIGNED(res->start + le32_to_cpu(pfn_sb->start_pad),
> > + memremap_compat_align())) {
> > + dev_err(_pfn->dev, "resource start misaligned\n");
> > + return -EOPNOTSUPP;
> > + }
> > +
> > + if (!IS_ALIGNED(res->end + 1 - le32_to_cpu(pfn_sb->end_trunc),
> > + memremap_compat_align())) {
> > + dev_err(_pfn->dev, "resource end misaligned\n");
> > + return -EOPNOTSUPP;
> > + }
> > +
> >   return 0;
> >  }
> >  EXPORT_SYMBOL(nd_pfn_validate);
> > @@ -750,7 +764,13 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn)
> >   start = nsio->res.start;
> >   size = 

Re: [PATCH v3 15/27] powerpc/powernv/pmem: Add support for near storage commands

2020-03-02 Thread Dan Williams
On Mon, Mar 2, 2020 at 9:59 AM Frederic Barrat  wrote:
>
>
>
> Le 21/02/2020 à 04:27, Alastair D'Silva a écrit :
> > From: Alastair D'Silva 
> >
> > Similar to the previous patch, this adds support for near storage commands.
> >
> > Signed-off-by: Alastair D'Silva 
> > ---
>
>
> Is any of these new functions ever called?

This is my concern as well. The libnvdimm command support is limited
to the commands that Linux will use. Other passthrough commands are
supported through a passthrough interface. However, that passthrough
interface is explicitly limited to publicly documented command sets so
that the kernel has an opportunity to constrain and consolidate
command implementations across vendors.


Re: [PATCH v3 6/8] perf/tools: Enhance JSON/metric infrastructure to handle "?"

2020-03-02 Thread Jiri Olsa
On Sat, Feb 29, 2020 at 03:11:57PM +0530, Kajol Jain wrote:

SNIP

> diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
> index 02aee946b6c1..f629828cc0de 100644
> --- a/tools/perf/util/metricgroup.c
> +++ b/tools/perf/util/metricgroup.c
> @@ -399,6 +399,11 @@ void metricgroup__print(bool metrics, bool metricgroups, 
> char *filter,
>   strlist__delete(metriclist);
>  }
>  
> +int __weak arch_get_runtimeparam(void)
> +{
> + return 1;
> +}
> +
>  static int metricgroup__add_metric(const char *metric, struct strbuf *events,
>  struct list_head *group_list)
>  {
> @@ -419,52 +424,77 @@ static int metricgroup__add_metric(const char *metric, 
> struct strbuf *events,
>   continue;
>   if (match_metric(pe->metric_group, metric) ||
>   match_metric(pe->metric_name, metric)) {
> - const char **ids;
> - int idnum;
> - struct egroup *eg;
> - bool no_group = false;
> + int k, count;

two things in here.. there's already ack-ed patchset from Kan Liang:
  Support metric group constraint
>[PATCH V2 2/5] perf metricgroup: Factor out 
metricgroup__add_metric_weak_group()

that's changing this place, so you might want to synchronize with that


> +
> + if (strstr(pe->metric_expr, "?"))
> + count = arch_get_runtimeparam();
> + else
> + count = 1;
> +
> + /* This loop is added to create multiple
> +  * events depend on count value and add
> +  * those events to group_list.
> +  */
> + for (k = 0; k < count; k++) {
> + const char **ids;
> + int idnum;
> + struct egroup *eg;
> + bool no_group = false;
> + char value[PATH_MAX];
> +
> + pr_debug("metric expr %s for %s\n",
> +  pe->metric_expr, pe->metric_name);
> + expr__runtimeparam = k;

the other thing is that I don't really follow what's going on in here

you're setting expr__runtimeparam to the loop index,
which you get from some arch related file

we should do this in arch-specific way.. I think that Kan's change is
already moving some bits into separate function and that should make
all this more readable, but perhaps we might need more, so all the
'repeating' code will be in a function

please either separate this to arch code, or make it understandable
for people from other archs ;-)

jirka

> + if (expr__find_other(pe->metric_expr, NULL,
> +  , ) < 0)
> + continue;
> + if (events->len > 0)
> + strbuf_addf(events, ",");
> + for (j = 0; j < idnum; j++) {
> + pr_debug("found event %s\n", ids[j]);
> + /*
> +  * Duration time maps to a software
> +  * event and can make groups not count.
> +  * Always use it outside a group.
> +  */
> + if (!strcmp(ids[j], "duration_time")) {
> + if (j > 0)
> + strbuf_addf(events,
> + "}:W,");
> + strbuf_addf(events,
> + "duration_time");
> + no_group = true;
> + continue;
> + }
> + strbuf_addf(events, "%s%s",
> + j == 0 || no_group ? "{" :
> + ",", ids[j]);
> + no_group = false;
> + }
> + if (!no_group)
> + strbuf_addf(events, "}:W");
>  
> - pr_debug("metric expr %s for %s\n", pe->metric_expr, 
> pe->metric_name);
> + eg = malloc(sizeof(struct egroup));
> + if (!eg) {
> + ret = -ENOMEM;
> + break;
> + }
> + eg->ids = ids;
> +   

Re: [PATCH v3 6/8] perf/tools: Enhance JSON/metric infrastructure to handle "?"

2020-03-02 Thread Jiri Olsa
On Sat, Feb 29, 2020 at 03:11:57PM +0530, Kajol Jain wrote:

SNIP

> + *dst++ = paramval[i++];
> + free(paramval);
> + }
> + }
>   else
>   *dst++ = *str;
>   str++;
> @@ -72,8 +86,8 @@ number  [0-9]+
>  
>  sch  [-,=]
>  spec \\{sch}
> -sym  [0-9a-zA-Z_\.:@]+
> -symbol   {spec}*{sym}*{spec}*{sym}*
> +sym[0-9a-zA-Z_\.:@?]+
> +symbol {spec}*{sym}*{spec}*{sym}*{spec}*{sym}
>  
>  %%
>   {
> diff --git a/tools/perf/util/expr.y b/tools/perf/util/expr.y
> index 4720cbe79357..0f3ef0f37bf4 100644
> --- a/tools/perf/util/expr.y
> +++ b/tools/perf/util/expr.y
> @@ -38,6 +38,8 @@
>  %type  expr if_expr
>  
>  %{
> +int expr__runtimeparam;

we don't like global variables.. could this be part of the
contaxt struct?

jirka



Re: [PATCH v3 15/27] powerpc/powernv/pmem: Add support for near storage commands

2020-03-02 Thread Frederic Barrat




Le 21/02/2020 à 04:27, Alastair D'Silva a écrit :

From: Alastair D'Silva 

Similar to the previous patch, this adds support for near storage commands.

Signed-off-by: Alastair D'Silva 
---



Is any of these new functions ever called?

  Fred



  arch/powerpc/platforms/powernv/pmem/ocxl.c|  6 +++
  .../platforms/powernv/pmem/ocxl_internal.c| 41 +++
  .../platforms/powernv/pmem/ocxl_internal.h| 37 +
  3 files changed, 84 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c 
b/arch/powerpc/platforms/powernv/pmem/ocxl.c
index 4e782d22605b..b8bd7e703b19 100644
--- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
+++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
@@ -259,12 +259,18 @@ static int setup_command_metadata(struct ocxlpmem 
*ocxlpmem)
int rc;
  
  	mutex_init(>admin_command.lock);

+   mutex_init(>ns_command.lock);
  
  	rc = extract_command_metadata(ocxlpmem, GLOBAL_MMIO_ACMA_CREQO,

  >admin_command);
if (rc)
return rc;
  
+	rc = extract_command_metadata(ocxlpmem, GLOBAL_MMIO_NSCMA_CREQO,

+ >ns_command);
+   if (rc)
+   return rc;
+
return 0;
  }
  
diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.c b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.c

index 583f48023025..3e0b133feddf 100644
--- a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.c
+++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.c
@@ -133,6 +133,47 @@ int admin_response_handled(const struct ocxlpmem *ocxlpmem)
  OCXL_LITTLE_ENDIAN, GLOBAL_MMIO_CHI_ACRA);
  }
  
+int ns_command_request(struct ocxlpmem *ocxlpmem, u8 op_code)

+{
+   u64 val;
+   int rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_CHI,
+OCXL_LITTLE_ENDIAN, );
+   if (rc)
+   return rc;
+
+   if (!(val & GLOBAL_MMIO_CHI_NSCRA))
+   return -EBUSY;
+
+   return scm_command_request(ocxlpmem, >ns_command, op_code);
+}
+
+int ns_response(const struct ocxlpmem *ocxlpmem)
+{
+   return command_response(ocxlpmem, >ns_command);
+}
+
+int ns_command_execute(const struct ocxlpmem *ocxlpmem)
+{
+   return ocxl_global_mmio_set64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_HCI,
+ OCXL_LITTLE_ENDIAN, 
GLOBAL_MMIO_HCI_NSCRW);
+}
+
+bool ns_command_complete(const struct ocxlpmem *ocxlpmem)
+{
+   u64 val = 0;
+   int rc = ocxlpmem_chi(ocxlpmem, );
+
+   WARN_ON(rc);
+
+   return (val & GLOBAL_MMIO_CHI_NSCRA) != 0;
+}
+
+int ns_response_handled(const struct ocxlpmem *ocxlpmem)
+{
+   return ocxl_global_mmio_set64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_CHIC,
+ OCXL_LITTLE_ENDIAN, 
GLOBAL_MMIO_CHI_NSCRA);
+}
+
  void warn_status(const struct ocxlpmem *ocxlpmem, const char *message,
 u8 status)
  {
diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h 
b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
index 2fef68c71271..28e2020f6355 100644
--- a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
+++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
@@ -107,6 +107,7 @@ struct ocxlpmem {
struct ocxl_context *ocxl_context;
void *metadata_addr;
struct command_metadata admin_command;
+   struct command_metadata ns_command;
struct resource pmem_res;
struct nd_region *nd_region;
char fw_version[8+1];
@@ -175,6 +176,42 @@ int admin_command_complete_timeout(const struct ocxlpmem 
*ocxlpmem,
   */
  int admin_response_handled(const struct ocxlpmem *ocxlpmem);
  
+/**

+ * ns_command_request() - Issue a near storage command request
+ * @ocxlpmem: the device metadata
+ * @op_code: The op-code for the command
+ * Returns an identifier for the command, or negative on error
+ */
+int ns_command_request(struct ocxlpmem *ocxlpmem, u8 op_code);
+
+/**
+ * ns_response() - Validate a near storage response
+ * @ocxlpmem: the device metadata
+ * Returns the status code of the command, or negative on error
+ */
+int ns_response(const struct ocxlpmem *ocxlpmem);
+
+/**
+ * ns_command_execute() - Notify the controller to start processing a pending 
near storage command
+ * @ocxlpmem: the device metadata
+ * Returns 0 on success, negative on error
+ */
+int ns_command_execute(const struct ocxlpmem *ocxlpmem);
+
+/**
+ * ns_command_complete() - Is a near storage command executing
+ * @ocxlpmem: the device metadata
+ * Returns true if the previous admin command has completed
+ */
+bool ns_command_complete(const struct ocxlpmem *ocxlpmem);
+
+/**
+ * ns_response_handled() - Notify the controller that the near storage 
response has been handled
+ * @ocxlpmem: the device metadata
+ * Returns 0 on success, negative on failure
+ */
+int ns_response_handled(const struct ocxlpmem *ocxlpmem);
+
  /**
   * 

Re: [PATCH v3 13/27] powerpc/powernv/pmem: Read the capability registers & wait for device ready

2020-03-02 Thread Frederic Barrat




Le 21/02/2020 à 04:27, Alastair D'Silva a écrit :

From: Alastair D'Silva 

This patch reads timeouts & firmware version from the controller, and
uses those timeouts to wait for the controller to report that it is ready
before handing the memory over to libnvdimm.

Signed-off-by: Alastair D'Silva 
---
  arch/powerpc/platforms/powernv/pmem/Makefile  |  2 +-
  arch/powerpc/platforms/powernv/pmem/ocxl.c| 92 +++
  .../platforms/powernv/pmem/ocxl_internal.c| 19 
  .../platforms/powernv/pmem/ocxl_internal.h| 24 +
  4 files changed, 136 insertions(+), 1 deletion(-)
  create mode 100644 arch/powerpc/platforms/powernv/pmem/ocxl_internal.c

diff --git a/arch/powerpc/platforms/powernv/pmem/Makefile 
b/arch/powerpc/platforms/powernv/pmem/Makefile
index 1c55c4193175..4ceda25907d4 100644
--- a/arch/powerpc/platforms/powernv/pmem/Makefile
+++ b/arch/powerpc/platforms/powernv/pmem/Makefile
@@ -4,4 +4,4 @@ ccflags-$(CONFIG_PPC_WERROR)+= -Werror
  
  obj-$(CONFIG_OCXL_PMEM) += ocxlpmem.o
  
-ocxlpmem-y := ocxl.o

+ocxlpmem-y := ocxl.o ocxl_internal.o
diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c 
b/arch/powerpc/platforms/powernv/pmem/ocxl.c
index 3c4eeb5dcc0f..431212c9f0cc 100644
--- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
+++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
@@ -8,6 +8,7 @@
  
  #include 

  #include 
+#include 
  #include 
  #include 
  #include 
@@ -215,6 +216,36 @@ static int register_lpc_mem(struct ocxlpmem *ocxlpmem)
return 0;
  }
  
+/**

+ * is_usable() - Is a controller usable?
+ * @ocxlpmem: the device metadata
+ * @verbose: True to log errors
+ * Return: true if the controller is usable
+ */
+static bool is_usable(const struct ocxlpmem *ocxlpmem, bool verbose)
+{
+   u64 chi = 0;
+   int rc = ocxlpmem_chi(ocxlpmem, );
+
+   if (rc < 0)
+   return false;
+
+   if (!(chi & GLOBAL_MMIO_CHI_CRDY)) {
+   if (verbose)
+   dev_err(>dev, "controller is not ready.\n");
+   return false;
+   }
+
+   if (!(chi & GLOBAL_MMIO_CHI_MA)) {
+   if (verbose)
+   dev_err(>dev,
+   "controller does not have memory available.\n");
+   return false;
+   }
+
+   return true;
+}
+
  /**
   * allocate_minor() - Allocate a minor number to use for an OpenCAPI pmem 
device
   * @ocxlpmem: the device metadata
@@ -328,6 +359,48 @@ static void ocxlpmem_remove(struct pci_dev *pdev)
}
  }
  
+/**

+ * read_device_metadata() - Retrieve config information from the AFU and save 
it for future use
+ * @ocxlpmem: the device metadata
+ * Return: 0 on success, negative on failure
+ */
+static int read_device_metadata(struct ocxlpmem *ocxlpmem)
+{
+   u64 val;
+   int rc;
+
+   rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_CCAP0,
+OCXL_LITTLE_ENDIAN, );
+   if (rc)
+   return rc;
+
+   ocxlpmem->scm_revision = val & 0x;
+   ocxlpmem->read_latency = (val >> 32) & 0xFF;
+   ocxlpmem->readiness_timeout = (val >> 48) & 0x0F;
+   ocxlpmem->memory_available_timeout = val >> 52;
+
+   rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_CCAP1,
+OCXL_LITTLE_ENDIAN, );
+   if (rc)
+   return rc;
+
+   ocxlpmem->max_controller_dump_size = val & 0x;
+
+   // Extract firmware version text
+   rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_FWVER,
+OCXL_HOST_ENDIAN, (u64 
*)ocxlpmem->fw_version);
+   if (rc)
+   return rc;
+
+   ocxlpmem->fw_version[8] = '\0';
+
+   dev_info(>dev,
+"Firmware version '%s' SCM revision %d:%d\n", 
ocxlpmem->fw_version,
+ocxlpmem->scm_revision >> 4, ocxlpmem->scm_revision & 0x0F);
+
+   return 0;
+}
+
  /**
   * probe_function0() - Set up function 0 for an OpenCAPI persistent memory 
device
   * This is important as it enables templates higher than 0 across all other 
functions,
@@ -368,6 +441,7 @@ static int probe(struct pci_dev *pdev, const struct 
pci_device_id *ent)
  {
struct ocxlpmem *ocxlpmem;
int rc;
+   u16 elapsed, timeout;
  
  	if (PCI_FUNC(pdev->devfn) == 0)

return probe_function0(pdev);
@@ -422,6 +496,24 @@ static int probe(struct pci_dev *pdev, const struct 
pci_device_id *ent)
goto err;
}
  
+	if (read_device_metadata(ocxlpmem)) {

+   dev_err(>dev, "Could not read metadata\n");




Need to set rc




+   goto err;
+   }
+
+   elapsed = 0;
+   timeout = ocxlpmem->readiness_timeout + 
ocxlpmem->memory_available_timeout;
+   while (!is_usable(ocxlpmem, false)) {
+   if (elapsed++ > timeout) {
+   dev_warn(>dev, "OpenCAPI Persistent Memory ready 
timeout.\n");
+   

eh_frame confusion

2020-03-02 Thread Naveen N. Rao

Naveen N. Rao wrote:

Rasmus Villemoes wrote:

I'm building a ppc32 kernel, and noticed that after upgrading from gcc-7
to gcc-8 all object files now end up having .eh_frame section. For
vmlinux, that's not a problem, because they all get discarded in
arch/powerpc/kernel/vmlinux.lds.S . However, they stick around in
modules, which doesn't seem to be useful - given that everything worked
just fine with gcc-7, and I don't see anything in the module loader that
handles .eh_frame.

The reason I care is that my target has a rather tight rootfs budget,
and the .eh_frame section seem to occupy 10-30% of the file size
(obviously very depending on the particular module).

Comparing the .foo.o.cmd files, I don't see change in options that might
explain this (there's a bunch of new -Wno-*, and the -mspe=no spelling
is apparently no longer supported in gcc-8). Both before and after, there's

-fno-dwarf2-cfi-asm

about which gcc's documentation says

'-fno-dwarf2-cfi-asm'
 Emit DWARF unwind info as compiler generated '.eh_frame' section
 instead of using GAS '.cfi_*' directives.

Looking into where that comes from got me even more confused, because
both arm and unicore32 say

# Never generate .eh_frame
KBUILD_CFLAGS   += $(call cc-option,-fno-dwarf2-cfi-asm)

while the ppc32 case at hand says

# FIXME: the module load should be taught about the additional relocs
# generated by this.
# revert to pre-gcc-4.4 behaviour of .eh_frame


Michael opened a task to look into this recently and I had spent some 
time last week on this. The original commit/discussion adding 
-fno-dwarf2-cfi-asm refers to R_PPC64_REL32 relocations not being 
handled by our module loader:

http://lkml.kernel.org/r/20090224065112.ga6...@bombadil.infradead.org

However, that is now handled thanks to commit 9f751b82b491d:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9f751b82b491d

I did a test build and a simple module loaded fine, so I think 
-fno-dwarf2-cfi-asm is not required anymore, unless Michael has seen 
some breakages with it. Michael?




but prior to gcc-8, .eh_frame didn't seem to get generated anyway.

Can .eh_frame sections be discarded for modules (on ppc32 at least), or
is there some magic that makes them necessary when building with gcc-8?


As Segher points out, it looks like we need to add 
-fno-asynchronous-unwind-tables. Most other architectures seem to use 
that too.


Can you check if the below patch works? I am yet to test this in more 
detail, but would be good to know the implications for ppc32.


- Naveen


---
diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index f35730548e42..5b5bf98b8217 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -239,10 +239,7 @@ KBUILD_CFLAGS += $(call cc-option,-mno-vsx)
KBUILD_CFLAGS += $(call cc-option,-mno-spe)
KBUILD_CFLAGS += $(call cc-option,-mspe=no)

-# FIXME: the module load should be taught about the additional relocs
-# generated by this.
-# revert to pre-gcc-4.4 behaviour of .eh_frame
-KBUILD_CFLAGS  += $(call cc-option,-fno-dwarf2-cfi-asm)
+KBUILD_CFLAGS  += $(call cc-option,-fno-asynchronous-unwind-tables)

# Never use string load/store instructions as they are
# often slow when they are implemented at all
diff --git a/arch/powerpc/kernel/vdso32/Makefile 
b/arch/powerpc/kernel/vdso32/Makefile
index e147bbdc12cd..d43b0b18137c 100644
--- a/arch/powerpc/kernel/vdso32/Makefile
+++ b/arch/powerpc/kernel/vdso32/Makefile
@@ -25,6 +25,7 @@ KCOV_INSTRUMENT := n
UBSAN_SANITIZE := n

ccflags-y := -shared -fno-common -fno-builtin -nostdlib \
+   -fasynchronous-unwind-tables \
   -Wl,-soname=linux-vdso32.so.1 -Wl,--hash-style=both
asflags-y := -D__VDSO32__ -s

diff --git a/arch/powerpc/kernel/vdso64/Makefile 
b/arch/powerpc/kernel/vdso64/Makefile
index 32ebb3522ea1..b2cbb5c49bad 100644
--- a/arch/powerpc/kernel/vdso64/Makefile
+++ b/arch/powerpc/kernel/vdso64/Makefile
@@ -13,6 +13,7 @@ KCOV_INSTRUMENT := n
UBSAN_SANITIZE := n

ccflags-y := -shared -fno-common -fno-builtin -nostdlib \
+   -fasynchronous-unwind-tables \
   -Wl,-soname=linux-vdso64.so.1 -Wl,--hash-style=both
asflags-y := -D__VDSO64__ -s





Re: eh_frame confusion

2020-03-02 Thread Naveen N. Rao

Segher Boessenkool wrote:

On Mon, Mar 02, 2020 at 11:56:05AM +0100, Rasmus Villemoes wrote:

I'm building a ppc32 kernel, and noticed that after upgrading from gcc-7
to gcc-8 all object files now end up having .eh_frame section.


Since GCC 8, we enable -fasynchronous-unwind-tables by default for
PowerPC.  See https://gcc.gnu.org/r259298 .


For
vmlinux, that's not a problem, because they all get discarded in
arch/powerpc/kernel/vmlinux.lds.S . However, they stick around in
modules, which doesn't seem to be useful - given that everything worked
just fine with gcc-7, and I don't see anything in the module loader that
handles .eh_frame.


It is useful for debugging.  Not many people debug the kernel like this,
of course.


I'm trying to understand if we need that. Other architectures seems to 
pass -fasynchronous-unwind-tables only for the vdso, but disable it for 
the kernel build. I suppose we can do the same.


If using -fno-asynchronous-unwind-tables, would crash/perf have 
problems?


- Naveen



eh_frame confusion

2020-03-02 Thread Naveen N. Rao

Rasmus Villemoes wrote:

I'm building a ppc32 kernel, and noticed that after upgrading from gcc-7
to gcc-8 all object files now end up having .eh_frame section. For
vmlinux, that's not a problem, because they all get discarded in
arch/powerpc/kernel/vmlinux.lds.S . However, they stick around in
modules, which doesn't seem to be useful - given that everything worked
just fine with gcc-7, and I don't see anything in the module loader that
handles .eh_frame.

The reason I care is that my target has a rather tight rootfs budget,
and the .eh_frame section seem to occupy 10-30% of the file size
(obviously very depending on the particular module).

Comparing the .foo.o.cmd files, I don't see change in options that might
explain this (there's a bunch of new -Wno-*, and the -mspe=no spelling
is apparently no longer supported in gcc-8). Both before and after, there's

-fno-dwarf2-cfi-asm

about which gcc's documentation says

'-fno-dwarf2-cfi-asm'
 Emit DWARF unwind info as compiler generated '.eh_frame' section
 instead of using GAS '.cfi_*' directives.

Looking into where that comes from got me even more confused, because
both arm and unicore32 say

# Never generate .eh_frame
KBUILD_CFLAGS   += $(call cc-option,-fno-dwarf2-cfi-asm)

while the ppc32 case at hand says

# FIXME: the module load should be taught about the additional relocs
# generated by this.
# revert to pre-gcc-4.4 behaviour of .eh_frame


Michael opened a task to look into this recently and I had spent some 
time last week on this. The original commit/discussion adding 
-fno-dwarf2-cfi-asm refers to R_PPC64_REL32 relocations not being 
handled by our module loader:

http://lkml.kernel.org/r/20090224065112.ga6...@bombadil.infradead.org

However, that is now handled thanks to commit 9f751b82b491d:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9f751b82b491d

I did a test build and a simple module loaded fine, so I think 
-fno-dwarf2-cfi-asm is not required anymore, unless Michael has seen 
some breakages with it. Michael?




but prior to gcc-8, .eh_frame didn't seem to get generated anyway.

Can .eh_frame sections be discarded for modules (on ppc32 at least), or
is there some magic that makes them necessary when building with gcc-8?


As Segher points out, it looks like we need to add 
-fno-asynchronous-unwind-tables. Most other architectures seem to use 
that too.



- Naveen



[Bug 206733] i2c i2c-3: i2c-powermac: modalias failure on /uni-n@f8000000/i2c@f8001000/cereal@1c0

2020-03-02 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=206733

--- Comment #1 from Erhard F. (erhar...@mailbox.org) ---
Created attachment 287761
  --> https://bugzilla.kernel.org/attachment.cgi?id=287761=edit
kernel .config (5.6-rc4, PowerMac G4 DP)

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 206733] New: i2c i2c-3: i2c-powermac: modalias failure on /uni-n@f8000000/i2c@f8001000/cereal@1c0

2020-03-02 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=206733

Bug ID: 206733
   Summary: i2c i2c-3: i2c-powermac: modalias failure on
/uni-n@f800/i2c@f8001000/cereal@1c0
   Product: Platform Specific/Hardware
   Version: 2.5
Kernel Version: 5.6-rc4
  Hardware: PPC-32
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: PPC-32
  Assignee: platform_ppc...@kernel-bugs.osdl.org
  Reporter: erhar...@mailbox.org
Regression: No

Created attachment 287759
  --> https://bugzilla.kernel.org/attachment.cgi?id=287759=edit
dmesg (5.6-rc4, PowerMac G4 DP)

The G4 MDD/DP can't quite pick up this device, despite it shows up in the
bootlog earlier.

[...]
Mär 02 17:23:45 T600 kernel: i2c-dev: adapter [uni-n 1] registered as minor 3
Mär 02 17:23:45 T600 kernel: i2c i2c-3: adapter [uni-n 1] registered
Mär 02 17:23:45 T600 kernel: PowerMac i2c bus uni-n 1 registered
Mär 02 17:23:45 T600 kernel: i2c i2c-3: i2c-powermac: register
/uni-n@f800/i2c@f8001000/cereal@1c0
Mär 02 17:23:45 T600 kernel: i2c i2c-3: i2c-powermac: modalias failure on
/uni-n@f800/i2c@f8001000/cereal@1c0
Mär 02 17:23:45 T600 kernel: i2c-dev: adapter [uni-n 0] registered as minor 4
[...]

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

Re: [RFC PATCH v1] powerpc/prom_init: disable XIVE in Secure VM.

2020-03-02 Thread Cédric Le Goater
On 2/29/20 11:51 PM, Ram Pai wrote:
> On Sat, Feb 29, 2020 at 09:27:54AM +0100, Cédric Le Goater wrote:
>> On 2/29/20 8:54 AM, Ram Pai wrote:
>>> XIVE is not correctly enabled for Secure VM in the KVM Hypervisor yet.
>>>
>>> Hence Secure VM, must always default to XICS interrupt controller.
>>
>> have you tried XIVE emulation 'kernel-irqchip=off' ? 
> 
> yes and it hangs. I think that option, continues to enable some variant
> of XIVE in the VM. 

HW is not involved, KVM is not involved anymore and all is emulated at 
the QEMU level in user space. What is the issue ? 

> There are some known deficiencies between KVM
> and the ultravisor negotiation, resulting in a hang in the SVM.

That is something else to investigate. feature/capability negotiation
is the core of the hypervisor stack : 

OPAL <-> PowerNV <-> KVM <-> QEMU <-> guest OS

>>> If XIVE is requested through kernel command line option "xive=on",
>>> override and turn it off.
>>
>> This is incorrect. It is negotiated through CAS depending on the FW
>> capabilities and the KVM capabilities.
> 
> Yes I understand, qemu/KVM have predetermined a set of capabilties that
> it can offer to the VM.  The kernel within the VM has a list of
> capabilties it needs to operate correctly.  So both negotiate and
> determine something mutually ammicable.
> 
> Here I am talking about the list of capabilities that the kernel is
> trying to determine, it needs to operate correctly.  "xive=on" is one of
> those capabilities the kernel is told by the VM-adminstrator, to enable.

XIVE is not a kernel capability. It's platform support and the default
for P9 is the native exploitation mode which makes full use of the P9
interrupt controller. For non XIVE aware kernels, the hypervisor emulates
the legacy interface on top of XIVE. 

"xive=off" was introduced for distro testing. It skips the negotiation 
process of the XIVE native exploitation mode on the guest. But it's not
a negotiation setting. It's a chicken switch.

> Unfortunately if the VM-administrtor blindly requests to enable it, the
> kernel must override it, if it knows that will be switching the VM into
> a SVM soon. No point negotiating a capability with Qemu; through CAS,
> if it knows it cannot handle that capability.

I don't understand. Are you talking about SVM or XIVE ? 

>>> If XIVE is the only supported platform interrupt controller; specified
>>> through qemu option "ic-mode=xive", simply abort. Otherwise default to
>>> XICS.
>>
>>
>> I don't think it is a good approach to downgrade the guest kernel 
>> capabilities this way. 
>>
>> PAPR has specified the CAS negotiation process for this purpose. It 
>> comes in two parts under KVM. First the KVM hypervisor advertises or 
>> not a capability to QEMU. The second is the CAS negotiation process 
>> between QEMU and the guest OS.
> 
> Unfortunately, this is not viable.  At the time the hypervisor
> advertises its capabilities to qemu, the hypervisor has no idea whether
> that VM will switch into a SVM or not. 

OK, but the hypervisor knows if it can handle 'SVM' guests or not and,
if not, there is no point in advertising a 'SVM' capability to the guest. 

> The decision to switch into a> SVM is taken by the kernel running in the VM. 
> This happens much later,
> after the hypervisor has already conveyed its capabilties to the qemu, and
> qemu has than instantiated the VM.

So you don't have negotiation with the hypervisor ? How does the guest
knows the hypervisor platform can handle SVMs ? try and see if it fails ?
If so, it seems quite broken to me.
 
> As a result, CAS in prom_init is the only place where this negotiation
> can take place.

Euh. I don't follow. This is indeed where CAS is performed and so it's 
*the* place to check that the hypervisor has 'SVM' support ? 

>> The SVM specifications might not be complete yet and if some features 
>> are incompatible, I think we should modify the capabilities advertised 
>> by the hypervisor : no XIVE in case of SVM. QEMU will automatically 
>> use the fallback path and emulate the XIVE device, same as setting 
>> 'kernel-irqchip=off'. 
> 
> As mentioned above, this would be an excellent approach, if the
> Hypervisor was aware of the VM's intent to switch into a SVM. Neither
> the hypervisor knows, nor the qemu.  Only the kernel running within the
> VM knows about it.


The hypervisor (KVM/QEMU) never knows what are the guest OS capabilities
or its intents. That is why there is a negotiation process. 

I would do :

 * OPAL FW advertises 'SVM' support to the Linux PowerNV (through DT) 
 * KVM advertises 'SVM' support to QEMU (extend KVM ioctls)
 * QEMU advertises 'SVM' support to guest OS (through CAS or DT) 
 * Guest OS should not try to use SVM it is not supported. 

If the passthrough of HW pages is not supported by Ultravisor, KVM 
should not advertised XIVE to QEMU which would then use fallback mode.

If emulated XIVE or XICS is not supported by SVM guests, then we have
a problem and we need to 

Re: [PATCH] selftests: powerpc: Add tlbie_test in .gitignore

2020-03-02 Thread Sasha Levin
Hi

[This is an automated email]

This commit has been processed because it contains a "Fixes:" tag
fixing commit: 93cad5f78995 ("selftests/powerpc: Add test case for tlbie vs 
mtpidr ordering issue").

The bot has tested the following trees: v5.5.6, v5.4.22, v4.19.106, v4.14.171.

v5.5.6: Failed to apply! Possible dependencies:
5eb7cfb3a2b1 ("selftests/powerpc: Add a test of bad (out-of-range) 
accesses")

v5.4.22: Failed to apply! Possible dependencies:
5eb7cfb3a2b1 ("selftests/powerpc: Add a test of bad (out-of-range) 
accesses")

v4.19.106: Failed to apply! Possible dependencies:
16391bfc8623 ("selftests/powerpc: Add test of fork with mapping above 
512TB")
5eb7cfb3a2b1 ("selftests/powerpc: Add a test of bad (out-of-range) 
accesses")
7b570361f6f6 ("selftests/powerpc: Add missing newline at end of file")
b7683fc66eba ("selftests/powerpc: Add a test of wild bctr")

v4.14.171: Failed to apply! Possible dependencies:
16391bfc8623 ("selftests/powerpc: Add test of fork with mapping above 
512TB")
5eb7cfb3a2b1 ("selftests/powerpc: Add a test of bad (out-of-range) 
accesses")
6ed361586b32 ("selftests/powerpc: Add a test of SEGV error behaviour")
7b570361f6f6 ("selftests/powerpc: Add missing newline at end of file")
b7683fc66eba ("selftests/powerpc: Add a test of wild bctr")


NOTE: The patch will not be queued to stable trees until it is upstream.

How should we proceed with this patch?

-- 
Thanks
Sasha


[Bug 199471] [Bisected][Regression] windfarm_pm* no longer gets automatically loaded when CONFIG_I2C_POWERMAC=y is set

2020-03-02 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=199471

--- Comment #18 from Erhard F. (erhar...@mailbox.org) ---
Created attachment 287757
  --> https://bugzilla.kernel.org/attachment.cgi?id=287757=edit
dmesg (kernel 5.6-rc4 + patch, PowerMac G5 11,2)

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 199471] [Bisected][Regression] windfarm_pm* no longer gets automatically loaded when CONFIG_I2C_POWERMAC=y is set

2020-03-02 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=199471

--- Comment #17 from Erhard F. (erhar...@mailbox.org) ---
(In reply to Wolfram Sang from comment #16)
> Created attachment 287755 [details]
> proof-of-concept patch for testing
> 
> Here is the promised patch. I converted all I2C MODULE tables. pm72 didn't
> have one, so we will see what pulls it in.
> 
> A test with a machine needing the lm75 driver would be great. Because some
> code change was needed there.
Excellent! Applied your patch on 5.6-rc4 and it just works fine on my G5 11,2!
I can leave CONFIG_WINDFARM=m and the correct modules get pulled in just as it
was before kernel 4.17.

I can't test on the G5 7,3 from my original bug report 'cause I sold this one.
But from my understanding this "lm75" sensor is used in pretty any windfarm_pm*
module?
 # grep -i lm75 drivers/macintosh/windfarm_pm*.c
drivers/macintosh/windfarm_pm112.c: request_module("windfarm_lm75_sensor");
drivers/macintosh/windfarm_pm121.c:
request_module("windfarm_lm75_sensor");
drivers/macintosh/windfarm_pm72.c:  request_module("windfarm_lm75_sensor");
drivers/macintosh/windfarm_pm81.c: 
request_module("windfarm_lm75_sensor");
drivers/macintosh/windfarm_pm91.c: 
request_module("windfarm_lm75_sensor");

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[powerpc:fixes-test] BUILD SUCCESS cb0cc635c7a9fa8a3a0f75d4d896721819c63add

2020-03-02 Thread kbuild test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git  
fixes-test
branch HEAD: cb0cc635c7a9fa8a3a0f75d4d896721819c63add  powerpc: Include .BTF 
section

elapsed time: 4704m

configs tested: 350
configs skipped: 154

The following configs have been built successfully.
More configs may be tested in the coming days.

arm  allmodconfig
arm   allnoconfig
arm  allyesconfig
arm64allmodconfig
arm64 allnoconfig
arm64allyesconfig
arm at91_dt_defconfig
arm   efm32_defconfig
arm  exynos_defconfig
armmulti_v5_defconfig
armmulti_v7_defconfig
armshmobile_defconfig
arm   sunxi_defconfig
arm64   defconfig
sparcallyesconfig
mips  fuloong2e_defconfig
i386 allyesconfig
ia64defconfig
powerpc defconfig
i386  allnoconfig
powerpc   ppc64_defconfig
openriscor1ksim_defconfig
xtensa   common_defconfig
ia64 alldefconfig
h8300 edosk2674_defconfig
s390  allnoconfig
sparc   defconfig
nds32 allnoconfig
s390 alldefconfig
s390   zfcpdump_defconfig
mips  allnoconfig
mips allmodconfig
s390 allmodconfig
arc  allyesconfig
mips  malta_kvm_defconfig
m68k   sun3_defconfig
xtensa  iss_defconfig
i386 alldefconfig
ia64  allnoconfig
h8300   h8s-sim_defconfig
m68k   m5475evb_defconfig
nios2 3c120_defconfig
m68k  multi_defconfig
powerpc   allnoconfig
sparc64   allnoconfig
openrisc simple_smp_defconfig
shallnoconfig
s390defconfig
alpha   defconfig
pariscallnoconfig
i386defconfig
ia64 allmodconfig
ia64 allyesconfig
c6x  allyesconfig
c6xevmc6678_defconfig
nios2 10m50_defconfig
h8300h8300h-sim_defconfig
m68k allmodconfig
arc defconfig
microblaze  mmu_defconfig
microblazenommu_defconfig
powerpc  rhel-kconfig
mips   32r2_defconfig
mips 64r6el_defconfig
mips allyesconfig
parisc   allyesconfig
pariscgeneric-32bit_defconfig
pariscgeneric-64bit_defconfig
x86_64   randconfig-a001-20200228
x86_64   randconfig-a002-20200228
x86_64   randconfig-a003-20200228
i386 randconfig-a001-20200228
i386 randconfig-a002-20200228
i386 randconfig-a003-20200228
x86_64   randconfig-a001-20200229
x86_64   randconfig-a002-20200229
x86_64   randconfig-a003-20200229
i386 randconfig-a001-20200229
i386 randconfig-a002-20200229
i386 randconfig-a003-20200229
x86_64   randconfig-a001-20200301
x86_64   randconfig-a002-20200301
x86_64   randconfig-a003-20200301
i386 randconfig-a001-20200301
i386 randconfig-a002-20200301
i386 randconfig-a003-20200301
x86_64   randconfig-a001-20200302
x86_64   randconfig-a002-20200302
x86_64   randconfig-a003-20200302
i386 randconfig-a001-20200302
i386 randconfig-a002-20200302
i386 randconfig-a003-20200302
alpharandconfig-a001-20200228
m68k randconfig-a001-20200228
mips randconfig-a001-20200228
nds32randconfig-a001-20200228
parisc   randconfig-a001-20200228
riscvrandconfig-a001-20200228
alpharandconfig-a001-20200302
parisc   randconfig-a001-20200302
alpharandconfig-a001-20200229
m68k

Re: [PATCH] ima: add a new CONFIG for loading arch-specific policies

2020-03-02 Thread Mimi Zohar
On Mon, 2020-03-02 at 15:52 +0100, Ard Biesheuvel wrote:
> On Mon, 2 Mar 2020 at 15:48, Mimi Zohar  wrote:
> >
> > On Wed, 2020-02-26 at 14:10 -0500, Nayna Jain wrote:
> > > Every time a new architecture defines the IMA architecture specific
> > > functions - arch_ima_get_secureboot() and arch_ima_get_policy(), the IMA
> > > include file needs to be updated. To avoid this "noise", this patch
> > > defines a new IMA Kconfig IMA_SECURE_AND_OR_TRUSTED_BOOT option, allowing
> > > the different architectures to select it.
> > >
> > > Suggested-by: Linus Torvalds 
> > > Signed-off-by: Nayna Jain 
> > > Cc: Ard Biesheuvel 
> > > Cc: Martin Schwidefsky 
> > > Cc: Philipp Rudo 
> > > Cc: Michael Ellerman 
> > > ---
> > >  arch/powerpc/Kconfig   | 2 +-
> > >  arch/s390/Kconfig  | 1 +
> > >  arch/x86/Kconfig   | 1 +
> > >  include/linux/ima.h| 3 +--
> > >  security/integrity/ima/Kconfig | 9 +
> > >  5 files changed, 13 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> > > index 497b7d0b2d7e..b8ce1b995633 100644
> > > --- a/arch/powerpc/Kconfig
> > > +++ b/arch/powerpc/Kconfig
> > > @@ -246,6 +246,7 @@ config PPC
> > >   select SYSCTL_EXCEPTION_TRACE
> > >   select THREAD_INFO_IN_TASK
> > >   select VIRT_TO_BUS  if !PPC64
> > > + select IMA_SECURE_AND_OR_TRUSTED_BOOT   if PPC_SECURE_BOOT
> > >   #
> > >   # Please keep this list sorted alphabetically.
> > >   #
> > > @@ -978,7 +979,6 @@ config PPC_SECURE_BOOT
> > >   prompt "Enable secure boot support"
> > >   bool
> > >   depends on PPC_POWERNV
> > > - depends on IMA_ARCH_POLICY
> > >   help
> > > Systems with firmware secure boot enabled need to define security
> > > policies to extend secure boot to the OS. This config allows a 
> > > user
> > > diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
> > > index 8abe77536d9d..90ff3633ade6 100644
> > > --- a/arch/s390/Kconfig
> > > +++ b/arch/s390/Kconfig
> > > @@ -195,6 +195,7 @@ config S390
> > >   select ARCH_HAS_FORCE_DMA_UNENCRYPTED
> > >   select SWIOTLB
> > >   select GENERIC_ALLOCATOR
> > > + select IMA_SECURE_AND_OR_TRUSTED_BOOT
> > >
> > >
> > >  config SCHED_OMIT_FRAME_POINTER
> > > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> > > index beea77046f9b..cafa66313fe2 100644
> > > --- a/arch/x86/Kconfig
> > > +++ b/arch/x86/Kconfig
> > > @@ -230,6 +230,7 @@ config X86
> > >   select VIRT_TO_BUS
> > >   select X86_FEATURE_NAMESif PROC_FS
> > >   select PROC_PID_ARCH_STATUS if PROC_FS
> > > + select IMA_SECURE_AND_OR_TRUSTED_BOOT   if EFI
> >
> > Not everyone is interested in enabling IMA or requiring IMA runtime
> > policies.  With this patch, enabling IMA_ARCH_POLICY is therefore
> > still left up to the person building the kernel.  As a result, I'm
> > seeing the following warning, which is kind of cool.
> >
> > WARNING: unmet direct dependencies detected for
> > IMA_SECURE_AND_OR_TRUSTED_BOOT
> >   Depends on [n]: INTEGRITY [=y] && IMA [=y] && IMA_ARCH_POLICY [=n]
> >   Selected by [y]:
> >   - X86 [=y] && EFI [=y]
> >
> > Ard, Michael, Martin, just making sure this type of warning is
> > acceptable before upstreaming this patch.  I would appreciate your
> > tags.
> >
> 
> Ehm, no, warnings like these are not really acceptable. It means there
> is an inconsistency in the way the Kconfig dependencies are defined.
> 
> Does this help:
> 
>   select IMA_SECURE_AND_OR_TRUSTED_BOOT   if EFI && IMA_ARCH_POLICY
> 
> ?

Yes, that's fine for x86.  Michael, Martin, do you want something
similar or would you prefer actually selecting IMA_ARCH_POLICY?

Mimi



Re: [RFC 02/11] perf/core: Data structure to present hazard data

2020-03-02 Thread Mark Rutland
On Mon, Mar 02, 2020 at 10:53:46AM +0530, Ravi Bangoria wrote:
> From: Madhavan Srinivasan 
> 
> Introduce new perf sample_type PERF_SAMPLE_PIPELINE_HAZ to request kernel
> to provide cpu pipeline hazard data. Also, introduce arch independent
> structure 'perf_pipeline_haz_data' to pass hazard data to userspace. This
> is generic structure and arch specific data needs to be converted to this
> format.
> 
> Signed-off-by: Madhavan Srinivasan 
> Signed-off-by: Ravi Bangoria 
> ---
>  include/linux/perf_event.h|  7 ++
>  include/uapi/linux/perf_event.h   | 32 ++-
>  kernel/events/core.c  |  6 +
>  tools/include/uapi/linux/perf_event.h | 32 ++-
>  4 files changed, 75 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index 547773f5894e..d5b606e3c57d 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -1001,6 +1001,7 @@ struct perf_sample_data {
>   u64 stack_user_size;
>  
>   u64 phys_addr;
> + struct perf_pipeline_haz_data   pipeline_haz;
>  } cacheline_aligned;

I don't think you can add this here, see below.

>  /* default value for data source */
> @@ -1021,6 +1022,12 @@ static inline void perf_sample_data_init(struct 
> perf_sample_data *data,
>   data->weight = 0;
>   data->data_src.val = PERF_MEM_NA;
>   data->txn = 0;
> + data->pipeline_haz.itype = PERF_HAZ__ITYPE_NA;
> + data->pipeline_haz.icache = PERF_HAZ__ICACHE_NA;
> + data->pipeline_haz.hazard_stage = PERF_HAZ__PIPE_STAGE_NA;
> + data->pipeline_haz.hazard_reason = PERF_HAZ__HREASON_NA;
> + data->pipeline_haz.stall_stage = PERF_HAZ__PIPE_STAGE_NA;
> + data->pipeline_haz.stall_reason = PERF_HAZ__SREASON_NA;
>  }
>  
>  extern void perf_output_sample(struct perf_output_handle *handle,
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index 377d794d3105..ff252618ca93 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -142,8 +142,9 @@ enum perf_event_sample_format {
>   PERF_SAMPLE_REGS_INTR   = 1U << 18,
>   PERF_SAMPLE_PHYS_ADDR   = 1U << 19,
>   PERF_SAMPLE_AUX = 1U << 20,
> + PERF_SAMPLE_PIPELINE_HAZ= 1U << 21,
>  
> - PERF_SAMPLE_MAX = 1U << 21, /* non-ABI */
> + PERF_SAMPLE_MAX = 1U << 22, /* non-ABI */
>  
>   __PERF_SAMPLE_CALLCHAIN_EARLY   = 1ULL << 63, /* non-ABI; 
> internal use */
>  };
> @@ -870,6 +871,13 @@ enum perf_event_type {
>*  { u64   phys_addr;} && PERF_SAMPLE_PHYS_ADDR
>*  { u64   size;
>*char  data[size]; } && PERF_SAMPLE_AUX
> +  *  { u8itype;
> +  *u8icache;
> +  *u8hazard_stage;
> +  *u8hazard_reason;
> +  *u8stall_stage;
> +  *u8stall_reason;
> +  *u16   pad;} && PERF_SAMPLE_PIPELINE_HAZ
>* };

The existing comment shows the aux data *immediately* after ther
phys_addr field, where you've placed struct perf_pipeline_haz_data.

If adding to struct perf_sample_data is fine, this needs to come before
the aux data in this comment. If adding to struct perf_sample_data is
not fine. struct perf_pipeline_haz_data cannot live there.

I suspect the latter is true, but you're getting away with it because
you're not using both PERF_SAMPLE_AUX and PERF_SAMPLE_PIPELINE_HAZ
simultaneously.

Thanks,
Mark.

>*/
>   PERF_RECORD_SAMPLE  = 9,
> @@ -1185,4 +1193,26 @@ struct perf_branch_entry {
>   reserved:40;
>  };
>  
> +struct perf_pipeline_haz_data {
> + /* Instruction/Opcode type: Load, Store, Branch  */
> + __u8itype;
> + /* Instruction Cache source */
> + __u8icache;
> + /* Instruction suffered hazard in pipeline stage */
> + __u8hazard_stage;
> + /* Hazard reason */
> + __u8hazard_reason;
> + /* Instruction suffered stall in pipeline stage */
> + __u8stall_stage;
> + /* Stall reason */
> + __u8stall_reason;
> + __u16   pad;
> +};
> +
> +#define PERF_HAZ__ITYPE_NA   0x0
> +#define PERF_HAZ__ICACHE_NA  0x0
> +#define PERF_HAZ__PIPE_STAGE_NA  0x0
> +#define PERF_HAZ__HREASON_NA 0x0
> +#define PERF_HAZ__SREASON_NA 0x0
> +
>  #endif /* _UAPI_LINUX_PERF_EVENT_H */
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index e453589da97c..d00037c77ccf 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -1754,6 +1754,9 @@ static void __perf_event_header_size(struct perf_event 
> *event, u64 sample_type)
>  

Re: [PATCH] ima: add a new CONFIG for loading arch-specific policies

2020-03-02 Thread Ard Biesheuvel
On Mon, 2 Mar 2020 at 15:48, Mimi Zohar  wrote:
>
> On Wed, 2020-02-26 at 14:10 -0500, Nayna Jain wrote:
> > Every time a new architecture defines the IMA architecture specific
> > functions - arch_ima_get_secureboot() and arch_ima_get_policy(), the IMA
> > include file needs to be updated. To avoid this "noise", this patch
> > defines a new IMA Kconfig IMA_SECURE_AND_OR_TRUSTED_BOOT option, allowing
> > the different architectures to select it.
> >
> > Suggested-by: Linus Torvalds 
> > Signed-off-by: Nayna Jain 
> > Cc: Ard Biesheuvel 
> > Cc: Martin Schwidefsky 
> > Cc: Philipp Rudo 
> > Cc: Michael Ellerman 
> > ---
> >  arch/powerpc/Kconfig   | 2 +-
> >  arch/s390/Kconfig  | 1 +
> >  arch/x86/Kconfig   | 1 +
> >  include/linux/ima.h| 3 +--
> >  security/integrity/ima/Kconfig | 9 +
> >  5 files changed, 13 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> > index 497b7d0b2d7e..b8ce1b995633 100644
> > --- a/arch/powerpc/Kconfig
> > +++ b/arch/powerpc/Kconfig
> > @@ -246,6 +246,7 @@ config PPC
> >   select SYSCTL_EXCEPTION_TRACE
> >   select THREAD_INFO_IN_TASK
> >   select VIRT_TO_BUS  if !PPC64
> > + select IMA_SECURE_AND_OR_TRUSTED_BOOT   if PPC_SECURE_BOOT
> >   #
> >   # Please keep this list sorted alphabetically.
> >   #
> > @@ -978,7 +979,6 @@ config PPC_SECURE_BOOT
> >   prompt "Enable secure boot support"
> >   bool
> >   depends on PPC_POWERNV
> > - depends on IMA_ARCH_POLICY
> >   help
> > Systems with firmware secure boot enabled need to define security
> > policies to extend secure boot to the OS. This config allows a user
> > diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
> > index 8abe77536d9d..90ff3633ade6 100644
> > --- a/arch/s390/Kconfig
> > +++ b/arch/s390/Kconfig
> > @@ -195,6 +195,7 @@ config S390
> >   select ARCH_HAS_FORCE_DMA_UNENCRYPTED
> >   select SWIOTLB
> >   select GENERIC_ALLOCATOR
> > + select IMA_SECURE_AND_OR_TRUSTED_BOOT
> >
> >
> >  config SCHED_OMIT_FRAME_POINTER
> > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> > index beea77046f9b..cafa66313fe2 100644
> > --- a/arch/x86/Kconfig
> > +++ b/arch/x86/Kconfig
> > @@ -230,6 +230,7 @@ config X86
> >   select VIRT_TO_BUS
> >   select X86_FEATURE_NAMESif PROC_FS
> >   select PROC_PID_ARCH_STATUS if PROC_FS
> > + select IMA_SECURE_AND_OR_TRUSTED_BOOT   if EFI
>
> Not everyone is interested in enabling IMA or requiring IMA runtime
> policies.  With this patch, enabling IMA_ARCH_POLICY is therefore
> still left up to the person building the kernel.  As a result, I'm
> seeing the following warning, which is kind of cool.
>
> WARNING: unmet direct dependencies detected for
> IMA_SECURE_AND_OR_TRUSTED_BOOT
>   Depends on [n]: INTEGRITY [=y] && IMA [=y] && IMA_ARCH_POLICY [=n]
>   Selected by [y]:
>   - X86 [=y] && EFI [=y]
>
> Ard, Michael, Martin, just making sure this type of warning is
> acceptable before upstreaming this patch.  I would appreciate your
> tags.
>

Ehm, no, warnings like these are not really acceptable. It means there
is an inconsistency in the way the Kconfig dependencies are defined.

Does this help:

  select IMA_SECURE_AND_OR_TRUSTED_BOOT   if EFI && IMA_ARCH_POLICY

?


>
> >
> >  config INSTRUCTION_DECODER
> >   def_bool y
> > diff --git a/include/linux/ima.h b/include/linux/ima.h
> > index 1659217e9b60..aefe758f4466 100644
> > --- a/include/linux/ima.h
> > +++ b/include/linux/ima.h
> > @@ -30,8 +30,7 @@ extern void ima_kexec_cmdline(const void *buf, int size);
> >  extern void ima_add_kexec_buffer(struct kimage *image);
> >  #endif
> >
> > -#if (defined(CONFIG_X86) && defined(CONFIG_EFI)) || defined(CONFIG_S390) \
> > - || defined(CONFIG_PPC_SECURE_BOOT)
> > +#ifdef CONFIG_IMA_SECURE_AND_OR_TRUSTED_BOOT
> >  extern bool arch_ima_get_secureboot(void);
> >  extern const char * const *arch_get_ima_policy(void);
> >  #else
> > diff --git a/security/integrity/ima/Kconfig b/security/integrity/ima/Kconfig
> > index 3f3ee4e2eb0d..d17972aa413a 100644
> > --- a/security/integrity/ima/Kconfig
> > +++ b/security/integrity/ima/Kconfig
> > @@ -327,3 +327,12 @@ config IMA_QUEUE_EARLY_BOOT_KEYS
> >   depends on IMA_MEASURE_ASYMMETRIC_KEYS
> >   depends on SYSTEM_TRUSTED_KEYRING
> >   default y
> > +
> > +config IMA_SECURE_AND_OR_TRUSTED_BOOT
> > + bool
> > + depends on IMA
> > + depends on IMA_ARCH_POLICY
> > + default n
> > + help
> > +This option is selected by architectures to enable secure and/or
> > +trusted boot based on IMA runtime policies.
>
>
>
>


Re: [PATCH] ima: add a new CONFIG for loading arch-specific policies

2020-03-02 Thread Mimi Zohar
On Wed, 2020-02-26 at 14:10 -0500, Nayna Jain wrote:
> Every time a new architecture defines the IMA architecture specific
> functions - arch_ima_get_secureboot() and arch_ima_get_policy(), the IMA
> include file needs to be updated. To avoid this "noise", this patch
> defines a new IMA Kconfig IMA_SECURE_AND_OR_TRUSTED_BOOT option, allowing
> the different architectures to select it.
> 
> Suggested-by: Linus Torvalds 
> Signed-off-by: Nayna Jain 
> Cc: Ard Biesheuvel 
> Cc: Martin Schwidefsky 
> Cc: Philipp Rudo 
> Cc: Michael Ellerman 
> ---
>  arch/powerpc/Kconfig   | 2 +-
>  arch/s390/Kconfig  | 1 +
>  arch/x86/Kconfig   | 1 +
>  include/linux/ima.h| 3 +--
>  security/integrity/ima/Kconfig | 9 +
>  5 files changed, 13 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 497b7d0b2d7e..b8ce1b995633 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -246,6 +246,7 @@ config PPC
>   select SYSCTL_EXCEPTION_TRACE
>   select THREAD_INFO_IN_TASK
>   select VIRT_TO_BUS  if !PPC64
> + select IMA_SECURE_AND_OR_TRUSTED_BOOT   if PPC_SECURE_BOOT
>   #
>   # Please keep this list sorted alphabetically.
>   #
> @@ -978,7 +979,6 @@ config PPC_SECURE_BOOT
>   prompt "Enable secure boot support"
>   bool
>   depends on PPC_POWERNV
> - depends on IMA_ARCH_POLICY
>   help
> Systems with firmware secure boot enabled need to define security
> policies to extend secure boot to the OS. This config allows a user
> diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
> index 8abe77536d9d..90ff3633ade6 100644
> --- a/arch/s390/Kconfig
> +++ b/arch/s390/Kconfig
> @@ -195,6 +195,7 @@ config S390
>   select ARCH_HAS_FORCE_DMA_UNENCRYPTED
>   select SWIOTLB
>   select GENERIC_ALLOCATOR
> + select IMA_SECURE_AND_OR_TRUSTED_BOOT
>  
>  
>  config SCHED_OMIT_FRAME_POINTER
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index beea77046f9b..cafa66313fe2 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -230,6 +230,7 @@ config X86
>   select VIRT_TO_BUS
>   select X86_FEATURE_NAMESif PROC_FS
>   select PROC_PID_ARCH_STATUS if PROC_FS
> + select IMA_SECURE_AND_OR_TRUSTED_BOOT   if EFI

Not everyone is interested in enabling IMA or requiring IMA runtime
policies.  With this patch, enabling IMA_ARCH_POLICY is therefore
still left up to the person building the kernel.  As a result, I'm
seeing the following warning, which is kind of cool.

WARNING: unmet direct dependencies detected for
IMA_SECURE_AND_OR_TRUSTED_BOOT
  Depends on [n]: INTEGRITY [=y] && IMA [=y] && IMA_ARCH_POLICY [=n]
  Selected by [y]:
  - X86 [=y] && EFI [=y]

Ard, Michael, Martin, just making sure this type of warning is
acceptable before upstreaming this patch.  I would appreciate your
tags.

thanks!

Mimi

>  
>  config INSTRUCTION_DECODER
>   def_bool y
> diff --git a/include/linux/ima.h b/include/linux/ima.h
> index 1659217e9b60..aefe758f4466 100644
> --- a/include/linux/ima.h
> +++ b/include/linux/ima.h
> @@ -30,8 +30,7 @@ extern void ima_kexec_cmdline(const void *buf, int size);
>  extern void ima_add_kexec_buffer(struct kimage *image);
>  #endif
>  
> -#if (defined(CONFIG_X86) && defined(CONFIG_EFI)) || defined(CONFIG_S390) \
> - || defined(CONFIG_PPC_SECURE_BOOT)
> +#ifdef CONFIG_IMA_SECURE_AND_OR_TRUSTED_BOOT
>  extern bool arch_ima_get_secureboot(void);
>  extern const char * const *arch_get_ima_policy(void);
>  #else
> diff --git a/security/integrity/ima/Kconfig b/security/integrity/ima/Kconfig
> index 3f3ee4e2eb0d..d17972aa413a 100644
> --- a/security/integrity/ima/Kconfig
> +++ b/security/integrity/ima/Kconfig
> @@ -327,3 +327,12 @@ config IMA_QUEUE_EARLY_BOOT_KEYS
>   depends on IMA_MEASURE_ASYMMETRIC_KEYS
>   depends on SYSTEM_TRUSTED_KEYRING
>   default y
> +
> +config IMA_SECURE_AND_OR_TRUSTED_BOOT
> + bool
> + depends on IMA
> + depends on IMA_ARCH_POLICY
> + default n
> + help
> +This option is selected by architectures to enable secure and/or
> +trusted boot based on IMA runtime policies.






Re: [RFC 02/11] perf/core: Data structure to present hazard data

2020-03-02 Thread Mark Rutland
On Mon, Mar 02, 2020 at 10:53:46AM +0530, Ravi Bangoria wrote:
> From: Madhavan Srinivasan 
> 
> Introduce new perf sample_type PERF_SAMPLE_PIPELINE_HAZ to request kernel
> to provide cpu pipeline hazard data. Also, introduce arch independent
> structure 'perf_pipeline_haz_data' to pass hazard data to userspace. This
> is generic structure and arch specific data needs to be converted to this
> format.
> 
> Signed-off-by: Madhavan Srinivasan 
> Signed-off-by: Ravi Bangoria 
> ---
>  include/linux/perf_event.h|  7 ++
>  include/uapi/linux/perf_event.h   | 32 ++-
>  kernel/events/core.c  |  6 +
>  tools/include/uapi/linux/perf_event.h | 32 ++-
>  4 files changed, 75 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index 547773f5894e..d5b606e3c57d 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -1001,6 +1001,7 @@ struct perf_sample_data {
>   u64 stack_user_size;
>  
>   u64 phys_addr;
> + struct perf_pipeline_haz_data   pipeline_haz;
>  } cacheline_aligned;
>  
>  /* default value for data source */
> @@ -1021,6 +1022,12 @@ static inline void perf_sample_data_init(struct 
> perf_sample_data *data,
>   data->weight = 0;
>   data->data_src.val = PERF_MEM_NA;
>   data->txn = 0;
> + data->pipeline_haz.itype = PERF_HAZ__ITYPE_NA;
> + data->pipeline_haz.icache = PERF_HAZ__ICACHE_NA;
> + data->pipeline_haz.hazard_stage = PERF_HAZ__PIPE_STAGE_NA;
> + data->pipeline_haz.hazard_reason = PERF_HAZ__HREASON_NA;
> + data->pipeline_haz.stall_stage = PERF_HAZ__PIPE_STAGE_NA;
> + data->pipeline_haz.stall_reason = PERF_HAZ__SREASON_NA;
>  }
>  
>  extern void perf_output_sample(struct perf_output_handle *handle,
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index 377d794d3105..ff252618ca93 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -142,8 +142,9 @@ enum perf_event_sample_format {
>   PERF_SAMPLE_REGS_INTR   = 1U << 18,
>   PERF_SAMPLE_PHYS_ADDR   = 1U << 19,
>   PERF_SAMPLE_AUX = 1U << 20,
> + PERF_SAMPLE_PIPELINE_HAZ= 1U << 21,

Can we please have perf_event_open() reject this sample flag for PMUs
without the new callback (introduced in the next patch)?

That way it'll be possible to detect whether the PMU exposes this.

Thanks,
Mark.


[Bug 199471] [Bisected][Regression] windfarm_pm* no longer gets automatically loaded when CONFIG_I2C_POWERMAC=y is set

2020-03-02 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=199471

--- Comment #16 from Wolfram Sang (w...@the-dreams.de) ---
Created attachment 287755
  --> https://bugzilla.kernel.org/attachment.cgi?id=287755=edit
proof-of-concept patch for testing

Here is the promised patch. I converted all I2C MODULE tables. pm72 didn't have
one, so we will see what pulls it in.

A test with a machine needing the lm75 driver would be great. Because some code
change was needed there.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

Re: [RFC 02/11] perf/core: Data structure to present hazard data

2020-03-02 Thread maddy




On 3/2/20 3:25 PM, Peter Zijlstra wrote:

On Mon, Mar 02, 2020 at 10:53:46AM +0530, Ravi Bangoria wrote:

From: Madhavan Srinivasan 

Introduce new perf sample_type PERF_SAMPLE_PIPELINE_HAZ to request kernel
to provide cpu pipeline hazard data. Also, introduce arch independent
structure 'perf_pipeline_haz_data' to pass hazard data to userspace. This
is generic structure and arch specific data needs to be converted to this
format.

Signed-off-by: Madhavan Srinivasan 
Signed-off-by: Ravi Bangoria 
---
  include/linux/perf_event.h|  7 ++
  include/uapi/linux/perf_event.h   | 32 ++-
  kernel/events/core.c  |  6 +
  tools/include/uapi/linux/perf_event.h | 32 ++-
  4 files changed, 75 insertions(+), 2 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 547773f5894e..d5b606e3c57d 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1001,6 +1001,7 @@ struct perf_sample_data {
u64 stack_user_size;
  
  	u64phys_addr;

+   struct perf_pipeline_haz_data   pipeline_haz;
  } cacheline_aligned;
  
  /* default value for data source */

@@ -1021,6 +1022,12 @@ static inline void perf_sample_data_init(struct 
perf_sample_data *data,
data->weight = 0;
data->data_src.val = PERF_MEM_NA;
data->txn = 0;
+   data->pipeline_haz.itype = PERF_HAZ__ITYPE_NA;
+   data->pipeline_haz.icache = PERF_HAZ__ICACHE_NA;
+   data->pipeline_haz.hazard_stage = PERF_HAZ__PIPE_STAGE_NA;
+   data->pipeline_haz.hazard_reason = PERF_HAZ__HREASON_NA;
+   data->pipeline_haz.stall_stage = PERF_HAZ__PIPE_STAGE_NA;
+   data->pipeline_haz.stall_reason = PERF_HAZ__SREASON_NA;
  }

NAK, Don't touch anything outside of the first cacheline here.


My bad, should have looked at the comment in "struct perf_sample_data {".
Will move it to perf_prepare_sample().

Thanks for comments.
Maddy



[Bug 201723] [Bisected][Regression] THERM_WINDTUNNEL not working any longer in kernel 4.19.x (PowerMac G4 MDD)

2020-03-02 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=201723

Wolfram Sang (w...@the-dreams.de) changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |CODE_FIX

--- Comment #7 from Wolfram Sang (w...@the-dreams.de) ---
Commited as 38b17afb0ebb ("macintosh: therm_windtunnel: fix regression when
instantiating devices") and available upstream since v5.6-rc4.

Thanks for everyone helping, especially Erhard, of course!

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 199471] [Bisected][Regression] windfarm_pm* no longer gets automatically loaded when CONFIG_I2C_POWERMAC=y is set

2020-03-02 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=199471

Wolfram Sang (w...@the-dreams.de) changed:

   What|Removed |Added

 Status|NEEDINFO|ASSIGNED
 Regression|No  |Yes

--- Comment #15 from Wolfram Sang (w...@the-dreams.de) ---
"I guess so 'cause if I build i2c_powermac as a module and manually modprobe
it, all the relevant windfarm modules get pulled in. But not before."

Maybe there is a module dependency I overlooked so far, but at least there is
no code loading the pm72 module from i2c-powermac.

However, the bisect is very valuable and very likely the commit is the culprit.
I was suspecting something changed the MODINFO, so loading fails, but I missed
this commit, so far. Also, it took me two approaches until I understood all the
behaviour involved. Macintosh drivers are still confusing.

I will cook up a patch to test later today to see if I was right.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

Re: eh_frame confusion

2020-03-02 Thread Segher Boessenkool
On Mon, Mar 02, 2020 at 11:56:05AM +0100, Rasmus Villemoes wrote:
> I'm building a ppc32 kernel, and noticed that after upgrading from gcc-7
> to gcc-8 all object files now end up having .eh_frame section.

Since GCC 8, we enable -fasynchronous-unwind-tables by default for
PowerPC.  See https://gcc.gnu.org/r259298 .

> For
> vmlinux, that's not a problem, because they all get discarded in
> arch/powerpc/kernel/vmlinux.lds.S . However, they stick around in
> modules, which doesn't seem to be useful - given that everything worked
> just fine with gcc-7, and I don't see anything in the module loader that
> handles .eh_frame.

It is useful for debugging.  Not many people debug the kernel like this,
of course.


Segher


Re: [GIT PULL] Second batch of KVM changes for Linux 5.6-rc4 (or rc5)

2020-03-02 Thread Segher Boessenkool
On Mon, Mar 02, 2020 at 09:51:44PM +1100, Michael Ellerman wrote:
> Linus Torvalds  writes:
> > Michael, what tends to be the triggers for people using
> > PPC_DISABLE_WERROR? Do you have reports for it?
> 
> My memory is that we have had very few reports of it actually causing
> problems. But I don't have hard data to back that up.

I build all archs with GCC trunk.

It always breaks for me, with thousands of errors, which is why since
many years I carry 21 lines of patch to thoroughly disable -Werror for
the powerpc arch.  It takes over a year from when a warning is added to
the kernel taking care of it -- and of course, I build with the current
development version of the compiler, so I get to see many misfiring
warnings and other fallout as well.  (Currently there are more than 100
warnings, this is way too many to consider attacking that as well).

> It has tripped up the Clang folks, but that's partly because they're
> building clang HEAD, and also because ~zero powerpc kernel developers
> are building regularly with clang. I'm trying to fix the latter ...

Is anyone building regularly with GCC HEAD?  Power or any other arch?

> And then building with GCC head sometimes requires disabling -Werror
> because of some new warning, sometimes valid sometimes not.

Yes.  And never worth breaking the build for.

-Werror is something you use if you do not trust your developers.

Warnings are not errors.  The compiler warns for things that
heuristically look suspicious.  And it errors for things that are wrong.

Some warnings have many false positives, but are so useful (find many
nasty problems, for example) that it is worth enabling them often.
-Werror sabotages that, giving people an extra incentive to disable
useful warnings.

> I think we could mostly avoid those problems by having the option only
> on by default for known compiler versions.

Well, the kernel disables most useful warnings anyway, so that might
even work, sure.

> It'd also be nice if we could do:
> 
>  $ make WERROR=0
> 
> Or something similarly obvious to turn off the WERROR option. That way
> users don't even have to edit their .config manually, they just rerun
> make with WERROR=0 and it works.

That would be nice, yes, that would help my situation as well.


Segher


Re: [PATCH v3 3/5] libnvdimm/namespace: Enforce memremap_compat_align()

2020-03-02 Thread Aneesh Kumar K.V
Dan Williams  writes:

> The pmem driver on PowerPC crashes with the following signature when
> instantiating misaligned namespaces that map their capacity via
> memremap_pages().
>
> BUG: Unable to handle kernel data access at 0xc00100040600
> Faulting instruction address: 0xc0090790
> NIP [c0090790] arch_add_memory+0xc0/0x130
> LR [c0090744] arch_add_memory+0x74/0x130
> Call Trace:
>  arch_add_memory+0x74/0x130 (unreliable)
>  memremap_pages+0x74c/0xa30
>  devm_memremap_pages+0x3c/0xa0
>  pmem_attach_disk+0x188/0x770
>  nvdimm_bus_probe+0xd8/0x470
>
> With the assumption that only memremap_pages() has alignment
> constraints, enforce memremap_compat_align() for
> pmem_should_map_pages(), nd_pfn, and nd_dax cases. This includes
> preventing the creation of namespaces where the base address is
> misaligned and cases there infoblock padding parameters are invalid.
>

Reviewed-by: Aneesh Kumar K.V 

> Reported-by: Aneesh Kumar K.V 
> Cc: Jeff Moyer 
> Fixes: a3619190d62e ("libnvdimm/pfn: stop padding pmem namespaces to section 
> alignment")
> Signed-off-by: Dan Williams 
> ---
>  drivers/nvdimm/namespace_devs.c |   12 
>  drivers/nvdimm/pfn_devs.c   |   26 +++---
>  2 files changed, 35 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/nvdimm/namespace_devs.c b/drivers/nvdimm/namespace_devs.c
> index 032dc61725ff..68e89855f779 100644
> --- a/drivers/nvdimm/namespace_devs.c
> +++ b/drivers/nvdimm/namespace_devs.c
> @@ -10,6 +10,7 @@
>  #include 
>  #include "nd-core.h"
>  #include "pmem.h"
> +#include "pfn.h"
>  #include "nd.h"
>  
>  static void namespace_io_release(struct device *dev)
> @@ -1739,6 +1740,17 @@ struct nd_namespace_common 
> *nvdimm_namespace_common_probe(struct device *dev)
>   return ERR_PTR(-ENODEV);
>   }

May be add a comment here that both dax/fsdax namespace details are
checked in nd_pfn_validate() so that we look at start_pad and end_trunc
while validating the namespace?

>  
> + if (pmem_should_map_pages(dev)) {
> + struct nd_namespace_io *nsio = to_nd_namespace_io(>dev);
> + struct resource *res = >res;
> +
> + if (!IS_ALIGNED(res->start | (res->end + 1),
> + memremap_compat_align())) {
> + dev_err(>dev, "%pr misaligned, unable to map\n", 
> res);
> + return ERR_PTR(-EOPNOTSUPP);
> + }
> + }
> +
>   if (is_namespace_pmem(>dev)) {
>   struct nd_namespace_pmem *nspm;
>  
> diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c
> index 79fe02d6f657..3bdd4b883d05 100644
> --- a/drivers/nvdimm/pfn_devs.c
> +++ b/drivers/nvdimm/pfn_devs.c
> @@ -446,6 +446,7 @@ static bool nd_supported_alignment(unsigned long align)
>  int nd_pfn_validate(struct nd_pfn *nd_pfn, const char *sig)
>  {
>   u64 checksum, offset;
> + struct resource *res;
>   enum nd_pfn_mode mode;
>   struct nd_namespace_io *nsio;
>   unsigned long align, start_pad;
> @@ -578,13 +579,14 @@ int nd_pfn_validate(struct nd_pfn *nd_pfn, const char 
> *sig)
>* established.
>*/
>   nsio = to_nd_namespace_io(>dev);
> - if (offset >= resource_size(>res)) {
> + res = >res;
> + if (offset >= resource_size(res)) {
>   dev_err(_pfn->dev, "pfn array size exceeds capacity of %s\n",
>   dev_name(>dev));
>   return -EOPNOTSUPP;
>   }
>  
> - if ((align && !IS_ALIGNED(nsio->res.start + offset + start_pad, align))
> + if ((align && !IS_ALIGNED(res->start + offset + start_pad, align))
>   || !IS_ALIGNED(offset, PAGE_SIZE)) {
>   dev_err(_pfn->dev,
>   "bad offset: %#llx dax disabled align: %#lx\n",
> @@ -592,6 +594,18 @@ int nd_pfn_validate(struct nd_pfn *nd_pfn, const char 
> *sig)
>   return -EOPNOTSUPP;
>   }
>  
> + if (!IS_ALIGNED(res->start + le32_to_cpu(pfn_sb->start_pad),
> + memremap_compat_align())) {
> + dev_err(_pfn->dev, "resource start misaligned\n");
> + return -EOPNOTSUPP;
> + }
> +
> + if (!IS_ALIGNED(res->end + 1 - le32_to_cpu(pfn_sb->end_trunc),
> + memremap_compat_align())) {
> + dev_err(_pfn->dev, "resource end misaligned\n");
> + return -EOPNOTSUPP;
> + }
> +
>   return 0;
>  }
>  EXPORT_SYMBOL(nd_pfn_validate);
> @@ -750,7 +764,13 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn)
>   start = nsio->res.start;
>   size = resource_size(>res);
>   npfns = PHYS_PFN(size - SZ_8K);
> - align = max(nd_pfn->align, SUBSECTION_SIZE);
> + align = max(nd_pfn->align, memremap_compat_align());
> + if (!IS_ALIGNED(start, memremap_compat_align())) {
> + dev_err(_pfn->dev, "%s: start %pa misaligned to %#lx\n",
> + 

Re: [PATCH v3 2/5] libnvdimm/pfn: Prevent raw mode fallback if pfn-infoblock valid

2020-03-02 Thread Aneesh Kumar K.V
Dan Williams  writes:

> The EOPNOTSUPP return code from the pmem driver indicates that the
> namespace has a configuration that may be valid, but the current kernel
> does not support it. Expand this to all of the nd_pfn_validate() error
> conditions after the infoblock has been verified as self consistent.
>
> This prevents exposing the namespace to I/O when the infoblock needs to
> be corrected, or the system needs to be put into a different
> configuration (like changing the page size on PowerPC).
>

Reviewed-by: Aneesh Kumar K.V 

> Cc: Aneesh Kumar K.V 
> Cc: Jeff Moyer 
> Signed-off-by: Dan Williams 
> ---
>  drivers/nvdimm/pfn_devs.c |8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c
> index a5c25cb87116..79fe02d6f657 100644
> --- a/drivers/nvdimm/pfn_devs.c
> +++ b/drivers/nvdimm/pfn_devs.c
> @@ -561,14 +561,14 @@ int nd_pfn_validate(struct nd_pfn *nd_pfn, const char 
> *sig)
>   dev_dbg(_pfn->dev, "align: %lx:%lx mode: %d:%d\n",
>   nd_pfn->align, align, nd_pfn->mode,
>   mode);
> - return -EINVAL;
> + return -EOPNOTSUPP;
>   }
>   }
>  
>   if (align > nvdimm_namespace_capacity(ndns)) {
>   dev_err(_pfn->dev, "alignment: %lx exceeds capacity %llx\n",
>   align, nvdimm_namespace_capacity(ndns));
> - return -EINVAL;
> + return -EOPNOTSUPP;
>   }
>  
>   /*
> @@ -581,7 +581,7 @@ int nd_pfn_validate(struct nd_pfn *nd_pfn, const char 
> *sig)
>   if (offset >= resource_size(>res)) {
>   dev_err(_pfn->dev, "pfn array size exceeds capacity of %s\n",
>   dev_name(>dev));
> - return -EBUSY;
> + return -EOPNOTSUPP;
>   }
>  
>   if ((align && !IS_ALIGNED(nsio->res.start + offset + start_pad, align))
> @@ -589,7 +589,7 @@ int nd_pfn_validate(struct nd_pfn *nd_pfn, const char 
> *sig)
>   dev_err(_pfn->dev,
>   "bad offset: %#llx dax disabled align: %#lx\n",
>   offset, align);
> - return -ENXIO;
> + return -EOPNOTSUPP;
>   }
>  
>   return 0;
> ___
> Linux-nvdimm mailing list -- linux-nvd...@lists.01.org
> To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH v3 1/5] mm/memremap_pages: Introduce memremap_compat_align()

2020-03-02 Thread Aneesh Kumar K.V
Dan Williams  writes:

> The "sub-section memory hotplug" facility allows memremap_pages() users
> like libnvdimm to compensate for hardware platforms like x86 that have a
> section size larger than their hardware memory mapping granularity.  The
> compensation that sub-section support affords is being tolerant of
> physical memory resources shifting by units smaller (64MiB on x86) than
> the memory-hotplug section size (128 MiB). Where the platform
> physical-memory mapping granularity is limited by the number and
> capability of address-decode-registers in the memory controller.
>
> While the sub-section support allows memremap_pages() to operate on
> sub-section (2MiB) granularity, the Power architecture may still
> require 16MiB alignment on "!radix_enabled()" platforms.
>
> In order for libnvdimm to be able to detect and manage this per-arch
> limitation, introduce memremap_compat_align() as a common minimum
> alignment across all driver-facing memory-mapping interfaces, and let
> Power override it to 16MiB in the "!radix_enabled()" case.
>
> The assumption / requirement for 16MiB to be a viable
> memremap_compat_align() value is that Power does not have platforms
> where its equivalent of address-decode-registers never hardware remaps a
> persistent memory resource on smaller than 16MiB boundaries. Note that I
> tried my best to not add a new Kconfig symbol, but header include
> entanglements defeated the #ifndef memremap_compat_align design pattern
> and the need to export it defeats the __weak design pattern for arch
> overrides.
>
> Based on an initial patch by Aneesh.
>
Reviewed-by: Aneesh Kumar K.V 

> Link: 
> http://lore.kernel.org/r/capcyv4gbgnp95apyabcsocea50tqj9b5h__83vgngjq3oug...@mail.gmail.com
> Reported-by: Aneesh Kumar K.V 
> Reported-by: Jeff Moyer 
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Signed-off-by: Dan Williams 
> ---
>  arch/powerpc/Kconfig  |1 +
>  arch/powerpc/mm/ioremap.c |   21 +
>  drivers/nvdimm/pfn_devs.c |2 +-
>  include/linux/memremap.h  |8 
>  include/linux/mmzone.h|1 +
>  lib/Kconfig   |3 +++
>  mm/memremap.c |   23 +++
>  7 files changed, 58 insertions(+), 1 deletion(-)
>


eh_frame confusion

2020-03-02 Thread Rasmus Villemoes
I'm building a ppc32 kernel, and noticed that after upgrading from gcc-7
to gcc-8 all object files now end up having .eh_frame section. For
vmlinux, that's not a problem, because they all get discarded in
arch/powerpc/kernel/vmlinux.lds.S . However, they stick around in
modules, which doesn't seem to be useful - given that everything worked
just fine with gcc-7, and I don't see anything in the module loader that
handles .eh_frame.

The reason I care is that my target has a rather tight rootfs budget,
and the .eh_frame section seem to occupy 10-30% of the file size
(obviously very depending on the particular module).

Comparing the .foo.o.cmd files, I don't see change in options that might
explain this (there's a bunch of new -Wno-*, and the -mspe=no spelling
is apparently no longer supported in gcc-8). Both before and after, there's

-fno-dwarf2-cfi-asm

about which gcc's documentation says

'-fno-dwarf2-cfi-asm'
 Emit DWARF unwind info as compiler generated '.eh_frame' section
 instead of using GAS '.cfi_*' directives.

Looking into where that comes from got me even more confused, because
both arm and unicore32 say

# Never generate .eh_frame
KBUILD_CFLAGS   += $(call cc-option,-fno-dwarf2-cfi-asm)

while the ppc32 case at hand says

# FIXME: the module load should be taught about the additional relocs
# generated by this.
# revert to pre-gcc-4.4 behaviour of .eh_frame

but prior to gcc-8, .eh_frame didn't seem to get generated anyway.

Can .eh_frame sections be discarded for modules (on ppc32 at least), or
is there some magic that makes them necessary when building with gcc-8?

Rasmus


Re: [GIT PULL] Second batch of KVM changes for Linux 5.6-rc4 (or rc5)

2020-03-02 Thread Michael Ellerman
Linus Torvalds  writes:
> On Sun, Mar 1, 2020 at 1:03 PM Paolo Bonzini  wrote:
>>
>> Paolo Bonzini (4):
>>   KVM: allow disabling -Werror
>
> Honestly, this is just badly done.
>
> You've basically made it enable -Werror only for very random
> configurations - and apparently the one you test.
>
> Doing things like COMPILE_TEST disables it, but so does not having
> EXPERT enabled.
>
> So it looks entirely ad-hoc and makes very little sense. At least the
> "with KASAN, disable this" part makes sense, since that's a known
> source or warnings. But everything else looks very random.
>
> I've merged this, but I wonder why you couldn't just do what I
> suggested originally?
>
> Seriously, if you script your build tests, and don't even look at the
> results, then you might as well use
>
>make KCFLAGS=-Werror
>
> instead of having this kind of completely random option that has
> almost no logic to it at all.
>
> And if you depend entirely on random build infrastructure like the
> 0day bot etc, this likely _is_ going to break when it starts using a
> new gcc version, or when it starts testing using clang, or whatever.
> So then we end up with another odd random situation where now kvm (and
> only kvm) will fail those builds just because they are automated.
>
> Yes, as I said in that original thread, I'd love to do -Werror in
> general, at which point it wouldn't be some random ad-hoc kvm special
> case for some random option. But the "now it causes problems for
> random compiler versions" is a real issue again - but at least it
> wouldn't be a random kernel subsystem that happens to trigger it, it
> would be a _generic_ issue, and we'd have everybody involved when a
> compiler change introduces a new warning.
>
> I've pulled this for now, but I really think it's a horrible hack, and
> it's just done entirely wrong.
>
> Adding the powerpc people, since they have more history with their
> somewhat less hacky one. Except that one automatically gets disabled
> by "make allmodconfig" and friends, which is also kind of pointless.
>
> Michael, what tends to be the triggers for people using
> PPC_DISABLE_WERROR? Do you have reports for it?

My memory is that we have had very few reports of it actually causing
problems. But I don't have hard data to back that up.

It has tripped up the Clang folks, but that's partly because they're
building clang HEAD, and also because ~zero powerpc kernel developers
are building regularly with clang. I'm trying to fix the latter ...


The thing that makes me disable -Werror (enable PPC_DISABLE_WERROR) most
often is bisecting back to before fixes for my current compiler were
merged.

For example with GCC 8 if you go back before ~4.18 you hit the warning
fixed by bee20031772a ("disable -Wattribute-alias warning for
SYSCALL_DEFINEx()").

And then building with GCC head sometimes requires disabling -Werror
because of some new warning, sometimes valid sometimes not.

I think we could mostly avoid those problems by having the option only
on by default for known compiler versions.

eg:

config WERROR
bool "Build with -Werror"
default CC_IS_GCC && (GCC_VERSION >= 7 && GCC_VERSION <= 9)

And we could bump the upper version up once each new GCC version has had
any problems ironed out.

> Could we have a _generic_ option that just gets enabled by default,
> except it gets disabled by _known_ issues (like KASAN).

Right now I don't think we could have a generic option that's enabled by
default, there's too many warnings floating around on minor arches and
in odd configurations.

But we could have a generic option that signifies the desire to build
with -Werror where possible, and then each arch/subsystem/etc could use
that config option to enable -Werror in stages.

Then after a release or three we could change the option to globally
enable -Werror and opt-out any areas that are still problematic.

It's also possible to use -Wno-error to turn certain warnings back into
warnings even when -Werror is set, so that's another way we could
incrementally attack the problem.


It'd also be nice if we could do:

 $ make WERROR=0

Or something similarly obvious to turn off the WERROR option. That way
users don't even have to edit their .config manually, they just rerun
make with WERROR=0 and it works.


> Being disabled for "make allmodconfig" is kind of against one of the
> _points_ of "the build should be warning-free".

True, it was just the conservative choice to disable it for allmod/yes.
We should probably revisit that these days.

cheers


RE: [PATCH net-next 00/23] Clean driver, module and FW versions

2020-03-02 Thread Madalin Bucur (OSS)
> -Original Message-
> From: David Miller 
> Sent: Monday, March 2, 2020 5:02 AM
> To: l...@kernel.org
> Subject: Re: [PATCH net-next 00/23] Clean driver, module and FW versions
> 
> From: Leon Romanovsky 
> Date: Sun,  1 Mar 2020 16:44:33 +0200
> 
> > This is second batch of the series which removes various static
> > versions in favour of globaly defined Linux kernel version.
> 
> This generally looks fine to me but I'll let it sit for a few days so
> that others can review.

Reviewed drivers/net/ethernet/freescale changes, thank you!

Reviewed-by: Madalin Bucur 


Re: [PATCH v18 00/24] selftests, powerpc, x86: Memory Protection Keys

2020-03-02 Thread Sandipan Das
Hi Shuah,

On 31/01/20 3:21 am, Dave Hansen wrote:
> On 1/29/20 10:36 PM, Sandipan Das wrote:
>> v18:
>>  (1) Fixed issues with x86 multilib builds based on
>>  feedback from Dave.
>>  (2) Moved patch 2 to the end of the series.
> 
> These (finally) build and run successfully for me on an x86 system with
> protection keys.  Feel free to add my Tested-by, and Acked-by.
> 
> FWIW, I don't think look perfect, but my standards are lower for
> selftests/ than normal kernel code. :)
> 

Any updates on considering this for merging?

- Sandipan



Re: [RFC 00/11] perf: Enhancing perf to export processor hazard information

2020-03-02 Thread Peter Zijlstra
On Mon, Mar 02, 2020 at 10:53:44AM +0530, Ravi Bangoria wrote:
> Modern processors export such hazard data in Performance
> Monitoring Unit (PMU) registers. Ex, 'Sampled Instruction Event
> Register' on IBM PowerPC[1][2] and 'Instruction-Based Sampling' on
> AMD[3] provides similar information.
> 
> Implementation detail:
> 
> A new sample_type called PERF_SAMPLE_PIPELINE_HAZ is introduced.
> If it's set, kernel converts arch specific hazard information
> into generic format:
> 
>   struct perf_pipeline_haz_data {
>  /* Instruction/Opcode type: Load, Store, Branch  */
>  __u8itype;
>  /* Instruction Cache source */
>  __u8icache;
>  /* Instruction suffered hazard in pipeline stage */
>  __u8hazard_stage;
>  /* Hazard reason */
>  __u8hazard_reason;
>  /* Instruction suffered stall in pipeline stage */
>  __u8stall_stage;
>  /* Stall reason */
>  __u8stall_reason;
>  __u16   pad;
>   };

Kim, does this format indeed work for AMD IBS?


Re: [RFC 02/11] perf/core: Data structure to present hazard data

2020-03-02 Thread Peter Zijlstra
On Mon, Mar 02, 2020 at 10:53:46AM +0530, Ravi Bangoria wrote:
> From: Madhavan Srinivasan 
> 
> Introduce new perf sample_type PERF_SAMPLE_PIPELINE_HAZ to request kernel
> to provide cpu pipeline hazard data. Also, introduce arch independent
> structure 'perf_pipeline_haz_data' to pass hazard data to userspace. This
> is generic structure and arch specific data needs to be converted to this
> format.
> 
> Signed-off-by: Madhavan Srinivasan 
> Signed-off-by: Ravi Bangoria 
> ---
>  include/linux/perf_event.h|  7 ++
>  include/uapi/linux/perf_event.h   | 32 ++-
>  kernel/events/core.c  |  6 +
>  tools/include/uapi/linux/perf_event.h | 32 ++-
>  4 files changed, 75 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index 547773f5894e..d5b606e3c57d 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -1001,6 +1001,7 @@ struct perf_sample_data {
>   u64 stack_user_size;
>  
>   u64 phys_addr;
> + struct perf_pipeline_haz_data   pipeline_haz;
>  } cacheline_aligned;
>  
>  /* default value for data source */
> @@ -1021,6 +1022,12 @@ static inline void perf_sample_data_init(struct 
> perf_sample_data *data,
>   data->weight = 0;
>   data->data_src.val = PERF_MEM_NA;
>   data->txn = 0;
> + data->pipeline_haz.itype = PERF_HAZ__ITYPE_NA;
> + data->pipeline_haz.icache = PERF_HAZ__ICACHE_NA;
> + data->pipeline_haz.hazard_stage = PERF_HAZ__PIPE_STAGE_NA;
> + data->pipeline_haz.hazard_reason = PERF_HAZ__HREASON_NA;
> + data->pipeline_haz.stall_stage = PERF_HAZ__PIPE_STAGE_NA;
> + data->pipeline_haz.stall_reason = PERF_HAZ__SREASON_NA;
>  }

NAK, Don't touch anything outside of the first cacheline here.


Re: [PATCH] powerpc/Kconfig: Make FSL_85XX_CACHE_SRAM configurable

2020-03-02 Thread 王文虎
发件人:Scott Wood 
发送日期:2020-03-02 16:58:52
收件人:"王文虎" 
抄送人:wangwenhu ,Kumar Gala 
,Benjamin Herrenschmidt 
,Paul Mackerras ,Michael Ellerman 
,linuxppc-dev@lists.ozlabs.org,linux-ker...@vger.kernel.org,triv...@kernel.org,Rai
 Harninder 
主题:Re: [PATCH] powerpc/Kconfig: Make FSL_85XX_CACHE_SRAM configurable>On Mon, 
2020-03-02 at 12:42 +0800, 王文虎 wrote:
>> 发件人:Scott Wood 
>> 发送日期:2020-03-01 07:12:58
>> 收件人:"王文虎" 
>> 抄送人:wangwenhu ,Kumar Gala ,B
>> enjamin Herrenschmidt ,Paul Mackerras <
>> pau...@samba.org>,Michael Ellerman ,
>> linuxppc-dev@lists.ozlabs.org,linux-ker...@vger.kernel.org,
>> triv...@kernel.org,Rai Harninder 
>> 主题:Re: Re: [PATCH] powerpc/Kconfig: Make FSL_85XX_CACHE_SRAM configurable>On
>> Tue, 2020-01-21 at 14:38 +0800, 王文虎 wrote:
>> > > 发件人:Scott Wood 
>> > > 发送日期:2020-01-21 13:49:59
>> > > 收件人:"王文虎" 
>> > > 抄送人:wangwenhu ,Kumar Gala <
>> > > ga...@kernel.crashing.org>,B
>> > > enjamin Herrenschmidt ,Paul Mackerras <
>> > > pau...@samba.org>,Michael Ellerman ,
>> > > linuxppc-dev@lists.ozlabs.org,linux-ker...@vger.kernel.org,
>> > > triv...@kernel.org,Rai Harninder 
>> > > 主题:Re: [PATCH] powerpc/Kconfig: Make FSL_85XX_CACHE_SRAM configurable>On
>> > > Tue, 2020-01-21 at 13:20 +0800, 王文虎 wrote:
>> > > > > From: Scott Wood 
>> > > > > Date: 2020-01-21 11:25:25
>> > > > > To:  wangwenhu ,Kumar Gala <
>> > > > > ga...@kernel.crashing.org>,
>> > > > > Benjamin Herrenschmidt ,Paul Mackerras <
>> > > > > pau...@samba.org>,Michael Ellerman ,
>> > > > > linuxppc-dev@lists.ozlabs.org,linux-ker...@vger.kernel.org
>> > > > > Cc:  triv...@kernel.org,wenhu.w...@vivo.com,Rai Harninder <
>> > > > > harninder@nxp.com>
>> > > > > Subject: Re: [PATCH] powerpc/Kconfig: Make FSL_85XX_CACHE_SRAM
>> > > > > configurable>On Mon, 2020-01-20 at 06:43 -0800, wangwenhu wrote:
>> > > > > > > From: wangwenhu 
>> > > > > > > 
>> > > > > > > When generating .config file with menuconfig on Freescale BOOKE
>> > > > > > > SOC, FSL_85XX_CACHE_SRAM is not configurable for the lack of
>> > > > > > > description in the Kconfig field, which makes it impossible
>> > > > > > > to support L2Cache-Sram driver. Add a description to make it
>> > > > > > > configurable.
>> > > > > > > 
>> > > > > > > Signed-off-by: wangwenhu 
>> > > > > > 
>> > > > > > The intent was that drivers using the SRAM API would select the
>> > > > > > symbol.  What
>> > > > > > is the use case for selecting it manually?
>> > > > > > 
>> > > > > 
>> > > > > With a repository of multiple products(meaning different defconfigs)
>> > > > > and
>> > > > > multiple
>> > > > > developers, the Kconfigs of the Kernel Source Tree change
>> > > > > frequently. So
>> > > > > the
>> > > > > "make menuconfig"
>> > > > > process is needed for defconfigs' re-generating or updating for the
>> > > > > complexity of dependencies
>> > > > > between different features defined in the Kconfigs.
>> > > > 
>> > > > That doesn't answer my question of how the SRAM code would be useful
>> > > > other
>> > > > than to some other driver that uses the API (which would use
>> > > > "select").  There
>> > > > is no userspace API.  You could use the kernel command line to
>> > > > configure
>> > > > the
>> > > > SRAM but you need to get the address of it for it to be useful.
>> > > > 
>> > > 
>> > > Like you've asked below, via /dev/mem or direct calling within the
>> > > Kernel.
>> > > And they are not submitted yes, under development.
>> > 
>> > If they are calling within the kernel, then whatever driver that is should
>> > select FSL_85XX_CACHE_SRAM.  Directly accessing /dev/mem without any way
>> > for
>> > the kernel to advertise where it is or which parts of SRAM are available
>> > for
>> > use sounds like a bad idea.
>> > 
>> 
>> Yes, definitely. So like we enable the moulde which should selet 
>> FSL_85XX_CACHE_SRAM to build vmlinux, FSL_85XX_CACHE_SRAM 
>> could not be seleted because of the Kconfig definition problem 
>> which I am trying to fix now.  So would you please merge the patch 
>> for the convenience of later works depending on the driver.
>
>Sorry, I don't think it's something that should be enabled by itself with
>nothing using the allocators.  Suppose we took this patch, and people enabled
>it and accessed it via /dev/mem.  Then suppose a driver is patched to allocate
>some sram and use it.  They'd be stepping on each others' toes undetected.
>
Right, and maybe i did not explain it clear: I mean that we are developing
modules both in kernel which call the interfaces of FSL_85XX_CACHE_SRAM 
directly, and in user space which is a further consideration upon the work
we have done. Cause we have not exported the code under developing, it 
seems like that nothing uses FSL_85XX_CACHE_SRAM.

>If you want to expose it to userspace, add code that allocates some or all of
>the sram and make it something userspace can mmap.  Or, if nothing's going to
>use them, remove the allocators and export the entire thing to userspace
>(again via an sram-specific mappable rather than 

Re: [PATCH v3 0/6] implement KASLR for powerpc/fsl_booke/64

2020-03-02 Thread Jason Yan




在 2020/3/2 16:47, Scott Wood 写道:

On Mon, 2020-03-02 at 15:12 +0800, Jason Yan wrote:


在 2020/3/2 11:24, Scott Wood 写道:

On Mon, 2020-03-02 at 10:17 +0800, Jason Yan wrote:


在 2020/3/1 6:54, Scott Wood 写道:

On Sat, 2020-02-29 at 15:27 +0800, Jason Yan wrote:


Turnning to %p may not be a good idea in this situation. So
for the REG logs printed when dumping stack, we can disable it when
KASLR is open. For the REG logs in other places like show_regs(),
only
privileged can trigger it, and they are not combind with a symbol,
so
I think it's ok to keep them.

diff --git a/arch/powerpc/kernel/process.c
b/arch/powerpc/kernel/process.c
index fad50db9dcf2..659c51f0739a 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -2068,7 +2068,10 @@ void show_stack(struct task_struct *tsk,
unsigned
long *stack)
newsp = stack[0];
ip = stack[STACK_FRAME_LR_SAVE];
if (!firstframe || ip != lr) {
-   printk("["REG"] ["REG"] %pS", sp, ip, (void
*)ip);
+   if (IS_ENABLED(CONFIG_RANDOMIZE_BASE))
+   printk("%pS", (void *)ip);
+   else
+   printk("["REG"] ["REG"] %pS", sp,
ip,
(void *)ip);


This doesn't deal with "nokaslr" on the kernel command line.  It also
doesn't
seem like something that every callsite should have to opencode,
versus
having
an appropriate format specifier behaves as I described above (and I
still
don't see why that format specifier should not be "%p").



Actually I still do not understand why we should print the raw value
here. When KALLSYMS is enabled we have symbol name  and  offset like
put_cred_rcu+0x108/0x110, and when KALLSYMS is disabled we have the raw
address.


I'm more concerned about the stack address for wading through a raw stack
dump
(to find function call arguments, etc).  The return address does help
confirm
that I'm on the right stack frame though, and also makes looking up a line
number slightly easier than having to look up a symbol address and then
add
the offset (at least for non-module addresses).

As a random aside, the mismatch between Linux printing a hex offset and
GDB
using decimal in disassembly is annoying...



OK, I will send a RFC patch to add a new format specifier such as "%pk"
or change the exsiting "%pK" to print raw value of addresses when KASLR
is disabled and print hash value of addresses when KASLR is enabled.
Let's see what the printk guys would say :)


I'm not sure that a new format specifier is needed versus changing the
behavior of "%p", and "%pK" definitely doesn't seem suitable given that it's
intended to be more restricted than "%p" (see commit ef0010a30935de4).  The
question is whether there is a legitimate reason to hash in the absence of
kaslr.



The problem is that if we change the behavior of "%p", we have to turn
all exsiting "%p" to "%pK". Hashing is still reasonable when there is no
kaslr because some architectures support randomize at build time such as 
arm64.




-Scott



.





Re: [PATCH] powerpc/Kconfig: Make FSL_85XX_CACHE_SRAM configurable

2020-03-02 Thread Scott Wood
On Mon, 2020-03-02 at 12:42 +0800, 王文虎 wrote:
> 发件人:Scott Wood 
> 发送日期:2020-03-01 07:12:58
> 收件人:"王文虎" 
> 抄送人:wangwenhu ,Kumar Gala ,B
> enjamin Herrenschmidt ,Paul Mackerras <
> pau...@samba.org>,Michael Ellerman ,
> linuxppc-dev@lists.ozlabs.org,linux-ker...@vger.kernel.org,
> triv...@kernel.org,Rai Harninder 
> 主题:Re: Re: [PATCH] powerpc/Kconfig: Make FSL_85XX_CACHE_SRAM configurable>On
> Tue, 2020-01-21 at 14:38 +0800, 王文虎 wrote:
> > > 发件人:Scott Wood 
> > > 发送日期:2020-01-21 13:49:59
> > > 收件人:"王文虎" 
> > > 抄送人:wangwenhu ,Kumar Gala <
> > > ga...@kernel.crashing.org>,B
> > > enjamin Herrenschmidt ,Paul Mackerras <
> > > pau...@samba.org>,Michael Ellerman ,
> > > linuxppc-dev@lists.ozlabs.org,linux-ker...@vger.kernel.org,
> > > triv...@kernel.org,Rai Harninder 
> > > 主题:Re: [PATCH] powerpc/Kconfig: Make FSL_85XX_CACHE_SRAM configurable>On
> > > Tue, 2020-01-21 at 13:20 +0800, 王文虎 wrote:
> > > > > From: Scott Wood 
> > > > > Date: 2020-01-21 11:25:25
> > > > > To:  wangwenhu ,Kumar Gala <
> > > > > ga...@kernel.crashing.org>,
> > > > > Benjamin Herrenschmidt ,Paul Mackerras <
> > > > > pau...@samba.org>,Michael Ellerman ,
> > > > > linuxppc-dev@lists.ozlabs.org,linux-ker...@vger.kernel.org
> > > > > Cc:  triv...@kernel.org,wenhu.w...@vivo.com,Rai Harninder <
> > > > > harninder@nxp.com>
> > > > > Subject: Re: [PATCH] powerpc/Kconfig: Make FSL_85XX_CACHE_SRAM
> > > > > configurable>On Mon, 2020-01-20 at 06:43 -0800, wangwenhu wrote:
> > > > > > > From: wangwenhu 
> > > > > > > 
> > > > > > > When generating .config file with menuconfig on Freescale BOOKE
> > > > > > > SOC, FSL_85XX_CACHE_SRAM is not configurable for the lack of
> > > > > > > description in the Kconfig field, which makes it impossible
> > > > > > > to support L2Cache-Sram driver. Add a description to make it
> > > > > > > configurable.
> > > > > > > 
> > > > > > > Signed-off-by: wangwenhu 
> > > > > > 
> > > > > > The intent was that drivers using the SRAM API would select the
> > > > > > symbol.  What
> > > > > > is the use case for selecting it manually?
> > > > > > 
> > > > > 
> > > > > With a repository of multiple products(meaning different defconfigs)
> > > > > and
> > > > > multiple
> > > > > developers, the Kconfigs of the Kernel Source Tree change
> > > > > frequently. So
> > > > > the
> > > > > "make menuconfig"
> > > > > process is needed for defconfigs' re-generating or updating for the
> > > > > complexity of dependencies
> > > > > between different features defined in the Kconfigs.
> > > > 
> > > > That doesn't answer my question of how the SRAM code would be useful
> > > > other
> > > > than to some other driver that uses the API (which would use
> > > > "select").  There
> > > > is no userspace API.  You could use the kernel command line to
> > > > configure
> > > > the
> > > > SRAM but you need to get the address of it for it to be useful.
> > > > 
> > > 
> > > Like you've asked below, via /dev/mem or direct calling within the
> > > Kernel.
> > > And they are not submitted yes, under development.
> > 
> > If they are calling within the kernel, then whatever driver that is should
> > select FSL_85XX_CACHE_SRAM.  Directly accessing /dev/mem without any way
> > for
> > the kernel to advertise where it is or which parts of SRAM are available
> > for
> > use sounds like a bad idea.
> > 
> 
> Yes, definitely. So like we enable the moulde which should selet 
> FSL_85XX_CACHE_SRAM to build vmlinux, FSL_85XX_CACHE_SRAM 
> could not be seleted because of the Kconfig definition problem 
> which I am trying to fix now.  So would you please merge the patch 
> for the convenience of later works depending on the driver.

Sorry, I don't think it's something that should be enabled by itself with
nothing using the allocators.  Suppose we took this patch, and people enabled
it and accessed it via /dev/mem.  Then suppose a driver is patched to allocate
some sram and use it.  They'd be stepping on each others' toes undetected.

If you want to expose it to userspace, add code that allocates some or all of
the sram and make it something userspace can mmap.  Or, if nothing's going to
use them, remove the allocators and export the entire thing to userspace
(again via an sram-specific mappable rather than /dev/mem).

-Scott




Re: [PATCH v3 0/6] implement KASLR for powerpc/fsl_booke/64

2020-03-02 Thread Scott Wood
On Mon, 2020-03-02 at 15:12 +0800, Jason Yan wrote:
> 
> 在 2020/3/2 11:24, Scott Wood 写道:
> > On Mon, 2020-03-02 at 10:17 +0800, Jason Yan wrote:
> > > 
> > > 在 2020/3/1 6:54, Scott Wood 写道:
> > > > On Sat, 2020-02-29 at 15:27 +0800, Jason Yan wrote:
> > > > > 
> > > > > Turnning to %p may not be a good idea in this situation. So
> > > > > for the REG logs printed when dumping stack, we can disable it when
> > > > > KASLR is open. For the REG logs in other places like show_regs(),
> > > > > only
> > > > > privileged can trigger it, and they are not combind with a symbol,
> > > > > so
> > > > > I think it's ok to keep them.
> > > > > 
> > > > > diff --git a/arch/powerpc/kernel/process.c
> > > > > b/arch/powerpc/kernel/process.c
> > > > > index fad50db9dcf2..659c51f0739a 100644
> > > > > --- a/arch/powerpc/kernel/process.c
> > > > > +++ b/arch/powerpc/kernel/process.c
> > > > > @@ -2068,7 +2068,10 @@ void show_stack(struct task_struct *tsk,
> > > > > unsigned
> > > > > long *stack)
> > > > >newsp = stack[0];
> > > > >ip = stack[STACK_FRAME_LR_SAVE];
> > > > >if (!firstframe || ip != lr) {
> > > > > -   printk("["REG"] ["REG"] %pS", sp, ip, (void
> > > > > *)ip);
> > > > > +   if (IS_ENABLED(CONFIG_RANDOMIZE_BASE))
> > > > > +   printk("%pS", (void *)ip);
> > > > > +   else
> > > > > +   printk("["REG"] ["REG"] %pS", sp,
> > > > > ip,
> > > > > (void *)ip);
> > > > 
> > > > This doesn't deal with "nokaslr" on the kernel command line.  It also
> > > > doesn't
> > > > seem like something that every callsite should have to opencode,
> > > > versus
> > > > having
> > > > an appropriate format specifier behaves as I described above (and I
> > > > still
> > > > don't see why that format specifier should not be "%p").
> > > > 
> > > 
> > > Actually I still do not understand why we should print the raw value
> > > here. When KALLSYMS is enabled we have symbol name  and  offset like
> > > put_cred_rcu+0x108/0x110, and when KALLSYMS is disabled we have the raw
> > > address.
> > 
> > I'm more concerned about the stack address for wading through a raw stack
> > dump
> > (to find function call arguments, etc).  The return address does help
> > confirm
> > that I'm on the right stack frame though, and also makes looking up a line
> > number slightly easier than having to look up a symbol address and then
> > add
> > the offset (at least for non-module addresses).
> > 
> > As a random aside, the mismatch between Linux printing a hex offset and
> > GDB
> > using decimal in disassembly is annoying...
> > 
> 
> OK, I will send a RFC patch to add a new format specifier such as "%pk" 
> or change the exsiting "%pK" to print raw value of addresses when KASLR 
> is disabled and print hash value of addresses when KASLR is enabled. 
> Let's see what the printk guys would say :)

I'm not sure that a new format specifier is needed versus changing the
behavior of "%p", and "%pK" definitely doesn't seem suitable given that it's
intended to be more restricted than "%p" (see commit ef0010a30935de4).  The
question is whether there is a legitimate reason to hash in the absence of
kaslr.

-Scott




Re: [RFC 1/3] mm/vma: Define a default value for VM_DATA_DEFAULT_FLAGS

2020-03-02 Thread Geert Uytterhoeven
On Mon, Mar 2, 2020 at 7:48 AM Anshuman Khandual
 wrote:
> There are many platforms with exact same value for VM_DATA_DEFAULT_FLAGS
> This creates a default value for VM_DATA_DEFAULT_FLAGS in line with the
> existing VM_STACK_DEFAULT_FLAGS. While here, also define some more macros
> with standard VMA access flag combinations that are used frequently across
> many platforms. Apart from simplification, this reduces code duplication
> as well.

> Signed-off-by: Anshuman Khandual 

>  arch/m68k/include/asm/page.h   |  3 ---

For m68k:
Acked-by: Geert Uytterhoeven 

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: [PATCH 2/2] powerpc/pseries: update device tree before ejecting hotplug uevents

2020-03-02 Thread Hari Bathini



On 11/02/20 8:29 AM, Pingfan Liu wrote:
> A bug is observed on pseries by taking the following steps on rhel:
> -1. drmgr -c mem -r -q 5
> -2. echo c > /proc/sysrq-trigger
> 
> And then, the failure looks like:
> kdump: saving to /sysroot//var/crash/127.0.0.1-2020-01-16-02:06:14/
> kdump: saving vmcore-dmesg.txt
> kdump: saving vmcore-dmesg.txt complete
> kdump: saving vmcore
>  Checking for memory holes : [  0.0 %] /  
>  Checking for memory holes : [100.0 %] |  
>  Excluding unnecessary pages   : [100.0 %] \  
>  Copying data  : [  0.3 %] -  
> eta: 38s[   44.337636] hash-mmu: mm: Hashing failure ! 
> EA=0x7fffba40 access=0x8004 current=makedumpfile
> [   44.337663] hash-mmu: trap=0x300 vsid=0x13a109c ssize=1 base psize=2 
> psize 2 pte=0xc0005504
> [   44.337677] hash-mmu: mm: Hashing failure ! EA=0x7fffba40 
> access=0x8004 current=makedumpfile
> [   44.337692] hash-mmu: trap=0x300 vsid=0x13a109c ssize=1 base psize=2 
> psize 2 pte=0xc0005504
> [   44.337708] makedumpfile[469]: unhandled signal 7 at 7fffba40 nip 
> 7fffbbc4d7fc lr 00011356ca3c code 2
> [   44.338548] Core dump to |/bin/false pipe failed
> /lib/kdump-lib-initramfs.sh: line 98:   469 Bus error   
> $CORE_COLLECTOR /proc/vmcore 
> $_mp/$KDUMP_PATH/$HOST_IP-$DATEDIR/vmcore-incomplete
> kdump: saving vmcore failed
> 
> * Root cause *
>   After analyzing, it turns out that in the current implementation,
> when hot-removing lmb, the KOBJ_REMOVE event ejects before the dt updating as
> the code __remove_memory() comes before drmem_update_dt().
> 
> From a viewpoint of listener and publisher, the publisher notifies the
> listener before data is ready.  This introduces a problem where udev
> launches kexec-tools (due to KOBJ_REMOVE) and loads a stale dt before
> updating. And in capture kernel, makedumpfile will access the memory based
> on the stale dt info, and hit a SIGBUS error due to an un-existed lmb.
> 
> * Fix *
>   In order to fix this issue, update dt before __remove_memory(), and
> accordingly the same rule in hot-add path.
> 
> This will introduce extra dt updating payload for each involved lmb when 
> hotplug.
> But it should be fine since drmem_update_dt() is memory based operation and
> hotplug is not a hot path.
> 
> Signed-off-by: Pingfan Liu 
> Cc: Michael Ellerman 
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Cc: Hari Bathini 
> To: linuxppc-dev@lists.ozlabs.org
> Cc: ke...@lists.infradead.org

KDump fails to capture vmcore as we end up looking at a stale elfcore hdr
with udev event happening before DT update. Resolved with these patches.
For the series:

Tested-by: Hari Bathini