Re: [Patch v3 01/11] arch/powerpc/pci: Fix compiling error for mpc85xx_edac

2016-08-04 Thread Michael Ellerman
York Sun  writes:

> Two symbols are missing if mpc85xx_edac driver is compiled as module.
>
> Signed-off-by: York Sun 
>
> ---
> Change log
>   v3: Change subject tag
>   v2: no change
>
>  arch/powerpc/kernel/pci-common.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/arch/powerpc/kernel/pci-common.c 
> b/arch/powerpc/kernel/pci-common.c
> index 0f7a60f..86bc484 100644
> --- a/arch/powerpc/kernel/pci-common.c
> +++ b/arch/powerpc/kernel/pci-common.c
> @@ -226,6 +226,7 @@ struct pci_controller* pci_find_hose_for_OF_device(struct 
> device_node* node)
>   }
>   return NULL;
>  }
> +EXPORT_SYMBOL(pci_find_hose_for_OF_device);
>  
>  /*
>   * Reads the interrupt pin to determine if interrupt is use by card.
> @@ -1585,6 +1586,7 @@ int early_find_capability(struct pci_controller *hose, 
> int bus, int devfn,
>  {
>   return pci_bus_find_capability(fake_pci_bus(hose, bus), devfn, cap);
>  }
> +EXPORT_SYMBOL(early_find_capability);

Does the driver really need to use these routines? They're meant for use
early in boot, before PCI is setup.

AFAICS this is just a regular driver, so when it's probed the PCI
devices should have already been scanned. In which case pci_get_device()
could work couldn't it? (I see other edac drivers doing that).

cheers


Re: [PATCH] crypto: powerpc - CRYPT_CRC32C_VPMSUM should depend on ALTIVEC

2016-08-04 Thread Michael Ellerman
Anton Blanchard  writes:

> Hi Michael,
>
>> The optimised crc32c implementation depends on VMX (aka. Altivec)
>> instructions, so the kernel must be built with Altivec support in
>> order for the crc32c code to build.
>
> Thanks for that, looks good.
>
> Acked-by: Anton Blanchard 

Thanks. Herbert, expecting you'll pick this up.

cheers


Re: mm: Initialise per_cpu_nodestats for all online pgdats at boot

2016-08-04 Thread Paul Mackerras
On Thu, Aug 04, 2016 at 10:24:04AM +0100, Mel Gorman wrote:
> Paul Mackerras and Reza Arbab reported that machines with memoryless nodes
> fails when vmstats are refreshed. Paul reported an oops as follows
> 
> [1.713998] Unable to handle kernel paging request for data at address 
> 0xff7a1
> [1.714164] Faulting instruction address: 0xc0270cd0
> [1.714304] Oops: Kernel access of bad area, sig: 11 [#1]
> [1.714414] SMP NR_CPUS=2048 NUMA PowerNV
> [1.714530] Modules linked in:
> [1.714647] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.7.0-kvm+ #118
> [1.714786] task: c00ff0680010 task.stack: c00ff0704000
> [1.714926] NIP: c0270cd0 LR: c0270ce8 CTR: 
> 
> [1.715093] REGS: c00ff0707900 TRAP: 0300   Not tainted  (4.7.0-kvm+)
> [1.715232] MSR: 900102009033   CR: 
> 846b6824  XER: 2000
> [1.715748] CFAR: c0008768 DAR: 000ff7a1 DSISR: 4200 
> SOFTE: 1
> GPR00: c0270d08 c00ff0707b80 c11fb200 
> GPR04: 0800   
> GPR08:   000ff7a1 c122aae0
> GPR12: c0a1e440 cfb8 c000c188 
> GPR16:    
> GPR20:    c0cecad0
> GPR24: c0d035b8 c0d6cd18 c0d6cd18 c01fffa86300
> GPR28:  c01fffa96300 c1230034 c122eb18
> [1.717484] NIP [c0270cd0] refresh_zone_stat_thresholds+0x80/0x240
> [1.717568] LR [c0270ce8] refresh_zone_stat_thresholds+0x98/0x240
> [1.717648] Call Trace:
> [1.717687] [c00ff0707b80] [c0270d08] 
> refresh_zone_stat_thresholds+0xb8/0x240 (unreliable)
> 
> Both supplied potential fixes but one potentially misses checks and another
> had redundant initialisations. This version initialises per_cpu_nodestats
> on a per-pgdat basis instead of on a per-zone basis.
> 
> Reported-by: Paul Mackerras 
> Reported-by: Reza Arbab 
> Signed-off-by: Mel Gorman 

That works, thanks.

Tested-by: Paul Mackerras 


Re: [PATCH v13 06/30] powerpc/ptrace: Adapt gpr32_get, gpr32_set functions for transaction

2016-08-04 Thread Daniel Axtens
Michael Ellerman  writes:


>> Is there a nice simple fix we could deploy to squash this warning, or
>> will we just live with it?
>
> This series has been nothing but pain. Given we're already at v13, and people
> really want this support to go in, I'm going to leave it in the tree.
>
> Once it's in we can refactor the implementation, which is a bit of a mess, and
> hopefully in the process fix the warnings.

OK, I'll push this onto my stack to look at again in a couple of months.

Regards,
Daniel



signature.asc
Description: PGP signature


[Patch v3 01/11] arch/powerpc/pci: Fix compiling error for mpc85xx_edac

2016-08-04 Thread York Sun
Two symbols are missing if mpc85xx_edac driver is compiled as module.

Signed-off-by: York Sun 

---
Change log
  v3: Change subject tag
  v2: no change

 arch/powerpc/kernel/pci-common.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index 0f7a60f..86bc484 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -226,6 +226,7 @@ struct pci_controller* pci_find_hose_for_OF_device(struct 
device_node* node)
}
return NULL;
 }
+EXPORT_SYMBOL(pci_find_hose_for_OF_device);
 
 /*
  * Reads the interrupt pin to determine if interrupt is use by card.
@@ -1585,6 +1586,7 @@ int early_find_capability(struct pci_controller *hose, 
int bus, int devfn,
 {
return pci_bus_find_capability(fake_pci_bus(hose, bus), devfn, cap);
 }
+EXPORT_SYMBOL(early_find_capability);
 
 struct device_node *pcibios_get_phb_of_node(struct pci_bus *bus)
 {
-- 
2.7.4



Re: [Patch v3 01/11] arch/powerpc/pci: Fix compiling error for mpc85xx_edac

2016-08-04 Thread Andrew Donnellan

On 05/08/16 08:58, York Sun wrote:

Two symbols are missing if mpc85xx_edac driver is compiled as module.

Signed-off-by: York Sun 


Good catch! One comment below.

Reviewed-by: Andrew Donnellan 


 /*
  * Reads the interrupt pin to determine if interrupt is use by card.
@@ -1585,6 +1586,7 @@ int early_find_capability(struct pci_controller *hose, 
int bus, int devfn,
 {
return pci_bus_find_capability(fake_pci_bus(hose, bus), devfn, cap);
 }
+EXPORT_SYMBOL(early_find_capability);


It'd be nicer for this to be renamed as "pci_early_find_capability" or 
something like that with a "namespace", I think.



Andrew

--
Andrew Donnellan  OzLabs, ADL Canberra
andrew.donnel...@au1.ibm.com  IBM Australia Limited



Re: [PATCH V2 2/2] fadump: Register the memory reserved by fadump

2016-08-04 Thread Andrew Morton
On Thu,  4 Aug 2016 22:42:09 +0530 Srikar Dronamraju 
 wrote:

> Fadump kernel reserves large chunks of memory even before the pages are
> initialized. This could mean memory that corresponds to several nodes might
> fall in memblock reserved regions.
> 
> Kernels compiled with CONFIG_DEFERRED_STRUCT_PAGE_INIT will initialize
> only certain size memory per node. The certain size takes into account
> the dentry and inode cache sizes. Currently the cache sizes are
> calculated based on the total system memory including the reserved
> memory. However such a kernel when booting the same kernel as fadump
> kernel will not be able to allocate the required amount of memory to
> suffice for the dentry and inode caches. This results in crashes like
> the below on large systems such as 32 TB systems.
> 
> Dentry cache hash table entries: 536870912 (order: 16, 4294967296 bytes)
> vmalloc: allocation failure, allocated 4097114112 of 17179934720 bytes
> swapper/0: page allocation failure: order:0, mode:0x2080020(GFP_ATOMIC)
> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.6-master+ #3
> Call Trace:
> [c108fb10] [c07fac88] dump_stack+0xb0/0xf0 (unreliable)
> [c108fb50] [c0235264] warn_alloc_failed+0x114/0x160
> [c108fbf0] [c0281484] __vmalloc_node_range+0x304/0x340
> [c108fca0] [c028152c] __vmalloc+0x6c/0x90
> [c108fd40] [c0aecfb0]
> alloc_large_system_hash+0x1b8/0x2c0
> [c108fe00] [c0af7240] inode_init+0x94/0xe4
> [c108fe80] [c0af6fec] vfs_caches_init+0x8c/0x13c
> [c108ff00] [c0ac4014] start_kernel+0x50c/0x578
> [c108ff90] [c0008c6c] start_here_common+0x20/0xa8
> 
> Register the memory reserved by fadump, so that the cache sizes are
> calculated based on the free memory (i.e Total memory - reserved
> memory).

Looks harmless enough to me.  I'll schedule the patches for 4.8.  But
it sounds like they should be backported into older kernels?



Re: [v5.1] ucc_fast: Fix to avoid IS_ERR_VALUE abuses and dead code on 64bit systems.

2016-08-04 Thread Arnd Bergmann
On Thursday, August 4, 2016 10:22:43 PM CEST Arvind Yadav wrote:
> index df8ea79..ada9070 100644
> --- a/include/soc/fsl/qe/ucc_fast.h
> +++ b/include/soc/fsl/qe/ucc_fast.h
> @@ -165,10 +165,12 @@ struct ucc_fast_private {
> int stopped_tx; /* Whether channel has been stopped for Tx
>(STOP_TX, etc.) */
> int stopped_rx; /* Whether channel has been stopped for Rx */
> -   u32 ucc_fast_tx_virtual_fifo_base_offset;/* pointer to base of Tx
> -   virtual fifo */
> -   u32 ucc_fast_rx_virtual_fifo_base_offset;/* pointer to base of Rx
> -   virtual fifo */
> +   unsigned long ucc_fast_tx_virtual_fifo_base_offset;/* pointer to base 
> of
> +   * Tx virtual fifo
> +   */
> +   unsigned long ucc_fast_rx_virtual_fifo_base_offset;/* pointer to base 
> of
> +   * Rx virtual fifo
> +   */
>  #ifdef STATISTICS
> u32 tx_frames;  /* Transmitted frames counter. */
> u32 rx_frames;  /* Received frames counter (only frames
> 

This change seems ok, but what about the other u32 variables in ucc_geth.c
that get checked for IS_ERR_VALUE?

Arnd




Re: [patch] powerpc/fsl_rio: fix a missing error code

2016-08-04 Thread Andrew Morton
On Thu, 4 Aug 2016 08:35:25 +0300 Dan Carpenter  
wrote:

> We should set the error code here.  Otherwise static checkers complain.
> 

hm.

> --- a/arch/powerpc/sysdev/fsl_rio.c
> +++ b/arch/powerpc/sysdev/fsl_rio.c
> @@ -491,6 +491,7 @@ int fsl_rio_setup(struct platform_device *dev)
>   rmu_node = of_parse_phandle(dev->dev.of_node, "fsl,srio-rmu-handle", 0);
>   if (!rmu_node) {
>   dev_err(>dev, "No valid fsl,srio-rmu-handle property\n");
> + rc = -ENOENT;
>   goto err_rmu;
>   }
>   rc = of_address_to_resource(rmu_node, 0, _regs);

afaict the function will return 0 in this case, which is a flat out
bug.  But why do static checkers complain?  The code will return a
suitably initialized value?

IOW, please always quote the checker/compiler output when fixing a bug!




[PATCH -net] net/ethernet: tundra: fix dump_eth_one warning in tsi108_eth

2016-08-04 Thread Paul Gortmaker
The call site for this function appears as:

  #ifdef DEBUG
data->msg_enable = DEBUG;
dump_eth_one(dev);
  #endif

...leading to the following warning for !DEBUG builds:

drivers/net/ethernet/tundra/tsi108_eth.c:169:13: warning: 'dump_eth_one' 
defined but not used [-Wunused-function]
 static void dump_eth_one(struct net_device *dev)
 ^

...when using the arch/powerpc/configs/mpc7448_hpc2_defconfig

Put the function definition under the same #ifdef as the call site
to avoid the warning.

Cc: "David S. Miller" 
Cc: net...@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Paul Gortmaker 
---
 drivers/net/ethernet/tundra/tsi108_eth.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/tundra/tsi108_eth.c 
b/drivers/net/ethernet/tundra/tsi108_eth.c
index 01a77145a0fa..8fd131207ee1 100644
--- a/drivers/net/ethernet/tundra/tsi108_eth.c
+++ b/drivers/net/ethernet/tundra/tsi108_eth.c
@@ -166,6 +166,7 @@ static struct platform_driver tsi_eth_driver = {
 
 static void tsi108_timed_checker(unsigned long dev_ptr);
 
+#ifdef DEBUG
 static void dump_eth_one(struct net_device *dev)
 {
struct tsi108_prv_data *data = netdev_priv(dev);
@@ -190,6 +191,7 @@ static void dump_eth_one(struct net_device *dev)
   TSI_READ(TSI108_EC_RXESTAT),
   TSI_READ(TSI108_EC_RXERR), data->rxpending);
 }
+#endif
 
 /* Synchronization is needed between the thread and up/down events.
  * Note that the PHY is accessed through the same registers for both
-- 
2.8.4



Re: [RESEND][PATCH v2 1/2] kexec: refactor code parsing size based on memory range

2016-08-04 Thread Hari Bathini

Hi Dave


Thanks for the review..


On Thursday 04 August 2016 02:56 PM, Dave Young wrote:

Hi Hari,

On 08/04/16 at 01:03am, Hari Bathini wrote:

crashkernel parameter supports different syntaxes to specify the amount
of memory to be reserved for kdump kernel. Below is one of the supported
syntaxes that needs parsing to find the memory size to reserve, based on
memory range:

 crashkernel=:[,:,...]

While such parsing is implemented for crashkernel parameter, it applies
to other parameters, like fadump_reserve_mem=, which could use similar
syntax. This patch moves crashkernel's parsing code for above syntax to
to kernel/params.c file for reuse. Two functions is_param_range_based()
and parse_mem_range_size() are added to kernel/params.c file for this
purpose.

Any parameter that uses the above syntax can use is_param_range_based()
function to validate the syntax and parse_mem_range_size() function to
get the parsed memory size. While some code is moved to kernel/params.c
file, there is no change functionality wise in parsing the crashkernel
parameter.

Signed-off-by: Hari Bathini 
---

Changes from v1:
1. Updated changelog

  include/linux/kernel.h |5 +++
  kernel/kexec_core.c|   63 +++-
  kernel/params.c|   96 
  3 files changed, 106 insertions(+), 58 deletions(-)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index d96a611..2df7ba2 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -435,6 +435,11 @@ extern char *get_options(const char *str, int nints, int 
*ints);
  extern unsigned long long memparse(const char *ptr, char **retptr);
  extern bool parse_option_str(const char *str, const char *option);
  
+extern bool __init is_param_range_based(const char *cmdline);

+extern unsigned long long __init parse_mem_range_size(const char *param,
+ char **str,
+ unsigned long long 
system_ram);
+
  extern int core_kernel_text(unsigned long addr);
  extern int core_kernel_data(unsigned long addr);
  extern int __kernel_text_address(unsigned long addr);
diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index 5616755..3a74024 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -1104,59 +1104,9 @@ static int __init parse_crashkernel_mem(char *cmdline,
char *cur = cmdline, *tmp;
  
  	/* for each entry of the comma-separated list */

-   do {
-   unsigned long long start, end = ULLONG_MAX, size;
-
-   /* get the start of the range */
-   start = memparse(cur, );
-   if (cur == tmp) {
-   pr_warn("crashkernel: Memory value expected\n");
-   return -EINVAL;
-   }
-   cur = tmp;
-   if (*cur != '-') {
-   pr_warn("crashkernel: '-' expected\n");
-   return -EINVAL;
-   }
-   cur++;
-
-   /* if no ':' is here, than we read the end */
-   if (*cur != ':') {
-   end = memparse(cur, );
-   if (cur == tmp) {
-   pr_warn("crashkernel: Memory value expected\n");
-   return -EINVAL;
-   }
-   cur = tmp;
-   if (end <= start) {
-   pr_warn("crashkernel: end <= start\n");
-   return -EINVAL;
-   }
-   }
-
-   if (*cur != ':') {
-   pr_warn("crashkernel: ':' expected\n");
-   return -EINVAL;
-   }
-   cur++;
-
-   size = memparse(cur, );
-   if (cur == tmp) {
-   pr_warn("Memory value expected\n");
-   return -EINVAL;
-   }
-   cur = tmp;
-   if (size >= system_ram) {
-   pr_warn("crashkernel: invalid size\n");
-   return -EINVAL;
-   }
-
-   /* match ? */
-   if (system_ram >= start && system_ram < end) {
-   *crash_size = size;
-   break;
-   }
-   } while (*cur++ == ',');
+   *crash_size = parse_mem_range_size("crashkernel", , system_ram);
+   if (cur == cmdline)
+   return -EINVAL;
  
  	if (*crash_size > 0) {

while (*cur && *cur != ' ' && *cur != '@')
@@ -1293,7 +1243,6 @@ static int __init __parse_crashkernel(char *cmdline,
 const char *name,
 const char *suffix)
  {
-   char*first_colon, *first_space;
char*ck_cmdline;
  
  	BUG_ON(!crash_size || !crash_base);

@@ -1311,12 

Re: [RESEND][PATCH v2 2/2] powerpc/fadump: parse fadump reserve memory size based on memory range

2016-08-04 Thread Hari Bathini


On Thursday 04 August 2016 03:15 PM, Michael Ellerman wrote:

Hari Bathini  writes:
...

  /**
   * fadump_calculate_reserve_size(): reserve variable boot area 5% of System 
RAM
   *
@@ -212,12 +262,17 @@ static inline unsigned long 
fadump_calculate_reserve_size(void)
  {
unsigned long size;
  
+	/* sets fw_dump.reserve_bootvar */

+   parse_fadump_reserve_mem();
+
/*
 * Check if the size is specified through fadump_reserve_mem= cmdline
 * option. If yes, then use that.
 */
if (fw_dump.reserve_bootvar)
return fw_dump.reserve_bootvar;
+   else
+   printk(KERN_INFO "fadump: calculating default boot size\n");
  
  	/* divide by 20 to get 5% of value */

size = memblock_end_of_DRAM() / 20;

The code already knows how to reserve 5% based on the size of the machine's
memory, as long as no commandline parameter is passed. So why can't we
just use that logic?


Hi Michael,

That is the default value reserved but not a good enough value for
every case. It is a bit difficult to come up with a robust formula
that works for every case as new kernel changes could make the
values obsolete. But it won't be all that difficult to find values that
work for different memory ranges for a given kernel version.
Passing that as range based input with "fadump_reserve_mem"
parameter would work for every memory configuration on a
given system, which is what this patch is trying to provide..

Thanks
Hari



cheers





Re: [Patch v2 01/10] driver/edac/mpc85xx_edac: Fix compiling error

2016-08-04 Thread Bjorn Helgaas
On Thu, Aug 04, 2016 at 12:01:17PM +0200, Borislav Petkov wrote:
> On Thu, Jul 28, 2016 at 03:30:55PM -0700, York Sun wrote:
> > Two symbols are missing if mpc85xx_edac driver is compiled as module.
> > 
> > Signed-off-by: York Sun 
> > ---
> > Change log
> >   v2: no change
> > 
> >  arch/powerpc/kernel/pci-common.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/arch/powerpc/kernel/pci-common.c 
> > b/arch/powerpc/kernel/pci-common.c
> > index 0f7a60f..86bc484 100644
> > --- a/arch/powerpc/kernel/pci-common.c
> > +++ b/arch/powerpc/kernel/pci-common.c
> > @@ -226,6 +226,7 @@ struct pci_controller* 
> > pci_find_hose_for_OF_device(struct device_node* node)
> > }
> > return NULL;
> >  }
> > +EXPORT_SYMBOL(pci_find_hose_for_OF_device);
> >  
> >  /*
> >   * Reads the interrupt pin to determine if interrupt is use by card.
> > @@ -1585,6 +1586,7 @@ int early_find_capability(struct pci_controller 
> > *hose, int bus, int devfn,
> >  {
> > return pci_bus_find_capability(fake_pci_bus(hose, bus), devfn, cap);
> >  }
> > +EXPORT_SYMBOL(early_find_capability);

arch/microblaze also contains a declaration and implementation of
early_find_capability(), but as far as I can see, this was just copied
from powerpc, and it is never used on microblaze.  So just as a matter
of good code hygiene, please add a patch to remove it from microblaze.

mpc85xx looks like a weird mix of platform driver and PCI device
driver.  If loaded as a module, it shouldn't need
early_find_capability(); regular pci_find_capability() (or just
pci_is_pcie()) should work fine by the time we can load modules.
Maybe it would even work by the time mpc85xx_pci_err_probe() runs when
built-in.

The whole early_find_capability() thing seems a little questionable,
too, but it's really only used in the FSL enumeration path, so maybe
there's something really special about that system.

Bjorn


[PATCH V2 1/2] mm/page_alloc: Replace set_dma_reserve to set_memory_reserve

2016-08-04 Thread Srikar Dronamraju
Expand the scope of the existing dma_reserve to accommodate other memory
reserves too. Accordingly rename variable dma_reserve to
nr_memory_reserve.

set_memory_reserve also takes a new parameter that helps to identify if
the current value needs to be incremented.

Suggested-by: Mel Gorman 
Signed-off-by: Srikar Dronamraju 
---
 arch/x86/kernel/e820.c |  2 +-
 include/linux/mm.h |  2 +-
 mm/page_alloc.c| 20 
 3 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 621b501..d935983 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -1188,6 +1188,6 @@ void __init memblock_find_dma_reserve(void)
nr_free_pages += end_pfn - start_pfn;
}
 
-   set_dma_reserve(nr_pages - nr_free_pages);
+   set_memory_reserve(nr_pages - nr_free_pages, false);
 #endif
 }
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 8f468e0..c884ffb 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1886,7 +1886,7 @@ extern int __meminit __early_pfn_to_nid(unsigned long pfn,
struct mminit_pfnnid_cache *state);
 #endif
 
-extern void set_dma_reserve(unsigned long new_dma_reserve);
+extern void set_memory_reserve(unsigned long nr_reserve, bool inc);
 extern void memmap_init_zone(unsigned long, int, unsigned long,
unsigned long, enum memmap_context);
 extern void setup_per_zone_wmarks(void);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index c1069ef..a154c2f 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -253,7 +253,7 @@ int watermark_scale_factor = 10;
 
 static unsigned long __meminitdata nr_kernel_pages;
 static unsigned long __meminitdata nr_all_pages;
-static unsigned long __meminitdata dma_reserve;
+static unsigned long __meminitdata nr_memory_reserve;
 
 #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 static unsigned long __meminitdata arch_zone_lowest_possible_pfn[MAX_NR_ZONES];
@@ -5493,10 +5493,10 @@ static void __paginginit free_area_init_core(struct 
pglist_data *pgdat)
}
 
/* Account for reserved pages */
-   if (j == 0 && freesize > dma_reserve) {
-   freesize -= dma_reserve;
+   if (j == 0 && freesize > nr_memory_reserve) {
+   freesize -= nr_memory_reserve;
printk(KERN_DEBUG "  %s zone: %lu pages reserved\n",
-   zone_names[0], dma_reserve);
+   zone_names[0], nr_memory_reserve);
}
 
if (!is_highmem_idx(j))
@@ -6186,8 +6186,9 @@ void __init mem_init_print_info(const char *str)
 }
 
 /**
- * set_dma_reserve - set the specified number of pages reserved in the first 
zone
- * @new_dma_reserve: The number of pages to mark reserved
+ * set_memory_reserve - set number of pages reserved in the first zone
+ * @nr_reserve: The number of pages to mark reserved
+ * @inc: true increment to existing value; false set new value.
  *
  * The per-cpu batchsize and zone watermarks are determined by managed_pages.
  * In the DMA zone, a significant percentage may be consumed by kernel image
@@ -6196,9 +6197,12 @@ void __init mem_init_print_info(const char *str)
  * first zone (e.g., ZONE_DMA). The effect will be lower watermarks and
  * smaller per-cpu batchsize.
  */
-void __init set_dma_reserve(unsigned long new_dma_reserve)
+void __init set_memory_reserve(unsigned long nr_reserve, bool inc)
 {
-   dma_reserve = new_dma_reserve;
+   if (inc)
+   nr_memory_reserve += nr_reserve;
+   else
+   nr_memory_reserve = nr_reserve;
 }
 
 void __init free_area_init(unsigned long *zones_size)
-- 
1.8.5.6



[PATCH V2 2/2] fadump: Register the memory reserved by fadump

2016-08-04 Thread Srikar Dronamraju
Fadump kernel reserves large chunks of memory even before the pages are
initialized. This could mean memory that corresponds to several nodes might
fall in memblock reserved regions.

Kernels compiled with CONFIG_DEFERRED_STRUCT_PAGE_INIT will initialize
only certain size memory per node. The certain size takes into account
the dentry and inode cache sizes. Currently the cache sizes are
calculated based on the total system memory including the reserved
memory. However such a kernel when booting the same kernel as fadump
kernel will not be able to allocate the required amount of memory to
suffice for the dentry and inode caches. This results in crashes like
the below on large systems such as 32 TB systems.

Dentry cache hash table entries: 536870912 (order: 16, 4294967296 bytes)
vmalloc: allocation failure, allocated 4097114112 of 17179934720 bytes
swapper/0: page allocation failure: order:0, mode:0x2080020(GFP_ATOMIC)
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.6-master+ #3
Call Trace:
[c108fb10] [c07fac88] dump_stack+0xb0/0xf0 (unreliable)
[c108fb50] [c0235264] warn_alloc_failed+0x114/0x160
[c108fbf0] [c0281484] __vmalloc_node_range+0x304/0x340
[c108fca0] [c028152c] __vmalloc+0x6c/0x90
[c108fd40] [c0aecfb0]
alloc_large_system_hash+0x1b8/0x2c0
[c108fe00] [c0af7240] inode_init+0x94/0xe4
[c108fe80] [c0af6fec] vfs_caches_init+0x8c/0x13c
[c108ff00] [c0ac4014] start_kernel+0x50c/0x578
[c108ff90] [c0008c6c] start_here_common+0x20/0xa8

Register the memory reserved by fadump, so that the cache sizes are
calculated based on the free memory (i.e Total memory - reserved
memory).

Suggested-by: Mel Gorman 
Signed-off-by: Srikar Dronamraju 
---
 arch/powerpc/kernel/fadump.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index 3cb3b02a..ca5ec88 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -330,6 +330,7 @@ int __init fadump_reserve_mem(void)
}
fw_dump.reserve_dump_area_start = base;
fw_dump.reserve_dump_area_size = size;
+   set_memory_reserve(size/PAGE_SIZE, true);
return 1;
 }
 
-- 
1.8.5.6



Re: [PATCH v3 1/5] powerpc/dts: add mcke-gpios for PM feature

2016-08-04 Thread Rob Herring
On Tue, Aug 02, 2016 at 07:58:44PM +0800, Chenhui Zhao wrote:
> Signed-off-by: Chenhui Zhao 
> ---
>  Documentation/devicetree/bindings/soc/fsl/rcpm.txt | 13 +
>  arch/powerpc/boot/dts/fsl/t1040si-post.dtsi|  3 +++
>  2 files changed, 16 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/soc/fsl/rcpm.txt 
> b/Documentation/devicetree/bindings/soc/fsl/rcpm.txt
> index e284e4e..1d44a80 100644
> --- a/Documentation/devicetree/bindings/soc/fsl/rcpm.txt
> +++ b/Documentation/devicetree/bindings/soc/fsl/rcpm.txt
> @@ -21,6 +21,10 @@ Required properites:
>   * "fsl,qoriq-rcpm-2.0": for chassis 2.0 rcpm
>   * "fsl,qoriq-rcpm-2.1": for chassis 2.1 rcpm
>  
> +Optional properites:
> +  - mcke-gpios : The GPIO pin is used for keeping the MCKE signal of DDR 
> modules
> +  low in the LPM35 state on some platforms, such as T1040.

Needs a vendor prefix.


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-04 Thread Segher Boessenkool
On Thu, Aug 04, 2016 at 06:10:57PM +0200, Arnd Bergmann wrote:
> On Thursday, August 4, 2016 9:47:13 PM CEST Nicholas Piggin wrote:
> 
> > +   __used  \
> > +   __attribute__((section("___kentry" "+" #sym ",\"a\",@note #"), used)) \
> 
> 
> I've just started testing this, but the first problem I ran into
> is that @ and # are special characters that have an architecture
> specific meaning to the assembler. On ARM, you need "%note @" instead
> of "@note #".

That comment trick (I still feel guilty about it) causes more problems
than it solves.  Please don't try to use it :-)


Segher


[v5.1] ucc_fast: Fix to avoid IS_ERR_VALUE abuses and dead code on 64bit systems.

2016-08-04 Thread Arvind Yadav
IS_ERR_VALUE() assumes that parameter is an unsigned long.
It can not be used to check if 'unsigned int' is passed insted.
Which tends to reflect an error.
In 64bit architectures sizeof (int) == 4 && sizeof (long) == 8.
IS_ERR_VALUE(x) is ((x) >= (unsigned long)-4095).
IS_ERR_VALUE() of 'unsigned int' is always false because the 32bit
value is zero extended to 64 bits.

Now Problem In UCC fast protocols -: drivers/soc/fsl/qe/ucc_fast.c

/* Allocate memory for Tx Virtual Fifo */
uccf->ucc_fast_tx_virtual_fifo_base_offset =
  qe_muram_alloc(uf_info->utfs, UCC_FAST_VIRT_FIFO_REGS_ALIGNMENT);
if (IS_ERR_VALUE(uccf->ucc_fast_tx_virtual_fifo_base_offset)) {
printk(KERN_ERR "%s: cannot allocate MURAM for TX FIFO\n",
__func__);
uccf->ucc_fast_tx_virtual_fifo_base_offset = 0;
ucc_fast_free(uccf);
return -ENOMEM;
}

/* Allocate memory for Rx Virtual Fifo */
uccf->ucc_fast_rx_virtual_fifo_base_offset =
   qe_muram_alloc(uf_info->urfs +
   UCC_FAST_RECEIVE_VIRTUAL_FIFO_SIZE_FUDGE_FACTOR,
   UCC_FAST_VIRT_FIFO_REGS_ALIGNMENT);
if (IS_ERR_VALUE(uccf->ucc_fast_rx_virtual_fifo_base_offset)) {
printk(KERN_ERR "%s: cannot allocate MURAM for RX FIFO\n",
__func__);
uccf->ucc_fast_rx_virtual_fifo_base_offset = 0;
ucc_fast_free(uccf);
return -ENOMEM;
}

qe_muram_alloc (a.k.a. cpm_muram_alloc) returns unsigned long.
Return value store in a u32 (ucc_fast_tx_virtual_fifo_base_offset
and ucc_fast_rx_virtual_fifo_base_offset).If qe_muram_alloc will
return any error, Then IS_ERR_VALUE will always return 0. it'll not
call ucc_fast_free for any failure. Inside 'if code' will be a dead
code on 64bit.
This patch is to avoid this problem on 64bit machine.

Signed-off-by: Arvind Yadav 
---
 include/soc/fsl/qe/ucc_fast.h | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/include/soc/fsl/qe/ucc_fast.h b/include/soc/fsl/qe/ucc_fast.h
index df8ea79..ada9070 100644
--- a/include/soc/fsl/qe/ucc_fast.h
+++ b/include/soc/fsl/qe/ucc_fast.h
@@ -165,10 +165,12 @@ struct ucc_fast_private {
int stopped_tx; /* Whether channel has been stopped for Tx
   (STOP_TX, etc.) */
int stopped_rx; /* Whether channel has been stopped for Rx */
-   u32 ucc_fast_tx_virtual_fifo_base_offset;/* pointer to base of Tx
-   virtual fifo */
-   u32 ucc_fast_rx_virtual_fifo_base_offset;/* pointer to base of Rx
-   virtual fifo */
+   unsigned long ucc_fast_tx_virtual_fifo_base_offset;/* pointer to base of
+   * Tx virtual fifo
+   */
+   unsigned long ucc_fast_rx_virtual_fifo_base_offset;/* pointer to base of
+   * Rx virtual fifo
+   */
 #ifdef STATISTICS
u32 tx_frames;  /* Transmitted frames counter. */
u32 rx_frames;  /* Received frames counter (only frames
-- 
1.9.1



Re: [PATCH] cpufreq: powernv: Fix crash in gpstate_timer_handler

2016-08-04 Thread Viresh Kumar
On 04-08-16, 20:59, Akshay Adiga wrote:
> 'commit 09ca4c9b5958 ("cpufreq: powernv: Replacing pstate_id with
> frequency table index")' changes calc_global_pstate() to use
> cpufreq_table index instead of pstate_id.
> 
> But in gpstate_timer_handler() pstate_id was being passed instead
> of cpufreq_table index, which caused the index_to_pstate() to access
> out of bound indices, leading to this crash.
> 
> Adding sanity check for index and pstate, to ensure only valid pstate
> and index values are returned.
> 
> Call Trace:
> [c0078d66b130] [c011d224] __free_irq+0x234/0x360
> (unreliable)
> [c0078d66b1c0] [c011d44c] free_irq+0x6c/0xa0
> [c0078d66b1f0] [c006c4f8] opal_event_shutdown+0x88/0xd0
> [c0078d66b230] [c0067a4c] opal_shutdown+0x1c/0x90
> [c0078d66b260] [c0063a00] pnv_shutdown+0x20/0x40
> [c0078d66b280] [c0021538] machine_restart+0x38/0x90
> [c78d66b310] [c0965ea0] panic+0x284/0x300
> [c0078d66b3a0] [c001f508] die+0x388/0x450
> [c0078d66b430] [c0045a50] bad_page_fault+0xd0/0x140
> [c0078d66b4a0] [c0008964] handle_page_fault+0x2c/0x30
>interrupt: 300 at gpstate_timer_handler+0x150/0x260
> LR = gpstate_timer_handler+0x130/0x260
> [c0078d66b7f0] [c0132b58] call_timer_fn+0x58/0x1c0
> [c0078d66b880] [c0132e20] expire_timers+0x130/0x1d0
> [c0078d66b8f0] [c0133068] run_timer_softirq+0x1a8/0x230
> [c0078d66b980] [c00b535c] __do_softirq+0x18c/0x400
> [c0078d66ba70] [c00b5828] irq_exit+0xc8/0x100
> [c0078d66ba90] [c001e214] timer_interrupt+0xa4/0xe0
> [c0078d66bac0] [c00027d0] decrementer_common+0x150/0x180
>interrupt: 901 at arch_local_irq_restore+0x74/0x90
>   0] [c0106b34] call_cpuidle+0x44/0x90
> [c0078d66be50] [c010708c] cpu_startup_entry+0x38c/0x460
> [c0078d66bf20] [c003d930] start_secondary+0x330/0x380
> [c0078d66bf90] [c0008e6c] start_secondary_prolog+0x10/0x14
> 
> Fixes: 08d27eb ("cpufreq: powernv: Replacing pstate_id with
> frequency table index")
> Reported-by: Madhavan Srinivasan 
> Signed-off-by: Akshay Adiga 
> ---
>  drivers/cpufreq/powernv-cpufreq.c | 21 -
>  1 file changed, 20 insertions(+), 1 deletion(-)

Acked-by: Viresh Kumar 

-- 
viresh


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-04 Thread Arnd Bergmann
On Thursday, August 4, 2016 9:47:13 PM CEST Nicholas Piggin wrote:

> + __used  \
> + __attribute__((section("___kentry" "+" #sym ",\"a\",@note #"), used)) \


I've just started testing this, but the first problem I ran into
is that @ and # are special characters that have an architecture
specific meaning to the assembler. On ARM, you need "%note @" instead
of "@note #".

Arnd


Re: [PATCH v2] powernv: Simplify searching for compatible device nodes

2016-08-04 Thread Jack Miller
On Thu, Aug 04, 2016 at 06:39:24PM +1000, Michael Ellerman wrote:
> Cyril Bur  writes:
> 
> > On Wed,  3 Aug 2016 12:18:00 -0500
> > Jack Miller  wrote:
> >
> >> (rebased on powerpc/next)
> >> 
> >> This condenses the opal node searching into a single function that finds
> >> all compatible nodes, instead of just searching the ibm,opal children,
> >> for ipmi, flash, and prd similar to how opal-i2c nodes are found.
> >> 
> >> Signed-off-by: Jack Miller 
> >
> > Using a version of the related skiboot patch that may not be the final one:
> > Tested-by: Cyril Bur 
> 
> Thanks. The part I'm still not clear on is *why* we're moving them in
> skiboot?

Ostensibly so the actual flash device nodes can inherit the #size-cells /
#address-cells properties properly set in the flash parent node instead of
the ibm,opal node (which has them set to 0). This would be more correct if
anything actually started to honor these settings.

The only concrete effect though is stopping dtc (and thus fwts) from whinging
when you run skiboot's output DT through it.

- Jack


Re: mm: Initialise per_cpu_nodestats for all online pgdats at boot

2016-08-04 Thread Reza Arbab

On Thu, Aug 04, 2016 at 10:24:04AM +0100, Mel Gorman wrote:

This has been compile-tested and boot-tested on a 32-bit KVM only. A
memoryless system was not available to test the patch with. A confirmation
from Paul and Reza that it resolves their problem is welcome.


Works for me. Thanks, Mel!

Tested-by: Reza Arbab 

--
Reza Arbab



Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-04 Thread Arnd Bergmann
On Thursday, August 4, 2016 11:54:18 PM CEST Nicholas Piggin wrote:
> On Thu, 4 Aug 2016 22:31:39 +1000
> Nicholas Piggin  wrote:
> > On Thu, 04 Aug 2016 14:09:02 +0200
> > Arnd Bergmann  wrote:
> > > Nicolas Pitre has done some related work, adding him to Cc. IIRC we have
> > > actually had multiple implementations of -ffunction-sections/--gc-sections
> > > in the past that people have used in production, but none of them
> > > ever made it upstream.  
> 
> After some googling around it seems lto has been difficult to
> get in and it was agreed this gc-sections should be done first
> anyway (although it may indeed provide a superset of DCE, but
> it's always going to be more costly and complicated). Lto would
> have the same issue with liveness of entry points, which is
> really the only thing you need change in the kernel as far as I
> can see.

Ok, good.

> I didn't really see what problems people were having with it
> though, so maybe it's architecture specific or something I
> haven't run into yet.

I remember trying it a few years ago without success, it's possible
that old binutils versions were more problematic.

I'm happy to test your patches on ARM, with my randconfig builder
I tend to find obscure bugs in corner cases that you might not
normally find with just defconfig/allmodconfig builds.

Arnd


[PATCH] cpufreq: powernv: Fix crash in gpstate_timer_handler

2016-08-04 Thread Akshay Adiga
'commit 09ca4c9b5958 ("cpufreq: powernv: Replacing pstate_id with
frequency table index")' changes calc_global_pstate() to use
cpufreq_table index instead of pstate_id.

But in gpstate_timer_handler() pstate_id was being passed instead
of cpufreq_table index, which caused the index_to_pstate() to access
out of bound indices, leading to this crash.

Adding sanity check for index and pstate, to ensure only valid pstate
and index values are returned.

Call Trace:
[c0078d66b130] [c011d224] __free_irq+0x234/0x360
(unreliable)
[c0078d66b1c0] [c011d44c] free_irq+0x6c/0xa0
[c0078d66b1f0] [c006c4f8] opal_event_shutdown+0x88/0xd0
[c0078d66b230] [c0067a4c] opal_shutdown+0x1c/0x90
[c0078d66b260] [c0063a00] pnv_shutdown+0x20/0x40
[c0078d66b280] [c0021538] machine_restart+0x38/0x90
[c78d66b310] [c0965ea0] panic+0x284/0x300
[c0078d66b3a0] [c001f508] die+0x388/0x450
[c0078d66b430] [c0045a50] bad_page_fault+0xd0/0x140
[c0078d66b4a0] [c0008964] handle_page_fault+0x2c/0x30
   interrupt: 300 at gpstate_timer_handler+0x150/0x260
LR = gpstate_timer_handler+0x130/0x260
[c0078d66b7f0] [c0132b58] call_timer_fn+0x58/0x1c0
[c0078d66b880] [c0132e20] expire_timers+0x130/0x1d0
[c0078d66b8f0] [c0133068] run_timer_softirq+0x1a8/0x230
[c0078d66b980] [c00b535c] __do_softirq+0x18c/0x400
[c0078d66ba70] [c00b5828] irq_exit+0xc8/0x100
[c0078d66ba90] [c001e214] timer_interrupt+0xa4/0xe0
[c0078d66bac0] [c00027d0] decrementer_common+0x150/0x180
   interrupt: 901 at arch_local_irq_restore+0x74/0x90
  0] [c0106b34] call_cpuidle+0x44/0x90
[c0078d66be50] [c010708c] cpu_startup_entry+0x38c/0x460
[c0078d66bf20] [c003d930] start_secondary+0x330/0x380
[c0078d66bf90] [c0008e6c] start_secondary_prolog+0x10/0x14

Fixes: 08d27eb ("cpufreq: powernv: Replacing pstate_id with
frequency table index")
Reported-by: Madhavan Srinivasan 
Signed-off-by: Akshay Adiga 
---
 drivers/cpufreq/powernv-cpufreq.c | 21 -
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/drivers/cpufreq/powernv-cpufreq.c 
b/drivers/cpufreq/powernv-cpufreq.c
index 87796e0..d3ffde8 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -145,11 +145,30 @@ static struct powernv_pstate_info {
 /* Use following macros for conversions between pstate_id and index */
 static inline int idx_to_pstate(unsigned int i)
 {
+   if (unlikely(i >= powernv_pstate_info.nr_pstates)) {
+   pr_warn_once("index %u is out of bound\n", i);
+   return powernv_freqs[powernv_pstate_info.nominal].driver_data;
+   }
+
return powernv_freqs[i].driver_data;
 }
 
 static inline unsigned int pstate_to_idx(int pstate)
 {
+   int min = powernv_freqs[powernv_pstate_info.min].driver_data;
+   int max = powernv_freqs[powernv_pstate_info.max].driver_data;
+
+   if (min > 0) {
+   if (unlikely((pstate < max) || (pstate > min))) {
+   pr_warn_once("pstate %d is out of bound\n", pstate);
+   return powernv_pstate_info.nominal;
+   }
+   } else {
+   if (unlikely((pstate > max) || (pstate < min))) {
+   pr_warn_once("pstate %d is out of bound\n", pstate);
+   return powernv_pstate_info.nominal;
+   }
+   }
/*
 * abs() is deliberately used so that is works with
 * both monotonically increasing and decreasing
@@ -593,7 +612,7 @@ void gpstate_timer_handler(unsigned long data)
} else {
gpstate_idx = calc_global_pstate(gpstates->elapsed_time,
 gpstates->highest_lpstate_idx,
-freq_data.pstate_id);
+gpstates->last_lpstate_idx);
}
 
/*
-- 
2.1.4



Re: [PATCH] fadump: Register the memory reserved by fadump

2016-08-04 Thread Srikar Dronamraju
* Mel Gorman  [2016-08-04 15:09:34]:

> > 
> > Suggested-by: Mel Gorman 
> 
> I didn't suggest this specifically. While it happens to be safe on ppc64,
> it potentially overwrites any future caller of set_dma_reserve. While the
> only other one is for the e820 map, it may be better to change the API
> to inc_dma_reserve?
> 
> It's also unfortunate that it's called dma_reserve because it has
> nothing to do with DMA or ZONE_DMA. inc_kernel_reserve may be more
> appropriate?

Yup Agree. Will redo and send out.

> 
> -- 
> Mel Gorman
> SUSE Labs
> 

-- 
Thanks and Regards
Srikar Dronamraju



RE: [PATCH] cxl: Use fixed width predefined types in data structure.

2016-08-04 Thread David Laight
From: Philippe Bergheaud
> Sent: 04 August 2016 14:56
> This patch fixes a regression introduced by commit b810253.
> It substitutes the type __u8 to u8 in the uapi header cxl.h,
> because the latter is not always defined in userland build
> environments, in particular when cross-compiling libcxl on
> x86_64 linux machines (RHEL6.7 and Ubuntu 16.04).
> 
> It also makes the definition of cxl_event_afu_driver_reserved
> more consistent with the other definitions in the header file.
...
> diff --git a/include/uapi/misc/cxl.h b/include/uapi/misc/cxl.h
> index cbae529..180d526 100644
> --- a/include/uapi/misc/cxl.h
> +++ b/include/uapi/misc/cxl.h
> @@ -136,8 +136,8 @@ struct cxl_event_afu_driver_reserved {
>*
>* Of course the contents will be ABI, but that's up the AFU driver.
>*/
> - size_t data_size;
> - u8 data[];
> + __u32 data_size;
> + __u8 data[];
>  };

You've just changed 'data_size' from 64bit to 32bit (on 64bit systems).
This isn't mentioned in the commit message and changes the API.

I've NFI what it need to be...

David



Re: [PATCH] fadump: Register the memory reserved by fadump

2016-08-04 Thread Mel Gorman
On Thu, Aug 04, 2016 at 07:12:45PM +0530, Srikar Dronamraju wrote:
> Fadump kernel reserves large chunks of memory even before the pages are
> initialized. This could mean memory that corresponds to several nodes might
> fall in memblock reserved regions.
> 
> Kernels compiled with CONFIG_DEFERRED_STRUCT_PAGE_INIT will initialize
> only certain size memory per node. The certain size takes into account
> the dentry and inode cache sizes. Currently the cache sizes are
> calculated based on the total system memory including the reserved
> memory. However such a kernel when booting the same kernel as fadump
> kernel will not be able to allocate the required amount of memory to
> suffice for the dentry and inode caches. This results in crashes like
> the below on large systems such as 32 TB systems.
> 
> Dentry cache hash table entries: 536870912 (order: 16, 4294967296 bytes)
> vmalloc: allocation failure, allocated 4097114112 of 17179934720 bytes
> swapper/0: page allocation failure: order:0, mode:0x2080020(GFP_ATOMIC)
> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.6-master+ #3
> Call Trace:
> [c108fb10] [c07fac88] dump_stack+0xb0/0xf0 (unreliable)
> [c108fb50] [c0235264] warn_alloc_failed+0x114/0x160
> [c108fbf0] [c0281484] __vmalloc_node_range+0x304/0x340
> [c108fca0] [c028152c] __vmalloc+0x6c/0x90
> [c108fd40] [c0aecfb0]
> alloc_large_system_hash+0x1b8/0x2c0
> [c108fe00] [c0af7240] inode_init+0x94/0xe4
> [c108fe80] [c0af6fec] vfs_caches_init+0x8c/0x13c
> [c108ff00] [c0ac4014] start_kernel+0x50c/0x578
> [c108ff90] [c0008c6c] start_here_common+0x20/0xa8
> 
> Register the memory reserved by fadump, so that the cache sizes are
> calculated based on the free memory (i.e Total memory - reserved
> memory).
> 
> Suggested-by: Mel Gorman 

I didn't suggest this specifically. While it happens to be safe on ppc64,
it potentially overwrites any future caller of set_dma_reserve. While the
only other one is for the e820 map, it may be better to change the API
to inc_dma_reserve?

It's also unfortunate that it's called dma_reserve because it has
nothing to do with DMA or ZONE_DMA. inc_kernel_reserve may be more
appropriate?

-- 
Mel Gorman
SUSE Labs


[PATCH] fadump: Register the memory reserved by fadump

2016-08-04 Thread Srikar Dronamraju
Fadump kernel reserves large chunks of memory even before the pages are
initialized. This could mean memory that corresponds to several nodes might
fall in memblock reserved regions.

Kernels compiled with CONFIG_DEFERRED_STRUCT_PAGE_INIT will initialize
only certain size memory per node. The certain size takes into account
the dentry and inode cache sizes. Currently the cache sizes are
calculated based on the total system memory including the reserved
memory. However such a kernel when booting the same kernel as fadump
kernel will not be able to allocate the required amount of memory to
suffice for the dentry and inode caches. This results in crashes like
the below on large systems such as 32 TB systems.

Dentry cache hash table entries: 536870912 (order: 16, 4294967296 bytes)
vmalloc: allocation failure, allocated 4097114112 of 17179934720 bytes
swapper/0: page allocation failure: order:0, mode:0x2080020(GFP_ATOMIC)
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.6-master+ #3
Call Trace:
[c108fb10] [c07fac88] dump_stack+0xb0/0xf0 (unreliable)
[c108fb50] [c0235264] warn_alloc_failed+0x114/0x160
[c108fbf0] [c0281484] __vmalloc_node_range+0x304/0x340
[c108fca0] [c028152c] __vmalloc+0x6c/0x90
[c108fd40] [c0aecfb0]
alloc_large_system_hash+0x1b8/0x2c0
[c108fe00] [c0af7240] inode_init+0x94/0xe4
[c108fe80] [c0af6fec] vfs_caches_init+0x8c/0x13c
[c108ff00] [c0ac4014] start_kernel+0x50c/0x578
[c108ff90] [c0008c6c] start_here_common+0x20/0xa8

Register the memory reserved by fadump, so that the cache sizes are
calculated based on the free memory (i.e Total memory - reserved
memory).

Suggested-by: Mel Gorman 
Signed-off-by: Srikar Dronamraju 
---
 arch/powerpc/kernel/fadump.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index 3cb3b02a..eae547d 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -330,6 +330,7 @@ int __init fadump_reserve_mem(void)
}
fw_dump.reserve_dump_area_start = base;
fw_dump.reserve_dump_area_size = size;
+   set_dma_reserve(size/PAGE_SIZE);
return 1;
 }
 
-- 
1.8.5.6



[PATCH] cxl: Use fixed width predefined types in data structure.

2016-08-04 Thread Philippe Bergheaud
This patch fixes a regression introduced by commit b810253.
It substitutes the type __u8 to u8 in the uapi header cxl.h,
because the latter is not always defined in userland build
environments, in particular when cross-compiling libcxl on
x86_64 linux machines (RHEL6.7 and Ubuntu 16.04).

It also makes the definition of cxl_event_afu_driver_reserved
more consistent with the other definitions in the header file.

Signed-off-by: Philippe Bergheaud 
---
 include/uapi/misc/cxl.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/uapi/misc/cxl.h b/include/uapi/misc/cxl.h
index cbae529..180d526 100644
--- a/include/uapi/misc/cxl.h
+++ b/include/uapi/misc/cxl.h
@@ -136,8 +136,8 @@ struct cxl_event_afu_driver_reserved {
 *
 * Of course the contents will be ABI, but that's up the AFU driver.
 */
-   size_t data_size;
-   u8 data[];
+   __u32 data_size;
+   __u8 data[];
 };
 
 struct cxl_event {
-- 
2.8.0



Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-04 Thread Nicholas Piggin
On Thu, 4 Aug 2016 22:31:39 +1000
Nicholas Piggin  wrote:
> On Thu, 04 Aug 2016 14:09:02 +0200
> Arnd Bergmann  wrote:
> > Nicolas Pitre has done some related work, adding him to Cc. IIRC we have
> > actually had multiple implementations of -ffunction-sections/--gc-sections
> > in the past that people have used in production, but none of them
> > ever made it upstream.  

After some googling around it seems lto has been difficult to
get in and it was agreed this gc-sections should be done first
anyway (although it may indeed provide a superset of DCE, but
it's always going to be more costly and complicated). Lto would
have the same issue with liveness of entry points, which is
really the only thing you need change in the kernel as far as I
can see.

I didn't really see what problems people were having with it
though, so maybe it's architecture specific or something I
haven't run into yet.

Thanks,
Nick


[pasemi] Radeon HD graphics card not recognised after the powerpc-4.8-1 commit

2016-08-04 Thread Christian Zigotzky

Hi All,

I figured out that the Git kernel (4.8) successfully detected my Radeon 
HD6870 but Xorg can't access it.


The reason is, that the BusID has changed between the kernel 4.7 and 4.8.

Kernel 4.7:

01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. 
[AMD/ATI] Barts XT [Radeon HD 6870]


xorg.conf:

Section "Device"
BusID   "PCI:1:0:0"

Git kernel (4.8):

e900:01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. 
[AMD/ATI] Barts XT [Radeon HD 6870]


How does the new BusID look like in the xorg.conf?

dmesg | grep -i radeon

[0.997471] [drm] radeon kernel modesetting enabled.
[1.171136] radeon e900:01:00.0: VRAM: 1024M 0x 
- 0x3FFF (1024M used)
[1.171141] radeon e900:01:00.0: GTT: 1024M 0x4000 - 
0x7FFF

[1.171270] [drm] radeon: 1024M of VRAM memory ready
[1.171273] [drm] radeon: 1024M of GTT memory ready.
[1.176557] [drm] radeon: dpm initialized
[1.212690] radeon e900:01:00.0: WB enabled
[1.212697] radeon e900:01:00.0: fence driver on ring 0 use gpu 
addr 0x4c00 and cpu addr 0xc0026dbdbc00
[1.212702] radeon e900:01:00.0: fence driver on ring 3 use gpu 
addr 0x4c0c and cpu addr 0xc0026dbdbc0c
[1.219475] radeon e900:01:00.0: fence driver on ring 5 use gpu 
addr 0x00072118 and cpu addr 0xd800901b2118

[1.219485] radeon e900:01:00.0: radeon: MSI limited to 32-bit
[1.219530] [drm] radeon: irq initialized.
[2.070343] [drm] Radeon Display Connectors
[2.917509] radeon e900:01:00.0: fb0: radeondrmfb frame buffer 
device
[2.923810] [drm] Initialized radeon 2.46.0 20080528 for 
e900:01:00.0 on minor 0


dmesg | grep -i e900:01:00.0

[0.152680] pci e900:01:00.0: [1002:6738] type 00 class 0x03
[0.152694] pci e900:01:00.0: calling .pci_dev_pdn_setup+0x0/0x3c
[0.152710] pci e900:01:00.0: reg 0x10: [mem 
0x9000-0x9fff 64bit pref]
[0.152722] pci e900:01:00.0: reg 0x18: [mem 
0xa002-0xa003 64bit]

[0.152731] pci e900:01:00.0: reg 0x20: [io  0x2000-0x20ff]
[0.152744] pci e900:01:00.0: reg 0x30: [mem 
0xa000-0xa001 pref]
[0.152755] pci e900:01:00.0: calling 
.pcibios_fixup_resources+0x0/0x110

[0.152763] pci e900:01:00.0: calling .quirk_no_pm_reset+0x0/0x40
[0.152804] pci e900:01:00.0: supports D1 D2
[0.494871] pci e900:01:00.0: calling .fixup_vga+0x0/0x3c
[1.171136] radeon e900:01:00.0: VRAM: 1024M 0x 
- 0x3FFF (1024M used)
[1.171141] radeon e900:01:00.0: GTT: 1024M 0x4000 - 
0x7FFF

[1.212690] radeon e900:01:00.0: WB enabled
[1.212697] radeon e900:01:00.0: fence driver on ring 0 use gpu 
addr 0x4c00 and cpu addr 0xc0026dbdbc00
[1.212702] radeon e900:01:00.0: fence driver on ring 3 use gpu 
addr 0x4c0c and cpu addr 0xc0026dbdbc0c
[1.219475] radeon e900:01:00.0: fence driver on ring 5 use gpu 
addr 0x00072118 and cpu addr 0xd800901b2118

[1.219485] radeon e900:01:00.0: radeon: MSI limited to 32-bit
[2.917509] radeon e900:01:00.0: fb0: radeondrmfb frame buffer 
device
[2.923810] [drm] Initialized radeon 2.46.0 20080528 for 
e900:01:00.0 on minor 0


FBDEV with the Git kernel (4.8):

Fatal server error:
xf86MapLegacyIO(): domain out of range

Cheers,

Christian

On 03 August 2016 at 12:53 PM, Benjamin Herrenschmidt wrote:

On Wed, 2016-08-03 at 11:03 +0200, Christian Zigotzky wrote:

I reverted the commit "powerpc-4.8-1" and Xorg works. The commit
"powerpc-4.8-1" is the problem.

Link:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=bad60e6f259a01cf9f29a1ef8d435ab6c60b2de9

Which source code modification in the commit "powerpc-4.8-1" could be
the problem?

This is a merge, not a commit. Can you bisect down that branch ? Also
include the kernel dmesg log.

Cheers,
Ben.






Re: [PATCH] powerpc/book3s: Fix MCE console messages for unrecoverable MCE.

2016-08-04 Thread Nicholas Piggin
On Thu, 4 Aug 2016 17:35:45 +0530
Mahesh Jagannath Salgaonkar  wrote:

> On 08/04/2016 03:27 PM, Michael Ellerman wrote:
> > Mahesh J Salgaonkar  writes:
> >   
> >> From: Mahesh Salgaonkar 
> >>
> >> When machine check occurs with MSR(RI=0), it means MC interrupt is
> >> unrecoverable and kernel goes down to panic path. But the console
> >> message still shows it as recovered. This patch fixes the MCE console
> >> messages.
> >>
> >> Signed-off-by: Mahesh Salgaonkar 
> >> ---
> >>  arch/powerpc/kernel/mce.c |3 ++-
> >>  arch/powerpc/platforms/powernv/opal.c |2 ++
> >>  2 files changed, 4 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
> >> index ef267fd..5e7ece0 100644
> >> --- a/arch/powerpc/kernel/mce.c
> >> +++ b/arch/powerpc/kernel/mce.c
> >> @@ -92,7 +92,8 @@ void save_mce_event(struct pt_regs *regs, long handled,
> >>mce->in_use = 1;
> >>  
> >>mce->initiator = MCE_INITIATOR_CPU;
> >> -  if (handled)
> >> +  /* Mark it recovered if we have handled it and MSR(RI=1). */
> >> +  if (handled && (regs->msr & MSR_RI))
> >>mce->disposition = MCE_DISPOSITION_RECOVERED;  
> > 
> > This seems like it has bigger implications than just changing the
> > printk output? We're now (correctly) marking any MC where RI=0 as
> > unrecoverable.
> > 
> > Or is the only place that uses this the code below which *also* checks
> > MSR_RI?  
> 
> We would always check MSR_RI at code below and panic correctly. It was
> just that we were always printing it as recovered and then panic.
> 
> >   
> >> diff --git a/arch/powerpc/platforms/powernv/opal.c 
> >> b/arch/powerpc/platforms/powernv/opal.c
> >> index 5385434..8154171 100644
> >> --- a/arch/powerpc/platforms/powernv/opal.c
> >> +++ b/arch/powerpc/platforms/powernv/opal.c
> >> @@ -401,6 +401,8 @@ static int opal_recover_mce(struct pt_regs *regs,
> >>  
> >>if (!(regs->msr & MSR_RI)) {
> >>/* If MSR_RI isn't set, we cannot recover */  
> > 
> > Why do we check MSR_RI again here? Shouldn't we just be looking at the 
> > evt->disposition?  
> 
> When MSR_RI=0, where SRR0/SRR1 registers values have been thrashed,
> kernel can not continue reliably if we return from interrupt.

If it's a user process that raises a synchronous/instruction caused
exception that gets hit with the MCE, I wonder if we can kill the
process and continue? I'm not saying you should do it with this patch.

It might need some auditing we don't leave the paca in some bad state
or something, but it would be a nice feature because right now a user
doing a busy loop of the funny 0x1ebe syscall, or maybe an illegal
instruction or emulation could probably keep the MSR_RI bit clear for
probably half or more of the CPU's cycles couldn't they?

Thanks,
Nick


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-04 Thread Nicholas Piggin
On Thu, 04 Aug 2016 14:09:02 +0200
Arnd Bergmann  wrote:

> On Thursday, August 4, 2016 9:47:13 PM CEST Nicholas Piggin wrote:
> > On Thu, 04 Aug 2016 12:37:41 +0200 Arnd Bergmann  wrote:  
> > > On Thursday, August 4, 2016 11:00:49 AM CEST Arnd Bergmann wrote:  
> > > > I tried this
> > > > 
> > > > diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
> > > > index b5e40ed86e60..89bca1a25916 100755
> > > > --- a/scripts/link-vmlinux.sh
> > > > +++ b/scripts/link-vmlinux.sh
> > > > @@ -44,7 +44,7 @@ modpost_link()
> > > > local objects
> > > >  
> > > > if [ -n "${CONFIG_THIN_ARCHIVES}" ]; then
> > > > -   objects="--whole-archive ${KBUILD_VMLINUX_INIT} 
> > > > ${KBUILD_VMLINUX_MAIN} --no-whole-archive"
> > > > +   objects="${KBUILD_VMLINUX_INIT} ${KBUILD_VMLINUX_MAIN}"
> > > > else
> > > > objects="${KBUILD_VMLINUX_INIT} --start-group 
> > > > ${KBUILD_VMLINUX_MAIN} --end-group"
> > > > fi
> > > > 
> > > > but that did not seem to change anything, the extra symbols are
> > > > still there. I have not tried to understand what that actually
> > > > does, so maybe I misunderstood your suggestion.
> > > > 
> > > 
> > > On a second attempt, I did the same change for vmlinux instead of the
> > > module (d'oh), and got a link failure instead:
> > > 
> > > 
> > > arch/arm/mm/proc-xscale.o: In function `cpu_xscale_do_resume':
> > > (.text+0x3d4): undefined reference to `cpu_resume_mmu'
> > > arch/arm/kernel/setup.o: In function `setup_arch':
> > > ...
> > > 
> > > However, I also see a link failure in some rare configurations
> > > with just your patch:
> > > 
> > > arch/arm/lib/lib.a(io-acorn.o): In function `outsl':
> > > (.text+0x38): undefined reference to `printk'
> > > 
> > > The problem being a file in a library object that is not referenced,
> > > but that references another symbol that is not defined
> > > (CONFIG_PRINTK=n).  
> > 
> > The first problem is the existing link system is buggy. I think an
> > unconditional switch to --whole-archive (at least for modular kernels)
> > should probably be done anyway. For example, on powerpc when building
> > with --whole-archive, I have:
> > 
> > +dma_noop_alloc
> > +dma_noop_free
> > +dma_noop_map_page
> > +dma_noop_mapping_error
> > +dma_noop_map_sg
> > +dma_noop_ops
> > +dma_noop_supported
> > +fdt_add_reservemap_entry
> > +fdt_begin_node
> > +fdt_create
> > +fdt_create_empty_tree
> > +fdt_end_node
> > +fdt_errtable
> > +find_cpio_data
> > +ioremap_page_range
> > 
> > find_cpio_data is unnecessary and it's a codesize regression to link it.
> > But dma_noop_ops and ioremap_page_range are exported symbols. If I
> > reference dma_noop_ops from some random module with otherwise unpatched
> > kernel:
> > 
> > ERROR: "dma_noop_ops" [drivers/char/bsr.ko] undefined!  
> 
> Right, but only on s390, which is the one architecture using this.
> I think we should just have a Kconfig symbol for this file that
> gets selected by any architecture that needs it.

No, the problem is that the module is being selected and built
but it is missing from the vmlinux despite being exported.


> This is also what we have ended up doing for almost all other
> files in lib/
> 
> > The real problem is that our linkage requirements are like a shared
> > library when we build modular.
> > 
> > We could build a list of exports and make it link objects with those
> > symbols, to solve this, but IMO that's just wasting lipstick on a pig.
> > But I will to propose a patch to always use --whole-archive, thin
> > archives or not, and transition all archs over to it in a few release
> > cycles. It just works by luck right now.
> >
> > Why is it a pig? Because having the linker to notice no external
> > references and just skipping the .o completely is trying to use a hammer
> > as a scalpel. It's just not a very effective way to eliminate dead code
> > --  I pulled in only a handful of unneeded functions by switching it.  
> 
> If we do that, we may just as well get rid of $(lib-y) in the process and
> always use $(obj-y).

Sure, after we switch everybody over.


> > I mean it is a quick simple feature that probably works well enough with
> > simple build systems. But not an advanced one that builds almost
> > everything on demand and also has loadable modules and must act like a
> > shared library.
> > 
> > Real linker DCE is a valid optimisation that can't be replaced by the
> > build system of course, but we need to do it properly. Here's what I'm
> > working on.
> > 
> > It applies on top of the previous patch I sent, plus some powerpc stuff
> > I'm working on that you should be able to just ignore for another arch.
> > it's a WIP, but if you can see if it works for arm that would be cool.
> > 
> > It doesn't actually build allyesconfig after this,
> > ld: .tmp_vmlinux1: Too many sections: 220655 (>= 65280)
> > 
> > But on a more reasonable configuration (ppc64le)
> > text 

[PATCH 7/7] ima: support restoring multiple template formats

2016-08-04 Thread Mimi Zohar
The configured IMA measurement list template format can be replaced at
runtime on the boot command line, including a custom template format.
This patch adds support for restoring a measuremement list containing
multiple builtin/custom template formats.

Signed-off-by: Mimi Zohar 
---
 security/integrity/ima/ima_template.c | 58 +--
 1 file changed, 55 insertions(+), 3 deletions(-)

diff --git a/security/integrity/ima/ima_template.c 
b/security/integrity/ima/ima_template.c
index b7bcb62..3923718 100644
--- a/security/integrity/ima/ima_template.c
+++ b/security/integrity/ima/ima_template.c
@@ -56,6 +56,8 @@ static int __init ima_template_setup(char *str)
if (ima_template)
return 1;
 
+   ima_init_template_list();
+
/*
 * Verify that a template with the supplied name exists.
 * If not, use CONFIG_IMA_DEFAULT_TEMPLATE.
@@ -150,9 +152,14 @@ static int template_desc_init_fields(const char 
*template_fmt,
 {
const char *template_fmt_ptr;
struct ima_template_field *found_fields[IMA_TEMPLATE_NUM_FIELDS_MAX];
-   int template_num_fields = template_fmt_size(template_fmt);
+   int template_num_fields;
int i, len;
 
+   if (num_fields && *num_fields > 0) /* already initialized? */
+   return 0;
+
+   template_num_fields = template_fmt_size(template_fmt);
+
if (template_num_fields > IMA_TEMPLATE_NUM_FIELDS_MAX) {
pr_err("format string '%s' contains too many fields\n",
   template_fmt);
@@ -202,6 +209,9 @@ void __init ima_init_template_list(void)
 {
int i;
 
+   if (!list_empty(_templates))
+   return;
+
spin_lock(_list);
for (i = 0; i < ARRAY_SIZE(builtin_templates); i++) {
list_add_tail_rcu(_templates[i].list,
@@ -227,6 +237,35 @@ int __init ima_init_template(void)
return result;
 }
 
+static struct ima_template_desc *restore_template_fmt(char *template_name)
+{
+   struct ima_template_desc *template_desc = NULL;
+   int ret;
+
+   ret = template_desc_init_fields(template_name, NULL, NULL);
+   if (ret < 0) {
+   pr_err("attempting to initialize the template \"%s\" failed\n",
+   template_name);
+   goto out;
+   }
+
+   template_desc = kzalloc(sizeof(*template_desc), GFP_KERNEL);
+   if (!template_desc)
+   goto out;
+
+   template_desc->name = "";
+   template_desc->fmt = kstrdup(template_name, GFP_KERNEL);
+   if (!template_desc->fmt)
+   goto out;
+
+   spin_lock(_list);
+   list_add_tail_rcu(_desc->list, _templates);
+   spin_unlock(_list);
+   synchronize_rcu();
+out:
+   return template_desc;
+}
+
 static int ima_restore_template_data(struct ima_template_desc *template_desc,
 void *template_data,
 int template_data_size,
@@ -359,10 +398,23 @@ int ima_restore_measurement_list(loff_t size, void *buf)
}
data_v1 = bufp += (u_int8_t)hdr_v1->template_name_len;
 
-   /* get template format */
template_desc = lookup_template_desc(template_name);
if (!template_desc) {
-   pr_err("template \"%s\" not found\n", template_name);
+   template_desc = restore_template_fmt(template_name);
+   if (!template_desc)
+   break;
+   }
+
+   /*
+* Only the running system's template format is initialized
+* on boot.  As needed, initialize the other template formats.
+*/
+   ret = template_desc_init_fields(template_desc->fmt,
+   &(template_desc->fields),
+   &(template_desc->num_fields));
+   if (ret < 0) {
+   pr_err("attempting to restore the template fmt \"%s\" \
+   failed\n", template_desc->fmt);
ret = -EINVAL;
break;
}
-- 
2.1.0



[PATCH 6/7] ima: store the builtin/custom template definitions in a list

2016-08-04 Thread Mimi Zohar
The builtin and single custom templates are currently stored in an
array.  In preparation for being able to restore a measurement list
containing multiple builtin/custom templates, this patch stores the
builtin and custom templates as a linked list.  This will permit
defining more than one custom template per boot.

Signed-off-by: Mimi Zohar 
---
 security/integrity/ima/ima.h  |  2 ++
 security/integrity/ima/ima_main.c |  1 +
 security/integrity/ima/ima_template.c | 37 +++
 3 files changed, 32 insertions(+), 8 deletions(-)

diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index f972296..9d7fdd5 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -81,6 +81,7 @@ struct ima_template_field {
 
 /* IMA template descriptor definition */
 struct ima_template_desc {
+   struct list_head list;
char *name;
char *fmt;
int num_fields;
@@ -135,6 +136,7 @@ int ima_restore_measurement_list(loff_t bufsize, void *buf);
 int ima_measurements_show(struct seq_file *m, void *v);
 unsigned long ima_get_binary_runtime_size(void);
 int ima_init_template(void);
+void ima_init_template_list(void);
 
 /*
  * used to protect h_table and sha_table
diff --git a/security/integrity/ima/ima_main.c 
b/security/integrity/ima/ima_main.c
index 596ef61..592f318 100644
--- a/security/integrity/ima/ima_main.c
+++ b/security/integrity/ima/ima_main.c
@@ -418,6 +418,7 @@ static int __init init_ima(void)
 {
int error;
 
+   ima_init_template_list();
hash_setup(CONFIG_IMA_DEFAULT_HASH);
error = ima_init();
if (!error) {
diff --git a/security/integrity/ima/ima_template.c 
b/security/integrity/ima/ima_template.c
index c6510f0..b7bcb62 100644
--- a/security/integrity/ima/ima_template.c
+++ b/security/integrity/ima/ima_template.c
@@ -15,16 +15,20 @@
 
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
+#include 
 #include "ima.h"
 #include "ima_template_lib.h"
 
-static struct ima_template_desc defined_templates[] = {
+static struct ima_template_desc builtin_templates[] = {
{.name = IMA_TEMPLATE_IMA_NAME, .fmt = IMA_TEMPLATE_IMA_FMT},
{.name = "ima-ng", .fmt = "d-ng|n-ng"},
{.name = "ima-sig", .fmt = "d-ng|n-ng|sig"},
{.name = "", .fmt = ""},/* placeholder for a custom format */
 };
 
+static LIST_HEAD(defined_templates);
+spinlock_t template_list;
+
 static struct ima_template_field supported_fields[] = {
{.field_id = "d", .field_init = ima_eventdigest_init,
 .field_show = ima_show_template_digest},
@@ -80,7 +84,7 @@ __setup("ima_template=", ima_template_setup);
 
 static int __init ima_template_fmt_setup(char *str)
 {
-   int num_templates = ARRAY_SIZE(defined_templates);
+   int num_templates = ARRAY_SIZE(builtin_templates);
 
if (ima_template)
return 1;
@@ -91,20 +95,24 @@ static int __init ima_template_fmt_setup(char *str)
return 1;
}
 
-   defined_templates[num_templates - 1].fmt = str;
-   ima_template = defined_templates + num_templates - 1;
+   builtin_templates[num_templates - 1].fmt = str;
+   ima_template = builtin_templates + num_templates - 1;
+
return 1;
 }
 __setup("ima_template_fmt=", ima_template_fmt_setup);
 
 static struct ima_template_desc *lookup_template_desc(const char *name)
 {
-   int i;
+   struct ima_template_desc *template_desc;
 
-   for (i = 0; i < ARRAY_SIZE(defined_templates); i++) {
-   if (strcmp(defined_templates[i].name, name) == 0)
-   return defined_templates + i;
+   rcu_read_lock();
+   list_for_each_entry_rcu(template_desc, _templates, list) {
+   if ((strcmp(template_desc->name, name) == 0) ||
+   (strcmp(template_desc->fmt, name) == 0))
+   return template_desc;
}
+   rcu_read_unlock();
 
return NULL;
 }
@@ -190,6 +198,19 @@ struct ima_template_desc *ima_template_desc_current(void)
return ima_template;
 }
 
+void __init ima_init_template_list(void)
+{
+   int i;
+
+   spin_lock(_list);
+   for (i = 0; i < ARRAY_SIZE(builtin_templates); i++) {
+   list_add_tail_rcu(_templates[i].list,
+ _templates);
+   }
+   spin_unlock(_list);
+   synchronize_rcu();
+}
+
 int __init ima_init_template(void)
 {
struct ima_template_desc *template = ima_template_desc_current();
-- 
2.1.0



[PATCH 5/7] ima: on soft reboot, save the measurement list

2016-08-04 Thread Mimi Zohar
From: Thiago Jung Bauermann 

This patch uses the kexec buffer passing mechanism to pass the
serialized IMA binary_runtime_measurements to the next kernel.

Changelog:
- updated to call IMA functions  (Mimi)
- move code from ima_template.c to ima_kexec.c (Mimi)

Signed-off-by: Thiago Jung Bauermann 
Signed-off-by: Mimi Zohar 
---
 include/linux/ima.h| 15 +++
 kernel/kexec_file.c|  3 ++
 security/integrity/ima/ima_kexec.c | 83 ++
 3 files changed, 101 insertions(+)

diff --git a/include/linux/ima.h b/include/linux/ima.h
index b553367..ba484ed 100644
--- a/include/linux/ima.h
+++ b/include/linux/ima.h
@@ -11,6 +11,7 @@
 #define _LINUX_IMA_H
 
 #include 
+#include 
 struct linux_binprm;
 
 enum ima_buffer_id {
@@ -35,6 +36,14 @@ extern int ima_add_measurement_check(const char *hashname, 
u8 *digest,
 loff_t size, enum ima_buffer_id buffer_id,
 char *hint);
 
+#ifdef CONFIG_IMA_KEXEC
+extern void ima_add_kexec_buffer(struct kimage *image);
+#else
+static inline void ima_add_kexec_buffer(struct kimage *image)
+{
+}
+#endif
+
 #else
 static inline int ima_bprm_check(struct linux_binprm *bprm)
 {
@@ -85,6 +94,12 @@ static inline int ima_add_measurement_check(const char 
*hashname, u8 *digest,
 {
return 0;
 }
+
+#ifdef CONFIG_IMA_KEXEC
+static inline void ima_add_kexec_buffer(struct kimage *image)
+{
+}
+#endif
 #endif /* CONFIG_IMA */
 
 #ifdef CONFIG_IMA_APPRAISE
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 852adb2..622c126 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -202,6 +202,9 @@ kimage_file_prepare_segments(struct kimage *image, int 
kernel_fd, int initrd_fd,
return ret;
image->kernel_buf_len = size;
 
+   /* IMA needs to pass the measurement list to the next kernel. */
+   ima_add_kexec_buffer(image);
+
/* Call arch image probe handlers */
ret = arch_kexec_kernel_image_probe(image, image->kernel_buf,
image->kernel_buf_len);
diff --git a/security/integrity/ima/ima_kexec.c 
b/security/integrity/ima/ima_kexec.c
index e77ca9d..3fed417 100644
--- a/security/integrity/ima/ima_kexec.c
+++ b/security/integrity/ima/ima_kexec.c
@@ -23,6 +23,11 @@
 
 #include "ima.h"
 
+#ifdef CONFIG_IMA_KEXEC
+/* Physical address of the measurement buffer in the next kernel. */
+static unsigned long kexec_buffer_load_addr;
+static size_t kexec_segment_size;
+
 static int ima_dump_measurement_list(unsigned long *buffer_size, void **buffer,
 unsigned long segment_size)
 {
@@ -75,6 +80,84 @@ out:
 }
 
 /*
+ * Called during kexec execute so that IMA can save the measurement list.
+ */
+static int ima_update_kexec_buffer(struct notifier_block *self,
+  unsigned long action, void *data)
+{
+   void *kexec_buffer = NULL;
+   size_t kexec_buffer_size;
+   int ret;
+
+   if (!kexec_in_progress)
+   return NOTIFY_OK;
+
+   kexec_buffer_size = ima_get_binary_runtime_size();
+   if (kexec_buffer_size >
+   (kexec_segment_size - sizeof(struct ima_kexec_hdr))) {
+   pr_err("Binary measurement list grew too large.\n");
+   goto out;
+   }
+
+   ima_dump_measurement_list(_buffer_size, _buffer,
+ kexec_segment_size);
+   if (!kexec_buffer) {
+   pr_err("Not enough memory for the kexec measurement buffer.\n");
+   goto out;
+   }
+   ret = kexec_update_segment(kexec_buffer, kexec_buffer_size,
+  kexec_buffer_load_addr, kexec_segment_size);
+   if (ret)
+   pr_err("Error updating kexec buffer: %d\n", ret);
+out:
+   return NOTIFY_OK;
+}
+
+struct notifier_block update_buffer_nb = {
+   .notifier_call = ima_update_kexec_buffer,
+};
+
+/*
+ * Called during kexec_file_load so that IMA can add a segment to the kexec
+ * image for the measurement list for the next kernel.
+ */
+void ima_add_kexec_buffer(struct kimage *image)
+{
+   struct kexec_buf kbuf = { .image = image, .buf_align = PAGE_SIZE,
+ .buf_min = 0, .buf_max = ULONG_MAX,
+ .top_down = true };
+   int ret;
+
+   if (!kexec_can_hand_over_buffer())
+   return;
+
+   kexec_segment_size = ALIGN(ima_get_binary_runtime_size() + PAGE_SIZE,
+  PAGE_SIZE);
+
+   if (kexec_segment_size >= (ULONG_MAX - sizeof(long))) {
+   pr_err("Binary measurement list too large.\n");
+   return;
+   }
+
+   /* Ask not to checksum the segment, we will update it later. */
+   kbuf.buffer = NULL;
+   kbuf.bufsz = 0;
+   kbuf.memsz = 

[PATCH 4/7] ima: serialize the binary_runtime_measurements

2016-08-04 Thread Mimi Zohar
The TPM PCRs are only reset on a hard reboot.  In order to validate a
TPM's quote after a soft reboot (eg. kexec -e), the IMA measurement list
of the running kernel must be saved and restored on boot.  This patch
serializes the IMA measurement list in the binary_runtime_measurements
format.

Signed-off-by: Mimi Zohar 
---
 security/integrity/ima/ima.h   |  1 +
 security/integrity/ima/ima_fs.c|  2 +-
 security/integrity/ima/ima_kexec.c | 51 ++
 3 files changed, 53 insertions(+), 1 deletion(-)

diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index 222668f..f972296 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -132,6 +132,7 @@ struct ima_template_desc *ima_template_desc_current(void);
 void ima_load_kexec_buffer(void);
 int ima_restore_measurement_entry(struct ima_template_entry *entry);
 int ima_restore_measurement_list(loff_t bufsize, void *buf);
+int ima_measurements_show(struct seq_file *m, void *v);
 unsigned long ima_get_binary_runtime_size(void);
 int ima_init_template(void);
 
diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
index c07a384..66e5dd5 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -116,7 +116,7 @@ void ima_putc(struct seq_file *m, void *data, int datalen)
  *   [eventdata length]
  *   eventdata[n]=template specific data
  */
-static int ima_measurements_show(struct seq_file *m, void *v)
+int ima_measurements_show(struct seq_file *m, void *v)
 {
/* the list never shrinks, so we don't need a lock here */
struct ima_queue_entry *qe = v;
diff --git a/security/integrity/ima/ima_kexec.c 
b/security/integrity/ima/ima_kexec.c
index 6a046ad..e77ca9d 100644
--- a/security/integrity/ima/ima_kexec.c
+++ b/security/integrity/ima/ima_kexec.c
@@ -23,6 +23,57 @@
 
 #include "ima.h"
 
+static int ima_dump_measurement_list(unsigned long *buffer_size, void **buffer,
+unsigned long segment_size)
+{
+   struct ima_queue_entry *qe;
+   struct seq_file file;
+   struct ima_kexec_hdr khdr = {
+   .version = 1, .buffer_size = 0, .count = 0};
+   int ret = 0;
+
+   /* segment size can't change between kexec load and execute */
+   file.buf = vmalloc(segment_size);
+   if (!file.buf) {
+   ret = -ENOMEM;
+   goto out;
+   }
+
+   file.size = segment_size;
+   file.read_pos = 0;
+   file.count = sizeof(khdr);  /* reserved space */
+
+   list_for_each_entry_rcu(qe, _measurements, later) {
+   if (file.count < file.size) {
+   khdr.count++;
+   ima_measurements_show(, qe);
+   } else {
+   ret = -EINVAL;
+   break;
+   }
+   }
+
+   if (ret < 0)
+   goto out;
+
+   /*
+* fill in reserved space with some buffer details
+* (eg. version, buffer size, number of measurements)
+*/
+   khdr.buffer_size = file.count;
+   memcpy(file.buf, , sizeof(khdr));
+   print_hex_dump(KERN_DEBUG, "ima dump: ", DUMP_PREFIX_NONE,
+   16, 1, file.buf,
+   file.count < 100 ? file.count : 100, true);
+
+   *buffer_size = file.count;
+   *buffer = file.buf;
+out:
+   if (ret == -EINVAL)
+   vfree(file.buf);
+   return ret;
+}
+
 /*
  * Restore the measurement list from the previous kernel.
  */
-- 
2.1.0



[PATCH 3/7] ima: maintain memory size needed for serializing the measurement list

2016-08-04 Thread Mimi Zohar
In preparation for serializing the binary_runtime_measurements, this patch
maintains the amount of memory required.

Signed-off-by: Mimi Zohar 
---
 security/integrity/ima/Kconfig | 12 ++
 security/integrity/ima/ima.h   |  1 +
 security/integrity/ima/ima_queue.c | 49 --
 3 files changed, 60 insertions(+), 2 deletions(-)

diff --git a/security/integrity/ima/Kconfig b/security/integrity/ima/Kconfig
index 0fb54d3..d764027 100644
--- a/security/integrity/ima/Kconfig
+++ b/security/integrity/ima/Kconfig
@@ -27,6 +27,18 @@ config IMA
  to learn more about IMA.
  If unsure, say N.
 
+config IMA_KEXEC
+   bool "Enable carrying the IMA measurement list across a soft boot"
+   depends on IMA && TCG_TPM && KEXEC_FILE
+   default n
+   help
+  TPM PCRs are only reset on a hard reboot.  In order to validate
+  a TPM's quote after a soft boot, the IMA measurement list of the
+  running kernel must be saved and restored on boot.
+
+  Depending on the IMA policy, the measurement list can grow to
+  be very large.
+
 config IMA_MEASURE_PCR_IDX
int
depends on IMA
diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index 84e8d36..222668f 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -132,6 +132,7 @@ struct ima_template_desc *ima_template_desc_current(void);
 void ima_load_kexec_buffer(void);
 int ima_restore_measurement_entry(struct ima_template_entry *entry);
 int ima_restore_measurement_list(loff_t bufsize, void *buf);
+unsigned long ima_get_binary_runtime_size(void);
 int ima_init_template(void);
 
 /*
diff --git a/security/integrity/ima/ima_queue.c 
b/security/integrity/ima/ima_queue.c
index 12d1b04..8f0661b 100644
--- a/security/integrity/ima/ima_queue.c
+++ b/security/integrity/ima/ima_queue.c
@@ -29,6 +29,11 @@
 #define AUDIT_CAUSE_LEN_MAX 32
 
 LIST_HEAD(ima_measurements);   /* list of all measurements */
+#ifdef CONFIG_IMA_KEXEC
+static unsigned long binary_runtime_size;
+#else
+static unsigned long binary_runtime_size = ULONG_MAX;
+#endif
 
 /* key: inode (before secure-hashing a file) */
 struct ima_h_table ima_htable = {
@@ -64,6 +69,24 @@ static struct ima_queue_entry *ima_lookup_digest_entry(u8 
*digest_value,
return ret;
 }
 
+/*
+ * Calculate the memory required for serializing a single
+ * binary_runtime_measurement list entry, which contains a
+ * couple of variable length fields (e.g template name and data).
+ */
+static int get_binary_runtime_size(struct ima_template_entry *entry)
+{
+   int size = 0;
+
+   size += sizeof(u32);/* pcr */
+   size += sizeof(entry->digest);
+   size += sizeof(int);/* template name size field */
+   size += strlen(entry->template_desc->name);
+   size += sizeof(entry->template_data_len);
+   size += entry->template_data_len;
+   return size;
+}
+
 /* ima_add_template_entry helper function:
  * - Add template entry to the measurement list and hash table, for
  *   all entries except those carried across kexec.
@@ -90,9 +113,26 @@ static int ima_add_digest_entry(struct ima_template_entry 
*entry, int flags)
key = ima_hash_key(entry->digest);
hlist_add_head_rcu(>hnext, _htable.queue[key]);
}
+
+   if (binary_runtime_size != ULONG_MAX) {
+   int size;
+
+   size = get_binary_runtime_size(entry);
+   binary_runtime_size = (binary_runtime_size < ULONG_MAX - size) ?
+binary_runtime_size + size : ULONG_MAX;
+   }
return 0;
 }
 
+/*
+ * Return the amount of memory required for serializing the
+ * entire binary_runtime_measurement list.
+ */
+unsigned long ima_get_binary_runtime_size(void)
+{
+   return binary_runtime_size;
+};
+
 static int ima_pcr_extend(const u8 *hash, int pcr)
 {
int result = 0;
@@ -106,8 +146,13 @@ static int ima_pcr_extend(const u8 *hash, int pcr)
return result;
 }
 
-/* Add template entry to the measurement list and hash table,
- * and extend the pcr.
+/*
+ * Add template entry to the measurement list and hash table, and
+ * extend the pcr.
+ *
+ * On systems which support carrying the IMA measurement list across
+ * kexec, maintain the total memory size required for serializing the
+ * binary_runtime_measurements.
  */
 int ima_add_template_entry(struct ima_template_entry *entry, int violation,
   const char *op, struct inode *inode,
-- 
2.1.0



[PATCH 2/7] ima: permit duplicate measurement list entries

2016-08-04 Thread Mimi Zohar
Measurements carried across kexec need to be added to the IMA
measurement list, but should not prevent measurements of the newly
booted kernel from being added to the measurement list. This patch
adds support for allowing duplicate measurements.

The "boot_aggregate" measurement entry is the delimiter between soft
boots.

Signed-off-by: Mimi Zohar 
---
 security/integrity/ima/ima_queue.c | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/security/integrity/ima/ima_queue.c 
b/security/integrity/ima/ima_queue.c
index 4b1bb77..12d1b04 100644
--- a/security/integrity/ima/ima_queue.c
+++ b/security/integrity/ima/ima_queue.c
@@ -65,11 +65,12 @@ static struct ima_queue_entry *ima_lookup_digest_entry(u8 
*digest_value,
 }
 
 /* ima_add_template_entry helper function:
- * - Add template entry to measurement list and hash table.
+ * - Add template entry to the measurement list and hash table, for
+ *   all entries except those carried across kexec.
  *
  * (Called with ima_extend_list_mutex held.)
  */
-static int ima_add_digest_entry(struct ima_template_entry *entry)
+static int ima_add_digest_entry(struct ima_template_entry *entry, int flags)
 {
struct ima_queue_entry *qe;
unsigned int key;
@@ -85,8 +86,10 @@ static int ima_add_digest_entry(struct ima_template_entry 
*entry)
list_add_tail_rcu(>later, _measurements);
 
atomic_long_inc(_htable.len);
-   key = ima_hash_key(entry->digest);
-   hlist_add_head_rcu(>hnext, _htable.queue[key]);
+   if (flags) {
+   key = ima_hash_key(entry->digest);
+   hlist_add_head_rcu(>hnext, _htable.queue[key]);
+   }
return 0;
 }
 
@@ -126,7 +129,7 @@ int ima_add_template_entry(struct ima_template_entry 
*entry, int violation,
}
}
 
-   result = ima_add_digest_entry(entry);
+   result = ima_add_digest_entry(entry, 1);
if (result < 0) {
audit_cause = "ENOMEM";
audit_info = 0;
@@ -155,7 +158,7 @@ int ima_restore_measurement_entry(struct ima_template_entry 
*entry)
int result = 0;
 
mutex_lock(_extend_list_mutex);
-   result = ima_add_digest_entry(entry);
+   result = ima_add_digest_entry(entry, 0);
mutex_unlock(_extend_list_mutex);
return result;
 }
-- 
2.1.0



[PATCH 1/7] ima: on soft reboot, restore the measurement list

2016-08-04 Thread Mimi Zohar
The TPM PCRs are only reset on a hard reboot.  In order to validate a
TPM's quote after a soft reboot (eg. kexec -e), the IMA measurement list
of the running kernel must be saved and restored on boot.  This patch
restores the measurement list.

Changelog:
- call ima_load_kexec_buffer() (Thiago)

Signed-off-by: Mimi Zohar 
---
 security/integrity/ima/Makefile   |   1 +
 security/integrity/ima/ima.h  |  10 ++
 security/integrity/ima/ima_init.c |   2 +
 security/integrity/ima/ima_kexec.c|  55 +++
 security/integrity/ima/ima_queue.c|  10 ++
 security/integrity/ima/ima_template.c | 171 ++
 6 files changed, 249 insertions(+)
 create mode 100644 security/integrity/ima/ima_kexec.c

diff --git a/security/integrity/ima/Makefile b/security/integrity/ima/Makefile
index c34599f..c0ce7b1 100644
--- a/security/integrity/ima/Makefile
+++ b/security/integrity/ima/Makefile
@@ -8,4 +8,5 @@ obj-$(CONFIG_IMA) += ima.o
 ima-y := ima_fs.o ima_queue.o ima_init.o ima_main.o ima_crypto.o ima_api.o \
 ima_policy.o ima_template.o ima_template_lib.o ima_buffer.o
 ima-$(CONFIG_IMA_APPRAISE) += ima_appraise.o
+ima-$(CONFIG_KEXEC_FILE) += ima_kexec.o
 obj-$(CONFIG_IMA_BLACKLIST_KEYRING) += ima_mok.o
diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index b5728da..84e8d36 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -102,6 +102,13 @@ struct ima_queue_entry {
 };
 extern struct list_head ima_measurements;  /* list of all measurements */
 
+/* Some details preceding the binary serialized measurement list */
+struct ima_kexec_hdr {
+   unsigned short version;
+   unsigned long buffer_size;
+   unsigned long count;
+} __packed;
+
 /* Internal IMA function definitions */
 int ima_init(void);
 int ima_fs_init(void);
@@ -122,6 +129,9 @@ int ima_init_crypto(void);
 void ima_putc(struct seq_file *m, void *data, int datalen);
 void ima_print_digest(struct seq_file *m, u8 *digest, u32 size);
 struct ima_template_desc *ima_template_desc_current(void);
+void ima_load_kexec_buffer(void);
+int ima_restore_measurement_entry(struct ima_template_entry *entry);
+int ima_restore_measurement_list(loff_t bufsize, void *buf);
 int ima_init_template(void);
 
 /*
diff --git a/security/integrity/ima/ima_init.c 
b/security/integrity/ima/ima_init.c
index 32912bd..3ba0ca4 100644
--- a/security/integrity/ima/ima_init.c
+++ b/security/integrity/ima/ima_init.c
@@ -128,6 +128,8 @@ int __init ima_init(void)
if (rc != 0)
return rc;
 
+   ima_load_kexec_buffer();
+
rc = ima_add_boot_aggregate();  /* boot aggregate must be first entry */
if (rc != 0)
return rc;
diff --git a/security/integrity/ima/ima_kexec.c 
b/security/integrity/ima/ima_kexec.c
new file mode 100644
index 000..6a046ad
--- /dev/null
+++ b/security/integrity/ima/ima_kexec.c
@@ -0,0 +1,55 @@
+/*
+ * Copyright (C) 2016 IBM Corporation
+ *
+ * Authors:
+ * Thiago Jung Bauermann 
+ * Mimi Zohar 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "ima.h"
+
+/*
+ * Restore the measurement list from the previous kernel.
+ */
+void ima_load_kexec_buffer(void)
+{
+   void *kexec_buffer = NULL;
+   size_t kexec_buffer_size = 0;
+   int rc;
+
+   rc = kexec_get_handover_buffer(_buffer, _buffer_size);
+   switch (rc) {
+   case 0:
+   rc = ima_restore_measurement_list(kexec_buffer_size,
+ kexec_buffer);
+   if (rc != 0)
+   pr_err("Failed to restore the measurement list: %d\n",
+   rc);
+
+   kexec_free_handover_buffer();
+   break;
+   case -ENOTSUPP:
+   pr_debug("Restoring the measurement list not supported\n");
+   break;
+   case -ENOENT:
+   pr_debug("No measurement list to restore\n");
+   break;
+   default:
+   pr_debug("Error restoring the measurement list: %d\n", rc);
+   }
+}
diff --git a/security/integrity/ima/ima_queue.c 
b/security/integrity/ima/ima_queue.c
index 32f6ac0..4b1bb77 100644
--- a/security/integrity/ima/ima_queue.c
+++ b/security/integrity/ima/ima_queue.c
@@ -149,3 +149,13 @@ out:
op, audit_cause, result, audit_info);
return result;
 }
+
+int ima_restore_measurement_entry(struct ima_template_entry *entry)
+{
+   int result = 0;
+
+   mutex_lock(_extend_list_mutex);
+   result = 

[PATCH 0/7] ima: carry the measurement list across kexec

2016-08-04 Thread Mimi Zohar
The TPM PCRs are only reset on a hard reboot.  In order to validate a
TPM's quote after a soft reboot (eg. kexec -e), the IMA measurement list
of the running kernel must be saved and then restored on the subsequent
boot.

The existing securityfs binary_runtime_measurements file conveniently
provides a serialized format of the IMA measurement list. This patch
set serializes the measurement list in this format and restores it.

This patch set pre-req's Thiago Bauermann's "kexec_file: Add buffer
hand-over for the next kernel" patch set* for actually carrying the
serialized measurement list across the kexec.

Mimi

*https://lists.infradead.org/pipermail/kexec/2016-June/016157.html

Mimi Zohar (6):
  ima: on soft reboot, restore the measurement list
  ima: permit duplicate measurement list entries
  ima: maintain memory size needed for serializing the measurement list
  ima: serialize the binary_runtime_measurements
  ima: store the builtin/custom template definitions in a list
  ima: support restoring multiple template formats

Thiago Jung Bauermann (1):
  ima: on soft reboot, save the measurement list

 include/linux/ima.h   |  15 ++
 kernel/kexec_file.c   |   3 +
 security/integrity/ima/Kconfig|  12 ++
 security/integrity/ima/Makefile   |   1 +
 security/integrity/ima/ima.h  |  14 ++
 security/integrity/ima/ima_fs.c   |   2 +-
 security/integrity/ima/ima_init.c |   2 +
 security/integrity/ima/ima_kexec.c| 189 
 security/integrity/ima/ima_main.c |   1 +
 security/integrity/ima/ima_queue.c|  72 +-
 security/integrity/ima/ima_template.c | 262 --
 11 files changed, 556 insertions(+), 17 deletions(-)
 create mode 100644 security/integrity/ima/ima_kexec.c

-- 
2.1.0



Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-04 Thread Arnd Bergmann
On Thursday, August 4, 2016 9:47:13 PM CEST Nicholas Piggin wrote:
> On Thu, 04 Aug 2016 12:37:41 +0200 Arnd Bergmann  wrote:
> > On Thursday, August 4, 2016 11:00:49 AM CEST Arnd Bergmann wrote:
> > > I tried this
> > > 
> > > diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
> > > index b5e40ed86e60..89bca1a25916 100755
> > > --- a/scripts/link-vmlinux.sh
> > > +++ b/scripts/link-vmlinux.sh
> > > @@ -44,7 +44,7 @@ modpost_link()
> > > local objects
> > >  
> > > if [ -n "${CONFIG_THIN_ARCHIVES}" ]; then
> > > -   objects="--whole-archive ${KBUILD_VMLINUX_INIT} 
> > > ${KBUILD_VMLINUX_MAIN} --no-whole-archive"
> > > +   objects="${KBUILD_VMLINUX_INIT} ${KBUILD_VMLINUX_MAIN}"
> > > else
> > > objects="${KBUILD_VMLINUX_INIT} --start-group 
> > > ${KBUILD_VMLINUX_MAIN} --end-group"
> > > fi
> > > 
> > > but that did not seem to change anything, the extra symbols are
> > > still there. I have not tried to understand what that actually
> > > does, so maybe I misunderstood your suggestion.
> > >   
> > 
> > On a second attempt, I did the same change for vmlinux instead of the
> > module (d'oh), and got a link failure instead:
> > 
> > 
> > arch/arm/mm/proc-xscale.o: In function `cpu_xscale_do_resume':
> > (.text+0x3d4): undefined reference to `cpu_resume_mmu'
> > arch/arm/kernel/setup.o: In function `setup_arch':
> > ...
> > 
> > However, I also see a link failure in some rare configurations
> > with just your patch:
> > 
> > arch/arm/lib/lib.a(io-acorn.o): In function `outsl':
> > (.text+0x38): undefined reference to `printk'
> > 
> > The problem being a file in a library object that is not referenced,
> > but that references another symbol that is not defined
> > (CONFIG_PRINTK=n).
> 
> The first problem is the existing link system is buggy. I think an
> unconditional switch to --whole-archive (at least for modular kernels)
> should probably be done anyway. For example, on powerpc when building
> with --whole-archive, I have:
> 
> +dma_noop_alloc
> +dma_noop_free
> +dma_noop_map_page
> +dma_noop_mapping_error
> +dma_noop_map_sg
> +dma_noop_ops
> +dma_noop_supported
> +fdt_add_reservemap_entry
> +fdt_begin_node
> +fdt_create
> +fdt_create_empty_tree
> +fdt_end_node
> +fdt_errtable
> +find_cpio_data
> +ioremap_page_range
> 
> find_cpio_data is unnecessary and it's a codesize regression to link it.
> But dma_noop_ops and ioremap_page_range are exported symbols. If I
> reference dma_noop_ops from some random module with otherwise unpatched
> kernel:
> 
> ERROR: "dma_noop_ops" [drivers/char/bsr.ko] undefined!

Right, but only on s390, which is the one architecture using this.
I think we should just have a Kconfig symbol for this file that
gets selected by any architecture that needs it.

This is also what we have ended up doing for almost all other
files in lib/

> The real problem is that our linkage requirements are like a shared
> library when we build modular.
> 
> We could build a list of exports and make it link objects with those
> symbols, to solve this, but IMO that's just wasting lipstick on a pig.
> But I will to propose a patch to always use --whole-archive, thin
> archives or not, and transition all archs over to it in a few release
> cycles. It just works by luck right now.
>
> Why is it a pig? Because having the linker to notice no external
> references and just skipping the .o completely is trying to use a hammer
> as a scalpel. It's just not a very effective way to eliminate dead code
> --  I pulled in only a handful of unneeded functions by switching it.

If we do that, we may just as well get rid of $(lib-y) in the process and
always use $(obj-y).

> I mean it is a quick simple feature that probably works well enough with
> simple build systems. But not an advanced one that builds almost
> everything on demand and also has loadable modules and must act like a
> shared library.
> 
> Real linker DCE is a valid optimisation that can't be replaced by the
> build system of course, but we need to do it properly. Here's what I'm
> working on.
> 
> It applies on top of the previous patch I sent, plus some powerpc stuff
> I'm working on that you should be able to just ignore for another arch.
> it's a WIP, but if you can see if it works for arm that would be cool.
> 
> It doesn't actually build allyesconfig after this,
> ld: .tmp_vmlinux1: Too many sections: 220655 (>= 65280)
> 
> But on a more reasonable configuration (ppc64le)
> text  data   bssdec   filename
> 11191672   1183536   1923820   14299028   vmlinux
> 10625528861895   1919707   13407130 vmlinux.thin+gc
> 
> 10M-552K   1M-314K ~   13M-870K

Nice!

> And it actually boots too, which is fairly astounding considering that
> it lost half a meg of code and 1/3 of its data. I'm not completely sure
> I've not done something wrong...

Nicolas Pitre has done some related work, adding him to Cc. IIRC we have

Re: [PATCH] powerpc/book3s: Fix MCE console messages for unrecoverable MCE.

2016-08-04 Thread Mahesh Jagannath Salgaonkar
On 08/04/2016 03:27 PM, Michael Ellerman wrote:
> Mahesh J Salgaonkar  writes:
> 
>> From: Mahesh Salgaonkar 
>>
>> When machine check occurs with MSR(RI=0), it means MC interrupt is
>> unrecoverable and kernel goes down to panic path. But the console
>> message still shows it as recovered. This patch fixes the MCE console
>> messages.
>>
>> Signed-off-by: Mahesh Salgaonkar 
>> ---
>>  arch/powerpc/kernel/mce.c |3 ++-
>>  arch/powerpc/platforms/powernv/opal.c |2 ++
>>  2 files changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
>> index ef267fd..5e7ece0 100644
>> --- a/arch/powerpc/kernel/mce.c
>> +++ b/arch/powerpc/kernel/mce.c
>> @@ -92,7 +92,8 @@ void save_mce_event(struct pt_regs *regs, long handled,
>>  mce->in_use = 1;
>>  
>>  mce->initiator = MCE_INITIATOR_CPU;
>> -if (handled)
>> +/* Mark it recovered if we have handled it and MSR(RI=1). */
>> +if (handled && (regs->msr & MSR_RI))
>>  mce->disposition = MCE_DISPOSITION_RECOVERED;
> 
> This seems like it has bigger implications than just changing the
> printk output? We're now (correctly) marking any MC where RI=0 as
> unrecoverable.
> 
> Or is the only place that uses this the code below which *also* checks
> MSR_RI?

We would always check MSR_RI at code below and panic correctly. It was
just that we were always printing it as recovered and then panic.

> 
>> diff --git a/arch/powerpc/platforms/powernv/opal.c 
>> b/arch/powerpc/platforms/powernv/opal.c
>> index 5385434..8154171 100644
>> --- a/arch/powerpc/platforms/powernv/opal.c
>> +++ b/arch/powerpc/platforms/powernv/opal.c
>> @@ -401,6 +401,8 @@ static int opal_recover_mce(struct pt_regs *regs,
>>  
>>  if (!(regs->msr & MSR_RI)) {
>>  /* If MSR_RI isn't set, we cannot recover */
> 
> Why do we check MSR_RI again here? Shouldn't we just be looking at the 
> evt->disposition?

When MSR_RI=0, where SRR0/SRR1 registers values have been thrashed,
kernel can not continue reliably if we return from interrupt. It should
definitely go down to panic path. Hence we check for RI=0 and return 0.
Where as, if MSR_RI=1 and disposition is "unrecovered", we can minimise
the damage to user process if this MCE was hit in user space context.

The print is just to tell that the kernel panic'ed because MCE occured
during a rare window where MSR RI bit was set to zero(0) and not that
handler could not fix the error.

> 
>> +printk(KERN_ERR "Machine check interrupt unrecoverable:"
>> +" MSR(RI=0)\n");
> 
> Are we sure it's safe to call printk() there?

Yes, we had just printed MCE event info before we came here.

> 
> Please don't split the message across lines, and use pr_err() like the
> rest of the code in this file. So it would be:
> 
>   pr_err("Machine check interrupt unrecoverable: MSR(RI=0)\n");

Sure. Will make the change.

> 
>>  recovered = 0;
>>  } else if (evt->disposition == MCE_DISPOSITION_RECOVERED) {
>>  /* Platform corrected itself */
> 
> cheers
> 



Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-04 Thread Nicholas Piggin
On Thu, 04 Aug 2016 12:37:41 +0200
Arnd Bergmann  wrote:

> On Thursday, August 4, 2016 11:00:49 AM CEST Arnd Bergmann wrote:
> > I tried this
> > 
> > diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
> > index b5e40ed86e60..89bca1a25916 100755
> > --- a/scripts/link-vmlinux.sh
> > +++ b/scripts/link-vmlinux.sh
> > @@ -44,7 +44,7 @@ modpost_link()
> > local objects
> >  
> > if [ -n "${CONFIG_THIN_ARCHIVES}" ]; then
> > -   objects="--whole-archive ${KBUILD_VMLINUX_INIT} 
> > ${KBUILD_VMLINUX_MAIN} --no-whole-archive"
> > +   objects="${KBUILD_VMLINUX_INIT} ${KBUILD_VMLINUX_MAIN}"
> > else
> > objects="${KBUILD_VMLINUX_INIT} --start-group 
> > ${KBUILD_VMLINUX_MAIN} --end-group"
> > fi
> > 
> > but that did not seem to change anything, the extra symbols are
> > still there. I have not tried to understand what that actually
> > does, so maybe I misunderstood your suggestion.
> >   
> 
> On a second attempt, I did the same change for vmlinux instead of the
> module (d'oh), and got a link failure instead:
> 
> 
> arch/arm/mm/proc-xscale.o: In function `cpu_xscale_do_resume':
> (.text+0x3d4): undefined reference to `cpu_resume_mmu'
> arch/arm/kernel/setup.o: In function `setup_arch':
> setup.c:(.init.text+0x910): undefined reference to `init_uts_ns'
> kernel/nsproxy.o:(.data+0x4): undefined reference to `init_uts_ns'
> kernel/sched/core.o: In function `update_rq_clock':
> core.c:(.text+0x6d8): undefined reference to `paravirt_steal_rq_enabled'
> core.c:(.text+0x6dc): undefined reference to `pv_time_ops'
> kernel/sched/cputime.o: In function `account_process_tick':
> cputime.c:(.text+0x794): undefined reference to `paravirt_steal_enabled'
> cputime.c:(.text+0x7a0): undefined reference to `pv_time_ops'
> kernel/locking/lockdep.o: In function `save_trace':
> lockdep.c:(.text+0xfe8): undefined reference to `save_stack_trace'
> kernel/module.o: In function `load_module':
> module.c:(.text+0x1b54): undefined reference to `elf_check_arch'
> module.c:(.text+0x2024): undefined reference to `apply_relocate'
> kernel/debug/debug_core.o: In function `kgdb_unregister_io_module':
> debug_core.c:(.text+0x2e4): undefined reference to `kgdb_arch_exit'
> kernel/debug/debug_core.o: In function `kgdb_arch_set_breakpoint':
> debug_core.c:(.text+0x3bc): undefined reference to `arch_kgdb_ops'
> kernel/debug/debug_core.o: In function `dbg_remove_all_break':
> debug_core.c:(.text+0x6d0): undefined reference to `arch_kgdb_ops'
> ...
> 
> However, I also see a link failure in some rare configurations
> with just your patch:
> 
> arch/arm/lib/lib.a(io-acorn.o): In function `outsl':
> (.text+0x38): undefined reference to `printk'
> 
> The problem being a file in a library object that is not referenced,
> but that references another symbol that is not defined
> (CONFIG_PRINTK=n).

The first problem is the existing link system is buggy. I think an
unconditional switch to --whole-archive (at least for modular kernels)
should probably be done anyway. For example, on powerpc when building
with --whole-archive, I have:

+dma_noop_alloc
+dma_noop_free
+dma_noop_map_page
+dma_noop_mapping_error
+dma_noop_map_sg
+dma_noop_ops
+dma_noop_supported
+fdt_add_reservemap_entry
+fdt_begin_node
+fdt_create
+fdt_create_empty_tree
+fdt_end_node
+fdt_errtable
+find_cpio_data
+ioremap_page_range

find_cpio_data is unnecessary and it's a codesize regression to link it.
But dma_noop_ops and ioremap_page_range are exported symbols. If I
reference dma_noop_ops from some random module with otherwise unpatched
kernel:

ERROR: "dma_noop_ops" [drivers/char/bsr.ko] undefined!

The real problem is that our linkage requirements are like a shared
library when we build modular.

We could build a list of exports and make it link objects with those
symbols, to solve this, but IMO that's just wasting lipstick on a pig.
But I will to propose a patch to always use --whole-archive, thin
archives or not, and transition all archs over to it in a few release
cycles. It just works by luck right now.

Why is it a pig? Because having the linker to notice no external
references and just skipping the .o completely is trying to use a hammer
as a scalpel. It's just not a very effective way to eliminate dead code
--  I pulled in only a handful of unneeded functions by switching it.

I mean it is a quick simple feature that probably works well enough with
simple build systems. But not an advanced one that builds almost
everything on demand and also has loadable modules and must act like a
shared library.

Real linker DCE is a valid optimisation that can't be replaced by the
build system of course, but we need to do it properly. Here's what I'm
working on.

It applies on top of the previous patch I sent, plus some powerpc stuff
I'm working on that you should be able to just ignore for another arch.
it's a WIP, but if you can see if it works for arm that would be cool.

It doesn't actually 

Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-04 Thread Arnd Bergmann
On Thursday, August 4, 2016 11:00:49 AM CEST Arnd Bergmann wrote:
> I tried this
> 
> diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
> index b5e40ed86e60..89bca1a25916 100755
> --- a/scripts/link-vmlinux.sh
> +++ b/scripts/link-vmlinux.sh
> @@ -44,7 +44,7 @@ modpost_link()
> local objects
>  
> if [ -n "${CONFIG_THIN_ARCHIVES}" ]; then
> -   objects="--whole-archive ${KBUILD_VMLINUX_INIT} 
> ${KBUILD_VMLINUX_MAIN} --no-whole-archive"
> +   objects="${KBUILD_VMLINUX_INIT} ${KBUILD_VMLINUX_MAIN}"
> else
> objects="${KBUILD_VMLINUX_INIT} --start-group 
> ${KBUILD_VMLINUX_MAIN} --end-group"
> fi
> 
> but that did not seem to change anything, the extra symbols are
> still there. I have not tried to understand what that actually
> does, so maybe I misunderstood your suggestion.
> 

On a second attempt, I did the same change for vmlinux instead of the
module (d'oh), and got a link failure instead:


arch/arm/mm/proc-xscale.o: In function `cpu_xscale_do_resume':
(.text+0x3d4): undefined reference to `cpu_resume_mmu'
arch/arm/kernel/setup.o: In function `setup_arch':
setup.c:(.init.text+0x910): undefined reference to `init_uts_ns'
kernel/nsproxy.o:(.data+0x4): undefined reference to `init_uts_ns'
kernel/sched/core.o: In function `update_rq_clock':
core.c:(.text+0x6d8): undefined reference to `paravirt_steal_rq_enabled'
core.c:(.text+0x6dc): undefined reference to `pv_time_ops'
kernel/sched/cputime.o: In function `account_process_tick':
cputime.c:(.text+0x794): undefined reference to `paravirt_steal_enabled'
cputime.c:(.text+0x7a0): undefined reference to `pv_time_ops'
kernel/locking/lockdep.o: In function `save_trace':
lockdep.c:(.text+0xfe8): undefined reference to `save_stack_trace'
kernel/module.o: In function `load_module':
module.c:(.text+0x1b54): undefined reference to `elf_check_arch'
module.c:(.text+0x2024): undefined reference to `apply_relocate'
kernel/debug/debug_core.o: In function `kgdb_unregister_io_module':
debug_core.c:(.text+0x2e4): undefined reference to `kgdb_arch_exit'
kernel/debug/debug_core.o: In function `kgdb_arch_set_breakpoint':
debug_core.c:(.text+0x3bc): undefined reference to `arch_kgdb_ops'
kernel/debug/debug_core.o: In function `dbg_remove_all_break':
debug_core.c:(.text+0x6d0): undefined reference to `arch_kgdb_ops'
...

However, I also see a link failure in some rare configurations
with just your patch:

arch/arm/lib/lib.a(io-acorn.o): In function `outsl':
(.text+0x38): undefined reference to `printk'

The problem being a file in a library object that is not referenced,
but that references another symbol that is not defined
(CONFIG_PRINTK=n).

Arnd


Re: [PATCH] powerpc: Align hot loops of memset() and backwards_memcpy()

2016-08-04 Thread Anton Blanchard
Hi Christophe,

> > Align the hot loops in our assembly implementation of memset()
> > and backwards_memcpy().
> >
> > backwards_memcpy() is called from tcp_v4_rcv(), so we might
> > want to optimise this a little more.
> >
> > Signed-off-by: Anton Blanchard   
> 
> Shouldn't this patch be titled powerpc/64, as powerpc32 has a
> different memset() ?

Yeah, good point. Michael can you make this change if you choose to
merge it? 

Anton

> > ---
> >  arch/powerpc/lib/mem_64.S | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/arch/powerpc/lib/mem_64.S b/arch/powerpc/lib/mem_64.S
> > index 43435c6..eda7a96 100644
> > --- a/arch/powerpc/lib/mem_64.S
> > +++ b/arch/powerpc/lib/mem_64.S
> > @@ -37,6 +37,7 @@ _GLOBAL(memset)
> > clrldi  r5,r5,58
> > mtctr   r0
> > beq 5f
> > +   .balign 16
> >  4: std r4,0(r6)
> > std r4,8(r6)
> > std r4,16(r6)
> > @@ -90,6 +91,7 @@ _GLOBAL(backwards_memcpy)
> > andi.   r0,r6,3
> > mtctr   r7
> > bne 5f
> > +   .balign 16
> >  1: lwz r7,-4(r4)
> > lwzur8,-8(r4)
> > stw r7,-4(r6)
> >  
> 



Re: [PATCH] crypto: crc32c-vpmsum - Convert to CPU feature based module autoloading

2016-08-04 Thread Anton Blanchard
Hi Michael,

> Is VEC_CRYPTO the right feature?
> 
> That's new power8 crypto stuff.

The vpmsum* instructions are part of the same pipeline as the vcipher*
instructions, introduced in POWER8.

> I thought this only used VMX? (but I haven't looked closely)

Yes, vcipher* and vpmsum* are VMX instructions.

Anton


Re: [Patch v2 01/10] driver/edac/mpc85xx_edac: Fix compiling error

2016-08-04 Thread Borislav Petkov
On Thu, Jul 28, 2016 at 03:30:55PM -0700, York Sun wrote:
> Two symbols are missing if mpc85xx_edac driver is compiled as module.
> 
> Signed-off-by: York Sun 
> ---
> Change log
>   v2: no change
> 
>  arch/powerpc/kernel/pci-common.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/powerpc/kernel/pci-common.c 
> b/arch/powerpc/kernel/pci-common.c
> index 0f7a60f..86bc484 100644
> --- a/arch/powerpc/kernel/pci-common.c
> +++ b/arch/powerpc/kernel/pci-common.c
> @@ -226,6 +226,7 @@ struct pci_controller* pci_find_hose_for_OF_device(struct 
> device_node* node)
>   }
>   return NULL;
>  }
> +EXPORT_SYMBOL(pci_find_hose_for_OF_device);
>  
>  /*
>   * Reads the interrupt pin to determine if interrupt is use by card.
> @@ -1585,6 +1586,7 @@ int early_find_capability(struct pci_controller *hose, 
> int bus, int devfn,
>  {
>   return pci_bus_find_capability(fake_pci_bus(hose, bus), devfn, cap);
>  }
> +EXPORT_SYMBOL(early_find_capability);

Please use get_maintainer.pl in order to CC the proper people when
preparing patches for the kernel.

I've CCed them now.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--


Re: [PATCH v2 3/3] powernv: Fix MCE handler to avoid trashing CR0/CR1 registers.

2016-08-04 Thread Michael Ellerman
Mahesh Jagannath Salgaonkar  writes:

> On 08/04/2016 09:44 AM, Stewart Smith wrote:
>> Mahesh J Salgaonkar  writes:
>>> From: Mahesh Salgaonkar 
>>>
>>> The current implementation of MCE early handling modifies CR0/1 registers
>>> without saving its old values. Fix this by moving early check for
>>> powersaving mode to machine_check_handle_early().
>> 
>> From (internal bug report) it seems as though in a test where one
>> injects continuous SLB Multi Hit errors, this bug could lead to rebooting
>> "due to to Platform error" rather than continuing to recover
>> successfully. It might be a good idea to mention that in commit message
>> here.
>
> This patch does not address the specific internal bug that you talking
> about. I am still chasing that bug.
> 
>> Also, should this go to stable?
>
> However yes. This should go to stable tree.

Can you please rebase it on to Linus' master.

cheers


Re: [PATCH] powerpc/book3s: Fix MCE console messages for unrecoverable MCE.

2016-08-04 Thread Michael Ellerman
Mahesh J Salgaonkar  writes:

> From: Mahesh Salgaonkar 
>
> When machine check occurs with MSR(RI=0), it means MC interrupt is
> unrecoverable and kernel goes down to panic path. But the console
> message still shows it as recovered. This patch fixes the MCE console
> messages.
>
> Signed-off-by: Mahesh Salgaonkar 
> ---
>  arch/powerpc/kernel/mce.c |3 ++-
>  arch/powerpc/platforms/powernv/opal.c |2 ++
>  2 files changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
> index ef267fd..5e7ece0 100644
> --- a/arch/powerpc/kernel/mce.c
> +++ b/arch/powerpc/kernel/mce.c
> @@ -92,7 +92,8 @@ void save_mce_event(struct pt_regs *regs, long handled,
>   mce->in_use = 1;
>  
>   mce->initiator = MCE_INITIATOR_CPU;
> - if (handled)
> + /* Mark it recovered if we have handled it and MSR(RI=1). */
> + if (handled && (regs->msr & MSR_RI))
>   mce->disposition = MCE_DISPOSITION_RECOVERED;

This seems like it has bigger implications than just changing the
printk output? We're now (correctly) marking any MC where RI=0 as
unrecoverable.

Or is the only place that uses this the code below which *also* checks
MSR_RI?

> diff --git a/arch/powerpc/platforms/powernv/opal.c 
> b/arch/powerpc/platforms/powernv/opal.c
> index 5385434..8154171 100644
> --- a/arch/powerpc/platforms/powernv/opal.c
> +++ b/arch/powerpc/platforms/powernv/opal.c
> @@ -401,6 +401,8 @@ static int opal_recover_mce(struct pt_regs *regs,
>  
>   if (!(regs->msr & MSR_RI)) {
>   /* If MSR_RI isn't set, we cannot recover */

Why do we check MSR_RI again here? Shouldn't we just be looking at the 
evt->disposition?

> + printk(KERN_ERR "Machine check interrupt unrecoverable:"
> + " MSR(RI=0)\n");

Are we sure it's safe to call printk() there?

Please don't split the message across lines, and use pr_err() like the
rest of the code in this file. So it would be:

pr_err("Machine check interrupt unrecoverable: MSR(RI=0)\n");

>   recovered = 0;
>   } else if (evt->disposition == MCE_DISPOSITION_RECOVERED) {
>   /* Platform corrected itself */

cheers


Re: [PATCH] powerpc/book3s: Fix MCE console messages for unrecoverable MCE.

2016-08-04 Thread Michael Ellerman
Mahesh Jagannath Salgaonkar  writes:

> On 08/04/2016 01:35 PM, Greg KH wrote:
>> On Thu, Aug 04, 2016 at 10:16:48AM +0530, Mahesh J Salgaonkar wrote:
>>> From: Mahesh Salgaonkar 
>>>
>>> When machine check occurs with MSR(RI=0), it means MC interrupt is
>>> unrecoverable and kernel goes down to panic path. But the console
>>> message still shows it as recovered. This patch fixes the MCE console
>>> messages.
>>>
>>> Signed-off-by: Mahesh Salgaonkar 
>> 
>> 
>> 
>> This is not the correct way to submit patches for inclusion in the
>> stable kernel tree.  Please read Documentation/stable_kernel_rules.txt
>> for how to do this properly.
>> 
>> 
>
> Ouch. My mistake. Will follow Documentation/stable_kernel_rules.txt

Additionally I appreciate it if you can add a Fixes: line. Which
indicates the commit that introduced the bug you are fixing. It is
documented in Documentation/SubmittingPatches.

In this case it looks like it would be:

Fixes: 36df96f8acaf ("powerpc/book3s: Decode and save machine check event.")

cheers


Re: [RESEND][PATCH v2 2/2] powerpc/fadump: parse fadump reserve memory size based on memory range

2016-08-04 Thread Michael Ellerman
Hari Bathini  writes:
...
>  /**
>   * fadump_calculate_reserve_size(): reserve variable boot area 5% of System 
> RAM
>   *
> @@ -212,12 +262,17 @@ static inline unsigned long 
> fadump_calculate_reserve_size(void)
>  {
>   unsigned long size;
>  
> + /* sets fw_dump.reserve_bootvar */
> + parse_fadump_reserve_mem();
> +
>   /*
>* Check if the size is specified through fadump_reserve_mem= cmdline
>* option. If yes, then use that.
>*/
>   if (fw_dump.reserve_bootvar)
>   return fw_dump.reserve_bootvar;
> + else
> + printk(KERN_INFO "fadump: calculating default boot size\n");
>  
>   /* divide by 20 to get 5% of value */
>   size = memblock_end_of_DRAM() / 20;

The code already knows how to reserve 5% based on the size of the machine's
memory, as long as no commandline parameter is passed. So why can't we
just use that logic?

cheers


Re: [PATCH] powerpc/xics: Properly set Edge/Level type and enable resend

2016-08-04 Thread Benjamin Herrenschmidt
On Thu, 2016-08-04 at 14:40 +1000, Michael Ellerman wrote:
> > Finally now that edge interrupts are properly identified, we can enable
> > CONFIG_HARDIRQS_SW_RESEND which will make the core re-send them if
> > they occur while masked, which some drivers rely upon.
> >
> > This fixes issues with lost interrupts on some Mellanox adapters.
> >
> > > Signed-off-by: Benjamin Herrenschmidt 
> 
> Broken since forever?
> 
> Cc stable?

Broken since forever but I want a bit more testing before that goes to
stable. Some drivers like Mellanox had a workaround but they want to
remove it.

Cheers,
Ben.



mm: Initialise per_cpu_nodestats for all online pgdats at boot

2016-08-04 Thread Mel Gorman
Paul Mackerras and Reza Arbab reported that machines with memoryless nodes
fails when vmstats are refreshed. Paul reported an oops as follows

[1.713998] Unable to handle kernel paging request for data at address 
0xff7a1
[1.714164] Faulting instruction address: 0xc0270cd0
[1.714304] Oops: Kernel access of bad area, sig: 11 [#1]
[1.714414] SMP NR_CPUS=2048 NUMA PowerNV
[1.714530] Modules linked in:
[1.714647] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.7.0-kvm+ #118
[1.714786] task: c00ff0680010 task.stack: c00ff0704000
[1.714926] NIP: c0270cd0 LR: c0270ce8 CTR: 
[1.715093] REGS: c00ff0707900 TRAP: 0300   Not tainted  (4.7.0-kvm+)
[1.715232] MSR: 900102009033   CR: 
846b6824  XER: 2000
[1.715748] CFAR: c0008768 DAR: 000ff7a1 DSISR: 4200 
SOFTE: 1
GPR00: c0270d08 c00ff0707b80 c11fb200 
GPR04: 0800   
GPR08:   000ff7a1 c122aae0
GPR12: c0a1e440 cfb8 c000c188 
GPR16:    
GPR20:    c0cecad0
GPR24: c0d035b8 c0d6cd18 c0d6cd18 c01fffa86300
GPR28:  c01fffa96300 c1230034 c122eb18
[1.717484] NIP [c0270cd0] refresh_zone_stat_thresholds+0x80/0x240
[1.717568] LR [c0270ce8] refresh_zone_stat_thresholds+0x98/0x240
[1.717648] Call Trace:
[1.717687] [c00ff0707b80] [c0270d08] 
refresh_zone_stat_thresholds+0xb8/0x240 (unreliable)

Both supplied potential fixes but one potentially misses checks and another
had redundant initialisations. This version initialises per_cpu_nodestats
on a per-pgdat basis instead of on a per-zone basis.

Reported-by: Paul Mackerras 
Reported-by: Reza Arbab 
Signed-off-by: Mel Gorman 
---
This has been compile-tested and boot-tested on a 32-bit KVM only. A
memoryless system was not available to test the patch with. A confirmation
from Paul and Reza that it resolves their problem is welcome.

 mm/page_alloc.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 39a372a2a1d6..fb975cec3518 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5257,11 +5257,6 @@ static void __meminit setup_zone_pageset(struct zone 
*zone)
zone->pageset = alloc_percpu(struct per_cpu_pageset);
for_each_possible_cpu(cpu)
zone_pageset_init(zone, cpu);
-
-   if (!zone->zone_pgdat->per_cpu_nodestats) {
-   zone->zone_pgdat->per_cpu_nodestats =
-   alloc_percpu(struct per_cpu_nodestat);
-   }
 }
 
 /*
@@ -5270,10 +5265,15 @@ static void __meminit setup_zone_pageset(struct zone 
*zone)
  */
 void __init setup_per_cpu_pageset(void)
 {
+   struct pglist_data *pgdat;
struct zone *zone;
 
for_each_populated_zone(zone)
setup_zone_pageset(zone);
+
+   for_each_online_pgdat(pgdat)
+   pgdat->per_cpu_nodestats =
+   alloc_percpu(struct per_cpu_nodestat);
 }
 
 static noinline __ref


Re: [RESEND][PATCH v2 1/2] kexec: refactor code parsing size based on memory range

2016-08-04 Thread Dave Young
Hi Hari,

On 08/04/16 at 01:03am, Hari Bathini wrote:
> crashkernel parameter supports different syntaxes to specify the amount
> of memory to be reserved for kdump kernel. Below is one of the supported
> syntaxes that needs parsing to find the memory size to reserve, based on
> memory range:
> 
> crashkernel=:[,:,...]
> 
> While such parsing is implemented for crashkernel parameter, it applies
> to other parameters, like fadump_reserve_mem=, which could use similar
> syntax. This patch moves crashkernel's parsing code for above syntax to
> to kernel/params.c file for reuse. Two functions is_param_range_based()
> and parse_mem_range_size() are added to kernel/params.c file for this
> purpose.
> 
> Any parameter that uses the above syntax can use is_param_range_based()
> function to validate the syntax and parse_mem_range_size() function to
> get the parsed memory size. While some code is moved to kernel/params.c
> file, there is no change functionality wise in parsing the crashkernel
> parameter.
> 
> Signed-off-by: Hari Bathini 
> ---
> 
> Changes from v1:
> 1. Updated changelog
> 
>  include/linux/kernel.h |5 +++
>  kernel/kexec_core.c|   63 +++-
>  kernel/params.c|   96 
> 
>  3 files changed, 106 insertions(+), 58 deletions(-)
> 
> diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> index d96a611..2df7ba2 100644
> --- a/include/linux/kernel.h
> +++ b/include/linux/kernel.h
> @@ -435,6 +435,11 @@ extern char *get_options(const char *str, int nints, int 
> *ints);
>  extern unsigned long long memparse(const char *ptr, char **retptr);
>  extern bool parse_option_str(const char *str, const char *option);
>  
> +extern bool __init is_param_range_based(const char *cmdline);
> +extern unsigned long long __init parse_mem_range_size(const char *param,
> +   char **str,
> +   unsigned long long 
> system_ram);
> +
>  extern int core_kernel_text(unsigned long addr);
>  extern int core_kernel_data(unsigned long addr);
>  extern int __kernel_text_address(unsigned long addr);
> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> index 5616755..3a74024 100644
> --- a/kernel/kexec_core.c
> +++ b/kernel/kexec_core.c
> @@ -1104,59 +1104,9 @@ static int __init parse_crashkernel_mem(char *cmdline,
>   char *cur = cmdline, *tmp;
>  
>   /* for each entry of the comma-separated list */
> - do {
> - unsigned long long start, end = ULLONG_MAX, size;
> -
> - /* get the start of the range */
> - start = memparse(cur, );
> - if (cur == tmp) {
> - pr_warn("crashkernel: Memory value expected\n");
> - return -EINVAL;
> - }
> - cur = tmp;
> - if (*cur != '-') {
> - pr_warn("crashkernel: '-' expected\n");
> - return -EINVAL;
> - }
> - cur++;
> -
> - /* if no ':' is here, than we read the end */
> - if (*cur != ':') {
> - end = memparse(cur, );
> - if (cur == tmp) {
> - pr_warn("crashkernel: Memory value expected\n");
> - return -EINVAL;
> - }
> - cur = tmp;
> - if (end <= start) {
> - pr_warn("crashkernel: end <= start\n");
> - return -EINVAL;
> - }
> - }
> -
> - if (*cur != ':') {
> - pr_warn("crashkernel: ':' expected\n");
> - return -EINVAL;
> - }
> - cur++;
> -
> - size = memparse(cur, );
> - if (cur == tmp) {
> - pr_warn("Memory value expected\n");
> - return -EINVAL;
> - }
> - cur = tmp;
> - if (size >= system_ram) {
> - pr_warn("crashkernel: invalid size\n");
> - return -EINVAL;
> - }
> -
> - /* match ? */
> - if (system_ram >= start && system_ram < end) {
> - *crash_size = size;
> - break;
> - }
> - } while (*cur++ == ',');
> + *crash_size = parse_mem_range_size("crashkernel", , system_ram);
> + if (cur == cmdline)
> + return -EINVAL;
>  
>   if (*crash_size > 0) {
>   while (*cur && *cur != ' ' && *cur != '@')
> @@ -1293,7 +1243,6 @@ static int __init __parse_crashkernel(char *cmdline,
>const char *name,
>const char *suffix)
>  {
> - char*first_colon, *first_space;
>   char*ck_cmdline;
>  
>   BUG_ON(!crash_size || 

Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-04 Thread Arnd Bergmann
On Thursday, August 4, 2016 10:10:51 AM CEST Stephen Rothwell wrote:
> Hi Arnd,
> 
> On Wed, 03 Aug 2016 20:52:48 +0200 Arnd Bergmann  wrote:
> >
> > Most of the difference appears to be in branch trampolines (634 added,
> > 559 removed, 14837 unchanged) as you suspect, but I also see a couple
> > of symbols show up in vmlinux that were not there before:
> > 
> > -A __crc_dma_noop_ops
> > -D dma_noop_ops
> > -R __clz_tab
> > -r fdt_errtable
> > -r __kcrctab_dma_noop_ops
> > -r __kstrtab_dma_noop_ops
> > -R __ksymtab_dma_noop_ops
> > -t dma_noop_alloc
> > -t dma_noop_free
> > -t dma_noop_map_page
> > -t dma_noop_mapping_error
> > -t dma_noop_map_sg
> > -t dma_noop_supported
> > -T fdt_add_reservemap_entry
> > -T fdt_begin_node
> > -T fdt_create
> > -T fdt_create_empty_tree
> > -T fdt_end_node
> > -T fdt_finish
> > -T fdt_finish_reservemap
> > -T fdt_property
> > -T fdt_resize
> > -T fdt_strerror
> > -T find_cpio_data
> > 
> > From my first look, it seems that all of lib/*.o is now getting linked
> > into vmlinux, while we traditionally leave out everything from lib/
> > that is not referenced.
> 
> You could try removing the --{,no-}whole-archive arguments to ld in
> scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh.  Last time I did
> that, though, a whole lot of stuff failed to be linked in. (Especially
> stuff only referenced by EXPORT_SYMBOL()s, bu that may have been fixed).

I tried this

diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index b5e40ed86e60..89bca1a25916 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -44,7 +44,7 @@ modpost_link()
local objects
 
if [ -n "${CONFIG_THIN_ARCHIVES}" ]; then
-   objects="--whole-archive ${KBUILD_VMLINUX_INIT} 
${KBUILD_VMLINUX_MAIN} --no-whole-archive"
+   objects="${KBUILD_VMLINUX_INIT} ${KBUILD_VMLINUX_MAIN}"
else
objects="${KBUILD_VMLINUX_INIT} --start-group 
${KBUILD_VMLINUX_MAIN} --end-group"
fi

but that did not seem to change anything, the extra symbols are
still there. I have not tried to understand what that actually
does, so maybe I misunderstood your suggestion.

> > I also see a noticeable overhead in link time, the numbers are for
> > a cache-hot rebuild after a successful allyesconfig build, using a
> > 24-way Opteron@2.5Ghz, just relinking vmlinux:
> 
> I was afraid of that, but it is offset by the time saved by not doing
> the "ld -r"s along the way?  It may also be that (for powerpc anyway)
> the linker is doing a better job.

At least on a big SMP system, it doesn't seem to make much difference,
as the "ld -r" steps are easily parallized

$ find build/ -name built-in.o | xargs rm ; time make -skj30 vmlinux
real2m12.092s
user3m52.932s
sys 0m51.248s

$ time make -skj30 vmlinux
real2m12.162s
user3m44.788s
sys 0m47.788s

I tried this twice with identical results: "user" time increases
by eight seconds today when we have to rebuild all "built-in.o"
files rather than just relinking vmlinux, but elapsed time
is unchanged.

After your patch that difference becomes smaller (three seconds
in one run, could be within the noise), but we still have the
extra two minutes for the total build time:

$ find build/ -name built-in.o | xargs rm ; time make -skj30 vmlinux
real4m20.717s
user5m47.556s
sys 0m54.128s

$ time make -skj30 vmlinux
real4m18.835s
user5m44.552s
sys 0m53.152s

FWIW, here is a sample build output I get on an allyesconfig build,
with timestamps added:

$ time make W= -kj30 vmlinux 
make[1]: Entering directory '/git/arm-soc'
make[2]: Entering directory '/git/arm-soc/build/tmp'
10:46:12   CHK include/config/kernel.release
10:46:13   GEN ./Makefile
10:46:13   CHK include/generated/uapi/linux/version.h
  Using /git/arm-soc as source for kernel
10:46:13   CHK include/generated/utsrelease.h
10:46:13   CHK include/generated/timeconst.h
10:46:13   CHK include/generated/bounds.h
10:46:13   CHK include/generated/asm-offsets.h
10:46:13   CALL/git/arm-soc/scripts/checksyscalls.sh
10:46:14   CHK include/generated/compile.h
10:46:18   CHK kernel/config_data.h
10:46:20   CC  drivers/misc/lkdtm_rodata.o
10:46:20   OBJCOPY drivers/misc/lkdtm_rodata_objcopy.o
10:46:20   LD  drivers/misc/lkdtm.o
10:46:20   LD  drivers/misc/built-in.o
10:46:20   DTC drivers/gpu/drm/tilcdc/tilcdc_slave_compat.dtb
10:46:20   DTB drivers/gpu/drm/tilcdc/tilcdc_slave_compat.dtb.S
10:46:20   AS  drivers/gpu/drm/tilcdc/tilcdc_slave_compat.dtb.o
10:46:20   LD  drivers/gpu/drm/tilcdc/built-in.o
rm drivers/gpu/drm/tilcdc/tilcdc_slave_compat.dtb.S 
drivers/gpu/drm/tilcdc/tilcdc_slave_compat.dtb
10:46:33   LD  drivers/gpu/drm/built-in.o
10:46:33   LD  drivers/gpu/built-in.o
10:46:36   CHK include/generated/uapi/linux/version.h
10:46:36   LINKvmlinux
10:46:37   LD  vmlinux.o
10:47:14   MODPOST vmlinux.o
10:47:16   GEN .version

Re: [PATCH] powerpc/book3s: Fix MCE console messages for unrecoverable MCE.

2016-08-04 Thread Mahesh Jagannath Salgaonkar
On 08/04/2016 01:35 PM, Greg KH wrote:
> On Thu, Aug 04, 2016 at 10:16:48AM +0530, Mahesh J Salgaonkar wrote:
>> From: Mahesh Salgaonkar 
>>
>> When machine check occurs with MSR(RI=0), it means MC interrupt is
>> unrecoverable and kernel goes down to panic path. But the console
>> message still shows it as recovered. This patch fixes the MCE console
>> messages.
>>
>> Signed-off-by: Mahesh Salgaonkar 
>> ---
>>  arch/powerpc/kernel/mce.c |3 ++-
>>  arch/powerpc/platforms/powernv/opal.c |2 ++
>>  2 files changed, 4 insertions(+), 1 deletion(-)
> 
> 
> 
> 
> This is not the correct way to submit patches for inclusion in the
> stable kernel tree.  Please read Documentation/stable_kernel_rules.txt
> for how to do this properly.
> 
> 
> 

Ouch. My mistake. Will follow Documentation/stable_kernel_rules.txt

Thanks,
-Mahesh.



Re: [PATCH v2 3/3] powernv: Fix MCE handler to avoid trashing CR0/CR1 registers.

2016-08-04 Thread Mahesh Jagannath Salgaonkar
On 08/04/2016 09:44 AM, Stewart Smith wrote:
> Mahesh J Salgaonkar  writes:
>> From: Mahesh Salgaonkar 
>>
>> The current implementation of MCE early handling modifies CR0/1 registers
>> without saving its old values. Fix this by moving early check for
>> powersaving mode to machine_check_handle_early().
> 
> From (internal bug report) it seems as though in a test where one
> injects continuous SLB Multi Hit errors, this bug could lead to rebooting
> "due to to Platform error" rather than continuing to recover
> successfully. It might be a good idea to mention that in commit message
> here.

This patch does not address the specific internal bug that you talking
about. I am still chasing that bug.

> 
> Also, should this go to stable?
> 

However yes. This should go to stable tree.



Re: [PATCH v2] powernv: Simplify searching for compatible device nodes

2016-08-04 Thread Michael Ellerman
Cyril Bur  writes:

> On Wed,  3 Aug 2016 12:18:00 -0500
> Jack Miller  wrote:
>
>> (rebased on powerpc/next)
>> 
>> This condenses the opal node searching into a single function that finds
>> all compatible nodes, instead of just searching the ibm,opal children,
>> for ipmi, flash, and prd similar to how opal-i2c nodes are found.
>> 
>> Signed-off-by: Jack Miller 
>
> Using a version of the related skiboot patch that may not be the final one:
> Tested-by: Cyril Bur 

Thanks. The part I'm still not clear on is *why* we're moving them in
skiboot?

cheers


Re: [PATCH] crypto: crc32c-vpmsum - Convert to CPU feature based module autoloading

2016-08-04 Thread Michael Ellerman
Anton Blanchard  writes:

> From: Anton Blanchard 
>
> This patch utilises the GENERIC_CPU_AUTOPROBE infrastructure
> to automatically load the crc32c-vpmsum module if the CPU supports
> it.
>
> Signed-off-by: Anton Blanchard 
> ---
>  arch/powerpc/crypto/crc32c-vpmsum_glue.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/crypto/crc32c-vpmsum_glue.c 
> b/arch/powerpc/crypto/crc32c-vpmsum_glue.c
> index bfe3d37..9fa046d 100644
> --- a/arch/powerpc/crypto/crc32c-vpmsum_glue.c
> +++ b/arch/powerpc/crypto/crc32c-vpmsum_glue.c
> @@ -4,6 +4,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  
>  #define CHKSUM_BLOCK_SIZE1
> @@ -157,7 +158,7 @@ static void __exit crc32c_vpmsum_mod_fini(void)
>   crypto_unregister_shash();
>  }
>  
> -module_init(crc32c_vpmsum_mod_init);
> +module_cpu_feature_match(PPC_MODULE_FEATURE_VEC_CRYPTO, 
> crc32c_vpmsum_mod_init);

Is VEC_CRYPTO the right feature?

That's new power8 crypto stuff.

I thought this only used VMX? (but I haven't looked closely)

cheers


Re: [PATCH v13 06/30] powerpc/ptrace: Adapt gpr32_get, gpr32_set functions for transaction

2016-08-04 Thread Michael Ellerman
Daniel Axtens  writes:

> [ Unknown signature status ]
> Hi all,
>
> This is causing cppcheck warnings (having just landed in next):
>
> [arch/powerpc/kernel/ptrace.c:2062]: (error) Uninitialized variable: ckpt_regs
> [arch/powerpc/kernel/ptrace.c:2130]: (error) Uninitialized variable: ckpt_regs

Sigh.

> This is from...
>> -static int gpr32_get(struct task_struct *target,
>> +static int gpr32_get_common(struct task_struct *target,
>>   const struct user_regset *regset,
>>   unsigned int pos, unsigned int count,
>> - void *kbuf, void __user *ubuf)
>> +void *kbuf, void __user *ubuf, bool tm_active)
>>  {
>>  const unsigned long *regs = >thread.regs->gpr[0];
>> +const unsigned long *ckpt_regs;
>>  compat_ulong_t *k = kbuf;
>>  compat_ulong_t __user *u = ubuf;
>>  compat_ulong_t reg;
>>  int i;
>>  
>> -if (target->thread.regs == NULL)
>> -return -EIO;
>> +#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
>> +ckpt_regs = >thread.ckpt_regs.gpr[0];
>> +#endif
>> +if (tm_active) {
>> +regs = ckpt_regs;
> ... this bit here. If the ifdef doesn't trigger, cppcheck can't find an
> initialisation for ckpt_regs, so it complains.
>
> Techinically it's a false positive as (I assume!) tm_active cannot ever
> be true in the absense of CONFIG_PPC_TRANSACTIONAL_MEM.

That's correct, so the code is safe. See the one call site (which passes an 
int!):

  #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
  static int tm_cgpr32_get(struct task_struct *target,
 const struct user_regset *regset,
 unsigned int pos, unsigned int count,
 void *kbuf, void __user *ubuf)
  {
return gpr32_get_common(target, regset, pos, count, kbuf, ubuf, 1);
  }

> Is there a nice simple fix we could deploy to squash this warning, or
> will we just live with it?

This series has been nothing but pain. Given we're already at v13, and people
really want this support to go in, I'm going to leave it in the tree.

Once it's in we can refactor the implementation, which is a bit of a mess, and
hopefully in the process fix the warnings.

cheers


Re: [PATCH] powerpc/book3s: Fix MCE console messages for unrecoverable MCE.

2016-08-04 Thread Greg KH
On Thu, Aug 04, 2016 at 10:16:48AM +0530, Mahesh J Salgaonkar wrote:
> From: Mahesh Salgaonkar 
> 
> When machine check occurs with MSR(RI=0), it means MC interrupt is
> unrecoverable and kernel goes down to panic path. But the console
> message still shows it as recovered. This patch fixes the MCE console
> messages.
> 
> Signed-off-by: Mahesh Salgaonkar 
> ---
>  arch/powerpc/kernel/mce.c |3 ++-
>  arch/powerpc/platforms/powernv/opal.c |2 ++
>  2 files changed, 4 insertions(+), 1 deletion(-)




This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read Documentation/stable_kernel_rules.txt
for how to do this properly.




Re: [PATCH] powerpc: Align hot loops of memset() and backwards_memcpy()

2016-08-04 Thread Christophe Leroy


Le 04/08/2016 à 08:53, Anton Blanchard a écrit :

From: Anton Blanchard 

Align the hot loops in our assembly implementation of memset()
and backwards_memcpy().

backwards_memcpy() is called from tcp_v4_rcv(), so we might
want to optimise this a little more.

Signed-off-by: Anton Blanchard 


Shouldn't this patch be titled powerpc/64, as powerpc32 has a different 
memset() ?


Christophe


---
 arch/powerpc/lib/mem_64.S | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/lib/mem_64.S b/arch/powerpc/lib/mem_64.S
index 43435c6..eda7a96 100644
--- a/arch/powerpc/lib/mem_64.S
+++ b/arch/powerpc/lib/mem_64.S
@@ -37,6 +37,7 @@ _GLOBAL(memset)
clrldi  r5,r5,58
mtctr   r0
beq 5f
+   .balign 16
 4: std r4,0(r6)
std r4,8(r6)
std r4,16(r6)
@@ -90,6 +91,7 @@ _GLOBAL(backwards_memcpy)
andi.   r0,r6,3
mtctr   r7
bne 5f
+   .balign 16
 1: lwz r7,-4(r4)
lwzur8,-8(r4)
stw r7,-4(r6)



Re: [PATCH v2] powernv: Simplify searching for compatible device nodes

2016-08-04 Thread Cyril Bur
On Wed,  3 Aug 2016 12:18:00 -0500
Jack Miller  wrote:

> (rebased on powerpc/next)
> 
> This condenses the opal node searching into a single function that finds
> all compatible nodes, instead of just searching the ibm,opal children,
> for ipmi, flash, and prd similar to how opal-i2c nodes are found.
> 
> Signed-off-by: Jack Miller 

Using a version of the related skiboot patch that may not be the final one:
Tested-by: Cyril Bur 

> ---
>  arch/powerpc/platforms/powernv/opal.c | 24 +++-
>  1 file changed, 7 insertions(+), 17 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/opal.c 
> b/arch/powerpc/platforms/powernv/opal.c
> index 8b4fc68..9db12ce 100644
> --- a/arch/powerpc/platforms/powernv/opal.c
> +++ b/arch/powerpc/platforms/powernv/opal.c
> @@ -631,21 +631,11 @@ static void __init opal_dump_region_init(void)
>   "rc = %d\n", rc);
>  }
>  
> -static void opal_pdev_init(struct device_node *opal_node,
> - const char *compatible)
> +static void opal_pdev_init(const char *compatible)
>  {
>   struct device_node *np;
>  
> - for_each_child_of_node(opal_node, np)
> - if (of_device_is_compatible(np, compatible))
> - of_platform_device_create(np, NULL, NULL);
> -}
> -
> -static void opal_i2c_create_devs(void)
> -{
> - struct device_node *np;
> -
> - for_each_compatible_node(np, NULL, "ibm,opal-i2c")
> + for_each_compatible_node(np, NULL, compatible)
>   of_platform_device_create(np, NULL, NULL);
>  }
>  
> @@ -717,7 +707,7 @@ static int __init opal_init(void)
>   opal_hmi_handler_init();
>  
>   /* Create i2c platform devices */
> - opal_i2c_create_devs();
> + opal_pdev_init("ibm,opal-i2c");
>  
>   /* Setup a heatbeat thread if requested by OPAL */
>   opal_init_heartbeat();
> @@ -752,12 +742,12 @@ static int __init opal_init(void)
>   }
>  
>   /* Initialize platform devices: IPMI backend, PRD & flash interface */
> - opal_pdev_init(opal_node, "ibm,opal-ipmi");
> - opal_pdev_init(opal_node, "ibm,opal-flash");
> - opal_pdev_init(opal_node, "ibm,opal-prd");
> + opal_pdev_init("ibm,opal-ipmi");
> + opal_pdev_init("ibm,opal-flash");
> + opal_pdev_init("ibm,opal-prd");
>  
>   /* Initialise platform device: oppanel interface */
> - opal_pdev_init(opal_node, "ibm,opal-oppanel");
> + opal_pdev_init("ibm,opal-oppanel");
>  
>   /* Initialise OPAL kmsg dumper for flushing console on panic */
>   opal_kmsg_init();



Re: [patch] ppc/cell: missing error code in spufs_mkgang()

2016-08-04 Thread Arnd Bergmann
On Thursday, August 4, 2016 8:37:03 AM CEST Dan Carpenter wrote:
> We should return -ENOMEM if alloc_spu_gang() fails.
> 
> Signed-off-by: Dan Carpenter 
> 

Acked-by: Arnd Bergmann 


[PATCH] powerpc: Align hot loops of memset() and backwards_memcpy()

2016-08-04 Thread Anton Blanchard
From: Anton Blanchard 

Align the hot loops in our assembly implementation of memset()
and backwards_memcpy().

backwards_memcpy() is called from tcp_v4_rcv(), so we might
want to optimise this a little more.

Signed-off-by: Anton Blanchard 
---
 arch/powerpc/lib/mem_64.S | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/lib/mem_64.S b/arch/powerpc/lib/mem_64.S
index 43435c6..eda7a96 100644
--- a/arch/powerpc/lib/mem_64.S
+++ b/arch/powerpc/lib/mem_64.S
@@ -37,6 +37,7 @@ _GLOBAL(memset)
clrldi  r5,r5,58
mtctr   r0
beq 5f
+   .balign 16
 4: std r4,0(r6)
std r4,8(r6)
std r4,16(r6)
@@ -90,6 +91,7 @@ _GLOBAL(backwards_memcpy)
andi.   r0,r6,3
mtctr   r7
bne 5f
+   .balign 16
 1: lwz r7,-4(r4)
lwzur8,-8(r4)
stw r7,-4(r6)
-- 
2.7.4



Crashes in refresh_zone_stat_thresholds when some nodes have no memory

2016-08-04 Thread Paul Mackerras
It appears that commit 75ef71840539 ("mm, vmstat: add infrastructure
for per-node vmstats", 2016-07-28) has introduced a regression on
machines that have nodes which have no memory, such as the POWER8
server that I use for testing.  When I boot current upstream, I get a
splat like this:

[1.713998] Unable to handle kernel paging request for data at address 
0xff7a1
[1.714164] Faulting instruction address: 0xc0270cd0
[1.714304] Oops: Kernel access of bad area, sig: 11 [#1]
[1.714414] SMP NR_CPUS=2048 NUMA PowerNV
[1.714530] Modules linked in:
[1.714647] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.7.0-kvm+ #118
[1.714786] task: c00ff0680010 task.stack: c00ff0704000
[1.714926] NIP: c0270cd0 LR: c0270ce8 CTR: 
[1.715093] REGS: c00ff0707900 TRAP: 0300   Not tainted  (4.7.0-kvm+)
[1.715232] MSR: 900102009033   CR: 
846b6824  XER: 2000
[1.715748] CFAR: c0008768 DAR: 000ff7a1 DSISR: 4200 
SOFTE: 1 
GPR00: c0270d08 c00ff0707b80 c11fb200  
GPR04: 0800    
GPR08:   000ff7a1 c122aae0 
GPR12: c0a1e440 cfb8 c000c188  
GPR16:     
GPR20:    c0cecad0 
GPR24: c0d035b8 c0d6cd18 c0d6cd18 c01fffa86300 
GPR28:  c01fffa96300 c1230034 c122eb18 
[1.717484] NIP [c0270cd0] refresh_zone_stat_thresholds+0x80/0x240
[1.717568] LR [c0270ce8] refresh_zone_stat_thresholds+0x98/0x240
[1.717648] Call Trace:
[1.717687] [c00ff0707b80] [c0270d08] 
refresh_zone_stat_thresholds+0xb8/0x240 (unreliable)
[1.717818] [c00ff0707bd0] [c0a1e4d4] 
init_per_zone_wmark_min+0x94/0xb0
[1.717932] [c00ff0707c30] [c000b90c] do_one_initcall+0x6c/0x1d0
[1.718036] [c00ff0707cf0] [c0d04244] 
kernel_init_freeable+0x294/0x384
[1.718150] [c00ff0707dc0] [c000c1a8] kernel_init+0x28/0x160
[1.718249] [c00ff0707e30] [c0009968] 
ret_from_kernel_thread+0x5c/0x74
[1.718358] Instruction dump:
[1.718408] 3fc20003 3bde4e34 3b80 6042 3860 3fbb0001 481c 
6042 
[1.718575] 3d220003 3929f8e0 7d49502a e93d9c00 <7f8a49ae> 38a30001 38800800 
7ca507b4 

It turns out that we can get a pgdat in the online pgdat list where
pgdat->per_cpu_nodestats is NULL.  On my machine the pgdats for nodes
1 and 17 are like this.  All the memory is in nodes 0 and 16.

With the patch below, the system boots normally.  I don't guarantee to
have found every place that needs a check, and it may be better to fix
this by allocating space for per-cpu statistics on nodes which have no
memory rather than checking at each use site.

Paul.

mm: cope with memoryless nodes not having per-cpu statistics allocated

It seems that the pgdat for nodes which have no memory will also have
no per-cpu statistics space allocated, that is, pgdat->per_cpu_nodestats
is NULL.  Avoid crashing on machines which have memoryless nodes by
checking for non-NULL pgdat->per_cpu_nodestats.

Signed-off-by: Paul Mackerras 
---
diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
index 6137719..48b2780 100644
--- a/include/linux/vmstat.h
+++ b/include/linux/vmstat.h
@@ -184,8 +184,9 @@ static inline unsigned long 
node_page_state_snapshot(pg_data_t *pgdat,
 
 #ifdef CONFIG_SMP
int cpu;
-   for_each_online_cpu(cpu)
-   x += per_cpu_ptr(pgdat->per_cpu_nodestats, 
cpu)->vm_node_stat_diff[item];
+   if (pgdat->per_cpu_nodestats)
+   for_each_online_cpu(cpu)
+   x += per_cpu_ptr(pgdat->per_cpu_nodestats, 
cpu)->vm_node_stat_diff[item];
 
if (x < 0)
x = 0;
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 89cec42..d83881e 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -176,6 +176,10 @@ void refresh_zone_stat_thresholds(void)
 
/* Zero current pgdat thresholds */
for_each_online_pgdat(pgdat) {
+   if (!pgdat->per_cpu_nodestats) {
+   pr_err("No nodestats for node %d\n", pgdat->node_id);
+   continue;
+   }
for_each_online_cpu(cpu) {
per_cpu_ptr(pgdat->per_cpu_nodestats, 
cpu)->stat_threshold = 0;
}
@@ -184,6 +188,10 @@ void refresh_zone_stat_thresholds(void)
for_each_populated_zone(zone) {
struct pglist_data *pgdat = zone->zone_pgdat;
unsigned long max_drift, tolerate_drift;
+   if (!pgdat->per_cpu_nodestats) {
+   pr_err("No per cpu nodestats\n");
+

[PATCH] crypto: crc32c-vpmsum - Convert to CPU feature based module autoloading

2016-08-04 Thread Anton Blanchard
From: Anton Blanchard 

This patch utilises the GENERIC_CPU_AUTOPROBE infrastructure
to automatically load the crc32c-vpmsum module if the CPU supports
it.

Signed-off-by: Anton Blanchard 
---
 arch/powerpc/crypto/crc32c-vpmsum_glue.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/crypto/crc32c-vpmsum_glue.c 
b/arch/powerpc/crypto/crc32c-vpmsum_glue.c
index bfe3d37..9fa046d 100644
--- a/arch/powerpc/crypto/crc32c-vpmsum_glue.c
+++ b/arch/powerpc/crypto/crc32c-vpmsum_glue.c
@@ -4,6 +4,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #define CHKSUM_BLOCK_SIZE  1
@@ -157,7 +158,7 @@ static void __exit crc32c_vpmsum_mod_fini(void)
crypto_unregister_shash();
 }
 
-module_init(crc32c_vpmsum_mod_init);
+module_cpu_feature_match(PPC_MODULE_FEATURE_VEC_CRYPTO, 
crc32c_vpmsum_mod_init);
 module_exit(crc32c_vpmsum_mod_fini);
 
 MODULE_AUTHOR("Anton Blanchard ");
-- 
2.7.4



Re: [PATCH v13 06/30] powerpc/ptrace: Adapt gpr32_get, gpr32_set functions for transaction

2016-08-04 Thread Daniel Axtens
Hi all,

This is causing cppcheck warnings (having just landed in next):

[arch/powerpc/kernel/ptrace.c:2062]: (error) Uninitialized variable: ckpt_regs
[arch/powerpc/kernel/ptrace.c:2130]: (error) Uninitialized variable: ckpt_regs

This is from...
> -static int gpr32_get(struct task_struct *target,
> +static int gpr32_get_common(struct task_struct *target,
>const struct user_regset *regset,
>unsigned int pos, unsigned int count,
> -  void *kbuf, void __user *ubuf)
> + void *kbuf, void __user *ubuf, bool tm_active)
>  {
>   const unsigned long *regs = >thread.regs->gpr[0];
> + const unsigned long *ckpt_regs;
>   compat_ulong_t *k = kbuf;
>   compat_ulong_t __user *u = ubuf;
>   compat_ulong_t reg;
>   int i;
>  
> - if (target->thread.regs == NULL)
> - return -EIO;
> +#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> + ckpt_regs = >thread.ckpt_regs.gpr[0];
> +#endif
> + if (tm_active) {
> + regs = ckpt_regs;
... this bit here. If the ifdef doesn't trigger, cppcheck can't find an
initialisation for ckpt_regs, so it complains.

Techinically it's a false positive as (I assume!) tm_active cannot ever
be true in the absense of CONFIG_PPC_TRANSACTIONAL_MEM.

Is there a nice simple fix we could deploy to squash this warning, or
will we just live with it?

> -static int gpr32_set(struct task_struct *target,
> +static int gpr32_set_common(struct task_struct *target,
>const struct user_regset *regset,
>unsigned int pos, unsigned int count,
> -  const void *kbuf, const void __user *ubuf)
> +  const void *kbuf, const void __user *ubuf, bool tm_active)
>  {
>   unsigned long *regs = >thread.regs->gpr[0];
> + unsigned long *ckpt_regs;
>   const compat_ulong_t *k = kbuf;
>   const compat_ulong_t __user *u = ubuf;
>   compat_ulong_t reg;
>  
> - if (target->thread.regs == NULL)
> - return -EIO;
> +#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> + ckpt_regs = >thread.ckpt_regs.gpr[0];
> +#endif
>  
> - CHECK_FULL_REGS(target->thread.regs);
> + if (tm_active) {
> + regs = ckpt_regs;
FWIW it happens again here.

Regards,
Daniel Axtens


signature.asc
Description: PGP signature


Re: [PATCH v2] powerpc/32: fix csum_partial_copy_generic()

2016-08-04 Thread Alessio Igor Bogani
Scott,

On 4 August 2016 at 05:53, Scott Wood  wrote:
> On Tue, 2016-08-02 at 10:07 +0200, Christophe Leroy wrote:
>> commit 7aef4136566b0 ("powerpc32: rewrite csum_partial_copy_generic()
>> based on copy_tofrom_user()") introduced a bug when destination
>> address is odd and initial csum is not null
>>
>> In that (rare) case the initial csum value has to be rotated one byte
>> as well as the resulting value is
>>
>> This patch also fixes related comments
>>
>> Fixes: 7aef4136566b0 ("powerpc32: rewrite csum_partial_copy_generic()
>> based on copy_tofrom_user()")
>> Cc: sta...@vger.kernel.org
>>
>> Signed-off-by: Christophe Leroy 
>> ---
>>  v2: updated comments as suggested by Segher
>>
>>  arch/powerpc/lib/checksum_32.S | 7 ---
>>  1 file changed, 4 insertions(+), 3 deletions(-)
>
> Alessio, can you confirm whether this fixes the problem you reported?

No unfortunately.

Ciao,
Alessio


Re: [PATCH kernel 04/15] powerpc/powernv/ioda: Fix TCE invalidate to work in real mode again

2016-08-04 Thread David Gibson
On Wed, Aug 03, 2016 at 06:40:45PM +1000, Alexey Kardashevskiy wrote:
> "powerpc/powernv/pci: Rework accessing the TCE invalidate register"
> broke TCE invalidation on IODA2/PHB3 for real mode.
> 
> This makes invalidate work again.
> 
> Fixes: fd141d1a99a3
> Signed-off-by: Alexey Kardashevskiy 
> ---
>  arch/powerpc/platforms/powernv/pci-ioda.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
> b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 53b56c0..59c7e7d 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -1877,7 +1877,7 @@ static void pnv_pci_phb3_tce_invalidate(struct 
> pnv_ioda_pe *pe, bool rm,
>   unsigned shift, unsigned long index,
>   unsigned long npages)
>  {
> - __be64 __iomem *invalidate = pnv_ioda_get_inval_reg(pe->phb, false);
> + __be64 __iomem *invalidate = pnv_ioda_get_inval_reg(pe->phb, rm);
>   unsigned long start, end, inc;
>  
>   /* We'll invalidate DMA address in PE scope */
> @@ -1935,10 +1935,12 @@ static void pnv_pci_ioda2_tce_invalidate(struct 
> iommu_table *tbl,
>   pnv_pci_phb3_tce_invalidate(pe, rm, shift,
>   index, npages);
>   else if (rm)
> + {
>   opal_rm_pci_tce_kill(phb->opal_id,
>OPAL_PCI_TCE_KILL_PAGES,
>pe->pe_number, 1u << shift,
>index << shift, npages);
> + }

These braces look a) unrelated to the actual point of the patch, b)
unnecessary and c) not in keeping with normal coding style.

>   else
>   opal_pci_tce_kill(phb->opal_id,
> OPAL_PCI_TCE_KILL_PAGES,

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH kernel 02/15] KVM: PPC: Finish enabling VFIO KVM device on POWER

2016-08-04 Thread David Gibson
On Wed, Aug 03, 2016 at 06:40:43PM +1000, Alexey Kardashevskiy wrote:
> 178a787502 "vfio: Enable VFIO device for powerpc" made an attempt to
> enable VFIO KVM device on POWER.
> 
> However as CONFIG_KVM_BOOK3S_64 does not use "common-objs-y",
> VFIO KVM device was not enabled for Book3s KVM, this adds VFIO to
> the kvm-book3s_64-objs-y list.
> 
> While we are here, enforce KVM_VFIO on KVM_BOOK3S as other platforms
> already do.
> 
> Signed-off-by: Alexey Kardashevskiy 

Reviewed-by: David Gibson 

This should be merged regardless of the rest of the series.  There's
no reason not to include the kvm device on Power, and it makes life
easier for userspace because it doens't have to have conditionals
about whether to instantiate it or not.

> ---
>  arch/powerpc/kvm/Kconfig  | 1 +
>  arch/powerpc/kvm/Makefile | 3 +++
>  2 files changed, 4 insertions(+)
> 
> diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
> index c2024ac..b7c494b 100644
> --- a/arch/powerpc/kvm/Kconfig
> +++ b/arch/powerpc/kvm/Kconfig
> @@ -64,6 +64,7 @@ config KVM_BOOK3S_64
>   select KVM_BOOK3S_64_HANDLER
>   select KVM
>   select KVM_BOOK3S_PR_POSSIBLE if !KVM_BOOK3S_HV_POSSIBLE
> + select KVM_VFIO if VFIO
>   ---help---
> Support running unmodified book3s_64 and book3s_32 guest kernels
> in virtual machines on book3s_64 host processors.
> diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile
> index 1f9e552..8907af9 100644
> --- a/arch/powerpc/kvm/Makefile
> +++ b/arch/powerpc/kvm/Makefile
> @@ -88,6 +88,9 @@ endif
>  kvm-book3s_64-objs-$(CONFIG_KVM_XICS) += \
>   book3s_xics.o
>  
> +kvm-book3s_64-objs-$(CONFIG_KVM_VFIO) += \
> + $(KVM)/vfio.o
> +
>  kvm-book3s_64-module-objs += \
>   $(KVM)/kvm_main.o \
>   $(KVM)/eventfd.o \

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature