date:20140821

RE: kernel boot fail with efi earlyprintk (bisected)

2014-08-21 Thread Zheng, Lv

Hi,

I checked the arch/x86/platform/efi/early_printk.c.
In early_efi_scroll_up(), 2 mapping entries will be used for the src/dst screen 
buffer.
In drivers/acpi/acpica/tbutils.c, we've improved the early table loading code 
in acpi_tb_parse_root_table().
We now need 2 mapping entries:
1. One mapping entry is used for RSDT table mapping. Each RSDT entry contains 
an address for another ACPI table.
2. For each entry in RSDP, we need another mapping entry to map the table to 
perform necessary check/override before installing it.

When acpi_tb_parse_root_table() prints something through EFI earlyprintk 
console, we'll have 4 mapping entries used.
The current 4 slots setting of early_ioremap() seems to be too small for such a 
use case.

I'm not 100% sure if this is the cause.
If it's the cause and we think both of the mappings are reasonable, we can 
simply increase the FIX_BITMAPS_SLOTS defined in arch/x86/include/asm/fixmap.h.

What do you think of this?

Thanks and best regards
-Lv

> From: Zheng, Lv
> Sent: Friday, August 22, 2014 9:43 AM
> 
> Hi,
> 
> There is only limited entries in the x86 early mapping which is implemented 
> by the FIXMAP.
> So this means for all __init call invoked for x86, if there was a early 
> mapping in it, it should be unmapped before exiting the __init call.
> 
> Using this rule, all __init call implementers can make sure that before 
> entering the __init call, the limited number of FIXMAP entries is
> enough.
> 
> The following bisected commit just increase early mapping times from 1 to 2 
> in ACPICA early table handling code.
> The number of 2 is less than the number of available FIXMAP entries.
> And ACPICA code has ensured that all mappings are correctly unmapped after 
> the table initialization.
> So we didn't break the rule.
> 
> We can offer a workaround in ACPICA to reduce mapping count from 2 to 1 using 
> a global option.
> But since this report sounds like that the root cause is earlyprintk=efi has 
> broken the above rule and the existing issue is triggered by
> this cleanup.
> So could someone check the earlyprintk=efi code first?
> I think earlyprintk=efi should either unmap the increased mapping or increase 
> the number of FIXMAP entries in case earlyprintk=efi
> need additional early mappings.
> Otherwise it will always be chances for earlyprintk=efi to break future code.
> 
> Thanks and best regards
> -Lv
> 
> > From: Matt Fleming [mailto:m...@console-pimps.org]
> > Sent: Friday, August 22, 2014 4:52 AM
> >
> > On Tue, 19 Aug, at 04:16:58PM, Dave Young wrote:
> > > Hi,
> > >
> > > 3.16 kernel boot fail with earlyprintk=efi on my laptop.
> > > It keeps scrolling at the bottom line of screen.
> > >
> > > Bisected, the first bad commit is below:
> > > commit 86dfc6f339886559d80ee0d4bd20fe5ee90450f0
> > > Author: Lv Zheng 
> > > Date:   Fri Apr 4 12:38:57 2014 +0800
> > >
> > > ACPICA: Tables: Fix table checksums verification before installation.
> > >
> > >
> > > I did some debugging by enabling both serial and efi earlyprintk, below is
> > > some debug dmesg, seems early_ioremap fails in scroll up function due to
> > > no free slot, but I'm still not sure if the debug info is right or not.
> >
> > Thanks Dave, your callstack seems to make sense.
> >
> > Can you also enable early_ioremap_debug so that we can figure out where
> > all the FIXMAP slots are going?
> >
> > --
> > Matt Fleming, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 9/9] ARM: zynq: Rename 'zynq_platform_cpu_die'

2014-08-21 Thread Michal Simek

On 08/21/2014 09:02 PM, Sören Brinkmann wrote:
> On Thu, 2014-08-21 at 02:28PM +0200, Michal Simek wrote:
>> On 08/20/2014 10:41 PM, Soren Brinkmann wrote:
>>> Match the naming pattern of all other SMP ops and rename
>>> zynq_platform_cpu_die --> zynq_cpu_die.
>>>
>>> Signed-off-by: Soren Brinkmann 
>>> ---
>>>  arch/arm/mach-zynq/platsmp.c | 4 ++--
>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/arch/arm/mach-zynq/platsmp.c b/arch/arm/mach-zynq/platsmp.c
>>> index 04e578718aa2..95933c5e70e1 100644
>>> --- a/arch/arm/mach-zynq/platsmp.c
>>> +++ b/arch/arm/mach-zynq/platsmp.c
>>> @@ -138,7 +138,7 @@ static int zynq_cpu_kill(unsigned cpu)
>>>   *
>>>   * Called with IRQs disabled
>>>   */
>>> -static void zynq_platform_cpu_die(unsigned int cpu)
>>> +static void zynq_cpu_die(unsigned int cpu)
>>>  {

Sorry for incorrect place - I mean to use kernel-doc here not below. :-)


>>> zynq_slcr_cpu_state_write(cpu, true);
>>>  
>>> @@ -158,7 +158,7 @@ struct smp_operations zynq_smp_ops __initdata = {
>>> .smp_boot_secondary = zynq_boot_secondary,
>>> .smp_secondary_init = zynq_secondary_init,
>>>  #ifdef CONFIG_HOTPLUG_CPU
>>> -   .cpu_die= zynq_platform_cpu_die,
>>> +   .cpu_die= zynq_cpu_die,
>>> .cpu_kill   = zynq_cpu_kill,
>>>  #endif
>>>  };
>>
>> Will be good if you can move fix that kernel-doc format for this function
>> too. It is just nice to have thing.
> 
> All these SMP-ops should be documented in the header defining that
> struct, shouldn't they?

This struct is not necessary. I have added comment to wrong place.

Thanks,
Michal

-- 
Michal Simek, Ing. (M.Eng), OpenPGP -> KeyID: FE3D1F91
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Microblaze cpu - http://www.monstr.eu/fdt/
Maintainer of Linux kernel - Xilinx Zynq ARM architecture
Microblaze U-BOOT custodian and responsible for u-boot arm zynq platform




signature.asc
Description: OpenPGP digital signature

Re: [PATCH 1/2] perf/x86/intel: Add Haswell-EP uncore support

2014-08-21 Thread Andi Kleen

> diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c 
> b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
> index 4785ee8..2485fd9 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
> @@ -883,6 +883,8 @@ static int __init uncore_pci_init(void)
>   case 62: /* Ivy Bridge-EP */
>   ret = ivbep_uncore_pci_init();
>   break;
> + case 63: /* Haswell-EP */
> + ret = hswep_uncore_pci_init();

Is the lack of break intentional? If yes please add a /* FALL THROUGH */
comment.

>   case 42: /* Sandy Bridge */
>   ret = snb_uncore_pci_init();
>   break;
> 

-- 
a...@linux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RFC 1/3] x86: Make page cache mode a real type

2014-08-21 Thread Juergen Gross


On 08/22/2014 12:09 AM, Toshi Kani wrote:

On Tue, 2014-08-19 at 15:25 +0200, jgr...@suse.com wrote:

From: Juergen Gross 

At the moment there are a lot of places that handle setting or getting
the page cache mode by treating the pgprot bits equal to the cache mode.
This is only true because there are a lot of assumptions about the setup
of the PAT MSR. Otherwise the cache type needs to get translated into
pgprot bits and vice versa.

This patch tries to prepare for that by introducing a seperate type
for the cache mode and adding functions to translate between those and pgprot
values.

To avoid too much performance penalty the translation between cache mode
and pgprot values is done via tables which contain the relevant information.
Write-back cache mode is hard-wired to be 0, all other modes are configurable
via those tables. For large pages there are translation functions as the
PAT bit is located at different positions in the ptes of 4k and large pages.


One more comment below..


diff --git a/arch/x86/include/asm/cacheflush.h 
b/arch/x86/include/asm/cacheflush.h

  :

-static inline void set_page_memtype(struct page *pg, unsigned long memtype)
+static inline void set_page_memtype(struct page *pg,
+   enum page_cache_mode memtype)
  {
unsigned long memtype_flags = _PGMT_DEFAULT;
unsigned long old_flags;
unsigned long new_flags;

switch (memtype) {
-   case _PAGE_CACHE_WC:
+   case _PAGE_CACHE_MODE_WC:
memtype_flags = _PGMT_WC;
break;
-   case _PAGE_CACHE_UC_MINUS:
+   case _PAGE_CACHE_MODE_UC_MINUS:
memtype_flags = _PGMT_UC_MINUS;
break;
-   case _PAGE_CACHE_WB:
+   case _PAGE_CACHE_MODE_WB:
+   default:
memtype_flags = _PGMT_WB;
break;
}


Adding the "default" case handled as _PGMT_WB is not correct here.
free_ram_pages_type() calls set_page_memtype() with -1, which needs to
be set to _PGMT_DEFAULT.


It says so in the comment above. I'll correct it, thanks.

Juergen

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [alsa-devel] /proc/asound/card0/oss_mixer stack corruption

2014-08-21 Thread Takashi Iwai

At Thu, 21 Aug 2014 20:55:21 +0200,
Clemens Ladisch wrote:
> 
> Tommi Rantala wrote:
> > Trinity discovered that writing 128 bytes to
> > /proc/asound/card0/oss_mixer triggers a stack corruption.
> >
> > Call Trace:
> >  [] __stack_chk_fail+0x16/0x20
> >  [] snd_mixer_oss_proc_write+0x24a/0x270
> 
> snd_info_get_line() wants the len parameter to be one less than the
> buffer size, but it isn't:
> 
>   while (!snd_info_get_line(buffer, line, sizeof(line))) {
> 
> Not that *any* other caller got it correct either:
> 
> sound/core/oss/pcm_oss.c: while (!snd_info_get_line(buffer, line, 
> sizeof(line))) {
> sound/core/pcm.c: if (!snd_info_get_line(buffer, line, sizeof(line)))
> sound/core/pcm_memory.c:  if (!snd_info_get_line(buffer, line, 
> sizeof(line))) {
> sound/drivers/dummy.c:while (!snd_info_get_line(buffer, line, 
> sizeof(line))) {
> sound/pci/ac97/ac97_proc.c:   while (!snd_info_get_line(buffer, line, 
> sizeof(line))) {
> sound/pci/ca0106/ca0106_proc.c:while (!snd_info_get_line(buffer, 
> line, sizeof(line))) {
> sound/pci/ca0106/ca0106_proc.c:while (!snd_info_get_line(buffer, 
> line, sizeof(line))) {
> sound/pci/ca0106/ca0106_proc.c:while (!snd_info_get_line(buffer, 
> line, sizeof(line))) {
> sound/pci/emu10k1/emu10k1x.c: while (!snd_info_get_line(buffer, line, 
> sizeof(line))) {
> sound/pci/emu10k1/emuproc.c:  while (!snd_info_get_line(buffer, line, 
> sizeof(line))) {
> sound/pci/emu10k1/emuproc.c:  while (!snd_info_get_line(buffer, line, 
> sizeof(line))) {
> sound/pci/hda/hda_eld.c:  while (!snd_info_get_line(buffer, line, 
> sizeof(line))) {
> sound/pci/ice1712/pontis.c:   while (!snd_info_get_line(buffer, line, 
> sizeof(line))) {
> sound/pci/ice1712/prodigy_hifi.c: while (!snd_info_get_line(buffer, line, 
> sizeof(line))) {
> sound/pci/lola/lola_proc.c:   while (!snd_info_get_line(buffer, line, 
> sizeof(line))) {
> sound/pci/pcxhr/pcxhr.c:  while (!snd_info_get_line(buffer, line, 
> sizeof(line))) {
> 
> Oh well.  At least these proc files are writable only by root, and the
> fix is easy:

Indeed.  Applied now, thanks.


Takashi

> 
> --8<>8--
> ALSA: core: fix buffer overflow in snd_info_get_line()
> 
> snd_info_get_line() documents that its last parameter must be one
> less than the buffer size, but this API design guarantees that
> (literally) every caller gets it wrong.
> 
> Just change this parameter to have its obvious meaning.
> 
> Reported-by: Tommi Rantala 
> Cc:  # v2.2.26+
> Signed-off-by: Clemens Ladisch 
> ---
>  sound/core/info.c |4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> --- a/sound/core/info.c
> +++ b/sound/core/info.c
> @@ -684,7 +684,7 @@ int snd_info_card_free(struct snd_card *card)
>   * snd_info_get_line - read one line from the procfs buffer
>   * @buffer: the procfs buffer
>   * @line: the buffer to store
> - * @len: the max. buffer size - 1
> + * @len: the max. buffer size
>   *
>   * Reads one line from the buffer and stores the string.
>   *
> @@ -704,7 +704,7 @@ int snd_info_get_line(struct snd_info_buffer *buffer, 
> char *line, int len)
>   buffer->stop = 1;
>   if (c == '\n')
>   break;
> - if (len) {
> + if (len > 1) {
>   len--;
>   *line++ = c;
>   }
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] drivers: pci: convert generic host controller to DT host bridge creation API

2014-08-21 Thread Bjorn Helgaas

On Thu, Aug 21, 2014 at 6:01 PM, Liviu Dudau  wrote:
> On Thu, Aug 21, 2014 at 12:02:16PM -0600, Bjorn Helgaas wrote:
>> [+cc Lorenzo]
>>
>> On Wed, Aug 20, 2014 at 05:35:59PM -0500, Bjorn Helgaas wrote:
>> > On Wed, Aug 20, 2014 at 7:31 AM, Liviu Dudau  wrote:
>> > > On Wed, Aug 20, 2014 at 01:27:57PM +0200, Arnd Bergmann wrote:
>> > >> On Tuesday 12 August 2014, Liviu Dudau wrote:
>> > >> > +   return of_create_pci_host_bridge(dev, 0, 0xff, _pci_ops,
>> > >> > +   gen_pci_setup, pci);
>> > >>
>> > >> I had not noticed it earlier, but the setup callback is actually a 
>> > >> feature
>> > >> of the arm32 PCI code that I had hoped to avoid when moving to the
>> > >> generic API. Can we do this as a more regular sequence of
>> > >>
>> > >>
>> > >>   ret = of_create_pci_host_bridge(dev, 0, 0xff, _pci_ops, pci);
>> > >>   if (ret)
>> > >>   return ret;
>> > >>
>> > >>   ret = gen_pci_setup(pci);
>> > >>   if (ret)
>> > >>   pci_destroy_host_bridge(dev, pci);
>> > >>   return ret;
>> > >>
>> > >> ?
>> > >>
>> > >>   Arnd
>> > >
>> > > Hi Arnd,
>> > >
>> > > That has been the general approach of my patchset up to v9. But, as 
>> > > Bjorn has
>> > > mentioned in his v8 review and I have put in my cover letter, the regular
>> > > aproach means that architectures that use pci_scan_root_bus() will have 
>> > > to
>> > > drop their one liner and replace it with the more verbose 
>> > > of_create_pci_host_bridge()
>> > > followed by pci_scan_child_bus() and pci_bus_add_devices() (basically, 
>> > > the content
>> > > of pci_scan_root_bus()). For those architectures it will lead to a net 
>> > > increase of
>> > > lines of code.
>> > >
>> > > The patch for pci-host-generic.c is the first to use the callback setup 
>> > > function, but
>> > > not the only one. My PCI host bridge driver for Juno has the same need, 
>> > > and I'm betting
>> > > all other host bridge controllers will use it as it will be the only 
>> > > opportunity to
>> > > finish the controller setup before we start scanning the child busses. 
>> > > I'm trying to
>> > > balance ease of read vs ease of use here and it is the best version I've 
>> > > come up with
>> > > so far.
>> >
>> > My guess is that you're referring to
>> > http://lkml.kernel.org/r/20140708011136.ge22...@google.com
>> >
>> > I'm trying to get to the point where arch code can discover the host
>> > bridge, configure it, learn its properties (apertures, etc.), then
>> > pass it off completely to the PCI core for PCI device enumeration.
>> > pci_scan_root_bus() is the closest thing we have to that right now, so
>> > that's why I point to that.  Here's the current pci_scan_root_bus():
>> >
>> >   pci_scan_root_bus()
>> >   {
>> > pci_create_root_bus();
>> > /* 1 */
>> > pci_scan_child_bus()
>> > /* 2 */
>> > pci_bus_add_devices()
>> >   }
>> >
>> > This is obviously incomplete as it is -- for example, it does nothing
>> > about assigning resources to PCI devices, so it only works if we rely
>> > completely on the firmware to do that.  Some arches (x86, ia64, etc.)
>> > don't want to rely on firmware, so they basically open-code
>> > pci_scan_root_bus() and insert resource assignment at (2) above.  That
>> > resource assignment really *should* be done in pci_scan_root_bus()
>> > itself, but it's quite a bit of work to make that happen.
>> >
>> > In your case, of_create_pci_host_bridge() open-codes
>> > pci_scan_root_bus() and calls the "setup" callback at (1) in the
>> > outline above.  I don't have any problem with that, and I don't care
>> > whether you do it by passing in a callback function pointer or via
>> > some other means.
>> >
>> > However, I would ask whether this is really a requirement.  Most
>> > (maybe all) other arches require nothing special at (1), i.e., between
>> > pci_create_root_bus() and pci_scan_child_bus().  If you can do it
>> > *before* pci_create_root_bus(), I think that would be nicer, but maybe
>> > you can't.
>>
>> I talked to Lorenzo here at LinuxCon and he explained this so it makes a
>> lot more sense to me now.  Would something like the following work?
>>
>>   gen_pci_probe()
>>   {
>> LIST_HEAD(res);
>> resource_size_t io_base = 0;
>>
>> of_parse_pci_host_bridge_resources(dev, , 0, 0xff, _base);
>> gen_pci_setup(, io_base);
>>
>> pci_create_root_bus(..., );
>> pci_scan_child_bus();
>> ... pci_assign_unassigned_bus_resources
>> pci_bus_add_resources();
>>   }
>>
>> Then we at least have all the PCI-related code consolidated, without
>> the arch-specific stuff mixed in.  We could almost use pci_scan_root_bus(),
>> but not quite, because of the pci_assign_unassigned_bus_resources() call
>> that pci_scan_root_bus() doesn't do.
>
> Hmm, after having a little bit more time to get my brain back into the problem
> I'm now not sure this will be good enough.
>
> Let me explain what I was trying to solve with the

Re: [PATCH] powerpc: edac: Fix build error

2014-08-21 Thread Borislav Petkov

On Thu, Aug 21, 2014 at 10:19:51PM -0400, Pranith Kumar wrote:
> Fix the following build error:
> 
> drivers/edac/ppc4xx_edac.c: In function 'mfsdram':
> drivers/edac/ppc4xx_edac.c:249: error: implicit declaration of function
> '__mfdcri'
> drivers/edac/ppc4xx_edac.c: In function 'mtsdram':
> drivers/edac/ppc4xx_edac.c:266: error: implicit declaration of function
> '__mtdcri'
> drivers/edac/ppc4xx_edac.c:269: warning: 'return' with a value, in function
> returning void
> drivers/edac/ppc4xx_edac.c: In function 'ppc4xx_edac_init_csrows':
> drivers/edac/ppc4xx_edac.c:924: warning: initialization from incompatible
> pointer type
> drivers/edac/ppc4xx_edac.c:977: error: request for member 'dimm' in something
> not a structure or union
> drivers/edac/ppc4xx_edac.c: In function 'ppc4xx_edac_map_dcrs':
> drivers/edac/ppc4xx_edac.c:1209: warning: passing argument 1 of 'dcr_map_mmio'
> discards qualifiers from pointer target type
> 
> This driver depends on PPC_DCR_NATIVE to be set for the relevant headers to be
> included. Also if PPC_DCR_MMIO=n the build fails. So make PPC_DCR depend on 
> both
> these options.
> 
> This is compile tested only.

This driver has been carried through the years from misc people after
its initial drop in 2009. And frankly, I'd prefer if someone with the
hardware could actually test it before we break it any further.

Alternatively, if it is obsolete and no one uses it, we could very well
delete it instead. Initial commit talks about the hw it supports:

dba7a77c0e40 ("edac: new ppc4xx driver module")

So it would be good if we got some opinions from the PPC crowd. Who are
on CC.

(leaving in the rest for context).

> 
> Signed-off-by: Pranith Kumar 
> CC: Andrew Morton 
> ---
>  arch/powerpc/Kconfig   | 6 +++---
>  drivers/edac/ppc4xx_edac.c | 8 
>  2 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 4bc7b62..9b90c1c 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -233,15 +233,15 @@ config ARCH_SUSPEND_POSSIBLE
>  
>  config PPC_DCR_NATIVE
>   bool
> - default n
> + default y
>  
>  config PPC_DCR_MMIO
>   bool
> - default n
> + default y
>  
>  config PPC_DCR
>   bool
> - depends on PPC_DCR_NATIVE || PPC_DCR_MMIO
> + depends on PPC_DCR_NATIVE && PPC_DCR_MMIO
>   default y
>  
>  config PPC_OF_PLATFORM_PCI
> diff --git a/drivers/edac/ppc4xx_edac.c b/drivers/edac/ppc4xx_edac.c
> index ef6b7e0..8725b73 100644
> --- a/drivers/edac/ppc4xx_edac.c
> +++ b/drivers/edac/ppc4xx_edac.c
> @@ -246,8 +246,8 @@ static const char * const ppc4xx_plb_masters[9] = {
>  static inline u32
>  mfsdram(const dcr_host_t *dcr_host, unsigned int idcr_n)
>  {
> - return __mfdcri(dcr_host->base + SDRAM_DCR_ADDR_OFFSET,
> - dcr_host->base + SDRAM_DCR_DATA_OFFSET,
> + return __mfdcri(dcr_host->host.native.base + SDRAM_DCR_ADDR_OFFSET,
> + dcr_host->host.native.base + SDRAM_DCR_DATA_OFFSET,
>   idcr_n);
>  }
>  
> @@ -263,8 +263,8 @@ mfsdram(const dcr_host_t *dcr_host, unsigned int idcr_n)
>  static inline void
>  mtsdram(const dcr_host_t *dcr_host, unsigned int idcr_n, u32 value)
>  {
> - return __mtdcri(dcr_host->base + SDRAM_DCR_ADDR_OFFSET,
> - dcr_host->base + SDRAM_DCR_DATA_OFFSET,
> + return __mtdcri(dcr_host->host.native.base + SDRAM_DCR_ADDR_OFFSET,
> + dcr_host->host.native.base + SDRAM_DCR_DATA_OFFSET,
>   idcr_n,
>   value);
>  }
> -- 
> 1.9.1
> 
> 

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH net-next 2/2] net: exit busy loop when another process is runnable

2014-08-21 Thread Mike Galbraith

On Thu, 2014-08-21 at 16:05 +0800, Jason Wang wrote: 
> Rx busy loop does not scale well in the case when several parallel
> sessions is active. This is because we keep looping even if there's
> another process is runnable. For example, if that process is about to
> send packet, keep busy polling in current process will brings extra
> delay and damage the performance.
> 
> This patch solves this issue by exiting the busy loop when there's
> another process is runnable in current cpu. Simple test that pin two
> netperf sessions in the same cpu in receiving side shows obvious
> improvement:

That patch says to me it's a bad idea to spin when someone (anyone) else
can get some work done on a CPU, which intuitively makes sense.  But..

(ponders net goop: with silly 1 byte ping-pong load, throughput is bound
by fastpath latency, net plus sched plus fixable nohz and governor crud
if not polling, so you can't get a lot of data moved byte at a time no
matter how sexy the pipe whether polling or not due to bound.  If OTOH
net hardware is a blazing fast large bore packet cannon, net overhead
per unit payload drops, sched+crud is a constant)

Seems the only time it's a good idea to poll is if blasting big packets
on sexy hardware, and if you're doing that, you want to poll regardless
of whether somebody else is waiting, or?

> Before:
> netperf -H 192.168.100.2 -T 0,0 -t TCP_RR -P 0 & \
> netperf -H 192.168.100.2 -T 1,0 -t TCP_RR -P 0
> 16384  87380  11   10.0015513.74
> 16384  87380
> 16384  87380  11   10.0015092.78
> 16384  87380
> 
> After:
> netperf -H 192.168.100.2 -T 0,0 -t TCP_RR -P 0 & \
> netperf -H 192.168.100.2 -T 1,0 -t TCP_RR -P 0
> 16384  87380  11   10.0023334.53
> 16384  87380
> 16384  87380  11   10.0023327.58
> 16384  87380
> 
> Benchmark was done through two 8 cores Xeon machine back to back connected
> with mlx4 through netperf TCP_RR test (busy_read were set to 50):
> 
> sessions/bytes/before/after/+improvement%/busy_read=0/
> 1/1/30062.10/30034.72/+0%/20228.96/
> 16/1/214719.83/307669.01/+43%/268997.71/
> 32/1/231252.81/345845.16/+49%/336157.442/
> 64/512/212467.39/373464.93/+75%/397449.375/
> 
> Signed-off-by: Jason Wang 
> ---
>  include/net/busy_poll.h | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/include/net/busy_poll.h b/include/net/busy_poll.h
> index 1d67fb6..8a33fb2 100644
> --- a/include/net/busy_poll.h
> +++ b/include/net/busy_poll.h
> @@ -109,7 +109,8 @@ static inline bool sk_busy_loop(struct sock *sk, int 
> nonblock)
>   cpu_relax();
>  
>   } while (!nonblock && skb_queue_empty(>sk_receive_queue) &&
> -  !need_resched() && !busy_loop_timeout(end_time));
> +  !need_resched() && !busy_loop_timeout(end_time) &&
> +  nr_running_this_cpu() < 2);
>  
>   rc = !skb_queue_empty(>sk_receive_queue);
>  out:


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v9 03/12] PCI: Introduce helper functions to deal with PCI I/O ranges.

2014-08-21 Thread Rob Herring

On Tue, Aug 12, 2014 at 11:25 AM, Liviu Dudau  wrote:
> Some architectures do not have a simple view of the PCI I/O space
> and instead use a range of CPU addresses that map to bus addresses.
> For some architectures these ranges will be expressed by OF bindings
> in a device tree file.
>
> This patch introduces a pci_register_io_range() helper function with
> a generic implementation that can be used by such architectures to
> keep track of the I/O ranges described by the PCI bindings. If the
> PCI_IOBASE macro is not defined that signals lack of support for PCI
> and we return an error.
>
> In order to retrieve the CPU address associated with an I/O port, a
> new helper function pci_pio_to_address() is introduced. This will
> search in the list of ranges registered with pci_register_io_range()
> and return the CPU address that corresponds to the given port.
>
> Cc: Grant Likely 
> Cc: Rob Herring 
> Cc: Arnd Bergmann 
> Signed-off-by: Liviu Dudau 

Acked-by: Rob Herring 


> ---
>  drivers/of/address.c   | 95 
> ++
>  include/linux/of_address.h |  2 +
>  2 files changed, 97 insertions(+)
>
> diff --git a/drivers/of/address.c b/drivers/of/address.c
> index 5edfcb0..4dab700 100644
> --- a/drivers/of/address.c
> +++ b/drivers/of/address.c
> @@ -5,6 +5,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>
>  /* Max address size we deal with */
> @@ -601,12 +602,106 @@ const __be32 *of_get_address(struct device_node *dev, 
> int index, u64 *size,
>  }
>  EXPORT_SYMBOL(of_get_address);
>
> +#ifdef PCI_IOBASE
> +struct io_range {
> +   struct list_head list;
> +   phys_addr_t start;
> +   resource_size_t size;
> +};
> +
> +static LIST_HEAD(io_range_list);
> +static DEFINE_SPINLOCK(io_range_lock);
> +#endif
> +
> +/*
> + * Record the PCI IO range (expressed as CPU physical address + size).
> + * Return a negative value if an error has occured, zero otherwise
> + */
> +int __weak pci_register_io_range(phys_addr_t addr, resource_size_t size)
> +{
> +#ifdef PCI_IOBASE
> +   struct io_range *range;
> +   resource_size_t allocated_size = 0;
> +
> +   /* check if the range hasn't been previously recorded */
> +   spin_lock(_range_lock);
> +   list_for_each_entry(range, _range_list, list) {
> +   if (addr >= range->start && addr + size <= range->start + 
> size)
> +   return 0;
> +   allocated_size += range->size;
> +   }
> +   spin_unlock(_range_lock);
> +
> +   /* range not registed yet, check for available space */
> +   if (allocated_size + size - 1 > IO_SPACE_LIMIT) {
> +   /* if it's too big check if 64K space can be reserved */
> +   if (allocated_size + SZ_64K - 1 > IO_SPACE_LIMIT)
> +   return -E2BIG;
> +
> +   size = SZ_64K;
> +   pr_warn("Requested IO range too big, new size set to 64K\n");
> +   }
> +
> +   /* add the range to the list */
> +   range = kzalloc(sizeof(*range), GFP_KERNEL);
> +   if (!range)
> +   return -ENOMEM;
> +
> +   range->start = addr;
> +   range->size = size;
> +
> +   list_add_tail(>list, _range_list);
> +#endif
> +
> +   return 0;
> +}
> +
> +phys_addr_t pci_pio_to_address(unsigned long pio)
> +{
> +   phys_addr_t address = (phys_addr_t)OF_BAD_ADDR;
> +
> +#ifdef PCI_IOBASE
> +   struct io_range *range;
> +   resource_size_t allocated_size = 0;
> +
> +   if (pio > IO_SPACE_LIMIT)
> +   return address;
> +
> +   spin_lock(_range_lock);
> +   list_for_each_entry(range, _range_list, list) {
> +   if (pio >= allocated_size && pio < allocated_size + 
> range->size) {
> +   address = range->start + pio - allocated_size;
> +   break;
> +   }
> +   allocated_size += range->size;
> +   }
> +   spin_unlock(_range_lock);
> +#endif
> +
> +   return address;
> +}
> +
>  unsigned long __weak pci_address_to_pio(phys_addr_t address)
>  {
> +#ifdef PCI_IOBASE
> +   struct io_range *res;
> +   resource_size_t offset = 0;
> +
> +   list_for_each_entry(res, _range_list, list) {
> +   if (address >= res->start &&
> +   address < res->start + res->size) {
> +   return res->start - address + offset;
> +   }
> +   offset += res->size;
> +   }
> +
> +   return (unsigned long)-1;
> +#else
> if (address > IO_SPACE_LIMIT)
> return (unsigned long)-1;
>
> return (unsigned long) address;
> +#endif
>  }
>
>  static int __of_address_to_resource(struct device_node *dev,
> diff --git a/include/linux/of_address.h b/include/linux/of_address.h
> index c13b878..28e6836 100644
> --- a/include/linux/of_address.h
> +++ b/include/linux/of_address.h
> @@ -55,7 +55,9 @@ extern void __iomem *of_iomap(struct

Re: amd_mce.c redundant if check?

2014-08-21 Thread Borislav Petkov

Hi,

first of all, please remember to hit Reply-to-all when replying to mails
on lkml otherwise your note might get lost in the flood.

On Wed, Aug 20, 2014 at 10:08:18PM -0600, Chip wrote:
> On Wed, Aug 20, 2014 at 11:18:21AM -0600, Adam Duskett wrote:
> 
> >I have recently come upon this section of code in
> >arch/x86/kernel/cpu/mcheck/mce_amd.c that seems to be a redundant
> >unnecessary if check.
> >
> >
> >From line 170 - 176:
> >
> >if (tr->set_lvt_off) {
> >if (lvt_off_valid(tr->b, tr->lvt_off, lo, hi)) {
> >/* set new lvt offset */
> >hi &= ~MASK_LVTOFF_HI;
> >hi |= tr->lvt_off << 20;
> >}
> >}
> >
> >
> >This seems like it's not actually doing anything because it's setting
> >the same value that the bit-field already has to itself.
> 
> I brought this up to Adam the other day, so he posted the question
> to this list today to elicit a response from the original
> developer(s).  I realize the quickest response is to ask the
> original poster (Adam) to investigate further, such as with pen and
> paper, but that is not a proper response to a legitimate question.
> Here is the #define that is referenced, and the two routines in
> question.  This is current in kernel version 3.16 in
> arch/x86/kernel/cpu/mcheck/mce_amd.c.
> 
> #define MASK_LVTOFF_HI0x00F0
> 
> static int lvt_off_valid(struct threshold_block *b, int apic, u32
> lo, u32 hi)
> {
>int msr = (hi & MASK_LVTOFF_HI) >> 20;
> 
>if (apic < 0) {
>pr_err(FW_BUG "cpu %d, failed to setup threshold interrupt "
>   "for bank %d, block %d (MSR%08X=0x%x%08x)\n", b->cpu,
>   b->bank, b->block, b->address, hi, lo);
>return 0;
>}
> 
>if (apic != msr) {
>pr_err(FW_BUG "cpu %d, invalid threshold interrupt
> offset %d "
>   "for bank %d, block %d (MSR%08X=0x%x%08x)\n",
>   b->cpu, apic, b->bank, b->block, b->address, hi, lo);
>return 0;
>}
> 
>return 1;
> };
> 
> /*
> * Called via smp_call_function_single(), must be called with correct
> * cpu affinity.
> */
> static void threshold_restart_bank(void *_tr)
> {
>struct thresh_restart *tr = _tr;
>u32 hi, lo;
> 
>rdmsr(tr->b->address, lo, hi);
> 
>if (tr->b->threshold_limit < (hi & THRESHOLD_MAX))
>tr->reset = 1;  /* limit cannot be lower than err count */
> 
>if (tr->reset) {/* reset err count and
> overflow bit */
>hi =
>(hi & ~(MASK_ERR_COUNT_HI | MASK_OVERFLOW_HI)) |
>(THRESHOLD_MAX - tr->b->threshold_limit);
>} else if (tr->old_limit) { /* change limit w/o reset */
>int new_count = (hi & THRESHOLD_MAX) +
>(tr->old_limit - tr->b->threshold_limit);
> 
>hi = (hi & ~MASK_ERR_COUNT_HI) |
>(new_count & THRESHOLD_MAX);
>}
> 
>/* clear IntType */
>hi &= ~MASK_INT_TYPE_HI;
> 
>if (!tr->b->interrupt_capable)
>goto done;
> 
>if (tr->set_lvt_off) {
>if (lvt_off_valid(tr->b, tr->lvt_off, lo, hi)) {
>/* set new lvt offset */
>hi &= ~MASK_LVTOFF_HI;
>hi |= tr->lvt_off << 20;
>}
>}
> 
>if (tr->b->interrupt_enable)
>hi |= INT_TYPE_APIC;
> 
> done:
> 
>hi |= MASK_COUNT_EN_HI;
>wrmsr(tr->b->address, lo, hi);
> }
> 
> 
> If one were to actually analyze the source file from which this
> snippet comes (lines 117 - 185), one would realize the call to
> lvt_off_valid() is given tr->lvt_off as the input "apic" value that
> is compared to the content in "hi" at bit positions 23:20 (MSR bits
> 55:52); this field is called LVT Offset (LVTOFF).  The value for
> tr->lvt_off is usually from 0 to 4, inclusive.  If this value is
> equal to the LVTOFF value in "hi", then lvt_off_valid() returns 1
> for true.  If the value for tr->lvt_off differs from the LVTOFF
> value in "hi", then lvt_off_valid() returns 0 for false.
> 
> Now, if the return from lvt_off_valid() is false, then nothing is
> changed in "hi".  However, if the return is true, which means the
> value in tr->lvt_off is equal to the LVTOFF value in "hi", then the
> LVTOFF value in "hi" is replaced with the value in tr->lvt_off.  One
> has to wonder, then, why bother actually calling lvt_off_valid() in
> the first place when the end result is that "hi" does not change.
> What is the rationale for having the code snippet at lines 170 - 176
> when that condition check does nothing to change "hi"?

Right, I see what you mean now. This is

bbaff08dca3c ("mce, amd: Add helper functions to setup APIC")

Frankly, I'm not too worried about the overwriting the LVT offset with
the same value in the success case - that doesn't hurt anyone.

What is more interesting is what we do in

[PATCH v3 3/3] mmc: mmci: rename sdio flag in vendor data to st_sdio

2014-08-21 Thread Srinivas Kandagatla

This patch renames sdio flag in vendor data to st_sdio, as this flag is
only used to enable ST specific sdio setup. This will also ensure that
the ST specfic setup is not done on other vendor like Qualcomm.

Originally the issue was detected while testing WLAN ath6kl on IFC6410
board with APQ8064 SOC.

Signed-off-by: Srinivas Kandagatla 
---
 drivers/mmc/host/mmci.c | 48 
 1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/drivers/mmc/host/mmci.c b/drivers/mmc/host/mmci.c
index a25759e..264c947 100644
--- a/drivers/mmc/host/mmci.c
+++ b/drivers/mmc/host/mmci.c
@@ -61,7 +61,7 @@ static unsigned int fmax = 515633;
  * @fifohalfsize: number of bytes that can be written when MCI_TXFIFOHALFEMPTY
  *   is asserted (likewise for RX)
  * @data_cmd_enable: enable value for data commands.
- * @sdio: variant supports SDIO
+ * @st_sdio: enable ST specific SDIO logic
  * @st_clkdiv: true if using a ST-specific clock divider algorithm
  * @datactrl_mask_ddrmode: ddr mode mask in datactrl register.
  * @blksz_datactrl16: true if Block size is at b16..b30 position in datactrl 
register
@@ -91,7 +91,7 @@ struct variant_data {
unsigned intdata_cmd_enable;
unsigned intdatactrl_mask_ddrmode;
unsigned intdatactrl_mask_sdio;
-   boolsdio;
+   boolst_sdio;
boolst_clkdiv;
boolblksz_datactrl16;
boolblksz_datactrl4;
@@ -141,7 +141,7 @@ static struct variant_data variant_u300 = {
.clkreg_8bit_bus_enable = MCI_ST_8BIT_BUS,
.datalength_bits= 16,
.datactrl_mask_sdio = MCI_ST_DPSM_SDIOEN,
-   .sdio   = true,
+   .st_sdio= true,
.pwrreg_powerup = MCI_PWR_ON,
.f_max  = 1,
.signal_direction   = true,
@@ -155,7 +155,7 @@ static struct variant_data variant_nomadik = {
.clkreg = MCI_CLK_ENABLE,
.datalength_bits= 24,
.datactrl_mask_sdio = MCI_ST_DPSM_SDIOEN,
-   .sdio   = true,
+   .st_sdio= true,
.st_clkdiv  = true,
.pwrreg_powerup = MCI_PWR_ON,
.f_max  = 1,
@@ -173,7 +173,7 @@ static struct variant_data variant_ux500 = {
.clkreg_neg_edge_enable = MCI_ST_UX500_NEG_EDGE,
.datalength_bits= 24,
.datactrl_mask_sdio = MCI_ST_DPSM_SDIOEN,
-   .sdio   = true,
+   .st_sdio= true,
.st_clkdiv  = true,
.pwrreg_powerup = MCI_PWR_ON,
.f_max  = 1,
@@ -193,7 +193,7 @@ static struct variant_data variant_ux500v2 = {
.datactrl_mask_ddrmode  = MCI_ST_DPSM_DDRMODE,
.datalength_bits= 24,
.datactrl_mask_sdio = MCI_ST_DPSM_SDIOEN,
-   .sdio   = true,
+   .st_sdio= true,
.st_clkdiv  = true,
.blksz_datactrl16   = true,
.pwrreg_powerup = MCI_PWR_ON,
@@ -818,26 +818,26 @@ static void mmci_start_data(struct mmci_host *host, 
struct mmc_data *data)
if (data->flags & MMC_DATA_READ)
datactrl |= MCI_DPSM_DIRECTION;
 
-   if (variant->sdio && host->mmc->card)
-   if (mmc_card_sdio(host->mmc->card)) {
-   u32 clk;
-   datactrl |= variant->datactrl_mask_sdio;
+   if (host->mmc->card && mmc_card_sdio(host->mmc->card)) {
+   u32 clk;
 
-   /*
-* The ST Micro variant for SDIO small write transfers
-* needs to have clock H/W flow control disabled,
-* otherwise the transfer will not start. The threshold
-* depends on the rate of MCLK.
-*/
-   if (data->flags & MMC_DATA_WRITE &&
-   (host->size < 8 ||
-(host->size <= 8 && host->mclk > 5000)))
-   clk = host->clk_reg & ~variant->clkreg_enable;
-   else
-   clk = host->clk_reg | variant->clkreg_enable;
+   datactrl |= variant->datactrl_mask_sdio;
 
-   mmci_write_clkreg(host, clk);
-   }
+   /*
+* The ST Micro variant for SDIO small write transfers
+* needs to have clock H/W flow control disabled,
+* otherwise the transfer will not start. The threshold
+* depends on the rate of MCLK.
+*/
+   if (variant->st_sdio && data->flags & MMC_DATA_WRITE &&
+   (host->size < 8 ||
+

[PATCH v3 2/3] mmc: mmci: Add sdio enable mask in variant data

2014-08-21 Thread Srinivas Kandagatla

This patch adds sdio enable mask in variant data, SOCs like ST have
special bits in datactrl register to enable sdio. Unconditionally setting
this bit in this driver breaks other SOCs like Qualcomm which maps this
bits to something else, so making this enable bit to come from variant
data solves the issue.

Originally the issue is detected while testing WLAN ath6kl on Qualcomm
APQ8064.

Reviewed-by: Linus Walleij 
Signed-off-by: Srinivas Kandagatla 
---
 drivers/mmc/host/mmci.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/mmc/host/mmci.c b/drivers/mmc/host/mmci.c
index 533ad2b..a25759e 100644
--- a/drivers/mmc/host/mmci.c
+++ b/drivers/mmc/host/mmci.c
@@ -67,6 +67,7 @@ static unsigned int fmax = 515633;
  * @blksz_datactrl16: true if Block size is at b16..b30 position in datactrl 
register
  * @blksz_datactrl4: true if Block size is at b4..b16 position in datactrl
  *  register
+ * @datactrl_mask_sdio: SDIO enable mask in datactrl register
  * @pwrreg_powerup: power up value for MMCIPOWER register
  * @f_max: maximum clk frequency supported by the controller.
  * @signal_direction: input/out direction of bus signals can be indicated
@@ -89,6 +90,7 @@ struct variant_data {
unsigned intfifohalfsize;
unsigned intdata_cmd_enable;
unsigned intdatactrl_mask_ddrmode;
+   unsigned intdatactrl_mask_sdio;
boolsdio;
boolst_clkdiv;
boolblksz_datactrl16;
@@ -138,6 +140,7 @@ static struct variant_data variant_u300 = {
.clkreg_enable  = MCI_ST_U300_HWFCEN,
.clkreg_8bit_bus_enable = MCI_ST_8BIT_BUS,
.datalength_bits= 16,
+   .datactrl_mask_sdio = MCI_ST_DPSM_SDIOEN,
.sdio   = true,
.pwrreg_powerup = MCI_PWR_ON,
.f_max  = 1,
@@ -151,6 +154,7 @@ static struct variant_data variant_nomadik = {
.fifohalfsize   = 8 * 4,
.clkreg = MCI_CLK_ENABLE,
.datalength_bits= 24,
+   .datactrl_mask_sdio = MCI_ST_DPSM_SDIOEN,
.sdio   = true,
.st_clkdiv  = true,
.pwrreg_powerup = MCI_PWR_ON,
@@ -168,6 +172,7 @@ static struct variant_data variant_ux500 = {
.clkreg_8bit_bus_enable = MCI_ST_8BIT_BUS,
.clkreg_neg_edge_enable = MCI_ST_UX500_NEG_EDGE,
.datalength_bits= 24,
+   .datactrl_mask_sdio = MCI_ST_DPSM_SDIOEN,
.sdio   = true,
.st_clkdiv  = true,
.pwrreg_powerup = MCI_PWR_ON,
@@ -187,6 +192,7 @@ static struct variant_data variant_ux500v2 = {
.clkreg_neg_edge_enable = MCI_ST_UX500_NEG_EDGE,
.datactrl_mask_ddrmode  = MCI_ST_DPSM_DDRMODE,
.datalength_bits= 24,
+   .datactrl_mask_sdio = MCI_ST_DPSM_SDIOEN,
.sdio   = true,
.st_clkdiv  = true,
.blksz_datactrl16   = true,
@@ -812,16 +818,10 @@ static void mmci_start_data(struct mmci_host *host, 
struct mmc_data *data)
if (data->flags & MMC_DATA_READ)
datactrl |= MCI_DPSM_DIRECTION;
 
-   /* The ST Micro variants has a special bit to enable SDIO */
if (variant->sdio && host->mmc->card)
if (mmc_card_sdio(host->mmc->card)) {
-   /*
-* The ST Micro variants has a special bit
-* to enable SDIO.
-*/
u32 clk;
-
-   datactrl |= MCI_ST_DPSM_SDIOEN;
+   datactrl |= variant->datactrl_mask_sdio;
 
/*
 * The ST Micro variant for SDIO small write transfers
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 1/3] mmc: mmci: Support any block sizes for ux500v2 and qcom variant

2014-08-21 Thread Srinivas Kandagatla

From: Ulf Hansson 

For the ux500v2 variant of the PL18x block, any block sizes are
supported. This will make it possible to decrease data overhead
for SDIO transfers.

This patch is based on Ulf Hansson patch
http://www.spinics.net/lists/linux-mmc/msg12160.html

Signed-off-by: Srinivas Kandagatla 
enabled this support on qcom variant.

Signed-off-by: Ulf Hansson 
Signed-off-by: Linus Walleij 
---
 drivers/mmc/host/mmci.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/mmc/host/mmci.c b/drivers/mmc/host/mmci.c
index c11cb05..533ad2b 100644
--- a/drivers/mmc/host/mmci.c
+++ b/drivers/mmc/host/mmci.c
@@ -77,6 +77,7 @@ static unsigned int fmax = 515633;
  * @qcom_fifo: enables qcom specific fifo pio read logic.
  * @reversed_irq_handling: handle data irq before cmd irq.
  * @qcom_dml: enables qcom specific dma glue for dma transfers.
+ * @any_blksize: true if block any sizes are supported
  */
 struct variant_data {
unsigned intclkreg;
@@ -102,6 +103,7 @@ struct variant_data {
boolqcom_fifo;
boolreversed_irq_handling;
boolqcom_dml;
+   boolany_blksize;
 };
 
 static struct variant_data variant_arm = {
@@ -194,6 +196,7 @@ static struct variant_data variant_ux500v2 = {
.pwrreg_clkgate = true,
.busy_detect= true,
.pwrreg_nopower = true,
+   .any_blksize= true,
 };
 
 static struct variant_data variant_qcom = {
@@ -212,6 +215,7 @@ static struct variant_data variant_qcom = {
.explicit_mclk_control  = true,
.qcom_fifo  = true,
.qcom_dml   = true,
+   .any_blksize= true,
 };
 
 static int mmci_card_busy(struct mmc_host *mmc)
@@ -239,10 +243,11 @@ static int mmci_card_busy(struct mmc_host *mmc)
 static int mmci_validate_data(struct mmci_host *host,
  struct mmc_data *data)
 {
+   struct variant_data *variant = host->variant;
+
if (!data)
return 0;
-
-   if (!is_power_of_2(data->blksz)) {
+   if (!is_power_of_2(data->blksz) && !variant->any_blksize) {
dev_err(mmc_dev(host->mmc),
"unsupported block size (%d bytes)\n", data->blksz);
return -EINVAL;
@@ -796,7 +801,6 @@ static void mmci_start_data(struct mmci_host *host, struct 
mmc_data *data)
writel(host->size, base + MMCIDATALENGTH);
 
blksz_bits = ffs(data->blksz) - 1;
-   BUG_ON(1 << blksz_bits != data->blksz);
 
if (variant->blksz_datactrl16)
datactrl = MCI_DPSM_ENABLE | (data->blksz << 16);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 0/3] mmc: mmci: sdio related fixes

2014-08-21 Thread Srinivas Kandagatla

This patchset fixes few sdio related issues encountered while testing
WLAN ath6kl via SDIO on IFC6410 board with Qualcomm APQ8064 SOC.

Patch: "mmc: mmci: Support any block sizes for ux500v2 and qcom variant" is
a very old patch by Ulf to support IPs which support any size of block sizes.
http://www.spinics.net/lists/linux-mmc/msg12160.html I modified the subject
line to include qcom.
Patch fixes below issues reported while testing sdio.
The issue was ath6kl driver was issuing 12 bytes and 24 bytes reads
which are caught as part of the error handing in the driver and
resulting in failures. 

Patch "mmc: mmci: Add sdio enable mask in variant data" adds extra
variant parameter to enable sdio. This makes mmci driver more flexible.

Patch "mmc: mmci: rename sdio flag in vendor data to st_sdio" renames sdio
flag in vendor data to st_sdio, as this flag is only used to setup st
specific sdio logic.

All these patches are tested on IFC6410 board with ath6kl WLAN via SDIO.

Thanks to Linus W, Ulf annd Russell for comments since RFC.

Changes since v2:
- removed "mmc: mmci: move block size validation under relevant code" 
patch
as this is already fixed by original patch from Ulf.

Changes since RFC:
- moved sdio flag to st_sdio to simplify the checks.
- use Ulf's patch to address IP's which support anysize blocks.

Thanks,
srini

Srinivas Kandagatla (2):
  mmc: mmci: Add sdio enable mask in variant data
  mmc: mmci: rename sdio flag in vendor data to st_sdio

Ulf Hansson (1):
  mmc: mmci: Support any block sizes for ux500v2 and qcom variant

 drivers/mmc/host/mmci.c | 68 ++---
 1 file changed, 36 insertions(+), 32 deletions(-)

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] usb: phy: twl4030-usb: Fix regressions to runtime PM on omaps

2014-08-21 Thread Kishon Vijay Abraham I

Hi,

On Thursday 21 August 2014 10:13 PM, Tony Lindgren wrote:
> Commit 30a70b026b4cd ("usb: musb: fix obex in g_nokia.ko causing kernel
> panic") attempted to fix runtime PM handling for PHYs that are on the
> I2C bus. Commit 3063a12be2b0 (usb: musb: fix PHY power on/off) then
> changed things around to enable of PHYs that rely on runtime PM.
> 
> These changes however broke idling of the PHY and causes at least
> 100 mW extra power consumption on omaps, which is a lot with
> the idle power consumption being below 10 mW range on many devices.
> 
> As calling phy_power_on/off from runtime PM calls in the USB
> causes complicated issues with I2C connected PHYs, let's just let
> the PHY do it's own runtime PM as needed. This leaves out the
> dependency between PHYs and USB controller drivers for runtime
> PM.
> 
> Let's fix the regression for twl4030-usb by adding minimal runtime
> PM support. This allows idling the PHY on disconnect.
> 
> Note that we are changing to use standard runtime PM handling
> for twl4030_phy_init() as that function just checks the state
> and does not initialize the PHY. The PHY won't get initialized
> until in twl4030_phy_power_on().
> 
> Fixes: 30a70b026b4cd ("usb: musb: fix obex in g_nokia.ko causing kernel 
> panic")
> Fixes: 3063a12be2b0 ("usb: musb: fix PHY power on/off")
> Cc: sta...@vger.kernel.org # v3.15+
> Signed-off-by: Tony Lindgren 
> 
> ---
> 
> Kishon, this regression fix would be nice to get into the v3.17-rc
> series if no objections. If you don't have other fixes, I can also
> queue via arm-soc with proper acks.

I can queue this one up once put_autosuspend() is used.

Thanks
Kishon
> 
> It probably does not make sense to try to fix this without using
> runtime PM without complicating the code further.
> 
> --- a/drivers/phy/phy-twl4030-usb.c
> +++ b/drivers/phy/phy-twl4030-usb.c
> @@ -34,6 +34,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -422,37 +423,55 @@ static void twl4030_phy_power(struct twl4030_usb *twl, 
> int on)
>   }
>  }
>  
> -static int twl4030_phy_power_off(struct phy *phy)
> +static int twl4030_usb_runtime_suspend(struct device *dev)
>  {
> - struct twl4030_usb *twl = phy_get_drvdata(phy);
> + struct twl4030_usb *twl = dev_get_drvdata(dev);
>  
> + dev_dbg(twl->dev, "%s\n", __func__);
>   if (twl->asleep)
>   return 0;
>  
>   twl4030_phy_power(twl, 0);
>   twl->asleep = 1;
> - dev_dbg(twl->dev, "%s\n", __func__);
> +
>   return 0;
>  }
>  
> -static void __twl4030_phy_power_on(struct twl4030_usb *twl)
> +static int twl4030_usb_runtime_resume(struct device *dev)
>  {
> + struct twl4030_usb *twl = dev_get_drvdata(dev);
> +
> + dev_dbg(twl->dev, "%s\n", __func__);
> + if (!twl->asleep)
> + return 0;
> +
>   twl4030_phy_power(twl, 1);
> - twl4030_i2c_access(twl, 1);
> - twl4030_usb_set_mode(twl, twl->usb_mode);
> - if (twl->usb_mode == T2_USB_MODE_ULPI)
> - twl4030_i2c_access(twl, 0);
> + twl->asleep = 0;
> +
> + return 0;
> +}
> +
> +static int twl4030_phy_power_off(struct phy *phy)
> +{
> + struct twl4030_usb *twl = phy_get_drvdata(phy);
> +
> + dev_dbg(twl->dev, "%s\n", __func__);
> + pm_runtime_mark_last_busy(twl->dev);
> + pm_runtime_put_autosuspend(twl->dev);
> +
> + return 0;
>  }
>  
>  static int twl4030_phy_power_on(struct phy *phy)
>  {
>   struct twl4030_usb *twl = phy_get_drvdata(phy);
>  
> - if (!twl->asleep)
> - return 0;
> - __twl4030_phy_power_on(twl);
> - twl->asleep = 0;
>   dev_dbg(twl->dev, "%s\n", __func__);
> + pm_runtime_get_sync(twl->dev);
> + twl4030_i2c_access(twl, 1);
> + twl4030_usb_set_mode(twl, twl->usb_mode);
> + if (twl->usb_mode == T2_USB_MODE_ULPI)
> + twl4030_i2c_access(twl, 0);
>  
>   /*
>* XXX When VBUS gets driven after musb goes to A mode,
> @@ -558,6 +577,16 @@ static irqreturn_t twl4030_usb_irq(int irq, void *_twl)
>* USB_LINK_VBUS state.  musb_hdrc won't care until it
>* starts to handle softconnect right.
>*/
> + if ((status == OMAP_MUSB_VBUS_VALID) ||
> + (status == OMAP_MUSB_ID_GROUND)) {
> + if (twl->asleep)
> + pm_runtime_get_sync(twl->dev);
> + } else {
> + if (!twl->asleep) {
> + pm_runtime_mark_last_busy(twl->dev);
> + pm_runtime_put_autosuspend(twl->dev);
> + }
> + }
>   omap_musb_mailbox(status);
>   }
>   sysfs_notify(>dev->kobj, NULL, "vbus");
> @@ -599,22 +628,17 @@ static int twl4030_phy_init(struct phy *phy)
>   struct twl4030_usb *twl = phy_get_drvdata(phy);
>   enum omap_musb_vbus_id_status status;
>  
> - /*
> -  * Start in sleep state, we'll get called through

Re: [PATCH v3 0/7] Dynamic Pause Loop Exiting window.

2014-08-21 Thread Wanpeng Li

Hi Radim,
On Thu, Aug 21, 2014 at 06:50:03PM +0200, Radim Krčmář wrote:
>2014-08-21 18:30+0200, Paolo Bonzini:
>> Il 21/08/2014 18:08, Radim Krčmář ha scritto:
>> I'm not sure of the usefulness of patch 6, so I'm going to drop it.
>> I'll keep it in my local junkyard branch in case it's going to be useful
>> in some scenario I didn't think of.
>
>I've been using it to benchmark different values, because it is more

Is there any benchmark data for this patchset?

Regards,
Wanpeng Li 

>convenient than reloading the module after shutting down guests.
>(And easier to sell than writing to kernel memory.)
>
>I don't think the additional code is worth it though.
>
>> Patch 7 can be easily rebased, so no need to repost (and I might even
>> squash it into patch 3, what do you think?).
>
>Yeah, the core is already a huge patch, so it does look weird without
>squashing.  (No-one wants to rebase to that point anyway.)
>
>Thanks.
>--
>To unsubscribe from this list: send the line "unsubscribe kvm" in
>the body of a message to majord...@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Xen-devel] [V0 PATCH 1/2] AMD-PVH: set EFER.NX and EFER.SCE for the boot vcpu

2014-08-21 Thread Borislav Petkov

On Thu, Aug 21, 2014 at 07:46:56PM -0700, Mukesh Rathor wrote:
> Intel doesn't have EFER.NX bit.

Of course it does.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 3/3] tg3: Fix tx_pending checks for tg3_tso_bug

2014-08-21 Thread Prashant Sreedharan

On Thu, 2014-08-21 at 14:57 -0700, Benjamin Poirier wrote:
> In tg3_set_ringparam(), the tx_pending test to cover the cases where
> tg3_tso_bug() is entered has two problems
> 1) the check is only done for certain hardware whereas the workaround
> is now used more broadly. IOW, the check may not be performed when it
> is needed.
> 2) the check is too optimistic.
> 
> For example, with a 5761 (SHORT_DMA_BUG), tg3_set_ringparam() skips over the
> "tx_pending <= (MAX_SKB_FRAGS * 3)" check because TSO_BUG is false. Even if it
> did do the check, with a full sized skb, frag_cnt_est = 135 but the check is
> for <= MAX_SKB_FRAGS * 3 (= 17 * 3 = 51). So the check is insufficient. This
> leads to the following situation: by setting, ex. tx_pending = 100, there can
> be an skb that triggers tg3_tso_bug() and that is large enough to cause
> tg3_tso_bug() to stop the queue even when it is empty. We then end up with a
> netdev watchdog transmit timeout.
> 
> Given that 1) some of the conditions tested for in tg3_tx_frag_set() apply
> regardless of the chipset flags and that 2) it is difficult to estimate ahead
> of time the max possible number of frames that a large skb may be split into
> by gso, we instead take the approach of adjusting dev->gso_max_segs according
> to the requested tx_pending size.
> 
> This puts us in the exceptional situation that a single skb that triggers
> tg3_tso_bug() may require the entire tx ring. Usually the tx queue is woken up
> when at least a quarter of it is available (TG3_TX_WAKEUP_THRESH) but that
> would be insufficient now. To avoid useless wakeups, the tx queue wake up
> threshold is made dynamic. Likewise, usually the tx queue is stopped as soon
> as an skb with max frags may overrun it. Since the skbs submitted from
> tg3_tso_bug() use a controlled number of descriptors, the tx queue stop
> threshold may be lowered.
> 
> Signed-off-by: Benjamin Poirier 
> ---
> Changes v1->v2
> * in tg3_set_ringparam(), reduce gso_max_segs further to budget 3 descriptors
>   per gso seg instead of only 1 as in v1
> * in tg3_tso_bug(), check that this estimation (3 desc/seg) holds, otherwise
>   linearize some skbs as needed
> * in tg3_start_xmit(), make the queue stop threshold a parameter, for the
>   reason explained in the commit description
> 
> I was concerned that this last change, because of the extra call in the
> default xmit path, may impact performance so I performed an rr latency test
> but I did not measure a significant impact. That test was with default mtu and
> ring size.
> 
> # perf stat -r10 -ad netperf -H 192.168.9.30 -l60 -T 0,0 -t omni -- -d rr
> 
> * without patches
>   rr values: 7039.63 6865.03 6939.21 6919.31 6931.88 6932.74 6925.1 
> 6953.33 6868.43 6935.65
>   sample size: 10
>   mean: 6931.031
>   standard deviation: 48.10918
>   quantiles: 6865.03 6920.757 6932.31 6938.32 7039.63
>   6930±50
> 
>  Performance counter stats for 'netperf -H 192.168.9.30 -l60 -T 0,0 -t omni 
> -- -d rr' (10 runs):
> 
>  480643.024723 task-clock#8.001 CPUs utilized 
>( +-  0.00% ) [100.00%]
>855,136 context-switches  #0.002 M/sec 
>( +-  0.23% ) [100.00%]
>521 CPU-migrations#0.000 M/sec 
>( +-  6.49% ) [100.00%]
>104 page-faults   #0.000 M/sec 
>( +-  2.73% )
>298,416,906,437 cycles#0.621 GHz   
>( +-  4.08% ) [15.01%]
>812,072,320,370 stalled-cycles-frontend   #  272.13% frontend cycles idle  
>( +-  1.89% ) [25.01%]
>685,633,562,247 stalled-cycles-backend#  229.76% backend  cycles idle  
>( +-  2.50% ) [35.00%]
>117,665,891,888 instructions  #0.39  insns per cycle
>  #6.90  stalled cycles per 
> insn  ( +-  2.22% ) [45.00%]
> 26,158,399,505 branches  #   54.424 M/sec 
>( +-  2.10% ) [50.00%]
>205,688,614 branch-misses #0.79% of all branches   
>( +-  0.78% ) [50.00%]
> 27,882,474,171 L1-dcache-loads   #   58.011 M/sec 
>( +-  1.98% ) [50.00%]
>369,911,372 L1-dcache-load-misses #1.33% of all L1-dcache hits 
>( +-  0.62% ) [50.00%]
> 76,240,847 LLC-loads #0.159 M/sec 
>( +-  1.04% ) [40.00%]
>  3,220 LLC-load-misses   #0.00% of all LL-cache hits  
>( +- 19.49% ) [ 5.00%]
> 
>   60.074059340 seconds time elapsed   
>( +-  0.00% )
> 
> * with patches
>   rr values: 6732.65 6920.1 6909.46 7032.41 6864.43 6897.6 6815.19 
> 6967.83 6849.23 6929.52
>   sample size: 10
>   mean: 6891.842
>   standard deviation: 82.91901
>   quantiles: 6732.65 6853.03 6903.53 6927.165 7032.41
>   6890±80
> 
>

Re: [PATCH v5 1/2] net: moxa: clear TX descriptor length bits

2014-08-21 Thread David Miller

From: Jonas Jensen 
Date: Wed, 20 Aug 2014 16:18:42 +0200

> @@ -348,7 +348,8 @@ static int moxart_mac_start_xmit(struct sk_buff *skb, 
> struct net_device *ndev)
>  
>   txdes1 = readl(desc + TX_REG_OFFSET_DESC1);
>   txdes1 |= TX_DESC1_LTS | TX_DESC1_FTS;
> - txdes1 &= ~(TX_DESC1_FIFO_COMPLETE | TX_DESC1_INTR_COMPLETE);
> + txdes1 &= ~(TX_DESC1_FIFO_COMPLETE | TX_DESC1_INTR_COMPLETE |
> + TX_DESC1_BUF_SIZE_MASK);
>   txdes1 |= (len & TX_DESC1_BUF_SIZE_MASK);
>   writel(txdes1, desc + TX_REG_OFFSET_DESC1);
>   writel(TX_DESC0_DMA_OWN, desc + TX_REG_OFFSET_DESC0);

Like others I wonder why the existing descriptor value is being read
at all.

It's inefficient and completely unnecessary, you can just compute a new
value from scratch, and that way all of these "uncleared field" issues
just automatically disappear.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] net: ethernet: broadcom: bnx2x: Remove redundant #ifdef

2014-08-21 Thread David Miller

From: Rasmus Villemoes 
Date: Wed, 20 Aug 2014 15:14:49 +0200

> Nothing defines _ASM_GENERIC_INT_L64_H, it is a weird way to check for
> 64 bit longs, and u64 should be printed using %llx anyway.
> 
> Signed-off-by: Rasmus Villemoes 

It's not correct and will warn on some platforms where "u64" is just
a plain "unsigned long".

Ie. all of those which use include/asm-generic/int-l64.h

I'm not applying this.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/1] sctp: not send SCTP_PEER_ADDR_CHANGE notifications with failed probe

2014-08-21 Thread David Miller

From: Zhu Yanjun 
Date: Wed, 20 Aug 2014 17:31:43 +0800

> Since the transport has always been in state SCTP_UNCONFIRMED, it
> therefore wasn't active before and hasn't been used before, and it
> always has been, so it is unnecessary to bug the user with a 
> notification.
> 
> Reported-by: Deepak Khandelwal   
> Suggested-by: Vlad Yasevich  
> Suggested-by: Michael Tuexen 
> Suggested-by: Daniel Borkmann 
> Signed-off-by: Zhu Yanjun 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH net-next] hyperv: Add handling of IP header with option field in netvsc_set_hash()

2014-08-21 Thread David Miller

From: Haiyang Zhang 
Date: Tue, 19 Aug 2014 20:53:55 +

> @@ -200,12 +202,18 @@ static bool netvsc_set_hash(u32 *hash, struct sk_buff 
> *skb)
>   iphdr = ip_hdr(skb);
>  
>   if (iphdr->version == 4) {
> - if (iphdr->protocol == IPPROTO_TCP)
> + data = (u8 *)>saddr;
> + if (iphdr->protocol == IPPROTO_TCP) {
>   data_len = 12;
> - else
> + if (iphdr->ihl > 5) {
> + memcpy(dbuf, >saddr, 8);
> + memcpy([8], _hdr(skb)->source, 4);

This is rediculous.

Make hash_comp() take a void pointer for the buffer.

Then your code is simply:

be32 dbuf[2];

dbuf[1] = iph->saddr;
dbuf[2] = iph->daddr;
dbuf[3] = *(be32 *)tcph->source;

*hash = comp_hash(netvsc_hash_key, HASH_KEYLEN, dbuf, 12);

No special cases for IP options or any garbage like that.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/3] tg3: Fix tx_pending checks for tg3_tso_bug

2014-08-21 Thread David Miller

From: Benjamin Poirier 
Date: Thu, 21 Aug 2014 14:59:11 -0700

> Ah, now I understand the reason for the * 3 in
>   u32 frag_cnt_est = skb_shinfo(skb)->gso_segs * 3;
> 
>   /* Estimate the number of fragments in the worst case */
> but that is not really the "worst case". It's not forbidden to have more than
> two frags per skb output from skb_gso_segment(). I've kept this estimation
> approach but I've added code to validate the estimation or else linearize the
> skb.

This is a common situation drivers run into, and there have been a few
notable situations in virtualization drivers recently.  Although in
those cases the problems arise from the fact that if an SKB fragment
is a compound page, this can evaluate to requiring multiple
descriptors, one for each 4K segment within that fragment.

Anyways, the point I wanted to make is that you shouldn't do anything
too complicated to handle all of this.  And I think your conclusion
to linearize if the estimation fails is a good one.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v9 12/12] PCI: Introduce pci_remap_iospace() for remapping PCI I/O bus resources into CPU space

2014-08-21 Thread Rob Herring

On Tue, Aug 12, 2014 at 11:25 AM, Liviu Dudau  wrote:
> Introduce a default implementation for remapping PCI bus I/O resources
> onto the CPU address space. Architectures with special needs may
> provide their own version, but most should be able to use this one.
>
> Cc: Bjorn Helgaas 
> Cc: Arnd Bergmann 
> Cc: Rob Herring 
> Signed-off-by: Liviu Dudau 

Reviewed-by: Rob Herring 

However, I would like to see ARM pci_ioremap_io converted over to this function.

Rob

> ---
>  drivers/pci/pci.c   | 33 +
>  include/linux/pci.h |  3 +++
>  2 files changed, 36 insertions(+)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 29d1775..76d21b6 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -2707,6 +2707,39 @@ int pci_request_regions_exclusive(struct pci_dev 
> *pdev, const char *res_name)
>  }
>  EXPORT_SYMBOL(pci_request_regions_exclusive);
>
> +/**
> + * pci_remap_iospace - Remap the memory mapped I/O space
> + * @res: Resource describing the I/O space
> + * @phys_addr: physical address where the range will be mapped.
> + *
> + * Remap the memory mapped I/O space described by the @res
> + * into the CPU physical address space. Only architectures
> + * that have memory mapped IO defined (and hence PCI_IOBASE)
> + * should call this function.
> + */
> +int __weak pci_remap_iospace(const struct resource *res, phys_addr_t 
> phys_addr)
> +{
> +   int err = -ENODEV;
> +
> +#ifdef PCI_IOBASE
> +   if (!(res->flags & IORESOURCE_IO))
> +   return -EINVAL;
> +
> +   if (res->end > IO_SPACE_LIMIT)
> +   return -EINVAL;
> +
> +   err = ioremap_page_range(res->start + (unsigned long)PCI_IOBASE,
> +   res->end + 1 + (unsigned long)PCI_IOBASE,
> +   phys_addr, pgprot_device(PAGE_KERNEL));
> +#else
> +   /* this architecture does not have memory mapped I/O space,
> +  so this function should never be called */
> +   WARN_ON(1);
> +#endif
> +
> +   return err;
> +}
> +
>  static void __pci_set_master(struct pci_dev *dev, bool enable)
>  {
> u16 old_cmd, cmd;
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index e1e0d80..988c2f5 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1098,6 +1098,9 @@ int __must_check pci_bus_alloc_resource(struct pci_bus 
> *bus,
>   resource_size_t),
> void *alignf_data);
>
> +
> +int pci_remap_iospace(const struct resource *res, phys_addr_t phys_addr);
> +
>  static inline dma_addr_t pci_bus_address(struct pci_dev *pdev, int bar)
>  {
> struct pci_bus_region region;
> --
> 2.0.4
>
>
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] checkpatch.pl: New instances of ENOSYS are errors

2014-08-21 Thread Andy Lutomirski

ENOSYS means that a nonexistent system call was called.  We have a
bad habit of using it for things like invalid operations on
otherwise valid syscalls.  We should avoid this in new code.

Signed-off-by: Andy Lutomirski 
---

Pervasive incorrect usage of ENOSYS came up at the kernel summit ABI
review discussion.  Let's see if checkpatch can help.

 scripts/checkpatch.pl | 9 +
 1 file changed, 9 insertions(+)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 182be0f..5749a44 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -2372,6 +2372,15 @@ sub process {
 "Using $1 is unnecessary\n" . $herecurr);
}
 
+# ENOSYS means "bad syscall nr" and nothing else
+# (note that this doesn't run on assembly files, so entry*.S is okay)
+   if ($line =~ /ENOSYS/) {
+   my $herevet = "$here\n" . cat_vet($line) . "\n";
+   ERROR("ENOSYS",
+ "ENOSYS means 'invalid syscall nr' and nothing 
else\n" .
+ "   (ignore if this really is syscall entry 
code)\n" . $herevet);
+   }
+
 # Check for potential 'bare' types
my ($stat, $cond, $line_nr_next, $remain_next, $off_next,
$realline_next);
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v9 04/12] PCI: OF: Fix the conversion of IO ranges into IO resources.

2014-08-21 Thread Rob Herring

On Tue, Aug 12, 2014 at 11:25 AM, Liviu Dudau  wrote:
> The ranges property for a host bridge controller in DT describes
> the mapping between the PCI bus address and the CPU physical address.
> The resources framework however expects that the IO resources start
> at a pseudo "port" address 0 (zero) and have a maximum size of IO_SPACE_LIMIT.
> The conversion from pci ranges to resources failed to take that into account.
>
> In the process move the function into drivers/of/address.c as it now
> depends on pci_address_to_pio() code and make it return an error code.
>
> Cc: Grant Likely 
> Cc: Rob Herring 

Humm, this says I'm cc'ed, but I'm not which defeats the point of
recording the Cc's in the commit.

I still have the same concerns that this will break existing users.
Are you sure integrator is the only platform affected?

Rob

> Cc: Arnd Bergmann 
> Signed-off-by: Liviu Dudau 
> ---
>  drivers/of/address.c   | 46 
> ++
>  include/linux/of_address.h | 13 ++---
>  2 files changed, 48 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/of/address.c b/drivers/of/address.c
> index 4dab700..3735ac7 100644
> --- a/drivers/of/address.c
> +++ b/drivers/of/address.c
> @@ -906,3 +906,49 @@ bool of_dma_is_coherent(struct device_node *np)
> return false;
>  }
>  EXPORT_SYMBOL_GPL(of_dma_is_coherent);
> +
> +/*
> + * of_pci_range_to_resource - Create a resource from an of_pci_range
> + * @range: the PCI range that describes the resource
> + * @np:device node where the range belongs to
> + * @res:   pointer to a valid resource that will be updated to
> + *  reflect the values contained in the range.
> + *
> + * Returns EINVAL if the range cannot be converted to resource.
> + *
> + * Note that if the range is an IO range, the resource will be converted
> + * using pci_address_to_pio() which can fail if it is called too early or
> + * if the range cannot be matched to any host bridge IO space (our case 
> here).
> + * To guard against that we try to register the IO range first.
> + * If that fails we know that pci_address_to_pio() will do too.
> + */
> +int of_pci_range_to_resource(struct of_pci_range *range,
> +   struct device_node *np, struct resource *res)
> +{
> +   int err;
> +   res->flags = range->flags;
> +   res->parent = res->child = res->sibling = NULL;
> +   res->name = np->full_name;
> +
> +   if (res->flags & IORESOURCE_IO) {
> +   unsigned long port = -1;
> +   err = pci_register_io_range(range->cpu_addr, range->size);
> +   if (err)
> +   goto invalid_range;
> +   port = pci_address_to_pio(range->cpu_addr);
> +   if (port == (unsigned long)-1) {
> +   err = -EINVAL;
> +   goto invalid_range;
> +   }
> +   res->start = port;
> +   } else {
> +   res->start = range->cpu_addr;
> +   }
> +   res->end = res->start + range->size - 1;
> +   return 0;
> +
> +invalid_range:
> +   res->start = (resource_size_t)OF_BAD_ADDR;
> +   res->end = (resource_size_t)OF_BAD_ADDR;
> +   return err;
> +}
> diff --git a/include/linux/of_address.h b/include/linux/of_address.h
> index 28e6836..6015f21 100644
> --- a/include/linux/of_address.h
> +++ b/include/linux/of_address.h
> @@ -23,17 +23,8 @@ struct of_pci_range {
>  #define for_each_of_pci_range(parser, range) \
> for (; of_pci_range_parser_one(parser, range);)
>
> -static inline void of_pci_range_to_resource(struct of_pci_range *range,
> -   struct device_node *np,
> -   struct resource *res)
> -{
> -   res->flags = range->flags;
> -   res->start = range->cpu_addr;
> -   res->end = range->cpu_addr + range->size - 1;
> -   res->parent = res->child = res->sibling = NULL;
> -   res->name = np->full_name;
> -}
> -
> +extern int of_pci_range_to_resource(struct of_pci_range *range,
> +   struct device_node *np, struct resource *res);
>  /* Translate a DMA address from device space to CPU space */
>  extern u64 of_translate_dma_address(struct device_node *dev,
> const __be32 *in_addr);
> --
> 2.0.4
>
>
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] cgroup: add tracepoints to track cgroup events

2014-08-21 Thread Andrea Righi

On Thu, Aug 21, 2014 at 07:45:14PM -0400, Steven Rostedt wrote:
> On Thu, 21 Aug 2014 12:07:01 -0500
> Tejun Heo  wrote:
> 
> > Hello, Anrea.
> > 
> > On Thu, Aug 21, 2014 at 11:00:02AM -0600, Andrea Righi wrote:
> > > hmm... am I missing something or we already support directory events?
> > 
> > Ah, right, those mkdir/rmdir and writes automatically generate those
> > events.
> > 
> > > root@Dell:~# grep cgroups /proc/mounts
> > > none /cgroups cgroup 
> > > rw,relatime,cpuset,cpu,cpuacct,memory,devices,freezer,perf_event,hugetlb 
> > > 0 0
> > > root@Dell:~# inotifywait -m -r -e modify -e move -e create -e delete 
> > > /cgroups
> > > Setting up watches.  Beware: since -r was given, this may take a while!
> > > Watches established.
> > > /cgroups/ CREATE,ISDIR test
> > > /cgroups/test/ MODIFY cgroup.procs
> > > /cgroups/test/ MODIFY cgroup.procs
> > > /cgroups/test/ MODIFY cgroup.populated
> > > /cgroups/ MODIFY cgroup.procs
> > > /cgroups/ MODIFY cgroup.procs
> > > /cgroups/test/ MODIFY cgroup.populated
> > > /cgroups/ DELETE,ISDIR test
> > > 
> > > I still need to figure out a smart way to track which PIDs are
> > > added/removed to/from cgroup.procs from userland (inotifywait + git? :)),
> > > but all the other informations provided by my tracepoint patch seem to
> > > be already available via [di]notify.
> > 
> > Hmmm... yeah, determining exactly which pids got added / removed can
> > be cumbersome from just MODIFY events.  That said, what are you trying
> > to do with such information?
> > 
> 
> OK, is this patch not being pushed then? I have a lot of comments to
> make about it, but if this patch is being dropped for another way of
> doing things I wont waste my time on it.
> 
> Thanks,
> 
> -- Steve

Comments are always welcome, but at this point I'd say we can drop this
patch, so don't waste your time on it. I can find an alternative way to
get the same informations from user-space.

Thanks,
-Andrea
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/6] mm/hugetlb: gigantic hugetlb page pools shrink supporting

2014-08-21 Thread Wanpeng Li

On Fri, Aug 22, 2014 at 09:34:30AM +0800, Zhang Yanfei wrote:
>Hello Wanpeng
>
>On 08/22/2014 07:37 AM, Wanpeng Li wrote:
>> Hi Andi,
>> On Fri, Apr 12, 2013 at 05:22:37PM +0200, Andi Kleen wrote:
>>> On Fri, Apr 12, 2013 at 07:29:07AM +0800, Wanpeng Li wrote:
 Ping Andi,
 On Thu, Apr 04, 2013 at 05:09:08PM +0800, Wanpeng Li wrote:
> order >= MAX_ORDER pages are only allocated at boot stage using the 
> bootmem allocator with the "hugepages=xxx" option. These pages are never 
> free after boot by default since it would be a one-way street(>= MAX_ORDER
> pages cannot be allocated later), but if administrator confirm not to 
> use these gigantic pages any more, these pinned pages will waste memory
> since other users can't grab free pages from gigantic hugetlb pool even
> if OOM, it's not flexible.  The patchset add hugetlb gigantic page pools
> shrink supporting. Administrator can enable knob exported in sysctl to
> permit to shrink gigantic hugetlb pool.
>>>
>>>
>>> I originally didn't allow this because it's only one way and it seemed
>>> dubious.  I've been recently working on a new patchkit to allocate
>>> GB pages from CMA. With that freeing actually makes sense, as 
>>> the pages can be reallocated.
>>>
>> 
>> More than one year past, If your allocate GB pages from CMA merged? 
>
>commit 944d9fec8d7aee3f2e16573e9b6a16634b33f403
>Author: Luiz Capitulino 
>Date:   Wed Jun 4 16:07:13 2014 -0700
>
>hugetlb: add support for gigantic page allocation at runtime
>
>

Ah, thanks for your pointing out.

Regards,
Wanpeng Li 

>> 
>> Regards,
>> Wanpeng Li 
>> 
>>> -Andi
>>>
>>> --
>>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>>> the body to majord...@kvack.org.  For more info on Linux MM,
>>> see: http://www.linux-mm.org/ .
>>> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
>> .
>> 
>
>
>-- 
>Thanks.
>Zhang Yanfei
>
>--
>To unsubscribe, send a message with 'unsubscribe linux-mm' in
>the body to majord...@kvack.org.  For more info on Linux MM,
>see: http://www.linux-mm.org/ .
>Don't email: mailto:"d...@kvack.org;> em...@kvack.org 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v7 1/8] clk: Add temporary mapping to the existing API

2014-08-21 Thread Simon Horman

On Thu, Aug 21, 2014 at 02:10:07PM -0400, Jason Cooper wrote:
> On Thu, Aug 21, 2014 at 11:04:48AM -0700, Tony Lindgren wrote:
> > * Mike Turquette  [140820 07:53]:
> > > Quoting Tomeu Vizoso (2014-08-18 08:30:27)
> > > > To preserve git-bisectability, add aliases from the future provider API 
> > > > to the
> > > > existing public API.
> > > > 
> > > > Also includes clk-provider.h and clk-dev.h in a few places so the right
> > > > functions are defined.
> > > > 
> > > > Signed-off-by: Tomeu Vizoso 
> > > 
> > > Cc'ing Tony for the OMAP2+ parts, Simon & Magnus for the SHMobile parts,
> > > Jason & Andrew for the Orion parts, Mauro & Kukjin for the Exynos parts.
> > > 
> > > This change is super trivial but it's best not to touch these files
> > > without a heads-up for the owners.
> > 
> > As long as it compiles omap2plus_defconfig it seems safe to me:
> > 
> > Acked-by: Tony Lindgren 
> 
> Same here for orion5x_defconfig, dove_defconfig, and mv78xx_defconfig.
> 
> Acked-by: Jason Cooper 
> 
> Also, added Sebastian to the Cc as he's the maintainer of mach-berlin.

I will add my usual 2c worth about global changes vs adding things to
infrastructure, updating the platforms, then removing things from
infrastructure, in separate patches picked up by the relevant maintainers.
But if everyone else is happy with doing things this way I won't say no.

Ideally I'd like the shmobile portion patch to apply cleanly to linux-next
(it looks like it should) and for the following to still compile:
ape6evm_defconfig, armadillo800eva_defconfig, bockw_defconfig,
koelsch_defconfig, kzm9g_defconfig, lager_defconfig, mackerel_defconfig,
marzen_defconfig, shmobile_defconfig.

With all that said and done:

Acked-by: Simon Horman 

> thx,
> 
> Jason.
> 
> > 
> > > > ---
> > > > v7: * Add mappings for clk_notifier_[un]register
> > > > * Add more clk-provider.h includes to clk implementations
> > > > 
> > > > v4: * Add more clk-provider.h includes to clk implementations
> > > > * Add mapping for clk_provider_round_rate
> > > > ---
> > > >  arch/arm/mach-omap2/display.c |  1 +
> > > >  arch/arm/mach-omap2/omap_device.c |  1 +
> > > >  arch/arm/mach-shmobile/clock.c|  1 +
> > > >  arch/arm/plat-orion/common.c  |  1 +
> > > >  drivers/clk/berlin/bg2.c  |  1 +
> > > >  drivers/clk/berlin/bg2q.c |  1 +
> > > >  drivers/clk/clk-conf.c|  1 +
> > > >  drivers/clk/clkdev.c  |  1 +
> > > >  drivers/media/platform/exynos4-is/media-dev.c |  1 +
> > > >  include/linux/clk-provider.h  | 25 
> > > > +
> > > >  include/linux/clk/zynq.h  |  1 +
> > > >  11 files changed, 35 insertions(+)
> > > > 
> > > > diff --git a/arch/arm/mach-omap2/display.c 
> > > > b/arch/arm/mach-omap2/display.c
> > > > index bf852d7..0f9e479 100644
> > > > --- a/arch/arm/mach-omap2/display.c
> > > > +++ b/arch/arm/mach-omap2/display.c
> > > > @@ -21,6 +21,7 @@
> > > >  #include 
> > > >  #include 
> > > >  #include 
> > > > +#include 
> > > >  #include 
> > > >  #include 
> > > >  #include 
> > > > diff --git a/arch/arm/mach-omap2/omap_device.c 
> > > > b/arch/arm/mach-omap2/omap_device.c
> > > > index 01ef59d..fbe8cf0 100644
> > > > --- a/arch/arm/mach-omap2/omap_device.c
> > > > +++ b/arch/arm/mach-omap2/omap_device.c
> > > > @@ -32,6 +32,7 @@
> > > >  #include 
> > > >  #include 
> > > >  #include 
> > > > +#include 
> > > >  #include 
> > > >  #include 
> > > >  #include 
> > > > diff --git a/arch/arm/mach-shmobile/clock.c 
> > > > b/arch/arm/mach-shmobile/clock.c
> > > > index 806f940..ed415dc 100644
> > > > --- a/arch/arm/mach-shmobile/clock.c
> > > > +++ b/arch/arm/mach-shmobile/clock.c
> > > > @@ -24,6 +24,7 @@
> > > >  
> > > >  #ifdef CONFIG_COMMON_CLK
> > > >  #include 
> > > > +#include 
> > > >  #include 
> > > >  #include "clock.h"
> > > >  
> > > > diff --git a/arch/arm/plat-orion/common.c b/arch/arm/plat-orion/common.c
> > > > index 3ec6e8e..961b593 100644
> > > > --- a/arch/arm/plat-orion/common.c
> > > > +++ b/arch/arm/plat-orion/common.c
> > > > @@ -15,6 +15,7 @@
> > > >  #include 
> > > >  #include 
> > > >  #include 
> > > > +#include 
> > > >  #include 
> > > >  #include 
> > > >  #include 
> > > > diff --git a/drivers/clk/berlin/bg2.c b/drivers/clk/berlin/bg2.c
> > > > index 515fb13..4c81e09 100644
> > > > --- a/drivers/clk/berlin/bg2.c
> > > > +++ b/drivers/clk/berlin/bg2.c
> > > > @@ -19,6 +19,7 @@
> > > >  
> > > >  #include 
> > > >  #include 
> > > > +#include 
> > > >  #include 
> > > >  #include 
> > > >  #include 
> > > > diff --git a/drivers/clk/berlin/bg2q.c b/drivers/clk/berlin/bg2q.c
> > > > index 21784e4..748da9b 100644
> > > > --- a/drivers/clk/berlin/bg2q.c
> > > > +++ b/drivers/clk/berlin/bg2q.c
> > > > @@ -19,6 +19,7 @@
> > > >  
> > > >  #include 
> > > >  #include 
> > > > +#include 
> > > >  #include 
> > > >  #include

Re: rcu: Throttle rcu_try_advance_all_cbs() execution causes visible slowdown in ftrace switching

2014-08-21 Thread Fengguang Wu

Hi Petr,

Sorry for picking up this old thread, but I noticed your attached
ftrace test script and would like to ask for your permission to
include it in

https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git

which is GPLv2. If you kindly agree, I'll run it actively for testing
the upstream linux kernels.

Thanks,
Fengguang


test-ftrace
Description: application/shellscript

Re: [PATCH 1/2] sh: intc: Confine SH_INTC to platforms that need it

2014-08-21 Thread Simon Horman

On Wed, Aug 20, 2014 at 03:39:22PM +0200, Geert Uytterhoeven wrote:
> Currently the sh-intc driver is compiled on all SuperH and
> non-multiplatform SH-Mobile platforms, while it's only used on a limited
> number of platforms:
>   - SuperH: SH2(A), SH3(A), SH4(A)(L) (all but SH5)
>   - ARM: sh7372, sh73a0
> 
> Drop the "default y" on SH_INTC, make all CPU platforms that use it
> select it, and protect all sub-options by "if SH_INTC" to fix this.

Thanks, I have queued this up with Magnus's Ack.

> Signed-off-by: Geert Uytterhoeven 
> ---
>  arch/arm/mach-shmobile/Kconfig | 2 ++
>  arch/sh/Kconfig| 3 +++
>  drivers/sh/Makefile| 3 +--
>  drivers/sh/intc/Kconfig| 6 +-
>  4 files changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm/mach-shmobile/Kconfig b/arch/arm/mach-shmobile/Kconfig
> index 5814754c1240..dae4c73a5f00 100644
> --- a/arch/arm/mach-shmobile/Kconfig
> +++ b/arch/arm/mach-shmobile/Kconfig
> @@ -71,6 +71,7 @@ config ARCH_SH7372
>   select ARM_CPU_SUSPEND if PM || CPU_IDLE
>   select CPU_V7
>   select SH_CLK_CPG
> + select SH_INTC
>   select SYS_SUPPORTS_SH_CMT
>   select SYS_SUPPORTS_SH_TMU
>  
> @@ -81,6 +82,7 @@ config ARCH_SH73A0
>   select CPU_V7
>   select I2C
>   select SH_CLK_CPG
> + select SH_INTC
>   select RENESAS_INTC_IRQPIN
>   select SYS_SUPPORTS_SH_CMT
>   select SYS_SUPPORTS_SH_TMU
> diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig
> index 453fa5c09550..b319846ad97f 100644
> --- a/arch/sh/Kconfig
> +++ b/arch/sh/Kconfig
> @@ -172,6 +172,7 @@ menu "System type"
>  #
>  config CPU_SH2
>   bool
> + select SH_INTC
>  
>  config CPU_SH2A
>   bool
> @@ -182,6 +183,7 @@ config CPU_SH3
>   bool
>   select CPU_HAS_INTEVT
>   select CPU_HAS_SR_RB
> + select SH_INTC
>   select SYS_SUPPORTS_SH_TMU
>  
>  config CPU_SH4
> @@ -189,6 +191,7 @@ config CPU_SH4
>   select CPU_HAS_INTEVT
>   select CPU_HAS_SR_RB
>   select CPU_HAS_FPU if !CPU_SH4AL_DSP
> + select SH_INTC
>   select SYS_SUPPORTS_SH_TMU
>   select SYS_SUPPORTS_HUGETLBFS if MMU
>  
> diff --git a/drivers/sh/Makefile b/drivers/sh/Makefile
> index 788ed9b59b4e..114203f32843 100644
> --- a/drivers/sh/Makefile
> +++ b/drivers/sh/Makefile
> @@ -1,8 +1,7 @@
>  #
>  # Makefile for the SuperH specific drivers.
>  #
> -obj-$(CONFIG_SUPERH) += intc/
> -obj-$(CONFIG_ARCH_SHMOBILE_LEGACY)   += intc/
> +obj-$(CONFIG_SH_INTC)+= intc/
>  ifneq ($(CONFIG_COMMON_CLK),y)
>  obj-$(CONFIG_HAVE_CLK)   += clk/
>  endif
> diff --git a/drivers/sh/intc/Kconfig b/drivers/sh/intc/Kconfig
> index 60228fae943f..6a1b05ddc8c9 100644
> --- a/drivers/sh/intc/Kconfig
> +++ b/drivers/sh/intc/Kconfig
> @@ -1,7 +1,9 @@
>  config SH_INTC
> - def_bool y
> + bool
>   select IRQ_DOMAIN
>  
> +if SH_INTC
> +
>  comment "Interrupt controller options"
>  
>  config INTC_USERIMASK
> @@ -37,3 +39,5 @@ config INTC_MAPPING_DEBUG
> between system IRQs and the per-controller id tables.
>  
> If in doubt, say N.
> +
> +endif
> -- 
> 1.9.1
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] ARM: shmobile: Move legacy INTC definitions from irqs.h to intc.h

2014-08-21 Thread Simon Horman

On Wed, Aug 20, 2014 at 03:39:23PM +0200, Geert Uytterhoeven wrote:
> Move all definitions for legacy INTC from the common "irqs.h" to the
> INTC-specific "intc.h".
> Include "intc.h" in sh7372/sh73a0 CPU and board files where needed.
> 
> Signed-off-by: Geert Uytterhoeven 

Thanks, I have queued this up with Magnus's Ack after removing
the whitespace change noted below.

> ---
>  arch/arm/mach-shmobile/board-kzm9g.c| 1 +
>  arch/arm/mach-shmobile/board-mackerel.c | 1 +
>  arch/arm/mach-shmobile/intc.h   | 6 ++
>  arch/arm/mach-shmobile/irqs.h   | 6 --
>  arch/arm/mach-shmobile/setup-sh7372.c   | 1 +
>  arch/arm/mach-shmobile/setup-sh73a0.c   | 1 +
>  6 files changed, 10 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/arm/mach-shmobile/board-kzm9g.c 
> b/arch/arm/mach-shmobile/board-kzm9g.c
> index 36593cb20e57..77e36fa0b142 100644
> --- a/arch/arm/mach-shmobile/board-kzm9g.c
> +++ b/arch/arm/mach-shmobile/board-kzm9g.c
> @@ -50,6 +50,7 @@
>  #include 
>  
>  #include "common.h"
> +#include "intc.h"
>  #include "irqs.h"
>  #include "sh73a0.h"
>  
> diff --git a/arch/arm/mach-shmobile/board-mackerel.c 
> b/arch/arm/mach-shmobile/board-mackerel.c
> index 79f448e93abb..b7c4261492b0 100644
> --- a/arch/arm/mach-shmobile/board-mackerel.c
> +++ b/arch/arm/mach-shmobile/board-mackerel.c
> @@ -63,6 +63,7 @@
>  #include 
>  
>  #include "common.h"
> +#include "intc.h"
>  #include "irqs.h"
>  #include "pm-rmobile.h"
>  #include "sh-gpio.h"
> diff --git a/arch/arm/mach-shmobile/intc.h b/arch/arm/mach-shmobile/intc.h
> index a5603c76cfe0..0313cf798c32 100644
> --- a/arch/arm/mach-shmobile/intc.h
> +++ b/arch/arm/mach-shmobile/intc.h
> @@ -1,5 +1,6 @@
>  #ifndef __ASM_MACH_INTC_H
>  #define __ASM_MACH_INTC_H
> +
>  #include 
>  
>  #define INTC_IRQ_PINS_ENUM_16L(p)\

I have removed the above hunk as it seems unrelated to the rest of the patch.

> @@ -287,4 +288,9 @@ static struct intc_desc p ## _desc __initdata = { 
> \
>p ## _sense_registers, NULL),  \
>  }
>  
> +/* INTCS */
> +#define INTCS_VECT_BASE  0x3400
> +#define INTCS_VECT(n, vect)  INTC_VECT((n), INTCS_VECT_BASE + (vect))
> +#define intcs_evt2irq(evt)   evt2irq(INTCS_VECT_BASE + (evt))
> +
>  #endif  /* __ASM_MACH_INTC_H */
> diff --git a/arch/arm/mach-shmobile/irqs.h b/arch/arm/mach-shmobile/irqs.h
> index 8e28223f1b3c..3070f6d887eb 100644
> --- a/arch/arm/mach-shmobile/irqs.h
> +++ b/arch/arm/mach-shmobile/irqs.h
> @@ -1,18 +1,12 @@
>  #ifndef __SHMOBILE_IRQS_H
>  #define __SHMOBILE_IRQS_H
>  
> -#include 
>  #include "include/mach/irqs.h"
>  
>  /* GIC */
>  #define gic_spi(nr)  ((nr) + 32)
>  #define gic_iid(nr)  (nr) /* ICCIAR / interrupt ID */
>  
> -/* INTCS */
> -#define INTCS_VECT_BASE  0x3400
> -#define INTCS_VECT(n, vect)  INTC_VECT((n), INTCS_VECT_BASE + (vect))
> -#define intcs_evt2irq(evt)   evt2irq(INTCS_VECT_BASE + (evt))
> -
>  /* GPIO IRQ */
>  #define _GPIO_IRQ_BASE   2500
>  #define GPIO_IRQ_BASE(x) (_GPIO_IRQ_BASE + (32 * x))
> diff --git a/arch/arm/mach-shmobile/setup-sh7372.c 
> b/arch/arm/mach-shmobile/setup-sh7372.c
> index 3731eef4..eaf5d1332c4b 100644
> --- a/arch/arm/mach-shmobile/setup-sh7372.c
> +++ b/arch/arm/mach-shmobile/setup-sh7372.c
> @@ -41,6 +41,7 @@
>  
>  #include "common.h"
>  #include "dma-register.h"
> +#include "intc.h"
>  #include "irqs.h"
>  #include "pm-rmobile.h"
>  #include "sh7372.h"
> diff --git a/arch/arm/mach-shmobile/setup-sh73a0.c 
> b/arch/arm/mach-shmobile/setup-sh73a0.c
> index e7a0296b81b1..4c7022830e30 100644
> --- a/arch/arm/mach-shmobile/setup-sh73a0.c
> +++ b/arch/arm/mach-shmobile/setup-sh73a0.c
> @@ -40,6 +40,7 @@
>  
>  #include "common.h"
>  #include "dma-register.h"
> +#include "intc.h"
>  #include "irqs.h"
>  #include "sh73a0.h"
>  
> -- 
> 1.9.1
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH net-next 2/2] net: exit busy loop when another process is runnable

2014-08-21 Thread Jason Wang

On 08/21/2014 04:11 PM, Michael S. Tsirkin wrote:
> On Thu, Aug 21, 2014 at 04:05:10PM +0800, Jason Wang wrote:
>> > Rx busy loop does not scale well in the case when several parallel
>> > sessions is active. This is because we keep looping even if there's
>> > another process is runnable. For example, if that process is about to
>> > send packet, keep busy polling in current process will brings extra
>> > delay and damage the performance.
>> > 
>> > This patch solves this issue by exiting the busy loop when there's
>> > another process is runnable in current cpu. Simple test that pin two
>> > netperf sessions in the same cpu in receiving side shows obvious
>> > improvement:
>> > 
>> > Before:
>> > netperf -H 192.168.100.2 -T 0,0 -t TCP_RR -P 0 & \
>> > netperf -H 192.168.100.2 -T 1,0 -t TCP_RR -P 0
>> > 16384  87380  11   10.0015513.74
>> > 16384  87380
>> > 16384  87380  11   10.0015092.78
>> > 16384  87380
>> > 
>> > After:
>> > netperf -H 192.168.100.2 -T 0,0 -t TCP_RR -P 0 & \
>> > netperf -H 192.168.100.2 -T 1,0 -t TCP_RR -P 0
>> > 16384  87380  11   10.0023334.53
>> > 16384  87380
>> > 16384  87380  11   10.0023327.58
>> > 16384  87380
>> > 
>> > Benchmark was done through two 8 cores Xeon machine back to back connected
>> > with mlx4 through netperf TCP_RR test (busy_read were set to 50):
>> > 
>> > sessions/bytes/before/after/+improvement%/busy_read=0/
>> > 1/1/30062.10/30034.72/+0%/20228.96/
>> > 16/1/214719.83/307669.01/+43%/268997.71/
>> > 32/1/231252.81/345845.16/+49%/336157.442/
>> > 64/512/212467.39/373464.93/+75%/397449.375/
>> > 
>> > Signed-off-by: Jason Wang 
>> > ---
>> >  include/net/busy_poll.h | 3 ++-
>> >  1 file changed, 2 insertions(+), 1 deletion(-)
>> > 
>> > diff --git a/include/net/busy_poll.h b/include/net/busy_poll.h
>> > index 1d67fb6..8a33fb2 100644
>> > --- a/include/net/busy_poll.h
>> > +++ b/include/net/busy_poll.h
>> > @@ -109,7 +109,8 @@ static inline bool sk_busy_loop(struct sock *sk, int 
>> > nonblock)
>> >cpu_relax();
>> >  
>> >} while (!nonblock && skb_queue_empty(>sk_receive_queue) &&
>> > -   !need_resched() && !busy_loop_timeout(end_time));
>> > +   !need_resched() && !busy_loop_timeout(end_time) &&
>> > +   nr_running_this_cpu() < 2);
> <= 1 would be a bit clearer? We want at most one process here.
>

Ok, will change it in next version.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 2/3] net: Add Keystone NetCP ethernet driver

2014-08-21 Thread Stephen Hemminger

On Fri, 15 Aug 2014 11:12:41 -0400
Santosh Shilimkar  wrote:

> NetCP driver has a plug-in module architecture where each of the NetCP
> sub-modules exist as a loadable kernel module which plug in to the netcp
> core. These sub-modules are represented as "netcp-devices" in the dts
> bindings. It is mandatory to have the ethernet switch sub-module for
> the ethernet interface to be operational. Any other sub-module like the
> PA is optional.

What are you doing to prevent/discourage proprietary binary only
sub-modules?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] cpufreq: powernv: Register the driver with reboot notifier

2014-08-21 Thread Preeti U Murthy

Hi Viresh,

On 08/21/2014 11:56 AM, Viresh Kumar wrote:
> On 21 August 2014 10:36, Shilpasri G Bhat  wrote:
>> The intention here is stop the cpufreq governor and then to set the cpus to
>> nominal frequency so as to ensure that the frequency won't be changed later.
>>
>> The .suspend callback of the driver is not called during reboot/kexec.
>> So we need an explicit reboot notifier to call cpufreq-suspend() to
>> suffice the requirement.
> 
> Hi Shilpa,
> 
> No, we can't allow any platform driver to misuse cpufreq_suspend().
> Platform drivers aren't *allowed* to call this routine.

At the moment this looks like the best way forward. We need to do this
cleanly by ensuring that we stop the governors and then call into the
driver to deal with the cpu frequency in its own way during reboot. The
best way to do this would be by calling this routine. Either this or
cpufreq_suspend() should be called in the reboot path generically. The
latter might not be an enticing option for other platforms.

Regards
Preeti U Murthy
> 
> Now the deal is how do we move to nominal frequency on reboot..
> @Rafael: Any suggestions? How do we ensure that governors
> are stopped on these notifiers, or if there is some other solution here?
> ___
> Linuxppc-dev mailing list
> linuxppc-...@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Xen-devel] [V0 PATCH 1/2] AMD-PVH: set EFER.NX and EFER.SCE for the boot vcpu

2014-08-21 Thread Mukesh Rathor

On Thu, 21 Aug 2014 21:39:04 -0400
Konrad Rzeszutek Wilk  wrote:

> On Wed, Aug 20, 2014 at 07:16:39PM -0700, Mukesh Rathor wrote:
> > On AMD, NX feature must be enabled in the efer for NX to be honored
> > in the pte entries, otherwise protection fault. We also set SC for
> > system calls to be enabled.
> 
> How come we don't need to do that for Intel (that is set the NX bit)?
> Could you include the explanation here please?

Intel doesn't have EFER.NX bit. The SC bit is being set in xen, but it
doesn't need to be, and I'm going to submit a patch to undo it.

> 
> > 
> > Signed-off-by: Mukesh Rathor 
> > ---
> >  arch/x86/xen/enlighten.c | 12 
> >  1 file changed, 12 insertions(+)
> > 
> > diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> > index c0cb11f..4af512d 100644
> > --- a/arch/x86/xen/enlighten.c
> > +++ b/arch/x86/xen/enlighten.c
> > @@ -1499,6 +1499,17 @@ void __ref xen_pvh_secondary_vcpu_init(int
> > cpu) xen_pvh_set_cr_flags(cpu);
> >  }
> >  
> > +/* This is done in secondary_startup_64 for hvm guests. */
> > +static void __init xen_configure_efer(void)
> > +{
> > +   u64 efer;
> > +
> > +   rdmsrl(MSR_EFER, efer);
> > +   efer |= EFER_SCE;
> > +   efer |= (cpuid_edx(0x8001) & (1 << 20)) ? EFER_NX : 0;
> 
> Ahem? #defines for these magic values please?

Linux uses these directly all over the code as they are set in stone
pretty much, and I didn't find any #defines. See cpu/common.c for one of
the places. Also see secondary_startup_64, and others...

> Or could you use 'boot_cpu_has'?

Nop, it's not initialized at this point.

thanks,
Mukesh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH V2 1/1] netfilter/jump_label: HAVE_JUMP_LABEL instead of CONFIG_JUMP_LABEL

2014-08-21 Thread Zhouyi Zhou

Use HAVE_JUMP_LABEL as elsewhere in the kernel to ensure
that the toolchain has the required support in addition to
CONFIG_JUMP_LABEL being set.


Signed-off-by: Zhouyi Zhou 
Reviewed-by: Florian Westphal 
---
 include/linux/netfilter.h |5 +++--
 net/netfilter/core.c  |6 +++---
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
index 2077489..83a1952 100644
--- a/include/linux/netfilter.h
+++ b/include/linux/netfilter.h
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #ifdef CONFIG_NETFILTER
 static inline int NF_DROP_GETERR(int verdict)
@@ -99,8 +100,8 @@ void nf_unregister_sockopt(struct nf_sockopt_ops *reg);
 
 extern struct list_head nf_hooks[NFPROTO_NUMPROTO][NF_MAX_HOOKS];
 
-#if defined(CONFIG_JUMP_LABEL)
-#include 
+#ifdef HAVE_JUMP_LABEL
+
 extern struct static_key nf_hooks_needed[NFPROTO_NUMPROTO][NF_MAX_HOOKS];
 static inline bool nf_hooks_active(u_int8_t pf, unsigned int hook)
 {
diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index a93c97f..024a2e2 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -54,7 +54,7 @@ EXPORT_SYMBOL_GPL(nf_unregister_afinfo);
 struct list_head nf_hooks[NFPROTO_NUMPROTO][NF_MAX_HOOKS] __read_mostly;
 EXPORT_SYMBOL(nf_hooks);
 
-#if defined(CONFIG_JUMP_LABEL)
+#ifdef HAVE_JUMP_LABEL
 struct static_key nf_hooks_needed[NFPROTO_NUMPROTO][NF_MAX_HOOKS];
 EXPORT_SYMBOL(nf_hooks_needed);
 #endif
@@ -72,7 +72,7 @@ int nf_register_hook(struct nf_hook_ops *reg)
}
list_add_rcu(>list, elem->list.prev);
mutex_unlock(_hook_mutex);
-#if defined(CONFIG_JUMP_LABEL)
+#ifdef HAVE_JUMP_LABEL
static_key_slow_inc(_hooks_needed[reg->pf][reg->hooknum]);
 #endif
return 0;
@@ -84,7 +84,7 @@ void nf_unregister_hook(struct nf_hook_ops *reg)
mutex_lock(_hook_mutex);
list_del_rcu(>list);
mutex_unlock(_hook_mutex);
-#if defined(CONFIG_JUMP_LABEL)
+#ifdef HAVE_JUMP_LABEL
static_key_slow_dec(_hooks_needed[reg->pf][reg->hooknum]);
 #endif
synchronize_net();
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH net-next 3/4] r8152: remove clear_bp function

2014-08-21 Thread Hayes Wang

 Sergei Shtylyov [mailto:sergei.shtyl...@cogentembedded.com] 
[...]
> >>  Why leave 2 empty lines? One is enough.
> 
> > The next patch would use another fucntion at the
> > same location. I skip removing the empty line and
> > re-adding it again. Is that better to do so? I would
> > resend the patches if the answer is yes.
> 
> Sorry, I haven't looked at your next patch, too big for me. :-)

It's my mistake. I would avoid it next time. Thanks for your notice.
 
Best Regards,
Hayes
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCHv12 0/3] power_supply: Introduce power supply charging driver

2014-08-21 Thread Tc, Jenny

Ping...

> -Original Message-
> From: Tc, Jenny
> Sent: Wednesday, August 13, 2014 5:40 PM
> To: linux-kernel@vger.kernel.org; Sebastian Reichel; Pavel Machek
> Cc: Dmitry Eremin-Solenikov; Stephen Rothwell; Anton Vorontsov; David
> Woodhouse; David Cohen; Pallala, Ramakrishna; Tc, Jenny
> Subject: [PATCHv12 0/3] power_supply: Introduce power supply charging driver
> 
> v1: Introduced feature as a framework within power supply class driver with
>   separate files for battid framework and charging framework
> v2: Fixed review comments, moved macros and inline functions to power_supply.h
> v3: Moved the feature as a separate driver, combined battid framework and
>   charging framework inside the power supply charging driver. Moved
>   charger specific properties to power_supply_charger.h and plugged the
>   driver with power supply subsystem using power_supply_notifier
>   introduced in my previous patch. Also a sample charger chip driver
>   (bq24261) patch added to give more idea on the psy charging driver
>   usage
> v4: Fixed review comments, no major design changes.
> v5: Fixed makefile inconsistencies, removed unused pdata callbacks
> v6: Fixed nested loops, commenting style
> v7: added kerneldocs for structs and minor fixes
> v8: used msecs_to_jiffies instead of HZ directly, modified Kconfig help text
> for POWER_SUPPLY_CHARGING_ALGO_PSE
> v9: Removed string lookups, static cable initialization
> v10: Fixed bug in algorithm lookup
> v11: Few variable name changes for better readability
> v12: Enabled polling and RTC wakeup which is supported in charger-manager as
>  suggested by Sebastian. Fixed review comments from Sebastian and Pavel
> 
> Jenny TC (3):
>   power_supply: Introduce generic psy charging driver
>   power_supply: Introduce PSE compliant algorithm
>   power_supply: bq24261 charger driver
> 
>  Documentation/power/power_supply_charger.txt |  349 +++
>  drivers/power/Kconfig|   33 +
>  drivers/power/Makefile   |3 +
>  drivers/power/bq24261_charger.c  | 1348
> ++
>  drivers/power/charging_algo_pse.c|  217 +
>  drivers/power/power_supply_charger.c | 1186 ++
>  drivers/power/power_supply_charger.h |  225 +
>  drivers/power/power_supply_core.c|3 +
>  include/linux/power/bq24261-charger.h|   25 +
>  include/linux/power/power_supply_charger.h   |  374 +++
>  include/linux/power_supply.h |  161 +++
>  11 files changed, 3924 insertions(+)
>  create mode 100644 Documentation/power/power_supply_charger.txt
>  create mode 100644 drivers/power/bq24261_charger.c  create mode 100644
> drivers/power/charging_algo_pse.c  create mode 100644
> drivers/power/power_supply_charger.c
>  create mode 100644 drivers/power/power_supply_charger.h
>  create mode 100644 include/linux/power/bq24261-charger.h
>  create mode 100644 include/linux/power/power_supply_charger.h
> 
> --
> 1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] pmbus: ltc2978: add regulator gating

2014-08-21 Thread Guenter Roeck

On Thu, Aug 21, 2014 at 08:26:22PM -0500, Mark Brown wrote:
> On Thu, Aug 21, 2014 at 06:18:10PM -0700, Guenter Roeck wrote:
> > On Thu, Aug 21, 2014 at 07:36:50PM -0500, Mark Brown wrote:
> > > On Thu, Aug 21, 2014 at 05:21:26PM -0500, at...@opensource.altera.com 
> > > wrote:
> 
> > > This all looks very much like pmbus could use regmap and then the regmap
> > > helpers.  I'd not insist on it though.  What I would however suggest is
> 
> > Not unless regmap got extended recently to support quick, byte, and word
> > smbus accesses at the same time.
> 
> Depending on how you decide which it quite possibly does - if it's based
> on the register number that'd work.

Mostly per register, but also per chip (for manufacturing specific registers
the register size is determined by the chip type). Also, there are block
registers. Plus, the scope of each register (ie if it is paged or not)
is chip specific.  The same register may be paged on one chip, and unpaged
on another. I'll have another look to see if that all can be mapped into the
regmap model. If yes, it might actually be quite helpful and might simplify
the pmbus code quite a bit.

Either case, even if regmap now supports all the PMBus oddities, converting
the pmbus drivers to use regmap should be a separate patch set and not be
tied together.

Tnanks,
Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] regulator: RK808: Fix uninitialized value

2014-08-21 Thread Mark Brown

On Thu, Aug 21, 2014 at 05:54:55PM -0700, Doug Anderson wrote:
> The RK808 regulator driver was putting its config on the stack but not
> initting it.  That means that you got a semi-random config.  Fix this.

Applied, thanks.


signature.asc
Description: Digital signature

Re: [PATCH v7 04/11] arm: Support restart through restart handler call chain

2014-08-21 Thread Guenter Roeck

On Fri, Aug 22, 2014 at 03:32:42AM +0200, Andreas Färber wrote:
> Hi,
> 
> Am 20.08.2014 02:45, schrieb Guenter Roeck:
> > The kernel core now supports a restart handler call chain for system
> > restart functions.
> > 
> > With this change, the arm_pm_restart callback is now optional, so
> > drop its initialization and check if it is set before calling it.
> > Only call the kernel restart handler if arm_pm_restart is not set.
> [...]
> > diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
> > index 81ef686..ea279f7 100644
> > --- a/arch/arm/kernel/process.c
> > +++ b/arch/arm/kernel/process.c
> > @@ -114,17 +114,13 @@ void soft_restart(unsigned long addr)
> > BUG();
> >  }
> >  
> > -static void null_restart(enum reboot_mode reboot_mode, const char *cmd)
> > -{
> > -}
> > -
> >  /*
> >   * Function pointers to optional machine specific functions
> >   */
> >  void (*pm_power_off)(void);
> >  EXPORT_SYMBOL(pm_power_off);
> >  
> > -void (*arm_pm_restart)(enum reboot_mode reboot_mode, const char *cmd) = 
> > null_restart;
> > +void (*arm_pm_restart)(enum reboot_mode reboot_mode, const char *cmd);
> 
> Stupid newbie question maybe, but isn't this variable uninitialized now,
> like any non-static variable in C99? Or does the kernel assure that all
> such "fields" are zero-initialized?
> 
It is initialized with NULL, like all other global and static variables in the
kernel (and like pm_power_off a few lines above).

Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] powerpc: edac: Fix build error

2014-08-21 Thread Pranith Kumar

Fix the following build error:

drivers/edac/ppc4xx_edac.c: In function 'mfsdram':
drivers/edac/ppc4xx_edac.c:249: error: implicit declaration of function
'__mfdcri'
drivers/edac/ppc4xx_edac.c: In function 'mtsdram':
drivers/edac/ppc4xx_edac.c:266: error: implicit declaration of function
'__mtdcri'
drivers/edac/ppc4xx_edac.c:269: warning: 'return' with a value, in function
returning void
drivers/edac/ppc4xx_edac.c: In function 'ppc4xx_edac_init_csrows':
drivers/edac/ppc4xx_edac.c:924: warning: initialization from incompatible
pointer type
drivers/edac/ppc4xx_edac.c:977: error: request for member 'dimm' in something
not a structure or union
drivers/edac/ppc4xx_edac.c: In function 'ppc4xx_edac_map_dcrs':
drivers/edac/ppc4xx_edac.c:1209: warning: passing argument 1 of 'dcr_map_mmio'
discards qualifiers from pointer target type

This driver depends on PPC_DCR_NATIVE to be set for the relevant headers to be
included. Also if PPC_DCR_MMIO=n the build fails. So make PPC_DCR depend on both
these options.

This is compile tested only.

Signed-off-by: Pranith Kumar 
CC: Andrew Morton 
---
 arch/powerpc/Kconfig   | 6 +++---
 drivers/edac/ppc4xx_edac.c | 8 
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 4bc7b62..9b90c1c 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -233,15 +233,15 @@ config ARCH_SUSPEND_POSSIBLE
 
 config PPC_DCR_NATIVE
bool
-   default n
+   default y
 
 config PPC_DCR_MMIO
bool
-   default n
+   default y
 
 config PPC_DCR
bool
-   depends on PPC_DCR_NATIVE || PPC_DCR_MMIO
+   depends on PPC_DCR_NATIVE && PPC_DCR_MMIO
default y
 
 config PPC_OF_PLATFORM_PCI
diff --git a/drivers/edac/ppc4xx_edac.c b/drivers/edac/ppc4xx_edac.c
index ef6b7e0..8725b73 100644
--- a/drivers/edac/ppc4xx_edac.c
+++ b/drivers/edac/ppc4xx_edac.c
@@ -246,8 +246,8 @@ static const char * const ppc4xx_plb_masters[9] = {
 static inline u32
 mfsdram(const dcr_host_t *dcr_host, unsigned int idcr_n)
 {
-   return __mfdcri(dcr_host->base + SDRAM_DCR_ADDR_OFFSET,
-   dcr_host->base + SDRAM_DCR_DATA_OFFSET,
+   return __mfdcri(dcr_host->host.native.base + SDRAM_DCR_ADDR_OFFSET,
+   dcr_host->host.native.base + SDRAM_DCR_DATA_OFFSET,
idcr_n);
 }
 
@@ -263,8 +263,8 @@ mfsdram(const dcr_host_t *dcr_host, unsigned int idcr_n)
 static inline void
 mtsdram(const dcr_host_t *dcr_host, unsigned int idcr_n, u32 value)
 {
-   return __mtdcri(dcr_host->base + SDRAM_DCR_ADDR_OFFSET,
-   dcr_host->base + SDRAM_DCR_DATA_OFFSET,
+   return __mtdcri(dcr_host->host.native.base + SDRAM_DCR_ADDR_OFFSET,
+   dcr_host->host.native.base + SDRAM_DCR_DATA_OFFSET,
idcr_n,
value);
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[blkg_stat] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.

2014-08-21 Thread Fengguang Wu

Hi Hong and Jens,

FYI, this patch still has the error that impacts the latest linux-next.

git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
commit 2c575026fae6e63771bd2a4c1d407214a8096a89
Author: Hong Zhiguo 
AuthorDate: Wed Nov 20 10:35:05 2013 -0700
Commit: Jens Axboe 
CommitDate: Wed Nov 20 15:33:04 2013 -0700

Update of blkg_stat and blkg_rwstat may happen in bh context.
While u64_stats_fetch_retry is only preempt_disable on 32bit
UP system. This is not enough to avoid preemption by bh and
may read strange 64 bit value.

Signed-off-by: Hong Zhiguo 
Acked-by: Tejun Heo 
Cc: sta...@kernel.org
Signed-off-by: Jens Axboe 

+---+++
|   | 
82023bb7f7 | 2c575026fa |
+---+++
| boot_successes| 496   
 | 0  |
| boot_failures | 494   
 | 330|
| WARNING:CPU:PID:at_arch/x86/mm/ioremap.c:__early_ioremap()| 493   
 | 177|
| WARNING:CPU:PID:at_kernel/trace/ring_buffer.c:rb_reserve_next_event() | 493   
 | 177|
| backtrace:acpi_initialize_tables  | 493   
 | 177|
| backtrace:acpi_table_init | 493   
 | 177|
| backtrace:acpi_boot_table_init| 493   
 | 177|
| backtrace:ring_buffer_producer_thread | 493   
 | 177|
| BUG:unable_to_handle_kernel_NULL_pointer_dereference  | 3 
 | 2  |
| Oops  | 3 
 | 2  |
| EIP_is_at_strlen  | 3 
 | 2  |
| Kernel_panic-not_syncing:Fatal_exception_in_interrupt | 2 
 ||
| Kernel_panic-not_syncing:Fatal_exception  | 1 
 | 2  |
| backtrace:vfs_write   | 1 
 | 2  |
| backtrace:SyS_write   | 1 
 | 2  |
| WARNING:CPU:PID:at_kernel/softirq.c:local_bh_enable() | 0 
 | 330|
| inconsistent_IN-SOFTIRQ-W-SOFTIRQ-ON-W_usage  | 0 
 | 330|
| backtrace:do_mount| 0 
 | 330|
| backtrace:SyS_mount   | 0 
 | 330|
| backtrace:smpboot_thread_fn   | 0 
 | 182|
+---+++

[7.266963] scsi_id (235) used greatest stack depth: 6008 bytes left
[7.403676] [ cut here ]
[7.404033] WARNING: CPU: 0 PID: 264 at kernel/softirq.c:156 
local_bh_enable+0x9c/0x1e0()
[7.404033] CPU: 0 PID: 264 Comm: mount Tainted: GW
3.12.0-02795-g2c57502 #16
[7.404033]  0001 511d1a58 420d4200 511d1a88 4109f3dd 426e5c40  
0108
[7.404033]  426e5fb0 009c 410a68dc 410a68dc 0001 4183189d 0001 
511d1a98
[7.404033]  4109f4c2 0009  511d1aac 410a68dc 51c9f008 51c9f23c 
511d1ad8
[7.404033] Call Trace:
[7.404033]  [<420d4200>] dump_stack+0x16/0x18
[7.404033]  [<4109f3dd>] warn_slowpath_common+0x8d/0xb0
[7.404033]  [<410a68dc>] ? local_bh_enable+0x9c/0x1e0
[7.404033]  [<410a68dc>] ? local_bh_enable+0x9c/0x1e0
[7.404033]  [<4183189d>] ? cfqg_stats_update_avg_queue_size+0x2d/0x100
[7.404033]  [<4109f4c2>] warn_slowpath_null+0x22/0x30
[7.404033]  [<410a68dc>] local_bh_enable+0x9c/0x1e0
[7.404033]  [<4183189d>] cfqg_stats_update_avg_queue_size+0x2d/0x100
[7.404033]  [<41833f4a>] __cfq_set_active_queue+0x15a/0x210
[7.404033]  [<418300d9>] ? cfq_group_service_tree_add+0x199/0x260
[7.404033]  [<41831f84>] ? cfq_service_tree_add+0x404/0x4f0
[7.404033]  [<418320a9>] ? cfq_resort_rr_list+0x39/0x40
[7.404033]  [<41832fff>] ? cfq_add_cfqq_rr+0x16f/0x1c0
[7.404033]  [<4183a014>] ? cfq_update_idle_window.isra.78+0x84/0x3a0
[7.404033]  [<41836b4c>] cfq_select_queue+0x7ec/0xa90
[7.404033]  [<4183988f>] cfq_dispatch_requests+0x2bf/0x9c0
[7.404033]  [<410833ec>] ? pvclock_clocksource_read+0xfc/0x240
[7.404033]  [<410822f3>] ? kvm_clock_read+0x13/0x20
[7.404033]  [<4183a702>] ? cfq_insert_request+0x3d2/0x8b0
[7.404033]  [<410e0fc3>] ? sched_clock_local.constprop.2+0x43/0x190
[7.404033]

Re: [PATCH v5] irqchip: gic: Allow gic_arch_extn hooks to call into scheduler

2014-08-21 Thread Stephen Boyd

On 08/21/14 02:47, Russell King - ARM Linux wrote:
> What would make more sense is if this were a read-write lock, then
> gic_raise_softirq() could run concurrently on several CPUs without
> interfering with each other, yet still be safe with gic_migrate_target().
>
> I'd then argue that we wouldn't need the ifdeffery, we might as well
> keep the locking in place - it's overhead is likely small (when lockdep
> is disabled) when compared to everything else which goes on in this
> path.

Ok.

>> @@ -690,6 +700,7 @@ void gic_migrate_target(unsigned int new_cpu_id)
>>  ror_val = (cur_cpu_id - new_cpu_id) & 31;
>>  
>>  raw_spin_lock(_controller_lock);
>> +raw_spin_lock(_sgi_lock);
>>  
>>  /* Update the target interface for this logical CPU */
>>  gic_cpu_map[cpu] = 1 << new_cpu_id;
>> @@ -709,6 +720,7 @@ void gic_migrate_target(unsigned int new_cpu_id)
>>  }
>>  }
>>  
>> +raw_spin_unlock(_sgi_lock);
>>  raw_spin_unlock(_controller_lock);
> I would actually suggest we go a bit further.  Use gic_sgi_lock to only
> lock gic_cpu_map[] itself, and not have it beneath any other lock.
> That's an advantage because right now, lockdep learns from the above that
> there's a dependency between irq_controller_lock and gic_sgi_lock.
> Reasonably keeping lock dependencies to a minimum is always a good idea.
>
> The places where gic_cpu_map[] is used are:
>
> static int gic_set_affinity(struct irq_data *d, const struct cpumask 
> *mask_val,
> bool force)
> {
> ...
> raw_spin_lock(_controller_lock);
> mask = 0xff << shift;
> bit = gic_cpu_map[cpu] << shift;
> val = readl_relaxed(reg) & ~mask;
> writel_relaxed(val | bit, reg);
> raw_spin_unlock(_controller_lock);
>
> So, we can move:
>
> mask = 0xff << shift;
> bit = gic_cpu_map[cpu] << shift;
>
> out from under irq_controller_lock and put it under gic_sgi_lock.  The
> "mask" bit doesn't need to be under any lock at all.
>
> There's gic_cpu_init():
>
> cpu_mask = gic_get_cpumask(gic);
> gic_cpu_map[cpu] = cpu_mask;
>
> /*
>  * Clear our mask from the other map entries in case they're
>  * still undefined.
>  */
> for (i = 0; i < NR_GIC_CPU_IF; i++)
> if (i != cpu)
> gic_cpu_map[i] &= ~cpu_mask;
>
> which better had be stable after boot - if not, this needs locking.
> Remember that the above will be called on hotplug too.
>

Perhaps you'd like to send this patch? It isn't clear to me from your
description how this would work. What happens if we update the
gic_cpu_map between the time we read the map and acquire the
irq_controller_lock in gic_set_affinity()? I think we would program the
affinity for the wrong CPU?

Either way, here's the patch I think you described.

8<-

diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index 4b959e606fe8..d159590461c7 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -39,6 +39,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -79,6 +80,7 @@ static DEFINE_RAW_SPINLOCK(irq_controller_lock);
  */
 #define NR_GIC_CPU_IF 8
 static u8 gic_cpu_map[NR_GIC_CPU_IF] __read_mostly;
+static DEFINE_RWLOCK(gic_cpu_map_lock);
 
 /*
  * Supported arch specific GIC irq extension.
@@ -233,9 +235,13 @@ static int gic_set_affinity(struct irq_data *d, const 
struct cpumask *mask_val,
if (cpu >= NR_GIC_CPU_IF || cpu >= nr_cpu_ids)
return -EINVAL;
 
-   raw_spin_lock(_controller_lock);
mask = 0xff << shift;
+
+   read_lock(_cpu_map_lock);
bit = gic_cpu_map[cpu] << shift;
+   read_unlock(_cpu_map_lock);
+
+   raw_spin_lock(_controller_lock);
val = readl_relaxed(reg) & ~mask;
writel_relaxed(val | bit, reg);
raw_spin_unlock(_controller_lock);
@@ -605,7 +611,7 @@ static void gic_raise_softirq(const struct cpumask *mask, 
unsigned int irq)
int cpu;
unsigned long flags, map = 0;
 
-   raw_spin_lock_irqsave(_controller_lock, flags);
+   read_lock_irqsave(_cpu_map_lock, flags);
 
/* Convert our logical CPU mask into a physical one. */
for_each_cpu(cpu, mask)
@@ -620,7 +626,7 @@ static void gic_raise_softirq(const struct cpumask *mask, 
unsigned int irq)
/* this always happens on GIC0 */
writel_relaxed(map << 16 | irq, gic_data_dist_base(_data[0]) + 
GIC_DIST_SOFTINT);
 
-   raw_spin_unlock_irqrestore(_controller_lock, flags);
+   read_unlock_irqrestore(_cpu_map_lock, flags);
 }
 #endif
 
@@ -689,11 +695,12 @@ void gic_migrate_target(unsigned int new_cpu_id)
cur_target_mask = 0x01010101 << cur_cpu_id;
ror_val = (cur_cpu_id - new_cpu_id) & 31;
 
-   raw_spin_lock(_controller_lock);
-
+   write_lock(_cpu_map_lock);
/* Update the target interface for this logical CPU */
gic_cpu_map[cpu] = 1 <<

Re: [PATCH] ACPI / scan: Allow ACPI drivers to bind to PNP device objects

2014-08-21 Thread Zhang Rui

On Thu, 2014-08-21 at 19:10 +0200, Rafael J. Wysocki wrote:
> On Thursday, August 21, 2014 08:08:54 PM Zhang Rui wrote:
> > Hi, Rafael,
> > 
> > On Thu, 2014-08-21 at 06:04 +0200, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki 
> 
> [cut]
> 
> > Note that I've just tested on my machine and it works well.
> > I still need the bug reporter to check if the patch fixes bug 81511 or not.
> 
> The FUJ02B1 and FUJ02E3 devices in bug 81971 have the same problem and
> they aren't motherboard devices.

Right, but IMO, the rootcause of that bug is that
1. the PNP id table in fujitsu-laptop driver was introduced for some
reason, probably it is used as an indicator for module auto-loading, and
nowadays, this is redundant because fujitsu-laptop driver probes ACPI
device only, and the driver will be loaded if the ACPI device objects
for FUJ02B1 and FUJ02E3 is created.
2. This "redundant" PNP id table results in that those IDs are added to
PNP ID list unnecessarily, and results in PNP device nodes for those
devices are created unnecessarily.

>   Yes, we need to convert that driver
> to use a PNP driver structure or a platform device, but (1) we need a
> -stable fix *first*

I agree.

>  and (2) the cases we already know about may not be
> the only broken ones.

Agree.
But the issue addressed in your patch is that PNP scan handler blocks
ACPI driver from being probed, right?
So my question would be,
1. If the id in PNP scan handler does not have a PNP driver, like the
   FUJ02B1/FUJ02E3 issue, what do we need the id in PNP scan handler?
   In fact, I think this is a good chance for us to cleanup the ACPI PNP
   id list, as long as we can fix them in time.
2. If the id in PNP scan handler has a PNP driver, should we allow both
   PNP driver and ACPI driver loaded? I think PNP system driver is the
   only case that makes us have to say yes, and what I'm trying to do
   is to fix this in the following patch.

Plus, IMO, your patch only fixes the PNP bus vs. ACPI bus issue. We
still may get bug report complaining some *PLATFORM* driver stops to
functional if the ACPI node has _CID PNP0C01/PNP0C02, sooner or later.
right?

thanks,
rui

> 
> > From c6c388728d08a6368f21dab61d6f0a940e0ea13a Mon Sep 17 00:00:00 2001
> > From: Zhang Rui 
> > Date: Thu, 21 Aug 2014 13:39:47 +0800
> > Subject: [RFC PATCH] ACPI: introduce motherboard resource management
> > 
> > ACPI Devices with _HID/_CID PNP0C01/PNP0C02 represents that
> > they have some motherboard resources that needs to be reserved.
> > 
> > We used to enumerated those devices to PNP bus and rely on
> > PNP system driver to do resource reservation.
> > But this mechanism does not work well nowadays as many devices
> > not only represent motherboard resources, but also represent
> > physical devices that need native drivers other than PNP system
> > driver for the device to work. For example,
> > 1) https://bugzilla.kernel.org/show_bug.cgi?id=46741,
> > Device (NIPM)
> > {
> > Name (_HID, EisaId ("IPI0001"))  // _HID: Hardware ID
> > Name (_CID, EisaId ("PNP0C01"))  // _CID: Compatible ID
> >the NIPM device has _CID PNP0C01 but it is an IPMI device.
> >PNP system driver blocks the PNP IPMI driver to probe.
> 
> That is a good reason for PNP0C01 to be dropped from acpi_pnp_device_ids[].
> 
> > 2) https://bugzilla.kernel.org/show_bug.cgi?id=81511
> > Device (IFFS)
> > {
> > Name (_HID, EisaId ("INT3392"))  // _HID: Hardware ID
> > Name (_CID, EisaId ("PNP0C02"))  // _CID: Compatible ID
> >the IFFS device has _CID PNP0C02, but it is an intel rapid start
> >device, which already has an ACPI driver at
> >drivers/platform/x86/intel-rst.c
> 
> And which should be a platform driver really.
> 
> > 3) a couple of machines, including the on in
> >https://bugzilla.kernel.org/show_bug.cgi?id=81511, has the AML code
> >like following
> > Device (PTID)
> > {
> > Name (_HID, EisaId ("INT340E"))  // _HID: Hardware ID
> > Name (_CID, EisaId ("PNP0C02"))  // _CID: Compatible ID
> >the PTID device has _CID PNP0C02, but it is also represents an
> >INT340E device, there is a platform bus driver for this device
> >which will be introduced by myself soon.
> 
> Again, that's a good reason for dropping PNP0C02 from acpi_pnp_device_ids[].
> 
> > In any of the above cases, the current code for managing PNP0C01/PNP0C02
> > resources in Linux kernel is broken, because it either blocks the physical
> > device driver on the same bus, or results in multiple drivers loaded for
> > the same ACPI device node, which may also has some potential risks.
> > 
> > Thus, IMO, we need a clean way to handle those motherboard resources.
> > Given that PNP0C01/PNP0C02 is more like an indicator for reserving the
> > resources, this patch
> > 1. does the resource reservation in ACPI code directly, with the same logic
> >and time point in drivers/pnp/quirks.c and

Re: [PATCH] softlockup: make detector be aware of task switch of processes hogging cpu

2014-08-21 Thread Don Zickus

On Thu, Aug 21, 2014 at 01:42:22PM +0800, chai wen wrote:
> For now, soft lockup detector warns once for each case of process softlockup.
> But the thread 'watchdog/n' may not always get the cpu at the time slot 
> between
> the task switch of two processes hogging that cpu to reset soft_watchdog_warn.
> 
> An example would be two processes hogging the cpu.  Process A causes the
> softlockup warning and is killed manually by a user.  Process B immediately
> becomes the new process hogging the cpu preventing the softlockup code from
> resetting the soft_watchdog_warn variable.
> 
> This case is a false negative of "warn only once for a process", as there may
> be a different process that is going to hog the cpu.  Resolve this by
> saving/checking the task pointer of the hogging process and use that to reset
> soft_watchdog_warn too.
> 
> Signed-off-by: chai wen 
> Signed-off-by: Don Zickus 

Acked-by: Don Zickus 

> ---
>  kernel/watchdog.c |   16 +++-
>  1 files changed, 15 insertions(+), 1 deletions(-)
> 
> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> index 0037db6..2e55620 100644
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -42,6 +42,7 @@ static DEFINE_PER_CPU(bool, softlockup_touch_sync);
>  static DEFINE_PER_CPU(bool, soft_watchdog_warn);
>  static DEFINE_PER_CPU(unsigned long, hrtimer_interrupts);
>  static DEFINE_PER_CPU(unsigned long, soft_lockup_hrtimer_cnt);
> +static DEFINE_PER_CPU(struct task_struct *, softlockup_task_ptr_saved);
>  #ifdef CONFIG_HARDLOCKUP_DETECTOR
>  static DEFINE_PER_CPU(bool, hard_watchdog_warn);
>  static DEFINE_PER_CPU(bool, watchdog_nmi_touch);
> @@ -328,8 +329,20 @@ static enum hrtimer_restart watchdog_timer_fn(struct 
> hrtimer *hrtimer)
>   return HRTIMER_RESTART;
>  
>   /* only warn once */
> - if (__this_cpu_read(soft_watchdog_warn) == true)
> + if (__this_cpu_read(soft_watchdog_warn) == true) {
> + /*
> +  * Handle the case where multiple processes are
> +  * causing softlockups but the duration is small
> +  * enough, the softlockup detector can not reset
> +  * itself in time.  Use task pointers to detect this.
> +  */
> + if (__this_cpu_read(softlockup_task_ptr_saved) !=
> + current) {
> + __this_cpu_write(soft_watchdog_warn, false);
> + __touch_watchdog();
> + }
>   return HRTIMER_RESTART;
> + }
>  
>   if (softlockup_all_cpu_backtrace) {
>   /* Prevent multiple soft-lockup reports if one cpu is 
> already
> @@ -345,6 +358,7 @@ static enum hrtimer_restart watchdog_timer_fn(struct 
> hrtimer *hrtimer)
>   pr_emerg("BUG: soft lockup - CPU#%d stuck for %us! [%s:%d]\n",
>   smp_processor_id(), duration,
>   current->comm, task_pid_nr(current));
> + __this_cpu_write(softlockup_task_ptr_saved, current);
>   print_modules();
>   print_irqtrace_events(current);
>   if (regs)
> -- 
> 1.7.1
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V4 3/8] namespaces: expose ns instance serial numbers in proc

2014-08-21 Thread Richard Guy Briggs

On 14/08/21, Andy Lutomirski wrote:
> On Aug 20, 2014 8:12 PM, "Richard Guy Briggs"  wrote:
> > Expose the namespace instace serial numbers in the proc filesystem at
> > /proc//ns/_snum.  The link text gives the serial number in hex.
> 
> What's the use case?
> 
> I understand the utility of giving unique numbers to the audit code,
> but I don't think this part is necessary for that, and I'd like to
> understand what else will use this before committing to a duplicative
> API like this.

How does a container manager get those numbers?  It could provoke a task
to cause an audit event that emits a NS_INFO message, or it could run a
task in that container to report its namespace serial numbers directly
from its /proc mount.

The discussion in this thread touches on the use cases:
https://lkml.org/lkml/2014/4/22/662

> Note that this API is thoroughly incompatible with CRIU.  If we do
> this, someone will ask for a namespace number namespace, and that way
> lies madness.

I had a very brief look at CRIU, but not enough to understand the issue.
Others have hinted at this problem.

Do you have a suggestion of a different approach that would be
compatible with CRIU?

I'd originally considered some sort of UUID that would be globally
unique, but that would be very hard to devise or guarantee, and besides,
namespaces aren't only used by containers and could be shared in other
ways.  Tracking the usage and migration of namespaces should be the task
of an upper layer.

> --Andy
> 
> >
> > "snum" was chosen instead of "seq" for consistency with inum and there are a
> > number of other uses of "seq" in the namespace code.
> >
> > Suggested-by: Serge E. Hallyn 
> > Signed-off-by: Richard Guy Briggs 
> > ---
> >  fs/proc/namespaces.c |   33 +
> >  1 files changed, 25 insertions(+), 8 deletions(-)
> >
> > diff --git a/fs/proc/namespaces.c b/fs/proc/namespaces.c
> > index 8902609..e953e0a 100644
> > --- a/fs/proc/namespaces.c
> > +++ b/fs/proc/namespaces.c
> > @@ -47,12 +47,15 @@ static char *ns_dname(struct dentry *dentry, char 
> > *buffer, int buflen)
> > struct inode *inode = dentry->d_inode;
> > const struct proc_ns_operations *ns_ops = PROC_I(inode)->ns.ns_ops;
> >
> > -   return dynamic_dname(dentry, buffer, buflen, "%s:[%lu]",
> > -   ns_ops->name, inode->i_ino);
> > +   if (strstr(dentry->d_iname, "_snum"))
> > +   return dynamic_dname(dentry, buffer, buflen, 
> > "%s_snum:[%llx]",
> > +   ns_ops->name, ns_ops->snum(PROC_I(inode)->ns.ns));
> > +   else
> > +   return dynamic_dname(dentry, buffer, buflen, "%s:[%lu]",
> > +   ns_ops->name, inode->i_ino);
> >  }
> >
> > -const struct dentry_operations ns_dentry_operations =
> > -{
> > +const struct dentry_operations ns_dentry_operations = {
> > .d_delete   = always_delete_dentry,
> > .d_dname= ns_dname,
> >  };
> > @@ -160,7 +163,10 @@ static int proc_ns_readlink(struct dentry *dentry, 
> > char __user *buffer, int bufl
> > if (!ns)
> > goto out_put_task;
> >
> > -   snprintf(name, sizeof(name), "%s:[%u]", ns_ops->name, 
> > ns_ops->inum(ns));
> > +   if (strstr(dentry->d_iname, "_snum"))
> > +   snprintf(name, sizeof(name), "%s_snum:[%llx]", 
> > ns_ops->name, ns_ops->snum(ns));
> > +   else
> > +   snprintf(name, sizeof(name), "%s:[%u]", ns_ops->name, 
> > ns_ops->inum(ns));
> > res = readlink_copy(buffer, buflen, name);
> > ns_ops->put(ns);
> >  out_put_task:
> > @@ -210,16 +216,23 @@ static int proc_ns_dir_readdir(struct file *file, 
> > struct dir_context *ctx)
> >
> > if (!dir_emit_dots(file, ctx))
> > goto out;
> > -   if (ctx->pos >= 2 + ARRAY_SIZE(ns_entries))
> > +   if (ctx->pos >= 2 + 2 * ARRAY_SIZE(ns_entries))
> > goto out;
> > entry = ns_entries + (ctx->pos - 2);
> > last = _entries[ARRAY_SIZE(ns_entries) - 1];
> > while (entry <= last) {
> > const struct proc_ns_operations *ops = *entry;
> > +   char name[50];
> > +
> > if (!proc_fill_cache(file, ctx, ops->name, 
> > strlen(ops->name),
> >  proc_ns_instantiate, task, ops))
> > break;
> > ctx->pos++;
> > +   snprintf(name, sizeof(name), "%s_snum", ops->name);
> > +   if (!proc_fill_cache(file, ctx, name, strlen(name),
> > +proc_ns_instantiate, task, ops))
> > +   break;
> > +   ctx->pos++;
> > entry++;
> > }
> >  out:
> > @@ -247,9 +260,13 @@ static struct dentry *proc_ns_dir_lookup(struct inode 
> > *dir,
> >
> > last = _entries[ARRAY_SIZE(ns_entries)];
> > for (entry = ns_entries; entry < last; entry++) {
> > -   if

Re: [ext4] 71d4f7d0321: -49.6% xfstests.generic.274.seconds

2014-08-21 Thread Fengguang Wu

On Thu, Aug 21, 2014 at 10:30:06PM +0800, Fengguang Wu wrote:
> Hi Ted,
> 
> We noticed increased xfstests 274's test speed and the first good
> commit is 71d4f7d032149b935a26eb3ff85c6c837f3714e1 ("ext4: remove
> metadata reservation checks").
> 
> test case: snb-drag/xfstests/4HDD-ext4-generic-slow2
> snb-drag is a Sandy Bridge PC with 6G memory.
> 
> d5e03cbb0c88cd1  71d4f7d032149b935a26eb3ff 
> ---  - 
> 51 ± 2% -49.6% 25 ± 1%  TOTAL xfstests.generic.274.seconds
>817 ± 1%  -3.0%792 ± 1%  TOTAL time.elapsed_time
> 
>   xfstests.generic.274.seconds
> 
>   60 ++-+
>  |  |
>   55 ++ .*..*.. *..   *..   |
>  |   .*.   .  ....  |
>   50 *+.*..*.   *..*..*..* *..*..*..*..* *  |
>  |  |
>   45 ++ |
>  |  |
>   40 ++ |
>  |  |
>   35 ++ |
>  |  |
>   30 ++ |
>  |  |
>   25 O+-O--O--O--O--O---O--O--O--O--O--O--O--O--O--O--O--O---O--O--O--O--O--O
> 
> 
>   [*] bisect-good sample
>   [O] bisect-bad  sample

This is sweet, got another improvement:

testbox/testcase/testparams: snb-drag/xfstests/4HDD-ext4-generic-mid

d5e03cbb0c88cd1  71d4f7d032149b935a26eb3ff  
---  -  
45 ± 4% -50.2% 22 ± 7%  xfstests.generic.256.seconds
   7044859 ± 2%  -7.0%6553446 ± 0%  time.minor_page_faults
 36156 ± 2%  -7.0%  33638 ± 0%  time.involuntary_context_switches
171551 ± 1%  -4.8% 163312 ± 0%  time.voluntary_context_switches
   436 ± 1%  -5.7%411 ± 0%  time.elapsed_time

   time.voluntary_context_switches

  176000 ++-+
 |   **.. *..   |
  174000 ++   *..  .. :  :   *..*:   *  |
  172000 ++ .. .*.*   :  :   +   :+ |
 *..*..* *.:: + :  +  .*|
  17 ++:  .*   **.  |   

  168000 ++ *.  |
 |  |
  166000 ++ |
  164000 ++ O  O|
 |   O O  O O O  O  O   O O  O  O
  162000 ++ OO  O OO O  |
  16 O+  O O|
 | O  O |
  158000 ++-+


  xfstests.generic.256.seconds

  60 ++-*---+
 | : :  |
  55 ++: :  |
  50 ++   :   : |
 | .*.. .*: .*..|
  45 *+*..  .*..*...*..*..*.   : .*.  .*..  .*  |
 |*.   *.   *.*.|
  40 ++ |
 |  |
  35 ++ |
  30 ++ |
 |  |
  25 ++  O O|
 O  O O  O  O  O   O O O O  O O  O  |
  20

RE: kernel boot fail with efi earlyprintk (bisected)

2014-08-21 Thread Zheng, Lv

Hi,

There is only limited entries in the x86 early mapping which is implemented by 
the FIXMAP.
So this means for all __init call invoked for x86, if there was a early mapping 
in it, it should be unmapped before exiting the __init call.

Using this rule, all __init call implementers can make sure that before 
entering the __init call, the limited number of FIXMAP entries is enough.

The following bisected commit just increase early mapping times from 1 to 2 in 
ACPICA early table handling code.
The number of 2 is less than the number of available FIXMAP entries.
And ACPICA code has ensured that all mappings are correctly unmapped after the 
table initialization.
So we didn't break the rule.

We can offer a workaround in ACPICA to reduce mapping count from 2 to 1 using a 
global option.
But since this report sounds like that the root cause is earlyprintk=efi has 
broken the above rule and the existing issue is triggered by this cleanup.
So could someone check the earlyprintk=efi code first?
I think earlyprintk=efi should either unmap the increased mapping or increase 
the number of FIXMAP entries in case earlyprintk=efi need additional early 
mappings.
Otherwise it will always be chances for earlyprintk=efi to break future code.

Thanks and best regards
-Lv

> From: Matt Fleming [mailto:m...@console-pimps.org]
> Sent: Friday, August 22, 2014 4:52 AM
> 
> On Tue, 19 Aug, at 04:16:58PM, Dave Young wrote:
> > Hi,
> >
> > 3.16 kernel boot fail with earlyprintk=efi on my laptop.
> > It keeps scrolling at the bottom line of screen.
> >
> > Bisected, the first bad commit is below:
> > commit 86dfc6f339886559d80ee0d4bd20fe5ee90450f0
> > Author: Lv Zheng 
> > Date:   Fri Apr 4 12:38:57 2014 +0800
> >
> > ACPICA: Tables: Fix table checksums verification before installation.
> >
> >
> > I did some debugging by enabling both serial and efi earlyprintk, below is
> > some debug dmesg, seems early_ioremap fails in scroll up function due to
> > no free slot, but I'm still not sure if the debug info is right or not.
> 
> Thanks Dave, your callstack seems to make sense.
> 
> Can you also enable early_ioremap_debug so that we can figure out where
> all the FIXMAP slots are going?
> 
> --
> Matt Fleming, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Xen-devel] [V0 PATCH 1/2] AMD-PVH: set EFER.NX and EFER.SCE for the boot vcpu

2014-08-21 Thread Konrad Rzeszutek Wilk

On Wed, Aug 20, 2014 at 07:16:39PM -0700, Mukesh Rathor wrote:
> On AMD, NX feature must be enabled in the efer for NX to be honored in
> the pte entries, otherwise protection fault. We also set SC for
> system calls to be enabled.

How come we don't need to do that for Intel (that is set the NX bit)?
Could you include the explanation here please?


> 
> Signed-off-by: Mukesh Rathor 
> ---
>  arch/x86/xen/enlighten.c | 12 
>  1 file changed, 12 insertions(+)
> 
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index c0cb11f..4af512d 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -1499,6 +1499,17 @@ void __ref xen_pvh_secondary_vcpu_init(int cpu)
>   xen_pvh_set_cr_flags(cpu);
>  }
>  
> +/* This is done in secondary_startup_64 for hvm guests. */
> +static void __init xen_configure_efer(void)
> +{
> + u64 efer;
> +
> + rdmsrl(MSR_EFER, efer);
> + efer |= EFER_SCE;
> + efer |= (cpuid_edx(0x8001) & (1 << 20)) ? EFER_NX : 0;

Ahem? #defines for these magic values please?

Or could you use 'boot_cpu_has'?

> + wrmsrl(MSR_EFER, efer);
> +}
> +
>  static void __init xen_pvh_early_guest_init(void)
>  {
>   if (!xen_feature(XENFEAT_auto_translated_physmap))
> @@ -1508,6 +1519,7 @@ static void __init xen_pvh_early_guest_init(void)
>   return;
>  
>   xen_have_vector_callback = 1;
> + xen_configure_efer();
>   xen_pvh_set_cr_flags(0);
>  
>  #ifdef CONFIG_X86_32
> -- 
> 1.8.3.1
> 
> 
> ___
> Xen-devel mailing list
> xen-de...@lists.xen.org
> http://lists.xen.org/xen-devel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/6] mm/hugetlb: gigantic hugetlb page pools shrink supporting

2014-08-21 Thread Zhang Yanfei

Hello Wanpeng

On 08/22/2014 07:37 AM, Wanpeng Li wrote:
> Hi Andi,
> On Fri, Apr 12, 2013 at 05:22:37PM +0200, Andi Kleen wrote:
>> On Fri, Apr 12, 2013 at 07:29:07AM +0800, Wanpeng Li wrote:
>>> Ping Andi,
>>> On Thu, Apr 04, 2013 at 05:09:08PM +0800, Wanpeng Li wrote:
 order >= MAX_ORDER pages are only allocated at boot stage using the 
 bootmem allocator with the "hugepages=xxx" option. These pages are never 
 free after boot by default since it would be a one-way street(>= MAX_ORDER
 pages cannot be allocated later), but if administrator confirm not to 
 use these gigantic pages any more, these pinned pages will waste memory
 since other users can't grab free pages from gigantic hugetlb pool even
 if OOM, it's not flexible.  The patchset add hugetlb gigantic page pools
 shrink supporting. Administrator can enable knob exported in sysctl to
 permit to shrink gigantic hugetlb pool.
>>
>>
>> I originally didn't allow this because it's only one way and it seemed
>> dubious.  I've been recently working on a new patchkit to allocate
>> GB pages from CMA. With that freeing actually makes sense, as 
>> the pages can be reallocated.
>>
> 
> More than one year past, If your allocate GB pages from CMA merged? 

commit 944d9fec8d7aee3f2e16573e9b6a16634b33f403
Author: Luiz Capitulino 
Date:   Wed Jun 4 16:07:13 2014 -0700

hugetlb: add support for gigantic page allocation at runtime


> 
> Regards,
> Wanpeng Li 
> 
>> -Andi
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majord...@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> .
> 


-- 
Thanks.
Zhang Yanfei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v7 04/11] arm: Support restart through restart handler call chain

2014-08-21 Thread Andreas Färber

Hi,

Am 20.08.2014 02:45, schrieb Guenter Roeck:
> The kernel core now supports a restart handler call chain for system
> restart functions.
> 
> With this change, the arm_pm_restart callback is now optional, so
> drop its initialization and check if it is set before calling it.
> Only call the kernel restart handler if arm_pm_restart is not set.
[...]
> diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
> index 81ef686..ea279f7 100644
> --- a/arch/arm/kernel/process.c
> +++ b/arch/arm/kernel/process.c
> @@ -114,17 +114,13 @@ void soft_restart(unsigned long addr)
>   BUG();
>  }
>  
> -static void null_restart(enum reboot_mode reboot_mode, const char *cmd)
> -{
> -}
> -
>  /*
>   * Function pointers to optional machine specific functions
>   */
>  void (*pm_power_off)(void);
>  EXPORT_SYMBOL(pm_power_off);
>  
> -void (*arm_pm_restart)(enum reboot_mode reboot_mode, const char *cmd) = 
> null_restart;
> +void (*arm_pm_restart)(enum reboot_mode reboot_mode, const char *cmd);

Stupid newbie question maybe, but isn't this variable uninitialized now,
like any non-static variable in C99? Or does the kernel assure that all
such "fields" are zero-initialized?

>  EXPORT_SYMBOL_GPL(arm_pm_restart);

(This doesn't seem to be affecting the value of arm_pm_restart, just
redeclaring it extern and adding further derived symbols.)

>  
>  /*
> @@ -230,7 +226,10 @@ void machine_restart(char *cmd)
>   local_irq_disable();
>   smp_send_stop();
>  
> - arm_pm_restart(reboot_mode, cmd);
> + if (arm_pm_restart)

Here we seem to be relying on arm_pm_restart to be NULL when not
callable. I.e., wondering whether it's ruled out that the following line
is triggered due to non-zero garbage in arm_pm_restart?

Thanks,
Andreas

> + arm_pm_restart(reboot_mode, cmd);
> + else
> + do_kernel_restart(cmd);
>  
>   /* Give a grace period for failure to restart of 1s */
>   mdelay(1000);

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v5 2/3] mfd: qcom-rpm: Driver for the Qualcomm RPM

2014-08-21 Thread Bjorn Andersson

On Thu 21 Aug 06:22 PDT 2014, Lee Jones wrote:
> > diff --git a/drivers/mfd/qcom_rpm.c b/drivers/mfd/qcom_rpm.c
> > +static const struct qcom_rpm_data msm8660_template = {
> > +   .version = -1,
> 
> -1?
> 

2 would be a better number...

> > +   .resource_table = msm8660_rpm_resource_table,
> > +   .n_resources = ARRAY_SIZE(msm8660_rpm_resource_table),
> > +};
> 
> [...]
> 
> > +struct qcom_rpm *dev_get_qcom_rpm(struct device *dev)
> > +{
> > +   return dev_get_drvdata(dev);
> > +}
> > +EXPORT_SYMBOL(dev_get_qcom_rpm);
> 
> No need for this at all.  Use dev_get_drvdata() direct instead.
> 

I see that others have put this as static inline in the header file, so I will
follow that. I don't want to expose this directly in the implementation of the
clients.

Let me know if you object.

> [...]
> 
> > +static int qcom_rpm_probe(struct platform_device *pdev)
> > +{
[...]
> > +
> > +   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> > +   rpm->status_regs = devm_ioremap_resource(>dev, res);
> > +   rpm->ctrl_regs = rpm->status_regs + 0x400;
> > +   rpm->req_regs = rpm->status_regs + 0x600;
> > +   if (IS_ERR(rpm->status_regs))
> > +   return PTR_ERR(rpm->status_regs);
> 
> You should probably do this _before_ using it above.
> 

There's no difference in behaviour, but it just feels cleaner than:

res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
rpm->status_regs = devm_ioremap_resource(>dev, res);
if (IS_ERR(rpm->status_regs))
return PTR_ERR(rpm->status_regs);
rpm->ctrl_regs = rpm->status_regs + 0x400;
rpm->req_regs = rpm->status_regs + 0x600;

If you don't like it, I'll change it.

[...]
> > +
> > +   writel(fw_version[0], RPM_CTRL_REG(rpm, 0));
> > +   writel(fw_version[1], RPM_CTRL_REG(rpm, 1));
> > +   writel(fw_version[2], RPM_CTRL_REG(rpm, 2));
> 
> A comment documenting what this is doing would be helpful here.
> 

Sounds reasonable, this seems to be incorrect now that I had to investigate
what's really happening. So I'll update it.

[...]
> > +}
> > +
> > +static int qcom_rpm_remove_child(struct device *dev, void *unused)
> > +{
> > +   platform_device_unregister(to_platform_device(dev));
> > +   return 0;
> > +}
> > +
> > +static int qcom_rpm_remove(struct platform_device *pdev)
> > +{
> > +   device_for_each_child(>dev, NULL, qcom_rpm_remove_child);
> 
> of_platform_depopulate()?
> 

Forgot that we had that now, will update.

> > +   return 0;
> > +}
> > +
> > +static struct platform_driver qcom_rpm_driver = {
> > +   .probe = qcom_rpm_probe,
> > +   .remove = qcom_rpm_remove,
> > +   .driver  = {
> > +   .name  = "qcom_rpm",
> > +   .owner = THIS_MODULE,
> 
> Remove this line, it's taken care of for you.
> 

OK

> > +   .of_match_table = qcom_rpm_of_match,
> > +   },
> > +};
> > +
> > +static int __init qcom_rpm_init(void)
> > +{
> > +   return platform_driver_register(_rpm_driver);
> > +}
> > +arch_initcall(qcom_rpm_init);
> > +
> > +static void __exit qcom_rpm_exit(void)
> > +{
> > +   platform_driver_unregister(_rpm_driver);
> > +}
> > +module_exit(qcom_rpm_exit)
> > +
> > +MODULE_DESCRIPTION("Qualcomm Resource Power Manager driver");
> > +MODULE_LICENSE("GPL v2");
> 
> No one authored this driver?
> 

Thought that was optional, will update.


Thanks for the review!

Regards,
Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] kbuild: Make scripts executable

2014-08-21 Thread Masahiro Yamada

Hi Michal,

On Thu, 21 Aug 2014 11:32:56 +0200
Michal Marek  wrote:

> On 2014-08-21 05:25, Masahiro Yamada wrote:
> > Hi Michal,
> > 
> > 
> > On Wed, 20 Aug 2014 16:10:48 +0200
> > Michal Marek  wrote:
> > 
> >> The Makefiles call the respective interpreter explicitly, but this makes
> >> it easier to use the scripts manually.
> >>
> >> Signed-off-by: Michal Marek 
> > 
> > 
> > I am not sure at all, but
> > it seems scripts/checkpatch.pl has a rule
> > to ban execute permissions.
> 
> I didn't know about this, but the intent of the rule seems to be to
> avoid *.c files with execute permissions.
> 
> 
> > # Check for incorrect file permissions
> > if ($line =~ /^new (file )?mode.*[7531]\d{0,2}$/) {
> > my $permhere = $here . "FILE: $realfile\n";
> > if ($realfile !~ m@scripts/@ &&
> > $realfile !~ /\.(py|pl|awk|sh)$/) {
> 
> Here it explicitly skips files below scripts/ and files with known
> script suffixes.
> 

OK then. I replied without understanding this code well.
My appologies.


Best Regards
Masahiro Yamada

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] pmbus: ltc2978: add regulator gating

2014-08-21 Thread Mark Brown

On Thu, Aug 21, 2014 at 06:18:10PM -0700, Guenter Roeck wrote:
> On Thu, Aug 21, 2014 at 07:36:50PM -0500, Mark Brown wrote:
> > On Thu, Aug 21, 2014 at 05:21:26PM -0500, at...@opensource.altera.com wrote:

> > This all looks very much like pmbus could use regmap and then the regmap
> > helpers.  I'd not insist on it though.  What I would however suggest is

> Not unless regmap got extended recently to support quick, byte, and word
> smbus accesses at the same time.

Depending on how you decide which it quite possibly does - if it's based
on the register number that'd work.


signature.asc
Description: Digital signature

Re: [PATCH 2/2] pmbus: ltc2978: add regulator gating

2014-08-21 Thread Guenter Roeck

On Thu, Aug 21, 2014 at 07:36:50PM -0500, Mark Brown wrote:
> On Thu, Aug 21, 2014 at 05:21:26PM -0500, at...@opensource.altera.com wrote:
> 
> > +config SENSORS_LTC2978_REGULATOR
> > +   boolean "Regulator support for LTC2974, LTC2978, LTC3880, and LTC3883"
> > +   default n
> 
> No need to say default n here, it's the default default.
> 
> > +   depends on SENSORS_LTC2978
> > +   select REGULATOR
> 
> I'd expect a depends here.
> 
> > +#include 
> > +#include 
> 
> If you need machine.h that's suspicious...  why do you need it?
> 
> > +static int ltc2978_write_pmbus_operation(struct regulator_dev *rdev, u8 
> > value)
> > +{
> > +   struct device *dev = rdev_get_dev(rdev);
> > +   struct i2c_client *client = to_i2c_client(dev->parent);
> > +   int ret;
> > +
> > +   ret = pmbus_set_page(client, 0xff);
> > +   if (ret < 0)
> > +   return ret;
> > +
> > +   return i2c_smbus_write_byte_data(client, PMBUS_OPERATION, value);
> > +}
> 
> This all looks very much like pmbus could use regmap and then the regmap
> helpers.  I'd not insist on it though.  What I would however suggest is

Not unless regmap got extended recently to support quick, byte, and word
smbus accesses at the same time.

Guenter

> that these functions should all be helpers which read the specific
> page, addresses and bits to write from the driver structure - I bet the
> code is going to be identical for most pmbus using regulators and so it
> makes sense to share it like we do with the generic regmap functions.
> 
> That means that any good practice can be deployed more easily and any
> API updates only need to update the helpers.
> 
> > +static struct regulator_init_data ltc2978_regulator_init = {
> > +   .constraints = {
> > +   .valid_ops_mask = REGULATOR_CHANGE_STATUS,
> > +   },
> > +};
> 
> You should not be forcing this on, you don't know what's safe on any
> given board.  Allow the board to specify constraints then it has
> control.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] softlockup: make detector be aware of task switch of processes hogging cpu

2014-08-21 Thread Chai Wen

On 08/21/2014 01:42 PM, chai wen wrote:

> For now, soft lockup detector warns once for each case of process softlockup.
> But the thread 'watchdog/n' may not always get the cpu at the time slot 
> between
> the task switch of two processes hogging that cpu to reset soft_watchdog_warn.
> 
> An example would be two processes hogging the cpu.  Process A causes the
> softlockup warning and is killed manually by a user.  Process B immediately
> becomes the new process hogging the cpu preventing the softlockup code from
> resetting the soft_watchdog_warn variable.
> 
> This case is a false negative of "warn only once for a process", as there may
> be a different process that is going to hog the cpu.  Resolve this by
> saving/checking the task pointer of the hogging process and use that to reset
> soft_watchdog_warn too.
> 
> Signed-off-by: chai wen 
> Signed-off-by: Don Zickus 


Hi Ingo & Don

Ping...

This patch is using the task pointer to check cases that softlockup can
not reset itself, and has been tested.

thanks
chai wen

> ---
>  kernel/watchdog.c |   16 +++-
>  1 files changed, 15 insertions(+), 1 deletions(-)
> 
> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> index 0037db6..2e55620 100644
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -42,6 +42,7 @@ static DEFINE_PER_CPU(bool, softlockup_touch_sync);
>  static DEFINE_PER_CPU(bool, soft_watchdog_warn);
>  static DEFINE_PER_CPU(unsigned long, hrtimer_interrupts);
>  static DEFINE_PER_CPU(unsigned long, soft_lockup_hrtimer_cnt);
> +static DEFINE_PER_CPU(struct task_struct *, softlockup_task_ptr_saved);
>  #ifdef CONFIG_HARDLOCKUP_DETECTOR
>  static DEFINE_PER_CPU(bool, hard_watchdog_warn);
>  static DEFINE_PER_CPU(bool, watchdog_nmi_touch);
> @@ -328,8 +329,20 @@ static enum hrtimer_restart watchdog_timer_fn(struct 
> hrtimer *hrtimer)
>   return HRTIMER_RESTART;
>  
>   /* only warn once */
> - if (__this_cpu_read(soft_watchdog_warn) == true)
> + if (__this_cpu_read(soft_watchdog_warn) == true) {
> + /*
> +  * Handle the case where multiple processes are
> +  * causing softlockups but the duration is small
> +  * enough, the softlockup detector can not reset
> +  * itself in time.  Use task pointers to detect this.
> +  */
> + if (__this_cpu_read(softlockup_task_ptr_saved) !=
> + current) {
> + __this_cpu_write(soft_watchdog_warn, false);
> + __touch_watchdog();
> + }
>   return HRTIMER_RESTART;
> + }
>  
>   if (softlockup_all_cpu_backtrace) {
>   /* Prevent multiple soft-lockup reports if one cpu is 
> already
> @@ -345,6 +358,7 @@ static enum hrtimer_restart watchdog_timer_fn(struct 
> hrtimer *hrtimer)
>   pr_emerg("BUG: soft lockup - CPU#%d stuck for %us! [%s:%d]\n",
>   smp_processor_id(), duration,
>   current->comm, task_pid_nr(current));
> + __this_cpu_write(softlockup_task_ptr_saved, current);
>   print_modules();
>   print_irqtrace_events(current);
>   if (regs)



-- 
Regards

Chai Wen
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3] i2c: rk3x: fix bug that cause transfer fails in master receive mode

2014-08-21 Thread Doug Anderson

Addy,

On Thu, Aug 21, 2014 at 6:13 PM, Addy Ke  wrote:
> In rk3x SOC, the I2C controller can receive/transmit up to 32 bytes data
> in one chunk, so the size of data to be write/read to/from TXDATAx/RXDATAx
> must be less than or equal 32 bytes at a time.
>
> Tested on rk3288-pinky board, elan receive 158 bytes data.
>
> Acked-by: Max Schwarz 
> Signed-off-by: Addy Ke 
> ---
> Changes in v2:
> - Use cleaner syntax as suggested by Sergei.
> - Update commit message as suggested by Wolfram.
>
> Changes in v3:
> - fix typo: maste --> master and double spaces after 'len'
>
>  drivers/i2c/busses/i2c-rk3x.c | 4 
>  1 file changed, 4 insertions(+)

Oops, we collided in the ether.  This looks good to me.

Tested-by: Doug Anderson 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] i2c: rk3x: fix bug that cause transfer fails in maste receive mode

2014-08-21 Thread Doug Anderson

Addy,

Title should probably have "master", not "maste"

On Thu, Aug 21, 2014 at 2:38 PM, Addy Ke  wrote:
> In rk3x SOC, the I2C controller can receive/transmit up to 32 bytes data
> in one chunk, so the size of data to be write/read to/from TXDATAx/RXDATAx
> must be less than or equal 32 bytes at a time.
>
> Tested on rk3288-pinky board, elan receive 158 bytes data.
>
> Acked-by: Max Schwarz 
> Signed-off-by: Addy Ke 
> ---
> Changes in v2:
> - Use cleaner syntax as suggested by Sergei.
> - Update commit message as suggested by Wolfram.
>
>  drivers/i2c/busses/i2c-rk3x.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/drivers/i2c/busses/i2c-rk3x.c b/drivers/i2c/busses/i2c-rk3x.c
> index 69e1185..806724a 100644
> --- a/drivers/i2c/busses/i2c-rk3x.c
> +++ b/drivers/i2c/busses/i2c-rk3x.c
> @@ -323,6 +323,10 @@ static void rk3x_i2c_handle_read(struct rk3x_i2c *i2c, 
> unsigned int ipd)
> /* ack interrupt */
> i2c_writel(i2c, REG_INT_MBRF, REG_IPD);
>
> +   /* Can only handle a maximum of 32 bytes at a time */
> +   if (unlikely(len > 32))
> +   len  = 32;

nit: one space before "=", not two.

Tested-by: Doug Anderson 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3] i2c: rk3x: fix bug that cause transfer fails in master receive mode

2014-08-21 Thread Addy Ke

In rk3x SOC, the I2C controller can receive/transmit up to 32 bytes data
in one chunk, so the size of data to be write/read to/from TXDATAx/RXDATAx
must be less than or equal 32 bytes at a time.

Tested on rk3288-pinky board, elan receive 158 bytes data.

Acked-by: Max Schwarz 
Signed-off-by: Addy Ke 
---
Changes in v2:
- Use cleaner syntax as suggested by Sergei.
- Update commit message as suggested by Wolfram.

Changes in v3:
- fix typo: maste --> master and double spaces after 'len'

 drivers/i2c/busses/i2c-rk3x.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/i2c/busses/i2c-rk3x.c b/drivers/i2c/busses/i2c-rk3x.c
index 69e1185..806724a 100644
--- a/drivers/i2c/busses/i2c-rk3x.c
+++ b/drivers/i2c/busses/i2c-rk3x.c
@@ -323,6 +323,10 @@ static void rk3x_i2c_handle_read(struct rk3x_i2c *i2c, 
unsigned int ipd)
/* ack interrupt */
i2c_writel(i2c, REG_INT_MBRF, REG_IPD);
 
+   /* Can only handle a maximum of 32 bytes at a time */
+   if (unlikely(len > 32))
+   len = 32;
+
/* read the data from receive buffer */
for (i = 0; i < len; ++i) {
if (i % 4 == 0)
-- 
1.8.3.2


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH 2/2] ACPI/EC: Add support to disallow QR_EC to be issued before completing the previous QR_EC.

2014-08-21 Thread Zheng, Lv

Hi, Rafael

I think PATCH 1 is required for now and this PATCH is not that useful if you 
are planning to merge my storming fix, the logic has already been cleaned there.

This flag originally is used to mark the period after SCI_EVT is indicated and 
before the QR_EC is polled.
This flag is useful because during this period some malfunctioning firmware may 
trigger GPE storm as we haven't handled SCI_EVT right in the IRQ context.
But so far we haven’t ACPICA GPE APIs ready to achieve such storm prevention, 
the original use case is not that obvious now….
Actually this flag is currently only used to masking query issuing...

This patch change the flag to be used for this case – marking the period after 
SCI_EVT is indicated and before the QR_EC is completed.
Since the original usage is actually not implemented, we can do this change.

But you also can ignore this patch as the cleanup in the storming series is 
cleaner than this.
In that series, the both cases are marked and thus can be protected by further 
code.

So it’s up to you whether you need PATCH 2 or not. ☺
If you took it, I would re-base the storming series to reflect this change.

Thanks and best regards
-Lv


From: Wysocki, Rafael J 
Sent: Friday, August 22, 2014 3:55 AM
To: Zheng, Lv; Brown, Len
Cc: Lv Zheng; linux-kernel@vger.kernel.org; linux-a...@vger.kernel.org
Subject: RE: [PATCH 2/2] ACPI/EC: Add support to disallow QR_EC to be issued 
before completing the previous QR_EC.

Looks OK to me.

Rafael



 Original message 
From "Zheng, Lv"  
Date: 21/08/2014 08:41 (GMT+01:00) 
To "Wysocki, Rafael J" ,"Brown, Len" 
 
Cc "Zheng, Lv" ,Lv Zheng 
,linux-kernel@vger.kernel.org,linux-a...@vger.kernel.org 
Subject [PATCH 2/2] ACPI/EC: Add support to disallow QR_EC to be issued before 
completing the previous QR_EC. 

There is platform refusing to respond QR_EC when SCI_EVT isn't set.
A known such platform is Acer Aspire V5-573G.

By disallowing QR_EC issuing without completing the previous one, we are
able to reduce the possibilities to trigger issues on such platforms.

Note that this fix can only reduce the occurrence rate of this issue, but
this issue may still occur when such a platform doesn't clear SCI_EVT
before or immediately after completing the previous QR_EC transaction. This
patch cannot fix CLEAR_ON_RESUME quirk which also relies on the assumption
that the platforms are able to respond even when SCI_EVT isn't set.

But this patch is still useful as it can help to reduce the number of
scheduled QR_EC work items.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=82611
Reported-and-tested-by: Alexander Mezin 
Signed-off-by: Lv Zheng 
---
 drivers/acpi/ec.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
index 5e1ed31..9922cc4 100644
--- a/drivers/acpi/ec.c
+++ b/drivers/acpi/ec.c
@@ -303,11 +303,11 @@ static int acpi_ec_transaction_unlocked(struct acpi_ec 
*ec,
 /* following two actions should be kept atomic */
 ec->curr = t;
 start_transaction(ec);
-   if (ec->curr->command == ACPI_EC_COMMAND_QUERY)
-   clear_bit(EC_FLAGS_QUERY_PENDING, >flags);
 spin_unlock_irqrestore(>lock, tmp);
 ret = ec_poll(ec);
 spin_lock_irqsave(>lock, tmp);
+   if (ec->curr->command == ACPI_EC_COMMAND_QUERY)
+   clear_bit(EC_FLAGS_QUERY_PENDING, >flags);
 ec->curr = NULL;
 spin_unlock_irqrestore(>lock, tmp);
 return ret;
-- 
1.7.10

Re: percpu: Define this_cpu_cpumask_var_t_ptr

2014-08-21 Thread Christoph Lameter

On Thu, 21 Aug 2014, Tejun Heo wrote:

> >
> > +#define this_cpu_cpumask_var_t_ptr(x) this_cpu_ptr()
>
> Urgh, this is nasty but yeah I can't think of any other way around it
> either. :(
>
> Do we need the "_t" in the name tho?  Maybe we can shorten the name to
> this_cpumask_var_ptr(x)?  Also, wouldn't it be better to define it as
> a static inline function so that the input type is explicit?

Its a pretty simple function (actually more a name substituion) so I
did not think it worth creating an inline function.

_t is there because I wanted to include the full "ugly" name of the
variable to make it similarly ugly. It is needed to make the clear
distinction to "struct cpumask *" which does not have these issues.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] tpm_tis: Verify ACPI-specified interrupt

2014-08-21 Thread Scot Doyle


Some machines, such as the Acer C720 and Toshiba CB35, have TPMs
that do not use interrupts while also having an ACPI TPM entry
indicating a specific interrupt to be used. Since this interrupt
is invalid, these machines freeze on resume until the interrupt
times out.

Generate the ACPI-specified interrupt. If none is received, then
fall back to polling mode.

Signed-off-by: Scot Doyle 
Tested-by: James Duley 
Tested-by: Michael Mullin 
---
 drivers/char/tpm/tpm_tis.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c
index 2c46734..736ed4a 100644
--- a/drivers/char/tpm/tpm_tis.c
+++ b/drivers/char/tpm/tpm_tis.c
@@ -633,12 +633,14 @@ static int tpm_tis_init(struct device *dev, 
resource_size_t start,
iowrite32(intmask,
  chip->vendor.iobase +
  TPM_INT_ENABLE(chip->vendor.locality));
-   if (interrupts)
-   chip->vendor.irq = irq;
-   if (interrupts && !chip->vendor.irq) {
-   irq_s =
-   ioread8(chip->vendor.iobase +
-   TPM_INT_VECTOR(chip->vendor.locality));
+   chip->vendor.irq = 0;
+   if (interrupts) {
+   if (irq)
+   irq_s = irq;
+   else
+   irq_s =
+   ioread8(chip->vendor.iobase +
+   TPM_INT_VECTOR(chip->vendor.locality));
if (irq_s) {
irq_e = irq_s;
} else {
--
2.0.4
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] regulator: RK808: Fix uninitialized value

2014-08-21 Thread Doug Anderson

The RK808 regulator driver was putting its config on the stack but not
initting it.  That means that you got a semi-random config.  Fix this.

Signed-off-by: Doug Anderson 
---
 drivers/regulator/rk808-regulator.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/regulator/rk808-regulator.c 
b/drivers/regulator/rk808-regulator.c
index 94753fd..4d5041c 100644
--- a/drivers/regulator/rk808-regulator.c
+++ b/drivers/regulator/rk808-regulator.c
@@ -334,7 +334,7 @@ static int rk808_regulator_probe(struct platform_device 
*pdev)
 {
struct rk808 *rk808 = dev_get_drvdata(pdev->dev.parent);
struct rk808_board *pdata;
-   struct regulator_config config;
+   struct regulator_config config = {};
struct regulator_dev *rk808_rdev;
struct regulator_init_data *reg_data;
int i = 0;
-- 
2.1.0.rc2.206.gedb03e5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

re

2014-08-21 Thread Adoption.Clinic

I am Mrs Bozena Horniakova. I need your assistance on a project. Please reply 
bozena.horniak...@outlook.com


This e-mail is intended only for the use of the individual or entity to which
it is addressed and may contain information that is privileged and confidential.
If the reader of this e-mail message is not the intended recipient, you are 
hereby notified that any dissemination, distribution or copying of this
communication is prohibited. If you have received this e-mail in error, please 
notify the sender and destroy all copies of the transmittal. 

Thank you
University of Chicago Medical Center 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3] spi: spi-imx: add DMA support

2014-08-21 Thread Mark Brown

On Thu, Aug 21, 2014 at 11:53:10AM +0800, Robin Gong wrote:

> Change from v2:
> http://thread.gmane.org/gmane.linux.ports.arm.kernel/291722/focus=294363
> 1. dma setup only for imx51-ecspi
> 2. use one small dummy buffer(1 bd size) to templiy store data
>for meanless rx/tx, instead of malloc the actual transfer size.

You can use the must_tx and must_rx flags to use the core implementation
of this functionality.  This will mean you get any performance or other
improvements we implement there.

> + int (*txrx_bufs)(struct spi_device *spi, struct spi_transfer *t);
> + struct dma_chan *dma_chan_rx;
> + struct dma_chan *dma_chan_tx;

The SPI controller has variables for this already - you should use them
(and the core support).  In general my main comment on this patch is
that you should be using the core DMA support, it fixes some problems
(like not having mapping for vmalloc() buffers) and factors out some
code.

signature.asc
Description: Digital signature

Re: [PATCH v3] scsi: ufs-msm: add UFS controller support for Qualcomm MSM chips

2014-08-21 Thread subhashj

>> On Aug 14, 2014, at 9:22 AM, Yaniv Gardi  wrote:
>>> The files in this change implement the UFS HW (controller & PHY) specific
>>> behavior in Qualcomm MSM chips.
>>> Signed-off-by: Yaniv Gardi 
>>> ---
>>> Documentation/devicetree/bindings/ufs/ufs-msm.txt  |   37 +
>>> .../devicetree/bindings/ufs/ufshcd-pltfrm.txt  |4 +
>>> drivers/scsi/ufs/Kconfig   |   12 +
>>> drivers/scsi/ufs/Makefile  |4 +
>>> drivers/scsi/ufs/ufs-msm-phy-qmp-20nm.c|  254 +
drivers/scsi/ufs/ufs-msm-phy-qmp-20nm.h|  216 
drivers/scsi/ufs/ufs-msm-phy-qmp-28nm.c|  368 +++
drivers/scsi/ufs/ufs-msm-phy-qmp-28nm.h|  735
+
>>> drivers/scsi/ufs/ufs-msm-phy.c |  646 
drivers/scsi/ufs/ufs-msm-phy.h |  193 
>> Any reason not to put the phy driver in drivers/phy ?
> Yes. Phy driver introduces a generic phy framework.
> And as a framework it provides with API's, callbacks,
> And data structures.
> I think the right place to have the >implementation< of the ufs-msm-phy
code
> Is under drivers/scsi/ufs as it's more related to ufs than it's related
to
> the framework itself.

I would agree with Kumar that PHY specific platform driver (which uses the
generic PHY framwork) can actually go under drivers/phy/*, if i look at
the 3.17-rc1, there are many platform specific phy drivers under
drivers/phy/.

>>> drivers/scsi/ufs/ufs-msm.c | 1105
>>> 
>>> drivers/scsi/ufs/ufs-msm.h |  158 +++
>>> 12 files changed, 3732 insertions(+)
>>> create mode 100644 Documentation/devicetree/bindings/ufs/ufs-msm.txt
create mode 100644 drivers/scsi/ufs/ufs-msm-phy-qmp-20nm.c
>>> create mode 100644 drivers/scsi/ufs/ufs-msm-phy-qmp-20nm.h
>>> create mode 100644 drivers/scsi/ufs/ufs-msm-phy-qmp-28nm.c
>>> create mode 100644 drivers/scsi/ufs/ufs-msm-phy-qmp-28nm.h
>>> create mode 100644 drivers/scsi/ufs/ufs-msm-phy.c
>>> create mode 100644 drivers/scsi/ufs/ufs-msm-phy.h
>>> create mode 100644 drivers/scsi/ufs/ufs-msm.c
>>> create mode 100644 drivers/scsi/ufs/ufs-msm.h
>> Seems like we should spit this into two patches, one for the phy and
one
>> for the UFS driver itself.  Maybe even three, one for the 20nm phy, one
for the 28nm phy, and one for ufs-msm.c,h.
> we could try to split it, but since we didn't split this change into
functional sub-changes, we decided to upload this change as a whole, as
one change without the other wouldn't work anyhow, and they are both
needed for proper functionality.
>>> diff --git a/Documentation/devicetree/bindings/ufs/ufs-msm.txt
b/Documentation/devicetree/bindings/ufs/ufs-msm.txt
>>> new file mode 100644
>>> index 000..b5caace
>>> --- /dev/null
>>> +++ b/Documentation/devicetree/bindings/ufs/ufs-msm.txt
>> This should probably be bindings/phy/qcom-ufs-phy.txt

If Yaniv also agrees that Qualcomm UFS MSM PHY specific driver should move
under drivers/phy then yes, "ufs-msm.txt" should also move under
bindings/phy/.

Btw, regarding using "qcom-ufs-phy.txt" instead "ufs-msm.txt", i would
agree that we need to add "phy" in file name but not sure we would like to
replace "msm" with "qcom" because we used "msm" prefix/postfix almost
everywhere.



>>> @@ -0,0 +1,37 @@
>>> +* MSM Universal Flash Storage (UFS) PHY
>>> +
>>> +UFSPHY nodes are defined to describe on-chip UFS PHY hardware macro.
+Each UFS PHY node should have its own node.
>>> +
>>> +To bind UFS PHY with UFS host controller, the controller node should
+contain a phandle reference to UFS PHY node.
>>> +
>>> +Required properties:
>>> +- compatible: compatible list, contains
>>> "qcom,ufs-msm-phy-qmp-28nm"
>>> +  or "qcom,ufs-msm-phy-qmp-20nm" according to the
relevant
>>> +  phy in use
>> Do we really need -msm in the compat name?

That's the convention we followed for all file names and hence carry
forwarded to compatible name as well. Any specific issues with it?

>>> +- reg   : 
>>> +- #phy-cells : This property shall be set to 0
>>> +- vdda-phy-supply   : phandle to main PHY supply for analog domain +-
vdda-pll-supply   : phandle to PHY PLL and Power-Gen block power
supply
>>> +
>>> +Optional properties:
>>> +- vdda-phy-max-microamp : specifies max. load that can be drawn from
phy supply
>>> +- vdda-pll-max-microamp : specifies max. load that can be drawn from
pll supply
>>> +
>>> +Example:
>>> +
>>> +   ufsphy1: ufsphy@0xfc597000 {
>>> +   compatible = "qcom,ufs-msm-phy-qmp-28nm";
>>> +   reg = <0xfc597000 0x800>;
>>> +   #phy-cells = <0>;
>>> +   vdda-phy-supply = <_l4>;
>>> +   vdda-pll-supply = <_l12>;
>>> +   vdda-phy-max-microamp = <5>;
>>> +   vdda-pll-max-microamp = <1000>;
>>> +   };
>>> +
>>> +   ufshc@0xfc598000 {
>>> +   ...
>>> +   phys = <>;
>>> +   };
>>> diff --git

Re: [PATCH v6 5/6] arm64: add SIGSYS siginfo for compat task

2014-08-21 Thread AKASHI Takahiro


On 08/22/2014 02:54 AM, Kees Cook wrote:

On Thu, Aug 21, 2014 at 3:56 AM, AKASHI Takahiro
 wrote:

SIGSYS is primarily used in secure computing to notify tracer.
This patch allows signal handler on compat task to get correct information
with SA_SYSINFO specified when this signal is delivered.


typo: SA_SIGINFO


Signed-off-by: AKASHI Takahiro 


I'm unable to test this myself, but if you've got the test suite
passing in compat mode, then this patch must be correct. :)


Thanks.
Actually I found this bug when I ran your test programs, TRAP.handler, on 32bit 
userland.

-Takahiro AKASHI



Reviewed-by: Kees Cook 

-Kees


---
  arch/arm64/include/asm/compat.h |7 +++
  arch/arm64/kernel/signal32.c|8 
  2 files changed, 15 insertions(+)

diff --git a/arch/arm64/include/asm/compat.h b/arch/arm64/include/asm/compat.h
index 253e33b..c877915 100644
--- a/arch/arm64/include/asm/compat.h
+++ b/arch/arm64/include/asm/compat.h
@@ -205,6 +205,13 @@ typedef struct compat_siginfo {
 compat_long_t _band;/* POLL_IN, POLL_OUT, POLL_MSG 
*/
 int _fd;
 } _sigpoll;
+
+   /* SIGSYS */
+   struct {
+   compat_uptr_t _call_addr; /* calling user insn */
+   int _syscall;   /* triggering system call number */
+   unsigned int _arch; /* AUDIT_ARCH_* of syscall */
+   } _sigsys;
 } _sifields;
  } compat_siginfo_t;

diff --git a/arch/arm64/kernel/signal32.c b/arch/arm64/kernel/signal32.c
index 1b9ad02..aa550d6 100644
--- a/arch/arm64/kernel/signal32.c
+++ b/arch/arm64/kernel/signal32.c
@@ -186,6 +186,14 @@ int copy_siginfo_to_user32(compat_siginfo_t __user *to, 
const siginfo_t *from)
 err |= __put_user(from->si_uid, >si_uid);
 err |= __put_user((compat_uptr_t)(unsigned long)from->si_ptr, 
>si_ptr);
 break;
+#ifdef __ARCH_SIGSYS
+   case __SI_SYS:
+   err |= __put_user((compat_uptr_t)(unsigned long)
+   from->si_call_addr, >si_call_addr);
+   err |= __put_user(from->si_syscall, >si_syscall);
+   err |= __put_user(from->si_arch, >si_arch);
+   break;
+#endif
 default: /* this is just in case for now ... */
 err |= __put_user(from->si_pid, >si_pid);
 err |= __put_user(from->si_uid, >si_uid);
--
1.7.9.5






--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 0/4] zram memory control enhance

2014-08-21 Thread Minchan Kim

Currently, zram has no feature to limit memory so theoretically
zram can deplete system memory.
Users have asked for a limit several times as even without exhaustion
zram makes it hard to control memory usage of the platform.
This patchset adds the feature.

Patch 1 makes zs_get_total_size_bytes faster because it would be
used frequently in later patches for the new feature.

Patch 2 changes zs_get_total_size_bytes's return unit from bytes
to page so that zsmalloc doesn't need unnecessary operation(ie,
<< PAGE_SHIFT).

Patch 3 adds new feature. I added the feature into zram layer,
not zsmalloc because limiation is zram's requirement, not zsmalloc
so any other user using zsmalloc(ie, zpool) shouldn't affected
by unnecessary branch of zsmalloc. In future, if every users
of zsmalloc want the feature, then, we could move the feature
from client side to zsmalloc easily but vice versa would be
painful.

Patch 4 adds news facility to report maximum memory usage of zram
so that this avoids user polling frequently via /sys/block/zram0/
mem_used_total and ensures transient max are not missed.

* From v3
 * get_zs_total_size_byte function name change - Dan
 * clarifiction of the document - Dan
 * atomic account instead of introducing new lock in zsmalloc - David
 * remove unnecessary atomic instruction in updating max - David
 
* From v2
 * introduce helper funcntion to update max_used_pages
   for readability - David
 * avoid unncessary zs_get_total_size call in updating loop
   for max_used_pages - David

* From v1
 * rebased on next-20140815
 * fix up race problem - David, Dan
 * reset mem_used_max as current total_bytes, rather than 0 - David
 * resetting works with only "0" write for extensiblilty - David, Dan

Minchan Kim (4):
  zsmalloc: move pages_allocated to zs_pool
  zsmalloc: change return value unit of  zs_get_total_size_bytes
  zram: zram memory size limitation
  zram: report maximum used memory

 Documentation/ABI/testing/sysfs-block-zram |  20 ++
 Documentation/blockdev/zram.txt|  25 +--
 drivers/block/zram/zram_drv.c  | 101 -
 drivers/block/zram/zram_drv.h  |   6 ++
 include/linux/zsmalloc.h   |   2 +-
 mm/zsmalloc.c  |  30 -
 6 files changed, 158 insertions(+), 26 deletions(-)

-- 
2.0.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 1/4] zsmalloc: move pages_allocated to zs_pool

2014-08-21 Thread Minchan Kim

pages_allocated has counted in size_class structure and when user
of zsmalloc want to see total_size_bytes, it should gather all of
count from each size_class to report the sum.

it's not bad if user don't see the value often but if user start
to see the value frequently, it would be not a good deal for
performance pov.

This patch moves the count from size_class to zs_pool so it could
reduce memory footprint (from [255 * 8byte] to
[sizeof(atomic_long_t)]).

Signed-off-by: Minchan Kim 
---
 mm/zsmalloc.c | 23 ---
 1 file changed, 8 insertions(+), 15 deletions(-)

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 94f38fac5e81..2a4acf400846 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -199,9 +199,6 @@ struct size_class {
 
spinlock_t lock;
 
-   /* stats */
-   u64 pages_allocated;
-
struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS];
 };
 
@@ -220,6 +217,7 @@ struct zs_pool {
struct size_class size_class[ZS_SIZE_CLASSES];
 
gfp_t flags;/* allocation flags used when growing pool */
+   atomic_long_t pages_allocated;
 };
 
 /*
@@ -1028,8 +1026,9 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size)
return 0;
 
set_zspage_mapping(first_page, class->index, ZS_EMPTY);
+   atomic_long_add(class->pages_per_zspage,
+   >pages_allocated);
spin_lock(>lock);
-   class->pages_allocated += class->pages_per_zspage;
}
 
obj = (unsigned long)first_page->freelist;
@@ -1082,14 +1081,13 @@ void zs_free(struct zs_pool *pool, unsigned long obj)
 
first_page->inuse--;
fullness = fix_fullness_group(pool, first_page);
-
-   if (fullness == ZS_EMPTY)
-   class->pages_allocated -= class->pages_per_zspage;
-
spin_unlock(>lock);
 
-   if (fullness == ZS_EMPTY)
+   if (fullness == ZS_EMPTY) {
+   atomic_long_sub(class->pages_per_zspage,
+   >pages_allocated);
free_zspage(first_page);
+   }
 }
 EXPORT_SYMBOL_GPL(zs_free);
 
@@ -1185,12 +1183,7 @@ EXPORT_SYMBOL_GPL(zs_unmap_object);
 
 u64 zs_get_total_size_bytes(struct zs_pool *pool)
 {
-   int i;
-   u64 npages = 0;
-
-   for (i = 0; i < ZS_SIZE_CLASSES; i++)
-   npages += pool->size_class[i].pages_allocated;
-
+   u64 npages = atomic_long_read(>pages_allocated);
return npages << PAGE_SHIFT;
 }
 EXPORT_SYMBOL_GPL(zs_get_total_size_bytes);
-- 
2.0.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 4/4] zram: report maximum used memory

2014-08-21 Thread Minchan Kim

Normally, zram user could get maximum memory usage zram consumed
via polling mem_used_total with sysfs in userspace.

But it has a critical problem because user can miss peak memory
usage during update inverval of polling. For avoiding that,
user should poll it with shorter interval(ie, 0.01s)
with mlocking to avoid page fault delay when memory pressure
is heavy. It would be troublesome.

This patch adds new knob "mem_used_max" so user could see
the maximum memory usage easily via reading the knob and reset
it via "echo 0 > /sys/block/zram0/mem_used_max".

Signed-off-by: Minchan Kim 
---
 Documentation/ABI/testing/sysfs-block-zram | 10 +
 Documentation/blockdev/zram.txt|  1 +
 drivers/block/zram/zram_drv.c  | 60 +-
 drivers/block/zram/zram_drv.h  |  1 +
 4 files changed, 70 insertions(+), 2 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-block-zram 
b/Documentation/ABI/testing/sysfs-block-zram
index b8c779d64968..7b8fca6a9b77 100644
--- a/Documentation/ABI/testing/sysfs-block-zram
+++ b/Documentation/ABI/testing/sysfs-block-zram
@@ -120,6 +120,16 @@ Description:
statistic.
Unit: bytes
 
+What:  /sys/block/zram/mem_used_max
+Date:  August 2014
+Contact:   Minchan Kim 
+Description:
+   The mem_used_max file is read/write and specifies the amount
+   of maximum memory zram have consumed to store compressed data.
+   For resetting the value, you should write "0". Otherwise,
+   you could see -EINVAL.
+   Unit: bytes
+
 What:  /sys/block/zram/mem_limit
 Date:  August 2014
 Contact:   Minchan Kim 
diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
index 82c6a41116db..7fcf9c6592ec 100644
--- a/Documentation/blockdev/zram.txt
+++ b/Documentation/blockdev/zram.txt
@@ -111,6 +111,7 @@ size of the disk when not in use so a huge zram is wasteful.
orig_data_size
compr_data_size
mem_used_total
+   mem_used_max
 
 8) Deactivate:
swapoff /dev/zram0
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 370c355eb127..1a2b3e320ea5 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -149,6 +149,41 @@ static ssize_t mem_limit_store(struct device *dev,
return len;
 }
 
+static ssize_t mem_used_max_show(struct device *dev,
+   struct device_attribute *attr, char *buf)
+{
+   u64 val = 0;
+   struct zram *zram = dev_to_zram(dev);
+
+   down_read(>init_lock);
+   if (init_done(zram))
+   val = atomic_long_read(>stats.max_used_pages);
+   up_read(>init_lock);
+
+   return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT);
+}
+
+static ssize_t mem_used_max_store(struct device *dev,
+   struct device_attribute *attr, const char *buf, size_t len)
+{
+   int err;
+   unsigned long val;
+   struct zram *zram = dev_to_zram(dev);
+   struct zram_meta *meta = zram->meta;
+
+   err = kstrtoul(buf, 10, );
+   if (err || val != 0)
+   return -EINVAL;
+
+   down_read(>init_lock);
+   if (init_done(zram))
+   atomic_long_set(>stats.max_used_pages,
+   zs_get_total_pages(meta->mem_pool));
+   up_read(>init_lock);
+
+   return len;
+}
+
 static ssize_t max_comp_streams_store(struct device *dev,
struct device_attribute *attr, const char *buf, size_t len)
 {
@@ -461,6 +496,21 @@ out_cleanup:
return ret;
 }
 
+static inline void update_used_max(struct zram *zram,
+   const unsigned long pages)
+{
+   int old_max, cur_max;
+
+   old_max = atomic_long_read(>stats.max_used_pages);
+
+   do {
+   cur_max = old_max;
+   if (pages > cur_max)
+   old_max = atomic_long_cmpxchg(
+   >stats.max_used_pages, cur_max, pages);
+   } while (old_max != cur_max);
+}
+
 static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
   int offset)
 {
@@ -472,6 +522,7 @@ static int zram_bvec_write(struct zram *zram, struct 
bio_vec *bvec, u32 index,
struct zram_meta *meta = zram->meta;
struct zcomp_strm *zstrm;
bool locked = false;
+   unsigned long alloced_pages;
 
page = bvec->bv_page;
if (is_partial_io(bvec)) {
@@ -541,13 +592,15 @@ static int zram_bvec_write(struct zram *zram, struct 
bio_vec *bvec, u32 index,
goto out;
}
 
-   if (zram->limit_pages &&
-   zs_get_total_pages(meta->mem_pool) > zram->limit_pages) {
+   alloced_pages = zs_get_total_pages(meta->mem_pool);
+   if (zram->limit_pages && alloced_pages > zram->limit_pages) {

RE: [RFC PATCH -logging 00/10] scsi/constants: Output continuous error messages on trace

2014-08-21 Thread Elliott, Robert (Server Storage)

> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Yoshihiro YUNOMAE
> Sent: Friday, 08 August, 2014 6:50 AM
> Subject: [RFC PATCH -logging 00/10] scsi/constants: Output continuous
> error messages on trace
...
> 1) printk
> Keeps current implemntation of upstream kernel.
> The messages are divided and can be mixed, but all users can
> check the error messages without any settings.

scsi_io_completion ignore the scsi_logging_level and always calls
printk if it detects ACTION_FAIL, resulting in messages like:

[10240.338600] sd 2:0:0:0: [sdr]
[10240.339722] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[10240.341662] sd 2:0:0:0: [sdr]
[10240.342792] Sense Key : Hardware Error [current]
[10240.344575] sd 2:0:0:0: [sdr]
[10240.345653] Add. Sense: Logical unit failure
[10240.347138] sd 2:0:0:0: [sdr] CDB:
[10240.348309] Read(10): 28 00 00 00 00 80 00 00 08 00

If you trigger hundreds of errors (e.g., hot remove a device
during heavy IO), then all the prints to the linux serial console
bog down the system, causing timeouts in commands to other
devices and soft lockups for applications.

Some changes that would help are:
1. Put them under SCSI logging level control
2. Use printk_ratelimited so an excessive number are trimmed

Would you like to include something like this in your
patch set?

This is an example patch that only prints them if the MLCOMPLETE 
logging level is nonzero.
Off: scsi_logging_level --set --mlcomplete=0
On: scsi_logging_level --set --mlcomplete=1

Some other loglevel (e.g., ERROR_RECOVERY) could be used.

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index d6b4ea8..dbb601f 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1037,7 +1037,9 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
switch (action) {
case ACTION_FAIL:
/* Give up and fail the remainder of the request */
-   if (!(req->cmd_flags & REQ_QUIET)) {
+   if (!(req->cmd_flags & REQ_QUIET) &&
+   SCSI_LOG_LEVEL(SCSI_LOG_MLCOMPLETE_SHIFT,
+   SCSI_LOG_MLCOMPLETE_BITS)) {
scsi_print_result(cmd);
if (driver_byte(result) & DRIVER_SENSE)
scsi_print_sense("", cmd);

Converting to printk_ratelimited is harder since the prints
are spread out over three functions (and as your patch
series notes, many individual printk calls).  The rates
for the printk calls might not match, which would lead to
even more confusing output.

---
Rob ElliottHP Server Storage

[PATCH v4 2/4] zsmalloc: change return value unit of zs_get_total_size_bytes

2014-08-21 Thread Minchan Kim

zs_get_total_size_bytes returns a amount of memory zsmalloc
consumed with *byte unit* but zsmalloc operates *page unit*
rather than byte unit so let's change the API so benefit
we could get is that reduce unnecessary overhead
(ie, change page unit with byte unit) in zsmalloc.

Since return type is pages, "zs_get_total_pages" is better than
"zs_get_total_size_bytes".

Signed-off-by: Minchan Kim 
---
 drivers/block/zram/zram_drv.c | 4 ++--
 include/linux/zsmalloc.h  | 2 +-
 mm/zsmalloc.c | 9 -
 3 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index d00831c3d731..f0b8b30a7128 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -103,10 +103,10 @@ static ssize_t mem_used_total_show(struct device *dev,
 
down_read(>init_lock);
if (init_done(zram))
-   val = zs_get_total_size_bytes(meta->mem_pool);
+   val = zs_get_total_pages(meta->mem_pool);
up_read(>init_lock);
 
-   return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
+   return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT);
 }
 
 static ssize_t max_comp_streams_show(struct device *dev,
diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h
index e44d634e7fb7..05c214760977 100644
--- a/include/linux/zsmalloc.h
+++ b/include/linux/zsmalloc.h
@@ -46,6 +46,6 @@ void *zs_map_object(struct zs_pool *pool, unsigned long 
handle,
enum zs_mapmode mm);
 void zs_unmap_object(struct zs_pool *pool, unsigned long handle);
 
-u64 zs_get_total_size_bytes(struct zs_pool *pool);
+unsigned long zs_get_total_pages(struct zs_pool *pool);
 
 #endif
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 2a4acf400846..c4a91578dc96 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -297,7 +297,7 @@ static void zs_zpool_unmap(void *pool, unsigned long handle)
 
 static u64 zs_zpool_total_size(void *pool)
 {
-   return zs_get_total_size_bytes(pool);
+   return zs_get_total_pages(pool) << PAGE_SHIFT;
 }
 
 static struct zpool_driver zs_zpool_driver = {
@@ -1181,12 +1181,11 @@ void zs_unmap_object(struct zs_pool *pool, unsigned 
long handle)
 }
 EXPORT_SYMBOL_GPL(zs_unmap_object);
 
-u64 zs_get_total_size_bytes(struct zs_pool *pool)
+unsigned long zs_get_total_pages(struct zs_pool *pool)
 {
-   u64 npages = atomic_long_read(>pages_allocated);
-   return npages << PAGE_SHIFT;
+   return atomic_long_read(>pages_allocated);
 }
-EXPORT_SYMBOL_GPL(zs_get_total_size_bytes);
+EXPORT_SYMBOL_GPL(zs_get_total_pages);
 
 module_init(zs_init);
 module_exit(zs_exit);
-- 
2.0.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 3/4] zram: zram memory size limitation

2014-08-21 Thread Minchan Kim

Since zram has no control feature to limit memory usage,
it makes hard to manage system memrory.

This patch adds new knob "mem_limit" via sysfs to set up the
a limit so that zram could fail allocation once it reaches
the limit.

In addition, user could change the limit in runtime so that
he could manage the memory more dynamically.

Default is no limit so it doesn't break old behavior.

Signed-off-by: Minchan Kim 
---
 Documentation/ABI/testing/sysfs-block-zram | 10 
 Documentation/blockdev/zram.txt| 24 ++---
 drivers/block/zram/zram_drv.c  | 41 ++
 drivers/block/zram/zram_drv.h  |  5 
 4 files changed, 76 insertions(+), 4 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-block-zram 
b/Documentation/ABI/testing/sysfs-block-zram
index 70ec992514d0..b8c779d64968 100644
--- a/Documentation/ABI/testing/sysfs-block-zram
+++ b/Documentation/ABI/testing/sysfs-block-zram
@@ -119,3 +119,13 @@ Description:
efficiency can be calculated using compr_data_size and this
statistic.
Unit: bytes
+
+What:  /sys/block/zram/mem_limit
+Date:  August 2014
+Contact:   Minchan Kim 
+Description:
+   The mem_limit file is read/write and specifies the amount
+   of memory to be able to consume memory to store store
+   compressed data. The limit could be changed in run time
+   and "0" is default which means disable the limit.
+   Unit: bytes
diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
index 0595c3f56ccf..82c6a41116db 100644
--- a/Documentation/blockdev/zram.txt
+++ b/Documentation/blockdev/zram.txt
@@ -74,14 +74,30 @@ There is little point creating a zram of greater than twice 
the size of memory
 since we expect a 2:1 compression ratio. Note that zram uses about 0.1% of the
 size of the disk when not in use so a huge zram is wasteful.
 
-5) Activate:
+5) Set memory limit: Optional
+   Set memory limit by writing the value to sysfs node 'mem_limit'.
+   The value can be either in bytes or you can use mem suffixes.
+   In addition, you could change the value in runtime.
+   Examples:
+   # limit /dev/zram0 with 50MB memory
+   echo $((50*1024*1024)) > /sys/block/zram0/mem_limit
+
+   # Using mem suffixes
+   echo 256K > /sys/block/zram0/mem_limit
+   echo 512M > /sys/block/zram0/mem_limit
+   echo 1G > /sys/block/zram0/mem_limit
+
+   # To disable memory limit
+   echo 0 > /sys/block/zram0/mem_limit
+
+6) Activate:
mkswap /dev/zram0
swapon /dev/zram0
 
mkfs.ext4 /dev/zram1
mount /dev/zram1 /tmp
 
-6) Stats:
+7) Stats:
Per-device statistics are exported as various nodes under
/sys/block/zram/
disksize
@@ -96,11 +112,11 @@ size of the disk when not in use so a huge zram is 
wasteful.
compr_data_size
mem_used_total
 
-7) Deactivate:
+8) Deactivate:
swapoff /dev/zram0
umount /dev/zram1
 
-8) Reset:
+9) Reset:
Write any positive value to 'reset' sysfs node
echo 1 > /sys/block/zram0/reset
echo 1 > /sys/block/zram1/reset
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index f0b8b30a7128..370c355eb127 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -122,6 +122,33 @@ static ssize_t max_comp_streams_show(struct device *dev,
return scnprintf(buf, PAGE_SIZE, "%d\n", val);
 }
 
+static ssize_t mem_limit_show(struct device *dev,
+   struct device_attribute *attr, char *buf)
+{
+   u64 val;
+   struct zram *zram = dev_to_zram(dev);
+
+   down_read(>init_lock);
+   val = zram->limit_pages;
+   up_read(>init_lock);
+
+   return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT);
+}
+
+static ssize_t mem_limit_store(struct device *dev,
+   struct device_attribute *attr, const char *buf, size_t len)
+{
+   u64 limit;
+   struct zram *zram = dev_to_zram(dev);
+
+   limit = memparse(buf, NULL);
+   down_write(>init_lock);
+   zram->limit_pages = PAGE_ALIGN(limit) >> PAGE_SHIFT;
+   up_write(>init_lock);
+
+   return len;
+}
+
 static ssize_t max_comp_streams_store(struct device *dev,
struct device_attribute *attr, const char *buf, size_t len)
 {
@@ -513,6 +540,14 @@ static int zram_bvec_write(struct zram *zram, struct 
bio_vec *bvec, u32 index,
ret = -ENOMEM;
goto out;
}
+
+   if (zram->limit_pages &&
+   zs_get_total_pages(meta->mem_pool) > zram->limit_pages) {
+   zs_free(meta->mem_pool, handle);
+   ret = -ENOMEM;
+   goto out;
+   }
+
cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_WO);
 
if ((clen ==

Re: [PATCH v6 4/6] arm64: add seccomp syscall for compat task

2014-08-21 Thread AKASHI Takahiro


On 08/22/2014 02:52 AM, Kees Cook wrote:

On Thu, Aug 21, 2014 at 3:56 AM, AKASHI Takahiro
 wrote:

This patch allows compat task to issue seccomp() system call.

Signed-off-by: AKASHI Takahiro 
---
  arch/arm64/include/asm/unistd.h   |2 +-
  arch/arm64/include/asm/unistd32.h |3 +++
  2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
index 4bc95d2..cf6ee31 100644
--- a/arch/arm64/include/asm/unistd.h
+++ b/arch/arm64/include/asm/unistd.h
@@ -41,7 +41,7 @@
  #define __ARM_NR_compat_cacheflush (__ARM_NR_COMPAT_BASE+2)
  #define __ARM_NR_compat_set_tls(__ARM_NR_COMPAT_BASE+5)

-#define __NR_compat_syscalls   383
+#define __NR_compat_syscalls   384
  #endif

  #define __ARCH_WANT_SYS_CLONE
diff --git a/arch/arm64/include/asm/unistd32.h 
b/arch/arm64/include/asm/unistd32.h
index e242600..2922c40 100644
--- a/arch/arm64/include/asm/unistd32.h
+++ b/arch/arm64/include/asm/unistd32.h
@@ -787,3 +787,6 @@ __SYSCALL(__NR_sched_setattr, sys_sched_setattr)
  __SYSCALL(__NR_sched_getattr, sys_sched_getattr)
  #define __NR_renameat2 382
  __SYSCALL(__NR_renameat2, sys_renameat2)
+#define __NR_seccomp 383
+__SYSCALL(__NR_seccomp, sys_seccomp)
+


Nit: this adds a trailing blank line. Other than that:


I will fix it.
Thanks,

-Takahiro AKASHI


Reviewed-by: Kees Cook 

-Kees


--
1.7.9.5






--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v6 3/6] asm-generic: add generic seccomp.h for secure computing mode 1

2014-08-21 Thread AKASHI Takahiro


On 08/22/2014 02:51 AM, Kees Cook wrote:

On Thu, Aug 21, 2014 at 3:56 AM, AKASHI Takahiro
 wrote:

Those values (__NR_seccomp_*) are used solely in secure_computing()
to identify mode 1 system calls. If compat system calls have different
syscall numbers, asm/seccomp.h may override them.

Acked-by: Arnd Bergmann 
Signed-off-by: AKASHI Takahiro 


Reviewed-by: Kees Cook 


---
  include/asm-generic/seccomp.h |   28 
  1 file changed, 28 insertions(+)
  create mode 100644 include/asm-generic/seccomp.h

diff --git a/include/asm-generic/seccomp.h b/include/asm-generic/seccomp.h
new file mode 100644
index 000..5e97022
--- /dev/null
+++ b/include/asm-generic/seccomp.h
@@ -0,0 +1,28 @@
+/*
+ * include/asm-generic/seccomp.h
+ *
+ * Copyright (C) 2014 Linaro Limited
+ * Author: AKASHI Takahiro 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#ifndef _ASM_GENERIC_SECCOMP_H
+#define _ASM_GENERIC_SECCOMP_H
+
+#include 


While this isn't a problem for ARM, this should be linux/unistd.h for
other architectures to get the right stuff.


I will fix it.


+
+#if defined(CONFIG_COMPAT) && !defined(__NR_seccomp_read_32)
+#define __NR_seccomp_read_32   __NR_read
+#define __NR_seccomp_write_32  __NR_write
+#define __NR_seccomp_exit_32   __NR_exit
+#define __NR_seccomp_sigreturn_32  __NR_rt_sigreturn
+#endif /* CONFIG_COMPAT && ! already defined */
+
+#define __NR_seccomp_read  __NR_read
+#define __NR_seccomp_write __NR_write
+#define __NR_seccomp_exit  __NR_exit
+#define __NR_seccomp_sigreturn __NR_rt_sigreturn


Some architectures use __NR_sigreturn, so this will need to be
adjusted in the future into:

#ifdef __NR_seccomp_sigreturn
#define __NR_seccomp_sigreturn __NR_rt_sigreturn
#endif


I will fix it.


After these changes, I was able to port x86 to using this
asm-generic/seccomp.h too.


Thanks,
-Takahiro AKASHI


-Kees


+
+#endif /* _ASM_GENERIC_SECCOMP_H */
--
1.7.9.5






--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] pmbus: ltc2978: add regulator gating

2014-08-21 Thread Mark Brown

On Thu, Aug 21, 2014 at 05:21:26PM -0500, at...@opensource.altera.com wrote:

> +config SENSORS_LTC2978_REGULATOR
> + boolean "Regulator support for LTC2974, LTC2978, LTC3880, and LTC3883"
> + default n

No need to say default n here, it's the default default.

> + depends on SENSORS_LTC2978
> + select REGULATOR

I'd expect a depends here.

> +#include 
> +#include 

If you need machine.h that's suspicious...  why do you need it?

> +static int ltc2978_write_pmbus_operation(struct regulator_dev *rdev, u8 
> value)
> +{
> + struct device *dev = rdev_get_dev(rdev);
> + struct i2c_client *client = to_i2c_client(dev->parent);
> + int ret;
> +
> + ret = pmbus_set_page(client, 0xff);
> + if (ret < 0)
> + return ret;
> +
> + return i2c_smbus_write_byte_data(client, PMBUS_OPERATION, value);
> +}

This all looks very much like pmbus could use regmap and then the regmap
helpers.  I'd not insist on it though.  What I would however suggest is
that these functions should all be helpers which read the specific
page, addresses and bits to write from the driver structure - I bet the
code is going to be identical for most pmbus using regulators and so it
makes sense to share it like we do with the generic regmap functions.

That means that any good practice can be deployed more easily and any
API updates only need to update the helpers.

> +static struct regulator_init_data ltc2978_regulator_init = {
> + .constraints = {
> + .valid_ops_mask = REGULATOR_CHANGE_STATUS,
> + },
> +};

You should not be forcing this on, you don't know what's safe on any
given board.  Allow the board to specify constraints then it has
control.

signature.asc
Description: Digital signature

Re: [PATCH net-next] MAINTAINERS: Add section for MRF24J40 IEEE 802.15.4 radio driver

2014-08-21 Thread David Miller

From: Alan Ott 
Date: Sat, 16 Aug 2014 17:09:03 -0400

> Alan is the original author of the driver. This change was discussed
> with the 802.15.4 subsystem maintainer, Alexander Aring.
> 
> Signed-off-by: Alan Ott 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v6 2/6] arm64: ptrace: allow tracer to skip a system call

2014-08-21 Thread AKASHI Takahiro


On 08/22/2014 02:08 AM, Kees Cook wrote:

On Thu, Aug 21, 2014 at 3:56 AM, AKASHI Takahiro
 wrote:

If tracer specifies -1 as a syscall number, this traced system call should
be skipped with a value in x0 used as a return value.
This patch enables this semantics, but there is a restriction here:

when syscall(-1) is issued by user, tracer cannot skip this system call
and modify a return value at syscall entry.

In order to ease this flavor, we need to treat whatever value in x0 as
a return value, but this might result in a bogus value being returned,
especially when tracer doesn't do anything at this syscall.
So we always return ENOSYS instead, while we have another chance to change
a return value at syscall exit.

Please also note:
* syscall entry tracing and syscall exit tracing (ftrace tracepoint and
   audit) are always executed, if enabled, even when skipping a system call
   (that is, -1).
   In this way, we can avoid a potential bug where audit_syscall_entry()
   might be called without audit_syscall_exit() at the previous system call
   being called, that would cause OOPs in audit_syscall_entry().

* syscallno may also be set to -1 if a fatal signal (SIGKILL) is detected
   in tracehook_report_syscall_entry(), but since a value set to x0 (ENOSYS)
   is not used in this case, we may neglect the case.

Signed-off-by: AKASHI Takahiro 
---
  arch/arm64/include/asm/ptrace.h |8 
  arch/arm64/kernel/entry.S   |4 
  arch/arm64/kernel/ptrace.c  |   20 
  3 files changed, 32 insertions(+)

diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
index 501000f..a58cf62 100644
--- a/arch/arm64/include/asm/ptrace.h
+++ b/arch/arm64/include/asm/ptrace.h
@@ -65,6 +65,14 @@
  #define COMPAT_PT_TEXT_ADDR0x1
  #define COMPAT_PT_DATA_ADDR0x10004
  #define COMPAT_PT_TEXT_END_ADDR0x10008
+
+/*
+ * used to skip a system call when tracer changes its number to -1
+ * with ptrace(PTRACE_SET_SYSCALL)
+ */
+#define RET_SKIP_SYSCALL   -1
+#define IS_SKIP_SYSCALL(no)((int)(no & 0x) == -1)
+
  #ifndef __ASSEMBLY__

  /* sizeof(struct user) for AArch32 */
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index f0b5e51..fdd6eae 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -25,6 +25,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 

@@ -671,6 +672,8 @@ ENDPROC(el0_svc)
  __sys_trace:
 mov x0, sp
 bl  syscall_trace_enter
+   cmp w0, #RET_SKIP_SYSCALL   // skip syscall?
+   b.eq__sys_trace_return_skipped
 adr lr, __sys_trace_return  // return address
 uxtwscno, w0// syscall number (possibly 
new)
 mov x1, sp  // pointer to regs
@@ -685,6 +688,7 @@ __sys_trace:

  __sys_trace_return:
 str x0, [sp]// save returned x0
+__sys_trace_return_skipped:// x0 already in regs[0]
 mov x0, sp
 bl  syscall_trace_exit
 b   ret_to_user
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index 8876049..c54dbcc 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -1121,9 +1121,29 @@ static void tracehook_report_syscall(struct pt_regs 
*regs,

  asmlinkage int syscall_trace_enter(struct pt_regs *regs)
  {
+   unsigned int saved_syscallno = regs->syscallno;
+
 if (test_thread_flag(TIF_SYSCALL_TRACE))
 tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER);

+   if (IS_SKIP_SYSCALL(regs->syscallno)) {
+   /*
+* RESTRICTION: we can't modify a return value of user
+* issued syscall(-1) here. In order to ease this flavor,
+* we need to treat whatever value in x0 as a return value,
+* but this might result in a bogus value being returned.
+*/
+   /*
+* NOTE: syscallno may also be set to -1 if fatal signal is
+* detected in tracehook_report_syscall_entry(), but since
+* a value set to x0 here is not used in this case, we may
+* neglect the case.
+*/
+   if (!test_thread_flag(TIF_SYSCALL_TRACE) ||
+   (IS_SKIP_SYSCALL(saved_syscallno)))
+   regs->regs[0] = -ENOSYS;
+   }
+


I don't have a runtime environment yet for arm64, so I can't test this
directly myself, so I'm just trying to eyeball this. :)

Once the seccomp logic is added here, I don't think using -2 as a
special value will work. Doesn't this mean the Oops is possible by the
user issuing a "-2" syscall? As in, if TIF_SYSCALL_WORK is set, and
the user passed -2 as the syscall, audit will be called only on entry,
and then skipped on exit?


Oops, you're

Re: [PATCH v6 1/6] arm64: ptrace: add PTRACE_SET_SYSCALL

2014-08-21 Thread AKASHI Takahiro


On 08/22/2014 01:47 AM, Kees Cook wrote:

On Thu, Aug 21, 2014 at 3:56 AM, AKASHI Takahiro
 wrote:

To allow tracer to be able to change/skip a system call by re-writing
a syscall number, there are several approaches:

(1) modify x8 register with ptrace(PTRACE_SETREGSET), and handle this case
 later on in syscall_trace_enter(), or
(2) support ptrace(PTRACE_SET_SYSCALL) as on arm

Thinking of the fact that user_pt_regs doesn't expose 'syscallno' to
tracer as well as that secure_computing() expects a changed syscall number
to be visible, especially case of -1, before this function returns in
syscall_trace_enter(), we'd better take (2).

Signed-off-by: AKASHI Takahiro 


Thanks, I like having this on both arm and arm64.


Yeah, having this simplified the code of syscall_trace_enter() a bit, but
also imposes some restriction on arm64, too.

> I wonder if other archs should add this option too.

Do you think so? I assumed that SET_SYSCALL is to be avoided if possible.

I also think that SET_SYSCALL should take an extra argument for a return value
just in case of -1 (or we have SKIP_SYSCALL?).

-Takahiro AKASHI


Reviewed-by: Kees Cook 


---
  arch/arm64/include/uapi/asm/ptrace.h |1 +
  arch/arm64/kernel/ptrace.c   |   14 +-
  2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/uapi/asm/ptrace.h 
b/arch/arm64/include/uapi/asm/ptrace.h
index 6913643..49c6174 100644
--- a/arch/arm64/include/uapi/asm/ptrace.h
+++ b/arch/arm64/include/uapi/asm/ptrace.h
@@ -23,6 +23,7 @@

  #include 

+#define PTRACE_SET_SYSCALL 23

  /*
   * PSR bits
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index 0310811..8876049 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -1077,7 +1077,19 @@ const struct user_regset_view 
*task_user_regset_view(struct task_struct *task)
  long arch_ptrace(struct task_struct *child, long request,
  unsigned long addr, unsigned long data)
  {
-   return ptrace_request(child, request, addr, data);
+   int ret;
+
+   switch (request) {
+   case PTRACE_SET_SYSCALL:
+   task_pt_regs(child)->syscallno = data;
+   ret = 0;
+   break;
+   default:
+   ret = ptrace_request(child, request, addr, data);
+   break;
+   }
+
+   return ret;
  }

  enum ptrace_syscall_dir {
--
1.7.9.5






--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3] perf hists browser: Consolidate callchain print functions in TUI

2014-08-21 Thread Namhyung Kim

Currently there're two callchain print functions in TUI - one for the
hists browser and another for file dump.  They do almost same job so
it'd be better consolidate the codes.

To do that, provide two callbacks to the generic logic - one for
printing and another for checking whether it should stop.

Cc: Frederic Weisbecker 
Signed-off-by: Namhyung Kim 
---
 tools/perf/ui/browsers/hists.c | 203 -
 1 file changed, 80 insertions(+), 123 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 519353d9f5fb..026421e0d53d 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -477,20 +477,37 @@ static char *callchain_list__sym_name(struct 
callchain_list *cl,
return bf;
 }
 
+struct callchain_print_arg {
+   /* for hists browser */
+   off_t row_offset;
+   bool is_current_entry;
+
+   /* for file dump */
+   FILE *fp;
+   int printed;
+};
+
+typedef void (*print_callchain_entry_fn)(struct hist_browser *browser,
+struct callchain_list *chain,
+const char *str, int offset,
+unsigned short row,
+struct callchain_print_arg *arg);
+
 static void hist_browser__show_callchain_entry(struct hist_browser *browser,
   struct callchain_list *chain,
-  unsigned short row, int offset,
-  char folded_sign, const char 
*str,
-  bool *is_current_entry)
+  const char *str, int offset,
+  unsigned short row,
+  struct callchain_print_arg *arg)
 {
int color, width;
+   char folded_sign = callchain_list__folded(chain);
 
color = HE_COLORSET_NORMAL;
width = browser->b.width - (offset + 2);
if (ui_browser__is_current_entry(>b, row)) {
browser->selection = >ms;
color = HE_COLORSET_SELECTED;
-   *is_current_entry = true;
+   arg->is_current_entry = true;
}
 
ui_browser__set_color(>b, color);
@@ -500,12 +517,41 @@ static void hist_browser__show_callchain_entry(struct 
hist_browser *browser,
slsmg_write_nstring(str, width);
 }
 
+static void hist_browser__fprintf_callchain_entry(struct hist_browser *b 
__maybe_unused,
+ struct callchain_list *chain,
+ const char *str, int offset,
+ unsigned short row 
__maybe_unused,
+ struct callchain_print_arg 
*arg)
+{
+   char folded_sign = callchain_list__folded(chain);
+
+   arg->printed += fprintf(arg->fp, "%*s%c %s\n", offset, " ",
+   folded_sign, str);
+}
+
+typedef bool (*check_output_full_fn)(struct hist_browser *browser,
+unsigned short row);
+
+static bool hist_browser__check_output_full(struct hist_browser *browser,
+   unsigned short row)
+{
+   return browser->b.rows == row;
+}
+
+static bool hist_browser__check_dump_full(struct hist_browser *browser 
__maybe_unused,
+ unsigned short row __maybe_unused)
+{
+   return false;
+}
+
 #define LEVEL_OFFSET_STEP 3
 
 static int hist_browser__show_callchain(struct hist_browser *browser,
struct rb_root *root, int level,
-   unsigned short row, off_t *row_offset,
-   u64 total, bool *is_current_entry)
+   unsigned short row, u64 total,
+   print_callchain_entry_fn print,
+   struct callchain_print_arg *arg,
+   check_output_full_fn check)
 {
struct rb_node *node;
int first_row = row, offset = level * LEVEL_OFFSET_STEP;
@@ -532,8 +578,8 @@ static int hist_browser__show_callchain(struct hist_browser 
*browser,
extra_offset = LEVEL_OFFSET_STEP;
 
folded_sign = callchain_list__folded(chain);
-   if (*row_offset != 0) {
-   --*row_offset;
+   if (arg->row_offset != 0) {
+   arg->row_offset--;
goto do_next;
}
 
@@ -550,13 +596,11 @@ static int hist_browser__show_callchain(struct 
hist_browser *browser,

Re: Issue with clone() and CLONE_NEWUSER as unprivileged user

2014-08-21 Thread Andy Lutomirski

On Thu, Aug 21, 2014 at 3:26 PM, Marcel Holtmann  wrote:
> Hi Andy,
>
>>> I am trying to use clone() and CLONE_NEWUSER for creating a new user 
>>> namespace as an unprivileged user. I always get an operation not permitted 
>>> error. However when I used fork() + unshare() as unprivileged user, I can 
>>> create the new user namespace just fine.
>>>
>>> Is there something obvious that I am missing? My understand is that 
>>> CLONE_NEWUSER should not require any special capabilities. I tried the 
>>> sample code from the manpage and also from LWN.net, but both give me the 
>>> same error.
>>
>> It works for me on 3.16 and 3.15 but not on 3.15.8-200.fc20.x86_64.  I'm
>> a bit confused.  What kernel are you using?
>
> I am running 3.15.6-200.fc20.x86_64 actually. What confused me is that fork() 
> + unshare() works fine, but clone() doesn't.

Ok, tracked it down.  This is a Fedora-specific issue.

https://bugzilla.redhat.com/show_bug.cgi?id=917708

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] checkpatch: look for common misspellings

2014-08-21 Thread Joe Perches

On Thu, 2014-08-21 at 18:01 -0500, Kees Cook wrote:
> On Thu, Aug 21, 2014 at 12:18 PM, Joe Perches  wrote:
> > On Thu, 2014-08-21 at 09:20 -0700, Kees Cook wrote:
> >> Check for misspellings, based on Debian's lintian list. Several false
> >> positives were removed, and several additional words added that were
> >> common in the kernel:
> > []
> >> diff --git a/MAINTAINERS b/MAINTAINERS
> > []
> >> @@ -2311,6 +2311,7 @@ M:  Andy Whitcroft 
> >>  M:   Joe Perches 
> >>  S:   Maintained
> >>  F:   scripts/checkpatch.pl
> >> +F:   scripts/spelling.txt
> >
> > I don't want to be responsible for misspellings.
> >
> > Maybe this should be moved to another section
> > and you could be added as a maintainer for that.
> 
> Okay, sure. Happy to do so. Or maybe some -docs folks want to join me? :)

Masanari Iida does a lot of these (added to cc)
Maybe he's interested in helping out or adding
his collection of misspellings to spelling.txt.

Maybe Geert Uytterhoeven too (also cc'd)

Maybe the lintian list should also be added if it's
around somewhere instead of trying to duplicate it.

http://anonscm.debian.org/cgit/lintian/lintian.git/tree/data/spelling/corrections

[]

> Cool, let me grok your suggested patch and reply. Thanks for looking this 
> over!

A defect was it checked only for all lower-case uses.

Here's what I have now.  It:

o relocates the check above specific file extension tests
o checks lines in the commit log and any added lines
o checks case-insensitive misspellings
o does Proper case corrections for start of sentences
o adds a working --fix option.

Maybe you could merge it with your other changes.

You could add some tag like improved-by: / signed-off-by,
acked-by: for me if you want.

---
 scripts/checkpatch.pl | 41 -
 1 file changed, 40 insertions(+), 1 deletion(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index b385bcb..9075ed5 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -9,7 +9,8 @@ use strict;
 use POSIX;
 
 my $P = $0;
-$P =~ s@.*/@@g;
+$P =~ s@(.*)/@@g;
+my $D = $1;
 
 my $V = '0.32';
 
@@ -43,6 +44,7 @@ my $configuration_file = ".checkpatch.conf";
 my $max_line_length = 80;
 my $ignore_perl_version = 0;
 my $minimum_perl_version = 5.10.0;
+my $spelling_file = "$D/spelling.txt";
 
 sub help {
my ($exitcode) = @_;
@@ -429,6 +431,29 @@ our $allowed_asm_includes = qr{(?x:
 )};
 # memory.h: ARM has a custom one
 
+# Load common spelling mistakes and build regular expression list.
+my $misspellings;
+my @spelling_list;
+my %spelling_fix;
+open(my $spelling, '<', $spelling_file)
+or die "$P: Can't open $spelling_file for reading: $!\n";
+while (<$spelling>) {
+   my $line = $_;
+
+   $line =~ s/\s*\n?$//g;
+   $line =~ s/^\s*//g;
+
+   next if ($line =~ m/^\s*#/);
+   next if ($line =~ m/^\s*$/);
+
+   my ($suspect, $fix) = split(/\|\|/, $line);
+
+   push(@spelling_list, $suspect);
+   $spelling_fix{$suspect} = $fix;
+}
+close($spelling);
+$misspellings = join("|", @spelling_list);
+
 sub build_types {
my $mods = "(?x:  \n" . join("|\n  ", @modifierList) . "\n)";
my $all = "(?x:  \n" . join("|\n  ", @typeList) . "\n)";
@@ -2337,6 +2362,20 @@ sub process {
}
}
 
+# Check for various spelling / typo mistakes
+   if ($in_commit_log || $line =~ /^\+/) {
+   while ($rawline =~ 
/(?:^|[^a-z@])($misspellings)(?:\$|[^a-z@])/gi) {
+   my $typo = $1;
+   my $fixed = $spelling_fix{lc($typo)};
+   $fixed = ucfirst($fixed) if ($typo =~ /^[A-Z]/);
+   if (WARN("TYPO_SPELLING",
+"'$typo' may be misspelled - perhaps 
'$fixed'?\n" . $herecurr) &&
+   $fix) {
+   $fixed[$fixlinenr] =~ 
s/(^|[^A-Za-z@])($typo)(\$|[^A-Za-z@])/$1$fixed$3/;
+   }
+   }
+   }
+
 # check we are in a valid source file if not then ignore this hunk
next if ($realfile !~ /\.(h|c|s|S|pl|sh)$/);
 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Oops: 17 SMP ARM (v3.16-rc2)

2014-08-21 Thread Fabio Estevam

On Thu, Aug 21, 2014 at 6:39 AM, Iain Paton  wrote:

> two and a half days of running this against both a sabre-lite and a
> wandboard quad B1 and I still have no reason to think there's any
> sort of a problem.
>
> Up to now, my testing has been done with my own config, I'll now
> repeat the whole thing using the config Mattis posted to see if
> I can reproduce it that way.
>
> Suggestions on a better / easier / quicker way to reproduce it are
> welcome.

Thanks, Iain.

Mattis,

What is the silicon version of the mx6 in your sabrelite? What GCC
version do you use?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] edac: Fix build error caused by wrong member access

2014-08-21 Thread Pranith Kumar

On Thu, Aug 21, 2014 at 5:28 PM, Andrew Morton
 wrote:
>
> This driver seems pretty unhealthy and I suspect it has been
> broken for quite a while.
>
> drivers/edac/ppc4xx_edac.c: In function 'mfsdram':
> drivers/edac/ppc4xx_edac.c:249: error: implicit declaration of function 
> '__mfdcri'
> drivers/edac/ppc4xx_edac.c: In function 'mtsdram':
> drivers/edac/ppc4xx_edac.c:266: error: implicit declaration of function 
> '__mtdcri'
> drivers/edac/ppc4xx_edac.c:269: warning: 'return' with a value, in function 
> returning void
> drivers/edac/ppc4xx_edac.c: In function 'ppc4xx_edac_init_csrows':
> drivers/edac/ppc4xx_edac.c:924: warning: initialization from incompatible 
> pointer type
> drivers/edac/ppc4xx_edac.c:977: error: request for member 'dimm' in something 
> not a structure or union
> drivers/edac/ppc4xx_edac.c: In function 'ppc4xx_edac_map_dcrs':
> drivers/edac/ppc4xx_edac.c:1209: warning: passing argument 1 of 
> 'dcr_map_mmio' discards qualifiers from pointer target type
>
>

Yes, not sure if anyone is actually using it.

Anways, I will send in a patch to fix the errors which you point out here.

-- 
Pranith
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] cgroup: add tracepoints to track cgroup events

2014-08-21 Thread Steven Rostedt

On Thu, 21 Aug 2014 12:07:01 -0500
Tejun Heo  wrote:

> Hello, Anrea.
> 
> On Thu, Aug 21, 2014 at 11:00:02AM -0600, Andrea Righi wrote:
> > hmm... am I missing something or we already support directory events?
> 
> Ah, right, those mkdir/rmdir and writes automatically generate those
> events.
> 
> > root@Dell:~# grep cgroups /proc/mounts
> > none /cgroups cgroup 
> > rw,relatime,cpuset,cpu,cpuacct,memory,devices,freezer,perf_event,hugetlb 0 0
> > root@Dell:~# inotifywait -m -r -e modify -e move -e create -e delete 
> > /cgroups
> > Setting up watches.  Beware: since -r was given, this may take a while!
> > Watches established.
> > /cgroups/ CREATE,ISDIR test
> > /cgroups/test/ MODIFY cgroup.procs
> > /cgroups/test/ MODIFY cgroup.procs
> > /cgroups/test/ MODIFY cgroup.populated
> > /cgroups/ MODIFY cgroup.procs
> > /cgroups/ MODIFY cgroup.procs
> > /cgroups/test/ MODIFY cgroup.populated
> > /cgroups/ DELETE,ISDIR test
> > 
> > I still need to figure out a smart way to track which PIDs are
> > added/removed to/from cgroup.procs from userland (inotifywait + git? :)),
> > but all the other informations provided by my tracepoint patch seem to
> > be already available via [di]notify.
> 
> Hmmm... yeah, determining exactly which pids got added / removed can
> be cumbersome from just MODIFY events.  That said, what are you trying
> to do with such information?
> 

OK, is this patch not being pushed then? I have a lot of comments to
make about it, but if this patch is being dropped for another way of
doing things I wont waste my time on it.

Thanks,

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] net: ec_bhf: remove excessive debug messages

2014-08-21 Thread David Miller

From: Dariusz Marcinkiewicz 
Date: Fri, 15 Aug 2014 17:49:41 +0200

> This cuts down on the number of debug information spit out by
> the driver. Some of the potentially useful debug info gets exposed
> by debugfs.
> 
> Signed-off-by: Dariusz Marcinkiewicz 

I think you should just flat out remove a lot of this stuff:

> +static struct debugfs_reg32 ec_bhf_debugfs_mii_regs[] = {
> + {
> + .name = "link-status",
> + .offset = MII_LINK_STATUS
> + }
> +};

This is completely unnecessary, if you want to export MII register
values to the user, we have a mechanism for that, via the SIOCGMII*
ioctls.

> +static struct debugfs_reg32 ec_bhf_debugfs_fifo_regs[] = {
> + {
> + .name = "fifo-tx",
> + .offset = FIFO_TX_REG
> + },
> + {
> + .name = "fifo-rx",
> + .offset = FIFO_RX_REG
> + }
> +};

You can export chip register values via the ethtool register dump API.
Simply implement ethtool_ops->get_regs_len and ethtool_ops->get_regs
and off you go.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] MAINTAINERS: add entry for ec_bhf driver

2014-08-21 Thread David Miller

From: Dariusz Marcinkiewicz 
Date: Fri, 15 Aug 2014 17:50:47 +0200

> Added entry for ec_bhf driver.
> 
> Signed-off-by: Dariusz Marcinkiewicz 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 8/9] ARM: zynq: Remove hotplug.c

2014-08-21 Thread Sören Brinkmann

On Thu, 2014-08-21 at 03:32AM +0200, Daniel Lezcano wrote:
> On 08/20/2014 10:41 PM, Soren Brinkmann wrote:
> >The hotplug code contains only a single function, which is an SMP
> >function. Move that to platsmp.c where all other SMP runctions reside.
> >That allows removing hotplug.c and declaring the cpu_die function
> >static.
> >
> >Signed-off-by: Soren Brinkmann 
> >---
> >  arch/arm/mach-zynq/Makefile  |  1 -
> >  arch/arm/mach-zynq/common.h  |  3 +--
> >  arch/arm/mach-zynq/hotplug.c | 17 -
> >  arch/arm/mach-zynq/platsmp.c | 18 ++
> >  4 files changed, 19 insertions(+), 20 deletions(-)
> >
> >diff --git a/arch/arm/mach-zynq/Makefile b/arch/arm/mach-zynq/Makefile
> >index 820dff6e1eba..c85fb3f7d5cd 100644
> >--- a/arch/arm/mach-zynq/Makefile
> >+++ b/arch/arm/mach-zynq/Makefile
> >@@ -6,5 +6,4 @@
> >  obj-y  := common.o slcr.o pm.o
> >  CFLAGS_REMOVE_hotplug.o=-march=armv6k
> >  CFLAGS_hotplug.o   =-Wa,-march=armv7-a -mcpu=cortex-a9
> >-obj-$(CONFIG_HOTPLUG_CPU)   += hotplug.o
> >  obj-$(CONFIG_SMP)  += headsmp.o platsmp.o
> >diff --git a/arch/arm/mach-zynq/common.h b/arch/arm/mach-zynq/common.h
> >index c0773e87e83c..e6bb12c50a23 100644
> >--- a/arch/arm/mach-zynq/common.h
> >+++ b/arch/arm/mach-zynq/common.h
> >@@ -39,8 +39,7 @@ extern struct smp_operations zynq_smp_ops __initdata;
> >
> >  extern void __iomem *zynq_scu_base;
> >
> >-/* Hotplug */
> >-extern void zynq_platform_cpu_die(unsigned int cpu);
> >+int zynq_pm_late_init(void);
> >
> >  int zynq_pm_late_init(void);
> >
> >diff --git a/arch/arm/mach-zynq/hotplug.c b/arch/arm/mach-zynq/hotplug.c
> >index fe44a05677e2..b685c89f11e4 100644
> >--- a/arch/arm/mach-zynq/hotplug.c
> >+++ b/arch/arm/mach-zynq/hotplug.c
> >@@ -12,20 +12,3 @@
> >   */
> >  #include 
> >
> >-/*
> >- * platform-specific code to shutdown a CPU
> >- *
> >- * Called with IRQs disabled
> >- */
> >-void zynq_platform_cpu_die(unsigned int cpu)
> >-{
> >-zynq_slcr_cpu_state_write(cpu, true);
> >-
> >-/*
> >- * there is no power-control hardware on this platform, so all
> >- * we can do is put the core into WFI; this is safe as the calling
> >- * code will have already disabled interrupts
> >- */
> >-for (;;)
> >-cpu_do_idle();
> >-}
> >diff --git a/arch/arm/mach-zynq/platsmp.c b/arch/arm/mach-zynq/platsmp.c
> >index f77f7ca4c45b..04e578718aa2 100644
> >--- a/arch/arm/mach-zynq/platsmp.c
> >+++ b/arch/arm/mach-zynq/platsmp.c
> >@@ -132,6 +132,24 @@ static int zynq_cpu_kill(unsigned cpu)
> > zynq_slcr_cpu_stop(cpu);
> > return 1;
> >  }
> >+
> >+/*
> >+ * platform-specific code to shutdown a CPU
> >+ *
> >+ * Called with IRQs disabled
> >+ */
> >+static void zynq_platform_cpu_die(unsigned int cpu)
> >+{
> >+zynq_slcr_cpu_state_write(cpu, true);
> >+
> >+/*
> >+ * there is no power-control hardware on this platform, so all
> >+ * we can do is put the core into WFI; this is safe as the calling
> >+ * code will have already disabled interrupts
> >+ */
> >+for (;;)
> >+cpu_do_idle();
> 
> IIUC, the cpu_do_idle() will flush the L1 cache and then call the
> WFI. It makes sense if we are about to power down the core. So I am
> wondering if we can just call wfi() instead.

I'm not sure - it's not that trivial to trace that through the sources.
But I think cpu_do_idle ends up in cpu_v7_do_idle which is just:
ENTRY(cpu_v7_do_idle)   
 
dsb @ WFI may enter a 
low-power mode 
wfi 
 
ret lr  
 
ENDPROC(cpu_v7_do_idle)

I think that is what we want here.

Thanks,
Sören

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/6] mm/hugetlb: gigantic hugetlb page pools shrink supporting

2014-08-21 Thread Wanpeng Li

Hi Andi,
On Fri, Apr 12, 2013 at 05:22:37PM +0200, Andi Kleen wrote:
>On Fri, Apr 12, 2013 at 07:29:07AM +0800, Wanpeng Li wrote:
>> Ping Andi,
>> On Thu, Apr 04, 2013 at 05:09:08PM +0800, Wanpeng Li wrote:
>> >order >= MAX_ORDER pages are only allocated at boot stage using the 
>> >bootmem allocator with the "hugepages=xxx" option. These pages are never 
>> >free after boot by default since it would be a one-way street(>= MAX_ORDER
>> >pages cannot be allocated later), but if administrator confirm not to 
>> >use these gigantic pages any more, these pinned pages will waste memory
>> >since other users can't grab free pages from gigantic hugetlb pool even
>> >if OOM, it's not flexible.  The patchset add hugetlb gigantic page pools
>> >shrink supporting. Administrator can enable knob exported in sysctl to
>> >permit to shrink gigantic hugetlb pool.
>
>
>I originally didn't allow this because it's only one way and it seemed
>dubious.  I've been recently working on a new patchkit to allocate
>GB pages from CMA. With that freeing actually makes sense, as 
>the pages can be reallocated.
>

More than one year past, If your allocate GB pages from CMA merged? 

Regards,
Wanpeng Li 

>-Andi
>
>--
>To unsubscribe, send a message with 'unsubscribe linux-mm' in
>the body to majord...@kvack.org.  For more info on Linux MM,
>see: http://www.linux-mm.org/ .
>Don't email: mailto:"d...@kvack.org;> em...@kvack.org 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 0/3] net: Add Keystone NetCP ethernet driver support

2014-08-21 Thread David Miller

From: Santosh Shilimkar 
Date: Fri, 15 Aug 2014 11:12:39 -0400

> Update version after incorporating David Miller's comment from earlier
> posting [1]. I would like to get these merged for upcoming 3.18 merge
> window if there are no concerns on this version.
> 
> The network coprocessor (NetCP) is a hardware accelerator that processes
> Ethernet packets. NetCP has a gigabit Ethernet (GbE) subsystem with a ethernet
> switch sub-module to send and receive packets. NetCP also includes a packet
> accelerator (PA) module to perform packet classification operations such as
> header matching, and packet modification operations such as checksum
> generation. NetCP can also optionally include a Security Accelerator(SA)
> capable of performing IPSec operations on ingress/egress packets.
> 
> Keystone SoC's also have a 10 Gigabit Ethernet Subsystem (XGbE) which
> includes a 3-port Ethernet switch sub-module capable of 10Gb/s and
> 1Gb/s rates per Ethernet port.
> 
> NetCP driver has a plug-in module architecture where each of the NetCP
> sub-modules exist as a loadable kernel module which plug in to the netcp
> core. These sub-modules are represented as "netcp-devices" in the dts
> bindings. It is mandatory to have the ethernet switch sub-module for
> the ethernet interface to be operational. Any other sub-module like the
> PA is optional.
> 
> Both GBE and XGBE network processors supported using common driver. It
> is also designed to handle future variants of NetCP.

I don't want to see an offload driver that doesn't plug into the existing
generic frameworks for configuration et al.

If no existing facility exists to support what you need, you must work
with the upstream maintainers to design and create one.

It is absolutely no reasonable for every "switch on a chip" driver to
export it's own configuration knob, we need a standard interface all
such drivers will plug into and provide.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH net] net: fix checksum features handling in netif_skb_features()

2014-08-21 Thread David Miller

From: Michal Kubecek 
Date: Fri, 15 Aug 2014 11:56:17 +0200 (CEST)

> @@ -2587,13 +2587,18 @@ netdev_features_t netif_skb_features(struct sk_buff 
> *skb)
>   return harmonize_features(skb, features);
>   }
>  
> - features &= (skb->dev->vlan_features | NETIF_F_HW_VLAN_CTAG_TX |
> -NETIF_F_HW_VLAN_STAG_TX);
> + features = netdev_intersect_features(features, skb->dev->vlan_features |
> +NETIF_F_HW_VLAN_CTAG_TX |
> +NETIF_F_HW_VLAN_STAG_TX);

This is not indented properly.

The second and subsequent lines of the function call should start precisely
at the first column after the openning parenthesis on the first line.  You
should use the appropriate number of TAB and SPACE characters necessary
to do so.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 3/4] zram: zram memory size limitation

2014-08-21 Thread Minchan Kim

On Thu, Aug 21, 2014 at 03:08:12PM -0400, Dan Streetman wrote:
> On Wed, Aug 20, 2014 at 8:27 PM, Minchan Kim  wrote:
> > Since zram has no control feature to limit memory usage,
> > it makes hard to manage system memrory.
> >
> > This patch adds new knob "mem_limit" via sysfs to set up the
> > a limit so that zram could fail allocation once it reaches
> > the limit.
> >
> > Signed-off-by: Minchan Kim 
> > ---
> >  Documentation/ABI/testing/sysfs-block-zram |  9 +++
> >  Documentation/blockdev/zram.txt| 20 ---
> >  drivers/block/zram/zram_drv.c  | 41 
> > ++
> >  drivers/block/zram/zram_drv.h  |  5 
> >  4 files changed, 71 insertions(+), 4 deletions(-)
> >
> > diff --git a/Documentation/ABI/testing/sysfs-block-zram 
> > b/Documentation/ABI/testing/sysfs-block-zram
> > index 70ec992514d0..025331c19045 100644
> > --- a/Documentation/ABI/testing/sysfs-block-zram
> > +++ b/Documentation/ABI/testing/sysfs-block-zram
> > @@ -119,3 +119,12 @@ Description:
> > efficiency can be calculated using compr_data_size and this
> > statistic.
> > Unit: bytes
> > +
> > +What:  /sys/block/zram/mem_limit
> > +Date:  August 2014
> > +Contact:   Minchan Kim 
> > +Description:
> > +   The mem_limit file is read/write and specifies the amount
> > +   of memory to be able to consume memory to store store
> > +   compressed data.
> 
> might want to clarify here that the value "0", which is the default,
> disables the limit.

Okay.

> 
> > +   Unit: bytes
> > diff --git a/Documentation/blockdev/zram.txt 
> > b/Documentation/blockdev/zram.txt
> > index 0595c3f56ccf..9f239ff8c444 100644
> > --- a/Documentation/blockdev/zram.txt
> > +++ b/Documentation/blockdev/zram.txt
> > @@ -74,14 +74,26 @@ There is little point creating a zram of greater than 
> > twice the size of memory
> >  since we expect a 2:1 compression ratio. Note that zram uses about 0.1% of 
> > the
> >  size of the disk when not in use so a huge zram is wasteful.
> >
> > -5) Activate:
> > +5) Set memory limit: Optional
> > +   Set memory limit by writing the value to sysfs node 'mem_limit'.
> > +   The value can be either in bytes or you can use mem suffixes.
> > +   Examples:
> > +   # limit /dev/zram0 with 50MB memory
> > +   echo $((50*1024*1024)) > /sys/block/zram0/mem_limit
> > +
> > +   # Using mem suffixes
> > +   echo 256K > /sys/block/zram0/mem_limit
> > +   echo 512M > /sys/block/zram0/mem_limit
> > +   echo 1G > /sys/block/zram0/mem_limit
> 
> # To disable memory limit
> echo 0 > /sys/block/zram0/mem_limit

Yeb.

> 
> > +
> > +6) Activate:
> > mkswap /dev/zram0
> > swapon /dev/zram0
> >
> > mkfs.ext4 /dev/zram1
> > mount /dev/zram1 /tmp
> >
> > -6) Stats:
> > +7) Stats:
> > Per-device statistics are exported as various nodes under
> > /sys/block/zram/
> > disksize
> > @@ -96,11 +108,11 @@ size of the disk when not in use so a huge zram is 
> > wasteful.
> > compr_data_size
> > mem_used_total
> >
> > -7) Deactivate:
> > +8) Deactivate:
> > swapoff /dev/zram0
> > umount /dev/zram1
> >
> > -8) Reset:
> > +9) Reset:
> > Write any positive value to 'reset' sysfs node
> > echo 1 > /sys/block/zram0/reset
> > echo 1 > /sys/block/zram1/reset
> > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> > index 302dd37bcea3..adc91c7ecaef 100644
> > --- a/drivers/block/zram/zram_drv.c
> > +++ b/drivers/block/zram/zram_drv.c
> > @@ -122,6 +122,33 @@ static ssize_t max_comp_streams_show(struct device 
> > *dev,
> > return scnprintf(buf, PAGE_SIZE, "%d\n", val);
> >  }
> >
> > +static ssize_t mem_limit_show(struct device *dev,
> > +   struct device_attribute *attr, char *buf)
> > +{
> > +   u64 val;
> > +   struct zram *zram = dev_to_zram(dev);
> > +
> > +   down_read(>init_lock);
> > +   val = zram->limit_pages;
> > +   up_read(>init_lock);
> > +
> > +   return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT);
> > +}
> > +
> > +static ssize_t mem_limit_store(struct device *dev,
> > +   struct device_attribute *attr, const char *buf, size_t len)
> > +{
> > +   u64 limit;
> > +   struct zram *zram = dev_to_zram(dev);
> > +
> > +   limit = memparse(buf, NULL);
> > +   down_write(>init_lock);
> > +   zram->limit_pages = PAGE_ALIGN(limit) >> PAGE_SHIFT;
> > +   up_write(>init_lock);
> > +
> > +   return len;
> > +}
> > +
> >  static ssize_t max_comp_streams_store(struct device *dev,
> > struct device_attribute *attr, const char *buf, size_t len)
> >  {
> > @@ -513,6 +540,14 @@ static int zram_bvec_write(struct zram *zram, struct 
> > bio_vec *bvec, u32 index,
> >

[PATCH linux-next] irq: export handle_fasteoi_irq

2014-08-21 Thread Vincent Stehlé

Export handle_fasteoi_irq to be able to use it in e.g. the Zynq gpio driver
since commit 6dd859508336 ("gpio: zynq: Fix IRQ handlers").

This fixes the following link issue:

  ERROR: "handle_fasteoi_irq" [drivers/gpio/gpio-zynq.ko] undefined!

Signed-off-by: Vincent Stehlé 
Cc: Thomas Gleixner 
Cc: Lars-Peter Clausen 
Cc: Linus Walleij 
---

Hi,

This can be seen in Linux next-20140822 with e.g. arm allmodconfig.

Zync gpio seems to be the first code to use handle_fasteoi_irq, which can be
compiled as a module.

Best regards,

V.

 kernel/irq/chip.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index a2b28a2..6223fab 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -517,6 +517,7 @@ out:
chip->irq_eoi(>irq_data);
raw_spin_unlock(>lock);
 }
+EXPORT_SYMBOL_GPL(handle_fasteoi_irq);

 /**
  * handle_edge_irq - edge type IRQ handler
-- 
2.1.0.rc1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v6 2/2] KVM: nVMX: nested TPR shadow/threshold emulation

2014-08-21 Thread Wanpeng Li

Hi Paolo,
On Thu, Aug 21, 2014 at 02:33:36PM +0200, Paolo Bonzini wrote:
[...]
>>  return;
>> @@ -7847,6 +7859,27 @@ static bool nested_get_vmcs12_pages(struct kvm_vcpu 
>> *vcpu,
>>  vmx->nested.apic_access_page =
>>  nested_get_page(vcpu, vmcs12->apic_access_addr);
>>  }
>> +
>> +if (nested_cpu_has(vmcs12, CPU_BASED_TPR_SHADOW)) {
>
>Missing PAGE_ALIGNED check.  I should have spotted this before, so I
>just fixed it and will commit the patch soon.
>

Maybe I misunderstand your comments "On real hardware you could point
the virtual-APIC page to an invalid address."
http://lists.openwall.net/linux-kernel/2014/08/07/344

>Thanks for your persistence!
>

Thanks for your great help. ;-)

Regards,
Wanpeng Li 

>Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -v2] x86, acpi: Handle xapic/x2apic entries in MADT at same time

2014-08-21 Thread Yinghai Lu

On Thu, Aug 21, 2014 at 12:00 AM, Ingo Molnar  wrote:
>
> (lkml Cc:-ed, in case someone wants to help out.)
>
> The changelog quality and organization of your submitted
> patches is still poor, they are hard to read and review. This
> is a repeat complaint against your patches, yet not much has
> happened over the last few years. Please improve them before
> resending your patches.
>
> As a positive example, here's a couple of x86 architecture
> commits with good changelogs:
>
> 95d76acc7518 ("x86, irq: Count legacy IRQs by legacy_pic->nr_legacy_irqs 
> instead of NR_IRQS_LEGACY")
> 6b9fb7082409 ("x86, ACPI, irq: Consolidate algorithm of mapping (ioapic, pin) 
> to IRQ number")
> 2e0ad0e2c135 ("x86, ACPI, irq: Fix possible error in GSI to IRQ mapping for 
> legacy IRQ")
> 44a69f619562 ("acpi, apei, ghes: Make NMI error notification to be GHES 
> architecture extension")
>
> Please match or exceed the changelog quality of these commits.

How about this version ?

Subject: [PATCH -v3] x86, acpi: Make cpu sequence to be consistent with MADT
From: Yinghai Lu 

On 8 socket system that x2apic is pre-enabled, get following sequence:
CPU0: socket0, core0, thread0.
CPU1 - CPU 40: socket 4 - socket 7, thread 0
CPU41 - CPU 80: socket 4 - socket 7, thread 1
CPU81 - CPU 119: socket 0 - socket 3, thread 0
CPU120 - CPU 159: socket 0 - socket 3, thread 1

The system has mixing xapic and x2apic entries in MADT and SRAT.
Current kernel parse all x2apic entries before all xapic entries, and
the same reserve CPU0 slot for boot cpu, so we get out of order
cpu sqeuence.

Some users have scripts that just assume that that cpu sequence is same
as socket0, and then next sockets. According to socket number/core number
in the system, they have simple mapping from kernel cpu index to socket
index.

BIOS guys insist that ACPI 4.0 SPEC says if apic id < 255, even
the cpus are with x2apic mode pre-enabled, still need to use xapic entries
instead of x2apic entries.

We could check every entry in MADT with xapic and x2apic instead of
checking all entries with x2apic then check all entries with xapic.

After patch we have:
CPU0 - CPU 79: socket 0 - socket 7, thread 0
CPU80 - CPU 159: socket 0 - socket 7, thread 1
and we have same cpu sequence as that in MADT.

-v2: update some comments, and change to pass array pointer.
-v3: update changelog.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/3] tg3: Limit minimum tx queue wakeup threshold

2014-08-21 Thread Michael Chan

On Thu, 2014-08-21 at 16:06 -0700, Benjamin Poirier wrote: 
> On 2014/08/21 15:32, Michael Chan wrote:
> > On Thu, 2014-08-21 at 15:04 -0700, Benjamin Poirier wrote: 
> > > On 2014/08/19 15:00, Michael Chan wrote:
> > > > On Tue, 2014-08-19 at 11:52 -0700, Benjamin Poirier wrote: 
> > > > > diff --git a/drivers/net/ethernet/broadcom/tg3.c 
> > > > > b/drivers/net/ethernet/broadcom/tg3.c
> > > > > index 3ac5d23..b11c0fd 100644
> > > > > --- a/drivers/net/ethernet/broadcom/tg3.c
> > > > > +++ b/drivers/net/ethernet/broadcom/tg3.c
> > > > > @@ -202,7 +202,8 @@ static inline void _tg3_flag_clear(enum TG3_FLAGS 
> > > > > flag, unsigned long *bits)
> > > > >  #endif
> > > > >  
> > > > >  /* minimum number of free TX descriptors required to wake up TX 
> > > > > process */
> > > > > -#define TG3_TX_WAKEUP_THRESH(tnapi)((tnapi)->tx_pending 
> > > > > / 4)
> > > > > +#define TG3_TX_WAKEUP_THRESH(tnapi)max_t(u32, 
> > > > > (tnapi)->tx_pending / 4, \
> > > > > + MAX_SKB_FRAGS + 1)
> > > > 
> > > > I think we should precompute this and store it in something like
> > > > tp->tx_wake_thresh.
> > > 
> > > I've tried this by adding the following patch at the end of the v2
> > > series but I did not measure a significant latency improvement. Was
> > > there another reason for the change? 
> > 
> > Just performance.  The wake up threshold is checked in the tx fast path
> > in both start_xmit() and tg3_tx().  I would optimize such code for speed
> 
> I don't see what you mean. The code in those two functions that used to
> invoke TG3_TX_WAKEUP_THRESH is wrapped in unlikely() conditions. You
> can't tell me that's the fast path ;) It's only checked when the queue
> is stopped.

I missed the unlikely().  So you're right.  It's not really in the fast
path.

> 
> Moreover, the patches I've sent already add tg3_napi.wakeup_thresh. It
> is over those patches that I've made the measurements.

Right.  But my original comment was over your original patch #1 which
was adding max_t() to the macro TG3_TX_WAKE_THRESH without adding
wakeup_thresh field.  All my comments (performance and smaller code)
were based on your original patch #1.  Later I did see that your patch 3
converted TG3_TX_WAKEUP_THRESH to a structure field so it's no longer an
issue.

> 
> > as much as possible.  In the current code, it was just a right shift
> > operation.  Now, with max_t() added, I think I prefer having it
> > pre-computed.  The performance difference may not be measurable, but I
> > think the compiled code size may be smaller too.
> 
> Maybe in certain areas, but not overall:
> 
> with v2 patches 1-3
>textdata bss dec hex filename
>  1494951247   0  150742   24cd6 drivers/net/ethernet/broadcom/tg3.o
> with v2 patches 1-3 + tx_wake_thresh_def
>textdata bss dec hex filename
>  1495241247   0  150771   24cf3 drivers/net/ethernet/broadcom/tg3.o
> 
> I really don't see a gain.
> 

Agreed.  Once you have converted the TG3_TX_WAKEUP_THRESH to a structure
field, that's sufficient.  No need to have multiple fields.  Thanks.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 2/4] zsmalloc: change return value unit of zs_get_total_size_bytes

2014-08-21 Thread Seth Jennings

On Thu, Aug 21, 2014 at 02:53:57PM -0400, Dan Streetman wrote:
> On Wed, Aug 20, 2014 at 8:27 PM, Minchan Kim  wrote:
> > zs_get_total_size_bytes returns a amount of memory zsmalloc
> > consumed with *byte unit* but zsmalloc operates *page unit*
> > rather than byte unit so let's change the API so benefit
> > we could get is that reduce unnecessary overhead
> > (ie, change page unit with byte unit) in zsmalloc.
> >
> > Now, zswap can rollback to zswap_pool_pages.
> > Over to zswap guys ;-)
> 
> We could change zpool/zswap over to total pages instead of total
> bytes, since both zbud and zsmalloc now report size in pages.  The
> only downside would be if either changed later to not use only whole
> pages (or if they start using huge pages for storage...), but for what
> they do that seems unlikely.  After this patch is finalized I can
> write up a quick patch unless Seth disagrees (or already has a patch
> :)

I agree that we should move everything (back) to pages.

I can write the patch or you can; doesn't matter to me.  I might have
one started on the previous version of this patchset where I erroneously
determined that Minchan had broken stuff :-/

Seth

> 
> >
> > Signed-off-by: Minchan Kim 
> > ---
> >  drivers/block/zram/zram_drv.c |  4 ++--
> >  include/linux/zsmalloc.h  |  2 +-
> >  mm/zsmalloc.c | 10 +-
> >  3 files changed, 8 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> > index d00831c3d731..302dd37bcea3 100644
> > --- a/drivers/block/zram/zram_drv.c
> > +++ b/drivers/block/zram/zram_drv.c
> > @@ -103,10 +103,10 @@ static ssize_t mem_used_total_show(struct device *dev,
> >
> > down_read(>init_lock);
> > if (init_done(zram))
> > -   val = zs_get_total_size_bytes(meta->mem_pool);
> > +   val = zs_get_total_size(meta->mem_pool);
> > up_read(>init_lock);
> >
> > -   return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
> > +   return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT);
> >  }
> >
> >  static ssize_t max_comp_streams_show(struct device *dev,
> > diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h
> > index e44d634e7fb7..105b56e45d23 100644
> > --- a/include/linux/zsmalloc.h
> > +++ b/include/linux/zsmalloc.h
> > @@ -46,6 +46,6 @@ void *zs_map_object(struct zs_pool *pool, unsigned long 
> > handle,
> > enum zs_mapmode mm);
> >  void zs_unmap_object(struct zs_pool *pool, unsigned long handle);
> >
> > -u64 zs_get_total_size_bytes(struct zs_pool *pool);
> > +unsigned long zs_get_total_size(struct zs_pool *pool);
> 
> minor naming suggestion, but since the name is changing anyway,
> "zs_get_total_size" implies to me the units are bytes, would
> "zs_get_total_pages" be clearer that it's returning size in # of
> pages, not bytes?
> 
> >
> >  #endif
> > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> > index a65924255763..80408a1da03a 100644
> > --- a/mm/zsmalloc.c
> > +++ b/mm/zsmalloc.c
> > @@ -299,7 +299,7 @@ static void zs_zpool_unmap(void *pool, unsigned long 
> > handle)
> >
> >  static u64 zs_zpool_total_size(void *pool)
> >  {
> > -   return zs_get_total_size_bytes(pool);
> > +   return zs_get_total_size(pool) << PAGE_SHIFT;
> >  }
> >
> >  static struct zpool_driver zs_zpool_driver = {
> > @@ -1186,16 +1186,16 @@ void zs_unmap_object(struct zs_pool *pool, unsigned 
> > long handle)
> >  }
> >  EXPORT_SYMBOL_GPL(zs_unmap_object);
> >
> > -u64 zs_get_total_size_bytes(struct zs_pool *pool)
> > +unsigned long zs_get_total_size(struct zs_pool *pool)
> >  {
> > -   u64 npages;
> > +   unsigned long npages;
> >
> > spin_lock(>stat_lock);
> > npages = pool->pages_allocated;
> > spin_unlock(>stat_lock);
> > -   return npages << PAGE_SHIFT;
> > +   return npages;
> >  }
> > -EXPORT_SYMBOL_GPL(zs_get_total_size_bytes);
> > +EXPORT_SYMBOL_GPL(zs_get_total_size);
> >
> >  module_init(zs_init);
> >  module_exit(zs_exit);
> > --
> > 2.0.0
> >
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 2/4] zsmalloc: change return value unit of zs_get_total_size_bytes

2014-08-21 Thread Minchan Kim

On Thu, Aug 21, 2014 at 02:53:57PM -0400, Dan Streetman wrote:
> On Wed, Aug 20, 2014 at 8:27 PM, Minchan Kim  wrote:
> > zs_get_total_size_bytes returns a amount of memory zsmalloc
> > consumed with *byte unit* but zsmalloc operates *page unit*
> > rather than byte unit so let's change the API so benefit
> > we could get is that reduce unnecessary overhead
> > (ie, change page unit with byte unit) in zsmalloc.
> >
> > Now, zswap can rollback to zswap_pool_pages.
> > Over to zswap guys ;-)
> 
> We could change zpool/zswap over to total pages instead of total
> bytes, since both zbud and zsmalloc now report size in pages.  The
> only downside would be if either changed later to not use only whole
> pages (or if they start using huge pages for storage...), but for what
> they do that seems unlikely.  After this patch is finalized I can
> write up a quick patch unless Seth disagrees (or already has a patch
> :)
> 
> >
> > Signed-off-by: Minchan Kim 
> > ---
> >  drivers/block/zram/zram_drv.c |  4 ++--
> >  include/linux/zsmalloc.h  |  2 +-
> >  mm/zsmalloc.c | 10 +-
> >  3 files changed, 8 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> > index d00831c3d731..302dd37bcea3 100644
> > --- a/drivers/block/zram/zram_drv.c
> > +++ b/drivers/block/zram/zram_drv.c
> > @@ -103,10 +103,10 @@ static ssize_t mem_used_total_show(struct device *dev,
> >
> > down_read(>init_lock);
> > if (init_done(zram))
> > -   val = zs_get_total_size_bytes(meta->mem_pool);
> > +   val = zs_get_total_size(meta->mem_pool);
> > up_read(>init_lock);
> >
> > -   return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
> > +   return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT);
> >  }
> >
> >  static ssize_t max_comp_streams_show(struct device *dev,
> > diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h
> > index e44d634e7fb7..105b56e45d23 100644
> > --- a/include/linux/zsmalloc.h
> > +++ b/include/linux/zsmalloc.h
> > @@ -46,6 +46,6 @@ void *zs_map_object(struct zs_pool *pool, unsigned long 
> > handle,
> > enum zs_mapmode mm);
> >  void zs_unmap_object(struct zs_pool *pool, unsigned long handle);
> >
> > -u64 zs_get_total_size_bytes(struct zs_pool *pool);
> > +unsigned long zs_get_total_size(struct zs_pool *pool);
> 
> minor naming suggestion, but since the name is changing anyway,
> "zs_get_total_size" implies to me the units are bytes, would
> "zs_get_total_pages" be clearer that it's returning size in # of
> pages, not bytes?

It's better. Will change.
Thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1478 matches

Mail list logo