Re: linux-next: build failure after merge of the omap_dss2 tree

2013-04-25 Thread Stephen Rothwell
Hi Tomi,

On Fri, 26 Apr 2013 08:31:03 +0300 Tomi Valkeinen  wrote:
>
> On 2013-04-26 08:10, Stephen Rothwell wrote:
> > 
> > After merging the omap_dss2 tree, today's linux-next build (powerpc
> > ppc64_defconfig) failed like this:
> > 
> > drivers/video/ps3fb.c: In function 'ps3fb_mmap':
> > drivers/video/ps3fb.c:710:2: error: implicit declaration of function 
> > 'vm_ioremap_memory' [-Werror=implicit-function-declaration]
> > drivers/video/ps3fb.c:712:2: error: 'offset' undeclared (first use in this 
> > function)
> > 
> > Caused by commit 6ea19860d6c5 ("fbdev/ps3fb: use vm_iomap_memory()").
> > 
> > I have used the omap_dss2 tree from next-20130424 for today.
> > 
> 
> Thanks. Updated patch below. I couldn't right away find where to download a
> ppc64 toolchain, so not compile tested...

https://www.kernel.org/pub/tools/crosstool/

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgp92ksI63vim.pgp
Description: PGP signature


Re: [PATCH 1/2] lib: Add lz4 compressor module

2013-04-25 Thread Stephen Rothwell
Hi,

On Fri, 26 Apr 2013 14:02:01 +0900 "Chanho Min"  wrote:
>
> 
> @@ -0,0 +1,23 @@
> +#include 
> +
> +int __attribute__((weak)) __clzsi2(int val)

We have __weak in 

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpUXjGrThgHW.pgp
Description: PGP signature


Re: For review (v2): user_namespaces(7) man page

2013-04-25 Thread richard -rw- weinberger
On Fri, Apr 26, 2013 at 2:54 AM, Eric W. Biederman
 wrote:
> richard -rw- weinberger  writes:
>
>> On Wed, Mar 27, 2013 at 10:26 PM, Michael Kerrisk (man-pages)
>>  wrote:
>>>Inside the user namespace, the shell has user and group  ID  0,
>>>and a full set of permitted and effective capabilities:
>>>
>>>bash$ cat /proc/$$/status | egrep '^[UG]id'
>>>Uid: 0000
>>>Gid: 0000
>>>bash$ cat /proc/$$/status | egrep '^Cap(Prm|Inh|Eff)'
>>>CapInh:   
>>>CapPrm:   001f
>>>CapEff:   001f
>>
>> I've tried your demo program, but inside the new ns I'm automatically nobody.
>> As Eric said, setuid(0)/setgid(0) are missing.
>
> Is it the setuid/setgid or not setting up the uid/gid map?

uid/git mapping are set up.

>> Eric, maybe you can help me. How can I drop capabilities within a user
>> namespace?
>
>> In childFunc() I did add prctl(PR_CAPBSET_DROP, CAP_NET_ADMIN) but it always
>> returns ENOPERM.
>> What that? I thought I get a completely fresh set of cap which I can modify.
>> I don't want that uid 0 inside the container has all caps.
>
> There are weird things that happen with exec and the user namespace.  If
> you have exec'd as an unmapped user all of your capabilities have
> already been droped.

I've setup the mappings. If I look into /proc/*/status I see that my process has
all caps.
So, in general it is possible to drop cap within a user namespace?
I really want to drop CAP_NET_ADMIN and some others.
root within my container must not change any networking settings.

>> And why does /proc/*/loginuid always contain 4294967295 in a new user 
>> namespace?
>> Writing to it also fails. (Noticed that because pam_loginuid.so does not 
>> work).
>
> Almost certainly because the loginuid has already been set.  Yes. It
> looks like I am simply using from_kuid instead of from_kuid_munged on
> the read.  So an unmapped loginuid will be reported as 4294967295.
>
> For some circumstances 65534 (nobody) is definitely better in some it is
> a toss up, and most of the time no one really cares.  So I have tried to
> do something but in this case I don't know which was the best policy.

Hmm, I hoped that loginuid will be reset upon entering a user namespace.

>> Final question, is it by design that uid 0 within a namespace in not
>> allowed to write to
>> /proc/*/oom_score_adj?
>
> Essentially.  It is by design that uid 0 within a namespace be mapped to
> some other uid outside the namespace, and that the permissions on writes
> should use the permission needed outside of the user namespace.

Okay, I've asked because systemd is a heavy user of this file and
fails due to this
within a user namespace.
Luckily it is possible to remove all the score changes from the .service files.

--
Thanks,
//richard
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] Davinci fbdev driver and enable it for DMx platform

2013-04-25 Thread Prabhakar Lad
Hi Laurent,

On Thu, Apr 25, 2013 at 2:32 AM, Laurent Pinchart
 wrote:
> Hi Prabhakar,
>
> Thank you for the patch.
>
> On Wednesday 24 April 2013 17:30:02 Prabhakar Lad wrote:
>> From: Lad, Prabhakar 
>>
>> This patch series adds an fbdev driver for Texas
>> Instruments Davinci SoC.The display subsystem consists
>> of OSD and VENC, with OSD supporting 2 RGb planes and
>> 2 video planes.
>> http://focus.ti.com/general/docs/lit/
>> getliterature.tsp?literatureNumber=sprue37d=pdf
>>
>> A good amount of the OSD and VENC enabling code is
>> present in the kernel, and this patch series adds the
>> fbdev interface.
>>
>> The fbdev driver exports 4 nodes representing each
>> plane to the user - from fb0 to fb3.
>
> The obvious question is: why not a KMS driver instead ? :-)
>
I did go through the KMS model (thanks for pointing to your work and the video)
and it looks like this would require a fair amount of development, at this point
of time I would go with the current implementation and revisit on KMS model
at later point of time.

Regards,
--Prabhakar
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 07/21] oss/dmabuf: use dma_map_single

2013-04-25 Thread Takashi Iwai
At Thu, 25 Apr 2013 19:28:50 +0200,
Arnd Bergmann wrote:
> 
> The virt_to_bus/bus_to_virt functions have been deprecated
> for as long as I can remember, and they are used in very
> few remaining instances, usually in obscure ISA device
> drivers. The OSS sound drivers are the only ones that are
> still used on the ARM architecture, and only on some of
> the earliest StrongARM machines.
> 
> The problem for converting the OSS subsystem to use
> dma_map_single instead is that the caller of virt_to_bus
> does not have a device pointer, since the subsystem has
> never been ported to use the common device infrastructure.
> 
> Signed-off-by: Arnd Bergmann 
> Cc: Jaroslav Kysela 
> Cc: Takashi Iwai 
> Cc: alsa-de...@alsa-project.org

Applied now, but the only problem is that it's difficult to test it :)
Let's see.


Thanks!

Takashi

> ---
>  sound/oss/dmabuf.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/sound/oss/dmabuf.c b/sound/oss/dmabuf.c
> index bcc3e8e..a59c888 100644
> --- a/sound/oss/dmabuf.c
> +++ b/sound/oss/dmabuf.c
> @@ -114,7 +114,7 @@ static int sound_alloc_dmap(struct dma_buffparms *dmap)
>   }
>   }
>   dmap->raw_buf = start_addr;
> - dmap->raw_buf_phys = virt_to_bus(start_addr);
> + dmap->raw_buf_phys = dma_map_single(NULL, start_addr, dmap->buffsize, 
> DMA_BIDIRECTIONAL);
>  
>   for (page = virt_to_page(start_addr); page <= virt_to_page(end_addr); 
> page++)
>   SetPageReserved(page);
> @@ -139,6 +139,7 @@ static void sound_free_dmap(struct dma_buffparms *dmap)
>   for (page = virt_to_page(start_addr); page <= virt_to_page(end_addr); 
> page++)
>   ClearPageReserved(page);
>  
> + dma_unmap_single(NULL, dmap->raw_buf_phys, dmap->buffsize, 
> DMA_BIDIRECTIONAL);
>   free_pages((unsigned long) dmap->raw_buf, sz);
>   dmap->raw_buf = NULL;
>  }
> -- 
> 1.8.1.2
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 06/21] ALSA: ali5451: use mdelay instead of large udelay constants

2013-04-25 Thread Takashi Iwai
At Thu, 25 Apr 2013 19:28:49 +0200,
Arnd Bergmann wrote:
> 
> ARM cannot handle udelay for more than 2 miliseconds, so we
> should use mdelay instead for those.
> 
> Signed-off-by: Arnd Bergmann 
> Cc: Takashi Iwai 
> Cc: alsa-de...@alsa-project.org

Thanks, applied to sound git tree.


Takashi


> ---
>  sound/pci/ali5451/ali5451.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/sound/pci/ali5451/ali5451.c b/sound/pci/ali5451/ali5451.c
> index e760af9..53754f5 100644
> --- a/sound/pci/ali5451/ali5451.c
> +++ b/sound/pci/ali5451/ali5451.c
> @@ -451,10 +451,10 @@ static int snd_ali_reset_5451(struct snd_ali *codec)
>   if (pci_dev) {
>   pci_read_config_dword(pci_dev, 0x7c, );
>   pci_write_config_dword(pci_dev, 0x7c, dwVal | 0x0800);
> - udelay(5000);
> + mdelay(5);
>   pci_read_config_dword(pci_dev, 0x7c, );
>   pci_write_config_dword(pci_dev, 0x7c, dwVal & 0xf7ff);
> - udelay(5000);
> + mdelay(5);
>   }
>   
>   pci_dev = codec->pci;
> @@ -463,14 +463,14 @@ static int snd_ali_reset_5451(struct snd_ali *codec)
>   udelay(500);
>   pci_read_config_dword(pci_dev, 0x44, );
>   pci_write_config_dword(pci_dev, 0x44, dwVal & 0xfffb);
> - udelay(5000);
> + mdelay(5);
>   
>   wCount = 200;
>   while(wCount--) {
>   wReg = snd_ali_codec_peek(codec, 0, AC97_POWERDOWN);
>   if ((wReg & 0x000f) == 0x000f)
>   return 0;
> - udelay(5000);
> + mdelay(5);
>   }
>  
>   /* non-fatal if you have a non PM capable codec */
> -- 
> 1.8.1.2
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: build failure after merge of the omap_dss2 tree

2013-04-25 Thread Tomi Valkeinen
On 2013-04-26 08:10, Stephen Rothwell wrote:
> Hi Tomi,
> 
> After merging the omap_dss2 tree, today's linux-next build (powerpc
> ppc64_defconfig) failed like this:
> 
> drivers/video/ps3fb.c: In function 'ps3fb_mmap':
> drivers/video/ps3fb.c:710:2: error: implicit declaration of function 
> 'vm_ioremap_memory' [-Werror=implicit-function-declaration]
> drivers/video/ps3fb.c:712:2: error: 'offset' undeclared (first use in this 
> function)
> 
> Caused by commit 6ea19860d6c5 ("fbdev/ps3fb: use vm_iomap_memory()").
> 
> I have used the omap_dss2 tree from next-20130424 for today.
> 

Thanks. Updated patch below. I couldn't right away find where to download a
ppc64 toolchain, so not compile tested...

 Tomi


From 11bd5933abe033fb7a3a0d1f1bd2cb4b6df8143f Mon Sep 17 00:00:00 2001
From: Tomi Valkeinen 
Date: Thu, 18 Apr 2013 07:52:42 +0300
Subject: [PATCH] fbdev/ps3fb: use vm_iomap_memory()

Use vm_iomap_memory() instead of [io_]remap_pfn_range().
vm_iomap_memory() gives us much simpler API to map memory to userspace,
and reduces possibilities for bugs.

Signed-off-by: Tomi Valkeinen 
Cc: Geert Uytterhoeven 
---
 drivers/video/ps3fb.c |   18 ++
 1 file changed, 6 insertions(+), 12 deletions(-)

diff --git a/drivers/video/ps3fb.c b/drivers/video/ps3fb.c
index 920c27b..d9f08c6 100644
--- a/drivers/video/ps3fb.c
+++ b/drivers/video/ps3fb.c
@@ -705,21 +705,15 @@ static int ps3fb_pan_display(struct fb_var_screeninfo 
*var,
 
 static int ps3fb_mmap(struct fb_info *info, struct vm_area_struct *vma)
 {
-   unsigned long size, offset;
+   int r;
 
-   size = vma->vm_end - vma->vm_start;
-   offset = vma->vm_pgoff << PAGE_SHIFT;
-   if (offset + size > info->fix.smem_len)
-   return -EINVAL;
-
-   offset += info->fix.smem_start;
-   if (remap_pfn_range(vma, vma->vm_start, offset >> PAGE_SHIFT,
-   size, vma->vm_page_prot))
-   return -EAGAIN;
+   r = vm_iomap_memory(vma, info->fix.smem_start, info->fix.smem_len);
 
dev_dbg(info->device, "ps3fb: mmap framebuffer P(%lx)->V(%lx)\n",
-   offset, vma->vm_start);
-   return 0;
+   info->fix.smem_start + vma->vm_pgoff << PAGE_SHIFT,
+   vma->vm_start);
+
+   return r;
 }
 
 /*
-- 
1.7.10.4





signature.asc
Description: OpenPGP digital signature


linux-next: manual merge of the trivial tree with the arm tree

2013-04-25 Thread Stephen Rothwell
Hi Jiri,

Today's linux-next merge of the trivial tree got a conflict in
arch/arm/kvm/arm.c between commit 3414bbfff98b ("ARM: KVM: move exit
handler selection to a separate file") from the arm tree and commit
b23f7a09f935 ("treewide: Fix typo in printk and comments") from the
trivial tree.

The former removed the code that the latter fixed, so I did that and can
carry the fix as necessary (no action is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgp9kJ9IOnaQA.pgp
Description: PGP signature


Re: [PATCH 1/2] acpi: video: add function to support unregister backlight

2013-04-25 Thread Aaron Lu

On 04/22/2013 08:39 PM, Chun-Yi Lee wrote:

From: "Lee, Chun-Yi" 

There have situation we unregister whole acpi/video driver by downstream driver
just want to remove backlight control interface of acpi/video. It caues we lost
other functions of acpi/video, e.g. transfer acpi event to input event.

So, this patch add a new function, find_video_unregister_backlight, it provide
the interface let downstream driver can tell acpi/video to unregister backlight
interface of all acpi video devices. Then we can keep functions of acpi/video
but only remove backlight support.

Reference: bko#35622
 https://bugzilla.kernel.org/show_bug.cgi?id=35622

Tested-by: Andrzej Krentosz 
Cc: Carlos Corbacho 
Cc: Matthew Garrett 
Cc: Dmitry Torokhov 
Cc: Corentin Chary 
Cc: Rafael J. Wysocki 
Cc: Aaron Lu 
Cc: Thomas Renninger 
Signed-off-by: Lee, Chun-Yi 
---
  drivers/acpi/video.c | 46 ++
  include/acpi/video.h |  2 ++
  2 files changed, 48 insertions(+)

diff --git a/drivers/acpi/video.c b/drivers/acpi/video.c
index 313f959..acd2e7a 100644
--- a/drivers/acpi/video.c
+++ b/drivers/acpi/video.c
@@ -1793,6 +1793,52 @@ static int __init intel_opregion_present(void)
return opregion;
  }

+static acpi_status
+find_video_unregister_backlight(acpi_handle handle, u32 lvl, void *context,
+   void **rv)
+{
+   struct acpi_device *acpi_dev;
+   struct acpi_video_bus *video = NULL;
+   struct acpi_video_device *dev, *next;
+
+   if (acpi_bus_get_device(handle, _dev))
+   return AE_OK;
+
+   if (!acpi_match_device_ids(acpi_dev, video_device_ids)) {
+   video = acpi_driver_data(acpi_dev);
+   acpi_video_bus_stop_devices(video);
+   mutex_lock(>device_list_lock);
+   list_for_each_entry_safe(dev, next, >video_device_list,
+   entry) {
+   if (dev->backlight) {
+   backlight_device_unregister(dev->backlight);
+   dev->backlight = NULL;
+   kfree(dev->brightness->levels);
+   kfree(dev->brightness);
+   }


The cooling_dev should also be unregistered I think.

Thanks,
Aaron


+   }
+   mutex_unlock(>device_list_lock);
+   acpi_video_bus_start_devices(video);
+   }
+   return AE_OK;
+}
+
+void acpi_video_backlight_unregister(void)
+{
+   if (!register_count) {
+   /*
+* If the acpi video bus is already unloaded, don't
+* unregister backlight of devices and return directly.
+*/
+   return;
+   }
+   acpi_walk_namespace(ACPI_TYPE_DEVICE, ACPI_ROOT_OBJECT,
+   ACPI_UINT32_MAX, find_video_unregister_backlight,
+   NULL, NULL, NULL);
+   return;
+}
+EXPORT_SYMBOL(acpi_video_backlight_unregister);
+
  int acpi_video_register(void)
  {
int result = 0;
diff --git a/include/acpi/video.h b/include/acpi/video.h
index 61109f2..1e810a1 100644
--- a/include/acpi/video.h
+++ b/include/acpi/video.h
@@ -19,11 +19,13 @@ struct acpi_device;
  #if (defined CONFIG_ACPI_VIDEO || defined CONFIG_ACPI_VIDEO_MODULE)
  extern int acpi_video_register(void);
  extern void acpi_video_unregister(void);
+extern void acpi_video_backlight_unregister(void);
  extern int acpi_video_get_edid(struct acpi_device *device, int type,
   int device_id, void **edid);
  #else
  static inline int acpi_video_register(void) { return 0; }
  static inline void acpi_video_unregister(void) { return; }
+static inline void acpi_video_backlight_unregister(void) { return; }
  static inline int acpi_video_get_edid(struct acpi_device *device, int type,
  int device_id, void **edid)
  {



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: add phys addr validity check for /dev/mem mmap

2013-04-25 Thread Will Huck

Hi Peter,
On 04/02/2013 08:28 PM, Frantisek Hrbata wrote:

When CR4.PAE is set, the 64b PTE's are used(ARCH_PHYS_ADDR_T_64BIT is set for
X86_64 || X86_PAE). According to [1] Chapter 4 Paging, some higher bits in 64b
PTE are reserved and have to be set to zero. For example, for IA-32e and 4KB
page [1] 4.5 IA-32e Paging: Table 4-19, bits 51-M(MAXPHYADDR) are reserved. So
for a CPU with e.g. 48bit phys addr width, bits 51-48 have to be zero. If one of
the reserved bits is set, [1] 4.7 Page-Fault Exceptions, the #PF is generated
with RSVD error code.


RSVD flag (bit 3).
This flag is 1 if there is no valid translation for the linear address because a
reserved bit was set in one of the paging-structure entries used to translate
that address. (Because reserved bits are not checked in a paging-structure entry
whose P flag is 0, bit 3 of the error code can be set only if bit 0 is also
set.)


In mmap_mem() the first check is valid_mmap_phys_addr_range(), but it always
returns 1 on x86. So it's possible to use any pgoff we want and to set the PTE's
reserved bits in remap_pfn_range(). Meaning there is a possibility to use mmap


In this case, remap_pfn_range() setup the map and reserved bits for mmio 
memory, so the mmio memory is already populated, why trigger #PF?



on /dev/mem and cause system panic. It's probably not that serious, because
access to /dev/mem is limited and the system has to have panic_on_oops set, but
still I think we should check this and return error.

This patch adds check for x86 when ARCH_PHYS_ADDR_T_64BIT is set, the same way
as it is already done in e.g. ioremap. With this fix mmap returns -EINVAL if the
requested phys addr is bigger then the supported phys addr width.

[1] Intel 64 and IA-32 Architectures Software Developer's Manual, Volume 3A

Signed-off-by: Frantisek Hrbata 
---
  arch/x86/include/asm/io.h |  4 
  arch/x86/mm/mmap.c| 13 +
  2 files changed, 17 insertions(+)

diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index d8e8eef..39607c6 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -242,6 +242,10 @@ static inline void flush_write_buffers(void)
  #endif
  }
  
+#define ARCH_HAS_VALID_PHYS_ADDR_RANGE

+extern int valid_phys_addr_range(phys_addr_t addr, size_t count);
+extern int valid_mmap_phys_addr_range(unsigned long pfn, size_t count);
+
  #endif /* __KERNEL__ */
  
  extern void native_io_delay(void);

diff --git a/arch/x86/mm/mmap.c b/arch/x86/mm/mmap.c
index 845df68..92ec31c 100644
--- a/arch/x86/mm/mmap.c
+++ b/arch/x86/mm/mmap.c
@@ -31,6 +31,8 @@
  #include 
  #include 
  
+#include "physaddr.h"

+
  struct __read_mostly va_alignment va_align = {
.flags = -1,
  };
@@ -122,3 +124,14 @@ void arch_pick_mmap_layout(struct mm_struct *mm)
mm->unmap_area = arch_unmap_area_topdown;
}
  }
+
+int valid_phys_addr_range(phys_addr_t addr, size_t count)
+{
+   return addr + count <= __pa(high_memory);
+}
+
+int valid_mmap_phys_addr_range(unsigned long pfn, size_t count)
+{
+   resource_size_t addr = (pfn << PAGE_SHIFT) + count;
+   return phys_addr_valid(addr);
+}


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: build failure after merge of the omap_dss2 tree

2013-04-25 Thread Stephen Rothwell
Hi Tomi,

After merging the omap_dss2 tree, today's linux-next build (powerpc
ppc64_defconfig) failed like this:

drivers/video/ps3fb.c: In function 'ps3fb_mmap':
drivers/video/ps3fb.c:710:2: error: implicit declaration of function 
'vm_ioremap_memory' [-Werror=implicit-function-declaration]
drivers/video/ps3fb.c:712:2: error: 'offset' undeclared (first use in this 
function)

Caused by commit 6ea19860d6c5 ("fbdev/ps3fb: use vm_iomap_memory()").

I have used the omap_dss2 tree from next-20130424 for today.
-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpYrbTNzoWhQ.pgp
Description: PGP signature


[PATCH -next] x86, doc: Add LZ4 magic number for the new compression

2013-04-25 Thread Kyungsik Lee
Documentation/x86/boot.txt is updated to list the LZ4 magic number.
This LZ4 magic number is used for the new compression format.

Signed-off-by: Kyungsik Lee 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: Russell King 
Cc: Borislav Petkov 
Cc: Florian Fainelli 
Cc: Yann Collet 
---
 Documentation/x86/boot.txt | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/Documentation/x86/boot.txt b/Documentation/x86/boot.txt
index 3840b6f..fc66d42 100644
--- a/Documentation/x86/boot.txt
+++ b/Documentation/x86/boot.txt
@@ -657,9 +657,10 @@ Protocol:  2.08+
   uncompressed data should be determined using the standard magic
   numbers.  The currently supported compression formats are gzip
   (magic numbers 1F 8B or 1F 9E), bzip2 (magic number 42 5A), LZMA
-  (magic number 5D 00), and XZ (magic number FD 37).  The uncompressed
-  payload is currently always ELF (magic number 7F 45 4C 46).
-  
+  (magic number 5D 00), XZ (magic number FD 37), and LZ4 (magic number
+  02 21).  The uncompressed payload is currently always ELF (magic
+  number 7F 45 4C 46).
+
 Field name:payload_length
 Type:  read
 Offset/size:   0x24c/4
-- 
1.8.1.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] lib: Add lz4 compressor module

2013-04-25 Thread Chanho Min

>> gcc seems to define __builtin_clz as __clzsi2 in some architecture.
>> But, kernel doesn't link libgcc.a.
>> If kernel should use gcc's built-in function without libgcc.a,
>> do we need to port __clzsi2 to 'arch/*/lib/*'?
>
>This breaks alpha (gcc-4.4.4) as well.  Can we please get this fixed
>promptly?

__clzsi2 can be implemented by using generic functions.
It can be overridden by linking arch-specific versions may not be implemented.
does this way look acceptable?

diff --git a/lib/Makefile b/lib/Makefile
index af79e8c..e17b3ee 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -23,7 +23,7 @@ lib-y += kobject.o klist.o

 obj-y += bcd.o div64.o sort.o parser.o halfmd4.o debug_locks.o random32.o \
   bust_spinlocks.o hexdump.o kasprintf.o bitmap.o scatterlist.o \
-  gcd.o lcm.o list_sort.o uuid.o flex_array.o \
+  gcd.o lcm.o list_sort.o uuid.o flex_array.o clz.o\
   bsearch.o find_last_bit.o find_next_bit.o llist.o memweight.o kfifo.o
 obj-y += string_helpers.o
 obj-$(CONFIG_TEST_STRING_HELPERS) += test-string_helpers.o
diff --git a/lib/clz.c b/lib/clz.c
index e69de29..6794b83 100644
--- a/lib/clz.c
+++ b/lib/clz.c
@@ -0,0 +1,23 @@
+#include 
+
+int __attribute__((weak)) __clzsi2(int val)
+{
+ return BITS_PER_LONG - fls(val);
+}
+EXPORT_SYMBOL(__clzsi2);
+
+#if BITS_PER_LONG == 32
+int __attribute__((weak)) __clzdi2(long val)
+{
+ return BITS_PER_LONG - fls((int)val);
+}
+EXPORT_SYMBOL(__clzdi2);
+#elif BITS_PER_LONG == 64
+int __attribute__((weak)) __clzdi2(long val)
+{
+ return BITS_PER_LONG - fls64((u64)val);
+}
+EXPORT_SYMBOL(__clzdi2);
+#else
+#error BITS_PER_LONG not 32 or 64
+#endif

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/3] f2fs: add a tracepoint on f2fs_new_inode

2013-04-25 Thread Namjae Jeon
2013/4/25, Jaegeuk Kim :
> This can help when debugging the free nid allocation flows.
>
> Signed-off-by: Jaegeuk Kim 
Yes, Agreed also.
Reviewed-by: Namjae Jeon 

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] process cputimer is moving faster than its corresponding clock

2013-04-25 Thread Olivier Langlois
On Fri, 2013-04-19 at 11:08 -0700, KOSAKI Motohiro wrote:
> On Fri, Apr 19, 2013 at 10:38 AM, KOSAKI Motohiro
>  wrote:
> >> I feel we are hitting the same issue than this patch:
> >> https://lkml.org/lkml/2013/4/5/116
> >>
> >> I'm adding Kosaki in Cc, who proposed roughly the same fix.
> >
> > Thanks to CCing. I'm now sitting LSF and I can't read whole tons emails.
> > However the fix is definitely same and I definitely agree this approach.
> >
> > thank you.
> 
> And if I understand correctly, update_gt_cputime() is no longer
> necessary after this patch because time never makes backward.
> 
> What do you think?

Kosaki, I would tend to say that what you propose is exact. After having
added the task deltas I was puzzled to see the cputimer still moving
faster than the process clock. I was seeing it with the the help of
printk statement inside update_gt_cputime().

After nailing down the last remaining cause of that inside sched/core.c,
I have never seen after the cputimer being in advance.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 3.0.75

2013-04-25 Thread Greg KH

diff --git a/Makefile b/Makefile
index 71e8efa..30ad2fe 100644
--- a/Makefile
+++ b/Makefile
@@ -1,6 +1,6 @@
 VERSION = 3
 PATCHLEVEL = 0
-SUBLEVEL = 74
+SUBLEVEL = 75
 EXTRAVERSION =
 NAME = Sneaky Weasel
 
diff --git a/arch/arm/mm/cache-feroceon-l2.c b/arch/arm/mm/cache-feroceon-l2.c
index e0b0e7a..09f8851 100644
--- a/arch/arm/mm/cache-feroceon-l2.c
+++ b/arch/arm/mm/cache-feroceon-l2.c
@@ -342,6 +342,7 @@ void __init feroceon_l2_init(int __l2_wt_override)
outer_cache.inv_range = feroceon_l2_inv_range;
outer_cache.clean_range = feroceon_l2_clean_range;
outer_cache.flush_range = feroceon_l2_flush_range;
+   outer_cache.inv_all = l2_inv_all;
 
enable_l2();
 
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index d2ac8e2..1eb45de 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -391,8 +391,8 @@ struct kvm_vcpu_arch {
gpa_t time;
struct pvclock_vcpu_time_info hv_clock;
unsigned int hw_tsc_khz;
-   unsigned int time_offset;
-   struct page *time_page;
+   struct gfn_to_hva_cache pv_time;
+   bool pv_time_enabled;
u64 last_guest_tsc;
u64 last_kernel_ns;
u64 last_tsc_nsec;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e329dc5..15e79a6 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1073,7 +1073,6 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
 {
unsigned long flags;
struct kvm_vcpu_arch *vcpu = >arch;
-   void *shared_kaddr;
unsigned long this_tsc_khz;
s64 kernel_ns, max_kernel_ns;
u64 tsc_timestamp;
@@ -1109,7 +1108,7 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
 
local_irq_restore(flags);
 
-   if (!vcpu->time_page)
+   if (!vcpu->pv_time_enabled)
return 0;
 
/*
@@ -1167,14 +1166,9 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
 */
vcpu->hv_clock.version += 2;
 
-   shared_kaddr = kmap_atomic(vcpu->time_page, KM_USER0);
-
-   memcpy(shared_kaddr + vcpu->time_offset, >hv_clock,
-  sizeof(vcpu->hv_clock));
-
-   kunmap_atomic(shared_kaddr, KM_USER0);
-
-   mark_page_dirty(v->kvm, vcpu->time >> PAGE_SHIFT);
+   kvm_write_guest_cached(v->kvm, >pv_time,
+   >hv_clock,
+   sizeof(vcpu->hv_clock));
return 0;
 }
 
@@ -1454,7 +1448,8 @@ static int kvm_pv_enable_async_pf(struct kvm_vcpu *vcpu, 
u64 data)
return 0;
}
 
-   if (kvm_gfn_to_hva_cache_init(vcpu->kvm, >arch.apf.data, gpa))
+   if (kvm_gfn_to_hva_cache_init(vcpu->kvm, >arch.apf.data, gpa,
+   sizeof(u32)))
return 1;
 
vcpu->arch.apf.send_user_only = !(data & KVM_ASYNC_PF_SEND_ALWAYS);
@@ -1464,10 +1459,7 @@ static int kvm_pv_enable_async_pf(struct kvm_vcpu *vcpu, 
u64 data)
 
 static void kvmclock_reset(struct kvm_vcpu *vcpu)
 {
-   if (vcpu->arch.time_page) {
-   kvm_release_page_dirty(vcpu->arch.time_page);
-   vcpu->arch.time_page = NULL;
-   }
+   vcpu->arch.pv_time_enabled = false;
 }
 
 int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data)
@@ -1527,6 +1519,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, 
u64 data)
break;
case MSR_KVM_SYSTEM_TIME_NEW:
case MSR_KVM_SYSTEM_TIME: {
+   u64 gpa_offset;
kvmclock_reset(vcpu);
 
vcpu->arch.time = data;
@@ -1536,16 +1529,14 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, 
u64 data)
if (!(data & 1))
break;
 
-   /* ...but clean it before doing the actual write */
-   vcpu->arch.time_offset = data & ~(PAGE_MASK | 1);
+   gpa_offset = data & ~(PAGE_MASK | 1);
 
-   vcpu->arch.time_page =
-   gfn_to_page(vcpu->kvm, data >> PAGE_SHIFT);
-
-   if (is_error_page(vcpu->arch.time_page)) {
-   kvm_release_page_clean(vcpu->arch.time_page);
-   vcpu->arch.time_page = NULL;
-   }
+   if (kvm_gfn_to_hva_cache_init(vcpu->kvm,
+>arch.pv_time, data & ~1ULL,
+sizeof(struct pvclock_vcpu_time_info)))
+   vcpu->arch.pv_time_enabled = false;
+   else
+   vcpu->arch.pv_time_enabled = true;
break;
}
case MSR_KVM_ASYNC_PF_EN:
@@ -6252,6 +6243,7 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
if (!zalloc_cpumask_var(>arch.wbinvd_dirty_mask, GFP_KERNEL))
goto fail_free_mce_banks;
 
+   vcpu->arch.pv_time_enabled = false;
kvm_async_pf_hash_reset(vcpu);
 
return 0;
diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c
index 62122a1..fed2868 

Linux 3.0.75

2013-04-25 Thread Greg KH
I'm announcing the release of the 3.0.75 kernel.

All users of the 3.0 kernel series must upgrade.

The updated 3.0.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git 
linux-3.0.y
and can be browsed at the normal kernel.org git web browser:

http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary

thanks,

greg k-h



 Makefile  |2 -
 arch/arm/mm/cache-feroceon-l2.c   |1 
 arch/x86/include/asm/kvm_host.h   |4 +-
 arch/x86/kvm/x86.c|   40 -
 crypto/algif_hash.c   |2 +
 crypto/algif_skcipher.c   |1 
 drivers/char/hpet.c   |   14 ---
 drivers/gpu/vga/vga_switcheroo.c  |3 +
 drivers/mtd/mtdchar.c |   32 +
 drivers/net/can/sja1000/sja1000_of_platform.c |   31 
 drivers/net/wireless/ath/ath9k/htc_drv_init.c |2 -
 drivers/video/console/fbcon.c |   11 -
 drivers/video/fbmem.c |   42 --
 fs/btrfs/tree-log.c   |   48 ++
 fs/hfsplus/extents.c  |2 -
 fs/sysfs/dir.c|   14 ---
 include/linux/kvm_host.h  |2 -
 include/linux/kvm_types.h |1 
 include/linux/mm.h|2 +
 kernel/events/core.c  |2 -
 kernel/hrtimer.c  |3 -
 kernel/sched.c|6 ++-
 kernel/signal.c   |2 -
 mm/hugetlb.c  |   12 +-
 mm/memory.c   |   47 +
 net/8021q/vlan.c  |   14 +++
 sound/core/pcm_native.c   |   12 +-
 virt/kvm/ioapic.c |7 ++-
 virt/kvm/kvm_main.c   |   39 -
 29 files changed, 227 insertions(+), 171 deletions(-)

Andrew Honig (1):
  KVM: Allow cross page reads and writes from cached translations.

Andy Honig (3):
  KVM: x86: fix for buffer overflow in handling of MSR_KVM_SYSTEM_TIME 
(CVE-2013-1796)
  KVM: x86: Convert MSR_KVM_SYSTEM_TIME to use gfn_to_hva_cache functions 
(CVE-2013-1797)
  KVM: Fix bounds checking in ioapic indirect register reads (CVE-2013-1798)

Christoph Fritz (1):
  can: sja1000: fix handling on dt properties on little endian systems

Dave Airlie (1):
  fbcon: fix locking harder

Emese Revfy (1):
  kernel/signal.c: stop info leak via the tkill and the tgkill syscalls

Felix Fietkau (1):
  ath9k_htc: accept 1.x firmware newer than 1.3

Greg Kroah-Hartman (2):
  Revert "8021q: fix a potential use-after-free"
  Linux 3.0.75

Illia Ragozin (1):
  ARM: 7696/1: Fix kexec by setting outer_cache.inv_all for Feroceon

Jiri Kosina (1):
  Revert "sysfs: fix race between readdir and lseek"

Josef Bacik (1):
  Btrfs: make sure nbytes are right after log replay

Linus Torvalds (5):
  vm: add vm_iomap_memory() helper function
  vm: convert snd_pcm_lib_mmap_iomem() to vm_iomap_memory() helper
  vm: convert fb_mmap to vm_iomap_memory() helper
  vm: convert HPET mmap to vm_iomap_memory() helper
  vm: convert mtdchar mmap to vm_iomap_memory() helper

Mathias Krause (1):
  crypto: algif - suppress sending source address information in recvmsg

Michael Bohan (1):
  hrtimer: Don't reinitialize a cpu_base lock on CPU_UP

Naoya Horiguchi (1):
  hugetlbfs: add swap entry check in follow_hugetlb_page()

Tejun Heo (1):
  sched: Convert BUG_ON()s in try_to_wake_up_local() to WARN_ON_ONCE()s

Tommi Rantala (1):
  perf: Treat attr.config as u64 in perf_swevent_init()

Vyacheslav Dubeyko (1):
  hfsplus: fix potential overflow in hfsplus_file_truncate()



pgpO6G5UC77gn.pgp
Description: PGP signature


Re: Linux 3.4.42

2013-04-25 Thread Greg KH
diff --git a/Makefile b/Makefile
index 90c3a6f..35c00db 100644
--- a/Makefile
+++ b/Makefile
@@ -1,6 +1,6 @@
 VERSION = 3
 PATCHLEVEL = 4
-SUBLEVEL = 41
+SUBLEVEL = 42
 EXTRAVERSION =
 NAME = Saber-toothed Squirrel
 
diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
index 186c8cb..85d6332 100644
--- a/arch/arm/kernel/perf_event.c
+++ b/arch/arm/kernel/perf_event.c
@@ -319,7 +319,10 @@ validate_event(struct pmu_hw_events *hw_events,
struct hw_perf_event fake_event = event->hw;
struct pmu *leader_pmu = event->group_leader->pmu;
 
-   if (event->pmu != leader_pmu || event->state <= PERF_EVENT_STATE_OFF)
+   if (event->pmu != leader_pmu || event->state < PERF_EVENT_STATE_OFF)
+   return 1;
+
+   if (event->state == PERF_EVENT_STATE_OFF && !event->attr.enable_on_exec)
return 1;
 
return armpmu->get_event_idx(hw_events, _event) >= 0;
diff --git a/arch/arm/mm/cache-feroceon-l2.c b/arch/arm/mm/cache-feroceon-l2.c
index dd3d591..48bc3c0 100644
--- a/arch/arm/mm/cache-feroceon-l2.c
+++ b/arch/arm/mm/cache-feroceon-l2.c
@@ -343,6 +343,7 @@ void __init feroceon_l2_init(int __l2_wt_override)
outer_cache.inv_range = feroceon_l2_inv_range;
outer_cache.clean_range = feroceon_l2_clean_range;
outer_cache.flush_range = feroceon_l2_flush_range;
+   outer_cache.inv_all = l2_inv_all;
 
enable_l2();
 
diff --git a/arch/arm/mm/proc-arm920.S b/arch/arm/mm/proc-arm920.S
index cb941ae..aeeb126 100644
--- a/arch/arm/mm/proc-arm920.S
+++ b/arch/arm/mm/proc-arm920.S
@@ -383,7 +383,7 @@ ENTRY(cpu_arm920_set_pte_ext)
 /* Suspend/resume support: taken from arch/arm/plat-s3c24xx/sleep.S */
 .globl cpu_arm920_suspend_size
 .equ   cpu_arm920_suspend_size, 4 * 3
-#ifdef CONFIG_PM_SLEEP
+#ifdef CONFIG_ARM_CPU_SUSPEND
 ENTRY(cpu_arm920_do_suspend)
stmfd   sp!, {r4 - r6, lr}
mrc p15, 0, r4, c13, c0, 0  @ PID
diff --git a/arch/arm/mm/proc-arm926.S b/arch/arm/mm/proc-arm926.S
index 820259b..ee29dc4 100644
--- a/arch/arm/mm/proc-arm926.S
+++ b/arch/arm/mm/proc-arm926.S
@@ -398,7 +398,7 @@ ENTRY(cpu_arm926_set_pte_ext)
 /* Suspend/resume support: taken from arch/arm/plat-s3c24xx/sleep.S */
 .globl cpu_arm926_suspend_size
 .equ   cpu_arm926_suspend_size, 4 * 3
-#ifdef CONFIG_PM_SLEEP
+#ifdef CONFIG_ARM_CPU_SUSPEND
 ENTRY(cpu_arm926_do_suspend)
stmfd   sp!, {r4 - r6, lr}
mrc p15, 0, r4, c13, c0, 0  @ PID
diff --git a/arch/arm/mm/proc-sa1100.S b/arch/arm/mm/proc-sa1100.S
index 3aa0da1..d92dfd0 100644
--- a/arch/arm/mm/proc-sa1100.S
+++ b/arch/arm/mm/proc-sa1100.S
@@ -172,7 +172,7 @@ ENTRY(cpu_sa1100_set_pte_ext)
 
 .globl cpu_sa1100_suspend_size
 .equ   cpu_sa1100_suspend_size, 4 * 3
-#ifdef CONFIG_PM_SLEEP
+#ifdef CONFIG_ARM_CPU_SUSPEND
 ENTRY(cpu_sa1100_do_suspend)
stmfd   sp!, {r4 - r6, lr}
mrc p15, 0, r4, c3, c0, 0   @ domain ID
diff --git a/arch/arm/mm/proc-v6.S b/arch/arm/mm/proc-v6.S
index 5900cd5..897486c 100644
--- a/arch/arm/mm/proc-v6.S
+++ b/arch/arm/mm/proc-v6.S
@@ -132,7 +132,7 @@ ENTRY(cpu_v6_set_pte_ext)
 /* Suspend/resume support: taken from arch/arm/mach-s3c64xx/sleep.S */
 .globl cpu_v6_suspend_size
 .equ   cpu_v6_suspend_size, 4 * 6
-#ifdef CONFIG_PM_SLEEP
+#ifdef CONFIG_ARM_CPU_SUSPEND
 ENTRY(cpu_v6_do_suspend)
stmfd   sp!, {r4 - r9, lr}
mrc p15, 0, r4, c13, c0, 0  @ FCSE/PID
diff --git a/arch/arm/mm/proc-xsc3.S b/arch/arm/mm/proc-xsc3.S
index b0d5786..a2d1e86 100644
--- a/arch/arm/mm/proc-xsc3.S
+++ b/arch/arm/mm/proc-xsc3.S
@@ -410,7 +410,7 @@ ENTRY(cpu_xsc3_set_pte_ext)
 
 .globl cpu_xsc3_suspend_size
 .equ   cpu_xsc3_suspend_size, 4 * 6
-#ifdef CONFIG_PM_SLEEP
+#ifdef CONFIG_ARM_CPU_SUSPEND
 ENTRY(cpu_xsc3_do_suspend)
stmfd   sp!, {r4 - r9, lr}
mrc p14, 0, r4, c6, c0, 0   @ clock configuration, for turbo mode
diff --git a/arch/arm/mm/proc-xscale.S b/arch/arm/mm/proc-xscale.S
index 4ffebaa..9882153 100644
--- a/arch/arm/mm/proc-xscale.S
+++ b/arch/arm/mm/proc-xscale.S
@@ -524,7 +524,7 @@ ENTRY(cpu_xscale_set_pte_ext)
 
 .globl cpu_xscale_suspend_size
 .equ   cpu_xscale_suspend_size, 4 * 6
-#ifdef CONFIG_PM_SLEEP
+#ifdef CONFIG_ARM_CPU_SUSPEND
 ENTRY(cpu_xscale_do_suspend)
stmfd   sp!, {r4 - r9, lr}
mrc p14, 0, r4, c6, c0, 0   @ clock configuration, for turbo mode
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index e216ba0..d57eacb 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -407,8 +407,8 @@ struct kvm_vcpu_arch {
gpa_t time;
struct pvclock_vcpu_time_info hv_clock;
unsigned int hw_tsc_khz;
-   unsigned int time_offset;
-   struct page *time_page;
+   struct gfn_to_hva_cache pv_time;
+   bool pv_time_enabled;
 
struct {
u64 msr_val;
diff --git a/arch/x86/kernel/cpu/perf_event_intel.c 
b/arch/x86/kernel/cpu/perf_event_intel.c
index 26b3e2f..268b245 

Linux 3.4.42

2013-04-25 Thread Greg KH
I'm announcing the release of the 3.4.42 kernel.

All users of the 3.4 kernel series must upgrade.

The updated 3.4.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git 
linux-3.4.y
and can be browsed at the normal kernel.org git web browser:

http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary

thanks,

greg k-h



 Makefile |2 
 arch/arm/kernel/perf_event.c |5 +
 arch/arm/mm/cache-feroceon-l2.c  |1 
 arch/arm/mm/proc-arm920.S|2 
 arch/arm/mm/proc-arm926.S|2 
 arch/arm/mm/proc-sa1100.S|2 
 arch/arm/mm/proc-v6.S|2 
 arch/arm/mm/proc-xsc3.S  |2 
 arch/arm/mm/proc-xscale.S|2 
 arch/x86/include/asm/kvm_host.h  |4 -
 arch/x86/kernel/cpu/perf_event_intel.c   |   15 -
 arch/x86/kvm/x86.c   |   43 +++--
 crypto/algif_hash.c  |2 
 crypto/algif_skcipher.c  |1 
 drivers/char/hpet.c  |   14 -
 drivers/gpu/vga/vga_switcheroo.c |3 +
 drivers/mtd/mtdchar.c|   32 
 drivers/net/can/sja1000/sja1000_of_platform.c|   31 +---
 drivers/net/wireless/ath/ath9k/ar9580_1p0_initvals.h |2 
 drivers/net/wireless/ath/ath9k/htc_drv_init.c|2 
 drivers/net/wireless/b43/phy_n.c |3 -
 drivers/ssb/driver_chipcommon_pmu.c  |   29 +++
 drivers/video/console/fbcon.c|   11 +++-
 drivers/video/fbmem.c|   42 ++--
 fs/btrfs/tree-log.c  |   48 ---
 fs/hfsplus/extents.c |2 
 include/linux/kvm_host.h |2 
 include/linux/kvm_types.h|1 
 include/linux/mm.h   |2 
 include/linux/ssb/ssb_driver_chipcommon.h|2 
 kernel/events/core.c |2 
 kernel/hrtimer.c |3 -
 kernel/sched/core.c  |6 +-
 kernel/signal.c  |2 
 mm/hugetlb.c |   12 
 mm/memory.c  |   47 ++
 sound/core/pcm_native.c  |   12 
 virt/kvm/ioapic.c|7 +-
 virt/kvm/kvm_main.c  |   47 ++
 39 files changed, 283 insertions(+), 166 deletions(-)

Andrew Honig (1):
  KVM: Allow cross page reads and writes from cached translations.

Andy Honig (3):
  KVM: x86: fix for buffer overflow in handling of MSR_KVM_SYSTEM_TIME 
(CVE-2013-1796)
  KVM: x86: Convert MSR_KVM_SYSTEM_TIME to use gfn_to_hva_cache functions 
(CVE-2013-1797)
  KVM: Fix bounds checking in ioapic indirect register reads (CVE-2013-1798)

Christoph Fritz (1):
  can: sja1000: fix handling on dt properties on little endian systems

Dave Airlie (1):
  fbcon: fix locking harder

Emese Revfy (1):
  kernel/signal.c: stop info leak via the tkill and the tgkill syscalls

Felix Fietkau (2):
  ath9k_htc: accept 1.x firmware newer than 1.3
  ath9k_hw: change AR9580 initvals to fix a stability issue

Greg Kroah-Hartman (1):
  Linux 3.4.42

Illia Ragozin (1):
  ARM: 7696/1: Fix kexec by setting outer_cache.inv_all for Feroceon

Josef Bacik (1):
  Btrfs: make sure nbytes are right after log replay

Linus Torvalds (5):
  vm: add vm_iomap_memory() helper function
  vm: convert snd_pcm_lib_mmap_iomem() to vm_iomap_memory() helper
  vm: convert fb_mmap to vm_iomap_memory() helper
  vm: convert HPET mmap to vm_iomap_memory() helper
  vm: convert mtdchar mmap to vm_iomap_memory() helper

Mathias Krause (1):
  crypto: algif - suppress sending source address information in recvmsg

Michael Bohan (1):
  hrtimer: Don't reinitialize a cpu_base lock on CPU_UP

Naoya Horiguchi (1):
  hugetlbfs: add swap entry check in follow_hugetlb_page()

Rafał Miłecki (1):
  ssb: implement spurious tone avoidance

Russell King (1):
  ARM: Do 15e0d9e37c (ARM: pm: let platforms select cpu_suspend support) 
properly

Stephane Eranian (1):
  perf/x86: Fix offcore_rsp valid mask for SNB/IVB

Tejun Heo (1):
  sched: Convert BUG_ON()s in try_to_wake_up_local() to WARN_ON_ONCE()s

Tommi Rantala (1):
  perf: Treat attr.config as u64 in perf_swevent_init()

Re: [PATCH 2/3] posix_timers: Defer per process timer stop after timers processing

2013-04-25 Thread Olivier Langlois
On Fri, 2013-04-19 at 14:47 +0200, Frederic Weisbecker wrote:

> 
> >
> > I might be mistaken but I believe that firing timers are not rescheduled
> > in the current interrupt context. They are going to be rescheduled later
> > from the task context handling the timer generated signal.
> 
> No, when the timer fires, it might generate a signal. But it won't
> execute that signal right away in the same code path. Instead, after
> signal generation, it may reschedule the timer if necessary then look
> at the next firing timer in the list. This is all made from the same
> timer interrupt context from the same call to run_posix_cpu_timers().
> The signal itself is executed asynchronously. Either by interrupting a
> syscall, or from the irq return path.
> 
Frederic, be careful with the interpretation, there are 2 locations from
where posix_cpu_timer_schedule() can be called.

Call to posix_cpu_timer_schedule() from cpu_timer_fire() only happens if
the signal isn't sent because it is ignored by the recipient.

Maybe the condition around the posix_cpu_timer_schedule() block inside
cpu_timer_fire() could even be a good candidate for 'unlikely'
qualifier.

IMO, a more likely scenario, posix_cpu_timer_schedule() will be called
from dequeue_signal() which will be from from a different context than
the interrupt context.

At best, you have an interesting race!

dequeue_signal() is called when delivering a signal, not when it is
generated, right?

If you have a different understanding then please explain when call to
posix_cpu_timer_schedule() from dequeue_signal() will happen.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 3.8.9

2013-04-25 Thread Greg KH
diff --git a/Makefile b/Makefile
index 7684f95..3ae4796 100644
--- a/Makefile
+++ b/Makefile
@@ -1,6 +1,6 @@
 VERSION = 3
 PATCHLEVEL = 8
-SUBLEVEL = 8
+SUBLEVEL = 9
 EXTRAVERSION =
 NAME = Displaced Humerus Anterior
 
diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
index f9e8657..23fa6a2 100644
--- a/arch/arm/kernel/perf_event.c
+++ b/arch/arm/kernel/perf_event.c
@@ -261,7 +261,10 @@ validate_event(struct pmu_hw_events *hw_events,
struct arm_pmu *armpmu = to_arm_pmu(event->pmu);
struct pmu *leader_pmu = event->group_leader->pmu;
 
-   if (event->pmu != leader_pmu || event->state <= PERF_EVENT_STATE_OFF)
+   if (event->pmu != leader_pmu || event->state < PERF_EVENT_STATE_OFF)
+   return 1;
+
+   if (event->state == PERF_EVENT_STATE_OFF && !event->attr.enable_on_exec)
return 1;
 
return armpmu->get_event_idx(hw_events, event) >= 0;
diff --git a/arch/arm/mach-imx/clk-imx35.c b/arch/arm/mach-imx/clk-imx35.c
index 0edce4b..5e3ca7a 100644
--- a/arch/arm/mach-imx/clk-imx35.c
+++ b/arch/arm/mach-imx/clk-imx35.c
@@ -265,6 +265,8 @@ int __init mx35_clocks_init()
clk_prepare_enable(clk[gpio3_gate]);
clk_prepare_enable(clk[iim_gate]);
clk_prepare_enable(clk[emi_gate]);
+   clk_prepare_enable(clk[max_gate]);
+   clk_prepare_enable(clk[iomuxc_gate]);
 
/*
 * SCC is needed to boot via mmc after a watchdog reset. The clock code
diff --git a/arch/arm/mm/cache-feroceon-l2.c b/arch/arm/mm/cache-feroceon-l2.c
index dd3d591..48bc3c0 100644
--- a/arch/arm/mm/cache-feroceon-l2.c
+++ b/arch/arm/mm/cache-feroceon-l2.c
@@ -343,6 +343,7 @@ void __init feroceon_l2_init(int __l2_wt_override)
outer_cache.inv_range = feroceon_l2_inv_range;
outer_cache.clean_range = feroceon_l2_clean_range;
outer_cache.flush_range = feroceon_l2_flush_range;
+   outer_cache.inv_all = l2_inv_all;
 
enable_l2();
 
diff --git a/arch/arm/mm/proc-arm920.S b/arch/arm/mm/proc-arm920.S
index 2c3b942..2556cf1 100644
--- a/arch/arm/mm/proc-arm920.S
+++ b/arch/arm/mm/proc-arm920.S
@@ -387,7 +387,7 @@ ENTRY(cpu_arm920_set_pte_ext)
 /* Suspend/resume support: taken from arch/arm/plat-s3c24xx/sleep.S */
 .globl cpu_arm920_suspend_size
 .equ   cpu_arm920_suspend_size, 4 * 3
-#ifdef CONFIG_PM_SLEEP
+#ifdef CONFIG_ARM_CPU_SUSPEND
 ENTRY(cpu_arm920_do_suspend)
stmfd   sp!, {r4 - r6, lr}
mrc p15, 0, r4, c13, c0, 0  @ PID
diff --git a/arch/arm/mm/proc-arm926.S b/arch/arm/mm/proc-arm926.S
index f1803f7e..344c8a5 100644
--- a/arch/arm/mm/proc-arm926.S
+++ b/arch/arm/mm/proc-arm926.S
@@ -402,7 +402,7 @@ ENTRY(cpu_arm926_set_pte_ext)
 /* Suspend/resume support: taken from arch/arm/plat-s3c24xx/sleep.S */
 .globl cpu_arm926_suspend_size
 .equ   cpu_arm926_suspend_size, 4 * 3
-#ifdef CONFIG_PM_SLEEP
+#ifdef CONFIG_ARM_CPU_SUSPEND
 ENTRY(cpu_arm926_do_suspend)
stmfd   sp!, {r4 - r6, lr}
mrc p15, 0, r4, c13, c0, 0  @ PID
diff --git a/arch/arm/mm/proc-mohawk.S b/arch/arm/mm/proc-mohawk.S
index 82f9cdc..0b60dd3 100644
--- a/arch/arm/mm/proc-mohawk.S
+++ b/arch/arm/mm/proc-mohawk.S
@@ -350,7 +350,7 @@ ENTRY(cpu_mohawk_set_pte_ext)
 
 .globl cpu_mohawk_suspend_size
 .equ   cpu_mohawk_suspend_size, 4 * 6
-#ifdef CONFIG_PM_SLEEP
+#ifdef CONFIG_ARM_CPU_SUSPEND
 ENTRY(cpu_mohawk_do_suspend)
stmfd   sp!, {r4 - r9, lr}
mrc p14, 0, r4, c6, c0, 0   @ clock configuration, for turbo mode
diff --git a/arch/arm/mm/proc-sa1100.S b/arch/arm/mm/proc-sa1100.S
index 3aa0da1..d92dfd0 100644
--- a/arch/arm/mm/proc-sa1100.S
+++ b/arch/arm/mm/proc-sa1100.S
@@ -172,7 +172,7 @@ ENTRY(cpu_sa1100_set_pte_ext)
 
 .globl cpu_sa1100_suspend_size
 .equ   cpu_sa1100_suspend_size, 4 * 3
-#ifdef CONFIG_PM_SLEEP
+#ifdef CONFIG_ARM_CPU_SUSPEND
 ENTRY(cpu_sa1100_do_suspend)
stmfd   sp!, {r4 - r6, lr}
mrc p15, 0, r4, c3, c0, 0   @ domain ID
diff --git a/arch/arm/mm/proc-v6.S b/arch/arm/mm/proc-v6.S
index 09c5233..d15 100644
--- a/arch/arm/mm/proc-v6.S
+++ b/arch/arm/mm/proc-v6.S
@@ -138,7 +138,7 @@ ENTRY(cpu_v6_set_pte_ext)
 /* Suspend/resume support: taken from arch/arm/mach-s3c64xx/sleep.S */
 .globl cpu_v6_suspend_size
 .equ   cpu_v6_suspend_size, 4 * 6
-#ifdef CONFIG_PM_SLEEP
+#ifdef CONFIG_ARM_CPU_SUSPEND
 ENTRY(cpu_v6_do_suspend)
stmfd   sp!, {r4 - r9, lr}
mrc p15, 0, r4, c13, c0, 0  @ FCSE/PID
diff --git a/arch/arm/mm/proc-xsc3.S b/arch/arm/mm/proc-xsc3.S
index eb93d64..e8efd83 100644
--- a/arch/arm/mm/proc-xsc3.S
+++ b/arch/arm/mm/proc-xsc3.S
@@ -413,7 +413,7 @@ ENTRY(cpu_xsc3_set_pte_ext)
 
 .globl cpu_xsc3_suspend_size
 .equ   cpu_xsc3_suspend_size, 4 * 6
-#ifdef CONFIG_PM_SLEEP
+#ifdef CONFIG_ARM_CPU_SUSPEND
 ENTRY(cpu_xsc3_do_suspend)
stmfd   sp!, {r4 - r9, lr}
mrc p14, 0, r4, c6, c0, 0   @ clock configuration, for turbo mode
diff --git a/arch/arm/mm/proc-xscale.S b/arch/arm/mm/proc-xscale.S
index 

Linux 3.8.9

2013-04-25 Thread Greg KH
I'm announcing the release of the 3.8.9 kernel.

All users of the 3.8 kernel series must upgrade.

The updated 3.8.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git 
linux-3.8.y
and can be browsed at the normal kernel.org git web browser:

http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary

thanks,

greg k-h



 Makefile |2 
 arch/arm/kernel/perf_event.c |5 +
 arch/arm/mach-imx/clk-imx35.c|2 
 arch/arm/mm/cache-feroceon-l2.c  |1 
 arch/arm/mm/proc-arm920.S|2 
 arch/arm/mm/proc-arm926.S|2 
 arch/arm/mm/proc-mohawk.S|2 
 arch/arm/mm/proc-sa1100.S|2 
 arch/arm/mm/proc-v6.S|2 
 arch/arm/mm/proc-xsc3.S  |2 
 arch/arm/mm/proc-xscale.S|2 
 arch/mips/include/asm/page.h |2 
 arch/powerpc/kernel/entry_64.S   |2 
 arch/powerpc/kvm/e500mc.c|7 ++
 arch/s390/include/asm/io.h   |4 -
 arch/s390/include/asm/pgtable.h  |4 +
 arch/x86/include/asm/kvm_host.h  |4 -
 arch/x86/kernel/cpu/perf_event_intel.c   |   20 +--
 arch/x86/kvm/lapic.c |2 
 arch/x86/kvm/x86.c   |   51 --
 crypto/algif_hash.c  |2 
 crypto/algif_skcipher.c  |1 
 drivers/char/hpet.c  |   14 -
 drivers/md/raid1.c   |7 ++
 drivers/md/raid10.c  |9 ++-
 drivers/mtd/mtdchar.c|   32 ---
 drivers/net/can/mcp251x.c|   10 ++-
 drivers/net/can/sja1000/sja1000_of_platform.c|   31 +--
 drivers/net/ethernet/broadcom/tg3.c  |   18 ++
 drivers/net/ethernet/broadcom/tg3.h  |2 
 drivers/net/wireless/ath/ath9k/ar9580_1p0_initvals.h |2 
 drivers/net/wireless/ath/ath9k/htc_drv_init.c|2 
 drivers/net/wireless/b43/phy_n.c |3 -
 drivers/ssb/driver_chipcommon_pmu.c  |   29 ++
 drivers/video/fbmem.c|   39 +-
 fs/binfmt_elf.c  |1 
 fs/btrfs/tree-log.c  |   48 +++--
 fs/hfsplus/extents.c |2 
 fs/hugetlbfs/inode.c |2 
 fs/proc/array.c  |1 
 include/linux/kvm_host.h |2 
 include/linux/kvm_types.h|1 
 include/linux/mm.h   |2 
 include/linux/sched.h|5 +
 include/linux/ssb/ssb_driver_chipcommon.h|2 
 include/trace/events/sched.h |2 
 kernel/events/core.c |2 
 kernel/hrtimer.c |3 -
 kernel/kthread.c |   52 ++-
 kernel/sched/core.c  |8 +-
 kernel/signal.c  |2 
 kernel/user_namespace.c  |   22 
 mm/hugetlb.c |   12 
 mm/memory.c  |   47 +
 net/mac80211/mlme.c  |   24 +++-
 sound/core/pcm_native.c  |   12 
 virt/kvm/ioapic.c|7 +-
 virt/kvm/kvm_main.c  |   47 +
 58 files changed, 405 insertions(+), 222 deletions(-)

Andrew Honig (1):
  KVM: Allow cross page reads and writes from cached translations.

Andy Honig (3):
  KVM: x86: fix for buffer overflow in handling of MSR_KVM_SYSTEM_TIME 
(CVE-2013-1796)
  KVM: x86: Convert MSR_KVM_SYSTEM_TIME to use gfn_to_hva_cache functions 
(CVE-2013-1797)
  KVM: Fix bounds checking in ioapic indirect register reads (CVE-2013-1798)

Andy Lutomirski (2):
  userns: Check uid_map's opener's fsuid, not the current fsuid
  userns: Changing any namespace id mappings should require privileges

Christoph Fritz (1):
  can: sja1000: fix handling on dt properties on little endian systems

Emese Revfy (1):
  kernel/signal.c: stop info leak via the tkill and the tgkill syscalls

Eric 

Re: [PATCH] PCI: Remove duplicate pci_disable_device for pcie port

2013-04-25 Thread Yijing Wang
Hi Yinghai,
   We should not remove this additional pci_disable_device().
Because we enable pcie port device twice before. The first is 
pci_enable_brides(),
in x86, it was called in pci_assign_unassigned_resources(). The second in 
pcie_port_device_register().
So we should call pci_disable_device() twice for pci_dev->enable_cnt balance.

But there is still a problem here. If we unbind a pcie port device pcie port 
driver, we can not
use its child devices again, because this pcie port device was disabled 
absolutely.

So I think we should move the second pci_disable_device() to remove.c.

I sent this patch to Bjorn and following is Bjorn reply
"And it's not clear to me whether unbinding the
pcie port driver should disable the bridge at all.  I think one could
argue that the bridge should remain functional even if the driver is
unloaded, because the PCI core *enables* the bridge even if the driver
is never loaded."

Yinghai, how do you think about this issue?



On 2013/4/26 9:47, Yinghai Lu wrote:
> During chasing one PCI xHCI hotplug problem, David Bulkow found
> 
>   static void pcie_portdrv_remove(struct pci_dev *dev)
>   {
>   pcie_port_device_remove(dev);
>   pci_disable_device(dev);
>   }
> and
>   void pcie_port_device_remove(struct pci_dev *dev)
>   {
>   device_for_each_child(>dev, NULL, remove_iter);
>   cleanup_service_irqs(dev);
>   pci_disable_device(dev);
>   }
> 
> that extra pci_disable_device in pcie_port_device_remove() was added by
> | commit dc5351784eb36f1fec4efa88e01581be72c0b711
> | Author: Kenji Kaneshige 
> | Date:   Wed Nov 25 21:04:00 2009 +0900
> |
> |PCI: portdrv: cleanup service irqs initialization
> 
> so pci_dsiable_device is called two times.
> 
> We should remove extra one in pcie_portdrv_remove.
> 
> Reported-by: David Bulkow 
> Signed-off-by: Yinghai Lu 
> 
> ---
>  drivers/pci/pcie/portdrv_pci.c |1 -
>  1 file changed, 1 deletion(-)
> 
> Index: linux-2.6/drivers/pci/pcie/portdrv_pci.c
> ===
> --- linux-2.6.orig/drivers/pci/pcie/portdrv_pci.c
> +++ linux-2.6/drivers/pci/pcie/portdrv_pci.c
> @@ -223,7 +223,6 @@ static int pcie_portdrv_probe(struct pci
>  static void pcie_portdrv_remove(struct pci_dev *dev)
>  {
>   pcie_port_device_remove(dev);
> - pci_disable_device(dev);
>  }
>  
>  static int error_detected_iter(struct device *device, void *data)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
> .
> 


-- 
Thanks!
Yijing
>From 44914e0e39dbe51e1a932492d6b4909d5967308e Mon Sep 17 00:00:00 2001
From: Yijing Wang 
Date: Tue, 16 Apr 2013 11:41:47 +0800
Subject: [PATCH] PCI: move second pci_disable_device into pci_stop_bus_device() for symmetry

Currently, we enable and disable pcie port device is not symmetrical. If
we unbind the pcie port driver for pcie port device, we will call pci_disable_device()
twice. Then the pcie port device is disabled, if there are some child devices
under it, the child device maybe cannot transmit data anymore. This patch move the
second pci_disable_device() int pci_stop_bus_device() to avoid this bug.

Signed-off-by: Yijing Wang 
---
 drivers/pci/pcie/portdrv_pci.c |1 -
 drivers/pci/remove.c   |1 +
 2 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index ed4d094..2ca1a0b 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -223,7 +223,6 @@ static int pcie_portdrv_probe(struct pci_dev *dev,
 static void pcie_portdrv_remove(struct pci_dev *dev)
 {
 	pcie_port_device_remove(dev);
-	pci_disable_device(dev);
 }
 
 static int error_detected_iter(struct device *device, void *data)
diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c
index cc875e6..e8f7c3c 100644
--- a/drivers/pci/remove.c
+++ b/drivers/pci/remove.c
@@ -73,6 +73,7 @@ static void pci_stop_bus_device(struct pci_dev *dev)
 		list_for_each_entry_safe_reverse(child, tmp,
 		 >devices, bus_list)
 			pci_stop_bus_device(child);
+			pci_disable_device(dev);
 	}
 
 	pci_stop_dev(dev);
-- 
1.7.1



Re: [PATCH v2] PowerPC: kernel: compiling issue, make additional room in exception vector area

2013-04-25 Thread Mike Qiu

于 2013/4/26 11:42, Chen Gang 写道:

On 2013年04月26日 11:25, Chen Gang wrote:

On 2013年04月26日 11:08, Mike Qiu wrote:

于 2013/4/26 10:06, Chen Gang 写道:

On 2013年04月26日 10:03, Mike Qiu wrote:

�� 2013/4/26 9:36, Chen Gang �:

On 2013��04��26�� 09:18, Chen Gang wrote:

On 2013��04��26�� 09:06, Chen Gang wrote:

CFAR is the Come From Register.  It saves the location of the
last

branch and is hence overwritten by any branch.


Do we process it just like others done (e.g. 0x300, 0xe00,
0xe20 ...) ?
 . = 0x900
 .globl decrementer_pSeries
decrementer_pSeries:
   HMT_MEDIUM_PPR_DISCARD
 SET_SCRATCH0(r13)
 b decrementer_pSeries_0

 ...



Oh, it seems EXCEPTION_PROLOG_1 will save the regesters which related
with CFAR, so I think need move EXCEPTION_PROLOG_1 to near 0x900.

I will try your diff V2, to see if the machine can boot up

OK, thanks. (hope it can work)

It seems that the machine can be bootup in powernv mode, but I'm not
sure if my machine call that module.

At lease my machine can boot up

Please reference commit number: 1707dd161349e6c54170c88d94fed012e3d224e3
(1707dd1 powerpc: Save CFAR before branching in interrupt entry paths)

What our diff v2 has done is just the fix for our patch v2 (just like
the commit 1707dd1 has done).

Please check, thanks.

:-)
I will check this evening or tomorrow, I have something else to do this 
afteroon.

Thank you for your information !

I have checked the disassemble by powerpc64-linux-gnu-objdump, it seems
all we have done for 0x900 is almost like the original done for 0x200.

I am just learning about the CFAR (google it), And I plan to wait for a
day, if all things go smoothly, I will send patch v3.


:-)





--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: manual merge of the net-next tree with the infiniband tree

2013-04-25 Thread Stephen Rothwell
Hi Cascardo,

On Wed, 24 Apr 2013 14:53:04 -0300 Thadeu Lima de Souza Cascardo 
 wrote:
>
> On Thu, Apr 18, 2013 at 01:18:43PM +1000, Stephen Rothwell wrote:
> > 
> > Today's linux-next merge of the net-next tree got a conflict in
> > drivers/infiniband/hw/cxgb4/qp.c between commit 5b0c275926b8
> > ("RDMA/cxgb4: Fix SQ allocation when on-chip SQ is disabled") from the
> > infiniband tree and commit 9919d5bd01b9 ("RDMA/cxgb4: Fix onchip queue
> > support for T5") from the net-next tree.
> > 
> > I think that they are 2 different fixes for the same problem, so I just
> > used the net-next version and can carry the fix as necessary (no action
> > is required).
> 
> Commit 5b0c275926b8 also keeps the intention of the original patch which
> broke it, which was to return an error code, in case the allocation fails.
> Commit 9919d5bd01b9 fix will return 0 in case the allocation fails.
> 
> We should keep the other fix or fix the code again to return the proper
> error code.

OK, so today I switched the conflict fix to use the version from the
infiniband tree.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpUpBhramruW.pgp
Description: PGP signature


linux-next: build failure after merge of the net-next tree

2013-04-25 Thread Stephen Rothwell
Hi all,

After merging the net-next tree, today's linux-next build (powerpc
ppc64_defconfig) failed like this:

drivers/net/ethernet/emulex/benet/be_main.c: In function 
'be_insert_vlan_in_pkt':
drivers/net/ethernet/emulex/benet/be_main.c:786:3: error: too few arguments to 
function '__vlan_put_tag'
include/linux/if_vlan.h:220:31: note: declared here
drivers/net/ethernet/emulex/benet/be_main.c:796:3: error: too few arguments to 
function '__vlan_put_tag'
include/linux/if_vlan.h:220:31: note: declared here

Caused by comt interaction of commit 86a9bad3ab6b ("net: vlan: add
protocol argument to packet tagging functions") from the net-next tree
and commit bc0c3405abbb ("be2net: fix a Tx stall bug caused by a specific
ipv6 packet") from the net tree.

I applied the following merge fix patch:

From: Stephen Rothwell 
Date: Fri, 26 Apr 2013 13:45:23 +1000
Subject: [PATCH] be2net: merge fix for __vlan_put_tag() API change

Signed-off-by: Stephen Rothwell 
---
 drivers/net/ethernet/emulex/benet/be_main.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/emulex/benet/be_main.c 
b/drivers/net/ethernet/emulex/benet/be_main.c
index 768a7d1..d6ab777 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -783,7 +783,7 @@ static struct sk_buff *be_insert_vlan_in_pkt(struct 
be_adapter *adapter,
}
 
if (vlan_tag) {
-   skb = __vlan_put_tag(skb, vlan_tag);
+   skb = __vlan_put_tag(skb, htons(ETH_P_8021Q), vlan_tag);
if (unlikely(!skb))
return skb;
 
@@ -793,7 +793,7 @@ static struct sk_buff *be_insert_vlan_in_pkt(struct 
be_adapter *adapter,
/* Insert the outer VLAN, if any */
if (adapter->qnq_vid) {
vlan_tag = adapter->qnq_vid;
-   skb = __vlan_put_tag(skb, vlan_tag);
+   skb = __vlan_put_tag(skb, htons(ETH_P_8021Q), vlan_tag);
if (unlikely(!skb))
return skb;
if (skip_hw_vlan)
-- 
1.8.1

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpnRaJLNwLZb.pgp
Description: PGP signature


Re: [PATCHv3 00/14] drivers: mailbox: framework creation

2013-04-25 Thread Jassi Brar
Hi Suman,

On 26 April 2013 03:59, Suman Anna  wrote:
> On 04/25/2013 12:20 AM, Jassi Brar wrote:
> tranmitting right away. OK, I thought you didn't want buffering, if that
> is not the case, then the buffering should be within the main driver
> code, like it is now, but configurable based on the controller or
> mailbox properties. If it is present in individual controller drivers,
> then we would be duplicating stuff. Are you envisioning that this be
> left to the individual controllers?
>
Please don't accuse me of such bad visions :)
I never said no-buffering and I never said buffering should be in
controller drivers. In fact I don't remember ever objecting to how
buffering is done in TI's framework.
A controller could service only 1 request at a time so lets give it
just 1 at a time. Let the API handle the complexity of buffering.

>> I am afraid you are confusing the meaning of 'atomic context' here.
>> atomic context doesn't mean instant transmission of data, but that the
>> API calls could be made from even atomic context and that the client &
>> controller can't sleep in callbacks from the API. So it's not moot.
>
> I understood the atomic context, and the question is about the behavior
> of the '.tx_done' callback when sending from atomic context. Is there
> such a usecase/need for you in that you want to send a response back
> from an atomic context, yet get a callback?
>
Let me get in detail...
The TX-Wheel has to tick. Someone has to tell the framework that the
last TX was consumed by the remote and now it's time to submit the
next TX (RX will always be driven by the controller's IRQ so it's
straight).
If the controller h/w gets some interrupt indicating
Remote-RTR/TX-Done then the ticker is driven by controller's TX-IRQ
handler. Otherwise, if the controller does sense RTR but not report
(by reading status in some register but no irq), then API has to poll
it periodically and move the ticker. If the controller can neither
report nor sense RTR, the client/protocol driver must run the ticker
(usually upon receiving some ACK packet on the RX channel).
This TX ticker should be callable from atomic context (controller's
IRQ handler) and calls into callback of the client. It is desirable
that the client be able to submit yet another TX request from the
callback. That way the client can avoid having to schedule work from
the callback if the TX doesn't involve any sleepable task. The scheme
is working very well in DMA-Engine stack.

BTW, TI's RX mechanism too seems broken for common API. Receiving
every few bytes via 'notify' mechanism is very inefficient. Imagine a
platform with no shared memory between co-processors and the local
wants to diagnose the remote by asking critical data at least KBs in
size.
 So when API has nothing to do with received packet and the controller
has to get rid of it asap so as to be able to receive the next, IMHO
there should be short-circuit from controller to client via the API.
No delay, no buffering of RX.


>> It's the controller driver that actually puts the data on the bus. So
>> only it should define the format in which it accepts data from the
>> clients. Every client should simply populate the packet structure
>> defined in  my_lovely_controller.h and pass on the struct pointer to
>> the controller driver via API.
>> No negotiations for the driver seat among passengers :)
>
> OK, I was trying to avoid including my_lovely_controller.h and only
> include the standard .h file as a client user, the client would anyway
> need to have the intrinsic knowledge of the packet structure.
>
Not including my_controller.h doesn't make things standard.
As we know, the client anyway has to have intrinsic knowledge of the
packet structure(which is dictated by the controller), so not
including my_controller.h will only confuse people as to where the
packet info came from?

>>>
>> I think the mailbox should be exclusively held by a client. That makes
>> many things simpler. Also remote firmwares won't be always robust
>> enough to handle commands from different subsystems intermixed. The
>> API only has to make sure the mailbox_get/put operations are very
>> thin.
>
> This might be the case for specific remotes where we expect only one
> client driver to be responsible for talking to it, but for generic
> offloading, you do not want to have this restriction. You do not want
> peer clients to go through a single main client, as the latencies or the
> infrastructure imposed by the main client may not be suitable for the
> other clients. The stricter usecase here would be the shareable mailbox,
> and if it is exclusive, as dictated by a controller or device property,
> then so be it and things would get simplified for that controller/device.
>
Shared Vs Exclusive had been the dilemma of DMAEngine too.
If the controller has physical channels at least as many as clients,
exclusivity is no problem.
Sharing is desirable when the controller has to serve clients more
than its physical 

Re: [PATCH v2] PowerPC: kernel: compiling issue, make additional room in exception vector area

2013-04-25 Thread Chen Gang
On 2013年04月26日 11:25, Chen Gang wrote:
> On 2013年04月26日 11:08, Mike Qiu wrote:
>> 于 2013/4/26 10:06, Chen Gang 写道:
>>> On 2013年04月26日 10:03, Mike Qiu wrote:
 �� 2013/4/26 9:36, Chen Gang �:
>> On 2013��04��26�� 09:18, Chen Gang wrote:
 On 2013��04��26�� 09:06, Chen Gang wrote:
 CFAR is the Come From Register.  It saves the location of the
 last
>> branch and is hence overwritten by any branch.
>>
>> Do we process it just like others done (e.g. 0x300, 0xe00,
>> 0xe20 ...) ?
>> . = 0x900
>> .globl decrementer_pSeries
>> decrementer_pSeries:
>>   HMT_MEDIUM_PPR_DISCARD
>> SET_SCRATCH0(r13)
>> b decrementer_pSeries_0
>>
>> ...
>>
>>
>> Oh, it seems EXCEPTION_PROLOG_1 will save the regesters which related
>> with CFAR, so I think need move EXCEPTION_PROLOG_1 to near 0x900.
 I will try your diff V2, to see if the machine can boot up
>>> OK, thanks. (hope it can work)
>> It seems that the machine can be bootup in powernv mode, but I'm not
>> sure if my machine call that module.
>>
>> At lease my machine can boot up
> 

Please reference commit number: 1707dd161349e6c54170c88d94fed012e3d224e3
(1707dd1 powerpc: Save CFAR before branching in interrupt entry paths)

What our diff v2 has done is just the fix for our patch v2 (just like
the commit 1707dd1 has done).

Please check, thanks.

:-)

> Thank you for your information !
> 
> I have checked the disassemble by powerpc64-linux-gnu-objdump, it seems
> all we have done for 0x900 is almost like the original done for 0x200.
> 
> I am just learning about the CFAR (google it), And I plan to wait for a
> day, if all things go smoothly, I will send patch v3.
> 
> 
> :-)
> 


-- 
Chen Gang

Asianux Corporation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: manual merge of the net-next tree with the net tree

2013-04-25 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the net-next tree got a conflict in
include/net/tcp.h between commit 093162553c33 ("tcp: force a dst refcount
when prequeue packet") from the net tree and commit b2fb4f54ecd4 ("tcp:
uninline tcp_prequeue()") from the net-next tree.

I fixed it up (I used the next-next version of tcp.h and added the
following patch) and can carry the fix as necessary (no action is
required).  Thanks for the heads up Dave!

From: Stephen Rothwell 
Date: Fri, 26 Apr 2013 13:35:50 +1000
Subject: [PATCH] tcp: merge fixup for movemenet of tcp_prequeue

Signed-off-by: Stephen Rothwell 
---
 net/ipv4/tcp_ipv4.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 8667aaa..5dcf177 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1926,6 +1926,7 @@ bool tcp_prequeue(struct sock *sk, struct sk_buff *skb)
skb_queue_len(>ucopy.prequeue) == 0)
return false;
 
+   skb_dst_force(skb);
__skb_queue_tail(>ucopy.prequeue, skb);
tp->ucopy.memory += skb->truesize;
if (tp->ucopy.memory > sk->sk_rcvbuf) {
-- 
1.8.1

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpyIhsBxGBsv.pgp
Description: PGP signature


linux-next: manual merge of the net-next tree with the pci tree

2013-04-25 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the net-next tree got a conflict in
include/linux/pci.h between commit f39d5b72913e ("PCI: Remove "extern"
from function declarations") from the pci tree and commit 5a8eb24292ff
("pci: Add SRIOV helper function to determine if VFs are assigned to
guest") from the net-next tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

(I would have preferred that all these declarations were changed to have
"extern" added to ones that were missing it, but ...)

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc include/linux/pci.h
index e73dfa3,43e45ac..000
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@@ -1657,12 -1640,13 +1657,13 @@@ int pci_ext_cfg_avail(void)
  void __iomem *pci_ioremap_bar(struct pci_dev *pdev, int bar);
  
  #ifdef CONFIG_PCI_IOV
 -extern int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn);
 -extern void pci_disable_sriov(struct pci_dev *dev);
 -extern irqreturn_t pci_sriov_migration(struct pci_dev *dev);
 -extern int pci_num_vf(struct pci_dev *dev);
 +int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn);
 +void pci_disable_sriov(struct pci_dev *dev);
 +irqreturn_t pci_sriov_migration(struct pci_dev *dev);
 +int pci_num_vf(struct pci_dev *dev);
+ int pci_vfs_assigned(struct pci_dev *dev);
 -extern int pci_sriov_set_totalvfs(struct pci_dev *dev, u16 numvfs);
 -extern int pci_sriov_get_totalvfs(struct pci_dev *dev);
 +int pci_sriov_set_totalvfs(struct pci_dev *dev, u16 numvfs);
 +int pci_sriov_get_totalvfs(struct pci_dev *dev);
  #else
  static inline int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn)
  {


pgpY0OyGrkWQz.pgp
Description: PGP signature


linux-next: manual merge of the net-next tree with the net tree

2013-04-25 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the net-next tree got a conflict in
drivers/net/ethernet/emulex/benet/be.h between commit bc0c3405abbb
("be2net: fix a Tx stall bug caused by a specific ipv6 packet") from the
net tree and commit 0ad3157e813a ("be2net: Avoid flashing BE3 UFI on
BE3-R chip") from the net-next tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc drivers/net/ethernet/emulex/benet/be.h
index 941aa1f,e2d5ced..000
--- a/drivers/net/ethernet/emulex/benet/be.h
+++ b/drivers/net/ethernet/emulex/benet/be.h
@@@ -435,7 -435,7 +436,8 @@@ struct be_adapter 
u8 wol_cap;
bool wol;
u32 uc_macs;/* Count of secondary UC MAC programmed */
 +  u16 qnq_vid;
+   u16 asic_rev;
u32 msg_enable;
int be_get_temp_freq;
u16 max_mcast_mac;


pgpQCEQJBbJII.pgp
Description: PGP signature


Re: [PATCH v2] PowerPC: kernel: compiling issue, make additional room in exception vector area

2013-04-25 Thread Chen Gang
On 2013年04月26日 11:08, Mike Qiu wrote:
> 于 2013/4/26 10:06, Chen Gang 写道:
>> On 2013年04月26日 10:03, Mike Qiu wrote:
>>> �� 2013/4/26 9:36, Chen Gang �:
> On 2013��04��26�� 09:18, Chen Gang wrote:
>>> On 2013��04��26�� 09:06, Chen Gang wrote:
>>> CFAR is the Come From Register.  It saves the location of the
>>> last
> branch and is hence overwritten by any branch.
>
> Do we process it just like others done (e.g. 0x300, 0xe00,
> 0xe20 ...) ?
> . = 0x900
> .globl decrementer_pSeries
> decrementer_pSeries:
>   HMT_MEDIUM_PPR_DISCARD
> SET_SCRATCH0(r13)
> b decrementer_pSeries_0
>
> ...
>
>
> Oh, it seems EXCEPTION_PROLOG_1 will save the regesters which related
> with CFAR, so I think need move EXCEPTION_PROLOG_1 to near 0x900.
>>> I will try your diff V2, to see if the machine can boot up
>> OK, thanks. (hope it can work)
> It seems that the machine can be bootup in powernv mode, but I'm not
> sure if my machine call that module.
> 
> At lease my machine can boot up

Thank you for your information !

I have checked the disassemble by powerpc64-linux-gnu-objdump, it seems
all we have done for 0x900 is almost like the original done for 0x200.

I am just learning about the CFAR (google it), And I plan to wait for a
day, if all things go smoothly, I will send patch v3.


:-)

-- 
Chen Gang

Asianux Corporation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: manual merge of the net-next tree with the net tree

2013-04-25 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the net-next tree got a conflict in
drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c between commit
ecf01c22be03 ("bnx2x: Prevent NULL pointer dereference in kdump") from
the net tree and commit 5b0752c863d7 ("bnx2x: Fix VF statistics") from
the net-next tree.

I am not sure about this, but I just used the net tree version but I
assume something more may be needed and can carry the fix as necessary (no
action is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpREYEYSI0nc.pgp
Description: PGP signature


Re: "attempt to move .org backwards" still show up

2013-04-25 Thread Mike Qiu

于 2013/4/25 14:25, Paul Mackerras 写道:

On Thu, Apr 25, 2013 at 12:05:54PM +0800, Mike Qiu wrote:

This has block my work now
So I hope you can take a look ASAP
Thanks
:)

Mike

As a quick fix, turn on CONFIG_KVM_BOOK3S_64_HV.  That will eliminate
the immediate problem.

Thanks
got it, I will have a try.

Paul.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: mm: BUG in do_huge_pmd_wp_page

2013-04-25 Thread Sasha Levin
On 04/25/2013 10:01 PM, Dave Jones wrote:
> On Thu, Apr 25, 2013 at 08:51:27PM -0400, Sasha Levin wrote:
>  > On 04/24/2013 06:46 PM, Andrew Morton wrote:
>  > > Guys, did this get fixed?
>  > 
>  > I've stopped seeing that during fuzzing, so I guess that it got fixed 
> somehow...
> 
> We've had reports of users hitting this in 3.8
> 
> eg:
> https://bugzilla.redhat.com/show_bug.cgi?id=947985
> https://bugzilla.redhat.com/show_bug.cgi?id=956730 
> 
> I'm sure there are other reports of it too.
> 
> Would be good if we can figure out what fixed it (if it is actually fixed)
> for backporting to stable

If it's interesting to know I'll bisect it over the weekend...

Think it's enough to look at mm/ commits?


Thanks,
Sasha

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V4 0/5] powerpc, perf: BHRB based branch stack enablement on POWER8

2013-04-25 Thread Michael Neuling
Anshuman,

IIRC there are new bits in the FSCR and HFSCR you need to enable for the
PMU and BRHB.  Can you please check these are enabled?

Mikey

Anshuman Khandual  wrote:
> Branch History Rolling Buffer (BHRB) is a new PMU feaure in 
> IBM
> POWER8 processor which records the branch instructions inside the execution
> pipeline. This patchset enables the basic functionality of the feature through
> generic perf branch stack sampling framework.
> 
> Sample output
> -
> $./perf record -b top
> $./perf report
> 
> Overhead  Command  Source Shared Object   Source 
> Symbol  Target Shared ObjectTarget Symbol
> #   ...    
> ..    
> ...
> #
> 
>  7.82%  top  libc-2.11.2.so[k] _IO_vfscanf
>  libc-2.11.2.so[k] _IO_vfscanf
>  6.17%  top  libc-2.11.2.so[k] _IO_vfscanf
>  [unknown] [k] 
>  2.37%  top  [unknown] [k] 0xf7aafb30 
>  [unknown] [k] 
>  1.80%  top  [unknown] [k] 0x0fe07978 
>  libc-2.11.2.so[k] _IO_vfscanf
>  1.60%  top  libc-2.11.2.so[k] _IO_vfscanf
>  [kernel.kallsyms] [k] .do_task_stat
>  1.20%  top  [kernel.kallsyms] [k] .do_task_stat  
>  [kernel.kallsyms] [k] .do_task_stat
>  1.02%  top  libc-2.11.2.so[k] vfprintf   
>  libc-2.11.2.so[k] vfprintf
>  0.92%  top  top   [k] _init  
>  [unknown] [k] 0x0fe037f4
> 
> Changes in V2
> --
> - Added copyright messages to the newly created files
> - Modified couple of commit messages
> 
> Changes in V3
> -
> - Incorporated review comments from Segher https://lkml.org/lkml/2013/4/16/350
> - Worked on a solution for review comment from Michael Ellerman 
> https://lkml.org/lkml/2013/4/17/548
>   - Could not move updated cpu_hw_events structure from core-book3s.c 
> file into perf_event_server.h
>   Because perf_event_server.h is pulled in first inside 
> linux/perf_event.h before the definition of
>   perf_branch_entry structure. Thats the reason why perf_branch_entry 
> definition is not available
>   inside perf_event_server.h where we define the array inside 
> cpu_hw_events structure.
> 
>   - Finally have pulled in the code from perf_event_bhrb.c into 
> core-book3s.c
> 
> - Improved documentation for the patchset
> 
> Changes in V4
> -
> - Incorporated review comments on V3 regarding new instruction encoding
> 
> Anshuman Khandual (5):
>   powerpc, perf: Add new BHRB related instructions for POWER8
>   powerpc, perf: Add basic assembly code to read BHRB entries on POWER8
>   powerpc, perf: Add new BHRB related generic functions, data and flags
>   powerpc, perf: Define BHRB generic functions, data and flags for POWER8
>   powerpc, perf: Enable branch stack sampling framework
> 
>  arch/powerpc/include/asm/perf_event_server.h |   7 ++
>  arch/powerpc/include/asm/ppc-opcode.h|   8 ++
>  arch/powerpc/perf/Makefile   |   2 +-
>  arch/powerpc/perf/bhrb.S |  44 +++
>  arch/powerpc/perf/core-book3s.c  | 167 
> ++-
>  arch/powerpc/perf/power8-pmu.c   |  57 -
>  6 files changed, 280 insertions(+), 5 deletions(-)
>  create mode 100644 arch/powerpc/perf/bhrb.S
> 
> -- 
> 1.7.11.7
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Reset PCIe devices to stop ongoing DMA

2013-04-25 Thread Takao Indoh
(2013/04/26 3:01), Don Dutile wrote:
> On 04/25/2013 01:11 AM, Takao Indoh wrote:
>> (2013/04/25 4:59), Don Dutile wrote:
>>> On 04/24/2013 12:58 AM, Takao Indoh wrote:
 This patch resets PCIe devices on boot to stop ongoing DMA. When
 "pci=pcie_reset_devices" is specified, a hot reset is triggered on each
 PCIe root port and downstream port to reset its downstream endpoint.

 Problem:
 This patch solves the problem that kdump can fail when intel_iommu=on is
 specified. When intel_iommu=on is specified, many dma-remapping errors
 occur in second kernel and it causes problems like driver error or PCI
 SERR, at last kdump fails. This problem is caused as follows.
 1) Devices are working on first kernel.
 2) Switch to second kernel(kdump kernel). The devices are still working
   and its DMA continues during this switch.
 3) iommu is initialized during second kernel boot and ongoing DMA causes
   dma-remapping errors.

 Solution:
 All DMA transactions have to be stopped before iommu is initialized. By
 this patch devices are reset and in-flight DMA is stopped before
 pci_iommu_init.

 To invoke hot reset on an endpoint, its upstream link need to be reset.
 reset_pcie_devices() is called from fs_initcall_sync, and it finds root
 port/downstream port whose child is PCIe endpoint, and then reset link
 between them. If the endpoint is VGA device, it is skipped because the
 monitor blacks out if VGA controller is reset.

>>> Couple questions wrt VGA device:
>>> (1) Many graphics devices are multi-function, one function being VGA;
>>>is the VGA always function 0, so this scan sees it first&  doesn't
>>>do a reset on that PCIe link?  if the VGA is not function 0, won't
>>>this logic break (will reset b/c function 0 is non-VGA graphics) ?
>>
>> VGA is not reset irrespective of its function number. The logic of this
>> patch is:
>>
>> for_each_pci_dev(dev) {
>>   if (dev is not PCIe)
>>  continue;
>>   if (dev is not root port/downstream port) ---(1)
>>  continue;
>>   list_for_each_entry(child,>subordinate->devices, bus_list) {
>>   if (child is upstream port or bridge or VGA) ---(2)
>>   continue;
>>   }
>>   do_reset_its_child(dev);
>> }
>>
>> Therefore VGA itself is skipped by (1), and upstream device(root port or
>> downstream port) of VGA is also skipped by (2).
>>
>>
>>> (2) I'm hearing VGA will soon not be the a required console; this logic
>>>assumes it is, and why it isn't blanked.
>>>Q: Should the filter be based on a device having a device-class of 
>>> display ?
>>
>> I want to avoid the situation that user's monitor blacks out and user
>> cannot know what's going on. That's reason why I introduced the logic to
>> skip VGA. As far as I tested the logic based on device-class works well,
> sorry, I read your description, which said VGA, but your are filtering on 
> display class,
> which includes non-VGA as well. So, all set ... but large, (x16) non-VGA 
> display devices
> are probably one of the most aggressive DMA engines on a system and will 
> grow as
> asymmetric processing using GPUs gets architected into a device-agnostic 
> manner.
> So, this may work well for servers, which is the primary consumer/user of 
> this feature,
> and they typically have built-in graphics that are generally used in simple 
> VGA mode,
> so this may be sufficient for now.

Ok, understood.


> 
>> but I would appreciate it if there are better ways.
>>
> You probably don't want to hear it but
> a) only turn off cmd-reg master enable bit
> b) only do reset based on a list of devices known not to
> obey their cmd-reg master enable bit, and only do reset to those devices.
> But, given the testing you've done so far, this optional (need cmdline) 
> feature,
> let's start here.

Ok. Either way I think we need more testing.

 
>>>
 Actually this is v8 patch but quite different from v7 and it's been so
 long since previous post, so I start over again.
>>> Thanks for this re-start.  I need to continue reviewing the rest.
>>
>> Thank you for your review!
>>
>>>
>>> Q: Why not force IOMMU off when re-booting a kexec kernel to perform a crash
>>>   dump?  After the crash dump, the system is rebooting to previous 
>>> (iommu=on) setting.
>>>   That logic, along w/your previous patch to disable the IOMMU if 
>>> iommu=off
>>>   is set, would remove this (relatively slow) PCI init sequencing ?
>>
>> To force iommu off, all ongoing DMA have to be stopped before that since
>> they are accessing the device address, not physical address. If we disable
>> iommu without stopping in-flihgt DMA, devices access invalid memory area
>> and it causes memory corruption or PCI-SERR due to DMA error.
> Right, that's a 'duh' on my part.
> I thought 'disable iommu' == 'block all dma' and it just turns it off &
> let's 

Re: [PATCH v2] PowerPC: kernel: compiling issue, make additional room in exception vector area

2013-04-25 Thread Mike Qiu

于 2013/4/26 10:06, Chen Gang 写道:

On 2013年04月26日 10:03, Mike Qiu wrote:

�� 2013/4/26 9:36, Chen Gang �:

On 2013��04��26�� 09:18, Chen Gang wrote:

On 2013��04��26�� 09:06, Chen Gang wrote:

CFAR is the Come From Register.  It saves the location of the last

branch and is hence overwritten by any branch.


Do we process it just like others done (e.g. 0x300, 0xe00, 0xe20 ...) ?
. = 0x900
.globl decrementer_pSeries
decrementer_pSeries:
HMT_MEDIUM_PPR_DISCARD
SET_SCRATCH0(r13)
b decrementer_pSeries_0

...



Oh, it seems EXCEPTION_PROLOG_1 will save the regesters which related
with CFAR, so I think need move EXCEPTION_PROLOG_1 to near 0x900.

I will try your diff V2, to see if the machine can boot up

OK, thanks. (hope it can work)
It seems that the machine can be bootup in powernv mode, but I'm not 
sure if my machine call that module.


At lease my machine can boot up

Thanks
Mike


:-)



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] cgroup: restore the call to eventfd->poll()

2013-04-25 Thread Li Zefan
I mistakenly removed the call to eventfd->poll() while I was actually
intending to remove the return value...

Calling evenfd->poll() will hook cgroup_event_wake() to the poll
waitqueue, which will be called to unregister eventfd when rmdir a
cgroup or close eventfd.

Signed-off-by: Li Zefan 
---
 kernel/cgroup.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index a45aa12..4b0f2ef 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -3882,6 +3882,8 @@ static int cgroup_write_event_control(struct cgroup 
*cgrp, struct cftype *cft,
if (ret)
goto fail;
 
+   efile->f_op->poll(efile, >pt);
+
/*
 * Events should be removed after rmdir of cgroup directory, but before
 * destroying subsystem state objects. Let's take reference to cgroup
-- 
1.8.0.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] cgroup: fix use-after-free when umounting cgroupfs

2013-04-25 Thread Li Zefan
Try:
  # mount -t cgroup xxx /cgroup
  # mkdir /cgroup/sub && rmdir /cgroup/sub && umount /cgroup

And you might see this:

ida_remove called for id=1 which is not allocated.

It's because cgroup_kill_sb() is called to destroy root->cgroup_ida
and free cgrp->root before ida_simple_removed() is called. What's
worse is we're accessing cgrp->root while it has been freed.

Signed-off-by: Li Zefan 
---
 kernel/cgroup.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 6780459..a45aa12 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -845,9 +845,12 @@ static void cgroup_free_fn(struct work_struct *work)
 */
dput(cgrp->parent->dentry);
 
+   ida_simple_remove(>root->cgroup_ida, cgrp->id);
+   
/*
 * Drop the active superblock reference that we took when we
-* created the cgroup
+* created the cgroup. This will free cgrp->root, if we are
+* holding the last reference to @sb.
 */
deactivate_super(cgrp->root->sb);
 
@@ -859,7 +862,6 @@ static void cgroup_free_fn(struct work_struct *work)
 
simple_xattrs_free(>xattrs);
 
-   ida_simple_remove(>root->cgroup_ida, cgrp->id);
kfree(rcu_dereference_raw(cgrp->name));
kfree(cgrp);
 }
-- 
1.8.0.2
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: mm: BUG in do_huge_pmd_wp_page

2013-04-25 Thread H. Peter Anvin
On 04/24/2013 04:40 PM, Simon Jeons wrote:
> 
> I see in memblock_trim_memory(): start = round_up(orig_start, align);
> here align is PAGE_SIZE, so the dump of zone ranges in my machine is [ 
>   0.00]  DMA  [mem 0x1000-0x00ff]. Why PFN 0 is not
> used? just for align?
> 

PFN 0 contains the real-mode interrupt vector table and BIOS data area,
so we just reserve it.  Avoids issues with zero being special, too.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] forced argument Was Re: sparse: incorrect type in argument 1 (different address spaces)

2013-04-25 Thread Christopher Li
On 04/22/2013 11:16 PM, Dan Carpenter wrote:
> That didn't work.  It's the the void * in the parameter list that's
> the problem.  We'd need to do something like the patch below:
> 
> Otherwise we could add "__ok_to_cast" thing to Sparse maybe?

Thanks for the insight. I make a small patch to test the __ok_to_cast
feature. The syntax is adding the force attribute to the argument
declaration.

it will look like this:
static inline long __must_check PTR_ERR( __force const void *ptr)

That means the "ptr" argument will perform a forced cast when receiving
the argument. It is OK to pass __iomem pointer to "ptr".

The example are in the patch. It need to patch both sparse and the
Linux tree.

What do you say?

Chris

>From a0974ed0fc1e67c41608c780b748c205622956b8 Mon Sep 17 00:00:00 2001
From: Christopher Li 
Date: Thu, 25 Apr 2013 18:09:43 -0700
Subject: [PATCH] Allow forced attribute in function argument

It will indicate this argument will skip the compatible check.
---
 evaluate.c |  2 +-
 parse.c|  1 +
 symbol.h   |  3 ++-
 validation/fored_arg.c | 18 ++
 4 files changed, 22 insertions(+), 2 deletions(-)
 create mode 100644 validation/fored_arg.c

diff --git a/evaluate.c b/evaluate.c
index 9f2c4ac..0dfa519 100644
--- a/evaluate.c
+++ b/evaluate.c
@@ -2137,7 +2137,7 @@ static int evaluate_arguments(struct symbol *f, struct symbol *fn, struct expres
 else
 	degenerate(expr);
 			}
-		} else {
+		} else if (!target->forced_arg){
 			static char where[30];
 			examine_symbol_type(target);
 			sprintf(where, "argument %d", i);
diff --git a/parse.c b/parse.c
index 45ffc10..890e56b 100644
--- a/parse.c
+++ b/parse.c
@@ -1841,6 +1841,7 @@ static struct token *parameter_declaration(struct token *token, struct symbol *s
 	sym->ctype = ctx.ctype;
 	sym->ctype.modifiers |= storage_modifiers();
 	sym->endpos = token->pos;
+	sym->forced_arg = ctx.storage_class == SForced;
 	return token;
 }
 
diff --git a/symbol.h b/symbol.h
index 1e74579..1c6ad66 100644
--- a/symbol.h
+++ b/symbol.h
@@ -157,7 +157,8 @@ struct symbol {
 	expanding:1,
 	evaluated:1,
 	string:1,
-	designated_init:1;
+	designated_init:1,
+	forced_arg:1;
 			struct expression *array_size;
 			struct ctype ctype;
 			struct symbol_list *arguments;
diff --git a/validation/fored_arg.c b/validation/fored_arg.c
new file mode 100644
index 000..4ab7141
--- /dev/null
+++ b/validation/fored_arg.c
@@ -0,0 +1,18 @@
+/*
+ * check-name: Forced function argument type.
+ */
+
+#define __iomem	__attribute__((noderef, address_space(2)))
+#define __force __attribute__((force))
+
+static void foo(__force void * addr)
+{
+}
+
+
+static void bar(void)
+{
+	void __iomem  *a;
+	foo(a);
+}
+
-- 
1.8.1.4



Re: [PATCH v2] PowerPC: kernel: compiling issue, make additional room in exception vector area

2013-04-25 Thread Chen Gang
On 2013年04月26日 10:03, Mike Qiu wrote:
> �� 2013/4/26 9:36, Chen Gang �:
>> > On 2013��04��26�� 09:18, Chen Gang wrote:
>>> >> On 2013��04��26�� 09:06, Chen Gang wrote:
>  CFAR is the Come From Register.  It saves the location of the last
>> > branch and is hence overwritten by any branch.
>> >
 >>> Do we process it just like others done (e.g. 0x300, 0xe00, 0xe20 ...) ?
 >>>. = 0x900
 >>>.globl decrementer_pSeries
 >>> decrementer_pSeries:
 >>>HMT_MEDIUM_PPR_DISCARD
 >>>SET_SCRATCH0(r13)
 >>>b decrementer_pSeries_0
 >>>
 >>>...
 >>>
 >>>
>> > Oh, it seems EXCEPTION_PROLOG_1 will save the regesters which related
>> > with CFAR, so I think need move EXCEPTION_PROLOG_1 to near 0x900.
> I will try your diff V2, to see if the machine can boot up

OK, thanks. (hope it can work)

:-)

-- 
Chen Gang

Asianux Corporation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: "attempt to move .org backwards" still show up

2013-04-25 Thread Chen Gang
On 2013年04月26日 09:58, Mike Qiu wrote:
> 于 2013/4/25 19:16, Chen Gang 写道:
>> On 2013年04月25日 14:25, Paul Mackerras wrote:
>>> On Thu, Apr 25, 2013 at 12:05:54PM +0800, Mike Qiu wrote:
> This has block my work now
> So I hope you can take a look ASAP
> Thanks
> :)
>
> Mike
>>> As a quick fix, turn on CONFIG_KVM_BOOK3S_64_HV.  That will eliminate
>>> the immediate problem.
>> Yes, just as my original reply to Mike to bypass it, but get no reply, I
>> guess he has to face the CONFIG_KVM_BOOK3S_64_PR.
>>
>> Now, I am just fixing it, when I finish one patch, please help check.
> Actually, I have compile pass by your patch, but I see Micheal Neuling's
> reply,
> I just stop to do that, and wait for you new patch :)
> 

I am just continuing (before get fixed, I should continue)


> Now I will use your V2 patch to build

Please see the discussion of patch v2, it still has another issues, but
I am still trying (I guess Michael is just checking).

:-)

-- 
Chen Gang

Asianux Corporation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] PowerPC: kernel: compiling issue, make additional room in exception vector area

2013-04-25 Thread Mike Qiu
于 2013/4/26 9:36, Chen Gang 写道:
> On 2013年04月26日 09:18, Chen Gang wrote:
>> On 2013年04月26日 09:06, Chen Gang wrote:
 CFAR is the Come From Register.  It saves the location of the last
> branch and is hence overwritten by any branch.
>
>>> Do we process it just like others done (e.g. 0x300, 0xe00, 0xe20 ...) ?
>>> . = 0x900
>>> .globl decrementer_pSeries
>>> decrementer_pSeries:
>>> HMT_MEDIUM_PPR_DISCARD
>>> SET_SCRATCH0(r13)
>>> b decrementer_pSeries_0
>>>
>>> ...
>>>
>>>
> Oh, it seems EXCEPTION_PROLOG_1 will save the regesters which related
> with CFAR, so I think need move EXCEPTION_PROLOG_1 to near 0x900.

I will try your diff V2, to see if the machine can boot up
> -diff v2 begin-
>
> diff --git a/arch/powerpc/kernel/exceptions-64s.S 
> b/arch/powerpc/kernel/exceptions-64s.S
> index e789ee7..f0489c4 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -254,7 +254,15 @@ hardware_interrupt_hv:
>   STD_EXCEPTION_PSERIES(0x800, 0x800, fp_unavailable)
>   KVM_HANDLER_PR(PACA_EXGEN, EXC_STD, 0x800)
>
> - MASKABLE_EXCEPTION_PSERIES(0x900, 0x900, decrementer)
> + . = 0x900
> + .globl decrementer_pSeries
> +decrementer_pSeries:
> + HMT_MEDIUM_PPR_DISCARD
> + SET_SCRATCH0(r13)   /* save r13 */
> + EXCEPTION_PROLOG_0(PACA_EXGEN)
> + EXCEPTION_PROLOG_1(PACA_EXGEN, SOFTEN_TEST_PR, 0x900)
> + b   decrementer_pSeries_0
> +
>   STD_EXCEPTION_HV(0x980, 0x982, hdecrementer)
>
>   MASKABLE_EXCEPTION_PSERIES(0xa00, 0xa00, doorbell_super)
> @@ -536,6 +544,11 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_ARCH_206)
>  #endif
>
>   .align  7
> + /* moved from 0x900 */
> +decrementer_pSeries_0:
> + EXCEPTION_PROLOG_PSERIES_1(decrementer_common, EXC_STD)
> +
> + .align  7
>   /* moved from 0xe00 */
>   STD_EXCEPTION_HV_OOL(0xe02, h_data_storage)
>   KVM_HANDLER_SKIP(PACA_EXGEN, EXC_HV, 0xe02)
>
>
> -diff v2 end---
>
>
>> Such as the fix below, is it OK (just like 0x300 or 0x200 has done) ?
>>
>> Please check, thanks.
>>
>> ---diff begin-
>>
>> diff --git a/arch/powerpc/kernel/exceptions-64s.S 
>> b/arch/powerpc/kernel/exceptions-64s.S
>> index e789ee7..a0a5ff2 100644
>> --- a/arch/powerpc/kernel/exceptions-64s.S
>> +++ b/arch/powerpc/kernel/exceptions-64s.S
>> @@ -254,7 +254,14 @@ hardware_interrupt_hv:
>>  STD_EXCEPTION_PSERIES(0x800, 0x800, fp_unavailable)
>>  KVM_HANDLER_PR(PACA_EXGEN, EXC_STD, 0x800)
>>  
>> -MASKABLE_EXCEPTION_PSERIES(0x900, 0x900, decrementer)
>> +. = 0x900
>> +.globl decrementer_pSeries
>> +decrementer_pSeries:
>> +HMT_MEDIUM_PPR_DISCARD
>> +SET_SCRATCH0(r13)   /* save r13 */
>> +EXCEPTION_PROLOG_0(PACA_EXGEN)
>> +b   decrementer_pSeries_0
>> +
>>  STD_EXCEPTION_HV(0x980, 0x982, hdecrementer)
>>  
>>  MASKABLE_EXCEPTION_PSERIES(0xa00, 0xa00, doorbell_super)
>> @@ -536,6 +543,12 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_ARCH_206)
>>  #endif
>>  
>>  .align  7
>> +/* moved from 0x900 */
>> +decrementer_pSeries_0:
>> +EXCEPTION_PROLOG_1(PACA_EXGEN, SOFTEN_TEST_PR, 0x900)
>> +EXCEPTION_PROLOG_PSERIES_1(decrementer_common, EXC_STD)
>> +
>> +.align  7
>>  /* moved from 0xe00 */
>>  STD_EXCEPTION_HV_OOL(0xe02, h_data_storage)
>>  KVM_HANDLER_SKIP(PACA_EXGEN, EXC_HV, 0xe02)
>>
>> ---diff end---
>>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: mm: BUG in do_huge_pmd_wp_page

2013-04-25 Thread Dave Jones
On Thu, Apr 25, 2013 at 08:51:27PM -0400, Sasha Levin wrote:
 > On 04/24/2013 06:46 PM, Andrew Morton wrote:
 > > Guys, did this get fixed?
 > 
 > I've stopped seeing that during fuzzing, so I guess that it got fixed 
 > somehow...

We've had reports of users hitting this in 3.8

eg:
https://bugzilla.redhat.com/show_bug.cgi?id=947985
https://bugzilla.redhat.com/show_bug.cgi?id=956730 

I'm sure there are other reports of it too.

Would be good if we can figure out what fixed it (if it is actually fixed)
for backporting to stable

Dave
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: "attempt to move .org backwards" still show up

2013-04-25 Thread Mike Qiu

于 2013/4/25 19:16, Chen Gang 写道:

On 2013年04月25日 14:25, Paul Mackerras wrote:

On Thu, Apr 25, 2013 at 12:05:54PM +0800, Mike Qiu wrote:

This has block my work now
So I hope you can take a look ASAP
Thanks
:)

Mike

As a quick fix, turn on CONFIG_KVM_BOOK3S_64_HV.  That will eliminate
the immediate problem.

Yes, just as my original reply to Mike to bypass it, but get no reply, I
guess he has to face the CONFIG_KVM_BOOK3S_64_PR.

Now, I am just fixing it, when I finish one patch, please help check.
Actually, I have compile pass by your patch, but I see Micheal Neuling's 
reply,

I just stop to do that, and wait for you new patch :)

Now I will use your V2 patch to build

Thanks

Mike

Thanks.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Resend][Bug fix PATCH v5] Reusing a resource structure allocated by bootmem

2013-04-25 Thread Yasuaki Ishimatsu

2013/04/25 5:37, Andrew Morton wrote:

On Wed, 24 Apr 2013 08:50:21 +0900 Yasuaki Ishimatsu 
 wrote:


When hot removing memory presented at boot time, following messages are shown:

[  296.867031] [ cut here ]
[  296.922273] kernel BUG at mm/slub.c:3409!

...

The reason why the messages are shown is to release a resource structure,
allocated by bootmem, by kfree(). So when we release a resource structure,
we should check whether it is allocated by bootmem or not.

But even if we know a resource structure is allocated by bootmem, we cannot
release it since SLxB cannot treat it. So for reusing a resource structure,
this patch remembers it by using bootmem_resource as follows:

When releasing a resource structure by free_resource(), free_resource() checks
whether the resource structure is allocated by bootmem or not. If it is
allocated by bootmem, free_resource() adds it to bootmem_resource. If it is
not allocated by bootmem, free_resource() release it by kfree().

And when getting a new resource structure by get_resource(), get_resource()
checks whether bootmem_resource has released resource structures or not. If
there is a released resource structure, get_resource() returns it. If there is
not a releaed resource structure, get_resource() returns new resource structure
allocated by kzalloc().

...



Looks good to me.


--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -21,6 +21,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 


@@ -50,6 +51,14 @@ struct resource_constraint {

  static DEFINE_RWLOCK(resource_lock);

+/*
+ * For memory hotplug, there is no way to free resource entries allocated
+ * by boot mem after the system is up. So for reusing the resource entry
+ * we need to remember the resource.
+ */
+static struct resource *bootmem_resource_free;
+static DEFINE_SPINLOCK(bootmem_resource_lock);
+
  static void *r_next(struct seq_file *m, void *v, loff_t *pos)
  {
struct resource *p = v;
@@ -151,6 +160,40 @@ __initcall(ioresources_init);

  #endif /* CONFIG_PROC_FS */

+static void free_resource(struct resource *res)
+{
+   if (!res)
+   return;
+
+   if (!PageSlab(virt_to_head_page(res))) {


Did you consider using a bit in resource.flags?  There appear to be
four free ones left.  The VM trickery will work OK I guess, but isn't
very "nice".


+   spin_lock(_resource_lock);
+   res->sibling = bootmem_resource_free;
+   bootmem_resource_free = res;
+   spin_unlock(_resource_lock);
+   } else {
+   kfree(res);
+   }
+}
+
+static struct resource *get_resource(gfp_t flags)
+{
+   struct resource *res = NULL;
+
+   spin_lock(_resource_lock);
+   if (bootmem_resource_free) {
+   res = bootmem_resource_free;
+   bootmem_resource_free = res->sibling;
+   }
+   spin_unlock(_resource_lock);
+
+   if (res)
+   memset(res, 0, sizeof(struct resource));
+   else
+   res = kzalloc(sizeof(struct resource), flags);
+
+   return res;
+}





I think I'll rename this to alloc_resource().  In Linux "get" often
(but not always) means "take a reference on".  So "get" pairs with
"put" and "alloc" pairs with "free".


I forgot to answer it.
I think so too. And I have no objection about your update patch.

Thanks,
Yasuaki Ishimatsu



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majord...@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: mailto:"d...@kvack.org;> em...@kvack.org 




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] PCI: move down pci_fixup_final for hotplug path

2013-04-25 Thread Yinghai Lu
David found some resource conflict issue after
| PCI: Put pci_dev in device tree as early as possible
| commit 4f535093cf8f6da8cfda7c36c2c1ecd2e9586ee4

and
| USB: Fix handoff when BIOS disables host PCI device
| commit: cab928ee1f221c9cc48d6615070fefe2e444384a

for usb qirks for hotplug path.

After checking pci_fixup_device() with pci_fixup_final,
now we have different path for boot path and hotadd path.

Boot path: because pci_apply_fix_final_quirks is not set yet,
so pci_fixup_device(pci_fixup_final) will be skipped
from pci_device_add().
And later pci_apply_final_quirks will be called for all
pci devices via fs_initcall.
That is after pci_assign_unassign resource.
In that case quirk could use bars with problem.

Hotadd path: pci_fixup_device(pci_fixup_final) will be executed
via pci_device_add(), and that is too early for hotplug
path, as pci bar for hot add devices is not assigned yet
after commit 4f535093.

So we need to move down that for hotplug path, call that in
pci_bus_add_devices instead, as at that time just before
drivers get attached. 
And that is simliar calling place for pci_device_add before
commit 4f535093 is applied.

We should apply this fix for v3.9, but is too late now.
so get it into v3.10 and could get into v3.9 stable instead.

Reported-by: David Bulkow 
Tested-by: David Bulkow 
Signed-off-by: Yinghai Lu 

---
 drivers/pci/bus.c   |1 +
 drivers/pci/probe.c |1 -
 2 files changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6/drivers/pci/bus.c
===
--- linux-2.6.orig/drivers/pci/bus.c
+++ linux-2.6/drivers/pci/bus.c
@@ -201,6 +201,7 @@ void pci_bus_add_devices(const struct pc
/* Skip already-added devices */
if (dev->is_added)
continue;
+   pci_fixup_device(pci_fixup_final, dev);
retval = pci_bus_add_device(dev);
if (retval)
dev_err(>dev, "Error adding device (%d)\n",
Index: linux-2.6/drivers/pci/probe.c
===
--- linux-2.6.orig/drivers/pci/probe.c
+++ linux-2.6/drivers/pci/probe.c
@@ -1341,7 +1341,6 @@ void pci_device_add(struct pci_dev *dev,
list_add_tail(>bus_list, >devices);
up_write(_bus_sem);
 
-   pci_fixup_device(pci_fixup_final, dev);
ret = pcibios_add_device(dev);
WARN_ON(ret < 0);
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH -v3] PCI, ACPI, hotplug: Fix BUS_CHECK event handle on root bridge

2013-04-25 Thread Yinghai Lu
Gavin found that acpiphp does not handle hotplug anymore even after
now we have acpiphp built-in preparing for v3.10.

Bjorn analyzed bootlog, he found that 
acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5
ACPI: PCI Root Bridge [PCI0] (domain  [bus 00-3e])
\_SB_.PCI0:_OSC invalid UUID
acpiphp: Slot [1] registered
acpiphp: Slot [1-1] registered
   acpi root: \_SB_.PCI0 notify handler is installed
_handle_hotplug_event_root: Bus check notify on \_SB_.PCI0
_handle_hotplug_event_root: Bus check notify on \_SB_.PCI0
And that means:
   So we should be using acpiphp, which you do have built in statically,
   and it found a couple slots.  And we did get two bus check notifies on
   \_SB_.PCI0, so we *should* be re-enumerating PCI bus :00.  But it
   looks like we're handling this as a host bridge hotplug event instead
   of a PCI device hotplug.  My guess is that
   handle_root_bridge_insertion() does nothing because the PCI0 ACPI
   device already exists, though I would expect to see the "acpi device
   exists..." in your dmesg log if this were the case.

Also according to Rafael and Bjorn, it is perfect fine that we
should enumerate bus by sending event to root bridge after hotadd
device to slots under that root bridge or children bridges.

It turns out that it is regression caused by
| commit 668192b678201d2fff27c6cc76bb003c1ec4a52a
| Author: Yinghai Lu 
| Date:   Mon Jan 21 13:20:48 2013 -0800
|
|PCI: acpiphp: Move host bridge hotplug to pci_root.c

We should check slots when BUS_CHECK is sent to root bridge acpi handle.

Restore the old behavoir by calling acpi_check_bridge and check_sub_bridge
in acpiphp.

Jiang Liu acctually have simimar patch but it forgets calling
acpi_check_bridge() for system that have slots on root bus directly.
That is still valid, as in QEMU we still have that slots on bus 0 at
least. But my first version patch wrongly check if root bridge exists
before check_sub_bridge for children bridges. 

-v2: Don't check bridge for acpi_walk_namespace with check_sub_bridges.
 also put back bridge reference.
-v3: More changelog and etc.

Reported-by: Gavin Guo 
Tested-by: Gavin Guo 
Signed-off-by: Yinghai Lu 

---
 drivers/acpi/pci_root.c|2 ++
 drivers/pci/hotplug/acpiphp_glue.c |   14 ++
 include/linux/pci-acpi.h   |2 ++
 3 files changed, 18 insertions(+)

Index: linux-2.6/drivers/acpi/pci_root.c
===
--- linux-2.6.orig/drivers/acpi/pci_root.c
+++ linux-2.6/drivers/acpi/pci_root.c
@@ -643,6 +643,8 @@ static void _handle_hotplug_event_root(s
 (char *)buffer.pointer);
if (!root)
handle_root_bridge_insertion(handle);
+   else
+   acpiphp_check_host_bridge(handle);
 
break;
 
Index: linux-2.6/drivers/pci/hotplug/acpiphp_glue.c
===
--- linux-2.6.orig/drivers/pci/hotplug/acpiphp_glue.c
+++ linux-2.6/drivers/pci/hotplug/acpiphp_glue.c
@@ -950,6 +950,20 @@ check_sub_bridges(acpi_handle handle, u3
return AE_OK ;
 }
 
+void acpiphp_check_host_bridge(acpi_handle handle)
+{
+   struct acpiphp_bridge *bridge;
+
+   bridge = acpiphp_handle_to_bridge(handle);
+   if (bridge) {
+   acpiphp_check_bridge(bridge);
+   put_bridge(bridge);
+   }
+
+   acpi_walk_namespace(ACPI_TYPE_DEVICE, handle,
+   ACPI_UINT32_MAX, check_sub_bridges, NULL, NULL, NULL);
+}
+
 static void _handle_hotplug_event_bridge(struct work_struct *work)
 {
struct acpiphp_bridge *bridge;
Index: linux-2.6/include/linux/pci-acpi.h
===
--- linux-2.6.orig/include/linux/pci-acpi.h
+++ linux-2.6/include/linux/pci-acpi.h
@@ -60,11 +60,13 @@ static inline void acpi_pci_slot_remove(
 void acpiphp_init(void);
 void acpiphp_enumerate_slots(struct pci_bus *bus, acpi_handle handle);
 void acpiphp_remove_slots(struct pci_bus *bus);
+void acpiphp_check_host_bridge(acpi_handle handle);
 #else
 static inline void acpiphp_init(void) { }
 static inline void acpiphp_enumerate_slots(struct pci_bus *bus,
   acpi_handle handle) { }
 static inline void acpiphp_remove_slots(struct pci_bus *bus) { }
+static inline void acpiphp_check_host_bridge(acpi_handle handle) { }
 #endif
 
 #else  /* CONFIG_ACPI */
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] PCI: Fix racing for pci device removing via sysfs

2013-04-25 Thread Yinghai Lu
Gu found nested removing through
echo -n 1 > /sys/bus/pci/devices/\:10\:00.0/remove ; echo -n 1 >
/sys/bus/pci/devices/\:1a\:01.0/remove

will cause kernel crash as bus get freed.

[  418.946462] CPU 4
[  418.968377] Pid: 512, comm: kworker/u:2 Tainted: GW3.8.0 #2
FUJITSU-SV PRIMEQUEST 1800E/SB
[  419.081763] RIP: 0010:[]  []
pci_bus_read_config_word+0x5e/0x90
[  420.494137] Call Trace:
[  420.523326]  [] ? remove_callback+0x1f/0x40
[  420.591984]  [] pci_pme_active+0x4b/0x1c0
[  420.658545]  [] pci_stop_bus_device+0x57/0xb0
[  420.729259]  [] pci_stop_and_remove_bus_device+0x16/0x30
[  420.811392]  [] remove_callback+0x2b/0x40
[  420.877955]  [] sysfs_schedule_callback_work+0x26/0x70

https://bugzilla.kernel.org/show_bug.cgi?id=54411

We have one patch that will let device hold bus ref to prevent it from
being freed, but that will still generate warning.

[ cut here ]
WARNING: at lib/list_debug.c:53 __list_del_entry+0x63/0xd0()
Hardware name: PRIMEQUEST 1800E
list_del corruption, 8807d1b6c000->next is LIST_POISON1 (dead00100100)
Call Trace:
 [] warn_slowpath_common+0x7f/0xc0
 [] warn_slowpath_fmt+0x46/0x50
 [] __list_del_entry+0x63/0xd0
 [] list_del+0x11/0x40
 [] pci_destroy_dev+0x31/0xc0
 [] pci_remove_bus_device+0x5b/0x70
 [] pci_stop_and_remove_bus_device+0x1e/0x30
 [] remove_callback+0x29/0x40
 [] sysfs_schedule_callback_work+0x24/0x70

We can just check if the device get removed from pci tree
already in the protection under pci_remove_rescan_mutex.

Reported-by: Gu Zheng 
Tested-by: Gu Zheng 
Signed-off-by: Yinghai Lu 

---
 drivers/pci/pci-sysfs.c |9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

Index: linux-2.6/drivers/pci/pci-sysfs.c
===
--- linux-2.6.orig/drivers/pci/pci-sysfs.c
+++ linux-2.6/drivers/pci/pci-sysfs.c
@@ -329,9 +329,16 @@ dev_rescan_store(struct device *dev, str
 static void remove_callback(struct device *dev)
 {
struct pci_dev *pdev = to_pci_dev(dev);
+   int domain = pci_domain_nr(pdev->bus);
+   u8 bus = pdev->bus->number;
+   u8 devfn = pdev->devfn;
 
mutex_lock(_remove_rescan_mutex);
-   pci_stop_and_remove_bus_device(pdev);
+   pdev = pci_get_domain_bus_and_slot(domain, bus, devfn);
+   if (pdev) {
+   pci_dev_put(pdev);
+   pci_stop_and_remove_bus_device(pdev);
+   }
mutex_unlock(_remove_rescan_mutex);
 }
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] PCI: Remove duplicate pci_disable_device for pcie port

2013-04-25 Thread Yinghai Lu
During chasing one PCI xHCI hotplug problem, David Bulkow found

static void pcie_portdrv_remove(struct pci_dev *dev)
{
pcie_port_device_remove(dev);
pci_disable_device(dev);
}
and
void pcie_port_device_remove(struct pci_dev *dev)
{
device_for_each_child(>dev, NULL, remove_iter);
cleanup_service_irqs(dev);
pci_disable_device(dev);
}

that extra pci_disable_device in pcie_port_device_remove() was added by
| commit dc5351784eb36f1fec4efa88e01581be72c0b711
| Author: Kenji Kaneshige 
| Date:   Wed Nov 25 21:04:00 2009 +0900
|
|PCI: portdrv: cleanup service irqs initialization

so pci_dsiable_device is called two times.

We should remove extra one in pcie_portdrv_remove.

Reported-by: David Bulkow 
Signed-off-by: Yinghai Lu 

---
 drivers/pci/pcie/portdrv_pci.c |1 -
 1 file changed, 1 deletion(-)

Index: linux-2.6/drivers/pci/pcie/portdrv_pci.c
===
--- linux-2.6.orig/drivers/pci/pcie/portdrv_pci.c
+++ linux-2.6/drivers/pci/pcie/portdrv_pci.c
@@ -223,7 +223,6 @@ static int pcie_portdrv_probe(struct pci
 static void pcie_portdrv_remove(struct pci_dev *dev)
 {
pcie_port_device_remove(dev);
-   pci_disable_device(dev);
 }
 
 static int error_detected_iter(struct device *device, void *data)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/2] sched: Add cond_resched_rcu_lock() helper

2013-04-25 Thread Simon Horman
Add a helper that for use in loops which read data protected by RCU and
may have a large number of iterations.  Such an example is dumping the list
of connections known to IPVS: ip_vs_conn_array() and ip_vs_conn_seq_next().

This series also updates the two ip_vs functions mentioned above
to use the helper.

As suggested by Eric Dumazet.

Simon Horman (2):
  sched: Add cond_resched_rcu_lock() helper
  ipvs: Use cond_resched_rcu_lock() helper when dumping connections

 include/linux/sched.h   | 9 +
 net/netfilter/ipvs/ip_vs_conn.c | 6 ++
 2 files changed, 11 insertions(+), 4 deletions(-)

-- 
1.8.2.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] sched: Add cond_resched_rcu_lock() helper

2013-04-25 Thread Simon Horman
This is intended for use in loops which read data protected by RCU and may
have a large number of iterations.  Such an example is dumping the list of
connections known to IPVS: ip_vs_conn_array() and ip_vs_conn_seq_next().

As suggested by Eric Dumazet.

Cc: Eric Dumazet 
Cc: Julian Anastasov 
Signed-off-by: Simon Horman 
---
 include/linux/sched.h | 9 +
 1 file changed, 9 insertions(+)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index e692a02..7eec4c7 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2787,3 +2787,12 @@ static inline unsigned long rlimit_max(unsigned int 
limit)
 }
 
 #endif
+
+static void inline cond_resched_rcu_lock(void)
+{
+   if (need_resched()) {
+   rcu_read_unlock();
+   cond_resched();
+   rcu_read_lock();
+   }
+}
-- 
1.8.2.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] ipvs: Use cond_resched_rcu_lock() helper when dumping connections

2013-04-25 Thread Simon Horman
This avoids the situation where a dump of a large number of connections
may prevent scheduling for a long time while also avoiding excessive
calls to rcu_read_unlock() and rcu_read_lock().

Cc: Eric Dumazet 
Cc: Julian Anastasov 
Signed-off-by: Simon Horman 
---
 net/netfilter/ipvs/ip_vs_conn.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/net/netfilter/ipvs/ip_vs_conn.c b/net/netfilter/ipvs/ip_vs_conn.c
index a083bda..42a7b33 100644
--- a/net/netfilter/ipvs/ip_vs_conn.c
+++ b/net/netfilter/ipvs/ip_vs_conn.c
@@ -975,8 +975,7 @@ static void *ip_vs_conn_array(struct seq_file *seq, loff_t 
pos)
return cp;
}
}
-   rcu_read_unlock();
-   rcu_read_lock();
+   cond_resched_rcu_lock();
}
 
return NULL;
@@ -1015,8 +1014,7 @@ static void *ip_vs_conn_seq_next(struct seq_file *seq, 
void *v, loff_t *pos)
iter->l = _vs_conn_tab[idx];
return cp;
}
-   rcu_read_unlock();
-   rcu_read_lock();
+   cond_resched_rcu_lock();
}
iter->l = NULL;
return NULL;
-- 
1.8.2.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] f2fs: enhnace alloc_nid and build_free_nids flows

2013-04-25 Thread Namjae Jeon
2013/4/25, Jaegeuk Kim :
> In order to avoid build_free_nid lock contention, let's change the order of
> function calls as follows.
>
> At first, check whether there is enough free nids.
>  - If available, just get a free nid with spin_lock without any overhead.
>  - Otherwise, conduct build_free_nids.
>   : scan nat pages, journal nat entries, and nat cache entries.
>
> We should consider carefullly not to serve free nids intermediately made by
> build_free_nids.
> We can get stable free nids only after build_free_nids is done.
>
> Signed-off-by: Jaegeuk Kim 
I can't find any issues in this patch.
Reviewed-by: Namjae Jeon 

Thanks.
> ---
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] PowerPC: kernel: compiling issue, make additional room in exception vector area

2013-04-25 Thread Chen Gang
On 2013年04月26日 09:18, Chen Gang wrote:
> On 2013年04月26日 09:06, Chen Gang wrote:
>>> CFAR is the Come From Register.  It saves the location of the last
 branch and is hence overwritten by any branch.

>> Do we process it just like others done (e.g. 0x300, 0xe00, 0xe20 ...) ?
>>  . = 0x900
>>  .globl decrementer_pSeries
>> decrementer_pSeries:
>>  HMT_MEDIUM_PPR_DISCARD
>>  SET_SCRATCH0(r13)
>>  b decrementer_pSeries_0
>>
>>  ...
>>
>>

Oh, it seems EXCEPTION_PROLOG_1 will save the regesters which related
with CFAR, so I think need move EXCEPTION_PROLOG_1 to near 0x900.

-diff v2 begin-

diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index e789ee7..f0489c4 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -254,7 +254,15 @@ hardware_interrupt_hv:
STD_EXCEPTION_PSERIES(0x800, 0x800, fp_unavailable)
KVM_HANDLER_PR(PACA_EXGEN, EXC_STD, 0x800)
 
-   MASKABLE_EXCEPTION_PSERIES(0x900, 0x900, decrementer)
+   . = 0x900
+   .globl decrementer_pSeries
+decrementer_pSeries:
+   HMT_MEDIUM_PPR_DISCARD
+   SET_SCRATCH0(r13)   /* save r13 */
+   EXCEPTION_PROLOG_0(PACA_EXGEN)
+   EXCEPTION_PROLOG_1(PACA_EXGEN, SOFTEN_TEST_PR, 0x900)
+   b   decrementer_pSeries_0
+
STD_EXCEPTION_HV(0x980, 0x982, hdecrementer)
 
MASKABLE_EXCEPTION_PSERIES(0xa00, 0xa00, doorbell_super)
@@ -536,6 +544,11 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_ARCH_206)
 #endif
 
.align  7
+   /* moved from 0x900 */
+decrementer_pSeries_0:
+   EXCEPTION_PROLOG_PSERIES_1(decrementer_common, EXC_STD)
+
+   .align  7
/* moved from 0xe00 */
STD_EXCEPTION_HV_OOL(0xe02, h_data_storage)
KVM_HANDLER_SKIP(PACA_EXGEN, EXC_HV, 0xe02)


-diff v2 end---


> 
> Such as the fix below, is it OK (just like 0x300 or 0x200 has done) ?
> 
> Please check, thanks.
> 
> ---diff begin-
> 
> diff --git a/arch/powerpc/kernel/exceptions-64s.S 
> b/arch/powerpc/kernel/exceptions-64s.S
> index e789ee7..a0a5ff2 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -254,7 +254,14 @@ hardware_interrupt_hv:
>   STD_EXCEPTION_PSERIES(0x800, 0x800, fp_unavailable)
>   KVM_HANDLER_PR(PACA_EXGEN, EXC_STD, 0x800)
>  
> - MASKABLE_EXCEPTION_PSERIES(0x900, 0x900, decrementer)
> + . = 0x900
> + .globl decrementer_pSeries
> +decrementer_pSeries:
> + HMT_MEDIUM_PPR_DISCARD
> + SET_SCRATCH0(r13)   /* save r13 */
> + EXCEPTION_PROLOG_0(PACA_EXGEN)
> + b   decrementer_pSeries_0
> +
>   STD_EXCEPTION_HV(0x980, 0x982, hdecrementer)
>  
>   MASKABLE_EXCEPTION_PSERIES(0xa00, 0xa00, doorbell_super)
> @@ -536,6 +543,12 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_ARCH_206)
>  #endif
>  
>   .align  7
> + /* moved from 0x900 */
> +decrementer_pSeries_0:
> + EXCEPTION_PROLOG_1(PACA_EXGEN, SOFTEN_TEST_PR, 0x900)
> + EXCEPTION_PROLOG_PSERIES_1(decrementer_common, EXC_STD)
> +
> + .align  7
>   /* moved from 0xe00 */
>   STD_EXCEPTION_HV_OOL(0xe02, h_data_storage)
>   KVM_HANDLER_SKIP(PACA_EXGEN, EXC_HV, 0xe02)
> 
> ---diff end---
> 


-- 
Chen Gang

Asianux Corporation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: mm: BUG in do_huge_pmd_wp_page

2013-04-25 Thread Minchan Kim
Hello hpa,

On Wed, Apr 24, 2013 at 03:51:12PM -0700, H. Peter Anvin wrote:
> On 04/10/2013 01:02 AM, Minchan Kim wrote:
> > 
> > When I am looking at the code, I was wonder about the logic of GHZP(aka,
> > get_huge_zero_page) reference handling. The logic depends on that page
> > allocator never alocate PFN 0.
> > 
> > Who makes sure it? What happens if allocator allocates PFN 0?
> > I don't know all of architecture makes sure it.
> > You investigated it for all arches?
> > 
> 
> This isn't manifest, right?  At least on x86 we should never, ever
> allocate PFN 0.

Thanks for the confirm.

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] netfilter: idletimers - fix the case of already expired timer

2013-04-25 Thread Pablo Neira Ayuso
Hi,

Same thing with this patch:

https://patchwork.kernel.org/patch/2333841/

Regards.

On Sun, Apr 21, 2013 at 11:53:13AM +0200, dmitry pervushin wrote:
> From: dmitry pervushin 
> 
> Fix the case in which timer has expired and we refresh it without
> sending the notification
> 
> Signed-off-by: Ashish Sharma 
> Signed-off-by: JP Abgrall 
> Signed-off-by: John Stultz 
> Signed-off-by: dmitry pervushin 
> ---
>  net/netfilter/xt_IDLETIMER.c |   18 --
>  1 file changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/net/netfilter/xt_IDLETIMER.c b/net/netfilter/xt_IDLETIMER.c
> index f407ebc1..3540c04 100644
> --- a/net/netfilter/xt_IDLETIMER.c
> +++ b/net/netfilter/xt_IDLETIMER.c
> @@ -168,14 +168,22 @@ static unsigned int idletimer_tg_target(struct sk_buff 
> *skb,
>const struct xt_action_param *par)
>  {
>   const struct idletimer_tg_info *info = par->targinfo;
> + unsigned long now = jiffies;
>  
>   pr_debug("resetting timer %s, timeout period %u\n",
>info->label, info->timeout);
>  
>   BUG_ON(!info->timer);
>  
> + if (time_before(info->timer->timer.expires, now)) {
> + schedule_work(>timer->work);
> + pr_debug("Starting timer %s (Expired, Jiffies): %lu, %lu\n",
> +  info->label, info->timer->timer.expires, now);
> + }
> +
> + /* TODO: Avoid modifying timers on each packet */
>   mod_timer(>timer->timer,
> -   msecs_to_jiffies(info->timeout * 1000) + jiffies);
> +   msecs_to_jiffies(info->timeout * 1000) + now);
>  
>   return XT_CONTINUE;
>  }
> @@ -184,6 +192,7 @@ static int idletimer_tg_checkentry(const struct 
> xt_tgchk_param *par)
>  {
>   struct idletimer_tg_info *info = par->targinfo;
>   int ret;
> + unsigned long now = jiffies;
>  
>   pr_debug("checkentry targinfo%s\n", info->label);
>  
> @@ -204,8 +213,13 @@ static int idletimer_tg_checkentry(const struct 
> xt_tgchk_param *par)
>   info->timer = __idletimer_tg_find_by_label(info->label);
>   if (info->timer) {
>   info->timer->refcnt++;
> + if (time_before(info->timer->timer.expires, now)) {
> + schedule_work(>timer->work);
> + pr_debug("Starting Checkentry timer (Expired, Jiffies): 
> %lu, %lu\n",
> + info->timer->timer.expires, now);
> + }
>   mod_timer(>timer->timer,
> -   msecs_to_jiffies(info->timeout * 1000) + jiffies);
> +   msecs_to_jiffies(info->timeout * 1000) + now);
>  
>   pr_debug("increased refcnt of timer %s to %u\n",
>info->label, info->timer->refcnt);
> -- 
> 1.7.10.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] netfilter: idletimers - add send_nl_msg field

2013-04-25 Thread Pablo Neira Ayuso
Hi Dmitry,

You got some feedback for this patch:

https://patchwork.kernel.org/patch/2333851/

This patch still seem not to address some spots I already mention.

Please, have a look at my previous email and let me know if you have
any question.

Thanks.

On Sun, Apr 21, 2013 at 11:53:14AM +0200, dmitry pervushin wrote:
> Send notifications when the label becomes active after an idle period.
> Send netlink message notifications in addition to sysfs notifications.
> Using a uevent with
>   subsystem=xt_idletimer
>   INTERFACE=...
>   STATE={active,inactive}
> 
> This is backport from common android-3.0
> commit: beb914e987cbbd368988d2b94a6661cb907c4d5a
> with uevent support instead of a new netlink message type.
> 
> Cc: JP Abgrall 
> Cc: Ashish Sharma 
> Cc: Peter P Waskiewicz Jr 
> Signed-off-by: Ashish Sharma 
> Signed-off-by: JP Abgrall 
> Signed-off-by: John Stultz 
> Signed-off-by: dmitry pervushin 
> ---
>  include/uapi/linux/netfilter/xt_IDLETIMER.h |   16 +-
>  net/netfilter/xt_IDLETIMER.c|  234 
> +++
>  2 files changed, 180 insertions(+), 70 deletions(-)
> 
> diff --git a/include/uapi/linux/netfilter/xt_IDLETIMER.h 
> b/include/uapi/linux/netfilter/xt_IDLETIMER.h
> index 208ae93..e50bc3d 100644
> --- a/include/uapi/linux/netfilter/xt_IDLETIMER.h
> +++ b/include/uapi/linux/netfilter/xt_IDLETIMER.h
> @@ -39,7 +39,21 @@ struct idletimer_tg_info {
>   char label[MAX_IDLETIMER_LABEL_SIZE];
>  
>   /* for kernel module internal use only */
> - struct idletimer_tg *timer __attribute__((aligned(8)));
> + struct idletimer_tg *timer __aligned(8);
>  };
>  
> +#define NL_EVENT_TYPE_INACTIVE 0
> +#define NL_EVENT_TYPE_ACTIVE 1
> +
> +struct idletimer_tg_info_v1 {
> + __u32 timeout;
> +
> + char label[MAX_IDLETIMER_LABEL_SIZE];
> +
> + /* Use netlink messages for notification in addition to sysfs */
> + __u8 send_nl_msg;
> +
> + /* for kernel module internal use only */
> + struct idletimer_tg *timer __aligned(8);
> +};
>  #endif
> diff --git a/net/netfilter/xt_IDLETIMER.c b/net/netfilter/xt_IDLETIMER.c
> index 3540c04..6eb25ab 100644
> --- a/net/netfilter/xt_IDLETIMER.c
> +++ b/net/netfilter/xt_IDLETIMER.c
> @@ -41,6 +41,8 @@
>  #include 
>  #include 
>  
> +#define NLMSG_MAX_SIZE 64
> +
>  struct idletimer_tg_attr {
>   struct attribute attr;
>   ssize_t (*show)(struct kobject *kobj,
> @@ -56,6 +58,8 @@ struct idletimer_tg {
>   struct idletimer_tg_attr attr;
>  
>   unsigned int refcnt;
> + bool send_nl_msg;
> + bool active;
>  };
>  
>  static LIST_HEAD(idletimer_tg_list);
> @@ -63,6 +67,30 @@ static DEFINE_MUTEX(list_mutex);
>  
>  static struct kobject *idletimer_tg_kobj;
>  
> +static void notify_netlink_uevent(const char *iface, struct idletimer_tg 
> *timer)
> +{
> + char iface_msg[NLMSG_MAX_SIZE];
> + char state_msg[NLMSG_MAX_SIZE];
> + char *envp[] = { iface_msg, state_msg, NULL };
> + int res;
> +
> + res = snprintf(iface_msg, NLMSG_MAX_SIZE, "INTERFACE=%s",
> +iface);
> + if (NLMSG_MAX_SIZE <= res) {
> + pr_err("message too long (%d)", res);
> + return;
> + }
> + res = snprintf(state_msg, NLMSG_MAX_SIZE, "STATE=%s",
> +timer->active ? "active" : "inactive");
> + if (NLMSG_MAX_SIZE <= res) {
> + pr_err("message too long (%d)", res);
> + return;
> + }
> + pr_debug("putting nlmsg: <%s> <%s>\n", iface_msg, state_msg);
> + kobject_uevent_env(idletimer_tg_kobj, KOBJ_CHANGE, envp);
> + return;
> +}
> +
>  static
>  struct idletimer_tg *__idletimer_tg_find_by_label(const char *label)
>  {
> @@ -83,6 +111,7 @@ static ssize_t idletimer_tg_show(struct kobject *kobj, 
> struct attribute *attr,
>  {
>   struct idletimer_tg *timer;
>   unsigned long expires = 0;
> + unsigned long now = jiffies;
>  
>   mutex_lock(_mutex);
>  
> @@ -92,11 +121,15 @@ static ssize_t idletimer_tg_show(struct kobject *kobj, 
> struct attribute *attr,
>  
>   mutex_unlock(_mutex);
>  
> - if (time_after(expires, jiffies))
> + if (time_after(expires, now))
>   return sprintf(buf, "%u\n",
> -jiffies_to_msecs(expires - jiffies) / 1000);
> +jiffies_to_msecs(expires - now) / 1000);
>  
> - return sprintf(buf, "0\n");
> + if (timer->send_nl_msg)
> + return sprintf(buf, "0 %d\n",
> + jiffies_to_msecs(now - expires) / 1000);
> + else
> + return sprintf(buf, "0\n");
>  }
>  
>  static void idletimer_tg_work(struct work_struct *work)
> @@ -105,6 +138,9 @@ static void idletimer_tg_work(struct work_struct *work)
> work);
>  
>   sysfs_notify(idletimer_tg_kobj, NULL, timer->attr.attr.name);
> +
> + if (timer->send_nl_msg)
> + notify_netlink_uevent(timer->attr.attr.name, 

Re: [PATCH v2] PowerPC: kernel: compiling issue, make additional room in exception vector area

2013-04-25 Thread Chen Gang
On 2013年04月26日 09:06, Chen Gang wrote:
>> CFAR is the Come From Register.  It saves the location of the last
>> > branch and is hence overwritten by any branch.
>> > 
> Do we process it just like others done (e.g. 0x300, 0xe00, 0xe20 ...) ?
>   . = 0x900
>   .globl decrementer_pSeries
> decrementer_pSeries:
>   HMT_MEDIUM_PPR_DISCARD
>   SET_SCRATCH0(r13)
>   b decrementer_pSeries_0
> 
>   ...
> 
> 

Such as the fix below, is it OK (just like 0x300 or 0x200 has done) ?

Please check, thanks.

---diff begin-

diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index e789ee7..a0a5ff2 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -254,7 +254,14 @@ hardware_interrupt_hv:
STD_EXCEPTION_PSERIES(0x800, 0x800, fp_unavailable)
KVM_HANDLER_PR(PACA_EXGEN, EXC_STD, 0x800)
 
-   MASKABLE_EXCEPTION_PSERIES(0x900, 0x900, decrementer)
+   . = 0x900
+   .globl decrementer_pSeries
+decrementer_pSeries:
+   HMT_MEDIUM_PPR_DISCARD
+   SET_SCRATCH0(r13)   /* save r13 */
+   EXCEPTION_PROLOG_0(PACA_EXGEN)
+   b   decrementer_pSeries_0
+
STD_EXCEPTION_HV(0x980, 0x982, hdecrementer)
 
MASKABLE_EXCEPTION_PSERIES(0xa00, 0xa00, doorbell_super)
@@ -536,6 +543,12 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_ARCH_206)
 #endif
 
.align  7
+   /* moved from 0x900 */
+decrementer_pSeries_0:
+   EXCEPTION_PROLOG_1(PACA_EXGEN, SOFTEN_TEST_PR, 0x900)
+   EXCEPTION_PROLOG_PSERIES_1(decrementer_common, EXC_STD)
+
+   .align  7
/* moved from 0xe00 */
STD_EXCEPTION_HV_OOL(0xe02, h_data_storage)
KVM_HANDLER_SKIP(PACA_EXGEN, EXC_HV, 0xe02)

---diff end---

-- 
Chen Gang

Asianux Corporation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] driver,usb: Fix a warning in uhci-hcd driver

2013-04-25 Thread ZhenHua


On 04/25/2013 10:54 PM, Alan Stern wrote:

On Thu, 25 Apr 2013, ZhenHua wrote:


+#define UHCI_SUSPENDRH_RETRY_MAX  10
+#define UHCI_SUSPENDRH_RETRY_DELAY100

Why is the delay set to 100 us?  Isn't that excessively large?  How
long does it take for this controller to go into suspend?

This controller will take about 200~400 us, but I am not sure how long
other devices will take.
I set interval to 100 us,  so it will save more time.

A 400-us delay is fairly long.  It would be better to avoid it

The device needs about 200~400 us to get stopped, not OS.
For other devices, it will not wait.

entirely.


Why are these variables u16?  Why not int?

uhci_readw will return u16.

That's not a good reason, since u16 fits perfectly well inside an
int.  But never mind...


Anyway, a better approach would be not to add a delay loop at all.
Instead, change this test:

if (!auto_stop && !(uhci_readw(uhci, USBSTS) & USBSTS_HCH)) {
uhci->rh_state = UHCI_RH_SUSPENDING;
spin_unlock_irq(>lock);
msleep(1);
spin_lock_irq(>lock);
if (uhci->dead)
return;
}

When the iLo controller is present, make the "if" statement always
succeed.  Then you'll get a whole 1-ms delay.

This will cause more operation and more time for other devices.

Actually what I wrote was wrong anyway.  I forgot that when auto_stop
is set, the routine is not allowed to sleep.

A better way to solve your problem is to change uhci_hub_status_data().
In the UHCI_RH_RUNNING_NODEVS case, change the line that says

else if (time_after_eq(jiffies, uhci->auto_stop_time))

to

else if (time_after_eq(jiffies, uhci->auto_stop_time) &&
!uhci->no_auto_stops)

where uhci->no_auto_stops is a new bitflag that you set inside
uhci_pci_init() if you detect that the controller is an iLo virtual
UHCI controller.

This way there will always be a 1-ms delay, so the slow controller will
suspend successfully.  And other types of host controllers won't be
affected, because the no_auto_stops flag won't get set for them.

Alan Stern



I  think it is a good idea, and the logic of the code may be more 
clear.  I will do some test on my system.


Thanks
Zhen-Hua

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] PowerPC: kernel: compiling issue, make additional room in exception vector area

2013-04-25 Thread Chen Gang
On 2013年04月26日 07:16, Michael Neuling wrote:
>> > diff --git a/arch/powerpc/kernel/exceptions-64s.S 
>> > b/arch/powerpc/kernel/exceptions-64s.S
>> > index e789ee7..8997de2 100644
>> > --- a/arch/powerpc/kernel/exceptions-64s.S
>> > +++ b/arch/powerpc/kernel/exceptions-64s.S
>> > @@ -254,7 +254,11 @@ hardware_interrupt_hv:
>> >STD_EXCEPTION_PSERIES(0x800, 0x800, fp_unavailable)
>> >KVM_HANDLER_PR(PACA_EXGEN, EXC_STD, 0x800)
>> >  
>> > -  MASKABLE_EXCEPTION_PSERIES(0x900, 0x900, decrementer)
>> > +  . = 0x900
>> > +  .globl decrementer_pSeries
>> > +decrementer_pSeries:
>> > +  b   decrementer_pSeries_0
>> > +
> Unfortunately you can't do this ether as we need to save the CFAR[1]
> before it's overwritten by any branch. MASKABLE_EXCEPTION_PSERIES does
> this.
> 

Thanks for your checking.

> CFAR is the Come From Register.  It saves the location of the last
> branch and is hence overwritten by any branch.
> 

Do we process it just like others done (e.g. 0x300, 0xe00, 0xe20 ...) ?
. = 0x900
.globl decrementer_pSeries
decrementer_pSeries:
HMT_MEDIUM_PPR_DISCARD
SET_SCRATCH0(r13)
b decrementer_pSeries_0

...


> Thanks for trying.
> 

Not at all, before get fixed by other members, I should continue trying.

-- 
Chen Gang

Asianux Corporation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: For review (v2): user_namespaces(7) man page

2013-04-25 Thread Eric W. Biederman
richard -rw- weinberger  writes:

> On Wed, Mar 27, 2013 at 10:26 PM, Michael Kerrisk (man-pages)
>  wrote:
>>Inside the user namespace, the shell has user and group  ID  0,
>>and a full set of permitted and effective capabilities:
>>
>>bash$ cat /proc/$$/status | egrep '^[UG]id'
>>Uid: 0000
>>Gid: 0000
>>bash$ cat /proc/$$/status | egrep '^Cap(Prm|Inh|Eff)'
>>CapInh:   
>>CapPrm:   001f
>>CapEff:   001f
>
> I've tried your demo program, but inside the new ns I'm automatically nobody.
> As Eric said, setuid(0)/setgid(0) are missing.

Is it the setuid/setgid or not setting up the uid/gid map?

> Eric, maybe you can help me. How can I drop capabilities within a user
> namespace?

> In childFunc() I did add prctl(PR_CAPBSET_DROP, CAP_NET_ADMIN) but it always
> returns ENOPERM.
> What that? I thought I get a completely fresh set of cap which I can modify.
> I don't want that uid 0 inside the container has all caps.

There are weird things that happen with exec and the user namespace.  If
you have exec'd as an unmapped user all of your capabilities have
already been droped.

> And why does /proc/*/loginuid always contain 4294967295 in a new user 
> namespace?
> Writing to it also fails. (Noticed that because pam_loginuid.so does not 
> work).

Almost certainly because the loginuid has already been set.  Yes. It
looks like I am simply using from_kuid instead of from_kuid_munged on
the read.  So an unmapped loginuid will be reported as 4294967295.

For some circumstances 65534 (nobody) is definitely better in some it is
a toss up, and most of the time no one really cares.  So I have tried to
do something but in this case I don't know which was the best policy.

> Final question, is it by design that uid 0 within a namespace in not
> allowed to write to
> /proc/*/oom_score_adj?

Essentially.  It is by design that uid 0 within a namespace be mapped to
some other uid outside the namespace, and that the permissions on writes
should use the permission needed outside of the user namespace.

Which means there are all kinds of things only uid 0 can write to, that
you can't touch in a user namespace.  Some of those things the policy
may need to be reconsidered.  A lot of those things the default policy
is good.  Regardless we are now defaulting to not letting root in a
container do risky things which is a good thing.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: mm: BUG in do_huge_pmd_wp_page

2013-04-25 Thread Sasha Levin
On 04/24/2013 06:46 PM, Andrew Morton wrote:
> On Thu, 11 Apr 2013 11:14:29 -0400 Sasha Levin  wrote:
> 
>> On 04/11/2013 11:13 AM, Kirill A. Shutemov wrote:
>>> Sasha Levin wrote:
 On 04/10/2013 04:02 AM, Minchan Kim wrote:
> I don't know this issue was already resolved. If so, my reply become a 
> just
> question to Kirill regardless of this BUG.

 The issue is still reproducible with today's -next.
>>>
>>> Could you share your kernel config and configuration of your virtual 
>>> machine?
>>
>> I've attached my .config.
>>
>> I start the vm using:
>>
>> ./vm sandbox --rng --balloon -k /usr/src/linux/arch/x86/boot/bzImage -d run 
>> -d /dev/shm/swap --no-dhcp -m 3072 -c 6 -p
>> "init=/virt/init zcache ftrace_dump_on_oops debugpat kvm.mmu_audit=1 
>> memblock=debug slub_debug=FZPU" -- /runtrin.sh
>>
>> Where /runtrin.sh inside the vm simply mounts some stuff like sysfs and proc,
>> creates the swap space and runs trinity.
> 
> Guys, did this get fixed?

I've stopped seeing that during fuzzing, so I guess that it got fixed somehow...


Thanks,
Sasha

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] MAINTAINERS: Update Grant's email address and maintainership

2013-04-25 Thread Stephen Rothwell
Hi Mark,

On Fri, 19 Apr 2013 11:22:37 +0100 Mark Brown  wrote:
>
> On Fri, Apr 19, 2013 at 09:18:58AM +1000, Stephen Rothwell wrote:
> 
> > Done.  Should I also use your kernel.org address as your contact address
> > (instead of broo...@opensource.wolfsonmicro.com)?
> 
> Yes, please - the Wolfson one won't work at some point.  I always
> assumed you got that stuff from MAINTAINERS, please also check that
> you're using lgirdw...@gmail.com for Liam Girdwood.

Done.  I had already changed Liam's address.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgp4DUt2lLMCh.pgp
Description: PGP signature


Re: [PATCH 11/21] irqdomain: export irq_domain_add_simple

2013-04-25 Thread Simon Horman
On Thu, Apr 25, 2013 at 07:28:54PM +0200, Arnd Bergmann wrote:
> All other irq_domain_add_* functions are exported already, and apparently
> this one got left out by mistake, which causes build errors for ARM
> allmodconfig kernels:
> 
> ERROR: "irq_domain_add_simple" [drivers/gpio/gpio-rcar.ko] undefined!
> ERROR: "irq_domain_add_simple" [drivers/gpio/gpio-em.ko] undefined!
> 
> Signed-off-by: Arnd Bergmann 
> Cc: Benjamin Herrenschmidt 
> Cc: Grant Likely 
> Cc: Thomas Gleixner 
> Cc: Simon Horman 
> Cc: Laurent Pinchart 
> Cc: Magnus Damm 

Acked-by: Simon Horman 

Grant, could consider taking this one?

> ---
>  kernel/irq/irqdomain.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
> index 059a280..c532c15 100644
> --- a/kernel/irq/irqdomain.c
> +++ b/kernel/irq/irqdomain.c
> @@ -194,6 +194,7 @@ struct irq_domain *irq_domain_add_simple(struct 
> device_node *of_node,
>   /* A linear domain is the default */
>   return irq_domain_add_linear(of_node, size, ops, host_data);
>  }
> +EXPORT_SYMBOL_GPL(irq_domain_add_simple);
>  
>  /**
>   * irq_domain_add_legacy() - Allocate and register a legacy revmap 
> irq_domain.
> -- 
> 1.8.1.2
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 03/21] ARM: shmobile: don't call irqchip_init unconditionally

2013-04-25 Thread Simon Horman
On Thu, Apr 25, 2013 at 07:28:46PM +0200, Arnd Bergmann wrote:
> The irqchip_init function is only available when building
> with CONFIG_OF enabled, which causes this build failure for
> bonito_defconfig:
> 
> arch/arm/mach-shmobile/built-in.o: In function `r8a7740_init_irq_of':
> :(.init.text+0x580): undefined reference to `irqchip_init'
> 
> This makes both the OF and the ATAGS portion of the driver
> conditional, which avoids the build error and also results
> in smaller object code if not both are enabled, without the
> need for an #ifdef.
> 
> Signed-off-by: Arnd Bergmann 
> Cc: Bastian Hecht 
> Cc: Simon Horman 
> Cc: Kuninori Morimoto 
> ---
>  arch/arm/mach-shmobile/intc-r8a7740.c | 13 +++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/mach-shmobile/intc-r8a7740.c 
> b/arch/arm/mach-shmobile/intc-r8a7740.c
> index 8871f77..5dc57f1 100644
> --- a/arch/arm/mach-shmobile/intc-r8a7740.c
> +++ b/arch/arm/mach-shmobile/intc-r8a7740.c
> @@ -53,14 +53,23 @@ static void __init r8a7740_init_irq_common(void)
>  
>  void __init r8a7740_init_irq_of(void)
>  {
> + if (!IS_ENABLED(CONFIG_OF))
> + return;
> +

In other parts of the shmobile I believe that such code is
guarded by #ifdef CONFIG_OF and I believe not guarding this code in
some way was an oversight.

The above change seems fine to me.

>   irqchip_init();
>   r8a7740_init_irq_common();
>  }
>  
>  void __init r8a7740_init_irq(void)
>  {
> - void __iomem *gic_dist_base = ioremap_nocache(0xc280, 0x1000);
> - void __iomem *gic_cpu_base = ioremap_nocache(0xc200, 0x1000);
> + void __iomem *gic_dist_base;
> + void __iomem *gic_cpu_base;
> +
> + if (!IS_ENABLED(CONFIG_ATAGS))
> + return;
> +
> + gic_dist_base = ioremap_nocache(0xc280, 0x1000);
> + gic_cpu_base = ioremap_nocache(0xc200, 0x1000);
>  
>   /* initialize the Generic Interrupt Controller PL390 r0p0 */
>   gic_init(0, 29, gic_dist_base, gic_cpu_base);

This one seems broken as the armadillo800eva board currently uses
it to initialise GIC even if CONFIG_ATAGS is not defined.

I did test the above change on the armadillo800eva board
with the above change and CONFIG_ATAGS disabled, the result was
a boot failure. With the change reverted booting seems fine.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 10/21] [SCSI] nsp32: use mdelay instead of large udelay constants

2013-04-25 Thread Masanori Goto
2013/4/25 Arnd Bergmann 
>
> ARM cannot handle udelay for more than 2 miliseconds, so we
> should use mdelay instead for those.
>

Singed-off-by: GOTO Masanori 

> Signed-off-by: Arnd Bergmann 
> Cc: GOTO Masanori 
> Cc: YOKOTA Hiroshi 
> Cc: "James E.J. Bottomley" 
> Cc: linux-s...@vger.kernel.org
> ---
>  drivers/scsi/nsp32.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/nsp32.c b/drivers/scsi/nsp32.c
> index 1e3879d..0665f9c 100644
> --- a/drivers/scsi/nsp32.c
> +++ b/drivers/scsi/nsp32.c
> @@ -2899,7 +2899,7 @@ static void nsp32_do_bus_reset(nsp32_hw_data *data)
>  * reset SCSI bus
>  */
> nsp32_write1(base, SCSI_BUS_CONTROL, BUSCTL_RST);
> -   udelay(RESET_HOLD_TIME);
> +   mdelay(RESET_HOLD_TIME / 1000);
> nsp32_write1(base, SCSI_BUS_CONTROL, 0);
> for(i = 0; i < 5; i++) {
> intrdat = nsp32_read2(base, IRQ_STATUS); /* dummy read */
> --
> 1.8.1.2
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm, highmem: remove useless virtual variable in page_address_map

2013-04-25 Thread Joonsoo Kim
On Thu, Apr 25, 2013 at 03:00:57PM -0700, Andrew Morton wrote:
> On Mon, 22 Apr 2013 17:26:28 +0900 Joonsoo Kim  wrote:
> 
> > We can get virtual address without virtual field.
> > So remove it.
> > 
> > ...
> >
> > --- a/mm/highmem.c
> > +++ b/mm/highmem.c
> > @@ -320,7 +320,6 @@ EXPORT_SYMBOL(kunmap_high);
> >   */
> >  struct page_address_map {
> > struct page *page;
> > -   void *virtual;
> > struct list_head list;
> >  };
> >  
> > @@ -362,7 +361,10 @@ void *page_address(const struct page *page)
> >  
> > list_for_each_entry(pam, >lh, list) {
> > if (pam->page == page) {
> > -   ret = pam->virtual;
> > +   int nr;
> > +
> > +   nr = pam - page_address_map;
> 
> Doesn't compile.  Presumably you meant page_address_maps.
> 
> I'll drop this - please resend if/when it has been runtime tested.

Sorry for that.
I'll resend when it has been runtime tested.

Thanks.

> 
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 12/15] powerpc/85xx: add time base sync support for e6500

2013-04-25 Thread Scott Wood

On 04/24/2013 07:28:18 PM, Zhao Chenhui wrote:

On Wed, Apr 24, 2013 at 05:38:16PM -0500, Scott Wood wrote:
> On 04/24/2013 06:29:29 AM, Zhao Chenhui wrote:
> >On Tue, Apr 23, 2013 at 07:04:06PM -0500, Scott Wood wrote:
> >> On 04/19/2013 05:47:45 AM, Zhao Chenhui wrote:
> >> >From: Chen-Hui Zhao 
> >> >
> >> >For e6500, two threads in one core share one time base. Just  
need

> >> >to do time base sync on first thread of one core, and skip it on
> >> >the other thread.
> >> >
> >> >Signed-off-by: Zhao Chenhui 
> >> >Signed-off-by: Li Yang 
> >> >Signed-off-by: Andy Fleming 
> >> >---
> >> > arch/powerpc/platforms/85xx/smp.c |   52
> >> >+++-
> >> > 1 files changed, 44 insertions(+), 8 deletions(-)
> >> >
> >> >diff --git a/arch/powerpc/platforms/85xx/smp.c
> >> >b/arch/powerpc/platforms/85xx/smp.c
> >> >index 74d8cde..5f3eee3 100644
> >> >--- a/arch/powerpc/platforms/85xx/smp.c
> >> >+++ b/arch/powerpc/platforms/85xx/smp.c
> >> >@@ -53,26 +55,40 @@ static inline u32 get_phy_cpu_mask(void)
> >> >  u32 mask;
> >> >  int cpu;
> >> >
> >> >- mask = 1 << cur_booting_core;
> >> >- for_each_online_cpu(cpu)
> >> >- mask |= 1 << get_hard_smp_processor_id(cpu);
> >> >+ if (smt_capable()) {
> >> >+		/* two threads in one core share one time base  
*/
> >> >+		mask = 1 <<  
cpu_core_index_of_thread(cur_booting_core);

> >> >+ for_each_online_cpu(cpu)
> >> >+ mask |= 1 << cpu_core_index_of_thread(
> >> >+	 
get_hard_smp_processor_id(cpu));

> >> >+ } else {
> >> >+ mask = 1 << cur_booting_core;
> >> >+ for_each_online_cpu(cpu)
> >> >+			mask |= 1 <<  
get_hard_smp_processor_id(cpu);

> >> >+ }
> >>
> >> Where is smt_capable defined()?  I assume somewhere in the  
patchset

> >> but it's a pain to search 12 patches...
> >>
> >
> >It is defined in arch/powerpc/include/asm/topology.h.
> >   #define smt_capable()   (cpu_has_feature(CPU_FTR_SMT))
> >
> >Thanks for your review again.
>
> We shouldn't base it on CPU_FTR_SMT.  For example, e6500 doesn't
> claim that feature yet, except in our SDK kernel.  That doesn't
> change the topology of CPU numbering.
>

Then, where can I get the thread information? dts?
Or, wait for upstream of the thread suppport of e6500.


It's an inherent property of e6500 (outside of some virtualization  
scenarios, but you wouldn't run this code under a hypervisor) that you  
have two threads per core (whether Linux uses them or not).  Or you  
could read TMCFG0[NTHRD] if you know you're on a chip that has TMRs but  
aren't positive it's an e6500, but I wouldn't bother.  If we do ever  
have such a chip, there are probably other things that will need  
updating.



> >static inline u32 get_phy_cpu_mask(void)
> >{
> >   u32 mask;
> >   int cpu;
> >
> >   mask = 1 << cpu_core_index_of_thread(cur_booting_core);
> >   for_each_online_cpu(cpu)
> >   mask |= 1 << cpu_core_index_of_thread(
> >   get_hard_smp_processor_id(cpu));
> >
> >   return mask;
> >}
>
> Likewise, this will get it wrong if SMT is disabled or not yet
> implemented on a core.
>
> -Scott

Let's look into cpu_core_index_of_thread() in  
arch/powerpc/kernel/smp.c.


  int cpu_core_index_of_thread(int cpu)
  {
  return cpu >> threads_shift;
  }

If no thread, the threads_shift is equal to 0. It can work with no
thread.


My point is that if threads are disabled, threads_shift will be 0, but  
e6500 cores will still be numbered 0, 2, 4, etc.


Perhaps, I should submit this patch after the thread patches for  
e6500.


Why?

-Scott
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH -v2] x86: Add a Kconfig shortcut for kvm guest kernel

2013-04-25 Thread Borislav Petkov
From: Borislav Petkov 
Date: Tue, 16 Apr 2013 18:24:34 +0200
Subject: [PATCH -v2] x86: Add a Kconfig shortcut for kvm guest kernel

This is pretty useful for the case where people want to boot the
resulting kernel in qemu/kvm. Instead of going and searching for each
required option through the Kconfig maze, this single option should
simply enable everything required/good to have to boot the resulting
kernel in the guest.

Cc: Fengguang Wu 
Originally-by: Pekka Enberg 
Originally-by: Sasha Levin 
Signed-off-by: Borislav Petkov 
---


Here's v2 which should be addressing all review comments so far.


 arch/x86/Kconfig | 38 ++
 1 file changed, 38 insertions(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 5651374d179f..76a95ffa959a 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -680,6 +680,44 @@ config KVM_GUEST
  underlying device model, the host provides the guest with
  timing infrastructure such as time of day, and system time
 
+config KVM_GUEST_COMMODITY_OPTIONS
+   bool "Enable commodity options for a standalone KVM guest"
+   depends on KVM_GUEST
+   select NET
+   select NETDEVICES
+   select BLOCK
+   select BLK_DEV
+   select NETWORK_FILESYSTEMS
+   select INET
+   select EXPERIMENTAL
+   select TTY
+   select SERIAL_8250
+   select SERIAL_8250_CONSOLE
+   select IP_PNP
+   select IP_PNP_DHCP
+   select BINFMT_ELF
+   select PCI_MSI
+   select HAVE_ARCH_KGDB
+   select DEBUG_KERNEL
+   select KGDB
+   select KGDB_SERIAL_CONSOLE
+   select VIRTUALIZATION
+   select VIRTIO
+   select VIRTIO_RING
+   select VIRTIO_PCI
+   select VIRTIO_BLK
+   select VIRTIO_CONSOLE
+   select VIRTIO_NET
+   select 9P_FS
+   select NET_9P
+   select NET_9P_VIRTIO
+   ---help---
+ Select guest kernel functionality which facilitates booting the
+ kernel as a guest in qemu/kvm. This entails basic stuff like
+ serial support, kgdb, virtio and other so that you can be able to
+ have commodity functionality like serial output from the guest,
+ networking, etc.
+
 source "arch/x86/lguest/Kconfig"
 
 config PARAVIRT_TIME_ACCOUNTING
-- 
1.8.2.135.g7b592fa

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] f2fs: check nid == 0 in add_free_nid

2013-04-25 Thread Namjae Jeon
2013/4/25, Jaegeuk Kim :
> It is more obvious that add_free_nid checks whether the free nid is zero or
> not.
>
> Signed-off-by: Jaegeuk Kim 
Looks reasonable to me.
Reviewed-by: Namjae Jeon 

Thanks~.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] f2fs: avoid frequent background GC

2013-04-25 Thread Namjae Jeon
2013/4/25, Jaegeuk Kim :
> Hi, Namjae,
>
> Agreed. How about this?
>
> Chang log from v1:
>  o change timings - min 30s, max 60s, nogc 5 min
>  o remove nonreachable routine
>  o consider NOGC_SLEEP_TIME in increate/decrease_sleep_time
>
> From 806e344624414fcf9fc87f6193265859027d51b5 Mon Sep 17 00:00:00 2001
> From: Jaegeuk Kim 
> Date: Wed, 24 Apr 2013 13:00:14 +0900
> Subject: [PATCH] f2fs: avoid frequent background GC
> Cc: linux-fsde...@vger.kernel.org, linux-kernel@vger.kernel.org,
> linux-f2fs-de...@lists.sourceforge.net
>
> If there is no victim segments selected by background GC, let's wait
> a little bit longer time to collect dirty segments.
> By default, let's give 5 minutes.
>
> Signed-off-by: Jaegeuk Kim 
Looks good!
Reviewed-by: Namjae Jeon 

Thanks~
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv3 00/14] drivers: mailbox: framework creation

2013-04-25 Thread Andy Green

On 26/04/13 06:29, the mail apparently from Suman Anna included:

Hi -


3. Shareable/exclusive nature of a mailbox. If it is shareable, then
duplicating the behavior between clients is not worth it, and this
should be absorbed into the respective controller driver.


I think the mailbox should be exclusively held by a client. That makes
many things simpler. Also remote firmwares won't be always robust
enough to handle commands from different subsystems intermixed. The
API only has to make sure the mailbox_get/put operations are very
thin.


This might be the case for specific remotes where we expect only one
client driver to be responsible for talking to it, but for generic
offloading, you do not want to have this restriction. You do not want
peer clients to go through a single main client, as the latencies or the
infrastructure imposed by the main client may not be suitable for the
other clients. The stricter usecase here would be the shareable mailbox,
and if it is exclusive, as dictated by a controller or device property,
then so be it and things would get simplified for that controller/device.


Knowing why Jassi mentioned this, the situation is a bit different than 
what you replied to.  There are in fact multiple client drivers that can 
asynchronously decide to initiate communication on the same mailbox. 
Some of the client need to perform multi-step sequencing and lock the 
mailbox for the duration.


Right now we can implement that by having a driver on top to mediate, 
Jassi is suggesting being able to do the client locking at your layer as 
a primitive will simplify things, not least get rid of the mediation 
driver.  Your layer has concept of completion and notifier already so it 
seems it wouldn't take much more.


-Andy

--
Andy Green | Fujitsu Landing Team Leader
Linaro.org │ Open source software for ARM SoCs | Follow Linaro
http://facebook.com/pages/Linaro/155974581091106  - 
http://twitter.com/#!/linaroorg - http://linaro.org/linaro-blog

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL for v3.9-final] media fixes

2013-04-25 Thread Mauro Carvalho Chehab
Hi Linus,

Please pull from:

  git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media 
v4l_for_linus

For two driver fixes. One avoids reading any file at a system with a
cx25821 board (fortunately, this is not a common device). The other one
prevents reading after a buffer with ISDB-T devices based on mb86a20s.

Regards,
Mauro

-

The following changes since commit 35ccecef6ed48a5602755ddf580c45a026a1dc05:

  [media] [REGRESSION] bt8xx: Fix too large height in cropcap (2013-03-26 
08:37:00 -0300)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media 
v4l_for_linus

for you to fetch changes up to c95789ecd5a979fd718ae09763df3fa50dd97a91:

  [media] cx25821: do not expose broken video output streams (2013-04-15 
08:28:41 -0300)


Hans Verkuil (1):
  [media] cx25821: do not expose broken video output streams

Mauro Carvalho Chehab (1):
  [media] mb86a20s: Fix estimate_rate setting

 drivers/media/dvb-frontends/mb86a20s.c| 2 +-
 drivers/media/pci/cx25821/cx25821-video.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


mmotm 2013-04-25-16-24 uploaded

2013-04-25 Thread akpm
The mm-of-the-moment snapshot 2013-04-25-16-24 has been uploaded to

   http://www.ozlabs.org/~akpm/mmotm/

mmotm-readme.txt says

README for mm-of-the-moment:

http://www.ozlabs.org/~akpm/mmotm/

This is a snapshot of my -mm patch queue.  Uploaded at random hopefully
more than once a week.

You will need quilt to apply these patches to the latest Linus release (3.x
or 3.x-rcY).  The series file is in broken-out.tar.gz and is duplicated in
http://ozlabs.org/~akpm/mmotm/series

The file broken-out.tar.gz contains two datestamp files: .DATE and
.DATE--mm-dd-hh-mm-ss.  Both contain the string -mm-dd-hh-mm-ss,
followed by the base kernel version against which this patch series is to
be applied.

This tree is partially included in linux-next.  To see which patches are
included in linux-next, consult the `series' file.  Only the patches
within the #NEXT_PATCHES_START/#NEXT_PATCHES_END markers are included in
linux-next.

A git tree which contains the memory management portion of this tree is
maintained at git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
by Michal Hocko.  It contains the patches which are between the
"#NEXT_PATCHES_START mm" and "#NEXT_PATCHES_END" markers, from the series
file, http://www.ozlabs.org/~akpm/mmotm/series.


A full copy of the full kernel tree with the linux-next and mmotm patches
already applied is available through git within an hour of the mmotm
release.  Individual mmotm releases are tagged.  The master branch always
points to the latest release, so it's constantly rebasing.

http://git.cmpxchg.org/?p=linux-mmotm.git;a=summary

To develop on top of mmotm git:

  $ git remote add mmotm 
git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
  $ git remote update mmotm
  $ git checkout -b topic mmotm/master
  
  $ git send-email mmotm/master.. [...]

To rebase a branch with older patches to a new mmotm release:

  $ git remote update mmotm
  $ git rebase --onto mmotm/master  topic




The directory http://www.ozlabs.org/~akpm/mmots/ (mm-of-the-second)
contains daily snapshots of the -mm tree.  It is updated more frequently
than mmotm, and is untested.

A git copy of this tree is available at

http://git.cmpxchg.org/?p=linux-mmots.git;a=summary

and use of this tree is similar to
http://git.cmpxchg.org/?p=linux-mmotm.git, described above.


This mmotm tree contains the following patches against 3.9-rc8:
(patches marked "*" will be included in linux-next)

  origin.patch
  linux-next.patch
  arch-alpha-kernel-systblss-remove-debug-check.patch
  i-need-old-gcc.patch
  drivers-usb-storage-realtek_crc-fix-build.patch
* revert-ipc-dont-allocate-a-copy-larger-than-max.patch
* drivers-char-randomc-fix-priming-of-last_data.patch
* kthread-introduce-to_live_kthread.patch
* kthread-kill-task_get_live_kthread.patch
* sound-soc-codecs-si476xc-dont-use-0bnnn.patch
* x86-make-mem=-option-to-work-for-efi-platform.patch
* auditsc-use-kzalloc-instead-of-kmallocmemset.patch
* auditsc-use-kzalloc-instead-of-kmallocmemset-fix.patch
* audit-dont-check-if-kauditd-is-valid-every-time.patch
* audit-remove-duplicate-export-of-audit_enabled.patch
* audit-remove-unnecessary-if-config_audit.patch
* 
kernel-auditfilter-resource-management-tree-and-watch-will-memory-leak-when-failure-occurs.patch
* 
kernel-audit_treec-tree-will-leak-memory-when-failure-occurs-in-audit_trim_trees.patch
* 
kernel-audit_treec-tree-will-leak-memory-when-failure-occurs-in-audit_trim_trees-fix.patch
* mm-remove-free_area_cache-use-in-powerpc-architecture.patch
* mm-use-vm_unmapped_area-on-powerpc-architecture.patch
* drm-fb-helper-dont-sleep-for-screen-unblank-when-an-oopps-is-in-progress.patch
* matroxfb-convert-struct-i2c_msg-initialization-to-c99-format.patch
* drivers-video-console-fbcon_cwc-fix-compiler-warning-in-cw_update_attr.patch
* drivers-video-add-hyper-v-synthetic-video-frame-buffer-driver.patch
* drivers-video-add-hyper-v-synthetic-video-frame-buffer-driver-fix.patch
* drivers-video-exynos-exynos_mipi_dsic-convert-to-devm_ioremap_resource.patch
* video-ep93xx-fbc-fix-section-mismatch-and-use-module_platform_driver.patch
* drivers-video-mmp-remove-legacy-hw-definitions.patch
* drivers-video-implement-a-simple-framebuffer-driver.patch
* drivers-video-implement-a-simple-framebuffer-driver-fix.patch
* cyber2000fb-avoid-palette-corruption-at-higher-clocks.patch
* fs-fscache-statsc-fix-memory-leak.patch
* inotify-invalid-mask-should-return-a-error-number-but-not-set-it.patch
* inotify-invalid-mask-should-return-a-error-number-but-not-set-it-fix.patch
* posix_cpu_timer-consolidate-expiry-time-type.patch
* posix_cpu_timers-consolidate-timer-list-cleanups.patch
* posix_cpu_timers-consolidate-expired-timers-check.patch
* selftests-add-basic-posix-timers-selftests.patch
* 
posix-timers-correctly-get-dying-task-time-sample-in-posix_cpu_timer_schedule.patch
* posix_timers-fix-racy-timer-delta-caching-on-task-exit.patch
* mkcapflagspl-convert-to-mkcapflagssh.patch
* 

Re: [PATCHv2 1/4] Documentation: Add memory mapped ARM architected timer binding

2013-04-25 Thread Stephen Boyd
On 04/25/13 16:06, Rob Herring wrote:
> On 04/25/2013 05:48 PM, Stephen Boyd wrote:
>
>> We don't really care about CNTFRQ because it's duplicated into each
>> view. We do care about CNTNSAR. Luckily the spec "just works" there in
>> the sense that we can use CNTTIDR in conjunction with CNTACRn and
>> determine if we have access to a frame we're interested in if the
>> CNTTIDR bits say the frame is present and the CNTACRn register says we
>> can access it. If not then it must be locked down for secure users.
>>
>> Unfortunately hardware doesn't have a way to say that a particular frame
>> is reserved for the hypervisor or the guest kernel/userspace. We need
>> some help from software, so we have the status property express that a
>> particular frame is available. We have to assume the DT is going to be
>> different depending on if you're the hypervisor or the guest. That's a
>> valid assumption right? Otherwise I hope we can do some trapping of the
>> guest's mapping to the control base and then rewrite what they read so
>> that they only see the frame that we want to be available to them.
> Yeah, I believe the only way to prevent access within non-secure world
> is with the MMU. So I guess the example is just policy that the
> hypervisor would/may not create a stage2 mapping. You still have the
> same issue that the guest should not be passed the control base. You
> could make the reg property optional, but then what do you do with the
> node name?

I don't follow. Why shouldn't we tell the guest about the hardware
that's there? Shouldn't they be able to safely assume they can access
the control base just like a non-guest kernel running in PL1 would be
able to?

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] time: Revert ALWAYS_USE_PERSISTENT_CLOCK compile time optimizaitons

2013-04-25 Thread Kay Sievers
On Wed, Apr 24, 2013 at 8:55 PM, John Stultz  wrote:

>> FWIW, in the light of the original change, I've just removed the
>> /dev/rtc creation from the default udev rules now, so that thing will
>> be phased out in the future.
>
> Is that actually wanted? What happens to applications that use /dev/rtc?
>
> I think setting up the /dev/rtc link is important. Its just that setting it
> up exclusively by the hctosys flag is maybe more fragile then we'd like.
> Instead the hctosys flag maybe should only be used as a hint if there is
> more then one RTC available.

Ok, convinced.

I've changed the udev rules now to first "search" for the rtc with
"hctosys" flag set, and if none is found, just fall back to /dev/rtc0.

It should work reliably on most boxes, and still do the right thing in
most cases if none of the rtcs has that flag.

Thanks,
Kay
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2 1/4] perf: Add hardware breakpoint address mask

2013-04-25 Thread Jacob Shin
On Thu, Apr 25, 2013 at 10:17:35AM -0700, H. Peter Anvin wrote:
> On 04/25/2013 10:06 AM, Oleg Nesterov wrote:
> >>
> >> The downside is that in userland perf tool we need differing documentation
> >> on what the mask syntax means for each architecture.
> > 
> > Personally I think this is acceptable.
> > 
> > But I am new to this code, so...
> > 
> 
> That would seem really, really awkward.  Yes, perf has a bunch of
> low-level stuff, but it would seem highly undesirable to force the user
> to deal with something like that.
> 
> It would be good to have a user-friendly syntax that covers most of what
> users may want to do and perhaps a longer form that can express
> everything including ARM's byte selects; if the system can't honor the
> request it should return an error.

Okay,

If arch specific masks are a no go, then I think I'm convinced that
Oleg's idea of using bp_len is the right thing to do. Right now perf
userland tool hard codes bp_len to 4, so I need to modify it to allow
user to override the length if desired.

Oleg, Frederic, et al.

Which syntax do you prefer?

If we want to set bp_len to 16:

  $ perf stat -e mem:0x1000:rw:16

Or

  $ perf stat -e mem:0x1000:16

Or

  $ perf stat -e mem:0x1000/16

If no bp_len value is specified, it will still default to 4 as it did
before.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bcache v. whatever

2013-04-25 Thread Andrew Morton
On Mon, 14 Jan 2013 14:32:02 -0800 Kent Overstreet  
wrote:

> Bcache: a block layer SSD cache

sparc64 gcc-3.4.5:

drivers/md/bcache/btree.c: In function `bch_btree_read':
drivers/md/bcache/btree.c:266: error: invalid operands to binary +
drivers/md/bcache/btree.c: In function `__btree_write':
drivers/md/bcache/btree.c:379: error: invalid operands to binary +
drivers/md/bcache/btree.c: In function `btree_node_free':
drivers/md/bcache/btree.c:980: error: invalid operands to binary +
drivers/md/bcache/btree.c: In function `btree_insert_key':
drivers/md/bcache/btree.c:1857: error: invalid operands to binary +
drivers/md/bcache/btree.c:1857: error: invalid operands to binary +
drivers/md/bcache/btree.c:1859: error: invalid operands to binary +
drivers/md/bcache/btree.c:1859: error: invalid operands to binary +
drivers/md/bcache/btree.c:1864: error: invalid operands to binary +
drivers/md/bcache/btree.c:1864: error: invalid operands to binary +
drivers/md/bcache/btree.c: In function `btree_split':
drivers/md/bcache/btree.c:1934: error: invalid operands to binary +
drivers/md/bcache/btree.c: In function `bch_btree_set_root':
drivers/md/bcache/btree.c:2159: error: invalid operands to binary +
drivers/md/bcache/btree.c: In function `bch_btree_search_recurse':
drivers/md/bcache/btree.c:2262: error: invalid operands to binary +
drivers/md/bcache/btree.c: In function `bch_btree_refill_keybuf':
drivers/md/bcache/btree.c:2330: error: invalid operands to binary +

due to

#define pbtree(b)   (_pbtree(b).s[0])

I don't know why this is happening (presumably a gcc glitch), but
returning an 80-byte struct by value from bch_pkey() and bch_pbtree()
is just gruesome.  The compiler has to allocate the space on the caller
stack, pass a hidden pointer into the callee and the callee copies its
return value into that caller stack slot.  It's slow and consumes stack.

Something different, please.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] PowerPC: kernel: compiling issue, make additional room in exception vector area

2013-04-25 Thread Michael Neuling
Chen Gang  wrote:

> 
> When CONFIG_KVM_BOOK3S_64_PR is enabled,
> MASKABLE_EXCEPTION_PSERIES(0x900 ...) will includes __KVMTEST, it will
> exceed 0x980 which STD_EXCEPTION_HV(0x980 ...) will use, it will cause
> compiling issue.
> 
> The related errors:
> arch/powerpc/kernel/exceptions-64s.S: Assembler messages:
> arch/powerpc/kernel/exceptions-64s.S:258: Error: attempt to move .org 
> backwards
> make[1]: *** [arch/powerpc/kernel/head_64.o] Error 1
> 
> The position 0x900 and 0x980 are solid, so can not move the position
> to make room larger. The final solution is to jump to another area to
> execute the related code.
> 
> 
> Signed-off-by: Chen Gang 
> ---
>  arch/powerpc/kernel/exceptions-64s.S |   12 +++-
>  1 files changed, 11 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/exceptions-64s.S 
> b/arch/powerpc/kernel/exceptions-64s.S
> index e789ee7..8997de2 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -254,7 +254,11 @@ hardware_interrupt_hv:
>   STD_EXCEPTION_PSERIES(0x800, 0x800, fp_unavailable)
>   KVM_HANDLER_PR(PACA_EXGEN, EXC_STD, 0x800)
>  
> - MASKABLE_EXCEPTION_PSERIES(0x900, 0x900, decrementer)
> + . = 0x900
> + .globl decrementer_pSeries
> +decrementer_pSeries:
> + b   decrementer_pSeries_0
> +

Unfortunately you can't do this ether as we need to save the CFAR[1]
before it's overwritten by any branch. MASKABLE_EXCEPTION_PSERIES does
this.

CFAR is the Come From Register.  It saves the location of the last
branch and is hence overwritten by any branch.

Thanks for trying.

Mikey

>   STD_EXCEPTION_HV(0x980, 0x982, hdecrementer)
>  
>   MASKABLE_EXCEPTION_PSERIES(0xa00, 0xa00, doorbell_super)
> @@ -536,6 +540,12 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_ARCH_206)
>  #endif
>  
>   .align  7
> + /* moved from 0x900 */
> +decrementer_pSeries_0:
> + _MASKABLE_EXCEPTION_PSERIES(0x900, decrementer,
> + EXC_STD, SOFTEN_TEST_PR)
> +
> + .align  7
>   /* moved from 0xe00 */
>   STD_EXCEPTION_HV_OOL(0xe02, h_data_storage)
>   KVM_HANDLER_SKIP(PACA_EXGEN, EXC_HV, 0xe02)
> -- 
> 1.7.7.6
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] pinctrl: document the "GPIO mode" pitfall

2013-04-25 Thread Laurent Pinchart
Hi Linus,

On Thursday 25 April 2013 23:39:18 Linus Walleij wrote:
> On Tue, Apr 23, 2013 at 3:33 PM, Laurent Pinchart wrote:
> >> +And your machine configuration may look like this:
> >> +--
> >> +
> >> +static unsigned long uart_default_mode[] = {
> >> +PIN_CONF_PACKED(PIN_CONFIG_DRIVE_PUSH_PULL, 0),
> >> +};
> >> +
> >> +static unsigned long uart_sleep_mode[] = {
> >> +PIN_CONF_PACKED(PIN_CONFIG_OUTPUT, 0),
> >> +};
> > 
> > I'm having a bit of trouble with PIN_CONFIG_DRIVE_PUSH_PULL and
> > PIN_CONFIG_OUTPUT. Strictly speaking, when configured in output mode, the
> > pin will be in a push-pull configuration.
> 
> For your system or for any system? Open drain, open source are also
> output modes, and none of them are push-pull.

Indeed. I was actually thinking about the opposite, push-pull is output.

> > Could you clarify the exact scope of the two configuration parameters ?
> 
> PIN_CONFIG_OUTPUT is left a bit unspecified, but here the idea was a passive
> drive, like just connecting the pin to VDD or GND without any driver stage
> at all.

Isn't that a driver stage ? :-)

> Maybe I should patch the documentation since we seem to be the only user?
> 
> In the above case (which is derived from the ABx500) I think what is
> happening is that the pin is connected to ground during sleep, without any
> enabled driver stages, which saves a lot of power, since you do not need to
> bias the totempole during sleep in that way.

Right. What is unclear to me is the interaction between OUTPUT and DRIVE_*. 
That's the part I would like to see clarified. Does DRIVE_* imply that the pin 
is driven by the selected function, and OUTPUT imply that the pin is driven to 
a fixed level ? If so, how do you configure the drive type of a pin that will 
be used through the GPIO API ? What about cases where I want to drive the pin 
to a fixed level in a non low-power output mode (for instance because I need 
more current that what the low-power output mode provides) ?

-- 
Regards,

Laurent Pinchart

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] x86 fixes for 3.9

2013-04-25 Thread Michel Lespinasse
On Thu, Apr 25, 2013 at 3:54 PM, H. Peter Anvin  wrote:
> On 04/25/2013 03:53 PM, Michel Lespinasse wrote:
>> On Thu, Apr 25, 2013 at 3:23 PM, Matthew Garrett
>>  wrote:
>>> On Thu, 2013-04-25 at 15:20 -0700, Linus Torvalds wrote:
 On Thu, Apr 25, 2013 at 2:44 PM, H. Peter Anvin  
 wrote:
>
> -   if (!sys_table->runtime->query_variable_info)
> +   if (sys_table->runtime->hdr.revision < 
> EFI_2_00_SYSTEM_TABLE_REVISION)
> return EFI_UNSUPPORTED;

 Is a EFI 2.00 system table *guaranteed* to have that
 "query_variable_info" function? The above adds the version check, but
 removes the check for a NULL pointer.
>>>
>>> As far as the spec's concerned, yes. As far as reality's concerned - if
>>> anything doesn't provide it, we're already crashing when
>>> efi_virt_query_variable_info() gets called. Nobody's complained so far.
>>
>> Well, I don't know if this is related, but commit e971318bbed6 broke
>> the google EFI SMI driver with
>> BUG: unable to handle kernel NULL pointer dereference at (null)
>> IP: [] variable_is_present+0x55/0x170
>> Call Trace:
>> [] register_efivars+0x106/0x370
>> [] ? firmware_map_add_early+0xb1/0xb1
>> [] gsmi_init+0x2ad/0x3da
>> [] do_one_initcall+0x3f/0x170
>> ...
>
> I don't know either.  Could you test this patch and see if it does anything?

Nope, still seeing the crash with this patch applied.

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2 1/4] Documentation: Add memory mapped ARM architected timer binding

2013-04-25 Thread Rob Herring
On 04/25/2013 05:48 PM, Stephen Boyd wrote:
> On 04/25/13 14:47, Rob Herring wrote:
>> On 04/15/2013 04:33 PM, Stephen Boyd wrote:
>>> On 04/15/13 14:20, Rob Herring wrote:
 On Fri, Apr 12, 2013 at 7:27 PM, Stephen Boyd  wrote:
> @@ -26,3 +30,52 @@ Example:
>  <1 10 0xf08>;
> clock-frequency = <1>;
> };
> +
> +** Memory mapped timer node properties
> +
> +- compatible : Should at least contain "arm,armv7-timer-mem".
 Everything about this timer is architecturally defined? If not, let's
 use a more specific name.
>>> I'm not sure I'm following you, but everything described here is part of
>>> the ARM definition. What would be a more specific name?
>> Something that corresponds to the particular implementation like
>> cortex-a15 (obviously not an example that has this). I'm fine with
>> leaving this for now, but would like to see that added when specific SOC
>> dts are added. Better to be specific in case we need to use it for
>> something like errata work-arounds. Of course we haven't done that so
>> far with the arch timer bindings...
> 
> Agreed. I'm under the impression that most of our compatible fields
> should be more specific than they currently are so we can workaround hw
> bugs like you say. Perhaps the catch all generic one should just be
> "arm,arm-timer-mem" since it isn't tied to any particular CPU type?
> 
>>
> +
> +- clock-frequency : The frequency of the main counter, in Hz. Optional.
> +
> +- reg : The control frame base address.
> +
> +Note that #address-cells, #size-cells, and ranges shall be present to 
> ensure
> +the CPU can address a frame's registers.
> +
> +Frame:
> +
> +- frame-number: 0 to 7.
 I'd really like to get rid of the frame numbers and sub-nodes. Is the
 frame number significant to software?
>>> We need the frame number to read and write registers in the control
>>> frame (the first base in the parent node). We currently use it to
>>> determine if a frame has support for the virtual timer by reading the
>>> CNTTIDR (a register with 4 bits per frame describing capabilities). If
>>> we wanted to control access to the second view of a frame we would also
>>> need to configure the CNTPL0ACRn register that pertains to the frame
>>> we're controlling. Without a frame number we wouldn't know which
>>> register to write.
>> I've gone thru the memory mapped part of the spec now, so I think I
>> understand things better. I see how the frame number is needed, but...
>>
>> The control base is only accessible in secure or hyp mode. How does a
>> guest know that it is a guest and can't map the control base? Seems like
>> we need to allow a subset of the binding that is just a reg and
>> interrupts properties without the frame sub nodes.
> 
> I don't see that part. My understanding is that the control base is
> accessible in non-secure mode and by the guests. There are certain
> registers within that base that are only accessible in secure mode
> though: CNTFRQ and CNTNSAR. Also some registers are configurable:
> CNTACRn and CNTVOFFN. CNTVOFFN is only accessible in the hypervisor.

The example is section E.8 seems to say otherwise. Perhaps Mark R can
comment further.

> We don't really care about CNTFRQ because it's duplicated into each
> view. We do care about CNTNSAR. Luckily the spec "just works" there in
> the sense that we can use CNTTIDR in conjunction with CNTACRn and
> determine if we have access to a frame we're interested in if the
> CNTTIDR bits say the frame is present and the CNTACRn register says we
> can access it. If not then it must be locked down for secure users.
> 
> Unfortunately hardware doesn't have a way to say that a particular frame
> is reserved for the hypervisor or the guest kernel/userspace. We need
> some help from software, so we have the status property express that a
> particular frame is available. We have to assume the DT is going to be
> different depending on if you're the hypervisor or the guest. That's a
> valid assumption right? Otherwise I hope we can do some trapping of the
> guest's mapping to the control base and then rewrite what they read so
> that they only see the frame that we want to be available to them.

Yeah, I believe the only way to prevent access within non-secure world
is with the MMU. So I guess the example is just policy that the
hypervisor would/may not create a stage2 mapping. You still have the
same issue that the guest should not be passed the control base. You
could make the reg property optional, but then what do you do with the
node name?

Certainly the guest dtb will be different.

Rob

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 6/8] gpio-tz1090: add TZ1090 gpio driver

2013-04-25 Thread Linus Walleij
On Tue, Apr 23, 2013 at 4:33 PM, James Hogan  wrote:

> Add a GPIO driver for the main GPIOs found in the TZ1090 (Comet) SoC.
> This doesn't include low-power GPIOs as they're controlled separately
> via the Powerdown Controller (PDC) registers.
>
> The driver is instantiated by device tree and supports interrupts for
> all GPIOs.
>
> Signed-off-by: James Hogan 
> Cc: Grant Likely 
> Cc: Rob Herring 
> Cc: Rob Landley 
> Cc: Linus Walleij 
> Cc: linux-...@vger.kernel.org

(...)
> +  - #gpio-cells: Should be 2. The syntax of the gpio specifier used by client
> +nodes should have the following values.
> +   <[phandle of the gpio controller node]
> +[gpio number within the gpio bank]
> +[standard Linux gpio flags]>

So when someone using this device tree for Symbian or Windows
Mobile start to work, what does "standard Linux gpio flags" tell them?

> +Values for gpio specifier:
> +- GPIO number: a value in the range 0 to 29.
> +- GPIO flags: standard Linux GPIO flags as found in of_gpio.h

Dito. Linux-specifics are not generally allowed in device trees,
and if they are anyway used they shall be prefixed with "linux,"

> +  Bank subnode optional properties:
> +  - gpio-ranges: Mapping to pin controller pins

Here you seem to use DT GPIO ranges, yet the pinctrl driver registers
some GPIO range, care to explain how that fits together?

> +  - #interrupt-cells: Should be 2. The syntax of the interrupt specifier 
> used by
> +client nodes should have the following values.
> +   <[phandle of the interurupt controller]
> +[gpio number within the gpio bank]
> +[standard Linux irq flags]>
> +
> +Values for irq specifier:
> +- GPIO number: a value in the range 0 to 29
> +- IRQ flags: standard Linux IRQ flags for edge and level triggering

Same comments.

(...)

+#include 

What on earth is that. I can only fear it. I don't like the
looks of that thing.

(...)
> +/* Convenience register accessors */
> +static void tz1090_gpio_write(struct tz1090_gpio_bank *bank,
> + unsigned int reg_offs, u32 data)
> +{
> +   iowrite32(data, bank->reg + reg_offs);
> +}
> +
> +static u32 tz1090_gpio_read(struct tz1090_gpio_bank *bank,
> +   unsigned int reg_offs)
> +{
> +   return ioread32(bank->reg + reg_offs);
> +}

The pinctrl driver included the keyword "inline" for these so
this should be consistent and do that too.

(...)
> +static void tz1090_gpio_clear_bit(struct tz1090_gpio_bank *bank,
> + unsigned int reg_offs,
> + unsigned int offset)
> +{
> +   int lstat;
> +
> +   __global_lock2(lstat);
> +   _tz1090_gpio_clear_bit(bank, reg_offs, offset);
> +   __global_unlock2(lstat);
> +}

This global lock scares me.

+static inline void _tz1090_gpio_clear_bit(struct tz1090_gpio_bank *bank,
+ unsigned int reg_offs,
+ unsigned int offset)
+{
+   u32 value;
+
+   value = tz1090_gpio_read(bank, reg_offs);
+   value &= ~(0x1 << offset);

I usually do this:

#include 

value &= ~BIT(offset);

+   tz1090_gpio_write(bank, reg_offs, value);
+}

> +/* caller must hold LOCK2 */
> +static inline void _tz1090_gpio_set_bit(struct tz1090_gpio_bank *bank,
> +   unsigned int reg_offs,
> +   unsigned int offset)
> +{
> +   u32 value;
> +
> +   value = tz1090_gpio_read(bank, reg_offs);
> +   value |= 0x1 << offset;

I usually do this:

#include 

value |= BIT(offset);

> +/* caller must hold LOCK2 */
> +static inline void _tz1090_gpio_mod_bit(struct tz1090_gpio_bank *bank,
> +   unsigned int reg_offs,
> +   unsigned int offset,
> +   int val)

If val is used as it is, make it a bool.

> +{
> +   u32 value;
> +
> +   value = tz1090_gpio_read(bank, reg_offs);
> +   value &= ~(0x1 << offset);
> +   value |= !!val << offset;

You're claming val to [0,1] obviously it's a bool.

> +   tz1090_gpio_write(bank, reg_offs, value);
> +}

(...)
> +static int tz1090_gpio_request(struct gpio_chip *chip, unsigned offset)
> +{
> +   struct tz1090_gpio_bank *bank = to_bank(chip);
> +   int ret;
> +
> +   ret = pinctrl_request_gpio(chip->base + offset);
> +   if (ret)
> +   return ret;
> +
> +   tz1090_gpio_set_bit(bank, REG_GPIO_DIR, offset);
> +   tz1090_gpio_set_bit(bank, REG_GPIO_BIT_EN, offset);
> +
> +   return 0;
> +}

This is nice, it just glues smoothly into pinctrl here.

> +static void tz1090_gpio_free(struct gpio_chip *chip, unsigned offset)
> +{
> +   struct tz1090_gpio_bank *bank = to_bank(chip);
> +
> +   pinctrl_free_gpio(chip->base + offset);
> +
> +   tz1090_gpio_clear_bit(bank, REG_GPIO_BIT_EN, 

Re: [GIT PULL] x86 fixes for 3.9

2013-04-25 Thread H. Peter Anvin
On 04/25/2013 03:53 PM, Michel Lespinasse wrote:
> On Thu, Apr 25, 2013 at 3:23 PM, Matthew Garrett
>  wrote:
>> On Thu, 2013-04-25 at 15:20 -0700, Linus Torvalds wrote:
>>> On Thu, Apr 25, 2013 at 2:44 PM, H. Peter Anvin  
>>> wrote:

 -   if (!sys_table->runtime->query_variable_info)
 +   if (sys_table->runtime->hdr.revision < 
 EFI_2_00_SYSTEM_TABLE_REVISION)
 return EFI_UNSUPPORTED;
>>>
>>> Is a EFI 2.00 system table *guaranteed* to have that
>>> "query_variable_info" function? The above adds the version check, but
>>> removes the check for a NULL pointer.
>>
>> As far as the spec's concerned, yes. As far as reality's concerned - if
>> anything doesn't provide it, we're already crashing when
>> efi_virt_query_variable_info() gets called. Nobody's complained so far.
> 
> Well, I don't know if this is related, but commit e971318bbed6 broke
> the google EFI SMI driver with
> BUG: unable to handle kernel NULL pointer dereference at (null)
> IP: [] variable_is_present+0x55/0x170
> Call Trace:
> [] register_efivars+0x106/0x370
> [] ? firmware_map_add_early+0xb1/0xb1
> [] gsmi_init+0x2ad/0x3da
> [] do_one_initcall+0x3f/0x170
> ...
> 

I don't know either.  Could you test this patch and see if it does anything?

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] x86 fixes for 3.9

2013-04-25 Thread Michel Lespinasse
On Thu, Apr 25, 2013 at 3:23 PM, Matthew Garrett
 wrote:
> On Thu, 2013-04-25 at 15:20 -0700, Linus Torvalds wrote:
>> On Thu, Apr 25, 2013 at 2:44 PM, H. Peter Anvin  wrote:
>> >
>> > -   if (!sys_table->runtime->query_variable_info)
>> > +   if (sys_table->runtime->hdr.revision < 
>> > EFI_2_00_SYSTEM_TABLE_REVISION)
>> > return EFI_UNSUPPORTED;
>>
>> Is a EFI 2.00 system table *guaranteed* to have that
>> "query_variable_info" function? The above adds the version check, but
>> removes the check for a NULL pointer.
>
> As far as the spec's concerned, yes. As far as reality's concerned - if
> anything doesn't provide it, we're already crashing when
> efi_virt_query_variable_info() gets called. Nobody's complained so far.

Well, I don't know if this is related, but commit e971318bbed6 broke
the google EFI SMI driver with
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [] variable_is_present+0x55/0x170
Call Trace:
[] register_efivars+0x106/0x370
[] ? firmware_map_add_early+0xb1/0xb1
[] gsmi_init+0x2ad/0x3da
[] do_one_initcall+0x3f/0x170
...

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2 1/4] Documentation: Add memory mapped ARM architected timer binding

2013-04-25 Thread Stephen Boyd
On 04/25/13 14:47, Rob Herring wrote:
> On 04/15/2013 04:33 PM, Stephen Boyd wrote:
>> On 04/15/13 14:20, Rob Herring wrote:
>>> On Fri, Apr 12, 2013 at 7:27 PM, Stephen Boyd  wrote:
 @@ -26,3 +30,52 @@ Example:
  <1 10 0xf08>;
 clock-frequency = <1>;
 };
 +
 +** Memory mapped timer node properties
 +
 +- compatible : Should at least contain "arm,armv7-timer-mem".
>>> Everything about this timer is architecturally defined? If not, let's
>>> use a more specific name.
>> I'm not sure I'm following you, but everything described here is part of
>> the ARM definition. What would be a more specific name?
> Something that corresponds to the particular implementation like
> cortex-a15 (obviously not an example that has this). I'm fine with
> leaving this for now, but would like to see that added when specific SOC
> dts are added. Better to be specific in case we need to use it for
> something like errata work-arounds. Of course we haven't done that so
> far with the arch timer bindings...

Agreed. I'm under the impression that most of our compatible fields
should be more specific than they currently are so we can workaround hw
bugs like you say. Perhaps the catch all generic one should just be
"arm,arm-timer-mem" since it isn't tied to any particular CPU type?

>
 +
 +- clock-frequency : The frequency of the main counter, in Hz. Optional.
 +
 +- reg : The control frame base address.
 +
 +Note that #address-cells, #size-cells, and ranges shall be present to 
 ensure
 +the CPU can address a frame's registers.
 +
 +Frame:
 +
 +- frame-number: 0 to 7.
>>> I'd really like to get rid of the frame numbers and sub-nodes. Is the
>>> frame number significant to software?
>> We need the frame number to read and write registers in the control
>> frame (the first base in the parent node). We currently use it to
>> determine if a frame has support for the virtual timer by reading the
>> CNTTIDR (a register with 4 bits per frame describing capabilities). If
>> we wanted to control access to the second view of a frame we would also
>> need to configure the CNTPL0ACRn register that pertains to the frame
>> we're controlling. Without a frame number we wouldn't know which
>> register to write.
> I've gone thru the memory mapped part of the spec now, so I think I
> understand things better. I see how the frame number is needed, but...
>
> The control base is only accessible in secure or hyp mode. How does a
> guest know that it is a guest and can't map the control base? Seems like
> we need to allow a subset of the binding that is just a reg and
> interrupts properties without the frame sub nodes.

I don't see that part. My understanding is that the control base is
accessible in non-secure mode and by the guests. There are certain
registers within that base that are only accessible in secure mode
though: CNTFRQ and CNTNSAR. Also some registers are configurable:
CNTACRn and CNTVOFFN. CNTVOFFN is only accessible in the hypervisor.

We don't really care about CNTFRQ because it's duplicated into each
view. We do care about CNTNSAR. Luckily the spec "just works" there in
the sense that we can use CNTTIDR in conjunction with CNTACRn and
determine if we have access to a frame we're interested in if the
CNTTIDR bits say the frame is present and the CNTACRn register says we
can access it. If not then it must be locked down for secure users.

Unfortunately hardware doesn't have a way to say that a particular frame
is reserved for the hypervisor or the guest kernel/userspace. We need
some help from software, so we have the status property express that a
particular frame is available. We have to assume the DT is going to be
different depending on if you're the hypervisor or the guest. That's a
valid assumption right? Otherwise I hope we can do some trapping of the
guest's mapping to the control base and then rewrite what they read so
that they only see the frame that we want to be available to them.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 -next 4/5] x86: Add support for LZ4-compressed kernel

2013-04-25 Thread Andrew Morton
On Wed, 6 Mar 2013 15:37:38 +0900 Kyungsik Lee  wrote:

> On Tue, Mar 05, 2013 at 08:13:38AM -0800, H. Peter Anvin wrote:
> > Please add the new magic to Documentation/x86/boot.txt as well.
> 
> Ok, I will update it as soon as the patch set is stabilized.

It's been six weeks.  Please send the update asap.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] lib: Add lz4 compressor module

2013-04-25 Thread Andrew Morton
On Mon, 22 Apr 2013 18:22:18 +0900 "Chanho Min"  wrote:

> >> +#define HTYPE const u8*
> >> +
> >> +#ifdef __BIG_ENDIAN
> >> +#define LZ4_NBCOMMONBYTES(val) (__builtin_clz(val) >> 3)
> >> +#else
> >> +#define LZ4_NBCOMMONBYTES(val) (__builtin_ctz(val) >> 3)
> >> +#endif
> >
> >It seems at least m68k and sparc don't have the __builtin_clz() functions:
> >
> >m68k-allmodconfig (http://kisskb.ellerman.id.au/kisskb/buildresult/8572593/):
> >
> >ERROR: "__clzsi2" [lib/lz4/lz4hc_compress.ko] undefined!
> >ERROR: "__clzsi2" [lib/lz4/lz4_compress.ko] undefined!
> 
> gcc seems to define __builtin_clz as __clzsi2 in some architecture. 
> But, kernel doesn't link libgcc.a.
> If kernel should use gcc's built-in function without libgcc.a,
> do we need to port __clzsi2 to 'arch/*/lib/*'?

This breaks alpha (gcc-4.4.4) as well.  Can we please get this fixed
promptly?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/8] pinctrl-tz1090: add TZ1090 pinctrl driver

2013-04-25 Thread Linus Walleij
On Tue, Apr 23, 2013 at 4:33 PM, James Hogan  wrote:

> Add a pin control driver for the main pins on the TZ1090 SoC. This
> doesn't include the low-power pins as they're controlled separately via
> the Powerdown Controller (PDC) registers.
>
> Signed-off-by: James Hogan 
> Cc: Grant Likely 
> Cc: Rob Herring 
> Cc: Rob Landley 
> Cc: Linus Walleij 
> Cc: linux-...@vger.kernel.org

(...)
> +++ b/drivers/pinctrl/Kconfig
> @@ -196,6 +196,12 @@ config PINCTRL_TEGRA114
> bool
> select PINCTRL_TEGRA
>
> +config PINCTRL_TZ1090
> +   bool "Toumaz Xenif TZ1090 pin control driver"
> +   depends on SOC_TZ1090
> +   select PINMUX
> +   select PINCONF

Why are you not using GENERIC_PINCONF?

It doesn't seem like this pin controller is using something
that isn't covered by that library.

This way you get rid of TZ1090_PINCONF_PACK() etc
and can use the standard packing.

> +#include 

As mentioned I want the pin definintions from the arch to be in this
subsystem as well.

If the GPIO driver is also using the, then move the GPIO driver
into drivers/pinctrl, that is recommended in such cases.

> + * @drv:   Drive control supported, 0 if unsupported.
> + * This means Schmitt, Slew, and Drive strength.
> + * @slw_bit:   Slew register bit. 0 if unsupported.
> + * The same bit is used for Schmitt, and Drive (*2).
(...)
> +   u32 drv:1;

So what about you use a bool for that?

> +   u32 slw_bit:5;
> +};

(...)
> +/* Pin names */
> +
> +static const struct pinctrl_pin_desc tz1090_pins[] = {
> +   /* Normal GPIOs */
> +   PINCTRL_PIN(TZ1090_PIN_SDIO_CLK,"sdio_clk"),
> +   PINCTRL_PIN(TZ1090_PIN_SDIO_CMD,"sdio_cmd"),

Are these actually the names from the datasheet? Usually these
have geographical names, like D1, A7... but just checking.

(...)
> +/* Sub muxes */

Can you describe here briefly what a sub mux is and how it is
deployed in this system? It's getting complicated at this point
so some help would be appreciated.

> +/* Pin group with mux control */
> +#define MUX_PG(pg_name, f0, f1, f2, f3, f4,\
> +  mux_r, mux_b, mux_w, slw_b)  \
> +   {   \
> +   .name = #pg_name,   \
> +   .pins = pg_name##_pins, \
> +   .npins = ARRAY_SIZE(pg_name##_pins),\
> +   .mux = MUX(f0, f1, f2, f3, f4,  \
> +  mux_r, mux_b, mux_w),\
> +   .drv = ((slw_b) >= 0),  \
> +   .slw_bit = (slw_b), \
> +   }
> +
> +#define SIMPLE_PG(pg_name) \
> +   {   \
> +   .name = #pg_name,   \
> +   .pins = pg_name##_pins, \
> +   .npins = ARRAY_SIZE(pg_name##_pins),\
> +   }
> +
> +#define SIMPLE_DRV_PG(pg_name, slw_b)  \
> +   {   \
> +   .name = #pg_name,   \
> +   .pins = pg_name##_pins, \
> +   .npins = ARRAY_SIZE(pg_name##_pins),\
> +   .drv = 1,   \
> +   .slw_bit = (slw_b), \
> +   }
> +
> +#define DRV_PG(pg_name, slw_b) \
> +   {   \
> +   .name = "drive_"#pg_name,   \
> +   .pins = drive_##pg_name##_pins, \
> +   .npins = ARRAY_SIZE(drive_##pg_name##_pins),\
> +   .drv = 1,   \
> +   .slw_bit = (slw_b), \
> +   }
> +
> +/*name f0,  f1,f2,f3, f4, mux r/b/w */
> +DEFINE_SUBMUX(ext_dac, DAC, NOT_IQADC_STB, IQDAC_STB, NA, NA, IF_CTL, 6, 2);

Again, this is not very easy to understand, so more commenting is warranted.
The macros may need individual documentation for being quite
hard to understand.

> +/**
> + * struct tz1090_pmx - Private pinctrl data
> + * @dev:   Platform device
> + * @pctl:  Pin control device
> + * @regs:  Register region
> + * @lock:  Lock protecting coherency of pin_en, gpio_en, select_en, and
> + * SELECT regs
> + * @pin_en:Pins that have been enabled (32 pins packed into each element)
> + * @gpio_en:   GPIOs that have been enabled (32 pins packed into each 
> element)
> + * @select_en: Pins that have been force seleced by pinconf (32 pins packed
> + * into 

Re: [PATCHv3 00/14] drivers: mailbox: framework creation

2013-04-25 Thread Suman Anna
Jassi,

On 04/25/2013 12:20 AM, Jassi Brar wrote:
> On 25 April 2013 04:46, Suman Anna  wrote:
>> On 04/24/2013 03:56 AM, Jassi Brar wrote:
>>
> 
>> I think there are two things here - one is what the client needs to do
>> upon sending/receiving a message, and the other is what the send API or
>> the mailbox controller should do when a client tried to send a message
>> and the controller's shared message/transport is not free (irrespective
>> of the atomic or non-atomic context). The kfifos in the common driver
>> code are currently solving the latter problem. The current send API
>> directly uses the controller to send if it is free, and uses buffering
>> only when it is busy. But, I see your point that this should should be
>> the responsibility of the specific controller, or even a property of the
>> specific mailbox belonging to that controller. This direction would put
>> most of the onus on the specific controller drivers, and probably the
>> main API would be very simple. Another factor / attribute to consider is
>> that a given mailbox may be sharable or exclusive, things are simple if
>> it is exclusive to a client driver.
>>
> I never suggested we don't use buffering. I too believe the API should
> buffer requests but also that it should still do atomic callbacks. The
> impact on implementation would be that the queue buffer can not grow
> at runtime. But that should be fine because a reasonable size (say 10
> or even 50) could be chosen and we allow submission of requests from
> tx_done callback.

Yeah, even the current kfifo approach doesn't grow the queue buffer at
runtime, and the size of the kfifo is determined at driver init time
based on the controller. If the queue is also full, then you fail
tranmitting right away. OK, I thought you didn't want buffering, if that
is not the case, then the buffering should be within the main driver
code, like it is now, but configurable based on the controller or
mailbox properties. If it is present in individual controller drivers,
then we would be duplicating stuff. Are you envisioning that this be
left to the individual controllers?

> 
>>
>> We are talking two fundamentally different usecases/needs here,
>> depending on the type of the controller. You seem to be coming from a
>> usecase where the client driver needs to know when every message is
>> transmitted (from an atomic context, it is a moot point since either you
>> are successful or not when transmitting).
>>
> I am afraid you are confusing the meaning of 'atomic context' here.
> atomic context doesn't mean instant transmission of data, but that the
> API calls could be made from even atomic context and that the client &
> controller can't sleep in callbacks from the API. So it's not moot.

I understood the atomic context, and the question is about the behavior
of the '.tx_done' callback when sending from atomic context. Is there
such a usecase/need for you in that you want to send a response back
from an atomic context, yet get a callback?

> 
>> The remote
>> has to ack before it can be shutdown. I would imagine that getting a
>> .tx_done on a particular message is not good enough to know that the
>> remote is ready for shutdown. I can imagine it to be useful where there
>> is some inherent knowledge that the client needs to proceed with the
>> next steps when a message is sent.
>>
> Of course we are not specifying how the mailbox signals are
> interpreted by the remote. It should suffice just to realize that
> there exists genuine requirement for a client to know when its message
> was received by the remote.
> 
>> That said, we need to go with the
>> stricter one.
>>
> Great, that we agree.
> 
>> My only concern here is that if there can be multiple
>> clients for a particular mailbox/controller, then all the clients would
>> have to have an agreement on the controller packet type, and the clients
>> would mostly have to include the standard mailbox.h as well as a
>> controller-specific header.
>>
> It's the controller driver that actually puts the data on the bus. So
> only it should define the format in which it accepts data from the
> clients. Every client should simply populate the packet structure
> defined in  my_lovely_controller.h and pass on the struct pointer to
> the controller driver via API.
> No negotiations for the driver seat among passengers :)

OK, I was trying to avoid including my_lovely_controller.h and only
include the standard .h file as a client user, the client would anyway
need to have the intrinsic knowledge of the packet structure. And if you
were to do buffering in the common driver code, you would need the size
field outside. The void *msg packet structure would still be an
understanding between the client and the controller. So, it is a
tradeoff between just using void * and leave the buffering to controller
driver (potential duplication) & using a size and void *, along with
buffering in common driver code.

> 
>> Overall, I see it coming down to following 

Re: [PATCH V2] usb: storage: Convert US_DEBUGP to usb_stor_dbg

2013-04-25 Thread Andrew Morton
On Mon, 22 Apr 2013 12:35:26 -0700 (PDT) David Rientjes  
wrote:

> On Fri, 19 Apr 2013, Joe Perches wrote:
> 
> > @@ -966,11 +934,13 @@ static int realtek_cr_autosuspend_setup(struct 
> > us_data *us)
> >  static void realtek_cr_destructor(void *extra)
> >  {
> > struct rts51x_chip *chip = (struct rts51x_chip *)extra;
> > -
> > -   US_DEBUGP("%s: <---\n", __func__);
> > +   struct us_data *us;
> >  
> > if (!chip)
> > return;
> > +
> > +   us = chip->us;
> > +
> >  #ifdef CONFIG_REALTEK_AUTOPM
> > if (ss_en) {
> > del_timer(>rts51x_suspend_timer);
> 
> >From linux-next:
> 
> drivers/usb/storage/realtek_cr.c: In function 'realtek_cr_destructor':
> drivers/usb/storage/realtek_cr.c:942:11: error: 'struct rts51x_chip' has no 
> member named 'us'
> 
> chip->us here is only defined when CONFIG_REALTEK_AUTOPM is enabled.

local var `us' doesn't get used by anything and it's unclear why that
patch added it?  



From: Andrew Morton 
Subject: drivers/usb/storage/realtek_cr.c: fix build

Remove unused local `us', which broke the build.  Also nuke an unneeded
cast.

Repairs 191648d03d20 ("usb: storage: Convert US_DEBUGP to usb_stor_dbg").

Cc: Joe Perches 
Cc: David Rientjes 
Cc: Greg KH 
Signed-off-by: Andrew Morton 
---

 drivers/usb/storage/realtek_cr.c |5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff -puN 
drivers/usb/storage/realtek_cr.c~drivers-usb-storage-realtek_crc-fix-build 
drivers/usb/storage/realtek_cr.c
--- a/drivers/usb/storage/realtek_cr.c~drivers-usb-storage-realtek_crc-fix-build
+++ a/drivers/usb/storage/realtek_cr.c
@@ -933,14 +933,11 @@ static int realtek_cr_autosuspend_setup(
 
 static void realtek_cr_destructor(void *extra)
 {
-   struct rts51x_chip *chip = (struct rts51x_chip *)extra;
-   struct us_data *us;
+   struct rts51x_chip *chip = extra;
 
if (!chip)
return;
 
-   us = chip->us;
-
 #ifdef CONFIG_REALTEK_AUTOPM
if (ss_en) {
del_timer(>rts51x_suspend_timer);
_


Left unfixed:

drivers/usb/storage/realtek_cr.c:628: warning: 
'config_autodelink_before_power_down' defined but not used
drivers/usb/storage/realtek_cr.c:698: warning: 'fw5895_init' defined but not 
used

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] vfio: Use down_reads to protect iommu disconnects

2013-04-25 Thread Alex Williamson
If a group or device is released or a container is unset from a group
it can race against file ops on the container.  Protect these with
down_reads to allow concurrent users.

Signed-off-by: Alex Williamson 
Reported-by: Michael S. Tsirkin 
---
 drivers/vfio/vfio.c |   62 ++-
 1 file changed, 46 insertions(+), 16 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 073788e..ac7423b 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -704,9 +704,13 @@ EXPORT_SYMBOL_GPL(vfio_del_group_dev);
 static long vfio_ioctl_check_extension(struct vfio_container *container,
   unsigned long arg)
 {
-   struct vfio_iommu_driver *driver = container->iommu_driver;
+   struct vfio_iommu_driver *driver;
long ret = 0;
 
+   down_read(>group_lock);
+
+   driver = container->iommu_driver;
+
switch (arg) {
/* No base extensions yet */
default:
@@ -736,6 +740,8 @@ static long vfio_ioctl_check_extension(struct 
vfio_container *container,
 VFIO_CHECK_EXTENSION, arg);
}
 
+   up_read(>group_lock);
+
return ret;
 }
 
@@ -844,9 +850,6 @@ static long vfio_fops_unl_ioctl(struct file *filep,
if (!container)
return ret;
 
-   driver = container->iommu_driver;
-   data = container->iommu_data;
-
switch (cmd) {
case VFIO_GET_API_VERSION:
ret = VFIO_API_VERSION;
@@ -858,8 +861,15 @@ static long vfio_fops_unl_ioctl(struct file *filep,
ret = vfio_ioctl_set_iommu(container, arg);
break;
default:
+   down_read(>group_lock);
+
+   driver = container->iommu_driver;
+   data = container->iommu_data;
+
if (driver) /* passthrough all unrecognized ioctls */
ret = driver->ops->ioctl(data, cmd, arg);
+
+   up_read(>group_lock);
}
 
return ret;
@@ -910,35 +920,55 @@ static ssize_t vfio_fops_read(struct file *filep, char 
__user *buf,
  size_t count, loff_t *ppos)
 {
struct vfio_container *container = filep->private_data;
-   struct vfio_iommu_driver *driver = container->iommu_driver;
+   struct vfio_iommu_driver *driver;
+   ssize_t ret = -EINVAL;
 
-   if (unlikely(!driver || !driver->ops->read))
-   return -EINVAL;
+   down_read(>group_lock);
+
+   driver = container->iommu_driver;
+   if (likely(driver && driver->ops->read))
+   ret = driver->ops->read(container->iommu_data,
+   buf, count, ppos);
 
-   return driver->ops->read(container->iommu_data, buf, count, ppos);
+   up_read(>group_lock);
+
+   return ret;
 }
 
 static ssize_t vfio_fops_write(struct file *filep, const char __user *buf,
   size_t count, loff_t *ppos)
 {
struct vfio_container *container = filep->private_data;
-   struct vfio_iommu_driver *driver = container->iommu_driver;
+   struct vfio_iommu_driver *driver;
+   ssize_t ret = -EINVAL;
 
-   if (unlikely(!driver || !driver->ops->write))
-   return -EINVAL;
+   down_read(>group_lock);
+
+   driver = container->iommu_driver;
+   if (likely(driver && driver->ops->write))
+   ret = driver->ops->write(container->iommu_data,
+buf, count, ppos);
+
+   up_read(>group_lock);
 
-   return driver->ops->write(container->iommu_data, buf, count, ppos);
+   return ret;
 }
 
 static int vfio_fops_mmap(struct file *filep, struct vm_area_struct *vma)
 {
struct vfio_container *container = filep->private_data;
-   struct vfio_iommu_driver *driver = container->iommu_driver;
+   struct vfio_iommu_driver *driver;
+   int ret = -EINVAL;
 
-   if (unlikely(!driver || !driver->ops->mmap))
-   return -EINVAL;
+   down_read(>group_lock);
 
-   return driver->ops->mmap(container->iommu_data, vma);
+   driver = container->iommu_driver;
+   if (likely(driver && driver->ops->mmap))
+   ret = driver->ops->mmap(container->iommu_data, vma);
+
+   up_read(>group_lock);
+
+   return ret;
 }
 
 static const struct file_operations vfio_fops = {

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] vfio: Convert container->group_lock to rwsem

2013-04-25 Thread Alex Williamson
All current users are writers, maintaining current mutual exclusion.
This lets us add read users next.

Signed-off-by: Alex Williamson 
---
 drivers/vfio/vfio.c |   21 +++--
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 21eddd9..073788e 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -57,7 +58,7 @@ struct vfio_iommu_driver {
 struct vfio_container {
struct kref kref;
struct list_headgroup_list;
-   struct mutexgroup_lock;
+   struct rw_semaphore group_lock;
struct vfio_iommu_driver*iommu_driver;
void*iommu_data;
 };
@@ -738,7 +739,7 @@ static long vfio_ioctl_check_extension(struct 
vfio_container *container,
return ret;
 }
 
-/* hold container->group_lock */
+/* hold write lock on container->group_lock */
 static int __vfio_container_attach_groups(struct vfio_container *container,
  struct vfio_iommu_driver *driver,
  void *data)
@@ -769,7 +770,7 @@ static long vfio_ioctl_set_iommu(struct vfio_container 
*container,
struct vfio_iommu_driver *driver;
long ret = -ENODEV;
 
-   mutex_lock(>group_lock);
+   down_write(>group_lock);
 
/*
 * The container is designed to be an unprivileged interface while
@@ -780,7 +781,7 @@ static long vfio_ioctl_set_iommu(struct vfio_container 
*container,
 * the container is deprivileged and returns to an unset state.
 */
if (list_empty(>group_list) || container->iommu_driver) {
-   mutex_unlock(>group_lock);
+   up_write(>group_lock);
return -EINVAL;
}
 
@@ -827,7 +828,7 @@ static long vfio_ioctl_set_iommu(struct vfio_container 
*container,
 
mutex_unlock(_drivers_lock);
 skip_drivers_unlock:
-   mutex_unlock(>group_lock);
+   up_write(>group_lock);
 
return ret;
 }
@@ -882,7 +883,7 @@ static int vfio_fops_open(struct inode *inode, struct file 
*filep)
return -ENOMEM;
 
INIT_LIST_HEAD(>group_list);
-   mutex_init(>group_lock);
+   init_rwsem(>group_lock);
kref_init(>kref);
 
filep->private_data = container;
@@ -961,7 +962,7 @@ static void __vfio_group_unset_container(struct vfio_group 
*group)
struct vfio_container *container = group->container;
struct vfio_iommu_driver *driver;
 
-   mutex_lock(>group_lock);
+   down_write(>group_lock);
 
driver = container->iommu_driver;
if (driver)
@@ -979,7 +980,7 @@ static void __vfio_group_unset_container(struct vfio_group 
*group)
container->iommu_data = NULL;
}
 
-   mutex_unlock(>group_lock);
+   up_write(>group_lock);
 
vfio_container_put(container);
 }
@@ -1039,7 +1040,7 @@ static int vfio_group_set_container(struct vfio_group 
*group, int container_fd)
container = f.file->private_data;
WARN_ON(!container); /* fget ensures we don't race vfio_release */
 
-   mutex_lock(>group_lock);
+   down_write(>group_lock);
 
driver = container->iommu_driver;
if (driver) {
@@ -1057,7 +1058,7 @@ static int vfio_group_set_container(struct vfio_group 
*group, int container_fd)
atomic_inc(>container_users);
 
 unlock_out:
-   mutex_unlock(>group_lock);
+   up_write(>group_lock);
fdput(f);
return ret;
 }

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/2] Protect against iommu driver disconnect

2013-04-25 Thread Alex Williamson
Michael Tsirkin pointed out that file operations on /dev/vfio/vfio
dereference iommu_driver and iommu_data without a lock.  If releasing
a group or unsetting the container occurs concurrently, we could race.
We currently use a mutex when setting this association, so we can
convert to a rwsem keeping the existing mutex critical sections as
down_writes and add down_reads where these are used.  Thanks,

Alex

---

Alex Williamson (2):
  vfio: Convert container->group_lock to rwsem
  vfio: Use down_reads to protect iommu disconnects


 drivers/vfio/vfio.c |   83 +++
 1 file changed, 57 insertions(+), 26 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] x86 fixes for 3.9

2013-04-25 Thread Matthew Garrett
On Thu, 2013-04-25 at 15:20 -0700, Linus Torvalds wrote:
> On Thu, Apr 25, 2013 at 2:44 PM, H. Peter Anvin  wrote:
> >
> > -   if (!sys_table->runtime->query_variable_info)
> > +   if (sys_table->runtime->hdr.revision < 
> > EFI_2_00_SYSTEM_TABLE_REVISION)
> > return EFI_UNSUPPORTED;
> 
> Is a EFI 2.00 system table *guaranteed* to have that
> "query_variable_info" function? The above adds the version check, but
> removes the check for a NULL pointer.

As far as the spec's concerned, yes. As far as reality's concerned - if
anything doesn't provide it, we're already crashing when
efi_virt_query_variable_info() gets called. Nobody's complained so far.

> And why the hell don't we have a real structure that has been filled
> out properly, and instead apparently just do this "point to random
> memory that doesn't necessarily have the full structure?

This is early boot code, we're not in the kernel proper yet. All we have
is the structure that's handed to us by the firmware, and the size of
that structure varies depending on its version.
-- 
Matthew Garrett | mj...@srcf.ucam.org
N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

Re: [ANNOUNCE] 3.4.41-rt55-feat1

2013-04-25 Thread Steven Rostedt
On Fri, 2013-04-26 at 00:20 +0200, Thomas Gleixner wrote:
> Tim,
> 
> On Thu, 25 Apr 2013, Tim Sander wrote:
> > handler. But normally the HR_TIMER is set. So we switched it off on this 
> > very 
> > purpose. As we also have also PREEMPT_RT_FULL set the proposed solution to 
> > allow only PREEMPT_RT_FULL with PREEMPT_RT_FULL set is not an option for us.
> 
> -ENOPARSE

I think he made a typo and was referring to my post. I think he meant
that they require not having high res timers set when PREEMPT_RT_FULL is
set. As you stated to me before.

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] kvm, svm: Fix typo in printk message

2013-04-25 Thread Borislav Petkov
From: Borislav Petkov 

It is "exit_int_info". It is actually EXITINTINFO in the official docs
but we don't like screaming docs.

Signed-off-by: Borislav Petkov 
---
 arch/x86/kvm/svm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index a3bba7786ecc..272d29844cc5 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -3491,7 +3491,7 @@ static int handle_exit(struct kvm_vcpu *vcpu)
exit_code != SVM_EXIT_EXCP_BASE + PF_VECTOR &&
exit_code != SVM_EXIT_NPF && exit_code != SVM_EXIT_TASK_SWITCH &&
exit_code != SVM_EXIT_INTR && exit_code != SVM_EXIT_NMI)
-   printk(KERN_ERR "%s: unexpected exit_ini_info 0x%x "
+   printk(KERN_ERR "%s: unexpected exit_int_info 0x%x "
   "exit_code 0x%x\n",
   __func__, svm->vmcb->control.exit_int_info,
   exit_code);
-- 
1.8.2.135.g7b592fa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >