Re: [PATCH 06/22] tools lib traceevent: Add kmem plugin

2013-11-24 Thread Namhyung Kim
On Sat, 23 Nov 2013 04:06:45 -0500, Steven Rostedt wrote:
> On Fri, 22 Nov 2013 23:38:17 +0900
> Namhyung Kim  wrote:
>> It'd be great if the "call_site" in the output changes to display
>> function names instead of hex addresses directly.
>> 
>
> Actually, that's what's in the (). 
>
>kmem:kmalloc_node: (__alloc_skb+0x7e) call_site=8153c67e
>
> This uses a short cut, where we don't overwrite the entire handler, in
> case the TP_printk() gets new fields.
>
> If the registered handler for an event, like "call_site_handler" (see
> how we use it for all of tracepoints) returns >0, that tells the
> library that we only added extra information, and to print the
> tracepoint as it is normally.

Yeah, I know.  But just want to say that it'd be better if it's
displayed like below.

  kmem:kmalloc_node: call_site=__alloc_skb+0x7e ...

But it requires to write new handlers for each event..

>
> The better solution here is to use "%pS" or something in the actual
> tracepoint instead.

Agreed.

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] pipe_to_sendpage: Ensure that MSG_MORE is set if we set MSG_SENDPAGE_NOTLAST

2013-11-24 Thread Richard Weinberger
Am Sonntag, 24. November 2013, 17:25:06 schrieb Eric Dumazet:
> On Mon, 2013-11-25 at 00:42 +0100, Richard Weinberger wrote:
> > Commit 35f9c09fe (tcp: tcp_sendpages() should call tcp_push() once)
> > added an internal flag MSG_SENDPAGE_NOTLAST.
> > We have to ensure that MSG_MORE is also set if we set
> > MSG_SENDPAGE_NOTLAST.
> > Otherwise users that check against MSG_MORE will not see it.
> > 
> > This fixes sendfile() on AF_ALG.
> > 
> > Cc: Tom Herbert 
> > Cc: Eric Dumazet 
> > Cc: David S. Miller 
> > Cc:  # 3.4.x
> > Reported-and-tested-by: Shawn Landden 
> > Signed-off-by: Richard Weinberger 
> > ---
> > 
> >  fs/splice.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/fs/splice.c b/fs/splice.c
> > index 3b7ee65..b93f1b8 100644
> > --- a/fs/splice.c
> > +++ b/fs/splice.c
> > @@ -701,7 +701,7 @@ static int pipe_to_sendpage(struct pipe_inode_info
> > *pipe,> 
> > more = (sd->flags & SPLICE_F_MORE) ? MSG_MORE : 0;
> > 
> > if (sd->len < sd->total_len && pipe->nrbufs > 1)
> > 
> > -   more |= MSG_SENDPAGE_NOTLAST;
> > +   more |= MSG_SENDPAGE_NOTLAST | MSG_MORE;
> > 
> > return file->f_op->sendpage(file, buf->page, buf->offset,
> > 
> > sd->len, , more);
> 
> I do not think this patch is right. It looks like a revert of a useful
> patch for TCP zero copy. Given the time it took to discover this
> regression, I bet tcp zero copy has more users than AF_ALG, by 5 or 6
> order of magnitude ;)

Yeah, but AF_ALG broke. That's why I did the patch.

> Here we want to make the difference between the two flags, not merge
> them.
> 
> If AF_ALG do not care of the difference, try instead :
> 
> diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c
> index ef5356cd280a..850246206b12 100644
> --- a/crypto/algif_hash.c
> +++ b/crypto/algif_hash.c
> @@ -114,6 +114,9 @@ static ssize_t hash_sendpage(struct socket *sock, struct
> page *page, struct hash_ctx *ctx = ask->private;
>   int err;
> 
> + if (flags & MSG_SENDPAGE_NOTLAST)
> + flags |= MSG_MORE;
> +

In the commit message of your patch you wrote "For all sendpage() providers, 
its a transparent change.". Why does AF_ALG need special handling?
If users have to care about MSG_SENDPAGE_NOTLAST it is no longer really an 
internal flag.

Thanks,
//richard
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)

2013-11-24 Thread Francis Moreau
On 11/24/2013 10:06 PM, Rafael J. Wysocki wrote:
> On Sunday, November 24, 2013 10:39:20 AM Francis Moreau wrote:
>> Hello Thomas
>>
>> On 11/22/2013 11:27 PM, Thomas Gleixner wrote:
>>> On Fri, 22 Nov 2013, Rafael J. Wysocki wrote:
 On Friday, November 22, 2013 10:36:23 PM Francis Moreau wrote:
> Ok, I've finally managed to find out the bad commit:
> ad07277e82dedabacc52c82746633680a3187d25: ACPI / PM: Hold acpi_scan_lock
> over system PM transitions
>
> I verified that the parent commit doesn't have the problem.

 Interesting.

> Rafael, you're the man now ;)

 I kind of don't see how that commit may result in behavior that you
 described earlier in the thread.

 You get a memory corruption that seems to have started to happen because
 we're holding an additional lock over suspend resume now.  Something's 
 fishy
 on that machine and we need to figure out what it is.
>>>
>>> The hickup happens in the timer softirq.
>>>
>>> @Francis: Did you try to enable DEBUG_OBJECTS.*. If not please give it
>>>   a try.
>>
>> This looks like it was a good idea.
>>
>> The kernel now outputs the following traces after resuming.
>>
>> [   26.973928] WARNING: CPU: 0 PID: 4 at lib/debugobjects.c:260
>> debug_print_object+0x83/0xa0()
>> [   26.973932] ODEBUG: free active (active state 0) object type:
>> timer_list hint: delayed_work_timer_fn+0x0/0x20
>> [   26.973972] Modules linked in: x86_pkg_temp_thermal intel_powerclamp
>> rtsx_pci_ms coretemp memstick kvm_intel i2c_i801 iTCO_wdt
>> iTCO_vendor_support i915 i2c_algo_bit intel_agp intel_gtt drm_kms_helper
>> r8169 drm kvm mii agpgart i2c_core lpc_ich ac shpchp crc32c_intel
>> battery thermal wmi evdev mei_me video mei button mperf processor
>> serio_raw microcode ext4 crc16 mbcache jbd2 sr_mod cdrom sd_mod
>> usb_storage rtsx_pci_sdmmc mmc_core ahci libahci libata ehci_pci
>> ehci_hcd xhci_hcd scsi_mod rtsx_pci usbcore usb_common
>> [   26.974013] CPU: 0 PID: 4 Comm: kworker/0:0 Not tainted
>> 3.11.0-rc2-ARCH #64
>> [   26.974014] Hardware name: CLEVO CO.W55xEU
>>/W55xEU  , BIOS 4.6.5
>> 03/05/2013
>> [   26.974019] Workqueue: kacpi_hotplug hotplug_event_work
>> [   26.974020]  0009 880407d0da18 81459fe9
>> 880407d0da60
>> [   26.974023]  880407d0da50 8104dc7d 880407fad488
>> 81836fc0
>> [   26.974025]  81701358 81afef70 0003
>> 880407d0dab0
>> [   26.974027] Call Trace:
>> [   26.974031]  [] dump_stack+0x54/0x8d
>> [   26.974043]  [] warn_slowpath_common+0x7d/0xa0
>> [   26.974044]  [] warn_slowpath_fmt+0x4c/0x50
>> [   26.974047]  [] debug_print_object+0x83/0xa0
>> [   26.974050]  [] ? queue_work_on+0x50/0x50
>> [   26.974053]  [] __debug_check_no_obj_freed+0x1fb/0x240
>> [   26.974059]  [] ? rtsx_pci_remove+0x119/0x1d0
>> [rtsx_pci]
> 
> So a device driven by rtsx_pcr.c is removed after resume.  Without the commit
> you've bisected it is removed as well, but that happens during resume, so
> rtsx_pci_resume() is likely not called in that case.

I'm not sure to understand your point.

> 
> I bet that there's a bug either in rtsx_pci_remove() or in rtsx_pci_resume().
> The latter definitely should check if the device is actually still present
> before scheduling the delayed work, but then the Boris' patch should take care
> of that anyway.
> 

With Boris' patch applied, I still have the problem.

Thanks.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 04/22] tools lib traceevent: Add jbd2 plugin

2013-11-24 Thread Namhyung Kim
On Sat, 23 Nov 2013 03:52:21 -0500, Steven Rostedt wrote:
> On Fri, 22 Nov 2013 23:27:57 +0900
> Namhyung Kim  wrote:
>
>> [SNIP]
>> > +#define MINORBITS 20
>> > +#define MINORMASK ((1U << MINORBITS) - 1)
>> > +
>> > +#define MAJOR(dev)((unsigned int) ((dev) >> MINORBITS))
>> > +#define MINOR(dev)((unsigned int) ((dev) & MINORMASK))
>> > +
>> > +unsigned long long process_jbd2_dev_to_name(struct trace_seq *s,
>> > +  unsigned long long *args)
>> > +{
>> > +  unsigned int dev = args[0];
>> > +
>> > +  trace_seq_printf(s, "%d:%d", MAJOR(dev), MINOR(dev));
>> > +  return 0;
>> > +}
>> > +
>> > +unsigned long long process_jiffies_to_msecs(struct trace_seq *s,
>> > +  unsigned long long *args)
>> > +{
>> > +  unsigned long long jiffies = args[0];
>> > +
>> > +  trace_seq_printf(s, "%lld", jiffies);
>> > +  return jiffies;
>> > +}
>> > +
>> > +int PEVENT_PLUGIN_LOADER(struct pevent *pevent)
>> > +{
>> > +  pevent_register_print_function(pevent,
>> > + process_jbd2_dev_to_name,
>> > + PEVENT_FUNC_ARG_STRING,
>> 
>> Actually the function returns long long not string.  But it seems the
>> current code doesn't care about the return type.
>
> Actually it's not representing what process_jbd2_dev_to_name() returns
> (which will always return unsigned long long), but what
> "jbd2_dev_to_name()" returns that is (was) defined in the kernel. That
> was:
>
>   const char *jbd2_dev_to_name(dev_t device)
>
> When registering a function to handle, you need to express the
> prototype of that function (not the handler). The third argument is the
> ret_type of that function.

Aha, got it.  Thank you for the explanation.


> But this is interesting, the ret_type doesn't seem to be used in
> event_parse.c. The return value of the callback is only done in
> eval_num_arg() where we could put a warning if the ret_type is not a
> number.

Yes. :)

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Why is O_DSYNC on linux so slow / what's wrong with my SSD?

2013-11-24 Thread Stefan Priebe

Hi Ric,

Am 23.11.2013 20:35, schrieb Ric Wheeler:

On 11/23/2013 01:27 PM, Stefan Priebe wrote:

Hi Ric,

Am 22.11.2013 21:37, schrieb Ric Wheeler:

On 11/22/2013 03:01 PM, Stefan Priebe wrote:

Hi Christoph,
Am 21.11.2013 11:11, schrieb Christoph Hellwig:


2. Some drives may implement CMD_FLUSH to return immediately i.e. no
guarantee the data is actually on disk.


In which case they aren't spec complicant.  While I've seen countless
data integrity bugs on lower end ATA SSDs I've not seen one that
simpliy
ingnores flush.  If you'd want to cheat that bluntly you'd be better
of just claiming to not have a writeback cache.

You solve your performance problem by completely disabling any chance
of having data integrity guarantees, and do so in a way that is not
detectable for applications or users.

If you have a workload with lots of small synchronous writes disabling
the writeback cache on the disk does indeed often help, especially
with
the non-queueable FLUSH on all but the most recent ATA devices.


But this isn't correct for drives with capicitors like Crucial m500,
Intel DC S3500, DC S3700 isn't it? Shouldn't the linux kernel has an
option to disable this for drives like these?
/sys/block/sdX/device/ignore_flush


If you know 100% for sure that your drive has a non-volatile write
cache, you can run the file system without the flushing by mounting "-o
nobarrier".  With most devices, this is not needed since they tend to
simply ignore the flushes if they know they are power failure safe.

Block level, we did something similar for users who are not running
through a file system for SCSI devices - James added support to echo
"temporary" into the sd's device's cache_type field:

See:

https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=2ee3e26c673e75c05ef8b914f54fadee3d7b9c88



At least to me this does not work. I get the same awful speed as
before - also the I/O waits stay the same. I'm still seeing CMD
flushes going to the devices.

Is there any way to check whether the temporary got accepted and works?

I simply executed:
for i in /sys/class/scsi_disk/*/cache_type; do echo $i; echo temporary
write back >$i; done

Stefan


What kernel are you running?  This is a new addition

Also, you can "cat" the same file to see what it says.

Regards,

Ric



Is the output i sent to you fine? Anything wrong?

Stefan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 01/22] tools lib traceevent: Add plugin support

2013-11-24 Thread Namhyung Kim
On Sat, 23 Nov 2013 03:12:19 -0500, Steven Rostedt wrote:
> On Fri, 22 Nov 2013 23:17:06 +0900
> Namhyung Kim  wrote:
>
>> > 
>> [SNIP[
>> > +static void
>> > +load_plugin(struct pevent *pevent, const char *path,
>> > +  const char *file, void *data)
>> > +{
>> > +  struct plugin_list **plugin_list = data;
>> > +  pevent_plugin_load_func func;
>> > +  struct plugin_list *list;
>> > +  const char *alias;
>> > +  char *plugin;
>> > +  void *handle;
>> > +
>> > +  plugin = malloc_or_die(strlen(path) + strlen(file) + 2);
>> 
>> I'd like not to see this malloc_or_die() anymore in a new code.  Just
>> returning after showing a warning looks enough here.
>
> Yeah I agree. This is a relic from my code. I think it's OK to add
> here, as it is pretty much direct port of my code, and then we can just
> add a patch against it to remove it.

Okay.  I agree that it'd be better to make them separate patches.

>
>> 
>> > +
>> > +  strcpy(plugin, path);
>> > +  strcat(plugin, "/");
>> > +  strcat(plugin, file);
>> > +
>> > +  handle = dlopen(plugin, RTLD_NOW | RTLD_GLOBAL);
>> 
>> Why RTLD_NOW and RTLD_GLOBAL?  Hmm.. maybe using _NOW is needed to
>> prevent a runtime error, but not sure why _GLOBAL is needed.
>
> Yes, we want to make sure all symbols defined are available at time of
> load, otherwise bail out.
>
>> 
>> IIUC _GLOBAL is for exporting symbols to *other libraries*.  Is it
>> intended for this plugin support?
>
> That was the plan. To have one plugin supply a set of functions that
> other plugins may use. That is what GLOBAL is for, right?  I don't
> recall if I every did this, but it was something I wanted for future
> work.
>
> Now if we don't need it, we could remove it, but is it bad to have?

I might be slow down symbol resolution of new plugins tiny bit.  But I
don't think it's a real problem as its effect will be negligible.

I don't object the code but just want to know your intention. :)

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clocksource: Do not drop unheld reference on device node

2013-11-24 Thread Uwe Kleine-König
Hello Daniel,

On Sun, Nov 24, 2013 at 10:28:15PM +0100, Daniel Lezcano wrote:
> On 11/22/2013 08:22 PM, Uwe Kleine-König wrote:
> >On Fri, Nov 22, 2013 at 05:31:46PM +0100, Daniel Lezcano wrote:
> >>On 11/22/2013 05:16 PM, Thierry Reding wrote:
> >>>On Sat, Oct 19, 2013 at 12:49:48AM +0200, Thierry Reding wrote:
> >>Yes. Sounds like I missed it.
> >>
> >>This regression has been introduced by:
> >>
> >>commit 326e31eebe61dc838e031ea16968b2cfb43443e3
> >>Author: Uwe Kleine-König 
> >>Date:   Tue Oct 1 11:00:53 2013 +0200
> >>
> >> clocksource: Put nodes passed to CLOCKSOURCE_OF_DECLARE
> >>callbacks centrally
> >>
> >> Instead of letting each driver call of_node_put do it centrally in the
> >> loop that also calls the CLOCKSOURCE_OF_DECLARE callbacks. This is less
> >> prone to error and also moves getting and putting the references
> >>into the
> >> same function.
> >>
> >> Consequently all respective of_node_put calls in drivers are removed.
> >>
> >> Signed-off-by: Uwe Kleine-König 
> >> Signed-off-by: Daniel Lezcano 
> >> Acked-by: David Brown 
> >Still all but the hook in clocksource_of_init of this commit was
> >correct, right? (Well, but this buggy hunk makes the commit log wrong.)
> 
> I don't understand your comment, can you elaborate ?
My patch added an of_node_put in clocksource_of_init and dropped several
of_node_puts in drivers. This thread is about the first being wrong. My
question was if dropping the others was correct.

Best regards
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | http://www.pengutronix.de/  |
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 03/22] tools lib traceevent: Add traceevent_host_bigendian function

2013-11-24 Thread Namhyung Kim
Hi Steve,

On Sat, 23 Nov 2013 03:27:08 -0500, Steven Rostedt wrote:
> On Fri, 22 Nov 2013 23:22:52 +0900
> Namhyung Kim  wrote:
>
>> 2013-11-21 (목), 12:01 +0100, Jiri Olsa:
>> > Adding traceevent_host_bigendian function to get host
>> > endianity. It's used in following patches.
>> 
>> [SNIP]
>> > +static inline int traceevent_host_bigendian(void)
>> > +{
>> > +  unsigned char str[] = { 0x1, 0x2, 0x3, 0x4 };
>> > +  unsigned int *ptr;
>> > +
>> > +  ptr = (unsigned int *)str;
>> > +  return *ptr == 0x01020304;
>> 
>> Is it safe for every architecture supported - especially ones that
>> require stricter alignment?  I know many architectures/compilers align
>> stack but not sure doing this is safe for all architecture.
>
> Would you prefer this (I tested it on both a big and little endian)
>
> {
>   unsigned char str[] = { 0x1, 0x2, 0x3, 0x4 };
>   unsigned int val;
>
>   memcpy(, str, 4);
>   return val == 0x01020304;
> }

Yeah, looks good to me.

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-24 Thread Namhyung Kim
Hi Oleg,

On Tue, 12 Nov 2013 17:00:01 +0900, Namhyung Kim wrote:
> For @+addr syntax: user-space uses relative symbol address from a loaded
>base address and kernel calculates the base address
>using "current->utask->vaddr - tu->offset".

I tried this approach and realized that current->utask is not set or has
an invalid vaddr when handler_chain() is called.  So I had to apply
following patch and it seems to work well for me.  Could you confirm it?

Thanks,
Namhyung


diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index ad8e1bdca70e..e63748d3520e 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -1456,7 +1456,7 @@ static void prepare_uretprobe(struct uprobe *uprobe, 
struct pt_regs *regs)
 
 /* Prepare to single-step probed instruction out of line. */
 static int
-pre_ssout(struct uprobe *uprobe, struct pt_regs *regs, unsigned long bp_vaddr)
+pre_ssout(struct uprobe *uprobe, struct pt_regs *regs)
 {
struct uprobe_task *utask;
unsigned long xol_vaddr;
@@ -1471,7 +1471,6 @@ pre_ssout(struct uprobe *uprobe, struct pt_regs *regs, 
unsigned long bp_vaddr)
return -ENOMEM;
 
utask->xol_vaddr = xol_vaddr;
-   utask->vaddr = bp_vaddr;
 
err = arch_uprobe_pre_xol(>arch, regs);
if (unlikely(err)) {
@@ -1701,6 +1700,7 @@ static bool handle_trampoline(struct pt_regs *regs)
 static void handle_swbp(struct pt_regs *regs)
 {
struct uprobe *uprobe;
+   struct uprobe_task *utask;
unsigned long bp_vaddr;
int uninitialized_var(is_swbp);
 
@@ -1744,11 +1744,17 @@ static void handle_swbp(struct pt_regs *regs)
if (unlikely(!test_bit(UPROBE_COPY_INSN, >flags)))
goto out;
 
+   utask = get_utask();
+   if (!utask)
+   goto out;
+
+   utask->vaddr = bp_vaddr;
+
handler_chain(uprobe, regs);
if (can_skip_sstep(uprobe, regs))
goto out;
 
-   if (!pre_ssout(uprobe, regs, bp_vaddr))
+   if (!pre_ssout(uprobe, regs))
return;
 
/* can_skip_sstep() succeeded, or restart if can't singlestep */
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/1] Drop INITRAMFS_COMPRESSION_GZIP option

2013-11-24 Thread P J P

   Hello Andrew, all

Please see attached herein a patch to replace INITRAMFS_COMPRESSION_GZIP 
option with a new CONFIG_RD_GZIP=y, for INITRAMFS_COMPRESSION_GZIP is not set. 
Patch also removes the choice text for INITRAMFS_* options from usr/Kconfig.


Thank you.
--
Prasad J Pandit / Red Hat Security Response TeamFrom e5ab6002603a901d40c7f84b4d6d240bf0e91aca Mon Sep 17 00:00:00 2001
From: Hristo Venev 
Date: Mon, 25 Nov 2013 11:54:29 +0530
Subject: [PATCH 1/1] Drop INITRAMFS_COMPRESSION_GZIP option

Replaced INITRAMFS_COMPRESSION_GZIP=y option in the mips' &
powerpc's defconfig file with the CONFIG_RD_GZIP=y, for
INITRAMFS_COMPRESSION_GZIP is not set. Also removed the
corresponding Kconfig choice text for all INITRAMFS_COMPRESSION_*
options.

Signed-off-by: P J P 

diff --git a/arch/mips/configs/nlm_xlr_defconfig 
b/arch/mips/configs/nlm_xlr_defconfig
index 44b4734..17388d0 100644
--- a/arch/mips/configs/nlm_xlr_defconfig
+++ b/arch/mips/configs/nlm_xlr_defconfig
@@ -25,7 +25,7 @@ CONFIG_BLK_DEV_INITRD=y
 CONFIG_INITRAMFS_SOURCE=""
 CONFIG_RD_BZIP2=y
 CONFIG_RD_LZMA=y
-CONFIG_INITRAMFS_COMPRESSION_GZIP=y
+CONFIG_RD_GZIP=y
 CONFIG_EXPERT=y
 CONFIG_KALLSYMS_ALL=y
 # CONFIG_ELF_CORE is not set
diff --git a/arch/powerpc/configs/chroma_defconfig 
b/arch/powerpc/configs/chroma_defconfig
index 4f35fc4..4d09a56 100644
--- a/arch/powerpc/configs/chroma_defconfig
+++ b/arch/powerpc/configs/chroma_defconfig
@@ -29,7 +29,7 @@ CONFIG_BLK_DEV_INITRD=y
 CONFIG_INITRAMFS_SOURCE=""
 CONFIG_RD_BZIP2=y
 CONFIG_RD_LZMA=y
-CONFIG_INITRAMFS_COMPRESSION_GZIP=y
+CONFIG_RD_GZIP=y
 CONFIG_KALLSYMS_ALL=y
 CONFIG_EMBEDDED=y
 CONFIG_PERF_EVENTS=y
diff --git a/usr/Kconfig b/usr/Kconfig
index 642f503..2d4c77e 100644
--- a/usr/Kconfig
+++ b/usr/Kconfig
@@ -98,80 +98,3 @@ config RD_LZ4
help
  Support loading of a LZ4 encoded initial ramdisk or cpio buffer
  If unsure, say N.
-
-choice
-   prompt "Built-in initramfs compression mode" if INITRAMFS_SOURCE!=""
-   help
- This option decides by which algorithm the builtin initramfs
- will be compressed.  Several compression algorithms are
- available, which differ in efficiency, compression and
- decompression speed.  Compression speed is only relevant
- when building a kernel.  Decompression speed is relevant at
- each boot.
-
- If you have any problems with bzip2 or LZMA compressed
- initramfs, mail me (Alain Knaff) .
-
- High compression options are mostly useful for users who are
- low on RAM, since it reduces the memory consumption during
- boot.
-
- If in doubt, select 'gzip'
-
-config INITRAMFS_COMPRESSION_NONE
-   bool "None"
-   help
- Do not compress the built-in initramfs at all. This may
- sound wasteful in space, but, you should be aware that the
- built-in initramfs will be compressed at a later stage
- anyways along with the rest of the kernel, on those
- architectures that support this.
- However, not compressing the initramfs may lead to slightly
- higher memory consumption during a short time at boot, while
- both the cpio image and the unpacked filesystem image will
- be present in memory simultaneously
-
-config INITRAMFS_COMPRESSION_GZIP
-   bool "Gzip"
-   depends on RD_GZIP
-   help
- The old and tried gzip compression. It provides a good balance
- between compression ratio and decompression speed.
-
-config INITRAMFS_COMPRESSION_BZIP2
-   bool "Bzip2"
-   depends on RD_BZIP2
-   help
- Its compression ratio and speed is intermediate.
- Decompression speed is slowest among the choices.  The initramfs
- size is about 10% smaller with bzip2, in comparison to gzip.
- Bzip2 uses a large amount of memory. For modern kernels you
- will need at least 8MB RAM or more for booting.
-
-config INITRAMFS_COMPRESSION_LZMA
-   bool "LZMA"
-   depends on RD_LZMA
-   help
- This algorithm's compression ratio is best.
- Decompression speed is between the other choices.
- Compression is slowest. The initramfs size is about 33%
- smaller with LZMA in comparison to gzip.
-
-config INITRAMFS_COMPRESSION_XZ
-   bool "XZ"
-   depends on RD_XZ
-   help
- XZ uses the LZMA2 algorithm. The initramfs size is about 30%
- smaller with XZ in comparison to gzip. Decompression speed
- is better than that of bzip2 but worse than gzip and LZO.
- Compression is slow.
-
-config INITRAMFS_COMPRESSION_LZO
-   bool "LZO"
-   depends on RD_LZO
-   help
- Its compression ratio is the poorest among the choices. The kernel
- size is about 10% bigger than gzip; however its speed
- (both compression and decompression) is the fastest.
-
-endchoice
-- 
1.8.3.1



Re: [PATCH v2] irqchip: exynos-combiner: remove hard-coded irq_base value

2013-11-24 Thread Chander Kashyap
Hi Kikjin,

On 21 October 2013 02:32, Kukjin Kim  wrote:
> On 10/18/13 02:53, Tomasz Figa wrote:
>>
>> Hi Kukjin,
>>
>> On Thursday 26 of September 2013 14:05:09 Kukjin Kim wrote:
>>>
>>> Chander Kashyap wrote:

 Replace irq_domain_add_simple with "irq_domain_add_linear" in order to
 use linear irq domain, and to remove hardcoded irq_base_value.

 Signed-off-by: Chander Kashyap
 ---

 Changes since v1:
 - Replaced irq_domain_add_simple with irq_domain_add_linear,

   as suggested by Tomasz

   drivers/irqchip/exynos-combiner.c |   15 +++
   1 file changed, 3 insertions(+), 12 deletions(-)
>>
>> [snip]
>>>
>>>
>>> Looks nice to me, applied with Tomasz's review.
>>
>>
>> I don't see this patch in your tree. Did you apply it in the end?
>>
> Thanks for your gentle reminder.

I still can not see these patches in yours branches ?

>
> Applied.
> - Kukjin



-- 
with warm regards,
Chander Kashyap
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] update consumers of MSG_MORE to recognize MSG_SENDPAGE_NOTLAST

2013-11-24 Thread Shawn Landden
Commit 35f9c09fe (tcp: tcp_sendpages() should call tcp_push() once)
added an internal flag MSG_SENDPAGE_NOTLAST, similar to
MSG_MORE.

algif_hash, algif_skcipher, and udp used MSG_MORE from tcp_sendpages()
and need to see the new flag as identical to MSG_MORE.

This fixes sendfile() on AF_ALG.

v3: also fix udp

Cc: Tom Herbert 
Cc: Eric Dumazet 
Cc: David S. Miller 
Cc:  # 3.4.x + 3.2.x
Reported-and-tested-by: Shawn Landden 
Original-patch: Richard Weinberger 
Signed-off-by: Shawn Landden 
---
 crypto/algif_hash.c | 3 +++
 crypto/algif_skcipher.c | 3 +++
 net/ipv4/udp.c  | 3 +++
 3 files changed, 9 insertions(+)

diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c
index ef5356c..8502462 100644
--- a/crypto/algif_hash.c
+++ b/crypto/algif_hash.c
@@ -114,6 +114,9 @@ static ssize_t hash_sendpage(struct socket *sock, struct 
page *page,
struct hash_ctx *ctx = ask->private;
int err;
 
+   if (flags & MSG_SENDPAGE_NOTLAST)
+   flags |= MSG_MORE;
+
lock_sock(sk);
sg_init_table(ctx->sgl.sg, 1);
sg_set_page(ctx->sgl.sg, page, size, offset);
diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c
index 6a6dfc0..a19c027 100644
--- a/crypto/algif_skcipher.c
+++ b/crypto/algif_skcipher.c
@@ -378,6 +378,9 @@ static ssize_t skcipher_sendpage(struct socket *sock, 
struct page *page,
struct skcipher_sg_list *sgl;
int err = -EINVAL;
 
+   if (flags & MSG_SENDPAGE_NOTLAST)
+   flags |= MSG_MORE;
+
lock_sock(sk);
if (!ctx->more && ctx->used)
goto unlock;
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 5944d7d..8bd04df 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1098,6 +1098,9 @@ int udp_sendpage(struct sock *sk, struct page *page, int 
offset,
struct udp_sock *up = udp_sk(sk);
int ret;
 
+   if (flags & MSG_SENDPAGE_NOTLAST)
+   flags |= MSG_MORE;
+
if (!up->pending) {
struct msghdr msg = {   .msg_flags = flags|MSG_MORE };
 
-- 
1.8.4.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2 2/2] ARM: dts: Enable ahci sata and sata phy

2013-11-24 Thread Kishon Vijay Abraham I
Hi,

On Monday 11 November 2013 02:02 PM, Yuvaraj Kumar C D wrote:
> This patch adds dt entry for ahci sata controller and its
> corresponding phy controller.phy node has been added w.r.t
> new generic phy framework.
> 
> Changes since V1:
>   1.Minor changes to node name convention
>   2.Updated binding document.
> 
> Signed-off-by: Yuvaraj Kumar C D 
> ---
>  .../devicetree/bindings/ata/exynos-sata-phy.txt|   19 +-
>  .../devicetree/bindings/ata/exynos-sata.txt|   17 +++-
>  arch/arm/boot/dts/exynos5250-arndale.dts   |9 -
>  arch/arm/boot/dts/exynos5250-smdk5250.dts  |8 ++--
>  arch/arm/boot/dts/exynos5250.dtsi  |   21 
> 
>  5 files changed, 53 insertions(+), 21 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/ata/exynos-sata-phy.txt 
> b/Documentation/devicetree/bindings/ata/exynos-sata-phy.txt
> index 37824fa..a679e17 100644
> --- a/Documentation/devicetree/bindings/ata/exynos-sata-phy.txt
> +++ b/Documentation/devicetree/bindings/ata/exynos-sata-phy.txt
> @@ -4,11 +4,20 @@ SATA PHY nodes are defined to describe on-chip SATA 
> Physical layer controllers.
>  Each SATA PHY controller should have its own node.
>  
>  Required properties:
> -- compatible: compatible list, contains "samsung,exynos5-sata-phy"
> +- compatible: compatible list, contains "samsung,exynos5250-sata-phy"

What if someone is already using samsung,exynos5-sata-phy? You can mark the old
one as deprecated and add the new compatible string.
>  - reg   : 
>  
>  Example:
> -sata@ffe07000 {
> -compatible = "samsung,exynos5-sata-phy";
> -reg = <0xffe07000 0x1000>;
> -};
> + sata_phy: sata-phy@1217 {
> + compatible = "samsung,exynos5250-sata-phy";
> + reg = <0x1217 0x1ff>;
> + clocks = < 287>;
> + clock-names = "sata_phyctrl";
> + #phy-cells = <0>;
> + #address-cells = <1>;
> + #size-cells = <1>;
> + ranges;
> + sataphy-pmu {
> + reg = <0x10040724 0x4>;
> + };

alignment problem..
> + };
> diff --git a/Documentation/devicetree/bindings/ata/exynos-sata.txt 
> b/Documentation/devicetree/bindings/ata/exynos-sata.txt
> index 0849f10..8ec7327 100644
> --- a/Documentation/devicetree/bindings/ata/exynos-sata.txt
> +++ b/Documentation/devicetree/bindings/ata/exynos-sata.txt
> @@ -8,10 +8,17 @@ Required properties:
>  - interrupts: 
>  - reg   : 
>  - samsung,sata-freq : 
> +- phys  : as mentioned in phy-bindings.txt
> +- phy-names : as mentioned in phy-bindings.txt
>  
>  Example:
> -sata@ffe08000 {
> -compatible = "samsung,exynos5-sata";
> -reg = <0xffe08000 0x1000>;
> -interrupts = <115>;
> -};
> + sata@122F {

use lower case here..
> + compatible = "snps,dwc-ahci";
> + samsung,sata-freq = <66>;
> + reg = <0x122F 0x1ff>;
here too..
> + interrupts = <0 115 0>;
> + clocks = < 277>, < 143>;
> + clock-names = "sata", "sclk_sata";
> + phys = <_phy>;
> + phy-names = "sata-phy";
> + };
> diff --git a/arch/arm/boot/dts/exynos5250-arndale.dts 
> b/arch/arm/boot/dts/exynos5250-arndale.dts
> index b77a37e..434e4f3 100644
> --- a/arch/arm/boot/dts/exynos5250-arndale.dts
> +++ b/arch/arm/boot/dts/exynos5250-arndale.dts
> @@ -381,7 +381,14 @@
>   };
>  
>   i2c@121D {
> - status = "disabled";
> + samsung,i2c-sda-delay = <100>;
> + samsung,i2c-max-bus-freq = <4>;
> + samsung,i2c-slave-addr = <0x38>;
> +
> + sata-phy {
> + compatible = "sata-phy-i2c";

Do you have documentation for this compatible string?
> + reg = <0x38>;
> + };
>   };
>  
>   mmc_0: mmc@1220 {
> diff --git a/arch/arm/boot/dts/exynos5250-smdk5250.dts 
> b/arch/arm/boot/dts/exynos5250-smdk5250.dts
> index 13746df..ef9 100644
> --- a/arch/arm/boot/dts/exynos5250-smdk5250.dts
> +++ b/arch/arm/boot/dts/exynos5250-smdk5250.dts
> @@ -90,16 +90,12 @@
>   samsung,i2c-max-bus-freq = <4>;
>   samsung,i2c-slave-addr = <0x38>;
>  
> - sata-phy {
> - compatible = "samsung,sata-phy";
> + sata-phy@38 {
> + compatible = "sata-phy-i2c";
>   reg = <0x38>;
>   };
>   };
>  
> - sata@122F {
> - samsung,sata-freq = <66>;
> - };
> -
>   i2c@12C8 {
>   samsung,i2c-sda-delay = <100>;
>   samsung,i2c-max-bus-freq = <66000>;
> diff --git a/arch/arm/boot/dts/exynos5250.dtsi 
> b/arch/arm/boot/dts/exynos5250.dtsi
> 

Re: [PATCH v3 07/15] KVM: MMU: introduce nulls desc

2013-11-24 Thread Xiao Guangrong
On 11/25/2013 02:11 PM, Xiao Guangrong wrote:
> 
> On Nov 23, 2013, at 3:14 AM, Marcelo Tosatti  wrote:
> 
>> On Wed, Oct 23, 2013 at 09:29:25PM +0800, Xiao Guangrong wrote:
>>> It likes nulls list and we use the pte-list as the nulls which can help us 
>>> to
>>> detect whether the "desc" is moved to anther rmap then we can re-walk the 
>>> rmap
>>> if that happened
>>>
>>> kvm->slots_lock is held when we do lockless walking that prevents rmap
>>> is reused (free rmap need to hold that lock) so that we can not see the same
>>> nulls used on different rmaps
>>>
>>> Signed-off-by: Xiao Guangrong 
>>
>> How about simplified lockless walk on the slot while rmapp entry
>> contains a single spte? (which should be the case with two-dimensional
>> paging).
>>
>> That is, grab the lock when finding a rmap with more than one spte in
>> it (and then keep it locked until the end).
> 
> Hmm� that isn't straightforward and more complex than the approach
> in this patchset. Also it can drop the improvement for shadow mmu that
> gets great improvement by this patchset.
> 
>>
>> For example, nothing prevents lockless walker to move into some
>> parent_ptes chain, right?
> 
> No.
> 
> The nulls can help us to detect this case, for parent_ptes, the nulls points
> to "shadow page" but for rmaps, the nulls points to slot.arch.rmap. There
> is no chance that the �rmap" is used as shadow page when slot-lock is held.
> 
>>
>> Also, there is no guarantee of termination (as long as sptes are
>> deleted with the correct timing). BTW, can't see any guarantee of
>> termination for rculist nulls either (a writer can race with a lockless
>> reader indefinately, restarting the lockless walk every time).
> 
> Hmm, that can be avoided by checking dirty-bitmap before rewalk,
> that means, if the dirty-bitmap has been set during lockless write-protection,
> it�s unnecessary to write-protect its sptes. Your idea?

This idea is based on the fact that the number of rmap is limited by
RMAP_RECYCLE_THRESHOLD. So, in the case of adding new spte into rmap,
we can break the rewalk at once, in the case of deleting, we can only
rewalk RMAP_RECYCLE_THRESHOLD times.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2 1/2] Phy: Exynos: Add Exynos5250 sata phy driver

2013-11-24 Thread Kishon Vijay Abraham I
Hi,

On Friday 22 November 2013 11:31 AM, Yuvaraj Kumar wrote:
> Any comments on this patch?
> 
> On Mon, Nov 11, 2013 at 2:02 PM, Yuvaraj Kumar C D  
> wrote:
>> This patch adds the sata phy driver for Exynos5250.Exynos5250 sata
>> phy comprises of CMU and TRSV blocks which are of I2C register Map.
>> So this patch also adds a i2c client driver, which is used configure
>> the CMU and TRSV block of exynos5250 SATA PHY.
>>
>> This patch incorporates the generic phy framework to deal with sata
>> phy.
>>
>> This patch depends on the below patches
>> [1].drivers: phy: add generic PHY framework
>> by Kishon Vijay Abraham I
>> [2].ata: ahci_platform: Manage SATA PHY
>> by Roger Quadros 
>> Changes from V1:
>> 1.Adapted to latest version of Generic PHY framework
>> 2.Removed exynos_sata_i2c_remove function.
>>
>> Signed-off-by: Yuvaraj Kumar C D 
>> Signed-off-by: Girish K S 
>> Signed-off-by: Vasanth Ananthan 
>> ---
>>  drivers/phy/Kconfig   |7 ++
>>  drivers/phy/Makefile  |1 +
>>  drivers/phy/exynos5250_phy_i2c.c  |   43 +++
>>  drivers/phy/sata_phy_exynos5250.c |  245 
>> +
>>  drivers/phy/sata_phy_exynos5250.h |   33 +
>>  5 files changed, 329 insertions(+)
>>  create mode 100644 drivers/phy/exynos5250_phy_i2c.c
>>  create mode 100644 drivers/phy/sata_phy_exynos5250.c
>>  create mode 100644 drivers/phy/sata_phy_exynos5250.h
>>
>> diff --git a/drivers/phy/Kconfig b/drivers/phy/Kconfig
>> index 349bef2..8afd423 100644
>> --- a/drivers/phy/Kconfig
>> +++ b/drivers/phy/Kconfig
>> @@ -15,4 +15,11 @@ config GENERIC_PHY
>>   phy users can obtain reference to the PHY. All the users of this
>>   framework should select this config.
>>
>> +config EXYNOS5250_SATA_PHY
>> +   tristate "Exynos5250 Sata SerDes/PHY driver"
>> +   depends on GENERIC_PHY && SOC_EXYNOS5250

select GENERIC_PHY?
>> +   help
>> + Support for Exynos5250 sata SerDes/Phy found on Samsung
>> + SoCs.

checkpatch gives a warning if it doesn't have atleast 4 help lines :-s
>> +
>>  endmenu
>> diff --git a/drivers/phy/Makefile b/drivers/phy/Makefile
>> index 9e9560f..824f47b 100644
>> --- a/drivers/phy/Makefile
>> +++ b/drivers/phy/Makefile
>> @@ -3,3 +3,4 @@
>>  #
>>
>>  obj-$(CONFIG_GENERIC_PHY)  += phy-core.o
>> +obj-$(CONFIG_EXYNOS5250_SATA_PHY)  += sata_phy_exynos5250.o 
>> exynos5250_phy_i2c.o
>> diff --git a/drivers/phy/exynos5250_phy_i2c.c 
>> b/drivers/phy/exynos5250_phy_i2c.c
>> new file mode 100644
>> index 000..752c8fe
>> --- /dev/null
>> +++ b/drivers/phy/exynos5250_phy_i2c.c
>> @@ -0,0 +1,43 @@
>> +/*
>> + * Copyright (C) 2013 Samsung Electronics Co.Ltd
>> + * Author:
>> + * Yuvaraj C D 
>> + *
>> + * This program is free software; you can redistribute  it and/or modify it
>> + * under  the terms of  the GNU General  Public License as published by the
>> + * Free Software Foundation;  either version 2 of the  License, or (at your
>> + * option) any later version.
>> + *
>> + */
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include "sata_phy_exynos5250.h"

arrange these headers in alphabetical order.. so it's easier to check if a
header has already been added while adding new headers.
>> +
>> +static int exynos_sata_i2c_probe(struct i2c_client *client,
>> +   const struct i2c_device_id *i2c_id)
>> +{
>> +   sataphy_attach_i2c_client(client);
>> +
>> +   dev_info(>adapter->dev,
>> +   "attached %s into sataphy i2c adapter successfully\n",
>> +   client->name);
>> +
>> +   return 0;
>> +}
>> +
>> +static const struct i2c_device_id phy_i2c_device_match[] = {
>> +   { "sata-phy-i2c", 0 },

pls use .compatible to assign compatible strings. Do you have dt documentation?
It should be *exynos,sata-phy-i2c*.
>> +};
>> +MODULE_DEVICE_TABLE(of, phy_i2c_device_match);
>> +
>> +struct i2c_driver sataphy_i2c_driver = {
>> +   .probe= exynos_sata_i2c_probe,
>> +   .id_table = phy_i2c_device_match,
>> +   .driver   = {
>> +   .name = "sata-phy-i2c",
>> +   .owner = THIS_MODULE,
>> +   .of_match_table = (void *)phy_i2c_device_match,

use of_match_ptr here.
>> +   },
>> +};
>> diff --git a/drivers/phy/sata_phy_exynos5250.c 
>> b/drivers/phy/sata_phy_exynos5250.c
>> new file mode 100644
>> index 000..13f4ce0
>> --- /dev/null
>> +++ b/drivers/phy/sata_phy_exynos5250.c
>> @@ -0,0 +1,245 @@
>> +/*
>> + * Samsung SATA SerDes(PHY) driver
>> + *
>> + * Copyright (C) 2013 Samsung Electronics Co., Ltd.
>> + * Authors: Girish K S 
>> + * Yuvaraj Kumar C D 
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + */
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 

Re: [PATCH v3 07/15] KVM: MMU: introduce nulls desc

2013-11-24 Thread Xiao Guangrong

On Nov 23, 2013, at 3:14 AM, Marcelo Tosatti  wrote:

> On Wed, Oct 23, 2013 at 09:29:25PM +0800, Xiao Guangrong wrote:
>> It likes nulls list and we use the pte-list as the nulls which can help us to
>> detect whether the "desc" is moved to anther rmap then we can re-walk the 
>> rmap
>> if that happened
>> 
>> kvm->slots_lock is held when we do lockless walking that prevents rmap
>> is reused (free rmap need to hold that lock) so that we can not see the same
>> nulls used on different rmaps
>> 
>> Signed-off-by: Xiao Guangrong 
> 
> How about simplified lockless walk on the slot while rmapp entry
> contains a single spte? (which should be the case with two-dimensional
> paging).
> 
> That is, grab the lock when finding a rmap with more than one spte in
> it (and then keep it locked until the end).

Hmm… that isn't straightforward and more complex than the approach
in this patchset. Also it can drop the improvement for shadow mmu that
gets great improvement by this patchset.

> 
> For example, nothing prevents lockless walker to move into some
> parent_ptes chain, right?

No.

The nulls can help us to detect this case, for parent_ptes, the nulls points
to "shadow page" but for rmaps, the nulls points to slot.arch.rmap. There
is no chance that the “rmap" is used as shadow page when slot-lock is held.

> 
> Also, there is no guarantee of termination (as long as sptes are
> deleted with the correct timing). BTW, can't see any guarantee of
> termination for rculist nulls either (a writer can race with a lockless
> reader indefinately, restarting the lockless walk every time).

Hmm, that can be avoided by checking dirty-bitmap before rewalk,
that means, if the dirty-bitmap has been set during lockless write-protection,
it’s unnecessary to write-protect its sptes. Your idea?

But… do we really need to care it. :(






--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] phy: Add Vitesse 8514 phy ID

2013-11-24 Thread shh.xie
From: Shaohui Xie 

Phy is compatible with Vitesse 82xx

Signed-off-by: Shaohui Xie 
---
 drivers/net/phy/vitesse.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/drivers/net/phy/vitesse.c b/drivers/net/phy/vitesse.c
index 508e435..14372c6 100644
--- a/drivers/net/phy/vitesse.c
+++ b/drivers/net/phy/vitesse.c
@@ -64,6 +64,7 @@
 
 #define PHY_ID_VSC8234 0x000fc620
 #define PHY_ID_VSC8244 0x000fc6c0
+#define PHY_ID_VSC8514 0x00070670
 #define PHY_ID_VSC8574 0x000704a0
 #define PHY_ID_VSC8662 0x00070660
 #define PHY_ID_VSC8221 0x000fc550
@@ -131,6 +132,7 @@ static int vsc82xx_config_intr(struct phy_device *phydev)
err = phy_write(phydev, MII_VSC8244_IMASK,
(phydev->drv->phy_id == PHY_ID_VSC8234 ||
 phydev->drv->phy_id == PHY_ID_VSC8244 ||
+phydev->drv->phy_id == PHY_ID_VSC8514 ||
 phydev->drv->phy_id == PHY_ID_VSC8574) ?
MII_VSC8244_IMASK_MASK :
MII_VSC8221_IMASK_MASK);
@@ -246,6 +248,18 @@ static struct phy_driver vsc82xx_driver[] = {
.config_intr= _config_intr,
.driver = { .owner = THIS_MODULE,},
 }, {
+   .phy_id = PHY_ID_VSC8514,
+   .name   = "Vitesse VSC8514",
+   .phy_id_mask= 0x0000,
+   .features   = PHY_GBIT_FEATURES,
+   .flags  = PHY_HAS_INTERRUPT,
+   .config_init= _config_init,
+   .config_aneg= _config_aneg,
+   .read_status= _read_status,
+   .ack_interrupt  = _ack_interrupt,
+   .config_intr= _config_intr,
+   .driver = { .owner = THIS_MODULE,},
+}, {
.phy_id = PHY_ID_VSC8574,
.name   = "Vitesse VSC8574",
.phy_id_mask= 0x0000,
@@ -315,6 +329,7 @@ module_exit(vsc82xx_exit);
 static struct mdio_device_id __maybe_unused vitesse_tbl[] = {
{ PHY_ID_VSC8234, 0x0000 },
{ PHY_ID_VSC8244, 0x000fffc0 },
+   { PHY_ID_VSC8514, 0x0000 },
{ PHY_ID_VSC8574, 0x0000 },
{ PHY_ID_VSC8662, 0x0000 },
{ PHY_ID_VSC8221, 0x0000 },
-- 
1.8.4.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


WARNING at net/core/dev.c:netdev_all_upper_get_next_dev_rcu()

2013-11-24 Thread Yuanhan Liu
Greetings,

We got the following warning:
  [   25.040056] BIOS EDD facility v0.16 2004-Jun-25, 0 devices found
  [   25.047312] EDD information not available.
  [   25.637680] [ cut here ]
  [   25.643383] WARNING: CPU: 10 PID: 1 at 
/c/kernel-tests/src/x86_64/net/core/dev.c:4503 
netdev_all_upper_get_next_dev_rcu+0x40/0x84()
  [   25.657508] Modules linked in:
  [   25.661515] CPU: 10 PID: 1 Comm: swapper/0 Not tainted 
3.12.0-11530-g873cd59 #1751
  [   25.670889] Hardware name: Intel Corporation LH Pass/S4600LH, BIOS 
SE5C600.86B.99.02.1047.032320122259 03/23/2012
  [   25.683647]  0001 880427c9bc68 81a4168e 

  [   25.693182]  880427c9bca0 810c5530 81934918 
880427c9bce8
  [   25.702742]  880818e98000  0040 
880427c9bcb0
  [   25.712306] Call Trace:
  [   25.715532]  [] dump_stack+0x4d/0x66
  [   25.721770]  [] warn_slowpath_common+0x7f/0x98
  [   25.728974]  [] ? 
netdev_all_upper_get_next_dev_rcu+0x40/0x84
  [   25.738020]  [] warn_slowpath_null+0x1a/0x1c
  [   25.745022]  [] 
netdev_all_upper_get_next_dev_rcu+0x40/0x84
  [   25.753517]  [] ixgbe_configure+0x74f/0x786
  [   25.760431]  [] ixgbe_open+0x18e/0x409
  [   25.766895]  [] ? raw_notifier_call_chain+0x14/0x16
  [   25.774608]  [] ? call_netdevice_notifiers_info+0x52/0x59
  [   25.782914]  [] __dev_open+0x90/0xd0
  [   25.789174]  [] __dev_change_flags+0xa9/0x14b
  [   25.796279]  [] dev_change_flags+0x26/0x59
  [   25.803136]  [] ip_auto_config+0x204/0xe82
  [   25.809974]  [] ? lock_release_holdtime.part.7+0xcc/0xd9
  [   25.818181]  [] ? 
tcp_set_default_congestion_control+0xb4/0xb9
  [   25.827349]  [] ? _raw_spin_unlock+0x27/0x32
  [   25.834353]  [] ? root_nfs_parse_addr+0xaf/0xaf
  [   25.841668]  [] do_one_initcall+0xa4/0x13a
  [   25.848495]  [] ? parse_args+0x261/0x33f
  [   25.855127]  [] kernel_init_freeable+0x1d9/0x25f
  [   25.862516]  [] ? do_early_param+0x88/0x88
  [   25.869348]  [] ? rest_init+0xcd/0xcd
  [   25.875669]  [] kernel_init+0xe/0x109
  [   25.882011]  [] ret_from_fork+0x7c/0xb0
  [   25.888535]  [] ? rest_init+0xcd/0xcd
  [   25.894850] ---[ end trace 083c1411a531ab55 ]---
  [   25.904421] pps pps0: new PPS source ptp0
  [   25.909429] ixgbe :06:00.0: registered PHC device on eth0
  [   26.321643] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready

And the first bad commit is:

  commit 2a47fa45d4dfbc54659d28de311a1f764b296a3c
  Author: John Fastabend 
  Date:   Wed Nov 6 09:54:52 2013 -0800
  
  ixgbe: enable l2 forwarding acceleration for macvlans
  
  Now that l2 acceleration ops are in place from the prior patch,
  enable ixgbe to take advantage of these operations.  Allow it to
  allocate queues for a macvlan so that when we transmit a frame,
  we can do the switching in hardware inside the ixgbe card, rather
  than in software.
  
  Signed-off-by: John Fastabend 
  Signed-off-by: Neil Horman 
  CC: Andy Gospodarek 
  CC: "David S. Miller" 
  Signed-off-by: David S. Miller 
  
  :04 04 6407c4e5932446e035cfd57b786845e49746948f 
60c62718a990436d6d4589b8124affaa0412aa14 Mdrivers
  bisect run success
  
  # bad: [873cd59de3c0e84596ee1790fb3047df45d0da43] Merge 
'drm-exynos/exynos-drm-fixes' into devel-hourly-2013112214
  # good: [5e01dc7b26d9f24f39abace5da98ccbd6a5ceb52] Linux 3.12
  git bisect start '873cd59de3c0e84596ee1790fb3047df45d0da43' 
'5e01dc7b26d9f24f39abace5da98ccbd6a5ceb52' '--'
  # good: [5cbb3d216e2041700231bcfc383ee5f8b7fc8b74] Merge branch 'akpm' 
(patches from Andrew Morton)
  git bisect good 5cbb3d216e2041700231bcfc383ee5f8b7fc8b74
  # bad: [3aeb58ab6216d864821e8dafb248e8d77403f3e9] Merge branch 'for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
  git bisect bad 3aeb58ab6216d864821e8dafb248e8d77403f3e9
  # bad: [dcd607718385d02ce3741de225927a57f528f93b] inet: fix a UFO regression
  git bisect bad dcd607718385d02ce3741de225927a57f528f93b
  # good: [3ba405db1c1b05d157474c71e559393f7ea436ad] gianfar: Simplify MQ 
polling to avoid soft lockup
  git bisect good 3ba405db1c1b05d157474c71e559393f7ea436ad
  # good: [ba275241030cfe87b87d6592345c7e7ebd9b6fba] virtio-net: coalesce rx 
frags when possible during rx
  git bisect good ba275241030cfe87b87d6592345c7e7ebd9b6fba
  # good: [a72e25f78134cc0c1ef2adc99d6c3680ebd80e35] Merge branch 
'for-linville' of git://github.com/kvalo/ath
  git bisect good a72e25f78134cc0c1ef2adc99d6c3680ebd80e35
  # good: [53c5a099b8fd45632f4021f0a908b43aabe883fc] rt2x00: rt2800lib: 
autodetect 5GHz band support
  git bisect good 53c5a099b8fd45632f4021f0a908b43aabe883fc
  # good: [01925efdf7e03b4b803b5c9f985163d687f7f017] Merge branch 'master' of 
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
  git bisect good 01925efdf7e03b4b803b5c9f985163d687f7f017
  # good: [95ed40196f965177ee0d044ab304e5cab3aee9c1] Merge branch 
'tipc_fragmentation'
  

Re: [PATCH] Cpufreq: Change sysfs interface cpuinfo_cur_freq access privilege

2013-11-24 Thread Lan Tianyu
On 2013年11月25日 12:30, Viresh Kumar wrote:
> On 25 November 2013 08:23, Lan Tianyu  wrote:
>> Currently, cpuinfo_cur_freq is only accessible for root user while
>> other cpufreq sysfs interfaces(E,G scaling_cur_freq) are available
>> to ordinary user. This seems make no sense. This patch is to change
>> it.
> 
> There is nothing wrong with the code and so this is more of a design
> change..
> 
> Probably Rafael can help us here as cpufreq_cur_freq will read stuff
> directly from hardware instead of using cached value in software.

I think so, too. I also tried to checking the reason of the privilege by
git log but the code was there before linux kernel being migrated to git
repository.

> 
> --
> viresh
> 


-- 
Best regards
Tianyu Lan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv3] usb: chipidea: add support for USB OTG controller on TI-NSPIRE

2013-11-24 Thread Daniel Tang
Hi,

On 25/11/2013, at 4:32 PM, Peter Chen  wrote:

> 
>> 
>> From: Daniel Tang 
>> 
>> The USB controller in TI-NSPIRE calculators are based off either
>> Freescale's
>> USB OTG controller or the USB controller found in the IMX233, both of
>> which
>> are Chipidea compatible.
>> 
>> This patch adds a device tree binding for the controller.
>> 
>> Signed-off-by: Daniel Tang 
>> ---
>> 
>> Changelog v3:
>> * Removed redundant module aliases
>> 
>> Changelog v2:
>> * Rename ci13xxx to ci_hdrc
>> * Fixed alignment issues
>> 
>> .../devicetree/bindings/usb/ci-hdrc-nspire.txt | 17 +
>> drivers/usb/chipidea/Makefile  |  1 +
>> drivers/usb/chipidea/ci_hdrc_nspire.c  | 72
>> ++
>> 3 files changed, 90 insertions(+)
>> create mode 100644 Documentation/devicetree/bindings/usb/ci-hdrc-
>> nspire.txt
>> create mode 100644 drivers/usb/chipidea/ci_hdrc_nspire.c
>> 
>> diff --git a/Documentation/devicetree/bindings/usb/ci-hdrc-nspire.txt
>> b/Documentation/devicetree/bindings/usb/ci-hdrc-nspire.txt
>> new file mode 100644
>> index 000..5ba8e90
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/usb/ci-hdrc-nspire.txt
>> @@ -0,0 +1,17 @@
>> +* TI-Nspire USB OTG Controller
>> +
>> +Required properties:
>> +- compatible: Should be "zevio,nspire-usb"
>> +- reg: Should contain registers location and length
>> +- interrupts: Should contain controller interrupt
>> +
>> +Recommended properies:
>> +- vbus-supply: regulator for vbus
>> +
>> +Examples:
>> +usb0: usb@B000 {
>> +reg = <0xB000 0x1000>;
>> +compatible = "zevio,nspire-usb";
>> +interrupts = <8>;
>> +vbus-supply = <_reg>;
>> +};
>> diff --git a/drivers/usb/chipidea/Makefile
>> b/drivers/usb/chipidea/Makefile
>> index a99d980..245ea4d 100644
>> --- a/drivers/usb/chipidea/Makefile
>> +++ b/drivers/usb/chipidea/Makefile
>> @@ -10,6 +10,7 @@ ci_hdrc-$(CONFIG_USB_CHIPIDEA_DEBUG)   += debug.o
>> # Glue/Bridge layers go here
>> 
>> obj-$(CONFIG_USB_CHIPIDEA)   += ci_hdrc_msm.o
>> +obj-$(CONFIG_USB_CHIPIDEA)  += ci_hdrc_nspire.o
>> 
>> # PCI doesn't provide stubs, need to check
>> ifneq ($(CONFIG_PCI),)
>> diff --git a/drivers/usb/chipidea/ci_hdrc_nspire.c
>> b/drivers/usb/chipidea/ci_hdrc_nspire.c
>> new file mode 100644
>> index 000..517ce41
>> --- /dev/null
>> +++ b/drivers/usb/chipidea/ci_hdrc_nspire.c
>> @@ -0,0 +1,72 @@
>> +/*
>> + *  Copyright (C) 2013 Daniel Tang 
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2, as
>> + * published by the Free Software Foundation.
>> + *
>> + * Based off drivers/usb/chipidea/ci_hdrc_msm.c
>> + *
>> + */
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +
>> +#include "ci.h"
>> +
>> +static struct ci_hdrc_platform_data ci_hdrc_nspire_platdata = {
>> +.name   = "ci_hdrc_nspire",
>> +.flags  = CI_HDRC_REGS_SHARED,
>> +.capoffset  = DEF_CAPOFFSET,
>> +};
>> +
>> +static int ci_hdrc_nspire_probe(struct platform_device *pdev)
>> +{
>> +struct platform_device *ci_pdev;
>> +
>> +dev_dbg(>dev, "ci_hdrc_nspire_probe\n");
>> +
>> +ci_pdev = ci_hdrc_add_device(>dev,
>> +pdev->resource, pdev->num_resources,
>> +_hdrc_nspire_platdata);
>> +
>> +if (IS_ERR(ci_pdev)) {
>> +dev_err(>dev, "ci_hdrc_add_device failed!\n");
>> +return PTR_ERR(ci_pdev);
>> +}
>> +
>> +platform_set_drvdata(pdev, ci_pdev);
>> +
>> +return 0;
>> +}
>> +
>> +static int ci_hdrc_nspire_remove(struct platform_device *pdev)
>> +{
>> +struct platform_device *ci_pdev = platform_get_drvdata(pdev);
>> +
>> +ci_hdrc_remove_device(ci_pdev);
>> +
>> +return 0;
>> +}
>> +
>> +static const struct of_device_id ci_hdrc_nspire_dt_ids[] = {
>> +{ .compatible = "zevio,nspire-usb", },
>> +{ /* sentinel */ }
>> +};
>> +
>> +static struct platform_driver ci_hdrc_nspire_driver = {
>> +.probe = ci_hdrc_nspire_probe,
>> +.remove = ci_hdrc_nspire_remove,
>> +.driver = {
>> +.name = "nspire_usb",
>> +.owner = THIS_MODULE,
>> +.of_match_table = ci_hdrc_nspire_dt_ids,
>> +},
>> +};
>> +
>> +MODULE_DEVICE_TABLE(of, ci_hdrc_nspire_dt_ids);
>> +module_platform_driver(ci_hdrc_nspire_driver);
>> +
>> +MODULE_LICENSE("GPL v2");
>> --
> 
> You can decide to add module alias or not.

It wasn't really required.

> 
> Acked-by: Peter Chen 
> for driver part.
> 
> I haven't seen your dts patch.

If you mean the dts files for the platform, I'm still working on getting the 
regulator working so I'll probably send it in when it's all done.

> 
> Peter
> 
> 

Cheers,
Daniel Tang--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to 

RE: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)

2013-11-24 Thread Bharat Bhushan


> -Original Message-
> From: Alex Williamson [mailto:alex.william...@redhat.com]
> Sent: Friday, November 22, 2013 2:31 AM
> To: Wood Scott-B07421
> Cc: Bhushan Bharat-R65777; linux-...@vger.kernel.org; ag...@suse.de; Yoder
> Stuart-B08248; io...@lists.linux-foundation.org; bhelg...@google.com; 
> linuxppc-
> d...@lists.ozlabs.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
> 
> On Thu, 2013-11-21 at 14:47 -0600, Scott Wood wrote:
> > On Thu, 2013-11-21 at 13:43 -0700, Alex Williamson wrote:
> > > On Thu, 2013-11-21 at 11:20 +, Bharat Bhushan wrote:
> > > >
> > > > > -Original Message-
> > > > > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > > > > Sent: Thursday, November 21, 2013 12:17 AM
> > > > > To: Bhushan Bharat-R65777
> > > > > Cc: j...@8bytes.org; bhelg...@google.com; ag...@suse.de; Wood
> > > > > Scott-B07421; Yoder Stuart-B08248;
> > > > > io...@lists.linux-foundation.org; linux- p...@vger.kernel.org;
> > > > > linuxppc-...@lists.ozlabs.org; linux- ker...@vger.kernel.org;
> > > > > Bhushan Bharat-R65777
> > > > > Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale
> > > > > IOMMU (PAMU)
> > > > >
> > > > > Is VFIO_IOMMU_PAMU_GET_MSI_BANK_COUNT per aperture (ie. each
> > > > > vfio user has $COUNT regions at their disposal exclusively)?
> > > >
> > > > Number of msi-bank count is system wide and not per aperture, But will 
> > > > be
> setting windows for banks in the device aperture.
> > > > So say if we are direct assigning 2 pci device (both have different 
> > > > iommu
> group, so 2 aperture in iommu) to VM.
> > > > Now qemu can make only one call to know how many msi-banks are there but
> it must set sub-windows for all banks for both pci device in its respective
> aperture.
> > >
> > > I'm still confused.  What I want to make sure of is that the banks
> > > are independent per aperture.  For instance, if we have two separate
> > > userspace processes operating independently and they both chose to
> > > use msi bank zero for their device, that's bank zero within each
> > > aperture and doesn't interfere.  Or another way to ask is can a
> > > malicious user interfere with other users by using the wrong bank.
> > > Thanks,
> >
> > They can interfere.

Want to be sure of how they can interfere?

>>  With this hardware, the only way to prevent that
> > is to make sure that a bank is not shared by multiple protection contexts.
> > For some of our users, though, I believe preventing this is less
> > important than the performance benefit.

So should we let this patch series in without protection?

> 
> I think we need some sort of ownership model around the msi banks then.
> Otherwise there's nothing preventing another userspace from attempting an MSI
> based attack on other users, or perhaps even on the host.  VFIO can't allow
> that.  Thanks,

We have very few (3 MSI bank on most of chips), so we can not assign one to 
each userspace. What we can do is host and userspace does not share a MSI bank 
while userspace will share a MSI bank.


Thanks
-Bharat

> 
> Alex
> 

N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

RE: [PATCHv3] usb: chipidea: add support for USB OTG controller on TI-NSPIRE

2013-11-24 Thread Peter Chen
 
> 
> From: Daniel Tang 
> 
> The USB controller in TI-NSPIRE calculators are based off either
> Freescale's
> USB OTG controller or the USB controller found in the IMX233, both of
> which
> are Chipidea compatible.
> 
> This patch adds a device tree binding for the controller.
> 
> Signed-off-by: Daniel Tang 
> ---
> 
> Changelog v3:
>  * Removed redundant module aliases
> 
> Changelog v2:
>  * Rename ci13xxx to ci_hdrc
>  * Fixed alignment issues
> 
> .../devicetree/bindings/usb/ci-hdrc-nspire.txt | 17 +
>  drivers/usb/chipidea/Makefile  |  1 +
>  drivers/usb/chipidea/ci_hdrc_nspire.c  | 72
> ++
>  3 files changed, 90 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/usb/ci-hdrc-
> nspire.txt
>  create mode 100644 drivers/usb/chipidea/ci_hdrc_nspire.c
> 
> diff --git a/Documentation/devicetree/bindings/usb/ci-hdrc-nspire.txt
> b/Documentation/devicetree/bindings/usb/ci-hdrc-nspire.txt
> new file mode 100644
> index 000..5ba8e90
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/usb/ci-hdrc-nspire.txt
> @@ -0,0 +1,17 @@
> +* TI-Nspire USB OTG Controller
> +
> +Required properties:
> +- compatible: Should be "zevio,nspire-usb"
> +- reg: Should contain registers location and length
> +- interrupts: Should contain controller interrupt
> +
> +Recommended properies:
> +- vbus-supply: regulator for vbus
> +
> +Examples:
> + usb0: usb@B000 {
> + reg = <0xB000 0x1000>;
> + compatible = "zevio,nspire-usb";
> + interrupts = <8>;
> + vbus-supply = <_reg>;
> + };
> diff --git a/drivers/usb/chipidea/Makefile
> b/drivers/usb/chipidea/Makefile
> index a99d980..245ea4d 100644
> --- a/drivers/usb/chipidea/Makefile
> +++ b/drivers/usb/chipidea/Makefile
> @@ -10,6 +10,7 @@ ci_hdrc-$(CONFIG_USB_CHIPIDEA_DEBUG)+= debug.o
>  # Glue/Bridge layers go here
> 
>  obj-$(CONFIG_USB_CHIPIDEA)   += ci_hdrc_msm.o
> +obj-$(CONFIG_USB_CHIPIDEA)   += ci_hdrc_nspire.o
> 
>  # PCI doesn't provide stubs, need to check
>  ifneq ($(CONFIG_PCI),)
> diff --git a/drivers/usb/chipidea/ci_hdrc_nspire.c
> b/drivers/usb/chipidea/ci_hdrc_nspire.c
> new file mode 100644
> index 000..517ce41
> --- /dev/null
> +++ b/drivers/usb/chipidea/ci_hdrc_nspire.c
> @@ -0,0 +1,72 @@
> +/*
> + *   Copyright (C) 2013 Daniel Tang 
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2, as
> + * published by the Free Software Foundation.
> + *
> + * Based off drivers/usb/chipidea/ci_hdrc_msm.c
> + *
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "ci.h"
> +
> +static struct ci_hdrc_platform_data ci_hdrc_nspire_platdata = {
> + .name   = "ci_hdrc_nspire",
> + .flags  = CI_HDRC_REGS_SHARED,
> + .capoffset  = DEF_CAPOFFSET,
> +};
> +
> +static int ci_hdrc_nspire_probe(struct platform_device *pdev)
> +{
> + struct platform_device *ci_pdev;
> +
> + dev_dbg(>dev, "ci_hdrc_nspire_probe\n");
> +
> + ci_pdev = ci_hdrc_add_device(>dev,
> + pdev->resource, pdev->num_resources,
> + _hdrc_nspire_platdata);
> +
> + if (IS_ERR(ci_pdev)) {
> + dev_err(>dev, "ci_hdrc_add_device failed!\n");
> + return PTR_ERR(ci_pdev);
> + }
> +
> + platform_set_drvdata(pdev, ci_pdev);
> +
> + return 0;
> +}
> +
> +static int ci_hdrc_nspire_remove(struct platform_device *pdev)
> +{
> + struct platform_device *ci_pdev = platform_get_drvdata(pdev);
> +
> + ci_hdrc_remove_device(ci_pdev);
> +
> + return 0;
> +}
> +
> +static const struct of_device_id ci_hdrc_nspire_dt_ids[] = {
> + { .compatible = "zevio,nspire-usb", },
> + { /* sentinel */ }
> +};
> +
> +static struct platform_driver ci_hdrc_nspire_driver = {
> + .probe = ci_hdrc_nspire_probe,
> + .remove = ci_hdrc_nspire_remove,
> + .driver = {
> + .name = "nspire_usb",
> + .owner = THIS_MODULE,
> + .of_match_table = ci_hdrc_nspire_dt_ids,
> + },
> +};
> +
> +MODULE_DEVICE_TABLE(of, ci_hdrc_nspire_dt_ids);
> +module_platform_driver(ci_hdrc_nspire_driver);
> +
> +MODULE_LICENSE("GPL v2");
> --

You can decide to add module alias or not.

Acked-by: Peter Chen 
for driver part.

I haven't seen your dts patch.

Peter



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/4] lpc_ich: Add Device IDs for Intel Wildcat Point-LP PCH

2013-11-24 Thread Alexander Beregalov
On 4 November 2013 21:31, James Ralston  wrote:
> This patch adds the TCO Watchdog Device IDs for the Intel Wildcat Point-LP 
> PCH.
>
> Signed-off-by: James Ralston 
> ---
>  drivers/mfd/lpc_ich.c | 13 +
>  1 file changed, 13 insertions(+)
>

> +   [LPC_WPT_LP] = {
> +   .name = "Lynx Point_LP",
> +   .iTCO_version = 2,
> +   },

Hi
Shouldn't it be "Wildcat Point_LP"?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 5/9 v3] efi: export more efi table variable to sysfs

2013-11-24 Thread Dave Young
On 11/23/13 at 02:15pm, Borislav Petkov wrote:
> On Fri, Nov 22, 2013 at 10:48:50AM +0800, Dave Young wrote:
> > >   efi.config_table = (unsigned long)efi.systab->tables;
> > >   efi.fw_vendor= (unsigned long)efi.systab->fw_vendor;
> > >   efi.runtime  = (unsigned long)efi.systab->runtime;
> > 
> > Hmm, UEFI spec mentions the them like below so I use the order:
> 
> I'm sure by now you know you should not really trust the UEFI spec, or
> any other spec for that matter :)
> 
> > Several fields of the EFI System Table must be converted from
> > physical pointers to virtual pointers using the ConvertPointer()
> > service. These fields include FirmwareVendor, RuntimeServices,
> > and ConfigurationTable.
> > 
> > But since you like the reverse I can change it in next version.
> 
> The reverse was simply a suggestion. The vertical alignment was more
> what I aimed at because it makes this chunk much more readable IMO.
> 

Got your point about alignment, will update.

--
Thanks
Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Compile error in i2c-bcm-kona.c

2013-11-24 Thread Guenter Roeck

Upstream HEAD, arm allmodconfig:

drivers/i2c/busses/i2c-bcm-kona.c:894:1: error: '__mod_of_device_table' aliased 
to undefined symbol 'kona_i2c_of_match'
make[3]: *** [drivers/i2c/busses/i2c-bcm-kona.o] Error 1
make[2]: *** [drivers/i2c/busses] Error 2
make[1]: *** [drivers/i2c] Error 2
make[1]: *** Waiting for unfinished jobs

Source:

static const struct of_device_id bcm_kona_i2c_of_match[] = {
{.compatible = "brcm,kona-i2c",},
{},
};
MODULE_DEVICE_TABLE(of, kona_i2c_of_match);

Makes me wonder how this was tested :-(.

Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 1/4] zsmalloc: add Kconfig for enabling page table method

2013-11-24 Thread Minchan Kim
Zsmalloc has two methods 1) copy-based and 2) pte based to
access objects that span two pages.
You can see history why we supported two approach from [1].

But it was bad choice that adding hard coding to select arch
which want to use pte based method because there are lots of
SoC in an architecure and they can have different cache size,
CPU speed and so on so it would be better to expose it to user
as selectable Kconfig option like Andrew Morton suggested.

[1] https://lkml.org/lkml/2012/7/11/58

Acked-by: Nitin Gupta 
Reviewed-by: Konrad Rzeszutek Wilk 
Signed-off-by: Minchan Kim 
---
 drivers/staging/zsmalloc/Kconfig |   13 +
 drivers/staging/zsmalloc/zsmalloc-main.c |   19 ---
 2 files changed, 17 insertions(+), 15 deletions(-)

diff --git a/drivers/staging/zsmalloc/Kconfig b/drivers/staging/zsmalloc/Kconfig
index 0ae13cd0908e..9d1f2a24ad62 100644
--- a/drivers/staging/zsmalloc/Kconfig
+++ b/drivers/staging/zsmalloc/Kconfig
@@ -9,3 +9,16 @@ config ZSMALLOC
  non-standard allocator interface where a handle, not a pointer, is
  returned by an alloc().  This handle must be mapped in order to
  access the allocated space.
+
+config PGTABLE_MAPPING
+   bool "Use page table mapping to access object in zsmalloc"
+   depends on ZSMALLOC
+   help
+ By default, zsmalloc uses a copy-based object mapping method to
+ access allocations that span two pages. However, if a particular
+ architecture (ex, ARM) performs VM mapping faster than copying,
+ then you should select this. This causes zsmalloc to use page table
+ mapping rather than copying for object mapping.
+
+ You can check speed with zsmalloc benchmark[1].
+ [1] https://github.com/spartacus06/zsmalloc
diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c 
b/drivers/staging/zsmalloc/zsmalloc-main.c
index 1a67537dbc56..f57258fa0c9d 100644
--- a/drivers/staging/zsmalloc/zsmalloc-main.c
+++ b/drivers/staging/zsmalloc/zsmalloc-main.c
@@ -218,19 +218,8 @@ struct zs_pool {
 #define CLASS_IDX_MASK ((1 << CLASS_IDX_BITS) - 1)
 #define FULLNESS_MASK  ((1 << FULLNESS_BITS) - 1)
 
-/*
- * By default, zsmalloc uses a copy-based object mapping method to access
- * allocations that span two pages. However, if a particular architecture
- * performs VM mapping faster than copying, then it should be added here
- * so that USE_PGTABLE_MAPPING is defined. This causes zsmalloc to use
- * page table mapping rather than copying for object mapping.
- */
-#if defined(CONFIG_ARM) && !defined(MODULE)
-#define USE_PGTABLE_MAPPING
-#endif
-
 struct mapping_area {
-#ifdef USE_PGTABLE_MAPPING
+#ifdef CONFIG_PGTABLE_MAPPING
struct vm_struct *vm; /* vm area for mapping object that span pages */
 #else
char *vm_buf; /* copy buffer for objects that span pages */
@@ -622,7 +611,7 @@ static struct page *find_get_zspage(struct size_class 
*class)
return page;
 }
 
-#ifdef USE_PGTABLE_MAPPING
+#ifdef CONFIG_PGTABLE_MAPPING
 static inline int __zs_cpu_up(struct mapping_area *area)
 {
/*
@@ -660,7 +649,7 @@ static inline void __zs_unmap_object(struct mapping_area 
*area,
unmap_kernel_range(addr, PAGE_SIZE * 2);
 }
 
-#else /* USE_PGTABLE_MAPPING */
+#else /* CONFIG_PGTABLE_MAPPING */
 
 static inline int __zs_cpu_up(struct mapping_area *area)
 {
@@ -738,7 +727,7 @@ out:
pagefault_enable();
 }
 
-#endif /* USE_PGTABLE_MAPPING */
+#endif /* CONFIG_PGTABLE_MAPPING */
 
 static int zs_cpu_notifier(struct notifier_block *nb, unsigned long action,
void *pcpu)
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 2/4] zsmalloc: add more comment

2013-11-24 Thread Minchan Kim
From: Nitin Cupta 

This patch adds lots of comments and it will help others
to review and enhance.

Signed-off-by: Seth Jennings 
Signed-off-by: Nitin Gupta 
Signed-off-by: Minchan Kim 
---
 drivers/staging/zsmalloc/zsmalloc-main.c |   66 +-
 drivers/staging/zsmalloc/zsmalloc.h  |9 +++-
 2 files changed, 64 insertions(+), 11 deletions(-)

diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c 
b/drivers/staging/zsmalloc/zsmalloc-main.c
index f57258fa0c9d..52ebddd7fe8c 100644
--- a/drivers/staging/zsmalloc/zsmalloc-main.c
+++ b/drivers/staging/zsmalloc/zsmalloc-main.c
@@ -10,16 +10,14 @@
  * Released under the terms of GNU General Public License Version 2.0
  */
 
-
 /*
- * This allocator is designed for use with zcache and zram. Thus, the
- * allocator is supposed to work well under low memory conditions. In
- * particular, it never attempts higher order page allocation which is
- * very likely to fail under memory pressure. On the other hand, if we
- * just use single (0-order) pages, it would suffer from very high
- * fragmentation -- any object of size PAGE_SIZE/2 or larger would occupy
- * an entire page. This was one of the major issues with its predecessor
- * (xvmalloc).
+ * This allocator is designed for use with zram. Thus, the allocator is
+ * supposed to work well under low memory conditions. In particular, it
+ * never attempts higher order page allocation which is very likely to
+ * fail under memory pressure. On the other hand, if we just use single
+ * (0-order) pages, it would suffer from very high fragmentation --
+ * any object of size PAGE_SIZE/2 or larger would occupy an entire page.
+ * This was one of the major issues with its predecessor (xvmalloc).
  *
  * To overcome these issues, zsmalloc allocates a bunch of 0-order pages
  * and links them together using various 'struct page' fields. These linked
@@ -27,6 +25,21 @@
  * page boundaries. The code refers to these linked pages as a single entity
  * called zspage.
  *
+ * For simplicity, zsmalloc can only allocate objects of size up to PAGE_SIZE
+ * since this satisfies the requirements of all its current users (in the
+ * worst case, page is incompressible and is thus stored "as-is" i.e. in
+ * uncompressed form). For allocation requests larger than this size, failure
+ * is returned (see zs_malloc).
+ *
+ * Additionally, zs_malloc() does not return a dereferenceable pointer.
+ * Instead, it returns an opaque handle (unsigned long) which encodes actual
+ * location of the allocated object. The reason for this indirection is that
+ * zsmalloc does not keep zspages permanently mapped since that would cause
+ * issues on 32-bit systems where the VA region for kernel space mappings
+ * is very small. So, before using the allocating memory, the object has to
+ * be mapped using zs_map_object() to get a usable pointer and subsequently
+ * unmapped using zs_unmap_object().
+ *
  * Following is how we use various fields and flags of underlying
  * struct page(s) to form a zspage.
  *
@@ -98,7 +111,7 @@
 
 /*
  * Object location (, ) is encoded as
- * as single (void *) handle value.
+ * as single (unsigned long) handle value.
  *
  * Note that object index  is relative to system
  * page  it is stored in, so for each sub-page belonging
@@ -264,6 +277,13 @@ static void set_zspage_mapping(struct page *page, unsigned 
int class_idx,
page->mapping = (struct address_space *)m;
 }
 
+/*
+ * zsmalloc divides the pool into various size classes where each
+ * class maintains a list of zspages where each zspage is divided
+ * into equal sized chunks. Each allocation falls into one of these
+ * classes depending on its size. This function returns index of the
+ * size class which has chunk size big enough to hold the give size.
+ */
 static int get_size_class_index(int size)
 {
int idx = 0;
@@ -275,6 +295,13 @@ static int get_size_class_index(int size)
return idx;
 }
 
+/*
+ * For each size class, zspages are divided into different groups
+ * depending on how "full" they are. This was done so that we could
+ * easily find empty or nearly empty zspages when we try to shrink
+ * the pool (not yet implemented). This function returns fullness
+ * status of the given page.
+ */
 static enum fullness_group get_fullness_group(struct page *page)
 {
int inuse, max_objects;
@@ -296,6 +323,12 @@ static enum fullness_group get_fullness_group(struct page 
*page)
return fg;
 }
 
+/*
+ * Each size class maintains various freelists and zspages are assigned
+ * to one of these freelists based on the number of live objects they
+ * have. This functions inserts the given zspage into the freelist
+ * identified by .
+ */
 static void insert_zspage(struct page *page, struct size_class *class,
enum fullness_group fullness)
 {
@@ -313,6 +346,10 @@ static void insert_zspage(struct page *page, struct 
size_class *class,
*head = page;
 }
 
+/*
+ * This function 

[PATCH v8 4/4] zram: promote zram from staging

2013-11-24 Thread Minchan Kim
Zram has lived in staging for a LONG LONG time and have been
fixed/improved by many contributors so code is clean and stable now.
Of course, there are lots of product using zram in real practice.

The major TV companys have used zram as swap since two years ago
and recently our production team released android smart phone with zram
which is used as swap, too and recently Android Kitkat start to use zram
for small memory smart phone.  And there was a report Google released
their ChromeOS with zram, too and cyanogenmod have been used zram
long time ago. And I heard some disto have used zram block device
for tmpfs. In addition, I saw many report from many other peoples.
For example, Lubuntu start to use it.

The benefit of zram is very clear. With my experience, one of the benefit
was to remove jitter of video application with backgroud memory pressure.
It would be effect of efficient memory usage by compression but more issue
is whether swap is there or not in the system. Recent mobile platforms have
used JAVA so there are many anonymous pages. But embedded system normally
are reluctant to use eMMC or SDCard as swap because there is wear-leveling
and latency issues so if we do not use swap, it means we can't reclaim
anoymous pages and at last, we could encounter OOM kill. :(

Although we have real storage as swap, it was a problem, too. Because
it sometime ends up making system very unresponsible caused by slow
swap storage performance.

Quote from Luigi on Google
"
Since Chrome OS was mentioned: the main reason why we don't use swap
to a disk (rotating or SSD) is because it doesn't degrade gracefully
and leads to a bad interactive experience.  Generally we prefer to
manage RAM at a higher level, by transparently killing and restarting
processes.  But we noticed that zram is fast enough to be competitive
with the latter, and it lets us make more efficient use of the
available RAM.
"
and he announced. http://www.spinics.net/lists/linux-mm/msg57717.html

Other uses case is to use zram for block device. Zram is block device
so anyone can format the block device and mount on it so some guys
on the internet start zram as /var/tmp.
http://forums.gentoo.org/viewtopic-t-838198-start-0.html

Let's promote zram and enhance/maintain it instead of removing.

Reviewed-by: Konrad Rzeszutek Wilk 
Acked-by: Nitin Gupta 
Acked-by: Pekka Enberg 
Signed-off-by: Minchan Kim 
---
 drivers/block/Kconfig   |2 +
 drivers/block/Makefile  |2 +
 drivers/block/zram/Kconfig  |   25 +
 drivers/block/zram/Makefile |3 +
 drivers/block/zram/zram.txt |   77 +++
 drivers/block/zram/zram_drv.c   |  981 ++
 drivers/staging/Kconfig |2 -
 drivers/staging/Makefile|1 -
 drivers/staging/zram/Kconfig|   25 -
 drivers/staging/zram/Makefile   |3 -
 drivers/staging/zram/zram.txt   |   77 ---
 drivers/staging/zram/zram_drv.c |  982 ---
 drivers/staging/zram/zram_drv.h |  124 -
 include/linux/zram_drv.h|  124 +
 14 files changed, 1214 insertions(+), 1214 deletions(-)
 create mode 100644 drivers/block/zram/Kconfig
 create mode 100644 drivers/block/zram/Makefile
 create mode 100644 drivers/block/zram/zram.txt
 create mode 100644 drivers/block/zram/zram_drv.c
 delete mode 100644 drivers/staging/zram/Kconfig
 delete mode 100644 drivers/staging/zram/Makefile
 delete mode 100644 drivers/staging/zram/zram.txt
 delete mode 100644 drivers/staging/zram/zram_drv.c
 delete mode 100644 drivers/staging/zram/zram_drv.h
 create mode 100644 include/linux/zram_drv.h

diff --git a/drivers/block/Kconfig b/drivers/block/Kconfig
index 86b9f37d102e..28b6bf2fe886 100644
--- a/drivers/block/Kconfig
+++ b/drivers/block/Kconfig
@@ -108,6 +108,8 @@ source "drivers/block/paride/Kconfig"
 
 source "drivers/block/mtip32xx/Kconfig"
 
+source "drivers/block/zram/Kconfig"
+
 config BLK_CPQ_DA
tristate "Compaq SMART2 support"
depends on PCI && VIRT_TO_BUS && 0
diff --git a/drivers/block/Makefile b/drivers/block/Makefile
index 8cc98cd0d4a8..1beffa2f3a5d 100644
--- a/drivers/block/Makefile
+++ b/drivers/block/Makefile
@@ -44,6 +44,8 @@ obj-$(CONFIG_BLK_DEV_PCIESSD_MTIP32XX)+= mtip32xx/
 obj-$(CONFIG_BLK_DEV_RSXX) += rsxx/
 obj-$(CONFIG_BLK_DEV_NULL_BLK) += null_blk.o
 
+obj-$(CONFIG_ZRAM) += zram/
+
 nvme-y := nvme-core.o nvme-scsi.o
 skd-y  := skd_main.o
 swim_mod-y := swim.o swim_asm.o
diff --git a/drivers/block/zram/Kconfig b/drivers/block/zram/Kconfig
new file mode 100644
index ..983314c41349
--- /dev/null
+++ b/drivers/block/zram/Kconfig
@@ -0,0 +1,25 @@
+config ZRAM
+   tristate "Compressed RAM block device support"
+   depends on BLOCK && SYSFS && ZSMALLOC
+   select LZO_COMPRESS
+   select LZO_DECOMPRESS
+   default n
+   help
+ Creates virtual block devices called /dev/zramX (X = 0, 1, ...).
+ Pages written to these disks are 

[PATCH v8 0/4] zram/zsmalloc promotion

2013-11-24 Thread Minchan Kim
Zram is a simple pseudo block device which can keep data on
in-memory with compressed.[1]

It have been used for many embedded system for several years
One of significant usecase is in-memory swap device.
Because NAND which is very popular on most embedded device
is weak for frequent write without good wear-level
and slow I/O hurts system's responsiblity so zram is really
good choice to use memory efficiently.

In previous trial, there was some argument[2] that zram has
similar goal with zswap so let's merge zram's functionality
into zswap via adding pseudo block device in zswap but I and
some people(At least, Hugh and Rik) believe it's not a good idea.
[2][3][4] and zswap might go writethrough model[5]. It makes
clear difference zram and zswap.

Zram itself is simple/well-designed/good abstraciton so it has
clear market(ex, Android, TV, ChromeOS, some Linux distro) which
is never niche. :)

Another zram-blk's usecase is following as.
The admin can use it as tmpfs so it could help small memory system.
The tmpfs is never good solution for swapless embedded system.

Patch 1 adds new Kconfig for zram to use page table method instead
of copy.

Patch 2 adds more comment for zsmalloc.

Patch 3 moves zsmalloc under mm.

Patch 4 moves zram from driver/staging to driver/blocks, finally.

[1] http://en.wikipedia.org/wiki/Zram
[2] https://lkml.org/lkml/2013/8/21/54
[3] https://lkml.org/lkml/2013/11/13/570
[4] https://lkml.org/lkml/2013/11/7/318
[5] http://www.spinics.net/lists/linux-mm/msg65499.html

 * From v7
  * Remove unnecessary zswap VS zram comparison in cover letter.
  * Add Reviewed-by/Acked-by I forgot.
  * Remove exporting unmap_kernel_range patch. I will do if promotion is done.
  * Move zsmalloc under mm - Hugh
  
Minchan Kim (3):
  zsmalloc: add Kconfig for enabling page table method
  zsmalloc: move it under mm
  zram: promote zram from staging

Nitin Cupta (1):
  zsmalloc: add more comment

 drivers/block/Kconfig|2 +
 drivers/block/Makefile   |2 +
 drivers/block/zram/Kconfig   |   25 +
 drivers/block/zram/Makefile  |3 +
 drivers/block/zram/zram.txt  |   77 +++
 drivers/block/zram/zram_drv.c|  981 ++
 drivers/staging/Kconfig  |4 -
 drivers/staging/Makefile |2 -
 drivers/staging/zram/Kconfig |   25 -
 drivers/staging/zram/Makefile|3 -
 drivers/staging/zram/zram.txt|   77 ---
 drivers/staging/zram/zram_drv.c  |  982 --
 drivers/staging/zram/zram_drv.h  |  125 
 drivers/staging/zsmalloc/Kconfig |   11 -
 drivers/staging/zsmalloc/Makefile|3 -
 drivers/staging/zsmalloc/zsmalloc-main.c | 1063 -
 drivers/staging/zsmalloc/zsmalloc.h  |   43 --
 include/linux/zram_drv.h |  124 
 include/linux/zsmalloc.h |   50 ++
 mm/Kconfig   |   25 +
 mm/Makefile  |1 +
 mm/zsmalloc.c| 1097 ++
 22 files changed, 2387 insertions(+), 2338 deletions(-)
 create mode 100644 drivers/block/zram/Kconfig
 create mode 100644 drivers/block/zram/Makefile
 create mode 100644 drivers/block/zram/zram.txt
 create mode 100644 drivers/block/zram/zram_drv.c
 delete mode 100644 drivers/staging/zram/Kconfig
 delete mode 100644 drivers/staging/zram/Makefile
 delete mode 100644 drivers/staging/zram/zram.txt
 delete mode 100644 drivers/staging/zram/zram_drv.c
 delete mode 100644 drivers/staging/zram/zram_drv.h
 delete mode 100644 drivers/staging/zsmalloc/Kconfig
 delete mode 100644 drivers/staging/zsmalloc/Makefile
 delete mode 100644 drivers/staging/zsmalloc/zsmalloc-main.c
 delete mode 100644 drivers/staging/zsmalloc/zsmalloc.h
 create mode 100644 include/linux/zram_drv.h
 create mode 100644 include/linux/zsmalloc.h
 create mode 100644 mm/zsmalloc.c

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] PCI: Move device_del() from pci_stop_dev() to pci_destroy_dev()

2013-11-24 Thread Yinghai Lu
On Sun, Nov 24, 2013 at 8:54 PM, Yinghai Lu  wrote:
> On Sat, Nov 23, 2013 at 4:17 PM, Rafael J. Wysocki  wrote:
>> From: Rafael J. Wysocki 
>>
>> After commit bcdde7e221a8 (sysfs: make __sysfs_remove_dir() recursive)
>> I'm seeing traces analogous to the one below in Thunderbolt testing:
>>
>> WARNING: CPU: 3 PID: 76 at 
>> /scratch/rafael/work/linux-pm/fs/sysfs/group.c:214 
>> sysfs_remove_group+0x59/0xe0()
>>  sysfs group 81c6c500 not found for kobject ':08'
>>  Modules linked in: fuse hidp af_packet xt_tcpudp xt_pkttype xt_LOG xt_limit 
>> ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw ipt_REJECT 
>> iptable_raw xt_CT iptable_filter ip6table_mangle nf_conntrack_netbios_ns 
>> nf_conntrack_broadcast nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables 
>> xt_conntrack nf_conntrack rfcomm ip6table_filter bnep ip6_tables x_tables 
>> arc4 ath9k mac80211 x86_pkg_temp_thermal intel_powerclamp coretemp 
>> crct10dif_pclmul crc32_pclmul iTCO_wdt crc32c_intel iTCO_vendor_support 
>> ghash_clmulni_intel aesni_intel ablk_helper acer_wmi sparse_keymap 
>> ath9k_common ath9k_hw cryptd lrw gf128mul ath3k glue_helper aes_x86_64 btusb 
>> microcode ath pcspkr joydev uvcvideo serio_raw videobuf2_core i2c_i801 
>> videodev snd_hda_codec_hdmi cfg80211 videobuf2_vmalloc tg3 videobuf2_memops 
>> sg ptp pps_core lpc_ich mfd_core snd_hda_codec_realtek bluetooth 
>> hid_logitech_dj rfkill snd_hda_intel snd_hda_codec shpchp battery ac wmi 
>> acpi_cpufreq edd snd_usb_audio snd_pcm snd_page_alloc snd_hwdep snd_
 usbmidi_lib
>> snd_rawmidi snd_seq snd_timer snd_seq_device snd soundcore autofs4 xhci_hcd 
>> processor scsi_dh_hp_sw scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_dh
>>  CPU: 3 PID: 76 Comm: kworker/u16:7 Not tainted 3.13.0-rc1+ #76
>>  Hardware name: Acer Aspire S5-391/Venus, BIOS V1.02 05/29/2012
>>  Workqueue: kacpi_hotplug acpi_hotplug_work_fn
>>   0009 8801644b9ac8 816b23bf 0007
>>   8801644b9b18 8801644b9b08 81046607 88016925b800
>>    81c6c500 88016924f928 88016924f800
>>  Call Trace:
>>   [] dump_stack+0x4e/0x71
>>   [] warn_slowpath_common+0x87/0xb0
>>   [] warn_slowpath_fmt+0x41/0x50
>>   [] ? sysfs_get_dirent_ns+0x6f/0x80
>>   [] sysfs_remove_group+0x59/0xe0
>>   [] dpm_sysfs_remove+0x3b/0x50
>>   [] device_del+0x58/0x1c0
>>   [] device_unregister+0x48/0x60
>>   [] pci_remove_bus+0x6e/0x80
>>   [] pci_remove_bus_device+0x38/0x110
>>   [] pci_remove_bus_device+0x4d/0x110
>>   [] pci_stop_and_remove_bus_device+0x19/0x20
>>   [] disable_slot+0x20/0xe0
>>   [] acpiphp_check_bridge+0xa8/0xd0
>>   [] hotplug_event+0x17d/0x220
>>   [] hotplug_event_work+0x30/0x70
>>   [] acpi_hotplug_work_fn+0x18/0x24
>>   [] process_one_work+0x261/0x450
>>   [] worker_thread+0x21e/0x370
>>   [] ? rescuer_thread+0x300/0x300
>>   [] kthread+0xd2/0xe0
>>   [] ? flush_kthread_worker+0x70/0x70
>>   [] ret_from_fork+0x7c/0xb0
>>   [] ? flush_kthread_worker+0x70/0x70
>>
>> (Mika Westerberg sees them too in his tests).
>>
>> Some investigation documented in kernel bug #65281 lead me to the
>> conclusion that the source of the problem is the device_del() in
>> pci_stop_dev() as it now causes the sysfs directory of the device
>> to be removed recursively along with all of its subdirectories.
>> That includes the sysfs directory of the device's subordinate
>> bus (dev->subordinate) and its "power" group.
>>
>> Consequently, when pci_remove_bus() is called for dev->subordinate
>> in pci_remove_bus_device(), it calls device_unregister(>dev),
>> but at this point the sysfs directory of bus->dev doesn't exist any
>> more and its "power" group doesn't exist either.  Thus, when
>> dpm_sysfs_remove() called from device_del() tries to remove that
>> group, it triggers the above warning.
>>
>> That indicates a logical mistake in the design of
>> pci_stop_and_remove_bus_device(), which causes bus device objects
>> to be left behind their parents (bridge device objects) and can be
>> fixed by moving the device_del() from pci_stop_dev() into
>> pci_destroy_dev(), so pci_remove_bus() can be called for the
>> device's subordinate bus before the device itself is unregistered
>> from the hierarchy.  Still, the driver, if any, should be detached
>> from the device in pci_stop_dev(), so use device_release_driver()
>> directly from there.
>>
>> References: https://bugzilla.kernel.org/show_bug.cgi?id=65281#c6
>> Reported-by: Mika Westerberg 
>> Signed-off-by: Rafael J. Wysocki 
>> ---
>>  drivers/pci/remove.c |4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> Index: linux-pm/drivers/pci/remove.c
>> ===
>> --- linux-pm.orig/drivers/pci/remove.c
>> +++ linux-pm/drivers/pci/remove.c
>> @@ -24,7 +24,7 @@ static void pci_stop_dev(struct pci_dev
>> if (dev->is_added) {
>> pci_proc_detach_device(dev);
>> pci_remove_sysfs_dev_files(dev);
>> - 

Re: [PATCH] PCI: Move device_del() from pci_stop_dev() to pci_destroy_dev()

2013-11-24 Thread Yinghai Lu
On Sat, Nov 23, 2013 at 4:17 PM, Rafael J. Wysocki  wrote:
> From: Rafael J. Wysocki 
>
> After commit bcdde7e221a8 (sysfs: make __sysfs_remove_dir() recursive)
> I'm seeing traces analogous to the one below in Thunderbolt testing:
>
> WARNING: CPU: 3 PID: 76 at /scratch/rafael/work/linux-pm/fs/sysfs/group.c:214 
> sysfs_remove_group+0x59/0xe0()
>  sysfs group 81c6c500 not found for kobject ':08'
>  Modules linked in: fuse hidp af_packet xt_tcpudp xt_pkttype xt_LOG xt_limit 
> ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw ipt_REJECT 
> iptable_raw xt_CT iptable_filter ip6table_mangle nf_conntrack_netbios_ns 
> nf_conntrack_broadcast nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables 
> xt_conntrack nf_conntrack rfcomm ip6table_filter bnep ip6_tables x_tables 
> arc4 ath9k mac80211 x86_pkg_temp_thermal intel_powerclamp coretemp 
> crct10dif_pclmul crc32_pclmul iTCO_wdt crc32c_intel iTCO_vendor_support 
> ghash_clmulni_intel aesni_intel ablk_helper acer_wmi sparse_keymap 
> ath9k_common ath9k_hw cryptd lrw gf128mul ath3k glue_helper aes_x86_64 btusb 
> microcode ath pcspkr joydev uvcvideo serio_raw videobuf2_core i2c_i801 
> videodev snd_hda_codec_hdmi cfg80211 videobuf2_vmalloc tg3 videobuf2_memops 
> sg ptp pps_core lpc_ich mfd_core snd_hda_codec_realtek bluetooth 
> hid_logitech_dj rfkill snd_hda_intel snd_hda_codec shpchp battery ac wmi 
> acpi_cpufreq edd snd_usb_audio snd_pcm snd_page_alloc snd_hwdep snd_u
 sbmidi_lib
> snd_rawmidi snd_seq snd_timer snd_seq_device snd soundcore autofs4 xhci_hcd 
> processor scsi_dh_hp_sw scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_dh
>  CPU: 3 PID: 76 Comm: kworker/u16:7 Not tainted 3.13.0-rc1+ #76
>  Hardware name: Acer Aspire S5-391/Venus, BIOS V1.02 05/29/2012
>  Workqueue: kacpi_hotplug acpi_hotplug_work_fn
>   0009 8801644b9ac8 816b23bf 0007
>   8801644b9b18 8801644b9b08 81046607 88016925b800
>    81c6c500 88016924f928 88016924f800
>  Call Trace:
>   [] dump_stack+0x4e/0x71
>   [] warn_slowpath_common+0x87/0xb0
>   [] warn_slowpath_fmt+0x41/0x50
>   [] ? sysfs_get_dirent_ns+0x6f/0x80
>   [] sysfs_remove_group+0x59/0xe0
>   [] dpm_sysfs_remove+0x3b/0x50
>   [] device_del+0x58/0x1c0
>   [] device_unregister+0x48/0x60
>   [] pci_remove_bus+0x6e/0x80
>   [] pci_remove_bus_device+0x38/0x110
>   [] pci_remove_bus_device+0x4d/0x110
>   [] pci_stop_and_remove_bus_device+0x19/0x20
>   [] disable_slot+0x20/0xe0
>   [] acpiphp_check_bridge+0xa8/0xd0
>   [] hotplug_event+0x17d/0x220
>   [] hotplug_event_work+0x30/0x70
>   [] acpi_hotplug_work_fn+0x18/0x24
>   [] process_one_work+0x261/0x450
>   [] worker_thread+0x21e/0x370
>   [] ? rescuer_thread+0x300/0x300
>   [] kthread+0xd2/0xe0
>   [] ? flush_kthread_worker+0x70/0x70
>   [] ret_from_fork+0x7c/0xb0
>   [] ? flush_kthread_worker+0x70/0x70
>
> (Mika Westerberg sees them too in his tests).
>
> Some investigation documented in kernel bug #65281 lead me to the
> conclusion that the source of the problem is the device_del() in
> pci_stop_dev() as it now causes the sysfs directory of the device
> to be removed recursively along with all of its subdirectories.
> That includes the sysfs directory of the device's subordinate
> bus (dev->subordinate) and its "power" group.
>
> Consequently, when pci_remove_bus() is called for dev->subordinate
> in pci_remove_bus_device(), it calls device_unregister(>dev),
> but at this point the sysfs directory of bus->dev doesn't exist any
> more and its "power" group doesn't exist either.  Thus, when
> dpm_sysfs_remove() called from device_del() tries to remove that
> group, it triggers the above warning.
>
> That indicates a logical mistake in the design of
> pci_stop_and_remove_bus_device(), which causes bus device objects
> to be left behind their parents (bridge device objects) and can be
> fixed by moving the device_del() from pci_stop_dev() into
> pci_destroy_dev(), so pci_remove_bus() can be called for the
> device's subordinate bus before the device itself is unregistered
> from the hierarchy.  Still, the driver, if any, should be detached
> from the device in pci_stop_dev(), so use device_release_driver()
> directly from there.
>
> References: https://bugzilla.kernel.org/show_bug.cgi?id=65281#c6
> Reported-by: Mika Westerberg 
> Signed-off-by: Rafael J. Wysocki 
> ---
>  drivers/pci/remove.c |4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> Index: linux-pm/drivers/pci/remove.c
> ===
> --- linux-pm.orig/drivers/pci/remove.c
> +++ linux-pm/drivers/pci/remove.c
> @@ -24,7 +24,7 @@ static void pci_stop_dev(struct pci_dev
> if (dev->is_added) {
> pci_proc_detach_device(dev);
> pci_remove_sysfs_dev_files(dev);
> -   device_del(>dev);
> +   device_release_driver(>dev);
> dev->is_added = 0;
> }
>
> @@ -34,6 +34,8 @@ static 

Re: [PATCH] Cpufreq: Change sysfs interface cpuinfo_cur_freq access privilege

2013-11-24 Thread Viresh Kumar
On 25 November 2013 08:23, Lan Tianyu  wrote:
> Currently, cpuinfo_cur_freq is only accessible for root user while
> other cpufreq sysfs interfaces(E,G scaling_cur_freq) are available
> to ordinary user. This seems make no sense. This patch is to change
> it.

There is nothing wrong with the code and so this is more of a design
change..

Probably Rafael can help us here as cpufreq_cur_freq will read stuff
directly from hardware instead of using cached value in software.

--
viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] arm: nspire: fix nspire_restart to take enum reboot_mode instead of a char so the correct function pointer is passed to DT_MACHINE_START

2013-11-24 Thread dt . tangr
From: Daniel Tang 


Signed-off-by: Daniel Tang 
---
 arch/arm/mach-nspire/nspire.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/mach-nspire/nspire.c b/arch/arm/mach-nspire/nspire.c
index 4b2ed2e..3d24ebf 100644
--- a/arch/arm/mach-nspire/nspire.c
+++ b/arch/arm/mach-nspire/nspire.c
@@ -63,7 +63,7 @@ static void __init nspire_init(void)
nspire_auxdata, NULL);
 }

-static void nspire_restart(char mode, const char *cmd)
+static void nspire_restart(enum reboot_mode mode, const char *cmd)
 {
void __iomem *base = ioremap(NSPIRE_MISC_PHYS_BASE, SZ_4K);
if (!base)
--
1.8.1.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] clocksource: zevio-timer: fix incorrect function definition so the correct function pointer is passed to CLOCKSOURCE_OF_DECLARE

2013-11-24 Thread dt . tangr
From: Daniel Tang 


Signed-off-by: Daniel Tang 
---
 drivers/clocksource/zevio-timer.c | 19 +--
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/drivers/clocksource/zevio-timer.c 
b/drivers/clocksource/zevio-timer.c
index ca81809..a1bd107 100644
--- a/drivers/clocksource/zevio-timer.c
+++ b/drivers/clocksource/zevio-timer.c
@@ -122,28 +122,27 @@ static irqreturn_t zevio_timer_interrupt(int irq, void 
*dev_id)
return IRQ_HANDLED;
 }

-static int __init zevio_timer_add(struct device_node *node)
+static void __init zevio_timer_add(struct device_node *node)
 {
struct zevio_timer *timer;
struct resource res;
-   int irqnr, ret;
+   int irqnr;

timer = kzalloc(sizeof(*timer), GFP_KERNEL);
if (!timer)
-   return -ENOMEM;
+   return;

timer->base = of_iomap(node, 0);
-   if (!timer->base) {
-   ret = -EINVAL;
+   if (!timer->base)
goto error_free;
-   }
+
timer->timer1 = timer->base + IO_TIMER1;
timer->timer2 = timer->base + IO_TIMER2;

timer->clk = of_clk_get(node, 0);
if (IS_ERR(timer->clk)) {
-   ret = PTR_ERR(timer->clk);
-   pr_err("Timer clock not found! (error %d)\n", ret);
+   pr_err("Timer clock not found! (error %d)\n",
+   (int)PTR_ERR(timer->clk));
goto error_unmap;
}

@@ -204,12 +203,12 @@ static int __init zevio_timer_add(struct device_node 
*node)

pr_info("Added %s as clocksource\n", timer->clocksource_name);

-   return 0;
+   return;
 error_unmap:
iounmap(timer->base);
 error_free:
kfree(timer);
-   return ret;
+   return;
 }

 CLOCKSOURCE_OF_DECLARE(zevio_timer, "lsi,zevio-timer", zevio_timer_add);
--
1.8.1.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2 2/2] cpufreq: Change freq before suspending governors

2013-11-24 Thread Viresh Kumar
On 25 November 2013 03:02, Rafael J. Wysocki  wrote:
> Viresh, maybe make it possible for the cpufreq driver to provide 
> suspend/resume
> callbacks to be executed by cpufreq_suspend() and cpufreq_resume() introduced
> by [1/2]?  Then Tegra could set the frequencies to what it wants from there
> before the governors are stopped.

Giving cpufreq-drivers a chance to do whatever they want looks to be
correct. So, maybe prepare() or suspend_prepare() for them can be
implemented.

Though I would still go for a generic function in core, which can be
just  reused by samsung and tegra to set cores to specific frequencies.

--
viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] update consumers of MSG_MORE to recognize MSG_SENDPAGE_NOTLAST

2013-11-24 Thread Hannes Frederic Sowa
On Sun, Nov 24, 2013 at 06:08:59PM -0800, Shawn Landden wrote:
> Commit 35f9c09fe (tcp: tcp_sendpages() should call tcp_push() once)
> added an internal flag MSG_SENDPAGE_NOTLAST, similar to
> MSG_MORE.
> 
> algif_hash and algif_skcipher used MSG_MORE from tcp_sendpages()
> and need to see the new flag as identical to MSG_MORE.
> 
> This fixes sendfile() on AF_ALG.

Don't we need a similar fix for udp_sendpage?

Greetings,

  Hannes

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] cpufreq: suspend/resume governors with PM notifiers

2013-11-24 Thread Viresh Kumar
On 22 November 2013 14:41, viresh kumar  wrote:
> So, what about something like this ?
>
> diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
> index f48370d..523c0bc 100644
> --- a/drivers/base/cpu.c
> +++ b/drivers/base/cpu.c
> @@ -120,6 +120,45 @@ static DEVICE_ATTR(release, S_IWUSR, NULL, 
> cpu_release_store);
>  #endif /* CONFIG_ARCH_CPU_PROBE_RELEASE */
>  #endif /* CONFIG_HOTPLUG_CPU */
>
> +int cpu_subsys_suspend_noirq(struct device *dev)
> +{
> +   struct bus_type *bus = dev->bus;
> +   struct subsys_interface *sif;
> +   int ret = 0;
> +
> +   list_for_each_entry(sif, >p->interfaces, node) {
> +   if (sif->pm && sif->pm->suspend_noirq) {
> +   ret = sif->suspend_noirq(dev);
> +   if (ret)
> +   break;
> +   }
> +   }
> +
> +   return ret;
> +}
> +
> +int cpu_subsys_resume_noirq(struct device *dev)
> +{
> +   struct bus_type *bus = dev->bus;
> +   struct subsys_interface *sif;
> +   int ret = 0;
> +
> +   list_for_each_entry(sif, >p->interfaces, node) {
> +   if (sif->pm && sif->pm->resume_noirq) {
> +   ret = sif->resume_noirq(dev);
> +   if (ret)
> +   break;
> +   }
> +   }
> +
> +   return ret;
> +}
> +
> +static const struct dev_pm_ops cpu_subsys_pm_ops = {
> +   .suspend_noirq = cpu_subsys_suspend_noirq,
> +   .resume_noirq = cpu_subsys_resume_noirq,
> +};
> +
>  struct bus_type cpu_subsys = {
> .name = "cpu",
> .dev_name = "cpu",
> @@ -128,6 +167,7 @@ struct bus_type cpu_subsys = {
> .online = cpu_subsys_online,
> .offline = cpu_subsys_offline,
>  #endif
> +   .pm = _subsys_pm_ops,
>  };
>  EXPORT_SYMBOL_GPL(cpu_subsys);
>
> diff --git a/include/linux/device.h b/include/linux/device.h
> index b025925..fa01273 100644
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -298,11 +298,16 @@ struct device *driver_find_device(struct device_driver 
> *drv,
>   * @node:   the list of functions registered at the subsystem
>   * @add_dev:device hookup to device function handler
>   * @remove_dev: device hookup to device function handler
> + * @pm: Power management operations of this interface.
>   *
>   * Simple interfaces attached to a subsystem. Multiple interfaces can
>   * attach to a subsystem and its devices. Unlike drivers, they do not
>   * exclusively claim or control devices. Interfaces usually represent
>   * a specific functionality of a subsystem/class of devices.
> + *
> + * PM callbacks are called from individual subsystems instead of PM core. And
> + * hence might not be available for all subsystems. Currently present for:
> + * cpu_subsys.
>   */
>  struct subsys_interface {
> const char *name;
> @@ -310,6 +315,7 @@ struct subsys_interface {
> struct list_head node;
> int (*add_dev)(struct device *dev, struct subsys_interface *sif);
> int (*remove_dev)(struct device *dev, struct subsys_interface *sif);
> +   const struct dev_pm_ops *pm;
>  };
>
>  int subsys_interface_register(struct subsys_interface *sif);

Any inputs?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2] cpufreq: Make sure CPU is running on a freq from freq-table

2013-11-24 Thread Viresh Kumar
Sometimes boot loaders set CPU frequency to a value outside of frequency table
present with cpufreq core. In such cases CPU might be unstable if it has to run
on that frequency for long duration of time and so its better to set it to a
frequency which is specified in freq-table. This also makes cpufreq stats
inconsistent as cpufreq-stats would fail to register because current frequency
of CPU isn't found in freq-table.

Because we don't want this change to effect boot process badly, we go for the
next freq which is >= policy->cur ('cur' must be set by now, otherwise we will
end up setting freq to lowest of the table as 'cur' is initialized to zero).

In case where CPU is already running on one of the frequencies present in
freq-table, this would turn into a dummy call as __cpufreq_driver_target() would
return early.

Reported-by: Carlos Hernandez 
Reported-and-tested-by: Nishanth Menon 
Signed-off-by: Viresh Kumar 
---
V1->V2
- Set to (policy->cur - 1) instead of policy->cur.
- return early in case __cpufreq_driver_target() fails.

 drivers/cpufreq/cpufreq.c | 32 
 1 file changed, 32 insertions(+)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 02d534d..7be996c 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1038,6 +1038,38 @@ static int __cpufreq_add_dev(struct device *dev, struct 
subsys_interface *sif,
}
}
 
+   /*
+* Sometimes boot loaders set CPU frequency to a value outside of
+* frequency table present with cpufreq core. In such cases CPU might be
+* unstable if it has to run on that frequency for long duration of time
+* and so its better to set it to a frequency which is specified in
+* freq-table. This also makes cpufreq stats inconsistent as
+* cpufreq-stats would fail to register because current frequency of CPU
+* isn't found in freq-table.
+*
+* Because we don't want this change to effect boot process badly, we go
+* for the next freq which is >= policy->cur ('cur' must be set by now,
+* otherwise we will end up setting freq to lowest of the table as 'cur'
+* is initialized to zero).
+*
+* In case where CPU is already running on one of the frequencies
+* present in freq-table, this would turn into a dummy call as
+* __cpufreq_driver_target() would return early.
+*
+* We are passing target-freq as "policy->cur - 1" otherwise
+* __cpufreq_driver_target() would simply fail, as policy->cur will be
+* equal to target-freq.
+*/
+   if (has_target()) {
+   ret = __cpufreq_driver_target(policy, policy->cur - 1,
+   CPUFREQ_RELATION_L);
+   if (ret) {
+   pr_err("%s: Unable to set frequency from table: %d\n",
+   __func__, ret);
+   goto err_out_unregister;
+   }
+   }
+
/* related cpus should atleast have policy->cpus */
cpumask_or(policy->related_cpus, policy->related_cpus, policy->cpus);
 
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 1/6] gpio: davinci: use readl/writel instead of __raw_*

2013-11-24 Thread Prabhakar Lad
Hi Taras,

On Fri, Nov 22, 2013 at 3:38 PM, Taras Kondratiuk
 wrote:
> On 21 November 2013 20:15, Prabhakar Lad  wrote:
>> From: "Lad, Prabhakar" 
>>
>> This patch replaces the __raw_readl/writel with
>> readl and writel, Altough the code runs on ARMv5
>> based SOCs, changing this will help copying the code
>> for other uses.
>
> This replacement has a functional impact: it adds memory barriers.
> Please note this in the description.
> Also please add a bit of explanation on why do you need to add barriers.
>
Agreed this adds memory barriers, I'll add a note about it.

Thanks,
--Prabhakar Lad
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] [trivial] mei: Fix typo in mei drivers

2013-11-24 Thread Randy Dunlap
On 11/23/13 07:36, Masanari Iida wrote:
> Correct spelling typo in comments within mei.
> 
> Signed-off-by: Masanari Iida 
> ---
>  drivers/misc/mei/amthif.c  |  4 ++--
>  drivers/misc/mei/debugfs.c |  2 +-
>  drivers/misc/mei/hbm.c | 12 ++--
>  drivers/misc/mei/mei_dev.h |  2 +-
>  4 files changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/misc/mei/debugfs.c b/drivers/misc/mei/debugfs.c
> index e3870f2..10b172a 100644
> --- a/drivers/misc/mei/debugfs.c
> +++ b/drivers/misc/mei/debugfs.c
> @@ -43,7 +43,7 @@ static ssize_t mei_dbgfs_read_meclients(struct file *fp, 
> char __user *ubuf,
>  
>   mutex_lock(>device_lock);
>  
> - /*  if the driver is not enabled the list won't b consitent */
> + /*  if the driver is not enabled the list won't b consistent */

be

>   if (dev->dev_state != MEI_DEV_ENABLED)
>   goto out;
>  
> diff --git a/drivers/misc/mei/hbm.c b/drivers/misc/mei/hbm.c
> index 9b3a0fb..96fed82 100644
> --- a/drivers/misc/mei/hbm.c
> +++ b/drivers/misc/mei/hbm.c
> @@ -90,7 +90,7 @@ void mei_hbm_cl_hdr(struct mei_cl *cl, u8 hbm_cmd, void 
> *buf, size_t len)
>   * @file: private data of the file object.
>   * @disconn: disconnection request.
>   *
> - * returns true if addres are same
> + * returns true if address are same

  addresses

>   */
>  static inline
>  bool mei_hbm_cl_addr_equal(struct mei_cl *cl, void *buf)


-- 
~Randy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCHv2 RESEND] irqchip: Add support for TI-NSPIRE irqchip

2013-11-24 Thread dt . tangr
From: Daniel Tang 

This patch adds support for the interrupt controllers found in some
TI-Nspire models.

FIQ support was taken out to simplify the driver
code and may be added in later. Since Linux on this platform doesn't
really use FIQs, this wasn't really that important in the first
place.

Changes from v1 to v2:
* Converted to use generic IRQ chips.
* Removed FIQ for now to simplify driver code.
* Based against tip/irq/core and uses IRQ domain support for generic
 chips.

Signed-off-by: Daniel Tang 
Acked-by: Grant Likely 
---
 .../interrupt-controller/lsi,zevio-intc.txt|  18 +++
 drivers/irqchip/Makefile   |   1 +
 drivers/irqchip/irq-zevio.c| 129 +
 3 files changed, 148 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/interrupt-controller/lsi,zevio-intc.txt
 create mode 100644 drivers/irqchip/irq-zevio.c

diff --git 
a/Documentation/devicetree/bindings/interrupt-controller/lsi,zevio-intc.txt 
b/Documentation/devicetree/bindings/interrupt-controller/lsi,zevio-intc.txt
new file mode 100644
index 000..aee38e7
--- /dev/null
+++ b/Documentation/devicetree/bindings/interrupt-controller/lsi,zevio-intc.txt
@@ -0,0 +1,18 @@
+TI-NSPIRE interrupt controller
+
+Required properties:
+- compatible: Compatible property value should be "lsi,zevio-intc".
+
+- reg: Physical base address of the controller and length of memory mapped
+   region.
+
+- interrupt-controller : Identifies the node as an interrupt controller
+
+Example:
+
+interrupt-controller {
+   compatible = "lsi,zevio-intc";
+   interrupt-controller;
+   reg = <0xDC00 0x1000>;
+   #interrupt-cells = <1>;
+};
diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
index cda4cb5..f313d14 100644
--- a/drivers/irqchip/Makefile
+++ b/drivers/irqchip/Makefile
@@ -15,4 +15,5 @@ obj-$(CONFIG_SIRF_IRQ)+= irq-sirfsoc.o
 obj-$(CONFIG_RENESAS_INTC_IRQPIN)  += irq-renesas-intc-irqpin.o
 obj-$(CONFIG_RENESAS_IRQC) += irq-renesas-irqc.o
 obj-$(CONFIG_VERSATILE_FPGA_IRQ)   += irq-versatile-fpga.o
+obj-$(CONFIG_ARCH_NSPIRE)  += irq-zevio.o
 obj-$(CONFIG_ARCH_VT8500)  += irq-vt8500.o
diff --git a/drivers/irqchip/irq-zevio.c b/drivers/irqchip/irq-zevio.c
new file mode 100644
index 000..92e6c7b
--- /dev/null
+++ b/drivers/irqchip/irq-zevio.c
@@ -0,0 +1,129 @@
+/*
+ *  linux/drivers/irqchip/irq-zevio.c
+ *
+ *  Copyright (C) 2013 Daniel Tang 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+#include "irqchip.h"
+
+#define IO_STATUS  0x000
+#define IO_RAW_STATUS  0x004
+#define IO_ENABLE  0x008
+#define IO_DISABLE 0x00C
+#define IO_CURRENT 0x020
+#define IO_RESET   0x028
+#define IO_MAX_PRIOTY  0x02C
+
+#define IO_IRQ_BASE0x000
+#define IO_FIQ_BASE0x100
+
+#define IO_INVERT_SEL  0x200
+#define IO_STICKY_SEL  0x204
+#define IO_PRIORITY_SEL0x300
+
+#define MAX_INTRS  32
+#define FIQ_START  MAX_INTRS
+
+static struct irq_domain *zevio_irq_domain;
+static void __iomem *zevio_irq_io;
+
+static void zevio_irq_ack(struct irq_data *irqd)
+{
+   struct irq_chip_generic *gc = irq_data_get_irq_chip_data(irqd);
+   struct irq_chip_regs *regs =
+   _of(irqd->chip, struct irq_chip_type, chip)->regs;
+
+   irq_gc_lock(gc);
+   readl(gc->reg_base + regs->ack);
+   irq_gc_unlock(gc);
+}
+
+static void init_base(void __iomem *base)
+{
+   /* Disable all interrupts */
+   writel(~0, base + IO_DISABLE);
+
+   /* Accept interrupts of all priorities */
+   writel(0xF, base + IO_MAX_PRIOTY);
+
+   /* Reset existing interrupts */
+   readl(base + IO_RESET);
+}
+
+asmlinkage void __exception_irq_entry zevio_handle_irq(struct pt_regs *regs)
+{
+   int irqnr;
+
+   while (readl(zevio_irq_io + IO_STATUS)) {
+   irqnr = readl(zevio_irq_io + IO_CURRENT);
+   irqnr = irq_find_mapping(zevio_irq_domain, irqnr);
+   handle_IRQ(irqnr, regs);
+   };
+}
+
+static int __init zevio_of_init(struct device_node *node,
+   struct device_node *parent)
+{
+   unsigned int clr = IRQ_NOREQUEST | IRQ_NOPROBE | IRQ_NOAUTOEN;
+   struct irq_chip_generic *gc;
+   int ret;
+
+   if (WARN_ON(zevio_irq_io || zevio_irq_domain))
+   return -EBUSY;
+
+   zevio_irq_io = of_iomap(node, 0);
+   BUG_ON(!zevio_irq_io);
+
+   /* Do not invert interrupt status bits */
+   writel(~0, zevio_irq_io + IO_INVERT_SEL);
+
+   /* Disable sticky interrupts */
+   writel(0, zevio_irq_io + IO_STICKY_SEL);
+
+   /* We don't use IRQ priorities. Set each IRQ to highest priority. */
+

[PATCHv3] usb: chipidea: add support for USB OTG controller on TI-NSPIRE

2013-11-24 Thread dt . tangr
From: Daniel Tang 

The USB controller in TI-NSPIRE calculators are based off either Freescale's
USB OTG controller or the USB controller found in the IMX233, both of which
are Chipidea compatible.

This patch adds a device tree binding for the controller.

Signed-off-by: Daniel Tang 
---

Changelog v3:
 * Removed redundant module aliases

Changelog v2:
 * Rename ci13xxx to ci_hdrc
 * Fixed alignment issues

.../devicetree/bindings/usb/ci-hdrc-nspire.txt | 17 +
 drivers/usb/chipidea/Makefile  |  1 +
 drivers/usb/chipidea/ci_hdrc_nspire.c  | 72 ++
 3 files changed, 90 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/usb/ci-hdrc-nspire.txt
 create mode 100644 drivers/usb/chipidea/ci_hdrc_nspire.c

diff --git a/Documentation/devicetree/bindings/usb/ci-hdrc-nspire.txt 
b/Documentation/devicetree/bindings/usb/ci-hdrc-nspire.txt
new file mode 100644
index 000..5ba8e90
--- /dev/null
+++ b/Documentation/devicetree/bindings/usb/ci-hdrc-nspire.txt
@@ -0,0 +1,17 @@
+* TI-Nspire USB OTG Controller
+
+Required properties:
+- compatible: Should be "zevio,nspire-usb"
+- reg: Should contain registers location and length
+- interrupts: Should contain controller interrupt
+
+Recommended properies:
+- vbus-supply: regulator for vbus
+
+Examples:
+   usb0: usb@B000 {
+   reg = <0xB000 0x1000>;
+   compatible = "zevio,nspire-usb";
+   interrupts = <8>;
+   vbus-supply = <_reg>;
+   };
diff --git a/drivers/usb/chipidea/Makefile b/drivers/usb/chipidea/Makefile
index a99d980..245ea4d 100644
--- a/drivers/usb/chipidea/Makefile
+++ b/drivers/usb/chipidea/Makefile
@@ -10,6 +10,7 @@ ci_hdrc-$(CONFIG_USB_CHIPIDEA_DEBUG)  += debug.o
 # Glue/Bridge layers go here

 obj-$(CONFIG_USB_CHIPIDEA) += ci_hdrc_msm.o
+obj-$(CONFIG_USB_CHIPIDEA) += ci_hdrc_nspire.o

 # PCI doesn't provide stubs, need to check
 ifneq ($(CONFIG_PCI),)
diff --git a/drivers/usb/chipidea/ci_hdrc_nspire.c 
b/drivers/usb/chipidea/ci_hdrc_nspire.c
new file mode 100644
index 000..517ce41
--- /dev/null
+++ b/drivers/usb/chipidea/ci_hdrc_nspire.c
@@ -0,0 +1,72 @@
+/*
+ * Copyright (C) 2013 Daniel Tang 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ *
+ * Based off drivers/usb/chipidea/ci_hdrc_msm.c
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+#include "ci.h"
+
+static struct ci_hdrc_platform_data ci_hdrc_nspire_platdata = {
+   .name   = "ci_hdrc_nspire",
+   .flags  = CI_HDRC_REGS_SHARED,
+   .capoffset  = DEF_CAPOFFSET,
+};
+
+static int ci_hdrc_nspire_probe(struct platform_device *pdev)
+{
+   struct platform_device *ci_pdev;
+
+   dev_dbg(>dev, "ci_hdrc_nspire_probe\n");
+
+   ci_pdev = ci_hdrc_add_device(>dev,
+   pdev->resource, pdev->num_resources,
+   _hdrc_nspire_platdata);
+
+   if (IS_ERR(ci_pdev)) {
+   dev_err(>dev, "ci_hdrc_add_device failed!\n");
+   return PTR_ERR(ci_pdev);
+   }
+
+   platform_set_drvdata(pdev, ci_pdev);
+
+   return 0;
+}
+
+static int ci_hdrc_nspire_remove(struct platform_device *pdev)
+{
+   struct platform_device *ci_pdev = platform_get_drvdata(pdev);
+
+   ci_hdrc_remove_device(ci_pdev);
+
+   return 0;
+}
+
+static const struct of_device_id ci_hdrc_nspire_dt_ids[] = {
+   { .compatible = "zevio,nspire-usb", },
+   { /* sentinel */ }
+};
+
+static struct platform_driver ci_hdrc_nspire_driver = {
+   .probe = ci_hdrc_nspire_probe,
+   .remove = ci_hdrc_nspire_remove,
+   .driver = {
+   .name = "nspire_usb",
+   .owner = THIS_MODULE,
+   .of_match_table = ci_hdrc_nspire_dt_ids,
+   },
+};
+
+MODULE_DEVICE_TABLE(of, ci_hdrc_nspire_dt_ids);
+module_platform_driver(ci_hdrc_nspire_driver);
+
+MODULE_LICENSE("GPL v2");
--
1.8.1.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2] usb: chipidea: add support for USB OTG controller on TI-NSPIRE

2013-11-24 Thread Daniel Tang
Hi,

On 24/11/2013, at 9:12 PM, Peter Chen  wrote:

> On Sun, Nov 24, 2013 at 06:37:47PM +1100, dt.ta...@gmail.com wrote:
> 
>> +};
>> +
>> +MODULE_DEVICE_TABLE(of, ci_hdrc_nspire_dt_ids);
>> +module_platform_driver(ci_hdrc_nspire_driver);
>> +
>> +MODULE_ALIAS("platform:nspire_usb");
>> +MODULE_ALIAS("platform:ci_hdrc_nspire");
> 
> Just curious, why you need to two alias?

Oops, I must've forgotten to remove this. I'll remove it in the next patch.

> 
> -- 
> 
> Best Regards,
> Peter Chen
> 

Cheers,
Daniel Tang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] misc: eeprom_93xx46: remove unnecessary spi_set_drvdata()

2013-11-24 Thread Jingoo Han
The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Jingoo Han 
---
 drivers/misc/eeprom/eeprom_93xx46.c |1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/misc/eeprom/eeprom_93xx46.c 
b/drivers/misc/eeprom/eeprom_93xx46.c
index 3a015ab..78e55b5 100644
--- a/drivers/misc/eeprom/eeprom_93xx46.c
+++ b/drivers/misc/eeprom/eeprom_93xx46.c
@@ -378,7 +378,6 @@ static int eeprom_93xx46_remove(struct spi_device *spi)
device_remove_file(>dev, _attr_erase);
 
sysfs_remove_bin_file(>dev.kobj, >bin);
-   spi_set_drvdata(spi, NULL);
kfree(edev);
return 0;
 }
-- 
1.7.10.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mfd: mc13xxx: remove unnecessary spi_set_drvdata()

2013-11-24 Thread Jingoo Han
The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Jingoo Han 
---
 drivers/mfd/mc13xxx-spi.c |1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/mfd/mc13xxx-spi.c b/drivers/mfd/mc13xxx-spi.c
index 5f14ef6..cbcc86d 100644
--- a/drivers/mfd/mc13xxx-spi.c
+++ b/drivers/mfd/mc13xxx-spi.c
@@ -149,7 +149,6 @@ static int mc13xxx_spi_probe(struct spi_device *spi)
ret = PTR_ERR(mc13xxx->regmap);
dev_err(mc13xxx->dev, "Failed to initialize register map: %d\n",
ret);
-   spi_set_drvdata(spi, NULL);
return ret;
}
 
-- 
1.7.10.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] rtc: ds1305: remove unnecessary spi_set_drvdata()

2013-11-24 Thread Jingoo Han
The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Jingoo Han 
---
 drivers/rtc/rtc-ds1305.c |1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/rtc/rtc-ds1305.c b/drivers/rtc/rtc-ds1305.c
index 80f3237..2dd586a 100644
--- a/drivers/rtc/rtc-ds1305.c
+++ b/drivers/rtc/rtc-ds1305.c
@@ -787,7 +787,6 @@ static int ds1305_remove(struct spi_device *spi)
cancel_work_sync(>work);
}
 
-   spi_set_drvdata(spi, NULL);
return 0;
 }
 
-- 
1.7.10.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[V2 PATCH] sctp: Restore 'resent' bit to avoid retransmitted chunks for RTT measurements

2013-11-24 Thread Xufeng Zhang
Currently retransmitted DATA chunks could also be used for
RTT measurements since there are no flag to identify whether
the transmitted DATA chunk is a new one or a retransmitted one.
This problem is introduced by commit ae19c5486 ("sctp: remove
'resent' bit from the chunk") which inappropriately removed the
'resent' bit completely, instead of doing this, we should set
the resent bit only for the retransmitted DATA chunks.

Signed-off-by: Xufeng Zhang 
---
v1->v2:
Rmoved initialization for resent bit.
Combined two if clause

 include/net/sctp/structs.h |1 +
 net/sctp/output.c  |3 ++-
 net/sctp/outqueue.c|3 +++
 3 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index 2174d8d..ea0ca5f 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -629,6 +629,7 @@ struct sctp_chunk {
 #define SCTP_NEED_FRTX 0x1
 #define SCTP_DONT_FRTX 0x2
__u16   rtt_in_progress:1,  /* This chunk used for RTT calc? */
+   resent:1,   /* Has this chunk ever been resent. */
has_tsn:1,  /* Does this chunk have a TSN yet? */
has_ssn:1,  /* Does this chunk have a SSN yet? */
singleton:1,/* Only chunk in the packet? */
diff --git a/net/sctp/output.c b/net/sctp/output.c
index e650978..0e2644d 100644
--- a/net/sctp/output.c
+++ b/net/sctp/output.c
@@ -474,10 +474,11 @@ int sctp_packet_transmit(struct sctp_packet *packet)
 * for a given destination transport address.
 */
 
-   if (!tp->rto_pending) {
+   if (!chunk->resent && !tp->rto_pending) {
chunk->rtt_in_progress = 1;
tp->rto_pending = 1;
}
+
has_data = 1;
}
 
diff --git a/net/sctp/outqueue.c b/net/sctp/outqueue.c
index 94df758..70f4f56 100644
--- a/net/sctp/outqueue.c
+++ b/net/sctp/outqueue.c
@@ -446,6 +446,8 @@ void sctp_retransmit_mark(struct sctp_outq *q,
transport->rto_pending = 0;
}
 
+   chunk->resent = 1;
+
/* Move the chunk to the retransmit queue. The chunks
 * on the retransmit queue are always kept in order.
 */
@@ -1375,6 +1377,7 @@ static void sctp_check_transmitted(struct sctp_outq *q,
 * instance).
 */
if (!tchunk->tsn_gap_acked &&
+   !tchunk->resent &&
tchunk->rtt_in_progress) {
tchunk->rtt_in_progress = 0;
rtt = jiffies - tchunk->sent_at;
-- 
1.7.0.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


ARM: nommu: DEBUG_LOCKS_WARN_ON(!depth)

2013-11-24 Thread Axel Lin
I'm testing on a nommu platform (arm7tdmi SoC).
Using current Linus' tree + out-of-tree patches for this SoC.
I got below hang while executing ls (busybox) after boot.

/ # ls
[   51.036191] [ cut here ]
[   51.042242] WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:3312 
lock_set_class+0x5c8/0x660()
[   51.051426] DEBUG_LOCKS_WARN_ON(!depth)
[   51.055842] CPU: 0 PID: 1 Comm:  Not tainted 3.13.0-rc1-00100-g4b061f7-dirty 
#1917
[   51.065415] [] (unwind_backtrace+0x0/0xe0) from [] 
(show_stack+0x10/0x14)
[   51.075781] [] (show_stack+0x10/0x14) from [] 
(warn_slowpath_common+0x58/0x78)
[   51.086549] [] (warn_slowpath_common+0x58/0x78) from [] 
(warn_slowpath_fmt+0x2c/0x3c)
[   51.098162] [] (warn_slowpath_fmt+0x2c/0x3c) from [<00036d9c>] 
(lock_set_class+0x5c8/0x660)
[   50.934805] [<00036d9c>] (lock_set_class+0x5c8/0x660) from [<000367d4>] 
(lock_set_class+0x0/0x660)
[   50.945255] [<000367d4>] (lock_set_class+0x0/0x660) from [<>] (  
(null))
[   50.953242] ---[ end trace 7d1e4eb80001 ]---

BTW, I also hit below hangup once a few days ago (just before 3.13-rc1 release).

below is my timers config:
#
# Timers subsystem
#
CONFIG_HZ_PERIODIC=y
# CONFIG_NO_HZ_IDLE is not set
# CONFIG_NO_HZ is not set
# CONFIG_HIGH_RES_TIMERS is not set

/ # ls /bin
[   81.272231] BUG: scheduling while atomic: ls/33/0x0037a001
[   81.284450] 2 locks held by ls/33:
[   81.292221]  #0: [   81.292221]  #0:  ( 
(>i_mutex_dir_key>i_mutex_dir_key){+.+.+.}){+.+.+.}[   81.304370] 
BUG: recent printk recursion!
[   81.304370] BUG: recent printk recursion!
, at: , at: [<0006c9c8>] lookup_slow+0x30/0xa0
[<0006c9c8>] lookup_slow+0x30/0xa0
[   81.323810]  #1: [   81.323810]  #1:  ( 
(>s_type->i_lock_key>s_type->i_lock_key#13#13){+.+...}){+.+...}, at: , 
at: [<000764cc>] d_instantia8
[<000764cc>] d_instantiate+0x28/0x48
[   81.345069] irq event stamp: 3753
[   81.352717] hardirqs last  enabled at (3753): [   81.352717] hardirqs last  
enabled at (3753): [<002943dc>] _raw_spin_unlock_irqrestore+0x3c/0x5c
[<002943dc>] _raw_spin_unlock_irqrestore+0x3c/0x5c
[   81.372183] hardirqs last disabled at (3752): [   81.372183] hardirqs last 
disabled at (3752): [<00294254>] _raw_spin_lock_irqsave+0x1c/0x68
[<00294254>] _raw_spin_lock_irqsave+0x1c/0x68
[   81.216282] softirqs last  enabled at (3712): [   81.216282] softirqs last  
enabled at (3712): [<000127ec>] __do_softirq+0x190/0x20c
[<000127ec>] __do_softirq+0x190/0x20c
[   81.233554] softirqs last disabled at (3705): [   81.233554] softirqs last 
disabled at (3705): [<00012c1c>] irq_exit+0x90/0xb8
[<00012c1c>] irq_exit+0x90/0xb8
[   81.250354] CPU: 0 PID: 0 Comm: ���z Tainted: GW
3.12.0-11171-g6adc047-dirty #1911
[   81.270570] [] (unwind_backtrace+0x0/0xe0) from [] 
(show_stack+0x10/0x14)
[   81.290277] [] (show_stack+0x10/0x14) from [<0028dd8c>] 
(__schedule_bug+0x5c/0x74)
[   81.309935] [<0028dd8c>] (__schedule_bug+0x5c/0x74) from [<00290c00>] 
(__schedule+0x58/0x38c)
[   81.329818] [<00290c00>] (__schedule+0x58/0x38c) from [<002908f4>] 
(do_nanosleep+0x78/0xd0)
[   81.349258] [<002908f4>] (do_nanosleep+0x78/0xd0) from [<00029418>] 
(hrtimer_nanosleep+0x88/0x10c)
[   81.369874] [<00029418>] (hrtimer_nanosleep+0x88/0x10c) from [<00025680>] 
(common_nsleep+0x0/0x20)

Thanks for any comments and advices.
Regards,
Axel


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] ACPI / bind: Simplify child devices lookup

2013-11-24 Thread Aaron Lu
On 11/25/2013 08:09 AM, Rafael J. Wysocki wrote:
> Hi,
> 
> The following series of four patches (on top of current 
> linux-pm.git/bleeding-edge)
> rework child device lookup in drivers/acpi/glue.c and related things:
> 
> [1/4] ACPI / bind: Simplify child device lookup
> [2/4] PCI/ ACPI: Use acpi_find_child_device() for child device lookup
> [3/4] ACPI / bind: Redefine acpi_get_child()
> [4/4] ACPI / bind: Redefine acpi_preset_companion()
> 
> Thanks!
> 

Reviewed-by: Aaron Lu 
Tested-by: Aaron Lu  #for ATA binding
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Cpufreq: Change sysfs interface cpuinfo_cur_freq access privilege

2013-11-24 Thread Lan Tianyu
Currently, cpuinfo_cur_freq is only accessible for root user while
other cpufreq sysfs interfaces(E,G scaling_cur_freq) are available
to ordinary user. This seems make no sense. This patch is to change
it.

Signed-off-by: Lan Tianyu 
---
 drivers/cpufreq/cpufreq.c | 2 +-
 include/linux/cpufreq.h   | 4 
 2 files changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 02d534d..1926465 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -605,7 +605,7 @@ static ssize_t show_bios_limit(struct cpufreq_policy 
*policy, char *buf)
return sprintf(buf, "%u\n", policy->cpuinfo.max_freq);
 }
 
-cpufreq_freq_attr_ro_perm(cpuinfo_cur_freq, 0400);
+cpufreq_freq_attr_ro(cpuinfo_cur_freq);
 cpufreq_freq_attr_ro(cpuinfo_min_freq);
 cpufreq_freq_attr_ro(cpuinfo_max_freq);
 cpufreq_freq_attr_ro(cpuinfo_transition_latency);
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 5bd6ab9..b4bb677 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -166,10 +166,6 @@ struct freq_attr {
 static struct freq_attr _name =\
 __ATTR(_name, 0444, show_##_name, NULL)
 
-#define cpufreq_freq_attr_ro_perm(_name, _perm)\
-static struct freq_attr _name =\
-__ATTR(_name, _perm, show_##_name, NULL)
-
 #define cpufreq_freq_attr_rw(_name)\
 static struct freq_attr _name =\
 __ATTR(_name, 0644, show_##_name, store_##_name)
-- 
1.8.4.rc0.1.g8f6a3e5.dirty

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 0/4] ACPICA: Stable material of ACPI executer fixes for linux-3.8.

2013-11-24 Thread Zheng, Lv
> From: Greg Kroah-Hartman [mailto:gre...@linuxfoundation.org]
> Sent: Sunday, November 24, 2013 11:22 AM
> 
> On Fri, Nov 01, 2013 at 02:58:16AM +, Zheng, Lv wrote:
> > > From: Rafael J. Wysocki [mailto:r...@rjwysocki.net]
> > > Sent: Thursday, October 31, 2013 8:22 PM
> > >
> > > On Thursday, October 31, 2013 05:08:50 AM Greg Kroah-Hartman wrote:
> > > > On Thu, Oct 31, 2013 at 12:39:21PM +0100, Rafael J. Wysocki wrote:
> > > > > On Thursday, October 31, 2013 09:07:40 AM Lv Zheng wrote:
> > > > > > There are bug-fixes for AML interpreter upstreamed, fixing some 
> > > > > > serious
> > > > > > issues found in recent platforms.  These fixes make Linux AML 
> > > > > > interpreter
> > > > > > more ACPI 2.0 ASL concept compliant.  Further AML interpreter fixes 
> > > > > > should
> > > > > > be based on such improvements, thus they are good materials for 
> > > > > > stable.
> > > > > >
> > > > > > This patch set can be safely applied to linux-3.8:
> > > > > > commit 19f949f52599ba7c3f67a5897ac6be14bfcb1200 upstream.
> > > > > >
> > > > > > The patch set has passed build/boot tests on the following machines:
> > > > > >   Dell Inspiron Mini 1010 (i386)
> > > > > >   HP Compaq 8200 Elite SFF PC (x86-64)
> > > > > >
> > > > > > Bob Moore (4):
> > > > > >   ACPICA: Interpreter: Fix Store() when implicit conversion is not
> > > > > > possible.
> > > > > >   ACPICA: DeRefOf operator: Update to fully resolve FieldUnit and
> > > > > > BufferField refs.
> > > > > >   ACPICA: Return error if DerefOf resolves to a null package 
> > > > > > element.
> > > > > >   ACPICA: Fix for a Store->ArgX when ArgX contains a reference to a
> > > > > > field.
> > > > >
> > > > > Hi Greg,
> > > > >
> > > > > Please take patches [1-4/4] for stable.
> > > >
> > > > "Which" stable tree?
> > > >
> > > > I don't do 3.8, it's long been end-of-life, although one company is
> > > > trying to keep it alive, but that's not me.
> > > >
> > > > I'm only handling 3.4, 3.10, and 3.11 stable trees right now, which
> > > > one(s) should these be applied to?
> > >
> > > 3.10.x and 3.11.x then.
> > >
> > > Lv, do the original mainline commits apply to these kernels?
> > >
> > > Rafael
> >
> > Hi, Rafael and Greg
> >
> > I checked the back port dependencies since v3.8:
> > 1. [PATCH 1] belongs to v3.9.
> > 2. [PATCH 4] includes an empty line belonging to a coding style fix 
> > affecting this series (between [PATCH 3] and [PATCH 4]).
> > Thus,
> > 1. For v3.10:
> > [PATCH 1]: It's already in the repo, so please drop it.
> > [PATCH 2-4]: They can be used directly as 3.10.x stable materials.
> > 2. For v3.11:
> > [PATCH 1]: It's already in the repo, so please drop it.
> > [PATCH 2-3]: They can be used directly as 3.11.x stable materials.
> > [PATCH 4]: The original commit from Linus' tree should be used instead.
> >
> >  I checked the commit log since v3.4.
> >  There is no functional change done to the AML executer between v3.4 and 
> > v3.8.
> >  The problem is there is a coding style fix affecting this series (between 
> > v3.4 and [PATCH 1]).
> >  I generated the following diff block before applying [PATCH 1], and 
> > obtained a successful build/boot to a v3.4 kernel with these
> patches applied.
> > Thus,
> > 1. For v3.4:
> > [PATCH 1]: You can merge this diff block to [PATCH 1] or simply modify 
> > the [PATCH 1] by manually adding this white space.
> > [PATCH 2-4]: They can be used directly as 3.4.x stable materials.
> 
> Ok, I think I have this all properly queued up for 3.4, 3.10, and
> 3.11-stable trees, can you please check and verify I didn't mess
> anything up?

I pulled and checked, they are all correct.

Thanks and best regards
-Lv

> 
> thanks,
> 
> greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] arch: hexagon: include: asm: use 'affinity' instead of 'locdis' for __vmintop_affinity() in "hexagon_vm.h"

2013-11-24 Thread Chen Gang
All __vmintop_*() use  __vmintop(*, ...), so __vmintop_affinity() need
use __vmintop(vm_affinity, ...).


Signed-off-by: Chen Gang 
---
 arch/hexagon/include/asm/hexagon_vm.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/hexagon/include/asm/hexagon_vm.h 
b/arch/hexagon/include/asm/hexagon_vm.h
index e1e0470..3ea99c1 100644
--- a/arch/hexagon/include/asm/hexagon_vm.h
+++ b/arch/hexagon/include/asm/hexagon_vm.h
@@ -161,7 +161,7 @@ static inline long __vmintop_locdis(long i)
 
 static inline long __vmintop_affinity(long i, long cpu)
 {
-   return __vmintop(locdis, i, cpu, 0, 0);
+   return __vmintop(vm_affinity, i, cpu, 0, 0);
 }
 
 static inline long __vmintop_get(void)
-- 
1.7.7.6
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] arch: hexagon: include: asm: add prefix "vm_" for all enum members in "hexagon_vm.h"

2013-11-24 Thread Chen Gang
Append "vm_" to all enum members (which are too common to make conflict
with another sub-systems). The related error with allmodconfig:

CC [M]  drivers/md/raid1.o
  drivers/md/raid1.c:1440:13: error: 'status' redeclared as different kind of 
symbol
  arch/hexagon/include/asm/hexagon_vm.h:76:2: note: previous definition of 
'status' was here


Signed-off-by: Chen Gang 
---
 arch/hexagon/include/asm/hexagon_vm.h |   70 
 1 files changed, 35 insertions(+), 35 deletions(-)

diff --git a/arch/hexagon/include/asm/hexagon_vm.h 
b/arch/hexagon/include/asm/hexagon_vm.h
index 67bb6d6..e1e0470 100644
--- a/arch/hexagon/include/asm/hexagon_vm.h
+++ b/arch/hexagon/include/asm/hexagon_vm.h
@@ -55,27 +55,27 @@
 #ifndef __ASSEMBLY__
 
 enum VM_CACHE_OPS {
-   ickill,
-   dckill,
-   l2kill,
-   dccleaninva,
-   icinva,
-   idsync,
-   fetch_cfg
+   vm_ickill,
+   vm_dckill,
+   vm_l2kill,
+   vm_dccleaninva,
+   vm_icinva,
+   vm_idsync,
+   vm_fetch_cfg
 };
 
 enum VM_INT_OPS {
-   nop,
-   globen,
-   globdis,
-   locen,
-   locdis,
-   affinity,
-   get,
-   peek,
-   status,
-   post,
-   clear
+   vm_nop,
+   vm_globen,
+   vm_globdis,
+   vm_locen,
+   vm_locdis,
+   vm_affinity,
+   vm_get,
+   vm_peek,
+   vm_status,
+   vm_post,
+   vm_clear
 };
 
 extern void _K_VM_event_vector(void);
@@ -98,65 +98,65 @@ long __vmvpid(void);
 
 static inline long __vmcache_ickill(void)
 {
-   return __vmcache(ickill, 0, 0);
+   return __vmcache(vm_ickill, 0, 0);
 }
 
 static inline long __vmcache_dckill(void)
 {
-   return __vmcache(dckill, 0, 0);
+   return __vmcache(vm_dckill, 0, 0);
 }
 
 static inline long __vmcache_l2kill(void)
 {
-   return __vmcache(l2kill, 0, 0);
+   return __vmcache(vm_l2kill, 0, 0);
 }
 
 static inline long __vmcache_dccleaninva(unsigned long addr, unsigned long len)
 {
-   return __vmcache(dccleaninva, addr, len);
+   return __vmcache(vm_dccleaninva, addr, len);
 }
 
 static inline long __vmcache_icinva(unsigned long addr, unsigned long len)
 {
-   return __vmcache(icinva, addr, len);
+   return __vmcache(vm_icinva, addr, len);
 }
 
 static inline long __vmcache_idsync(unsigned long addr,
   unsigned long len)
 {
-   return __vmcache(idsync, addr, len);
+   return __vmcache(vm_idsync, addr, len);
 }
 
 static inline long __vmcache_fetch_cfg(unsigned long val)
 {
-   return __vmcache(fetch_cfg, val, 0);
+   return __vmcache(vm_fetch_cfg, val, 0);
 }
 
 /* interrupt operations  */
 
 static inline long __vmintop_nop(void)
 {
-   return __vmintop(nop, 0, 0, 0, 0);
+   return __vmintop(vm_nop, 0, 0, 0, 0);
 }
 
 static inline long __vmintop_globen(long i)
 {
-   return __vmintop(globen, i, 0, 0, 0);
+   return __vmintop(vm_globen, i, 0, 0, 0);
 }
 
 static inline long __vmintop_globdis(long i)
 {
-   return __vmintop(globdis, i, 0, 0, 0);
+   return __vmintop(vm_globdis, i, 0, 0, 0);
 }
 
 static inline long __vmintop_locen(long i)
 {
-   return __vmintop(locen, i, 0, 0, 0);
+   return __vmintop(vm_locen, i, 0, 0, 0);
 }
 
 static inline long __vmintop_locdis(long i)
 {
-   return __vmintop(locdis, i, 0, 0, 0);
+   return __vmintop(vm_locdis, i, 0, 0, 0);
 }
 
 static inline long __vmintop_affinity(long i, long cpu)
@@ -166,27 +166,27 @@ static inline long __vmintop_affinity(long i, long cpu)
 
 static inline long __vmintop_get(void)
 {
-   return __vmintop(get, 0, 0, 0, 0);
+   return __vmintop(vm_get, 0, 0, 0, 0);
 }
 
 static inline long __vmintop_peek(void)
 {
-   return __vmintop(peek, 0, 0, 0, 0);
+   return __vmintop(vm_peek, 0, 0, 0, 0);
 }
 
 static inline long __vmintop_status(long i)
 {
-   return __vmintop(status, i, 0, 0, 0);
+   return __vmintop(vm_status, i, 0, 0, 0);
 }
 
 static inline long __vmintop_post(long i)
 {
-   return __vmintop(post, i, 0, 0, 0);
+   return __vmintop(vm_post, i, 0, 0, 0);
 }
 
 static inline long __vmintop_clear(long i)
 {
-   return __vmintop(clear, i, 0, 0, 0);
+   return __vmintop(vm_clear, i, 0, 0, 0);
 }
 
 #else /* Only assembly code should reference these */
-- 
1.7.7.6
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] extcon: gpio: Request gpio pin before modifying its state

2013-11-24 Thread MyungJoo Ham
> Commit 338de0ca (extcon: gpio: Use gpio driver/chip debounce if supported)
> introduced a call to gpio_set_debounce() before actually requesting the
> respective gpio pin from the gpio subsystem.
> 
> The gpio subsystem expects that a gpio pin was requested before modifying its
> state. Not doing so results in a warning from gpiolib, and the gpio pin is
> auto-requested. This in turn causes the subsequent devm_gpio_request_one()
> to fail. So devm_gpio_request_one() must be called prior to calling
> gpio_set_debounce().
> 
> Signed-off-by: Guenter Roeck 

Thank you.


Acked-by: MyungJoo Ham 

> ---
>  drivers/extcon/extcon-gpio.c |   11 ++-
>  1 file changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/extcon/extcon-gpio.c b/drivers/extcon/extcon-gpio.c
> index 7e0dff5..4736a9c 100644
> --- a/drivers/extcon/extcon-gpio.c
> +++ b/drivers/extcon/extcon-gpio.c
> @@ -105,6 +105,12 @@ static int gpio_extcon_probe(struct platform_device 
> *pdev)
>   extcon_data->state_off = pdata->state_off;
>   if (pdata->state_on && pdata->state_off)
>   extcon_data->edev.print_state = extcon_gpio_print_state;
> +
> + ret = devm_gpio_request_one(>dev, extcon_data->gpio, GPIOF_DIR_IN,
> + pdev->name);
> + if (ret < 0)
> + return ret;
> +
>   if (pdata->debounce) {
>   ret = gpio_set_debounce(extcon_data->gpio,
>   pdata->debounce * 1000);
> @@ -117,11 +123,6 @@ static int gpio_extcon_probe(struct platform_device 
> *pdev)
>   if (ret < 0)
>   return ret;
>  
> - ret = devm_gpio_request_one(>dev, extcon_data->gpio, GPIOF_DIR_IN,
> - pdev->name);
> - if (ret < 0)
> - goto err;
> -
>   INIT_DELAYED_WORK(_data->work, gpio_extcon_work);
>  
>   extcon_data->irq = gpio_to_irq(extcon_data->gpio);
> -- 
> 1.7.9.7
> 
> 


[PATCH 0/2] arch: hexagon: include: asm: add prefix "vm_" for all enum members in "hexagon_vm.h"

2013-11-24 Thread Chen Gang
Append "vm_" to all enum members (which are too common to make conflict
with another sub-systems). The related error with allmodconfig:

CC [M]  drivers/md/raid1.o
  drivers/md/raid1.c:1440:13: error: 'status' redeclared as different kind of 
symbol
  arch/hexagon/include/asm/hexagon_vm.h:76:2: note: previous definition of 
'status' was here

Also fix a typo issue: use 'affinity' instead of 'locdis' for
__vmintop_affinity() in "hexagon_vm.h"


Signed-off-by: Chen Gang 
---
 arch/hexagon/include/asm/hexagon_vm.h |   72 
 1 files changed, 36 insertions(+), 36 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: The man page of /proc/stat needs to be fixed

2013-11-24 Thread Randy Dunlap
On 11/24/13 14:59, OSDepend wrote:
> The man page of /proc fs only offers defination of the first column of "intr" 
> in /proc/stat. But lack of the explaination of other columns. Same issue for 
> the softirq in /proc/stat!  How can i identify which interrupts the numbers 
> related to respectively?
> 

There is a man page for /proc/stat ?

Anyway, please take a look at 
/Documentation/filesystems/proc.txt .
It contains a section (1.8) for "Miscellaneous kernel statistics in /proc/stat".
If that does not answer your questions, please let us know.

and please use a hard newline in your text every 70-72 characters or so...


> [root@hw092 proc]# cat /proc/stat 
> cpu  80451071 175 11782191 1329038749 69071 198 147056 0 0 0
> cpu0 14287214 23 2683574 42127654 40971 35 45265 0 0 0
> cpu1 9638646 35 1600926 47943968 2389 27 15931 0 0 0
> intr 1409191313 2119 2 0 0 1 0 0 0 1 0 0 0 4 0 0 0 29 0 2 0 0 0 0 0 0 0 0 0 0 
> 0 0 0 281421 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 
> 57583114 48846505 49762624 51599059 50570653 50783369 53543974 46920184 0 0 0 
> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0
> ctxt 288625991
> btime 1384770128
> processes 84957
> procs_running 1
> procs_blocked 0
> softirq 1494638321 158 941224871 35100444 417538347 283479 158 159 59647062 
> 120767 40722876


-- 
~Randy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] update consumers of MSG_MORE to recognize MSG_SENDPAGE_NOTLAST

2013-11-24 Thread Shawn Landden
Commit 35f9c09fe (tcp: tcp_sendpages() should call tcp_push() once)
added an internal flag MSG_SENDPAGE_NOTLAST, similar to
MSG_MORE.

algif_hash and algif_skcipher used MSG_MORE from tcp_sendpages()
and need to see the new flag as identical to MSG_MORE.

This fixes sendfile() on AF_ALG.

Cc: Tom Herbert 
Cc: Eric Dumazet 
Cc: David S. Miller 
Cc:  # 3.4.x + 3.2.x
Reported-and-tested-by: Shawn Landden 
Original-patch: Richard Weinberger 
Signed-off-by: Shawn Landden 
---
 crypto/algif_hash.c | 3 +++
 crypto/algif_skcipher.c | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c
index ef5356c..8502462 100644
--- a/crypto/algif_hash.c
+++ b/crypto/algif_hash.c
@@ -114,6 +114,9 @@ static ssize_t hash_sendpage(struct socket *sock, struct 
page *page,
struct hash_ctx *ctx = ask->private;
int err;
 
+   if (flags & MSG_SENDPAGE_NOTLAST)
+   flags |= MSG_MORE;
+
lock_sock(sk);
sg_init_table(ctx->sgl.sg, 1);
sg_set_page(ctx->sgl.sg, page, size, offset);
diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c
index 6a6dfc0..a19c027 100644
--- a/crypto/algif_skcipher.c
+++ b/crypto/algif_skcipher.c
@@ -378,6 +378,9 @@ static ssize_t skcipher_sendpage(struct socket *sock, 
struct page *page,
struct skcipher_sg_list *sgl;
int err = -EINVAL;
 
+   if (flags & MSG_SENDPAGE_NOTLAST)
+   flags |= MSG_MORE;
+
lock_sock(sk);
if (!ctx->more && ctx->used)
goto unlock;
-- 
1.8.4.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 0/4] Add dual-fifo mode support of i.MX ssi

2013-11-24 Thread Shawn Guo
On Sat, Nov 23, 2013 at 12:31:32AM +0800, Nicolin Chen wrote:
> Hi all,
> 
>I'm sorry to push this. But this series has been an orphan for a while.
>Could any one please receive and foster it?

Vinod,

I expect you will pick up the series.  But otherwise, I can apply it via
IMX tree with your ACKs on the first two patches.

Shawn

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT] Security subsystem updates for 3.13

2013-11-24 Thread James Morris
Sorry about all this -- we'll do better next time.


-- 
James Morris

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Commit 35f9c09fe (tcp: tcp_sendpages() should call tcp_push() once) added an internal flag MSG_SENDPAGE_NOTLAST, similar to MSG_MORE.

2013-11-24 Thread Shawn Landden
algif_hash and algif_skcipher used MSG_MORE from tcp_sendpages()
and need to see the new flag as identical to MSG_MORE.

This fixes sendfile() on AF_ALG.

Cc: Tom Herbert 
Cc: Eric Dumazet 
Cc: David S. Miller 
Cc:  # 3.4.x + 3.2.x
Reported-and-tested-by: Shawn Landden 
Original-patch: Richard Weinberger 
Signed-off-by: Shawn Landden 
---
 crypto/algif_hash.c | 3 +++
 crypto/algif_skcipher.c | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c
index ef5356c..8502462 100644
--- a/crypto/algif_hash.c
+++ b/crypto/algif_hash.c
@@ -114,6 +114,9 @@ static ssize_t hash_sendpage(struct socket *sock, struct 
page *page,
struct hash_ctx *ctx = ask->private;
int err;
 
+   if (flags & MSG_SENDPAGE_NOTLAST)
+   flags |= MSG_MORE;
+
lock_sock(sk);
sg_init_table(ctx->sgl.sg, 1);
sg_set_page(ctx->sgl.sg, page, size, offset);
diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c
index 6a6dfc0..a19c027 100644
--- a/crypto/algif_skcipher.c
+++ b/crypto/algif_skcipher.c
@@ -378,6 +378,9 @@ static ssize_t skcipher_sendpage(struct socket *sock, 
struct page *page,
struct skcipher_sg_list *sgl;
int err = -EINVAL;
 
+   if (flags & MSG_SENDPAGE_NOTLAST)
+   flags |= MSG_MORE;
+
lock_sock(sk);
if (!ctx->more && ctx->used)
goto unlock;
-- 
1.8.4.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] ima: bug fixes for Linus

2013-11-24 Thread James Morris
On Sun, 24 Nov 2013, Mimi Zohar wrote:

> On Mon, 2013-11-25 at 09:44 +1100, James Morris wrote:
> > On Sun, 24 Nov 2013, Mimi Zohar wrote:
> > 
> > > Hi James,
> > > 
> > > Linus has already reverted the trusted keyring support for IMA patches.
> > > These patches are re-based on -rc1.
> > > 
> > > The following changes since commit 
> > > 4c1cc40a2d49500d84038ff751bc6cd183e729b5:
> > > 
> > >   Revert "KEYS: verify a certificate is signed by a 'trusted' key" 
> > > (2013-11-23 16:38:17 -0800)
> > > 
> > > are available in the git repository at:
> > > 
> > >   git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity 
> > > for-linus
> > > 
> > > for you to fetch changes up to 3eeb2d63ab623be55bb2ff584e123c0df45691e3:
> > > 
> > >   ima: make a copy of template_fmt in template_desc_init_fields() 
> > > (2013-11-24 00:29:23 -0500)
> > > 
> > 
> > I don't understand -- are these all fixes for regressions in the new 
> > kernel?
> 
> Yes, mostly.  There's one code cleanup, that could be deferred and a
> documentation update.

Can we leave documentation and code cleanups to the next cycle and only 
include essential fixes for regressions at this stage?

Also, please identify which upstream commits specifically are fixed by 
each patch.


- James

-- 
James Morris

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] clocksource: bcm_kona_timer: Remove unused bcm_timer_ids

2013-11-24 Thread Axel Lin
bcm_timer_ids is no longer used after converting to CLOCKSOURCE_OF_DECLARE.

Signed-off-by: Axel Lin 
---
 drivers/clocksource/bcm_kona_timer.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/drivers/clocksource/bcm_kona_timer.c 
b/drivers/clocksource/bcm_kona_timer.c
index 0d7d8c3..5176e76 100644
--- a/drivers/clocksource/bcm_kona_timer.c
+++ b/drivers/clocksource/bcm_kona_timer.c
@@ -98,12 +98,6 @@ kona_timer_get_counter(void *timer_base, uint32_t *msw, 
uint32_t *lsw)
return;
 }
 
-static const struct of_device_id bcm_timer_ids[] __initconst = {
-   {.compatible = "brcm,kona-timer"},
-   {.compatible = "bcm,kona-timer"}, /* deprecated name */
-   {},
-};
-
 static void __init kona_timers_init(struct device_node *node)
 {
u32 freq;
-- 
1.8.1.2



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] arch: hexagon: kernel: add export symbol function __delay()

2013-11-24 Thread Chen Gang
On 11/25/2013 09:19 AM, rkuo wrote:
> On Tue, Nov 19, 2013 at 11:10:43AM +0800, Chen Gang wrote:
>> Need add __delay() implementation, or can not pass allmodconfig in
>> next-20131118 tree.
>>
>> The related error:
>>
>> CC  kernel/locking/spinlock_debug.o
>>   kernel/locking/spinlock_debug.c: In function '__spin_lock_debug':
>>   kernel/locking/spinlock_debug.c:114:3: error: implicit declaration of 
>> function '__delay' [-Werror=implicit-function-declaration]
>>
>>
>> Signed-off-by: Chen Gang 
>> ---
>>  arch/hexagon/include/asm/delay.h |1 +
>>  arch/hexagon/kernel/time.c   |9 +
>>  2 files changed, 10 insertions(+), 0 deletions(-)
> 
> Thanks again for all the cleanups.  I've tested this and the rest of your
> patches on my internal tree and everything checks out.
> 

OK, thanks. I will/should continue. :-)

> Also just to let you know, I'm still waiting to hear back on that compiler
> bug.
> 

If necessary/suitable, can let me join to the related communication.

> 
> 
> Acked-by: Richard Kuo 
> 

Thanks.
-- 
Chen Gang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: Tree for Nov 25

2013-11-24 Thread Stephen Rothwell
Hi all,

I have had a request for me to include some metric of the size of
linux-next in these reports - see below.

Changes since 20131122:

My fixes tree contains:

Revert "powerpc: Add CONFIG_CPU_LITTLE_ENDIAN kernel config option."

Non-merge commits (relative to Linus' tree): 1036
 1041 files changed, 32934 insertions(+), 20320 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" as mentioned in the FAQ on the wiki
(see below).

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log files
in the Next directory.  Between each merge, the tree was built with
a ppc64_defconfig for powerpc and an allmodconfig for x86_64 and a
multi_v7_defconfig for arm. After the final fixups (if any), it is also
built with powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig and
allyesconfig (minus CONFIG_PROFILE_ALL_BRANCHES - this fails its final
link) and i386, sparc, sparc64 and arm defconfig. These builds also have
CONFIG_ENABLE_WARN_DEPRECATED, CONFIG_ENABLE_MUST_CHECK and
CONFIG_DEBUG_INFO disabled when necessary.

Below is a summary of the state of the merge.

I am currently merging 208 trees (counting Linus' and 29 trees of patches
pending for Linus' tree).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

There is a wiki covering stuff to do with linux-next at
http://linux.f-seidel.de/linux-next/pmwiki/ .  Thanks to Frank Seidel.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

$ git checkout master
$ git reset --hard stable
Merging origin/master (7e3528c3660a slab.h: remove duplicate kmalloc 
declaration and fix kernel-doc warnings)
Merging fixes/master (f5331539eb90 Revert "powerpc: Add 
CONFIG_CPU_LITTLE_ENDIAN kernel config option.")
Merging kbuild-current/rc-fixes (19514fc665ff arm, kbuild: make "make install" 
not depend on vmlinux)
Merging arc-current/for-curr (be3bdd0d2f2a ARC: Add guard macro to 
uapi/asm/unistd.h)
Merging arm-current/fixes (0c403462d682 ARM: 7894/1: kconfig: select 
GENERIC_CLOCKEVENTS if HAVE_ARM_ARCH_TIMER)
Merging m68k-current/for-linus (77a42796786c m68k: Remove deprecated 
IRQF_DISABLED)
Merging metag-fixes/fixes (3b2f64d00c46 Linux 3.11-rc2)
Merging powerpc-merge/merge (c13f20ac4832 powerpc/signals: Mark VSX not saved 
with small contexts)
Merging sparc/master (b4789b8e6be3 aacraid: prevent invalid pointer dereference)
Merging net/master (2c7a9dc16416 be2net: Avoid programming permenant MAC by 
BE3-R VFs)
Merging ipsec/master (be408cd3e1fe Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net)
Merging sound-current/for-linus (5db4d34b54c7 ALSA: hda - Set 
current_headset_type to ALC_HEADSET_TYPE_ENUM (janitorial))
Merging pci-current/for-linus (e7cc5cf74544 PCI: Remove duplicate 
pci_disable_device() from pcie_portdrv_remove())
Merging wireless/master (3b1bace9960b brcmfmac: fix possible memory leak)
Merging driver-core.current/driver-core-linus (027a485d12e0 sysfs: use a 
separate locking class for open files depending on mmap)
Merging tty.current/tty-linus (6ce4eac1f600 Linux 3.13-rc1)
Merging usb.current/usb-linus (6ce4eac1f600 Linux 3.13-rc1)
Merging staging.current/staging-linus (6ce4eac1f600 Linux 3.13-rc1)
Merging char-misc.current/char-misc-linus (6ce4eac1f600 Linux 3.13-rc1)
Merging input-current/for-linus (5cf0eb9875cb Merge branch 'next' into 
for-linus)
Merging md-current/for-linus (d47648fcf061 raid5: avoid finding "discard" 
stripe)
Merging crypto-current/master (f262f0f5cad0 crypto: s390 - Fix aes-cbc IV 
corruption)
Merging ide/master (c2f7d1e103ef ide: pmac: remove unnecessary 
pci_set_drvdata())
Merging dwmw2/master (5950f0803ca9 pcmcia: remove RPX board stuff)
Merging sh-current/sh-fixes-for-linus (44033109e99c SH: Convert out[bwl] macros 
to inline functions)
Merging devicetree-current/devicetree/merge (1931ee143b0a Revert "drivers: of: 
add initialization code for dma reserved memory")
Merging rr-fixes/fixes (f6537f2f0eba scripts/kallsyms: filter symbols not in 
kernel address space)
Merging mfd-fixes/master (ed2fe55fd91e mfd: ti-ssp: Fix build)
Merging vfio-fixes/for-linus (d93b3ac0edb8 VFIO: vfio_iommu_type1: fix bug 
caused by break in nested loop)
Merging 

Re: [PATCH] pipe_to_sendpage: Ensure that MSG_MORE is set if we set MSG_SENDPAGE_NOTLAST

2013-11-24 Thread Shawn Landden
On Sun, Nov 24, 2013 at 5:25 PM, Eric Dumazet  wrote:
> On Mon, 2013-11-25 at 00:42 +0100, Richard Weinberger wrote:
>> Commit 35f9c09fe (tcp: tcp_sendpages() should call tcp_push() once)
>> added an internal flag MSG_SENDPAGE_NOTLAST.
>> We have to ensure that MSG_MORE is also set if we set MSG_SENDPAGE_NOTLAST.
>> Otherwise users that check against MSG_MORE will not see it.
>>
>> This fixes sendfile() on AF_ALG.
>>
>> Cc: Tom Herbert 
>> Cc: Eric Dumazet 
>> Cc: David S. Miller 
>> Cc:  # 3.4.x
>> Reported-and-tested-by: Shawn Landden 
>> Signed-off-by: Richard Weinberger 
>> ---
>>  fs/splice.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/fs/splice.c b/fs/splice.c
>> index 3b7ee65..b93f1b8 100644
>> --- a/fs/splice.c
>> +++ b/fs/splice.c
>> @@ -701,7 +701,7 @@ static int pipe_to_sendpage(struct pipe_inode_info *pipe,
>>   more = (sd->flags & SPLICE_F_MORE) ? MSG_MORE : 0;
>>
>>   if (sd->len < sd->total_len && pipe->nrbufs > 1)
>> - more |= MSG_SENDPAGE_NOTLAST;
>> + more |= MSG_SENDPAGE_NOTLAST | MSG_MORE;
>>
>>   return file->f_op->sendpage(file, buf->page, buf->offset,
>>   sd->len, , more);
>
> I do not think this patch is right. It looks like a revert of a useful
> patch for TCP zero copy. Given the time it took to discover this
> regression, I bet tcp zero copy has more users than AF_ALG, by 5 or 6
> order of magnitude ;)
>
> Here we want to make the difference between the two flags, not merge
> them.
>
> If AF_ALG do not care of the difference, try instead :
>
> diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c
> index ef5356cd280a..850246206b12 100644
> --- a/crypto/algif_hash.c
> +++ b/crypto/algif_hash.c
> @@ -114,6 +114,9 @@ static ssize_t hash_sendpage(struct socket *sock, struct 
> page *page,
> struct hash_ctx *ctx = ask->private;
> int err;
>
> +   if (flags & MSG_SENDPAGE_NOTLAST)
> +   flags |= MSG_MORE;
> +
> lock_sock(sk);
> sg_init_table(ctx->sgl.sg, 1);
> sg_set_page(ctx->sgl.sg, page, size, offset);
>

>From my testing this works.

-- 

---
Shawn Landden
+1 360 389 3001 (SMS preferred)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [f2fs-dev] [PATCH V3 2/2] f2fs: read contiguous sit entry pages by merging for mount performance

2013-11-24 Thread Chao Yu
Hi,

> -Original Message-
> From: Jaegeuk Kim [mailto:jaegeuk@samsung.com]
> Sent: Sunday, November 24, 2013 12:26 PM
> To: Chao Yu
> Cc: linux-fsde...@vger.kernel.org; linux-kernel@vger.kernel.org; 
> linux-f2fs-de...@lists.sourceforge.net; 谭姝
> Subject: Re: [f2fs-dev] [PATCH V3 2/2] f2fs: read contiguous sit entry pages 
> by merging for mount performance
> 
> Hi,
> 
> 2013-11-22 (금), 09:09 +0800, Chao Yu:
> > Previously we read sit entries page one by one, this method lost the chance
> > of reading contiguous page together. So we read pages as contiguous as
> > possible for better mount performance.
> >
> > change log:
> >  o merge judgements/use 'Continue' or 'Break' instead of 'Goto' as Gu Zheng
> >suggested.
> >  o add mark_page_accessed() before release page to delay VM reclaiming.
> >  o remove '*order' for simplification of function as Jaegeuk Kim suggested.
> >
> > Signed-off-by: Chao Yu 
> > ---
> >  fs/f2fs/segment.c |  103 
> > +++--
> >  fs/f2fs/segment.h |2 ++
> >  2 files changed, 78 insertions(+), 27 deletions(-)
> >
> > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> > index 8149eba..998e7d3 100644
> > --- a/fs/f2fs/segment.c
> > +++ b/fs/f2fs/segment.c
> > @@ -14,6 +14,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >
> >  #include "f2fs.h"
> >  #include "segment.h"
> > @@ -1488,41 +1489,89 @@ static int build_curseg(struct f2fs_sb_info *sbi)
> > return restore_curseg_summaries(sbi);
> >  }
> >
> > +static int ra_sit_pages(struct f2fs_sb_info *sbi, int start, int nrpages)
> > +{
> > +   struct address_space *mapping = sbi->meta_inode->i_mapping;
> > +   struct page *page;
> > +   block_t blk_addr, prev_blk_addr = 0;
> > +   int sit_blk_cnt = SIT_BLK_CNT(sbi);
> > +   int blkno = start;
> > +
> > +   for (; blkno < start + nrpages && blkno < sit_blk_cnt; blkno++) {
> > +
> > +   blk_addr = current_sit_addr(sbi, start * SIT_ENTRY_PER_BLOCK);
> 
> Should be:
>   blk_addr = current_sit_addr(sbi, blkno * SIT_ENTRY_PER_BLOCK);
>   ---
> I'll fix this and merge the patch though.
> Thanks,

Oh, It's my mistake, sorry for that.
Thanks for your review! :)

> 
> --
> Jaegeuk Kim
> Samsung

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sctp: Restore 'resent' bit to avoid retransmitted chunks for RTT measurements

2013-11-24 Thread Xufeng Zhang

On 11/22/2013 10:18 PM, Vlad Yasevich wrote:

On 11/22/2013 03:30 AM, Xufeng Zhang wrote:
   

From: Signed-off-by: Xufeng Zhang

Currently retransmitted DATA chunks could also be used for
RTT measurements since there are no flag to identify whether
the transmitted DATA chunk is a new one or a retransmitted one.
This problem is introduced by commit ae19c5486 ("sctp: remove
'resent' bit from the chunk") which inappropriately removed the
'resent' bit completely, instead of doing this, we should set
the resent bit only for the retransmitted DATA chunks.

Signed-off-by: Xufeng Zhang
---
  include/net/sctp/structs.h |1 +
  net/sctp/output.c  |   23 +--
  net/sctp/outqueue.c|3 +++
  net/sctp/sm_make_chunk.c   |1 +
  4 files changed, 18 insertions(+), 10 deletions(-)

diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index 2174d8d..ea0ca5f 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -629,6 +629,7 @@ struct sctp_chunk {
  #define SCTP_NEED_FRTX 0x1
  #define SCTP_DONT_FRTX 0x2
__u16   rtt_in_progress:1,  /* This chunk used for RTT calc? */
+   resent:1,   /* Has this chunk ever been resent. */
has_tsn:1,  /* Does this chunk have a TSN yet? */
has_ssn:1,  /* Does this chunk have a SSN yet? */
singleton:1,/* Only chunk in the packet? */
diff --git a/net/sctp/output.c b/net/sctp/output.c
index e650978..32c214d 100644
--- a/net/sctp/output.c
+++ b/net/sctp/output.c
@@ -467,17 +467,20 @@ int sctp_packet_transmit(struct sctp_packet *packet)
list_for_each_entry_safe(chunk, tmp,>chunk_list, list) {
list_del_init(>list);
if (sctp_chunk_is_data(chunk)) {
-   /* 6.3.1 C4) When data is in flight and when allowed
-* by rule C5, a new RTT measurement MUST be made each
-* round trip.  Furthermore, new RTT measurements
-* SHOULD be made no more than once per round-trip
-* for a given destination transport address.
-*/
-
-   if (!tp->rto_pending) {
-   chunk->rtt_in_progress = 1;
-   tp->rto_pending = 1;
+   if (!chunk->resent) {
+   /* 6.3.1 C4) When data is in flight and when 
allowed
+* by rule C5, a new RTT measurement MUST be 
made each
+* round trip.  Furthermore, new RTT 
measurements
+* SHOULD be made no more than once per 
round-trip
+* for a given destination transport address.
+*/
+
+   if (!tp->rto_pending) {
 

Could be combined as
 if (!chunk->resent&&  !tp->rto_pending) {
   


Got it!

   

+   chunk->rtt_in_progress = 1;
+   tp->rto_pending = 1;
+   }
}
+
has_data = 1;
}

diff --git a/net/sctp/outqueue.c b/net/sctp/outqueue.c
index 94df758..70f4f56 100644
--- a/net/sctp/outqueue.c
+++ b/net/sctp/outqueue.c
@@ -446,6 +446,8 @@ void sctp_retransmit_mark(struct sctp_outq *q,
transport->rto_pending = 0;
}

+   chunk->resent = 1;
+
/* Move the chunk to the retransmit queue. The chunks
 * on the retransmit queue are always kept in order.
 */
@@ -1375,6 +1377,7 @@ static void sctp_check_transmitted(struct sctp_outq *q,
 * instance).
 */
if (!tchunk->tsn_gap_acked&&
+   !tchunk->resent&&
tchunk->rtt_in_progress) {
tchunk->rtt_in_progress = 0;
rtt = jiffies - tchunk->sent_at;
diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
index fe69032..8fe89f8 100644
--- a/net/sctp/sm_make_chunk.c
+++ b/net/sctp/sm_make_chunk.c
@@ -1321,6 +1321,7 @@ struct sctp_chunk *sctp_chunkify(struct sk_buff *skb,
INIT_LIST_HEAD(>list);
retval->skb  = skb;
retval->asoc = (struct sctp_association *)asoc;
+   retval->resent  = 0;
 

Not needed due to zeroed out malloc.
   


Thank you!
Will send V2 later.



Thanks,
Xufeng


-vlad

   

retval->singleton= 1;

retval->fast_retransmit = SCTP_CAN_FRTX;

 


   


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message 

RE: [f2fs-dev] [PATCH 1/2] f2fs: adds a tracepoint for submit_read_page

2013-11-24 Thread Chao Yu
Hi,

> -Original Message-
> From: Jaegeuk Kim [mailto:jaegeuk@samsung.com]
> Sent: Sunday, November 24, 2013 11:09 AM
> To: Chao Yu
> Cc: linux-fsde...@vger.kernel.org; linux-kernel@vger.kernel.org; 
> linux-f2fs-de...@lists.sourceforge.net; 谭姝
> Subject: Re: [f2fs-dev] [PATCH 1/2] f2fs: adds a tracepoint for 
> submit_read_page
> 
> Hi,
> 
> We need to avoid redundancy as much as possible.
> So, how about this patch?

Ah, It's more neat!
Thanks. :)


> -Original Message-
> From: Jaegeuk Kim [mailto:jaegeuk@samsung.com]
> Sent: Sunday, November 24, 2013 11:51 AM
> To: Chao Yu
> Cc: linux-fsde...@vger.kernel.org; linux-kernel@vger.kernel.org; 
> linux-f2fs-de...@lists.sourceforge.net; 谭姝
> Subject: Re: [f2fs-dev] [PATCH 2/2] f2fs: adds a tracepoint for 
> f2fs_submit_read_bio
> 
> Hi,
> 
> Again, this tracepoint alos can be integrated with write_bios like this.

Alright, Thanks!

Regards,
Yu

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] pipe_to_sendpage: Ensure that MSG_MORE is set if we set MSG_SENDPAGE_NOTLAST

2013-11-24 Thread Eric Dumazet
On Mon, 2013-11-25 at 00:42 +0100, Richard Weinberger wrote:
> Commit 35f9c09fe (tcp: tcp_sendpages() should call tcp_push() once)
> added an internal flag MSG_SENDPAGE_NOTLAST.
> We have to ensure that MSG_MORE is also set if we set MSG_SENDPAGE_NOTLAST.
> Otherwise users that check against MSG_MORE will not see it.
> 
> This fixes sendfile() on AF_ALG.
> 
> Cc: Tom Herbert 
> Cc: Eric Dumazet 
> Cc: David S. Miller 
> Cc:  # 3.4.x
> Reported-and-tested-by: Shawn Landden 
> Signed-off-by: Richard Weinberger 
> ---
>  fs/splice.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/splice.c b/fs/splice.c
> index 3b7ee65..b93f1b8 100644
> --- a/fs/splice.c
> +++ b/fs/splice.c
> @@ -701,7 +701,7 @@ static int pipe_to_sendpage(struct pipe_inode_info *pipe,
>   more = (sd->flags & SPLICE_F_MORE) ? MSG_MORE : 0;
>  
>   if (sd->len < sd->total_len && pipe->nrbufs > 1)
> - more |= MSG_SENDPAGE_NOTLAST;
> + more |= MSG_SENDPAGE_NOTLAST | MSG_MORE;
>  
>   return file->f_op->sendpage(file, buf->page, buf->offset,
>   sd->len, , more);

I do not think this patch is right. It looks like a revert of a useful
patch for TCP zero copy. Given the time it took to discover this
regression, I bet tcp zero copy has more users than AF_ALG, by 5 or 6
order of magnitude ;)

Here we want to make the difference between the two flags, not merge
them.

If AF_ALG do not care of the difference, try instead :

diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c
index ef5356cd280a..850246206b12 100644
--- a/crypto/algif_hash.c
+++ b/crypto/algif_hash.c
@@ -114,6 +114,9 @@ static ssize_t hash_sendpage(struct socket *sock, struct 
page *page,
struct hash_ctx *ctx = ask->private;
int err;
 
+   if (flags & MSG_SENDPAGE_NOTLAST)
+   flags |= MSG_MORE;
+
lock_sock(sk);
sg_init_table(ctx->sgl.sg, 1);
sg_set_page(ctx->sgl.sg, page, size, offset);



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.10.16 cgroup_mutex deadlock

2013-11-24 Thread Li Zefan
On 2013/11/23 6:54, William Dauchy wrote:
> Hi Tejun,
> 
> On Fri, Nov 22, 2013 at 11:18 PM, Tejun Heo  wrote:
>> Just applied to cgroup/for-3.13-fixes w/ stable cc'd.  Will push to
>> Linus next week.
> 
> Thank your for your quick reply. Do you also have a backport for
> v3.10.x already available?
> 

I'll do this after the patch hits mainline, if Tejun doesn't plan to.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] arch: hexagon: include: asm: add "vga.h" in Kbuild

2013-11-24 Thread rkuo
On Tue, Nov 19, 2013 at 01:17:21PM +0800, Chen Gang wrote:
> Need include generic "vga.h", or can not pass compiling with
> allmodconfig, the related error:
> 
> CC [M]  drivers/gpu/drm/drm_irq.o
>   In file included from include/linux/vgaarb.h:34:0,
>from drivers/gpu/drm/drm_irq.c:42:
>   include/video/vga.h:22:21: fatal error: asm/vga.h: No such file or directory
> 
> Also move "preempt.h" upper to match sort order.
> 
> 
> Signed-off-by: Chen Gang 
> ---
>  arch/hexagon/include/asm/Kbuild |3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
> 

Acked-by: Richard Kuo 

-- 

Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] arch: hexagon: Kconfig: add HAVE_DMA_ATTR in Kconfig and remove "linux/dma-mapping.h" from "asm/dma-mapping.h"

2013-11-24 Thread rkuo
On Tue, Nov 19, 2013 at 12:57:27PM +0800, Chen Gang wrote:
> When HAS_DMA, and also need use generic implementation, HAVE_DMA_ATTR
> must be enabled, or can not pass compiling with allmodconfig, the
> related error:
> 
> CC [M]  drivers/ata/libata-core.o
>   drivers/ata/libata-core.c: In function 'ata_sg_clean':
>   drivers/ata/libata-core.c:4598:3: error: implicit declaration of function 
> 'dma_unmap_sg' [-Werror=implicit-function-declaration]
>   drivers/ata/libata-core.c: In function 'ata_sg_setup':
>   drivers/ata/libata-core.c:4708:2: error: implicit declaration of function 
> 'dma_map_sg' [-Werror=implicit-function-declaration]
> 
> "linux/dma-mapping.h" will include "asm/dma-mapping.h", so need remove
> "linux/dma-mapping.h" from "asm/dma-mapping.h",
> 
> 
> Signed-off-by: Chen Gang 

Acked-by: Richard Kuo 

-- 

Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] arch: hexagon: kernel: add export symbol function __delay()

2013-11-24 Thread rkuo
On Tue, Nov 19, 2013 at 11:10:43AM +0800, Chen Gang wrote:
> Need add __delay() implementation, or can not pass allmodconfig in
> next-20131118 tree.
> 
> The related error:
> 
> CC  kernel/locking/spinlock_debug.o
>   kernel/locking/spinlock_debug.c: In function '__spin_lock_debug':
>   kernel/locking/spinlock_debug.c:114:3: error: implicit declaration of 
> function '__delay' [-Werror=implicit-function-declaration]
> 
> 
> Signed-off-by: Chen Gang 
> ---
>  arch/hexagon/include/asm/delay.h |1 +
>  arch/hexagon/kernel/time.c   |9 +
>  2 files changed, 10 insertions(+), 0 deletions(-)

Thanks again for all the cleanups.  I've tested this and the rest of your
patches on my internal tree and everything checks out.

Also just to let you know, I'm still waiting to hear back on that compiler
bug.



Acked-by: Richard Kuo 



-- 

Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH cgroup/for-3.13-fixes] cgroup: use a dedicated workqueue for cgroup destruction

2013-11-24 Thread Li Zefan
> Since be44562613851 ("cgroup: remove synchronize_rcu() from
> cgroup_diput()"), cgroup destruction path makes use of workqueue.  css
> freeing is performed from a work item from that point on and a later
> commit, ea15f8ccdb430 ("cgroup: split cgroup destruction into two
> steps"), moves css offlining to workqueue too.
> 
> As cgroup destruction isn't depended upon for memory reclaim, the
> destruction work items were put on the system_wq; unfortunately, some
> controller may block in the destruction path for considerable duration
> while holding cgroup_mutex.  As large part of destruction path is
> synchronized through cgroup_mutex, when combined with high rate of
> cgroup removals, this has potential to fill up system_wq's max_active
> of 256.
> 
> Also, it turns out that memcg's css destruction path ends up queueing
> and waiting for work items on system_wq through work_on_cpu().  If
> such operation happens while system_wq is fully occupied by cgroup
> destruction work items, work_on_cpu() can't make forward progress
> because system_wq is full and other destruction work items on
> system_wq can't make forward progress because the work item waiting
> for work_on_cpu() is holding cgroup_mutex, leading to deadlock.
> 
> This can be fixed by queueing destruction work items on a separate
> workqueue.  This patch creates a dedicated workqueue -
> cgroup_destroy_wq - for this purpose.  As these work items shouldn't
> have inter-dependencies and mostly serialized by cgroup_mutex anyway,
> giving high concurrency level doesn't buy anything and the workqueue's
> @max_active is set to 1 so that destruction work items are executed
> one by one on each CPU.
> 
> Hugh Dickins: Because cgroup_init() is run before init_workqueues(),
> cgroup_destroy_wq can't be allocated from cgroup_init().  Do it from a
> separate core_initcall().  In the future, we probably want to reorder
> so that workqueue init happens before cgroup_init().
> 
> Signed-off-by: Tejun Heo 
> Reported-by: Hugh Dickins 
> Reported-by: Shawn Bohrer 
> Link: 
> http://lkml.kernel.org/r/2013220626.ga7...@sbohrermbp13-local.rgmadvisors.com
> Link: http://lkml.kernel.org/g/alpine.LNX.2.00.1310301606080.2333@eggly.anvils
> Cc: sta...@vger.kernel.org # v3.9+

Acked-by: Li Zefan 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 0/4] sched: remove cpu_load decay.

2013-11-24 Thread Alex Shi
On 11/22/2013 08:13 PM, Daniel Lezcano wrote:
> 
> Hi Alex,
> 
> I tried on my Xeon server (2 x 4 cores) your patchset and got the
> following result:
> 
> kernel a5d6e63323fe7799eb0e6  / + patchset
> 
> hackbench -T -s 4096 -l 1000 -g 10 -f 40
>   27.604  38.556

Hi Daniel, would you like give the detailed server info? 2 socket * 4
cores, sounds it isn't a modern machine.

-- 
Thanks
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PROBLEM] possible divide by 0 in kernel/sched/cputime.c scale_stime()

2013-11-24 Thread Christian Engelmayer
On Mon, 18 Nov 2013 18:27:06 +0100, Peter Zijlstra  wrote:
> That is not actually correct in the case time wraps.
> 
> There's a further problem with this code though -- ever since Frederic
> added NO_HZ_FULL a CPU can in fact aggregate a runtime delta larger than
> 4 seconds, due to running without a tick.
> 
> Therefore we need to be able to deal with u64 deltas.
> 
> The below is a compile tested only attempt to deal with both these
> problems. Comments?

I had this patch applied during daily use. No systematic testing, but no user
perceived regressions either. The originally reported divide by 0 scenario
could no longer be reproduced with this change.

> +/* 
> + * delta_exec * weight / lw.weight
> + *   OR
> + * (delta_exec * (weight * lw->inv_weight)) >> WMULT_SHIFT
> + *
> + * Either weight := NICE_0_LOAD and lw \e prio_to_wmult[], in which case
> + * we're guaranteed shift stays positive because inv_weight is guaranteed to
> + * fit 32 bits, and NICE_0_LOAD gives another 10 bits; therefore shift >= 22.
> + *
> + * Or, weight =< lw.weight (because lw.weight is the runqueue weight), thus
> + * XXX mind got twisted, but I'm fairly sure shift will stay positive.
> + *
> + */
> +static u64 __calc_delta(u64 delta_exec, unsigned long weight, struct 
> load_weight *lw)

The patch itself seems comprehensible to me, although I have to admit that I
would have to read into the code more deeply in order to understand why the
changed __calc_delta() will always prove correct.

On Mon, 18 Nov 2013 15:19:56 +0100, Peter Zijlstra  wrote:
> I'm not sure what tool you used to generate that, but its broken, that's
> model 0x25 (37), it somehow truncates the upper model bits.

Correct, that was the fairly outdated cpuid (http://www.ka9q.net/code/cpuid)
currently shipped with Ubuntu 13.10. Debian already switched to packaging a
maintained version (http://www.etallen.com/cpuid.html).

> That said, its a westmere core and I've seen wsm-ep (dual socket)
> machines loose their TSC sync quite regularly, but this would be the
> first case a single socket wsm would loose its TSC sync.
>
> That leads me to believe your BIOS is screwing you over with SMIs or the
> like.

Having rechecked the running microcode as hinted by Henrique de Moraes Holschuh
off-list and running the Intel BIOS Implementation Test Suite 
(http://biosbits.org)
that seems to be an educated guess.

Regards,
Christian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] max17042: Fix build errors caused by missing REGMAP_I2C config

2013-11-24 Thread jonghwa3 . lee
On 2013년 11월 24일 19:41, Austin Boyle wrote:

> max17042 now uses regmap interface but does not enable config option. This 
> patch fixes the following build errors:
> 
> drivers/power/max17042_battery.c:661:15: error: variable 
> ‘max17042_regmap_config’ has initializer but incomplete type
> drivers/power/max17042_battery.c:662:2: error: unknown field ‘reg_bits’ 
> specified in initializer
> drivers/power/max17042_battery.c:662:2: warning: excess elements in struct 
> initializer
> drivers/power/max17042_battery.c:662:2: warning: (near initialization for 
> ‘max17042_regmap_config’)
> drivers/power/max17042_battery.c:663:2: error: unknown field ‘val_bits’ 
> specified in initializer
> drivers/power/max17042_battery.c:663:2: warning: excess elements in struct 
> initializer
> drivers/power/max17042_battery.c:663:2: warning: (near initialization for 
> ‘max17042_regmap_config’)
> drivers/power/max17042_battery.c:664:2: error: unknown field 
> ‘val_format_endian’ specified in initializer
> drivers/power/max17042_battery.c:664:23: error: ‘REGMAP_ENDIAN_NATIVE’ 
> undeclared here (not in a function)
> drivers/power/max17042_battery.c:664:2: warning: excess elements in struct 
> initializer
> drivers/power/max17042_battery.c:664:2: warning: (near initialization for 
> ‘max17042_regmap_config’)
> drivers/power/max17042_battery.c: In function ‘max17042_probe’:
> drivers/power/max17042_battery.c:684:2: error: implicit declaration of 
> function ‘devm_regmap_init_i2c’
> 
> Signed-off-by: Austin Boyle 
> ---
> diff --git a/drivers/power/Kconfig b/drivers/power/Kconfig
> index 5e2054a..85ad58c 100644
> --- a/drivers/power/Kconfig
> +++ b/drivers/power/Kconfig
> @@ -196,6 +196,7 @@ config BATTERY_MAX17040
>  config BATTERY_MAX17042
>   tristate "Maxim MAX17042/17047/17050/8997/8966 Fuel Gauge"
>   depends on I2C
> + select REGMAP_I2C
>   help
> MAX17042 is fuel-gauge systems for lithium-ion (Li+) batteries
> in handheld and portable equipment. The MAX17042 is configured
> 


Sorry, It's my fault. Thanks.

Acked-by: Jonghwa Lee 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] ima: bug fixes for Linus

2013-11-24 Thread Mimi Zohar
On Mon, 2013-11-25 at 09:44 +1100, James Morris wrote:
> On Sun, 24 Nov 2013, Mimi Zohar wrote:
> 
> > Hi James,
> > 
> > Linus has already reverted the trusted keyring support for IMA patches.
> > These patches are re-based on -rc1.
> > 
> > The following changes since commit 4c1cc40a2d49500d84038ff751bc6cd183e729b5:
> > 
> >   Revert "KEYS: verify a certificate is signed by a 'trusted' key" 
> > (2013-11-23 16:38:17 -0800)
> > 
> > are available in the git repository at:
> > 
> >   git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity 
> > for-linus
> > 
> > for you to fetch changes up to 3eeb2d63ab623be55bb2ff584e123c0df45691e3:
> > 
> >   ima: make a copy of template_fmt in template_desc_init_fields() 
> > (2013-11-24 00:29:23 -0500)
> > 
> 
> I don't understand -- are these all fixes for regressions in the new 
> kernel?

Yes, mostly.  There's one code cleanup, that could be deferred and a
documentation update.

Mimi



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next stats (Was: Linux 3.13-rc1 is out)

2013-11-24 Thread Stephen Rothwell
On Fri, 22 Nov 2013 12:36:37 -0800 Linus Torvalds 
 wrote:
>
> Talking about mistakes... I suspect it was a mistake to have that
> extra week before the merge window opened, and I probably should just
> have done a 3.12-rc8 instead. Because the linux-next statistics look
> suspicious, and we had extra stuff show up there not just in that
> first week. Clearly people took that "let's have an extra week of
> merge window" and extrapolated it a bit too much. Oh, well. Live and
> learn.

As usual, the executive friendly graph is at
http://neuling.org/linux-next-size.html :-)

(No merge commits counted, next-20131105 was the linux-next based on v3.12)

Commits in v3.13-rc1 (relative to v3.12): 10518 (v3.12-rc11:9474)
Commits in next-20131105:  9029 (next-20130903: 8891)
Commits with the same SHA1:7979 (   7991)
Commits with the same patch_id: 621 (1) (472)
Commits with the same subject line:  70 (1) ( 70)

(1) not counting those in the lines above.

So commits in -rc1 that were in next-20131105:  867082.4%   (8533   90.1%)
Commits in -rc1 that were not in next-20131105: 184817.6%   ( 9419.9%)

So worse than last time, probably because of the 1 week delay.

[Aside: if I use next-2013 as a base (when Linus' starting doing
merges in earnest), the stats look like this:

Commits in next-2013:  9906
Commits with the same SHA1:9156
Commits with the same patch_id: 354 (1)
Commits with the same subject line:  41 (1)

So commits in -rc1 that were in next-2013:  955190.8%
Commits in -rc1 that were not in next-20131105:  967 9.2%

So, much more in line with previous releases.
]

Some breakdown of the list of extra commits (relative to next-20131105)
in -rc1:

Top ten first word of commit summary:

337 drm
115 btrfs
 83 perf
 68 alsa
 50 asoc
 49 arm
 46 net
 31 powerpc
 27 netfilter
 27 acpi

Top ten authors:

 66 Ben Skeggs 
 63 Takashi Iwai 
 63 Al Viro 
 46 Alex Deucher 
 39 Ben Widawsky 
 33 Josef Bacik 
 31 Johannes Berg 
 29 J. Bruce Fields 
 29 Dan Carpenter 
 27 Peter Zijlstra 

Top ten commiters:

195 David S. Miller 
117 Chris Mason 
 95 Daniel Vetter 
 94 Al Viro 
 86 Ben Skeggs 
 82 Arnaldo Carvalho de Melo 
 74 Alex Deucher 
 69 Takashi Iwai 
 53 Dave Airlie 
 52 Mark Brown 

There are also 358 commits in next-20131105 that didn't make it into
v3.13-rc1.

Top ten first word of commit summary:

 51 arm
 40 crypto
 21 block
 11 x86
 11 ocfs2
 11 dm
 10 ceph
 10 bluetooth
  9 iov_iter
  9 9p

Top ten authors:

 27 Kent Overstreet 
 22 Dave Kleikamp 
 19 Andrew Morton 
  9 Zach Brown 
  9 Denis Carikli 
  8 Geyslan G. Bem 
  7 Sachin Kamat 
  7 Kees Cook 
  7 Alex Porosanu 
  6 Yan, Zheng 

Some of Andrew's patches are fixes for other patches in his tree (and
have been merged into those).

Top ten commiters:

 93 Stephen Rothwell 
 48 Herbert Xu 
 33 Dave Kleikamp 
 30 Jens Axboe 
 27 Shawn Guo 
 17 Kukjin Kim 
 11 Mauro Carvalho Chehab 
 10 Jason Wessel 
  9 Sage Weil 
  9 Eric Van Hensbergen 

Those commits by me are from the quilt series (mainly Andrew's mmotm
tree).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpzfrMCF2C0m.pgp
Description: PGP signature


[PATCH 1/4] ACPI / bind: Simplify child device lookups

2013-11-24 Thread Rafael J. Wysocki
From: Rafael J. Wysocki 

Now that we create a struct acpi_device object for every ACPI
namespace node representing a device, it is not necessary to
use acpi_walk_namespace() for child device lookup in
acpi_find_child() any more.  Instead, we can simply walk the
list of children of the given struct acpi_device object and
return the matching one (or the one which is the best match if
there are more of them).  The checks done during the matching
loop can be simplified too so that the secondary namespace walks
in find_child_checks() are not necessary any more.

Signed-off-by: Rafael J. Wysocki 
---
 drivers/acpi/glue.c |  134 ++--
 include/acpi/acpi_bus.h |3 +
 2 files changed, 56 insertions(+), 81 deletions(-)

Index: linux-pm/drivers/acpi/glue.c
===
--- linux-pm.orig/drivers/acpi/glue.c
+++ linux-pm/drivers/acpi/glue.c
@@ -82,107 +82,79 @@ static struct acpi_bus_type *acpi_get_bu
 #define FIND_CHILD_MIN_SCORE   1
 #define FIND_CHILD_MAX_SCORE   2
 
-static acpi_status acpi_dev_present(acpi_handle handle, u32 lvl_not_used,
- void *not_used, void **ret_p)
-{
-   struct acpi_device *adev = NULL;
-
-   acpi_bus_get_device(handle, );
-   if (adev) {
-   *ret_p = handle;
-   return AE_CTRL_TERMINATE;
-   }
-   return AE_OK;
-}
-
-static int do_find_child_checks(acpi_handle handle, bool is_bridge)
+static int find_child_checks(struct acpi_device *adev, bool check_children)
 {
bool sta_present = true;
unsigned long long sta;
acpi_status status;
 
-   status = acpi_evaluate_integer(handle, "_STA", NULL, );
+   status = acpi_evaluate_integer(adev->handle, "_STA", NULL, );
if (status == AE_NOT_FOUND)
sta_present = false;
else if (ACPI_FAILURE(status) || !(sta & ACPI_STA_DEVICE_ENABLED))
return -ENODEV;
 
-   if (is_bridge) {
-   void *test = NULL;
+   if (check_children && list_empty(>children))
+   return -ENODEV;
 
-   /* Check if this object has at least one child device. */
-   acpi_walk_namespace(ACPI_TYPE_DEVICE, handle, 1,
-   acpi_dev_present, NULL, NULL, );
-   if (!test)
-   return -ENODEV;
-   }
return sta_present ? FIND_CHILD_MAX_SCORE : FIND_CHILD_MIN_SCORE;
 }
 
-struct find_child_context {
-   u64 addr;
-   bool is_bridge;
-   acpi_handle ret;
-   int ret_score;
-};
-
-static acpi_status do_find_child(acpi_handle handle, u32 lvl_not_used,
-void *data, void **not_used)
+struct acpi_device *acpi_find_child_device(struct acpi_device *parent,
+  u64 address, bool check_children)
 {
-   struct find_child_context *context = data;
-   unsigned long long addr;
-   acpi_status status;
-   int score;
+   struct acpi_device *adev, *ret = NULL;
+   int ret_score = 0;
 
-   status = acpi_evaluate_integer(handle, METHOD_NAME__ADR, NULL, );
-   if (ACPI_FAILURE(status) || addr != context->addr)
-   return AE_OK;
-
-   if (!context->ret) {
-   /* This is the first matching object.  Save its handle. */
-   context->ret = handle;
-   return AE_OK;
-   }
-   /*
-* There is more than one matching object with the same _ADR value.
-* That really is unexpected, so we are kind of beyond the scope of the
-* spec here.  We have to choose which one to return, though.
-*
-* First, check if the previously found object is good enough and return
-* its handle if so.  Second, check the same for the object that we've
-* just found.
-*/
-   if (!context->ret_score) {
-   score = do_find_child_checks(context->ret, context->is_bridge);
-   if (score == FIND_CHILD_MAX_SCORE)
-   return AE_CTRL_TERMINATE;
-   else
-   context->ret_score = score;
-   }
-   score = do_find_child_checks(handle, context->is_bridge);
-   if (score == FIND_CHILD_MAX_SCORE) {
-   context->ret = handle;
-   return AE_CTRL_TERMINATE;
-   } else if (score > context->ret_score) {
-   context->ret = handle;
-   context->ret_score = score;
+   list_for_each_entry(adev, >children, node) {
+   unsigned long long addr;
+   acpi_status status;
+   int score;
+
+   status = acpi_evaluate_integer(adev->handle, METHOD_NAME__ADR,
+  NULL, );
+   if (ACPI_FAILURE(status) || addr != address)
+   continue;
+
+   if (!ret) {
+   /* This is the 

Re: [PATCH] pipe_to_sendpage: Ensure that MSG_MORE is set if we set MSG_SENDPAGE_NOTLAST

2013-11-24 Thread Shawn Landden
On Sun, Nov 24, 2013 at 3:42 PM, Richard Weinberger  wrote:
> Commit 35f9c09fe (tcp: tcp_sendpages() should call tcp_push() once)
> added an internal flag MSG_SENDPAGE_NOTLAST.
> We have to ensure that MSG_MORE is also set if we set MSG_SENDPAGE_NOTLAST.
> Otherwise users that check against MSG_MORE will not see it.
>
> This fixes sendfile() on AF_ALG.
>
> Cc: Tom Herbert 
> Cc: Eric Dumazet 
> Cc: David S. Miller 
> Cc:  # 3.4.x

The offending commit also got backported to the 3.2 stable kernel, so
we need this fix there as well.
---
Shawn Landden
+1 360 389 3001 (SMS preferred)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/4] ACPI / bind: Redefine acpi_get_child()

2013-11-24 Thread Rafael J. Wysocki
From: Rafael J. Wysocki 

Since acpi_get_child() is the only user of acpi_find_child() now,
drop the static inline definition of the former and redefine the
latter as new acpi_get_child().

Signed-off-by: Rafael J. Wysocki 
---
 drivers/acpi/glue.c |6 +++---
 include/acpi/acpi_bus.h |6 +-
 2 files changed, 4 insertions(+), 8 deletions(-)

Index: linux-pm/drivers/acpi/glue.c
===
--- linux-pm.orig/drivers/acpi/glue.c
+++ linux-pm/drivers/acpi/glue.c
@@ -147,16 +147,16 @@ struct acpi_device *acpi_find_child_devi
return ret;
 }
 
-acpi_handle acpi_find_child(acpi_handle handle, u64 addr, bool is_bridge)
+acpi_handle acpi_get_child(acpi_handle handle, u64 addr)
 {
struct acpi_device *parent;
 
if (!handle || acpi_bus_get_device(handle, ))
return NULL;
 
-   return acpi_find_child_device(parent, addr, is_bridge);
+   return acpi_find_child_device(parent, addr, false);
 }
-EXPORT_SYMBOL_GPL(acpi_find_child);
+EXPORT_SYMBOL_GPL(acpi_get_child);
 
 static void acpi_physnode_link_name(char *buf, unsigned int node_id)
 {
Index: linux-pm/include/acpi/acpi_bus.h
===
--- linux-pm.orig/include/acpi/acpi_bus.h
+++ linux-pm/include/acpi/acpi_bus.h
@@ -436,11 +436,7 @@ struct acpi_pci_root {
 
 struct acpi_device *acpi_find_child_device(struct acpi_device *parent,
   u64 address, bool check_children);
-acpi_handle acpi_find_child(acpi_handle, u64, bool);
-static inline acpi_handle acpi_get_child(acpi_handle handle, u64 addr)
-{
-   return acpi_find_child(handle, addr, false);
-}
+acpi_handle acpi_get_child(acpi_handle handle, u64 addr);
 void acpi_preset_companion(struct device *dev, acpi_handle parent, u64 addr);
 int acpi_is_root_bridge(acpi_handle);
 struct acpi_pci_root *acpi_pci_find_root(acpi_handle handle);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/4] ACPI / bind: Simplify child devices lookup

2013-11-24 Thread Rafael J. Wysocki
Hi,

The following series of four patches (on top of current 
linux-pm.git/bleeding-edge)
rework child device lookup in drivers/acpi/glue.c and related things:

[1/4] ACPI / bind: Simplify child device lookup
[2/4] PCI/ ACPI: Use acpi_find_child_device() for child device lookup
[3/4] ACPI / bind: Redefine acpi_get_child()
[4/4] ACPI / bind: Redefine acpi_preset_companion()

Thanks!

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/4] ACPI / bind: Redefine acpi_preset_companion()

2013-11-24 Thread Rafael J. Wysocki
From: Rafael J. Wysocki 

Modify acpi_preset_companion() to take a struct acpi_device pointer
instead of an ACPI handle as its second argument and redefine it as
a static inline wrapper around ACPI_COMPANION_SET() passing the
return value of acpi_find_child_device() directly as the second
argument to it.  Update its users to pass struct acpi_device
pointers instead of ACPI handles to it.

This allows some unnecessary acpi_bus_get_device() calls to be
avoided.

Signed-off-by: Rafael J. Wysocki 
---
 drivers/acpi/glue.c |   10 +-
 drivers/ata/libata-acpi.c   |   26 +-
 drivers/mmc/core/sdio_bus.c |2 +-
 include/acpi/acpi_bus.h |1 -
 include/linux/acpi.h|6 ++
 5 files changed, 21 insertions(+), 24 deletions(-)

Index: linux-pm/drivers/ata/libata-acpi.c
===
--- linux-pm.orig/drivers/ata/libata-acpi.c
+++ linux-pm/drivers/ata/libata-acpi.c
@@ -180,12 +180,12 @@ static const struct acpi_dock_ops ata_ac
 /* bind acpi handle to pata port */
 void ata_acpi_bind_port(struct ata_port *ap)
 {
-   acpi_handle host_handle = ACPI_HANDLE(ap->host->dev);
+   struct acpi_device *host_companion = ACPI_COMPANION(ap->host->dev);
 
-   if (libata_noacpi || ap->flags & ATA_FLAG_ACPI_SATA || !host_handle)
+   if (libata_noacpi || ap->flags & ATA_FLAG_ACPI_SATA || !host_companion)
return;
 
-   acpi_preset_companion(>tdev, host_handle, ap->port_no);
+   acpi_preset_companion(>tdev, host_companion, ap->port_no);
 
if (ata_acpi_gtm(ap, >__acpi_init_gtm) == 0)
ap->pflags |= ATA_PFLAG_INIT_GTM_VALID;
@@ -198,17 +198,17 @@ void ata_acpi_bind_port(struct ata_port
 void ata_acpi_bind_dev(struct ata_device *dev)
 {
struct ata_port *ap = dev->link->ap;
-   acpi_handle port_handle = ACPI_HANDLE(>tdev);
-   acpi_handle host_handle = ACPI_HANDLE(ap->host->dev);
-   acpi_handle parent_handle;
+   struct acpi_device *port_companion = ACPI_COMPANION(>tdev);
+   struct acpi_device *host_companion = ACPI_COMPANION(ap->host->dev);
+   struct acpi_device *parent;
u64 adr;
 
/*
-* For both sata/pata devices, host handle is required.
-* For pata device, port handle is also required.
+* For both sata/pata devices, host companion device is required.
+* For pata device, port companion device is also required.
 */
-   if (libata_noacpi || !host_handle ||
-   (!(ap->flags & ATA_FLAG_ACPI_SATA) && !port_handle))
+   if (libata_noacpi || !host_companion ||
+   (!(ap->flags & ATA_FLAG_ACPI_SATA) && !port_companion))
return;
 
if (ap->flags & ATA_FLAG_ACPI_SATA) {
@@ -216,13 +216,13 @@ void ata_acpi_bind_dev(struct ata_device
adr = SATA_ADR(ap->port_no, NO_PORT_MULT);
else
adr = SATA_ADR(ap->port_no, dev->link->pmp);
-   parent_handle = host_handle;
+   parent = host_companion;
} else {
adr = dev->devno;
-   parent_handle = port_handle;
+   parent = port_companion;
}
 
-   acpi_preset_companion(>tdev, parent_handle, adr);
+   acpi_preset_companion(>tdev, parent, adr);
 
register_hotplug_dock_device(ata_dev_acpi_handle(dev),
 _acpi_dev_dock_ops, dev, NULL, NULL);
Index: linux-pm/drivers/mmc/core/sdio_bus.c
===
--- linux-pm.orig/drivers/mmc/core/sdio_bus.c
+++ linux-pm/drivers/mmc/core/sdio_bus.c
@@ -308,7 +308,7 @@ static void sdio_acpi_set_handle(struct
struct mmc_host *host = func->card->host;
u64 addr = (host->slotno << 16) | func->num;
 
-   acpi_preset_companion(>dev, ACPI_HANDLE(host->parent), addr);
+   acpi_preset_companion(>dev, ACPI_COMPANION(host->parent), addr);
 }
 #else
 static inline void sdio_acpi_set_handle(struct sdio_func *func) {}
Index: linux-pm/drivers/acpi/glue.c
===
--- linux-pm.orig/drivers/acpi/glue.c
+++ linux-pm/drivers/acpi/glue.c
@@ -146,6 +146,7 @@ struct acpi_device *acpi_find_child_devi
}
return ret;
 }
+EXPORT_SYMBOL_GPL(acpi_find_child_device);
 
 acpi_handle acpi_get_child(acpi_handle handle, u64 addr)
 {
@@ -294,15 +295,6 @@ int acpi_unbind_one(struct device *dev)
 }
 EXPORT_SYMBOL_GPL(acpi_unbind_one);
 
-void acpi_preset_companion(struct device *dev, acpi_handle parent, u64 addr)
-{
-   struct acpi_device *adev;
-
-   if (!acpi_bus_get_device(acpi_get_child(parent, addr), ))
-   ACPI_COMPANION_SET(dev, adev);
-}
-EXPORT_SYMBOL_GPL(acpi_preset_companion);
-
 static int acpi_platform_notify(struct device *dev)
 {
struct acpi_bus_type *type = acpi_get_bus_type(dev);
Index: 

[PATCH 2/4] PCI / ACPI: Use acpi_find_child_device() for child devices lookup

2013-11-24 Thread Rafael J. Wysocki
From: Rafael J. Wysocki 

It is much more efficient to use acpi_find_child_device()
for child devices lookup in acpi_pci_find_device() and pass
ACPI_COMPANION(dev->parent) to it directly instead of obtaining
ACPI_HANDLE() of ACPI_COMPANION(dev->parent) and passing it to
acpi_find_child() which has to run acpi_bus_get_device() to
obtain ACPI_COMPANION(dev->parent) from that again.

Signed-off-by: Rafael J. Wysocki 
---
 drivers/pci/pci-acpi.c |   16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

Index: linux-pm/drivers/pci/pci-acpi.c
===
--- linux-pm.orig/drivers/pci/pci-acpi.c
+++ linux-pm/drivers/pci/pci-acpi.c
@@ -309,7 +309,8 @@ void acpi_pci_remove_bus(struct pci_bus
 static int acpi_pci_find_device(struct device *dev, acpi_handle *handle)
 {
struct pci_dev *pci_dev = to_pci_dev(dev);
-   bool is_bridge;
+   struct acpi_device *adev;
+   bool check_children;
u64 addr;
 
/*
@@ -317,14 +318,17 @@ static int acpi_pci_find_device(struct d
 * is set only after acpi_pci_find_device() has been called for the
 * given device.
 */
-   is_bridge = pci_dev->hdr_type == PCI_HEADER_TYPE_BRIDGE
+   check_children = pci_dev->hdr_type == PCI_HEADER_TYPE_BRIDGE
|| pci_dev->hdr_type == PCI_HEADER_TYPE_CARDBUS;
/* Please ref to ACPI spec for the syntax of _ADR */
addr = (PCI_SLOT(pci_dev->devfn) << 16) | PCI_FUNC(pci_dev->devfn);
-   *handle = acpi_find_child(ACPI_HANDLE(dev->parent), addr, is_bridge);
-   if (!*handle)
-   return -ENODEV;
-   return 0;
+   adev = acpi_find_child_device(ACPI_COMPANION(dev->parent), addr,
+ check_children);
+   if (adev) {
+   *handle = adev->handle;
+   return 0;
+   }
+   return -ENODEV;
 }
 
 static void pci_acpi_setup(struct device *dev)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] pipe_to_sendpage: Ensure that MSG_MORE is set if we set MSG_SENDPAGE_NOTLAST

2013-11-24 Thread Richard Weinberger
Commit 35f9c09fe (tcp: tcp_sendpages() should call tcp_push() once)
added an internal flag MSG_SENDPAGE_NOTLAST.
We have to ensure that MSG_MORE is also set if we set MSG_SENDPAGE_NOTLAST.
Otherwise users that check against MSG_MORE will not see it.

This fixes sendfile() on AF_ALG.

Cc: Tom Herbert 
Cc: Eric Dumazet 
Cc: David S. Miller 
Cc:  # 3.4.x
Reported-and-tested-by: Shawn Landden 
Signed-off-by: Richard Weinberger 
---
 fs/splice.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/splice.c b/fs/splice.c
index 3b7ee65..b93f1b8 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -701,7 +701,7 @@ static int pipe_to_sendpage(struct pipe_inode_info *pipe,
more = (sd->flags & SPLICE_F_MORE) ? MSG_MORE : 0;
 
if (sd->len < sd->total_len && pipe->nrbufs > 1)
-   more |= MSG_SENDPAGE_NOTLAST;
+   more |= MSG_SENDPAGE_NOTLAST | MSG_MORE;
 
return file->f_op->sendpage(file, buf->page, buf->offset,
sd->len, , more);
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 4/9] mm: filemap: move radix tree hole searching here

2013-11-24 Thread Johannes Weiner
The radix tree hole searching code is only used for page cache, for
example the readahead code trying to get a a picture of the area
surrounding a fault.

It sufficed to rely on the radix tree definition of holes, which is
"empty tree slot".  But this is about to change, though, as shadow
page descriptors will be stored in the page cache after the actual
pages get evicted from memory.

Move the functions over to mm/filemap.c and make them native page
cache operations, where they can later be adapted to handle the new
definition of "page cache hole".

Signed-off-by: Johannes Weiner 
---
 fs/nfs/blocklayout/blocklayout.c |  2 +-
 include/linux/pagemap.h  |  5 +++
 include/linux/radix-tree.h   |  4 ---
 lib/radix-tree.c | 75 ---
 mm/filemap.c | 76 
 mm/readahead.c   |  4 +--
 6 files changed, 84 insertions(+), 82 deletions(-)

diff --git a/fs/nfs/blocklayout/blocklayout.c b/fs/nfs/blocklayout/blocklayout.c
index e242bbf..fdb74cb 100644
--- a/fs/nfs/blocklayout/blocklayout.c
+++ b/fs/nfs/blocklayout/blocklayout.c
@@ -1220,7 +1220,7 @@ static u64 pnfs_num_cont_bytes(struct inode *inode, 
pgoff_t idx)
end = DIV_ROUND_UP(i_size_read(inode), PAGE_CACHE_SIZE);
if (end != NFS_I(inode)->npages) {
rcu_read_lock();
-   end = radix_tree_next_hole(>page_tree, idx + 1, 
ULONG_MAX);
+   end = page_cache_next_hole(mapping, idx + 1, ULONG_MAX);
rcu_read_unlock();
}
 
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index e3dea75..c73130c 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -243,6 +243,11 @@ static inline struct page 
*page_cache_alloc_readahead(struct address_space *x)
 
 typedef int filler_t(void *, struct page *);
 
+pgoff_t page_cache_next_hole(struct address_space *mapping,
+pgoff_t index, unsigned long max_scan);
+pgoff_t page_cache_prev_hole(struct address_space *mapping,
+pgoff_t index, unsigned long max_scan);
+
 extern struct page * find_get_page(struct address_space *mapping,
pgoff_t index);
 extern struct page * find_lock_page(struct address_space *mapping,
diff --git a/include/linux/radix-tree.h b/include/linux/radix-tree.h
index 1bf0a9c..e8be53e 100644
--- a/include/linux/radix-tree.h
+++ b/include/linux/radix-tree.h
@@ -227,10 +227,6 @@ radix_tree_gang_lookup(struct radix_tree_root *root, void 
**results,
 unsigned int radix_tree_gang_lookup_slot(struct radix_tree_root *root,
void ***results, unsigned long *indices,
unsigned long first_index, unsigned int max_items);
-unsigned long radix_tree_next_hole(struct radix_tree_root *root,
-   unsigned long index, unsigned long max_scan);
-unsigned long radix_tree_prev_hole(struct radix_tree_root *root,
-   unsigned long index, unsigned long max_scan);
 int radix_tree_preload(gfp_t gfp_mask);
 int radix_tree_maybe_preload(gfp_t gfp_mask);
 void radix_tree_init(void);
diff --git a/lib/radix-tree.c b/lib/radix-tree.c
index f442e32..e8adb5d 100644
--- a/lib/radix-tree.c
+++ b/lib/radix-tree.c
@@ -946,81 +946,6 @@ next:
 }
 EXPORT_SYMBOL(radix_tree_range_tag_if_tagged);
 
-
-/**
- * radix_tree_next_hole-find the next hole (not-present entry)
- * @root:  tree root
- * @index: index key
- * @max_scan:  maximum range to search
- *
- * Search the set [index, min(index+max_scan-1, MAX_INDEX)] for the lowest
- * indexed hole.
- *
- * Returns: the index of the hole if found, otherwise returns an index
- * outside of the set specified (in which case 'return - index >= max_scan'
- * will be true). In rare cases of index wrap-around, 0 will be returned.
- *
- * radix_tree_next_hole may be called under rcu_read_lock. However, like
- * radix_tree_gang_lookup, this will not atomically search a snapshot of
- * the tree at a single point in time. For example, if a hole is created
- * at index 5, then subsequently a hole is created at index 10,
- * radix_tree_next_hole covering both indexes may return 10 if called
- * under rcu_read_lock.
- */
-unsigned long radix_tree_next_hole(struct radix_tree_root *root,
-   unsigned long index, unsigned long max_scan)
-{
-   unsigned long i;
-
-   for (i = 0; i < max_scan; i++) {
-   if (!radix_tree_lookup(root, index))
-   break;
-   index++;
-   if (index == 0)
-   break;
-   }
-
-   return index;
-}
-EXPORT_SYMBOL(radix_tree_next_hole);
-
-/**
- * radix_tree_prev_hole-find the prev hole (not-present entry)
- * @root:  tree root
- * @index: index key
- *

[patch 0/9] mm: thrash detection-based file cache sizing v6

2013-11-24 Thread Johannes Weiner
Changes in this revision:

o Based on suggestions from Dave Chinner and Rik van Riel, rework the
  shadow entry reclaim to directly track and scan radix tree nodes
  containing only shadows instead of operating on an inode level.
  This adds one word to the address space (thus inode) and two words
  to the radix_tree_node, but the number of objects per slab remains
  unchanged in both cases.  The shrinker no longer needs to scan radix
  trees but can just walk the list of immediately reclaimable nodes.

[ Dave, I looked into getting rid of the AS_EXITING flag but since
  reclaim can't participate in inode lifetime management (no iput in
  NOFS context), the fs somehow needs to communicate the final
  truncate so that reclaim can stop putting shadow entries into the
  tree.  We can't detect it in the truncate call, unless we modify the
  API to carry that bit of information, and switch every filesystem
  over to the new truncate, but at that point we might as well just
  leave the AS_EXITING setting in one place in the vfs code with a
  comment; it seems less error prone.

  In the last revision, it seems you were mostly thrown by the dumb
  shrinker linking every inode, thus increasing the inode footprint
  massively.  All inode involvement is gone now, maybe you won't hate
  the address space flag as much anymore after a fresh look... ]

Summary

The VM maintains cached filesystem pages on two types of lists.  One
list holds the pages recently faulted into the cache, the other list
holds pages that have been referenced repeatedly on that first list.
The idea is to prefer reclaiming young pages over those that have
shown to benefit from caching in the past.  We call the recently used
list "inactive list" and the frequently used list "active list".

Currently, the VM aims for a 1:1 ratio between the lists, which is the
"perfect" trade-off between the ability to *protect* frequently used
pages and the ability to *detect* frequently used pages.  This means
that working set changes bigger than half of cache memory go
undetected and thrash indefinitely, whereas working sets bigger than
half of cache memory are unprotected against used-once streams that
don't even need caching.

This happens on file servers and media streaming servers, where the
popular files and file sections change over time.  Even though the
individual files might be smaller than half of memory, concurrent
access to many of them may still result in their inter-reference
distance being greater than half of memory.  It's also been reported
as a problem on database workloads that switch back and forth between
tables that are bigger than half of memory.  In these cases the VM
never recognizes the new working set and will for the remainder of the
workload thrash disk data which could easily live in memory.

Historically, every reclaim scan of the inactive list also took a
smaller number of pages from the tail of the active list and moved
them to the head of the inactive list.  This model gave established
working sets more gracetime in the face of temporary use-once streams,
but ultimately was not significantly better than a FIFO policy and
still thrashed cache based on eviction speed, rather than actual
demand for cache.

This series solves the problem by maintaining a history of pages
evicted from the inactive list, enabling the VM to detect frequently
used pages regardless of inactive list size and facilitate working set
transitions.

Tests

The reported database workload is easily demonstrated on an 8G machine
with two filesets a 6G.  This fio workload operates on one set first,
then switches to the other.  The VM should obviously always cache the
set that the workload is currently using.

unpatched:
db1: READ: io=98304MB, aggrb=803577KB/s, minb=803577KB/s, maxb=803577KB/s, 
mint= 125269msec, maxt= 125269msec
db2: READ: io=98304MB, aggrb= 65610KB/s, minb= 65610KB/s, maxb= 65610KB/s, 
mint=1534266msec, maxt=1534266msec
sdb: ios=835729/7, merge=4/2, ticks=4620185/318869, in_queue=4938281, 
util=98.33%

real27m40.094s
user0m20.017s
sys 1m35.293s

patched:
db1: READ: io=98304MB, aggrb=796954KB/s, minb=796954KB/s, maxb=796954KB/s, 
mint=126310msec, maxt=126310msec
db2: READ: io=98304MB, aggrb=376076KB/s, minb=376076KB/s, maxb=376076KB/s, 
mint=267667msec, maxt=267667msec
sdb: ios=170660/4, merge=2/1, ticks=956451/62623, in_queue=1018896, util=86.23%

real6m34.717s
user0m17.120s
sys 0m54.790s

As can be seen, the unpatched kernel simply never adapts to the
workingset change and db2 is stuck indefinitely with secondary storage
speed.  The patched kernel needs 2-3 iterations over db2 before it
replaces db1 and reaches full memory speed.  Given the unbounded
negative affect of the existing VM behavior, these patches should be
considered correctness fixes rather than performance optimizations.

Another test resembles a fileserver or streaming server workload,
where data in excess of memory size 

[patch 6/9] mm + fs: store shadow entries in page cache

2013-11-24 Thread Johannes Weiner
Reclaim will be leaving shadow entries in the page cache radix tree
upon evicting the real page.  As those pages are found from the LRU,
an iput() can lead to the inode being freed concurrently.  At this
point, reclaim must no longer install shadow pages because the inode
freeing code needs to ensure the page tree is really empty.

Add an address_space flag, AS_EXITING, that the inode freeing code
sets under the tree lock before doing the final truncate.  Reclaim
will check for this flag before installing shadow pages.

Signed-off-by: Johannes Weiner 
---
 fs/block_dev.c  |  2 +-
 fs/inode.c  | 18 +-
 fs/nilfs2/inode.c   |  4 ++--
 include/linux/fs.h  |  1 +
 include/linux/pagemap.h | 13 -
 mm/filemap.c| 23 +++
 mm/truncate.c   |  7 ---
 mm/vmscan.c |  2 +-
 8 files changed, 57 insertions(+), 13 deletions(-)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 1e86823..391ffe5 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -83,7 +83,7 @@ void kill_bdev(struct block_device *bdev)
 {
struct address_space *mapping = bdev->bd_inode->i_mapping;
 
-   if (mapping->nrpages == 0)
+   if (mapping->nrpages == 0 && mapping->nrshadows == 0)
return;
 
invalidate_bh_lrus();
diff --git a/fs/inode.c b/fs/inode.c
index b33ba8e..7858fb7 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -503,6 +503,7 @@ void clear_inode(struct inode *inode)
 */
spin_lock_irq(>i_data.tree_lock);
BUG_ON(inode->i_data.nrpages);
+   BUG_ON(inode->i_data.nrshadows);
spin_unlock_irq(>i_data.tree_lock);
BUG_ON(!list_empty(>i_data.private_list));
BUG_ON(!(inode->i_state & I_FREEING));
@@ -545,10 +546,25 @@ static void evict(struct inode *inode)
 */
inode_wait_for_writeback(inode);
 
+   /*
+* Page reclaim can not do iput() and thus can race with the
+* inode teardown.  Tell it when the address space is exiting,
+* so that it does not install eviction information after the
+* final truncate has begun.
+*
+* As truncation uses a lockless tree lookup, acquire the
+* spinlock to make sure any ongoing tree modification that
+* does not see AS_EXITING is completed before starting the
+* final truncate.
+*/
+   spin_lock_irq(>i_data.tree_lock);
+   mapping_set_exiting(>i_data);
+   spin_unlock_irq(>i_data.tree_lock);
+
if (op->evict_inode) {
op->evict_inode(inode);
} else {
-   if (inode->i_data.nrpages)
+   if (inode->i_data.nrpages || inode->i_data.nrshadows)
truncate_inode_pages(>i_data, 0);
clear_inode(inode);
}
diff --git a/fs/nilfs2/inode.c b/fs/nilfs2/inode.c
index 7e350c5..42fcbe3 100644
--- a/fs/nilfs2/inode.c
+++ b/fs/nilfs2/inode.c
@@ -783,7 +783,7 @@ void nilfs_evict_inode(struct inode *inode)
int ret;
 
if (inode->i_nlink || !ii->i_root || unlikely(is_bad_inode(inode))) {
-   if (inode->i_data.nrpages)
+   if (inode->i_data.nrpages || inode->i_data.nrshadows)
truncate_inode_pages(>i_data, 0);
clear_inode(inode);
nilfs_clear_inode(inode);
@@ -791,7 +791,7 @@ void nilfs_evict_inode(struct inode *inode)
}
nilfs_transaction_begin(sb, , 0); /* never fails */
 
-   if (inode->i_data.nrpages)
+   if (inode->i_data.nrpages || inode->i_data.nrshadows)
truncate_inode_pages(>i_data, 0);
 
/* TODO: some of the following operations may fail.  */
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 3f40547..9bfa5a5 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -416,6 +416,7 @@ struct address_space {
struct mutexi_mmap_mutex;   /* protect tree, count, list */
/* Protected by tree_lock together with the radix tree */
unsigned long   nrpages;/* number of total pages */
+   unsigned long   nrshadows;  /* number of shadow entries */
pgoff_t writeback_index;/* writeback starts here */
const struct address_space_operations *a_ops;   /* methods */
unsigned long   flags;  /* error bits/gfp mask */
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index b6854b7..f132fdf 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -25,6 +25,7 @@ enum mapping_flags {
AS_MM_ALL_LOCKS = __GFP_BITS_SHIFT + 2, /* under mm_take_all_locks() */
AS_UNEVICTABLE  = __GFP_BITS_SHIFT + 3, /* e.g., ramdisk, SHM_LOCK */
AS_BALLOON_MAP  = __GFP_BITS_SHIFT + 4, /* balloon page special map */
+   AS_EXITING  = __GFP_BITS_SHIFT + 5, /* final truncate in progress */
 };
 
 static inline void mapping_set_error(struct address_space *mapping, 

[patch 9/9] mm: keep page cache radix tree nodes in check

2013-11-24 Thread Johannes Weiner
Previously, page cache radix tree nodes were freed after reclaim
emptied out their page pointers.  But now reclaim stores shadow
entries in their place, which are only reclaimed when the inodes
themselves are reclaimed.  This is problematic for bigger files that
are still in use after they have a significant amount of their cache
reclaimed, without any of those pages actually refaulting.  The shadow
entries will just sit there and waste memory.  In the worst case, the
shadow entries will accumulate until the machine runs out of memory.

To get this under control, the VM will track radix tree nodes
exclusively containing shadow entries on a per-NUMA node list.  A
simple shrinker will reclaim these nodes on memory pressure.

A few things need to be stored in the radix tree node to implement the
shadow node LRU and allow tree deletions coming from the list:

1. There is no index available that would describe the reverse path
   from the node up to the tree root, which is needed to perform a
   deletion.  To solve this, encode in each node its offset inside the
   parent.  This can be stored in the unused upper bits of the same
   member that stores the node's height at no extra space cost.

2. The number of shadow entries needs to be counted in addition to the
   regular entries, to quickly detect when the node is ready to go to
   the shadow node LRU list.  The current entry count is an unsigned
   int but the maximum number of entries is 64, so a shadow counter
   can easily be stored in the unused upper bits.

3. Tree modification needs the lock, which is located in the address
   space, so store a backpointer to it.  The parent pointer is in a
   union with the 2-word rcu_head, so the backpointer comes at no
   extra cost as well.

4. The node needs to be linked to an LRU list, which requires a list
   head inside the node.  This does increase the size of the node, but
   it does not change the number of objects that fit into a slab page.

Signed-off-by: Johannes Weiner 
---
 fs/super.c|   4 +-
 fs/xfs/xfs_buf.c  |   2 +-
 fs/xfs/xfs_qm.c   |   2 +-
 include/linux/list_lru.h  |   2 +-
 include/linux/radix-tree.h|  30 +++---
 include/linux/swap.h  |   1 +
 include/linux/vm_event_item.h |   1 +
 lib/radix-tree.c  |  36 +++-
 mm/filemap.c  |  70 
 mm/list_lru.c |   4 +-
 mm/truncate.c |  19 ++-
 mm/vmstat.c   |   2 +
 mm/workingset.c   | 124 ++
 13 files changed, 255 insertions(+), 42 deletions(-)

diff --git a/fs/super.c b/fs/super.c
index 0225c20..a958d52 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -196,9 +196,9 @@ static struct super_block *alloc_super(struct 
file_system_type *type, int flags)
INIT_HLIST_BL_HEAD(>s_anon);
INIT_LIST_HEAD(>s_inodes);
 
-   if (list_lru_init(>s_dentry_lru))
+   if (list_lru_init(>s_dentry_lru, NULL))
goto err_out;
-   if (list_lru_init(>s_inode_lru))
+   if (list_lru_init(>s_inode_lru, NULL))
goto err_out_dentry_lru;
 
INIT_LIST_HEAD(>s_mounts);
diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index 2634700..c49cbce 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -1670,7 +1670,7 @@ xfs_alloc_buftarg(
if (xfs_setsize_buftarg_early(btp, bdev))
goto error;
 
-   if (list_lru_init(>bt_lru))
+   if (list_lru_init(>bt_lru, NULL))
goto error;
 
btp->bt_shrinker.count_objects = xfs_buftarg_shrink_count;
diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
index 3e6c2e6..57d6aa9 100644
--- a/fs/xfs/xfs_qm.c
+++ b/fs/xfs/xfs_qm.c
@@ -831,7 +831,7 @@ xfs_qm_init_quotainfo(
 
qinf = mp->m_quotainfo = kmem_zalloc(sizeof(xfs_quotainfo_t), KM_SLEEP);
 
-   if ((error = list_lru_init(>qi_lru))) {
+   if ((error = list_lru_init(>qi_lru, NULL))) {
kmem_free(qinf);
mp->m_quotainfo = NULL;
return error;
diff --git a/include/linux/list_lru.h b/include/linux/list_lru.h
index 3ce5417..b970a45 100644
--- a/include/linux/list_lru.h
+++ b/include/linux/list_lru.h
@@ -32,7 +32,7 @@ struct list_lru {
 };
 
 void list_lru_destroy(struct list_lru *lru);
-int list_lru_init(struct list_lru *lru);
+int list_lru_init(struct list_lru *lru, struct lock_class_key *key);
 
 /**
  * list_lru_add: add an element to the lru list's tail
diff --git a/include/linux/radix-tree.h b/include/linux/radix-tree.h
index 13636c4..29df11f 100644
--- a/include/linux/radix-tree.h
+++ b/include/linux/radix-tree.h
@@ -72,21 +72,35 @@ static inline int radix_tree_is_indirect_ptr(void *ptr)
 #define RADIX_TREE_TAG_LONGS   \
((RADIX_TREE_MAP_SIZE + BITS_PER_LONG - 1) / BITS_PER_LONG)
 
+#define RADIX_TREE_INDEX_BITS  (8 /* CHAR_BIT */ * 

[patch 8/9] lib: radix_tree: tree node interface

2013-11-24 Thread Johannes Weiner
Make struct radix_tree_node part of the public interface and provide
API functions to create, look up, and delete whole nodes.  Refactor
the existing insert, look up, delete functions on top of these new
node primitives.

This will allow the VM to track and garbage collect page cache radix
tree nodes.

Signed-off-by: Johannes Weiner 
---
 include/linux/radix-tree.h |  34 ++
 lib/radix-tree.c   | 261 +
 2 files changed, 180 insertions(+), 115 deletions(-)

diff --git a/include/linux/radix-tree.h b/include/linux/radix-tree.h
index e8be53e..13636c4 100644
--- a/include/linux/radix-tree.h
+++ b/include/linux/radix-tree.h
@@ -60,6 +60,33 @@ static inline int radix_tree_is_indirect_ptr(void *ptr)
 
 #define RADIX_TREE_MAX_TAGS 3
 
+#ifdef __KERNEL__
+#define RADIX_TREE_MAP_SHIFT   (CONFIG_BASE_SMALL ? 4 : 6)
+#else
+#define RADIX_TREE_MAP_SHIFT   3   /* For more stressful testing */
+#endif
+
+#define RADIX_TREE_MAP_SIZE(1UL << RADIX_TREE_MAP_SHIFT)
+#define RADIX_TREE_MAP_MASK(RADIX_TREE_MAP_SIZE-1)
+
+#define RADIX_TREE_TAG_LONGS   \
+   ((RADIX_TREE_MAP_SIZE + BITS_PER_LONG - 1) / BITS_PER_LONG)
+
+struct radix_tree_node {
+   unsigned intheight; /* Height from the bottom */
+   unsigned intcount;
+   union {
+   struct radix_tree_node *parent; /* Used when ascending tree */
+   struct rcu_head rcu_head;   /* Used when freeing node */
+   };
+   void __rcu  *slots[RADIX_TREE_MAP_SIZE];
+   unsigned long   tags[RADIX_TREE_MAX_TAGS][RADIX_TREE_TAG_LONGS];
+};
+
+#define RADIX_TREE_INDEX_BITS  (8 /* CHAR_BIT */ * sizeof(unsigned long))
+#define RADIX_TREE_MAX_PATH (DIV_ROUND_UP(RADIX_TREE_INDEX_BITS, \
+ RADIX_TREE_MAP_SHIFT))
+
 /* root tags are stored in gfp_mask, shifted by __GFP_BITS_SHIFT */
 struct radix_tree_root {
unsigned intheight;
@@ -101,6 +128,7 @@ do {
\
  *   concurrently with other readers.
  *
  * The notable exceptions to this rule are the following functions:
+ * __radix_tree_lookup
  * radix_tree_lookup
  * radix_tree_lookup_slot
  * radix_tree_tag_get
@@ -216,9 +244,15 @@ static inline void radix_tree_replace_slot(void **pslot, 
void *item)
rcu_assign_pointer(*pslot, item);
 }
 
+int __radix_tree_create(struct radix_tree_root *root, unsigned long index,
+   struct radix_tree_node **nodep, void ***slotp);
 int radix_tree_insert(struct radix_tree_root *, unsigned long, void *);
+void *__radix_tree_lookup(struct radix_tree_root *root, unsigned long index,
+ struct radix_tree_node **nodep, void ***slotp);
 void *radix_tree_lookup(struct radix_tree_root *, unsigned long);
 void **radix_tree_lookup_slot(struct radix_tree_root *, unsigned long);
+bool __radix_tree_delete_node(struct radix_tree_root *root, unsigned long 
index,
+ struct radix_tree_node *node);
 void *radix_tree_delete_item(struct radix_tree_root *, unsigned long, void *);
 void *radix_tree_delete(struct radix_tree_root *, unsigned long);
 unsigned int
diff --git a/lib/radix-tree.c b/lib/radix-tree.c
index e8adb5d..e601c56 100644
--- a/lib/radix-tree.c
+++ b/lib/radix-tree.c
@@ -35,33 +35,6 @@
 #include  /* in_interrupt() */
 
 
-#ifdef __KERNEL__
-#define RADIX_TREE_MAP_SHIFT   (CONFIG_BASE_SMALL ? 4 : 6)
-#else
-#define RADIX_TREE_MAP_SHIFT   3   /* For more stressful testing */
-#endif
-
-#define RADIX_TREE_MAP_SIZE(1UL << RADIX_TREE_MAP_SHIFT)
-#define RADIX_TREE_MAP_MASK(RADIX_TREE_MAP_SIZE-1)
-
-#define RADIX_TREE_TAG_LONGS   \
-   ((RADIX_TREE_MAP_SIZE + BITS_PER_LONG - 1) / BITS_PER_LONG)
-
-struct radix_tree_node {
-   unsigned intheight; /* Height from the bottom */
-   unsigned intcount;
-   union {
-   struct radix_tree_node *parent; /* Used when ascending tree */
-   struct rcu_head rcu_head;   /* Used when freeing node */
-   };
-   void __rcu  *slots[RADIX_TREE_MAP_SIZE];
-   unsigned long   tags[RADIX_TREE_MAX_TAGS][RADIX_TREE_TAG_LONGS];
-};
-
-#define RADIX_TREE_INDEX_BITS  (8 /* CHAR_BIT */ * sizeof(unsigned long))
-#define RADIX_TREE_MAX_PATH (DIV_ROUND_UP(RADIX_TREE_INDEX_BITS, \
- RADIX_TREE_MAP_SHIFT))
-
 /*
  * The height_to_maxindex array needs to be one deeper than the maximum
  * path as height 0 holds only 1 entry.
@@ -387,23 +360,28 @@ out:
 }
 
 /**
- * radix_tree_insert-insert into a radix tree
+ * __radix_tree_create -   create a slot in a radix tree
  * @root:  radix tree root
  * @index: index key
- * @item:  item to insert
+ * @nodep: returns node
+ * @slotp: returns slot
  *
- * Insert an item into the radix tree at position 

[patch 2/9] lib: radix-tree: radix_tree_delete_item()

2013-11-24 Thread Johannes Weiner
Provide a function that does not just delete an entry at a given
index, but also allows passing in an expected item.  Delete only if
that item is still located at the specified index.

This is handy when lockless tree traversals want to delete entries as
well because they don't have to do an second, locked lookup to verify
the slot has not changed under them before deleting the entry.

Signed-off-by: Johannes Weiner 
---
 include/linux/radix-tree.h |  1 +
 lib/radix-tree.c   | 31 +++
 2 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/include/linux/radix-tree.h b/include/linux/radix-tree.h
index 4039407..1bf0a9c 100644
--- a/include/linux/radix-tree.h
+++ b/include/linux/radix-tree.h
@@ -219,6 +219,7 @@ static inline void radix_tree_replace_slot(void **pslot, 
void *item)
 int radix_tree_insert(struct radix_tree_root *, unsigned long, void *);
 void *radix_tree_lookup(struct radix_tree_root *, unsigned long);
 void **radix_tree_lookup_slot(struct radix_tree_root *, unsigned long);
+void *radix_tree_delete_item(struct radix_tree_root *, unsigned long, void *);
 void *radix_tree_delete(struct radix_tree_root *, unsigned long);
 unsigned int
 radix_tree_gang_lookup(struct radix_tree_root *root, void **results,
diff --git a/lib/radix-tree.c b/lib/radix-tree.c
index 7811ed3..f442e32 100644
--- a/lib/radix-tree.c
+++ b/lib/radix-tree.c
@@ -1335,15 +1335,18 @@ static inline void radix_tree_shrink(struct 
radix_tree_root *root)
 }
 
 /**
- * radix_tree_delete-delete an item from a radix tree
+ * radix_tree_delete_item-delete an item from a radix tree
  * @root:  radix tree root
  * @index: index key
+ * @item:  expected item
  *
- * Remove the item at @index from the radix tree rooted at @root.
+ * Remove @item at @index from the radix tree rooted at @root.
  *
- * Returns the address of the deleted item, or NULL if it was not present.
+ * Returns the address of the deleted item, or NULL if it was not present
+ * or the entry at the given @index was not @item.
  */
-void *radix_tree_delete(struct radix_tree_root *root, unsigned long index)
+void *radix_tree_delete_item(struct radix_tree_root *root,
+unsigned long index, void *item)
 {
struct radix_tree_node *node = NULL;
struct radix_tree_node *slot = NULL;
@@ -1378,6 +1381,11 @@ void *radix_tree_delete(struct radix_tree_root *root, 
unsigned long index)
if (slot == NULL)
goto out;
 
+   if (item && slot != item) {
+   slot = NULL;
+   goto out;
+   }
+
/*
 * Clear all tags associated with the item to be deleted.
 * This way of doing it would be inefficient, but seldom is any set.
@@ -1422,6 +1430,21 @@ void *radix_tree_delete(struct radix_tree_root *root, 
unsigned long index)
 out:
return slot;
 }
+EXPORT_SYMBOL(radix_tree_delete_item);
+
+/**
+ * radix_tree_delete-delete an item from a radix tree
+ * @root:  radix tree root
+ * @index: index key
+ *
+ * Remove the item at @index from the radix tree rooted at @root.
+ *
+ * Returns the address of the deleted item, or NULL if it was not present.
+ */
+void *radix_tree_delete(struct radix_tree_root *root, unsigned long index)
+{
+   return radix_tree_delete_item(root, index, NULL);
+}
 EXPORT_SYMBOL(radix_tree_delete);
 
 /**
-- 
1.8.4.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 5/9] mm + fs: prepare for non-page entries in page cache radix trees

2013-11-24 Thread Johannes Weiner
shmem mappings already contain exceptional entries where swap slot
information is remembered.

To be able to store eviction information for regular page cache,
prepare every site dealing with the radix trees directly to handle
entries other than pages.

The common lookup functions will filter out non-page entries and
return NULL for page cache holes, just as before.  But provide a raw
version of the API which returns non-page entries as well, and switch
shmem over to use it.

Signed-off-by: Johannes Weiner 
---
 fs/btrfs/compression.c   |   2 +-
 include/linux/mm.h   |   8 ++
 include/linux/pagemap.h  |  15 ++--
 include/linux/pagevec.h  |   3 +
 include/linux/shmem_fs.h |   1 +
 mm/filemap.c | 190 +--
 mm/mincore.c |  20 +++--
 mm/readahead.c   |   2 +-
 mm/shmem.c   |  97 +---
 mm/swap.c|  47 
 mm/truncate.c|  73 ++
 11 files changed, 331 insertions(+), 127 deletions(-)

diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index 6aad98c..c883165 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -474,7 +474,7 @@ static noinline int add_ra_bio_pages(struct inode *inode,
rcu_read_lock();
page = radix_tree_lookup(>page_tree, pg_index);
rcu_read_unlock();
-   if (page) {
+   if (page && !radix_tree_exceptional_entry(page)) {
misses++;
if (misses > 4)
break;
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 8b6e55e..c09ef3a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -906,6 +906,14 @@ extern void show_free_areas(unsigned int flags);
 extern bool skip_free_areas_node(unsigned int flags, int nid);
 
 int shmem_zero_setup(struct vm_area_struct *);
+#ifdef CONFIG_SHMEM
+bool shmem_mapping(struct address_space *mapping);
+#else
+static inline bool shmem_mapping(struct address_space *mapping)
+{
+   return false;
+}
+#endif
 
 extern int can_do_mlock(void);
 extern int user_shm_lock(size_t, struct user_struct *);
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index c73130c..b6854b7 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -248,12 +248,15 @@ pgoff_t page_cache_next_hole(struct address_space 
*mapping,
 pgoff_t page_cache_prev_hole(struct address_space *mapping,
 pgoff_t index, unsigned long max_scan);
 
-extern struct page * find_get_page(struct address_space *mapping,
-   pgoff_t index);
-extern struct page * find_lock_page(struct address_space *mapping,
-   pgoff_t index);
-extern struct page * find_or_create_page(struct address_space *mapping,
-   pgoff_t index, gfp_t gfp_mask);
+struct page *__find_get_page(struct address_space *mapping, pgoff_t offset);
+struct page *find_get_page(struct address_space *mapping, pgoff_t offset);
+struct page *__find_lock_page(struct address_space *mapping, pgoff_t offset);
+struct page *find_lock_page(struct address_space *mapping, pgoff_t offset);
+struct page *find_or_create_page(struct address_space *mapping, pgoff_t index,
+gfp_t gfp_mask);
+unsigned __find_get_pages(struct address_space *mapping, pgoff_t start,
+ unsigned int nr_pages, struct page **pages,
+ pgoff_t *indices);
 unsigned find_get_pages(struct address_space *mapping, pgoff_t start,
unsigned int nr_pages, struct page **pages);
 unsigned find_get_pages_contig(struct address_space *mapping, pgoff_t start,
diff --git a/include/linux/pagevec.h b/include/linux/pagevec.h
index e4dbfab..3c6b8b1 100644
--- a/include/linux/pagevec.h
+++ b/include/linux/pagevec.h
@@ -22,6 +22,9 @@ struct pagevec {
 
 void __pagevec_release(struct pagevec *pvec);
 void __pagevec_lru_add(struct pagevec *pvec);
+unsigned __pagevec_lookup(struct pagevec *pvec, struct address_space *mapping,
+ pgoff_t start, unsigned nr_pages, pgoff_t *indices);
+void pagevec_remove_exceptionals(struct pagevec *pvec);
 unsigned pagevec_lookup(struct pagevec *pvec, struct address_space *mapping,
pgoff_t start, unsigned nr_pages);
 unsigned pagevec_lookup_tag(struct pagevec *pvec,
diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h
index 30aa0dc..deb4960 100644
--- a/include/linux/shmem_fs.h
+++ b/include/linux/shmem_fs.h
@@ -49,6 +49,7 @@ extern struct file *shmem_file_setup(const char *name,
loff_t size, unsigned long flags);
 extern int shmem_zero_setup(struct vm_area_struct *);
 extern int shmem_lock(struct file *file, int lock, struct user_struct *user);
+extern bool shmem_mapping(struct address_space *mapping);
 extern void 

[patch 7/9] mm: thrash detection-based file cache sizing

2013-11-24 Thread Johannes Weiner
The VM maintains cached filesystem pages on two types of lists.  One
list holds the pages recently faulted into the cache, the other list
holds pages that have been referenced repeatedly on that first list.
The idea is to prefer reclaiming young pages over those that have
shown to benefit from caching in the past.  We call the recently used
list "inactive list" and the frequently used list "active list".

Currently, the VM aims for a 1:1 ratio between the lists, which is the
"perfect" trade-off between the ability to *protect* frequently used
pages and the ability to *detect* frequently used pages.  This means
that working set changes bigger than half of cache memory go
undetected and thrash indefinitely, whereas working sets bigger than
half of cache memory are unprotected against used-once streams that
don't even need caching.

Historically, every reclaim scan of the inactive list also took a
smaller number of pages from the tail of the active list and moved
them to the head of the inactive list.  This model gave established
working sets more gracetime in the face of temporary use-once streams,
but ultimately was not significantly better than a FIFO policy and
still thrashed cache based on eviction speed, rather than actual
demand for cache.

This patch solves one half of the problem by decoupling the ability to
detect working set changes from the inactive list size.  By
maintaining a history of recently evicted file pages it can detect
frequently used pages with an arbitrarily small inactive list size,
and subsequently apply pressure on the active list based on actual
demand for cache, not just overall eviction speed.

Every zone maintains a counter that tracks inactive list aging speed.
When a page is evicted, a snapshot of this counter is stored in the
now-empty page cache radix tree slot.  On refault, the minimum access
distance of the page can be assesed, to evaluate whether the page
should be part of the active list or not.

This fixes the VM's blindness towards working set changes in excess of
the inactive list.  And it's the foundation to further improve the
protection ability and reduce the minimum inactive list size of 50%.

Signed-off-by: Johannes Weiner 
---
 include/linux/mmzone.h |   5 +
 include/linux/swap.h   |   5 +
 mm/Makefile|   2 +-
 mm/filemap.c   |  61 
 mm/swap.c  |   2 +
 mm/vmscan.c|  24 -
 mm/vmstat.c|   2 +
 mm/workingset.c| 253 +
 8 files changed, 331 insertions(+), 23 deletions(-)
 create mode 100644 mm/workingset.c

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index bd791e4..118ba9f 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -142,6 +142,8 @@ enum zone_stat_item {
NUMA_LOCAL, /* allocation from local node */
NUMA_OTHER, /* allocation from other node */
 #endif
+   WORKINGSET_REFAULT,
+   WORKINGSET_ACTIVATE,
NR_ANON_TRANSPARENT_HUGEPAGES,
NR_FREE_CMA_PAGES,
NR_VM_ZONE_STAT_ITEMS };
@@ -392,6 +394,9 @@ struct zone {
spinlock_t  lru_lock;
struct lruvec   lruvec;
 
+   /* Evictions & activations on the inactive file list */
+   atomic_long_t   inactive_age;
+
unsigned long   pages_scanned; /* since last reclaim */
unsigned long   flags; /* zone flags, see below */
 
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 46ba0c6..b83cf61 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -260,6 +260,11 @@ struct swap_list_t {
int next;   /* swapfile to be used next */
 };
 
+/* linux/mm/workingset.c */
+void *workingset_eviction(struct address_space *mapping, struct page *page);
+bool workingset_refault(void *shadow);
+void workingset_activation(struct page *page);
+
 /* linux/mm/page_alloc.c */
 extern unsigned long totalram_pages;
 extern unsigned long totalreserve_pages;
diff --git a/mm/Makefile b/mm/Makefile
index 305d10a..b30aeb8 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -17,7 +17,7 @@ obj-y := filemap.o mempool.o oom_kill.o 
fadvise.o \
   util.o mmzone.o vmstat.o backing-dev.o \
   mm_init.o mmu_context.o percpu.o slab_common.o \
   compaction.o balloon_compaction.o \
-  interval_tree.o list_lru.o $(mmu-y)
+  interval_tree.o list_lru.o workingset.o $(mmu-y)
 
 obj-y += init-mm.o
 
diff --git a/mm/filemap.c b/mm/filemap.c
index 9761f6a..30a74be 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -461,7 +461,7 @@ int replace_page_cache_page(struct page *old, struct page 
*new, gfp_t gfp_mask)
 EXPORT_SYMBOL_GPL(replace_page_cache_page);
 
 static int page_cache_tree_insert(struct address_space *mapping,
- struct page *page)
+ 

[patch 3/9] mm: shmem: save one radix tree lookup when truncating swapped pages

2013-11-24 Thread Johannes Weiner
Page cache radix tree slots are usually stabilized by the page lock,
but shmem's swap cookies have no such thing.  Because the overall
truncation loop is lockless, the swap entry is currently confirmed by
a tree lookup and then deleted by another tree lookup under the same
tree lock region.

Use radix_tree_delete_item() instead, which does the verification and
deletion with only one lookup.  This also allows removing the
delete-only special case from shmem_radix_tree_replace().

Signed-off-by: Johannes Weiner 
---
 mm/shmem.c | 25 -
 1 file changed, 12 insertions(+), 13 deletions(-)

diff --git a/mm/shmem.c b/mm/shmem.c
index 8297623..7c67249 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -242,19 +242,17 @@ static int shmem_radix_tree_replace(struct address_space 
*mapping,
pgoff_t index, void *expected, void *replacement)
 {
void **pslot;
-   void *item = NULL;
+   void *item;
 
VM_BUG_ON(!expected);
+   VM_BUG_ON(!replacement);
pslot = radix_tree_lookup_slot(>page_tree, index);
-   if (pslot)
-   item = radix_tree_deref_slot_protected(pslot,
-   >tree_lock);
+   if (!pslot)
+   return -ENOENT;
+   item = radix_tree_deref_slot_protected(pslot, >tree_lock);
if (item != expected)
return -ENOENT;
-   if (replacement)
-   radix_tree_replace_slot(pslot, replacement);
-   else
-   radix_tree_delete(>page_tree, index);
+   radix_tree_replace_slot(pslot, replacement);
return 0;
 }
 
@@ -386,14 +384,15 @@ export:
 static int shmem_free_swap(struct address_space *mapping,
   pgoff_t index, void *radswap)
 {
-   int error;
+   void *old;
 
spin_lock_irq(>tree_lock);
-   error = shmem_radix_tree_replace(mapping, index, radswap, NULL);
+   old = radix_tree_delete_item(>page_tree, index, radswap);
spin_unlock_irq(>tree_lock);
-   if (!error)
-   free_swap_and_cache(radix_to_swp_entry(radswap));
-   return error;
+   if (old != radswap)
+   return -ENOENT;
+   free_swap_and_cache(radix_to_swp_entry(radswap));
+   return 0;
 }
 
 /*
-- 
1.8.4.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 1/9] fs: cachefiles: use add_to_page_cache_lru()

2013-11-24 Thread Johannes Weiner
This code used to have its own lru cache pagevec up until a0b8cab3
("mm: remove lru parameter from __pagevec_lru_add and remove parts of
pagevec API").  Now it's just add_to_page_cache() followed by
lru_cache_add(), might as well use add_to_page_cache_lru() directly.

Signed-off-by: Johannes Weiner 
---
 fs/cachefiles/rdwr.c | 33 +
 1 file changed, 13 insertions(+), 20 deletions(-)

diff --git a/fs/cachefiles/rdwr.c b/fs/cachefiles/rdwr.c
index ebaff36..4b1fb5c 100644
--- a/fs/cachefiles/rdwr.c
+++ b/fs/cachefiles/rdwr.c
@@ -265,24 +265,22 @@ static int cachefiles_read_backing_file_one(struct 
cachefiles_object *object,
goto nomem_monitor;
}
 
-   ret = add_to_page_cache(newpage, bmapping,
-   netpage->index, cachefiles_gfp);
+   ret = add_to_page_cache_lru(newpage, bmapping,
+   netpage->index, cachefiles_gfp);
if (ret == 0)
goto installed_new_backing_page;
if (ret != -EEXIST)
goto nomem_page;
}
 
-   /* we've installed a new backing page, so now we need to add it
-* to the LRU list and start it reading */
+   /* we've installed a new backing page, so now we need to start
+* it reading */
 installed_new_backing_page:
_debug("- new %p", newpage);
 
backpage = newpage;
newpage = NULL;
 
-   lru_cache_add_file(backpage);
-
 read_backing_page:
ret = bmapping->a_ops->readpage(NULL, backpage);
if (ret < 0)
@@ -510,24 +508,23 @@ static int cachefiles_read_backing_file(struct 
cachefiles_object *object,
goto nomem;
}
 
-   ret = add_to_page_cache(newpage, bmapping,
-   netpage->index, cachefiles_gfp);
+   ret = add_to_page_cache_lru(newpage, bmapping,
+   netpage->index,
+   cachefiles_gfp);
if (ret == 0)
goto installed_new_backing_page;
if (ret != -EEXIST)
goto nomem;
}
 
-   /* we've installed a new backing page, so now we need to add it
-* to the LRU list and start it reading */
+   /* we've installed a new backing page, so now we need
+* to start it reading */
installed_new_backing_page:
_debug("- new %p", newpage);
 
backpage = newpage;
newpage = NULL;
 
-   lru_cache_add_file(backpage);
-
reread_backing_page:
ret = bmapping->a_ops->readpage(NULL, backpage);
if (ret < 0)
@@ -538,8 +535,8 @@ static int cachefiles_read_backing_file(struct 
cachefiles_object *object,
monitor_backing_page:
_debug("- monitor add");
 
-   ret = add_to_page_cache(netpage, op->mapping, netpage->index,
-   cachefiles_gfp);
+   ret = add_to_page_cache_lru(netpage, op->mapping,
+   netpage->index, cachefiles_gfp);
if (ret < 0) {
if (ret == -EEXIST) {
page_cache_release(netpage);
@@ -549,8 +546,6 @@ static int cachefiles_read_backing_file(struct 
cachefiles_object *object,
goto nomem;
}
 
-   lru_cache_add_file(netpage);
-
/* install a monitor */
page_cache_get(netpage);
monitor->netfs_page = netpage;
@@ -613,8 +608,8 @@ static int cachefiles_read_backing_file(struct 
cachefiles_object *object,
backing_page_already_uptodate:
_debug("- uptodate");
 
-   ret = add_to_page_cache(netpage, op->mapping, netpage->index,
-   cachefiles_gfp);
+   ret = add_to_page_cache_lru(netpage, op->mapping,
+   netpage->index, cachefiles_gfp);
if (ret < 0) {
if (ret == -EEXIST) {
page_cache_release(netpage);
@@ -631,8 +626,6 @@ static int cachefiles_read_backing_file(struct 
cachefiles_object *object,
 
fscache_mark_page_cached(op, netpage);
 
-   lru_cache_add_file(netpage);
-
/* the netpage is unlocked and marked up to date here */
fscache_end_io(op, netpage, 0);
page_cache_release(netpage);
-- 
1.8.4.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More 

[PATCH 1/1] rsxx: Reset the pcie slot of adapter trains incorrectly.

2013-11-24 Thread Philip J. Kelleher
From: Philip J Kelleher 

This patch contains a software workaround for a firmware bug that
can cause the pcie adapter to train to a width below the desired
width of x8.

It will reset the adapter 3 times before the driver gives up and
informs the user that the link width has been trained to something
unexpected.

Signed-off-by: Philip J Kelleher 



diff --git a/drivers/block/rsxx/core.c b/drivers/block/rsxx/core.c
index a8de2ee..d15b678 100644
--- a/drivers/block/rsxx/core.c
+++ b/drivers/block/rsxx/core.c
@@ -42,6 +42,8 @@
 
 #define NO_LEGACY 0
 #define SYNC_START_TIMEOUT (10 * 60) /* 10 minutes */
+#define EXPECTED_LINK_WIDTH 8
+#define MAX_RETRAIN_CNT 3
 
 MODULE_DESCRIPTION("IBM Flash Adapter 900GB Full Height Device Driver");
 MODULE_AUTHOR("Joshua Morris/Philip Kelleher, IBM");
@@ -600,6 +602,50 @@ static int card_shutdown(struct rsxx_cardinfo *card)
return 0;
 }
 
+static void rsxx_reset_slot(struct rsxx_cardinfo *card)
+{
+   if (card->retrain_cnt >= MAX_RETRAIN_CNT) {
+   dev_warn(CARD_TO_DEV(card), "Failed to train the adapter to x%d 
"
+   "(is x%d), performance degradation "
+   "possible", EXPECTED_LINK_WIDTH,
+   card->link_width);
+   return;
+   }
+
+   pci_cfg_access_lock(card->dev);
+   pci_set_pcie_reset_state(card->dev, pcie_warm_reset);
+   msleep(500);
+   pci_set_pcie_reset_state(card->dev, pcie_deassert_reset);
+   msleep(2000);
+   pci_cfg_access_unlock(card->dev);
+
+}
+
+static void rsxx_verify_link_width(struct rsxx_cardinfo *card)
+{
+   int pos;
+   u16 reg16;
+
+   card->retrain_cnt = 0;
+
+   do {
+   pos = pci_find_capability(card->dev, PCI_CAP_ID_EXP);
+   pci_read_config_word(card->dev,
+pos + PCI_EXP_LNKSTA,
+);
+   card->link_width = (reg16 & PCI_EXP_LNKSTA_NLW) >> 4;
+
+   if (card->link_width != EXPECTED_LINK_WIDTH) {
+   card->retrain_cnt++;
+   rsxx_reset_slot(card);
+   } else {
+   pci_restore_state(card->dev);
+   break;
+   }
+
+   } while (card->retrain_cnt < MAX_RETRAIN_CNT);
+}
+
 static int rsxx_eeh_frozen(struct pci_dev *dev)
 {
struct rsxx_cardinfo *card = pci_get_drvdata(dev);
@@ -723,6 +769,8 @@ static pci_ers_result_t rsxx_slot_reset(struct pci_dev *dev)
dev_warn(>dev,
"IBM Flash Adapter PCI: recovering from slot reset.\n");
 
+   rsxx_verify_link_width(card);
+
st = pci_enable_device(dev);
if (st)
goto failed_hw_setup;
@@ -837,6 +885,14 @@ static int rsxx_pci_probe(struct pci_dev *dev,
if (st)
goto failed_ida_get;
 
+   st = pci_save_state(dev);
+   if (st) {
+   dev_err(CARD_TO_DEV(card), "Failed to save PCI config space\n");
+   goto failed_enable;
+   }
+
+   rsxx_verify_link_width(card);
+
st = pci_enable_device(dev);
if (st)
goto failed_enable;
diff --git a/drivers/block/rsxx/rsxx_priv.h b/drivers/block/rsxx/rsxx_priv.h
index 6bbc64d..348dbbb 100644
--- a/drivers/block/rsxx/rsxx_priv.h
+++ b/drivers/block/rsxx/rsxx_priv.h
@@ -52,7 +52,7 @@ struct proc_cmd;
 #define RS70_PCI_REV_SUPPORTED 4
 
 #define DRIVER_NAME "rsxx"
-#define DRIVER_VERSION "4.0.3.2516"
+#define DRIVER_VERSION "4.0.4"
 
 /* Block size is 4096 */
 #define RSXX_HW_BLK_SHIFT  12
@@ -122,6 +122,8 @@ struct rsxx_cardinfo {
struct pci_dev  *dev;
unsigned inthalt;
unsigned inteeh_state;
+   unsigned short  link_width;
+   unsigned intretrain_cnt;
 
void__iomem *regmap;
spinlock_t  irq_lock;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel BUG at drivers/md/raid5.c:693!

2013-11-24 Thread NeilBrown
On Sat, 23 Nov 2013 11:22:05 +0800 fengguang...@intel.com wrote:

> Shaohua,
> 
> FYI, we are still seeing this bug.. dmesg attached.

Thanks for the report.  However the dmesg you attached doesn't mention:

 kernel BUG at drivers/md/raid5.c:693!

at all.  It is quite different.

The "BUG" it reports is that a spinlock is blocking for too long - over
99msecs.
As far as I can tell, this is a scheduling issue, possibly specific to qemu,
rather than a bug in raid5.

md/raid5 is calling md_wakeup_thread while holding a spinlock, and that seems
to get lost in the scheduler for long enough that a different thread which
also wants the spinlock starts to complain.
It has always been safe to call 'wake_up' from under a spinlock before.

Ingo/Peter: is it considered OK to call wake_up while holding a spinlock?
Could "sleeping spinlocks" affect this at all? (some sample stack traces are
below).

Thanks,
NeilBrown

> 
> 566c09c53455d7c4f1130928ef8071da1a24ea65 is the first bad commit
> commit 566c09c53455d7c4f1130928ef8071da1a24ea65
> Author: Shaohua Li 
> Date:   Thu Nov 14 15:16:17 2013 +1100
> 
> raid5: relieve lock contention in get_active_stripe()
> 
> get_active_stripe() is the last place we have lock contention. It has two
> paths. One is stripe isn't found and new stripe is allocated, the other is
> stripe is found.
> 
> The first path basically calls __find_stripe and init_stripe. It accesses
> conf->generation, conf->previous_raid_disks, conf->raid_disks,
> conf->prev_chunk_sectors, conf->chunk_sectors, conf->max_degraded,
> conf->prev_algo, conf->algorithm, the stripe_hashtbl and inactive_list. 
> Except
> stripe_hashtbl and inactive_list, other fields are changed very rarely.
> 
> With this patch, we split inactive_list and add new hash locks. Each free
> stripe belongs to a specific inactive list. Which inactive list is 
> determined
> by stripe's lock_hash. Note, even a stripe hasn't a sector assigned, it 
> has a
> lock_hash assigned. Stripe's inactive list is protected by a hash lock, 
> which
> is determined by it's lock_hash too. The lock_hash is derivied from 
> current
> stripe_hashtbl hash, which guarantees any stripe_hashtbl list will be 
> assigned
> to a specific lock_hash, so we can use new hash lock to protect 
> stripe_hashtbl
> list too. The goal of the new hash locks introduced is we can only use 
> the new
> locks in the first path of get_active_stripe(). Since we have several hash
> locks, lock contention is relieved significantly.
> 
> The first path of get_active_stripe() accesses other fields, since they 
> are
> changed rarely, changing them now need take conf->device_lock and all hash
> locks. For a slow path, this isn't a problem.
> 
> If we need lock device_lock and hash lock, we always lock hash lock 
> first. The
> tricky part is release_stripe and friends. We need take device_lock first.
> Neil's suggestion is we put inactive stripes to a temporary list and 
> readd it
> to inactive_list after device_lock is released. In this way, we add 
> stripes to
> temporary list with device_lock hold and remove stripes from the list 
> with hash
> lock hold. So we don't allow concurrent access to the temporary list, 
> which
> means we need allocate temporary list for all participants of 
> release_stripe.
> 
> One downside is free stripes are maintained in their inactive list, they 
> can't
> across between the lists. By default, we have total 256 stripes and 8 
> lists, so
> each list will have 32 stripes. It's possible one list has free stripe but
> other list hasn't. The chance should be rare because stripes allocation 
> are
> even distributed. And we can always allocate more stripes for cache, 
> several
> mega bytes memory isn't a big deal.
> 
> This completely removes the lock contention of the first path of
> get_active_stripe(). It slows down the second code path a little bit 
> though
> because we now need takes two locks, but since the hash lock isn't 
> contended,
> the overhead should be quite small (several atomic instructions). The 
> second
> path of get_active_stripe() (basically sequential write or big request 
> size
> randwrite) still has lock contentions.
> 
> Signed-off-by: Shaohua Li 
> Signed-off-by: NeilBrown 
> 
> :04 04 88fa28d1decc5454cf4d58421fa3eb12bc9ad524 
> 5d1f104188f72b17d10cd569bf2924dab5d789cb Mdrivers
> bisect run success
> 
> # bad: [02ffe4cc90dce5a1bbee5daae98a40a431c29c6d] Merge 
> 'yuanhan/slub-experimental' into devel-hourly-2013112217
> # good: [5e01dc7b26d9f24f39abace5da98ccbd6a5ceb52] Linux 3.12
> git bisect start '02ffe4cc90dce5a1bbee5daae98a40a431c29c6d' 
> '5e01dc7b26d9f24f39abace5da98ccbd6a5ceb52' '--'
> # good: [5cbb3d216e2041700231bcfc383ee5f8b7fc8b74] Merge branch 'akpm' 
> (patches from Andrew Morton)
> git bisect 

Re: [PATCH] block: submit_bio_wait() conversions

2013-11-24 Thread Jens Axboe
On Sat, Nov 23 2013, Kent Overstreet wrote:
> It was being open coded in a few places.

Thanks, applied (with Neils ack).

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ARM: pxa: Move iotable mapping inside vmalloc region

2013-11-24 Thread Nicolas Pitre
On Sun, 24 Nov 2013, Ezequiel Garcia wrote:

> In order to remove the following ugly message:
> 
>   BUG: mapping for 0x at 0xff00 out of vmalloc space
> 
> the iotable mappings should be re-located inside the vmalloc
> region. Such move was introduced at commit:
> 
> commit 0536bdf33faff4d940ac094c77998cfac368cfff
> Author: Nicolas Pitre 
> Date:   Thu Aug 25 00:35:59 2011 -0400
> 
> ARM: move iotable mappings within the vmalloc region
> 
> While at it, let's add some nicer defines to make the code
> more readable.
> 
> Cc: Nicolas Pitre 
> Signed-off-by: Ezequiel Garcia 
> ---
>  arch/arm/mach-pxa/generic.c | 12 +---
>  1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm/mach-pxa/generic.c b/arch/arm/mach-pxa/generic.c
> index 4225417..7c01095 100644
> --- a/arch/arm/mach-pxa/generic.c
> +++ b/arch/arm/mach-pxa/generic.c
> @@ -24,6 +24,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -77,6 +78,11 @@ EXPORT_SYMBOL(get_clk_frequency_khz);
>   * Note: virtual 0xfffe-0x is reserved for the vector table
>   *   and cache flush area.
>   */
> +
> +#define UNCACHED_PHY 0x
> +#define UNCACHED_SIZESZ_1M
> +#define UNCACHED_VIRT(VMALLOC_END - UNCACHED_SIZE)
> +
>  static struct map_desc common_io_desc[] __initdata = {
>   {   /* Devs */
>   .virtual=  0xf200,
> @@ -84,9 +90,9 @@ static struct map_desc common_io_desc[] __initdata = {
>   .length = 0x0200,
>   .type   = MT_DEVICE
>   }, {/* UNCACHED_PHYS_0 */
> - .virtual= 0xff00,
> - .pfn= __phys_to_pfn(0x),
> - .length = 0x0010,
> + .virtual= UNCACHED_VIRT,
> + .pfn= __phys_to_pfn(UNCACHED_PHY),
> + .length = UNCACHED_SIZE,
>   .type   = MT_DEVICE
>   }
>  };

You might wish to change the virtual address for this mapping, but 
nowhere in your patch you are changing users of that virtual mapping.


Nicolas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   >