date:20131124

Re: [PATCH 06/22] tools lib traceevent: Add kmem plugin

2013-11-24 Thread Namhyung Kim

On Sat, 23 Nov 2013 04:06:45 -0500, Steven Rostedt wrote:
> On Fri, 22 Nov 2013 23:38:17 +0900
> Namhyung Kim  wrote:
>> It'd be great if the "call_site" in the output changes to display
>> function names instead of hex addresses directly.
>> 
>
> Actually, that's what's in the (). 
>
>kmem:kmalloc_node: (__alloc_skb+0x7e) call_site=8153c67e
>
> This uses a short cut, where we don't overwrite the entire handler, in
> case the TP_printk() gets new fields.
>
> If the registered handler for an event, like "call_site_handler" (see
> how we use it for all of tracepoints) returns >0, that tells the
> library that we only added extra information, and to print the
> tracepoint as it is normally.

Yeah, I know.  But just want to say that it'd be better if it's
displayed like below.

  kmem:kmalloc_node: call_site=__alloc_skb+0x7e ...

But it requires to write new handlers for each event..

>
> The better solution here is to use "%pS" or something in the actual
> tracepoint instead.

Agreed.

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] pipe_to_sendpage: Ensure that MSG_MORE is set if we set MSG_SENDPAGE_NOTLAST

2013-11-24 Thread Richard Weinberger

Am Sonntag, 24. November 2013, 17:25:06 schrieb Eric Dumazet:
> On Mon, 2013-11-25 at 00:42 +0100, Richard Weinberger wrote:
> > Commit 35f9c09fe (tcp: tcp_sendpages() should call tcp_push() once)
> > added an internal flag MSG_SENDPAGE_NOTLAST.
> > We have to ensure that MSG_MORE is also set if we set
> > MSG_SENDPAGE_NOTLAST.
> > Otherwise users that check against MSG_MORE will not see it.
> > 
> > This fixes sendfile() on AF_ALG.
> > 
> > Cc: Tom Herbert 
> > Cc: Eric Dumazet 
> > Cc: David S. Miller 
> > Cc:  # 3.4.x
> > Reported-and-tested-by: Shawn Landden 
> > Signed-off-by: Richard Weinberger 
> > ---
> > 
> >  fs/splice.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/fs/splice.c b/fs/splice.c
> > index 3b7ee65..b93f1b8 100644
> > --- a/fs/splice.c
> > +++ b/fs/splice.c
> > @@ -701,7 +701,7 @@ static int pipe_to_sendpage(struct pipe_inode_info
> > *pipe,> 
> > more = (sd->flags & SPLICE_F_MORE) ? MSG_MORE : 0;
> > 
> > if (sd->len < sd->total_len && pipe->nrbufs > 1)
> > 
> > -   more |= MSG_SENDPAGE_NOTLAST;
> > +   more |= MSG_SENDPAGE_NOTLAST | MSG_MORE;
> > 
> > return file->f_op->sendpage(file, buf->page, buf->offset,
> > 
> > sd->len, , more);
> 
> I do not think this patch is right. It looks like a revert of a useful
> patch for TCP zero copy. Given the time it took to discover this
> regression, I bet tcp zero copy has more users than AF_ALG, by 5 or 6
> order of magnitude ;)

Yeah, but AF_ALG broke. That's why I did the patch.

> Here we want to make the difference between the two flags, not merge
> them.
> 
> If AF_ALG do not care of the difference, try instead :
> 
> diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c
> index ef5356cd280a..850246206b12 100644
> --- a/crypto/algif_hash.c
> +++ b/crypto/algif_hash.c
> @@ -114,6 +114,9 @@ static ssize_t hash_sendpage(struct socket *sock, struct
> page *page, struct hash_ctx *ctx = ask->private;
>   int err;
> 
> + if (flags & MSG_SENDPAGE_NOTLAST)
> + flags |= MSG_MORE;
> +

In the commit message of your patch you wrote "For all sendpage() providers, 
its a transparent change.". Why does AF_ALG need special handling?
If users have to care about MSG_SENDPAGE_NOTLAST it is no longer really an 
internal flag.

Thanks,
//richard
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)

2013-11-24 Thread Francis Moreau

On 11/24/2013 10:06 PM, Rafael J. Wysocki wrote:
> On Sunday, November 24, 2013 10:39:20 AM Francis Moreau wrote:
>> Hello Thomas
>>
>> On 11/22/2013 11:27 PM, Thomas Gleixner wrote:
>>> On Fri, 22 Nov 2013, Rafael J. Wysocki wrote:
 On Friday, November 22, 2013 10:36:23 PM Francis Moreau wrote:
> Ok, I've finally managed to find out the bad commit:
> ad07277e82dedabacc52c82746633680a3187d25: ACPI / PM: Hold acpi_scan_lock
> over system PM transitions
>
> I verified that the parent commit doesn't have the problem.

 Interesting.

> Rafael, you're the man now ;)

 I kind of don't see how that commit may result in behavior that you
 described earlier in the thread.

 You get a memory corruption that seems to have started to happen because
 we're holding an additional lock over suspend resume now.  Something's 
 fishy
 on that machine and we need to figure out what it is.
>>>
>>> The hickup happens in the timer softirq.
>>>
>>> @Francis: Did you try to enable DEBUG_OBJECTS.*. If not please give it
>>>   a try.
>>
>> This looks like it was a good idea.
>>
>> The kernel now outputs the following traces after resuming.
>>
>> [   26.973928] WARNING: CPU: 0 PID: 4 at lib/debugobjects.c:260
>> debug_print_object+0x83/0xa0()
>> [   26.973932] ODEBUG: free active (active state 0) object type:
>> timer_list hint: delayed_work_timer_fn+0x0/0x20
>> [   26.973972] Modules linked in: x86_pkg_temp_thermal intel_powerclamp
>> rtsx_pci_ms coretemp memstick kvm_intel i2c_i801 iTCO_wdt
>> iTCO_vendor_support i915 i2c_algo_bit intel_agp intel_gtt drm_kms_helper
>> r8169 drm kvm mii agpgart i2c_core lpc_ich ac shpchp crc32c_intel
>> battery thermal wmi evdev mei_me video mei button mperf processor
>> serio_raw microcode ext4 crc16 mbcache jbd2 sr_mod cdrom sd_mod
>> usb_storage rtsx_pci_sdmmc mmc_core ahci libahci libata ehci_pci
>> ehci_hcd xhci_hcd scsi_mod rtsx_pci usbcore usb_common
>> [   26.974013] CPU: 0 PID: 4 Comm: kworker/0:0 Not tainted
>> 3.11.0-rc2-ARCH #64
>> [   26.974014] Hardware name: CLEVO CO.W55xEU
>>/W55xEU  , BIOS 4.6.5
>> 03/05/2013
>> [   26.974019] Workqueue: kacpi_hotplug hotplug_event_work
>> [   26.974020]  0009 880407d0da18 81459fe9
>> 880407d0da60
>> [   26.974023]  880407d0da50 8104dc7d 880407fad488
>> 81836fc0
>> [   26.974025]  81701358 81afef70 0003
>> 880407d0dab0
>> [   26.974027] Call Trace:
>> [   26.974031]  [] dump_stack+0x54/0x8d
>> [   26.974043]  [] warn_slowpath_common+0x7d/0xa0
>> [   26.974044]  [] warn_slowpath_fmt+0x4c/0x50
>> [   26.974047]  [] debug_print_object+0x83/0xa0
>> [   26.974050]  [] ? queue_work_on+0x50/0x50
>> [   26.974053]  [] __debug_check_no_obj_freed+0x1fb/0x240
>> [   26.974059]  [] ? rtsx_pci_remove+0x119/0x1d0
>> [rtsx_pci]
> 
> So a device driven by rtsx_pcr.c is removed after resume.  Without the commit
> you've bisected it is removed as well, but that happens during resume, so
> rtsx_pci_resume() is likely not called in that case.

I'm not sure to understand your point.

> 
> I bet that there's a bug either in rtsx_pci_remove() or in rtsx_pci_resume().
> The latter definitely should check if the device is actually still present
> before scheduling the delayed work, but then the Boris' patch should take care
> of that anyway.
> 

With Boris' patch applied, I still have the problem.

Thanks.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 04/22] tools lib traceevent: Add jbd2 plugin

2013-11-24 Thread Namhyung Kim

On Sat, 23 Nov 2013 03:52:21 -0500, Steven Rostedt wrote:
> On Fri, 22 Nov 2013 23:27:57 +0900
> Namhyung Kim  wrote:
>
>> [SNIP]
>> > +#define MINORBITS 20
>> > +#define MINORMASK ((1U << MINORBITS) - 1)
>> > +
>> > +#define MAJOR(dev)((unsigned int) ((dev) >> MINORBITS))
>> > +#define MINOR(dev)((unsigned int) ((dev) & MINORMASK))
>> > +
>> > +unsigned long long process_jbd2_dev_to_name(struct trace_seq *s,
>> > +  unsigned long long *args)
>> > +{
>> > +  unsigned int dev = args[0];
>> > +
>> > +  trace_seq_printf(s, "%d:%d", MAJOR(dev), MINOR(dev));
>> > +  return 0;
>> > +}
>> > +
>> > +unsigned long long process_jiffies_to_msecs(struct trace_seq *s,
>> > +  unsigned long long *args)
>> > +{
>> > +  unsigned long long jiffies = args[0];
>> > +
>> > +  trace_seq_printf(s, "%lld", jiffies);
>> > +  return jiffies;
>> > +}
>> > +
>> > +int PEVENT_PLUGIN_LOADER(struct pevent *pevent)
>> > +{
>> > +  pevent_register_print_function(pevent,
>> > + process_jbd2_dev_to_name,
>> > + PEVENT_FUNC_ARG_STRING,
>> 
>> Actually the function returns long long not string.  But it seems the
>> current code doesn't care about the return type.
>
> Actually it's not representing what process_jbd2_dev_to_name() returns
> (which will always return unsigned long long), but what
> "jbd2_dev_to_name()" returns that is (was) defined in the kernel. That
> was:
>
>   const char *jbd2_dev_to_name(dev_t device)
>
> When registering a function to handle, you need to express the
> prototype of that function (not the handler). The third argument is the
> ret_type of that function.

Aha, got it.  Thank you for the explanation.


> But this is interesting, the ret_type doesn't seem to be used in
> event_parse.c. The return value of the callback is only done in
> eval_num_arg() where we could put a warning if the ret_type is not a
> number.

Yes. :)

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is O_DSYNC on linux so slow / what's wrong with my SSD?

2013-11-24 Thread Stefan Priebe

Hi Ric,

Am 23.11.2013 20:35, schrieb Ric Wheeler:

On 11/23/2013 01:27 PM, Stefan Priebe wrote:

Hi Ric,

Am 22.11.2013 21:37, schrieb Ric Wheeler:

On 11/22/2013 03:01 PM, Stefan Priebe wrote:

Hi Christoph,
Am 21.11.2013 11:11, schrieb Christoph Hellwig:

2. Some drives may implement CMD_FLUSH to return immediately i.e. no
guarantee the data is actually on disk.

In which case they aren't spec complicant. While I've seen countless
data integrity bugs on lower end ATA SSDs I've not seen one that
simpliy
ingnores flush. If you'd want to cheat that bluntly you'd be better
of just claiming to not have a writeback cache.

You solve your performance problem by completely disabling any chance
of having data integrity guarantees, and do so in a way that is not
detectable for applications or users.

If you have a workload with lots of small synchronous writes disabling
the writeback cache on the disk does indeed often help, especially
with
the non-queueable FLUSH on all but the most recent ATA devices.

But this isn't correct for drives with capicitors like Crucial m500,
Intel DC S3500, DC S3700 isn't it? Shouldn't the linux kernel has an
option to disable this for drives like these?
/sys/block/sdX/device/ignore_flush

If you know 100% for sure that your drive has a non-volatile write
cache, you can run the file system without the flushing by mounting "-o
nobarrier". With most devices, this is not needed since they tend to
simply ignore the flushes if they know they are power failure safe.

Block level, we did something similar for users who are not running
through a file system for SCSI devices - James added support to echo
"temporary" into the sd's device's cache_type field:

See:

https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=2ee3e26c673e75c05ef8b914f54fadee3d7b9c88

At least to me this does not work. I get the same awful speed as
before - also the I/O waits stay the same. I'm still seeing CMD
flushes going to the devices.

Is there any way to check whether the temporary got accepted and works?

I simply executed:
for i in /sys/class/scsi_disk/*/cache_type; do echo $i; echo temporary
write back >$i; done

Stefan

What kernel are you running? This is a new addition

Also, you can "cat" the same file to see what it says.

Regards,

Ric

Is the output i sent to you fine? Anything wrong?

Stefan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

1 2 3 4 >

1 - 100 of 388 matches

Mail list logo