Re: [PATCH] Btrfs-progs use safe string manipulation functions
On to, 2011-02-10 at 10:29 -0300, Eduardo Silva wrote: [PATCH] Add safe string manipulation functions Deprecate direct use of strcpy(3) The following string manipulation function has been added: - string_copy() : wrapper of strcpy(3) - string_ncopy(): wrapper of strncpy(3) both function compose safe NULL terminated strings. I'd like make some comments, which I hope will be acceptable. Firstly, calling strcpy dangerous is, to me, rather overblown. It is easy to make mistakes, but it is not at all dangerous the way, for example, gets(3) is dangerous. strcpy can be used safely, gets cannot. Also, if you consider strcpy to be dangerous, then strcat should be dangerous too. However, given the risk of overwriting a buffer with strcpy, I agree that it's good to see if an alternative can be found. Secondly, if you're going to make wrappers or helper functions for string handling in C, you need to decide several things right from the start: * do you do static or dynamic allocation? * how do you handle errors? * do you want a minimal wrapper or replacement, or a whole new library? I am not familiar enough with the btrfs-progs code base to give any strong recommendations, but off the top of my head I would suggest these, for this patch: * make use of fairly minimal wrappers/replacements (at least for now) * handle errors by calling abort or exit * don't allocate data dynamically (or else it's not a minimal wrapper) For error handling, there are two kinds of things that can happen: normal run-time errors (malloc returns NULL, writing to a file fails, etc), and programming errors (wrong parameters to functions). If we're doing a minimal wrapper without dynamic memory allocation, the only thing string functions should need to worry about is programming errors. For those, abort(3) is the appropriate way to terminate the program, since it causes a core dump, which can be inspected with a debugger. Since btrfs-progs are non-interactive command line tools, this should be OK. For checking function arguments, the assert macro is appropriate. It calls abort if the test fails. I am not sure I would check for parameters being non-NULL, though, since the kernel will trap such usage and cause a segfault, which, again, can be analyzed with a debugger. For things like string copying, another problem to consider is what to do if the target array is not large enough? The two possibilities is to silently truncate the output string, return an error code of some sort, or to abort. The error code is a bit tedious, since it requires the caller to check for it, and do something sensible if it's not enough. For btrfs-progs, I would suggest aborting. Taking all of these together, my suggestion for a safer strcpy would be along these lines (outline only, not tested code): void safer_strcpy(char *target, size_t tsize, const char *source) { size_t n; n = snprintf(target, tsize, %s, source); assert(n tsize); } void safer_strncpy(char *tgt, size_t tsize, const char *src, size_t n) { assert(n tsize); /* There must be space for the '\0'. */ memset(tgt, '\0', tsize); strncpy(tgt, src, n); } Note that for any reasonable error checking to be possible the safety functions need to know the size of the target memory area. Otherwise no sensible checks can be done -- you have to rely on the caller to check that the target array is big enough, and then you're nowhere better than with plain strcpy. (Also note that I did not call the function string_copy, since global names starting with str are reserved to the C implementation.) Your function fills in the target array with zero bytes. Is that necessary? If it is, then the memset call needs to be added to safer_strcpy. (I don't find it useful to return the target array as the return value of the function, so I didn't do that.) -- Blog/wiki/website hosting with ikiwiki (free for free software): http://www.branchable.com/ -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: LOOP_GET_STATUS(64) truncates pathnames to 64 chars (was Re: Bug in mkfs.btrfs?!)
Hi, are you sure that patch is in the kernel? I'm using 2.6.37 and don't have those attribues in my /sys. Felix On 10. February 2011 - 13:29, Petr Uzel wrote: Date: Thu, 10 Feb 2011 13:29:27 +0100 From: Petr Uzel petr.u...@suse.cz To: Chris Samuel ch...@csamuel.org Cc: Felix Blanke felixbla...@gmail.com, kreij...@inwind.it, Hugo Mills hugo-l...@carfax.org.uk, linux-btrfs@vger.kernel.org, Linux Kernel linux-ker...@vger.kernel.org Subject: Re: LOOP_GET_STATUS(64) truncates pathnames to 64 chars (was Re: Bug in mkfs.btrfs?!) Mail-Followup-To: Chris Samuel ch...@csamuel.org, Felix Blanke felixbla...@gmail.com, kreij...@inwind.it, Hugo Mills hugo-l...@carfax.org.uk, linux-btrfs@vger.kernel.org, Linux Kernel linux-ker...@vger.kernel.org On Tue, Jan 25, 2011 at 11:15:11AM +1100, Chris Samuel wrote: /* * CC'd to linux-kernel in case they have any feedback on this. * * Long thread, trying to work out why mkfs.btrfs failed to * make a filesystem on an encrypted loopback mount called * /dev/loop2. Cause turned out to be mkfs.btrfs calling * LOOP_GET_STATUS to find out if the block device was mounted * and getting a truncated device name back and so it later * fails when lstat() is called on the truncated device path. * * The long device name for the encrypted loopback mount was * because /dev/disk/by-id/$ID was used when Felix created it * to cope with devices moving around. */ On 25/01/11 00:01, Felix Blanke wrote: you were talking about the LOOP_GET_STATUS function. I'm not quite sure where does it came from. Is it part of the kernel? Or does it come from the util-linux package? It's in the kernel, and there is both LOOP_GET_STATUS (old implementation) and LOOP_GET_STATUS64 (new implementation). They return structures called loop_info and loop_info64 respectively and both are defined in include/linux/loop.h . Sadly in both cases the lengths of paths are defined to be LO_NAME_SIZE which is currently 64 and hence either implementation will cause the problematic: lstat(/dev/disk/by-id/ata-INTEL_SSDSA2M160G2GC_CVPO939201JX160AGN-par, 0x7fffa30b3cf0) = -1 ENOENT (No such file or directory) I've CC'd this to the LKML in case they have any feedback on this apparent problem with the API. Since 2.6.37, you can get full path to the backing file from sys: cat /sys/block/loopX/loop/backing_file See http://linux.derkeiler.com/Mailing-Lists/Kernel/2010-07/msg10996.html HTH, Petr -- Petr Uzel IRC: ptr_uzl @ freenode ---end quoted text--- -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
ENOSPC Regression
I'm encountering premature ENOSPC issues recently where my Btrfs testing partition will either prematurely return an ENOSPC, or lock up the operations trying to access the partition. I have bisected the problem to this commit: http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-unstable.git;a=commit;h=914ee295af418e936ec20a08c1663eaabe4cd07a (Btrfs: pwrite blocked when writing from the mmaped buffer of the same page) I am encountering the problem on a small-ish 3.5 GB Btrfs partition. I can replicate the problem with and without compression. I can also replicate the problem with and without reformating the partition. For most operations I run on this partition, Btrfs is performing without error. But when I compile openmotif-2.3.3 on a kernel that is after the above referenced commit, I'll get either an ENOSPC error or the partition locks up. When I encounter a lock-up issue, there are no errors in dmesg, and no delayed processes are showing (unless I try to run an additional operation on that partition, such as 'ls', which will subsequently show up as delayed). However, the build process for openmotif-2.3.3 appears frozen, and several processes related to the build are shown as running, and will not even respond to 'kill -s 9 pid' The partition only has about 500 MB of data when I encounter the problems, and openmotif-2.3.3 typically only requires about 30-60 MB to compile. However, running 'btrfs fi show' indicates that btrfs has attempted to reserve all the space on the disk for data and metadata. When running a kernel prior to the above referenced commit, btrfs will compile openmotif-2.3.3 without needing to reserve much extra space on the partition. Let me know if you would like any additional information or tests. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2.6.37: Multi-second I/O latency while untarring
As I type this, I have an ssh process running that's dumping data into a fifo at high speed (maybe 500Mbps) and a tar process that's untarring from the same fifo onto btrfs. The btrfs fs is mounted -o space_cache,compress. This machine has 8GB ram, 8 logical cores, and a fast (i7-2600) CPU, so it's not an issue with the machine struggling under load. Every few tens of seconds, my system stalls for several seconds. These stalls cause keyboard input to be lost, firefox to hang, etc. Setting tar's ionice priority to best effort / 7 or to idle makes no difference. ionice idle and queue_depth = 1 on the disk (a slow 2TB WD) also makes no difference. max_sectors_kb = 64 in addition to the above doesn't help either. latencytop shows regular instances of 2-7 *second* latency, variously in sync_page, start_transaction, btrfs_start_ordered_extent, and do_get_write_access (from jbd2 on my ext4 root partition). echo 3 drop_caches gave me 7 GB free RAM. I still had stalls when 4-5 GB were still free (so it shouldn't be a problem with important pages being evicted). In case it matters, all of my partitions are on LVM on dm-crypt, but this machine has AES-NI so the overhead from that should be minimal. In fact, overall CPU usage is only about 10%. What gives? I thought this stuff was supposed to be better on modern kernels. --Andy -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.37: Multi-second I/O latency while untarring
Excerpts from Andrew Lutomirski's message of 2011-02-11 10:08:52 -0500: As I type this, I have an ssh process running that's dumping data into a fifo at high speed (maybe 500Mbps) and a tar process that's untarring from the same fifo onto btrfs. The btrfs fs is mounted -o space_cache,compress. This machine has 8GB ram, 8 logical cores, and a fast (i7-2600) CPU, so it's not an issue with the machine struggling under load. Every few tens of seconds, my system stalls for several seconds. These stalls cause keyboard input to be lost, firefox to hang, etc. Setting tar's ionice priority to best effort / 7 or to idle makes no difference. ionice idle and queue_depth = 1 on the disk (a slow 2TB WD) also makes no difference. max_sectors_kb = 64 in addition to the above doesn't help either. latencytop shows regular instances of 2-7 *second* latency, variously in sync_page, start_transaction, btrfs_start_ordered_extent, and do_get_write_access (from jbd2 on my ext4 root partition). echo 3 drop_caches gave me 7 GB free RAM. I still had stalls when 4-5 GB were still free (so it shouldn't be a problem with important pages being evicted). In case it matters, all of my partitions are on LVM on dm-crypt, but this machine has AES-NI so the overhead from that should be minimal. In fact, overall CPU usage is only about 10%. What gives? I thought this stuff was supposed to be better on modern kernels. We can tell more if you post the full traces from latencytop. I have a patch here for latencytop that adds a -c mode, which dumps the traces out to a text files. http://oss.oracle.com/~mason/latencytop.patch Based on what you have here, I think it's probably a latency problem between btrfs and the dm-crypt stuff. How easily can setup a test partition without dm-crypt? -chris -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.37: Multi-second I/O latency while untarring
On Fri, Feb 11, 2011 at 3:08 PM, Andrew Lutomirski a...@luto.us wrote: As I type this, I have an ssh process running that's dumping data into a fifo at high speed (maybe 500Mbps) and a tar process that's untarring from the same fifo onto btrfs. The btrfs fs is mounted -o space_cache,compress. This machine has 8GB ram, 8 logical cores, and a fast (i7-2600) CPU, so it's not an issue with the machine struggling under load. Every few tens of seconds, my system stalls for several seconds. These stalls cause keyboard input to be lost, firefox to hang, etc. Setting tar's ionice priority to best effort / 7 or to idle makes no difference. ionice idle and queue_depth = 1 on the disk (a slow 2TB WD) also makes no difference. max_sectors_kb = 64 in addition to the above doesn't help either. latencytop shows regular instances of 2-7 *second* latency, variously in sync_page, start_transaction, btrfs_start_ordered_extent, and do_get_write_access (from jbd2 on my ext4 root partition). echo 3 drop_caches gave me 7 GB free RAM. I still had stalls when 4-5 GB were still free (so it shouldn't be a problem with important pages being evicted). In case it matters, all of my partitions are on LVM on dm-crypt, but this machine has AES-NI so the overhead from that should be minimal. In fact, overall CPU usage is only about 10%. What gives? I thought this stuff was supposed to be better on modern kernels. --Andy -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Hi Andrew, you could try the following patch to speed up dm-crypt: https://patchwork.kernel.org/patch/365542/ I'm using it on top of a highly-patched 2.6.37 kernel not sure if exactly that version was included in 2.6.38 there are some additional handles to speed up dm: e.g. PCRYCONFIG_CRYPTO_PCRYPT=y Regards Matt -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Recovering data from disk with loose cable
On Wed, 9 Feb 2011 21:46:38 -0500, Ben Gamari bgam...@gmail.com wrote: We have a disk array behind two external SATA port multipliers (four disks on each multiplier) which has been running btrfs (RAID 1 for both data and metadata). Unfortunately, earlier today it seems one of the SATA cables came loose, resulting in the kernel (2.6.37) eventually OOPSing although apparently not before writing quite a bit of data. Upon reboot, I was met with the dreaded, disk-io.c:741: open_ctree_fd: Assertion `!(!tree_root-node)' failed. Unfortunately any attempt to run any of the btrfs-progs utilities (from git) met a similar end. There was recently a patch to try harder in recovering from this problem posted to the list[1], although unfortunately it is unable to find a root. Considering there are eight disks in the array and only four were affected by the loose cable, I find it very hard to believe there is no way to recover this volume. Any suggestions at all would be greatly appreciated. Recovering this data would mean a lot. Thanks, Given there has been no response to this, I suppose I should assume this data is unrecoverable? It's not the end of the world if so, but again, it would be nice to get a few files and it seems like a small subset of the metadata is corrupted. Cheers, - Ben -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ENOSPC Regression
On Fri, Feb 11, 2011 at 07:21:47AM -0600, Mitch Harder wrote: I'm encountering premature ENOSPC issues recently where my Btrfs testing partition will either prematurely return an ENOSPC, or lock up the operations trying to access the partition. I have bisected the problem to this commit: http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-unstable.git;a=commit;h=914ee295af418e936ec20a08c1663eaabe4cd07a (Btrfs: pwrite blocked when writing from the mmaped buffer of the same page) I am encountering the problem on a small-ish 3.5 GB Btrfs partition. I can replicate the problem with and without compression. I can also replicate the problem with and without reformating the partition. For most operations I run on this partition, Btrfs is performing without error. But when I compile openmotif-2.3.3 on a kernel that is after the above referenced commit, I'll get either an ENOSPC error or the partition locks up. When I encounter a lock-up issue, there are no errors in dmesg, and no delayed processes are showing (unless I try to run an additional operation on that partition, such as 'ls', which will subsequently show up as delayed). However, the build process for openmotif-2.3.3 appears frozen, and several processes related to the build are shown as running, and will not even respond to 'kill -s 9 pid' The partition only has about 500 MB of data when I encounter the problems, and openmotif-2.3.3 typically only requires about 30-60 MB to compile. However, running 'btrfs fi show' indicates that btrfs has attempted to reserve all the space on the disk for data and metadata. When running a kernel prior to the above referenced commit, btrfs will compile openmotif-2.3.3 without needing to reserve much extra space on the partition. Let me know if you would like any additional information or tests. Can you try my btrfs-work tree and see if you still have the same problem? Thanks, Josef -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ENOSPC Regression
On Fri, Feb 11, 2011 at 10:22 AM, Josef Bacik jo...@redhat.com wrote: On Fri, Feb 11, 2011 at 07:21:47AM -0600, Mitch Harder wrote: I'm encountering premature ENOSPC issues recently where my Btrfs testing partition will either prematurely return an ENOSPC, or lock up the operations trying to access the partition. I have bisected the problem to this commit: http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-unstable.git;a=commit;h=914ee295af418e936ec20a08c1663eaabe4cd07a (Btrfs: pwrite blocked when writing from the mmaped buffer of the same page) I am encountering the problem on a small-ish 3.5 GB Btrfs partition. I can replicate the problem with and without compression. I can also replicate the problem with and without reformating the partition. For most operations I run on this partition, Btrfs is performing without error. But when I compile openmotif-2.3.3 on a kernel that is after the above referenced commit, I'll get either an ENOSPC error or the partition locks up. When I encounter a lock-up issue, there are no errors in dmesg, and no delayed processes are showing (unless I try to run an additional operation on that partition, such as 'ls', which will subsequently show up as delayed). However, the build process for openmotif-2.3.3 appears frozen, and several processes related to the build are shown as running, and will not even respond to 'kill -s 9 pid' The partition only has about 500 MB of data when I encounter the problems, and openmotif-2.3.3 typically only requires about 30-60 MB to compile. However, running 'btrfs fi show' indicates that btrfs has attempted to reserve all the space on the disk for data and metadata. When running a kernel prior to the above referenced commit, btrfs will compile openmotif-2.3.3 without needing to reserve much extra space on the partition. Let me know if you would like any additional information or tests. Can you try my btrfs-work tree and see if you still have the same problem? Thanks, Josef I've built and tested the 2.6.38-rc1 kernel from the master branch of git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-work.git, and I am still getting the same issue. I've just noticed there is another thread going on about this same problem. I'll just pile on that thread if I come across something new. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: LOOP_GET_STATUS(64) truncates pathnames to 64 chars (was Re: Bug in mkfs.btrfs?!)
On 02/11/2011 08:23 PM, Felix Blanke wrote: What do you mean with configured? I'm using loop devices with loop aes, and I've looked into /sys for a device which is actually in use. Ehm. It is really Loop-AES? Then ask author to backport it there, Loop-AES is not mainline code. He usually replaces the whole upstream loop implementation with old patched version. Milan -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: LOOP_GET_STATUS(64) truncates pathnames to 64 chars (was Re: Bug in mkfs.btrfs?!)
Yeah, for me its loop-aes. Ah ok, didn't knew that it replaces that whole loop thing :) Felix On Feb 11, 2011 8:32 PM, Milan Broz mb...@redhat.com wrote: On 02/11/2011 08:23 PM, Felix Blanke wrote: What do you mean with configured? I'm using loop devices with loop aes, and I've looked into /sys for a device which is actually in use. Ehm. It is really Loop-AES? Then ask author to backport it there, Loop-AES is not mainline code. He usually replaces the whole upstream loop implementation with old patched version. Milan -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.37: Multi-second I/O latency while untarring
On Fri, Feb 11, 2011 at 10:44 AM, Chris Mason chris.ma...@oracle.com wrote: Excerpts from Andrew Lutomirski's message of 2011-02-11 10:08:52 -0500: As I type this, I have an ssh process running that's dumping data into a fifo at high speed (maybe 500Mbps) and a tar process that's untarring from the same fifo onto btrfs. The btrfs fs is mounted -o space_cache,compress. This machine has 8GB ram, 8 logical cores, and a fast (i7-2600) CPU, so it's not an issue with the machine struggling under load. Every few tens of seconds, my system stalls for several seconds. These stalls cause keyboard input to be lost, firefox to hang, etc. Setting tar's ionice priority to best effort / 7 or to idle makes no difference. ionice idle and queue_depth = 1 on the disk (a slow 2TB WD) also makes no difference. max_sectors_kb = 64 in addition to the above doesn't help either. latencytop shows regular instances of 2-7 *second* latency, variously in sync_page, start_transaction, btrfs_start_ordered_extent, and do_get_write_access (from jbd2 on my ext4 root partition). echo 3 drop_caches gave me 7 GB free RAM. I still had stalls when 4-5 GB were still free (so it shouldn't be a problem with important pages being evicted). In case it matters, all of my partitions are on LVM on dm-crypt, but this machine has AES-NI so the overhead from that should be minimal. In fact, overall CPU usage is only about 10%. What gives? I thought this stuff was supposed to be better on modern kernels. We can tell more if you post the full traces from latencytop. I have a patch here for latencytop that adds a -c mode, which dumps the traces out to a text files. http://oss.oracle.com/~mason/latencytop.patch Big dump at end of email from latencytop git + your patch. Based on what you have here, I think it's probably a latency problem between btrfs and the dm-crypt stuff. How easily can setup a test partition without dm-crypt? Not so easily on that disk. I left some space inside the LVM to play with but none outside. I'll try hooking up another disk over eSATA l (on a Cougar Point 3Gbps controller, so it might blow up). And here's the dump: === Fri Feb 11 14:44:07 2011 Globals: Cause Maximum Percentage synchronous write 4249.1 msec 35.5 % Writing to a pipe 4248.5 msec 35.5 % Writing a page to disk 105.9 msec 2.1 % Page fault 23.7 msec 0.2 % Reading from a pipe 4.7 msec 19.8 % Waiting for event (select)4.6 msec 6.4 % Waiting for event (poll) 1.3 msec 0.0 % Executing raw SCSI command1.3 msec 0.2 % opening cdrom device 1.3 msec 0.3 % Process details: Process ksoftirqd/1 (10) Total: 50.0 msec [run_ksoftirqd] 4.8 msec100.0 % run_ksoftirqd kthread kernel_thread_helper Process ksoftirqd/2 (15) Total: 8.7 msec [run_ksoftirqd] 4.9 msec100.0 % run_ksoftirqd kthread kernel_thread_helper Process ksoftirqd/3 (19) Total: 2.9 msec [run_ksoftirqd] 2.9 msec100.0 % run_ksoftirqd kthread kernel_thread_helper Process ksoftirqd/5 (27) Total: 80.6 msec [run_ksoftirqd] 5.0 msec100.0 % run_ksoftirqd kthread kernel_thread_helper Process scsi_eh_1 (62) Total: 45.0 msec Executing internal ATA command0.7 msec 62.3 % ata_exec_internal_sg ata_exec_internal atapi_eh_request_sense ata_eh_link_autopsy ata_eh_autopsy sata_pmp_error_handler ahci_error_handler ata_scsi_error scsi_error_handler kthread kernel_thread_helper SCSI error handler0.6 msec 37.7 % scsi_error_handler kthread kernel_thread_helper Process kworker/7:1 (76) Total: 8.7 msec . 3.9 msec100.0 % worker_thread kthread kernel_thread_helper Process kworker/4:1 (139) Total: 124.0 msec . 4.9 msec100.0 % worker_thread kthread kernel_thread_helper Process kworker/6:1 (140) Total: 11.7 msec . 3.8 msec100.0 % worker_thread kthread kernel_thread_helper Process kworker/5:1 (141) Total: 12.5 msec . 4.9 msec100.0 % worker_thread kthread kernel_thread_helper Process kworker/2:1 (142) Total: 26.1 msec . 4.9 msec100.0 % worker_thread kthread kernel_thread_helper Process kworker/1:1 (143) Total: 47.1 msec . 4.9 msec100.0 % worker_thread kthread kernel_thread_helper Process kworker/3:1 (150) Total: 4.6 msec . 3.1 msec100.0 % worker_thread kthread kernel_thread_helper Process jbd2/dm-1-8 (376) Total: 66.7 msec Writing buffer to disk (synchronous) 66.7 msec100.0 %
kernel BUG at /usr/src/btrfs-work/fs/btrfs/extent-tree.c:2195
Hi, While testing with my Ceph cluster I saw some btrfs messages: http://pastebin.com/URN3ShVb I'm not sure when these messages came up (What state of the OSD). To keep up with the recent btrfs changes I'm using Josef's btrfs-work repository ( aba63cd31ab85e3ec7e9805fadc77dad8b7fc945 ) with the 2.6.38 kernel. One of my OSD's (Object Store Daemons) is still blocking, this is the OSD which is using /dev/sdc (See the pastebin errors about sdc). It's in status D, the stack is showing: root@noisy:~# cat /proc/1974/task/2043/stack [a033fc3a] btrfs_commit_transaction_async+0x25a/0x2e0 [btrfs] [a036e48e] btrfs_mksubvol+0x2ae/0x350 [btrfs] [a036e62a] btrfs_ioctl_snap_create_transid+0xfa/0x150 [btrfs] [a036e709] btrfs_ioctl_snap_create_v2+0x89/0x100 [btrfs] [a0371692] btrfs_ioctl+0x762/0xa90 [btrfs] [8116de1d] vfs_ioctl+0x1d/0x50 [8116e8b9] do_vfs_ioctl+0x69/0x1d0 [8116eab4] sys_ioctl+0x94/0xa0 [8100c002] system_call_fastpath+0x16/0x1b [] 0x root@noisy:~# I don't know if it is related to the messages in my dmesg, but I thought i'd send it anyway. Is this a known bug? Thank you, Wido -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
null pointer dereference in iov_iter_copy_from_user_atomic while updating rpm packages
Hi, While updating my fedora rawhide installation, I got the Ooops listed at the end of the Email. Is this a known bug (I didn't find anything specific), or should I file a bug? Thank you in advance, Clemens Feb 10 10:59:45 testbox kernel: [ 524.495751] BUG: unable to handle kernel NULL pointer dereference at (null) Feb 10 10:59:45 testbox kernel: [ 524.496006] IP: [c04267a2] kmap_atomic_prot+0x1c/0x111 Feb 10 10:59:45 testbox kernel: [ 524.496006] *pde = Feb 10 10:59:45 testbox kernel: [ 524.496006] Oops: [#1] SMP Feb 10 10:59:45 testbox kernel: [ 524.496006] last sysfs file: /sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_map Feb 10 10:59:45 testbox kernel: [ 524.496006] Modules linked in: sunrpc cpufreq_ondemand acpi_cpufreq mperf ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables snd_hda_codec_si3054 snd_hda_codec_realtek arc4 snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device iwl3945 snd_pcm iwlcore mac80211 snd_timer ppdev e1000e snd cfg80211 parport_pc soundcore iTCO_wdt toshiba_bluetooth joydev parport snd_page_alloc toshiba_acpi microcode iTCO_vendor_support sparse_keymap rfkill uinput ipv6 btrfs zlib_deflate libcrc32c sdhci_pci sdhci firewire_ohci mmc_core firewire_core crc_itu_t yenta_socket i915 drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded: scsi_wait_scan] Feb 10 10:59:45 testbox kernel: [ 524.496006] Feb 10 10:59:45 testbox kernel: [ 524.496006] Pid: 1465, comm: build-locale-ar Not tainted 2.6.38-0.rc3.git4.1.fc15.i686 #1 Portable PC/Tecra A8 Feb 10 10:59:45 testbox kernel: [ 524.496006] EIP: 0060:[c04267a2] EFLAGS: 00210202 CPU: 0 Feb 10 10:59:45 testbox kernel: [ 524.496006] EIP is at kmap_atomic_prot+0x1c/0x111 Feb 10 10:59:45 testbox kernel: [ 524.496006] EAX: f1d56000 EBX: f1d57eb8 ECX: EDX: 0163 Feb 10 10:59:45 testbox kernel: [ 524.496006] ESI: EDI: 0163 EBP: f1d57de8 ESP: f1d57dd4 Feb 10 10:59:45 testbox kernel: [ 524.496006] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Feb 10 10:59:45 testbox kernel: [ 524.496006] Process build-locale-ar (pid: 1465, ti=f1d56000 task=f1d1f110 task.ti=f1d56000) Feb 10 10:59:45 testbox kernel: [ 524.496006] Stack: Feb 10 10:59:45 testbox kernel: [ 524.496006] f1d57df0 f1d57eb8 1000 f1d57df0 c04268aa f1d57e08 Feb 10 10:59:45 testbox kernel: [ 524.496006] c04ab3cd 012c 1000 f1d57e2c f8217b41 012c Feb 10 10:59:45 testbox kernel: [ 524.496006] 1010 0002 1000 f1d57eb8 113c f1d57edc f8218129 Feb 10 10:59:45 testbox kernel: [ 524.496006] Call Trace: Feb 10 10:59:45 testbox kernel: [ 524.496006] [c04268aa] __kmap_atomic+0x13/0x15 Feb 10 10:59:45 testbox kernel: [ 524.496006] [c04ab3cd] iov_iter_copy_from_user_atomic+0x28/0x6c Feb 10 10:59:45 testbox kernel: [ 524.496006] [f8217b41] btrfs_copy_from_user.isra.6+0x5c/0x96 [btrfs] Feb 10 10:59:45 testbox kernel: [ 524.496006] [f8218129] btrfs_file_aio_write+0x480/0x79b [btrfs] Feb 10 10:59:45 testbox kernel: [ 524.496006] [c04dd8e4] ? mem_cgroup_update_page_stat+0x1a/0xd4 Feb 10 10:59:45 testbox kernel: [ 524.496006] [c04e3e76] do_sync_write+0x96/0xcf Feb 10 10:59:45 testbox kernel: [ 524.496006] [c04e4265] ? rw_verify_area+0xd0/0xf3 Feb 10 10:59:45 testbox kernel: [ 524.496006] [c04e44fd] vfs_write+0x8f/0xd7 Feb 10 10:59:45 testbox kernel: [ 524.496006] [c04e3de0] ? do_sync_write+0x0/0xcf Feb 10 10:59:45 testbox kernel: [ 524.496006] [c04e46bf] sys_write+0x42/0x63 Feb 10 10:59:45 testbox kernel: [ 524.496006] [c07d449c] syscall_call+0x7/0xb Feb 10 10:59:45 testbox kernel: [ 524.496006] Code: 26 00 8b 15 08 b9 af c0 e8 58 f9 ff ff 5d c3 55 89 e5 57 56 53 83 ec 08 3e 8d 74 26 00 89 c6 89 e0 25 00 e0 ff ff 89 d7 ff 40 14 8b 06 c1 e8 1e 69 c0 80 03 00 00 05 00 07 a3 c0 e8 49 fe ff ff Feb 10 10:59:45 testbox kernel: [ 524.496006] EIP: [c04267a2] kmap_atomic_prot+0x1c/0x111 SS:ESP 0068:f1d57dd4 Feb 10 10:59:45 testbox kernel: [ 524.496006] CR2: Feb 10 10:59:45 testbox kernel: [ 524.582447] ---[ end trace e16f2400ae6eb809 ]--- Feb 10 10:59:45 testbox kernel: [ 524.584816] note: build-locale-ar[1465] exited with preempt_count 2 Feb 10 10:59:45 testbox kernel: [ 524.584819] BUG: sleeping function called from invalid context at kernel/rwsem.c:21 Feb 10 10:59:45 testbox kernel: [ 524.584822] in_atomic(): 1, irqs_disabled(): 0, pid: 1465, name: build-locale-ar Feb 10 10:59:45 testbox kernel: [ 524.584828] Pid: 1465, comm: build-locale-ar Tainted: G D 2.6.38-0.rc3.git4.1.fc15.i686 #1 Feb 10 10:59:45 testbox kernel: [ 524.584830] Call Trace: Feb 10 10:59:45 testbox kernel: [ 524.584835] [c042e20a] ? __might_sleep+0xdd/0xe4 Feb 10 10:59:45 testbox kernel: [ 524.584839] [c07d382c] ? down_read+0x1c/0x30 Feb 10 10:59:45 testbox kernel: [ 524.584843] [c046c69f] ? acct_collect+0x3e/0x138 Feb 10 10:59:45 testbox kernel: [ 524.584847] [c043da92] ? do_exit+0x1d0/0x62c Feb 10
Re: null pointer dereference in iov_iter_copy_from_user_atomic while updating rpm packages
Excerpts from Clemens Eisserer's message of 2011-02-11 18:05:55 -0500: Hi, While updating my fedora rawhide installation, I got the Ooops listed at the end of the Email. Is this a known bug (I didn't find anything specific), or should I file a bug? Thank you in advance, Clemens I think we've fixed this in rc4, or you can git pull from the current btrfs-unstable tree. -chris Feb 10 10:59:45 testbox kernel: [ 524.495751] BUG: unable to handle kernel NULL pointer dereference at (null) Feb 10 10:59:45 testbox kernel: [ 524.496006] IP: [c04267a2] kmap_atomic_prot+0x1c/0x111 Feb 10 10:59:45 testbox kernel: [ 524.496006] *pde = Feb 10 10:59:45 testbox kernel: [ 524.496006] Oops: [#1] SMP Feb 10 10:59:45 testbox kernel: [ 524.496006] last sysfs file: /sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_map Feb 10 10:59:45 testbox kernel: [ 524.496006] Modules linked in: sunrpc cpufreq_ondemand acpi_cpufreq mperf ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables snd_hda_codec_si3054 snd_hda_codec_realtek arc4 snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device iwl3945 snd_pcm iwlcore mac80211 snd_timer ppdev e1000e snd cfg80211 parport_pc soundcore iTCO_wdt toshiba_bluetooth joydev parport snd_page_alloc toshiba_acpi microcode iTCO_vendor_support sparse_keymap rfkill uinput ipv6 btrfs zlib_deflate libcrc32c sdhci_pci sdhci firewire_ohci mmc_core firewire_core crc_itu_t yenta_socket i915 drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded: scsi_wait_scan] Feb 10 10:59:45 testbox kernel: [ 524.496006] Feb 10 10:59:45 testbox kernel: [ 524.496006] Pid: 1465, comm: build-locale-ar Not tainted 2.6.38-0.rc3.git4.1.fc15.i686 #1 Portable PC/Tecra A8 Feb 10 10:59:45 testbox kernel: [ 524.496006] EIP: 0060:[c04267a2] EFLAGS: 00210202 CPU: 0 Feb 10 10:59:45 testbox kernel: [ 524.496006] EIP is at kmap_atomic_prot+0x1c/0x111 Feb 10 10:59:45 testbox kernel: [ 524.496006] EAX: f1d56000 EBX: f1d57eb8 ECX: EDX: 0163 Feb 10 10:59:45 testbox kernel: [ 524.496006] ESI: EDI: 0163 EBP: f1d57de8 ESP: f1d57dd4 Feb 10 10:59:45 testbox kernel: [ 524.496006] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Feb 10 10:59:45 testbox kernel: [ 524.496006] Process build-locale-ar (pid: 1465, ti=f1d56000 task=f1d1f110 task.ti=f1d56000) Feb 10 10:59:45 testbox kernel: [ 524.496006] Stack: Feb 10 10:59:45 testbox kernel: [ 524.496006] f1d57df0 f1d57eb8 1000 f1d57df0 c04268aa f1d57e08 Feb 10 10:59:45 testbox kernel: [ 524.496006] c04ab3cd 012c 1000 f1d57e2c f8217b41 012c Feb 10 10:59:45 testbox kernel: [ 524.496006] 1010 0002 1000 f1d57eb8 113c f1d57edc f8218129 Feb 10 10:59:45 testbox kernel: [ 524.496006] Call Trace: Feb 10 10:59:45 testbox kernel: [ 524.496006] [c04268aa] __kmap_atomic+0x13/0x15 Feb 10 10:59:45 testbox kernel: [ 524.496006] [c04ab3cd] iov_iter_copy_from_user_atomic+0x28/0x6c Feb 10 10:59:45 testbox kernel: [ 524.496006] [f8217b41] btrfs_copy_from_user.isra.6+0x5c/0x96 [btrfs] Feb 10 10:59:45 testbox kernel: [ 524.496006] [f8218129] btrfs_file_aio_write+0x480/0x79b [btrfs] Feb 10 10:59:45 testbox kernel: [ 524.496006] [c04dd8e4] ? mem_cgroup_update_page_stat+0x1a/0xd4 Feb 10 10:59:45 testbox kernel: [ 524.496006] [c04e3e76] do_sync_write+0x96/0xcf Feb 10 10:59:45 testbox kernel: [ 524.496006] [c04e4265] ? rw_verify_area+0xd0/0xf3 Feb 10 10:59:45 testbox kernel: [ 524.496006] [c04e44fd] vfs_write+0x8f/0xd7 Feb 10 10:59:45 testbox kernel: [ 524.496006] [c04e3de0] ? do_sync_write+0x0/0xcf Feb 10 10:59:45 testbox kernel: [ 524.496006] [c04e46bf] sys_write+0x42/0x63 Feb 10 10:59:45 testbox kernel: [ 524.496006] [c07d449c] syscall_call+0x7/0xb Feb 10 10:59:45 testbox kernel: [ 524.496006] Code: 26 00 8b 15 08 b9 af c0 e8 58 f9 ff ff 5d c3 55 89 e5 57 56 53 83 ec 08 3e 8d 74 26 00 89 c6 89 e0 25 00 e0 ff ff 89 d7 ff 40 14 8b 06 c1 e8 1e 69 c0 80 03 00 00 05 00 07 a3 c0 e8 49 fe ff ff Feb 10 10:59:45 testbox kernel: [ 524.496006] EIP: [c04267a2] kmap_atomic_prot+0x1c/0x111 SS:ESP 0068:f1d57dd4 Feb 10 10:59:45 testbox kernel: [ 524.496006] CR2: Feb 10 10:59:45 testbox kernel: [ 524.582447] ---[ end trace e16f2400ae6eb809 ]--- Feb 10 10:59:45 testbox kernel: [ 524.584816] note: build-locale-ar[1465] exited with preempt_count 2 Feb 10 10:59:45 testbox kernel: [ 524.584819] BUG: sleeping function called from invalid context at kernel/rwsem.c:21 Feb 10 10:59:45 testbox kernel: [ 524.584822] in_atomic(): 1, irqs_disabled(): 0, pid: 1465, name: build-locale-ar Feb 10 10:59:45 testbox kernel: [ 524.584828] Pid: 1465, comm: build-locale-ar Tainted: G D 2.6.38-0.rc3.git4.1.fc15.i686 #1 Feb 10 10:59:45 testbox kernel: [ 524.584830] Call Trace: Feb 10 10:59:45 testbox kernel: [ 524.584835] [c042e20a] ?
Re: 2.6.37: Multi-second I/O latency while untarring
On Fri, Feb 11, 2011 at 10:44 AM, Chris Mason chris.ma...@oracle.com wrote: Excerpts from Andrew Lutomirski's message of 2011-02-11 10:08:52 -0500: As I type this, I have an ssh process running that's dumping data into a fifo at high speed (maybe 500Mbps) and a tar process that's untarring from the same fifo onto btrfs. The btrfs fs is mounted -o space_cache,compress. This machine has 8GB ram, 8 logical cores, and a fast (i7-2600) CPU, so it's not an issue with the machine struggling under load. Every few tens of seconds, my system stalls for several seconds. These stalls cause keyboard input to be lost, firefox to hang, etc. Setting tar's ionice priority to best effort / 7 or to idle makes no difference. ionice idle and queue_depth = 1 on the disk (a slow 2TB WD) also makes no difference. max_sectors_kb = 64 in addition to the above doesn't help either. latencytop shows regular instances of 2-7 *second* latency, variously in sync_page, start_transaction, btrfs_start_ordered_extent, and do_get_write_access (from jbd2 on my ext4 root partition). echo 3 drop_caches gave me 7 GB free RAM. I still had stalls when 4-5 GB were still free (so it shouldn't be a problem with important pages being evicted). In case it matters, all of my partitions are on LVM on dm-crypt, but this machine has AES-NI so the overhead from that should be minimal. In fact, overall CPU usage is only about 10%. What gives? I thought this stuff was supposed to be better on modern kernels. We can tell more if you post the full traces from latencytop. I have a patch here for latencytop that adds a -c mode, which dumps the traces out to a text files. http://oss.oracle.com/~mason/latencytop.patch Based on what you have here, I think it's probably a latency problem between btrfs and the dm-crypt stuff. How easily can setup a test partition without dm-crypt? Done, on the same physical disk as before. The latency is just as bad. On this test, I wrote a total of 3.1G, which is under half of my RAM. That should rule out lots of VM issues. latencytop trace below. The impression I get (from watching the disk activity light) is that the disk is mostly idle but every now and then writes out a ton of data. While it's writing, the system often becomes unusable. P.S. How bad is this? I got it on both disks. btrfs: free space inode generation (0) did not match free space cache generation (11070) for block group 1103101952 === Fri Feb 11 19:30:57 2011 Globals: Cause Maximum Percentage Writing a page to disk 2009.0 msec 19.7 % fsync() on a file (type 'F' for details)612.2 msec 5.0 % synchronous write 573.6 msec 1.8 % Page fault 57.3 msec 0.7 % Writing buffer to disk (synchronous) 45.2 msec 0.1 % Unlinking file 12.6 msec 0.0 % Waiting for event (select)5.0 msec 22.3 % Reading from a pipe 5.0 msec 29.9 % Waiting for event (poll) 5.0 msec 17.8 % Process details: Process kthreadd (2) Total: 1.9 msec kthreadd kernel thread1.9 msec100.0 % kthreadd kernel_thread_helper Process ksoftirqd/0 (3) Total: 18.5 msec [run_ksoftirqd] 4.0 msec100.0 % run_ksoftirqd kthread kernel_thread_helper Process ksoftirqd/1 (10) Total: 19.6 msec [run_ksoftirqd] 4.9 msec100.0 % run_ksoftirqd kthread kernel_thread_helper Process kworker/0:1 (11) Total: 556.3 msec . 5.0 msec100.0 % worker_thread kthread kernel_thread_helper Process ksoftirqd/2 (15) Total: 8.1 msec [run_ksoftirqd] 2.9 msec100.0 % run_ksoftirqd kthread kernel_thread_helper Process ksoftirqd/4 (23) Total: 11.2 msec [run_ksoftirqd] 4.3 msec100.0 % run_ksoftirqd kthread kernel_thread_helper Process scsi_eh_1 (62) Total: 38.8 msec SCSI error handler0.9 msec 39.9 % scsi_error_handler kthread kernel_thread_helper Executing internal ATA command0.7 msec 60.1 % ata_exec_internal_sg ata_exec_internal atapi_eh_request_sense ata_eh_link_autopsy ata_eh_autopsy sata_pmp_error_handler ahci_error_handler ata_scsi_error scsi_error_handler kthread kernel_thread_helper Process kworker/u:4 (69) Total: 616.5 msec Creating block layer request 54.9 msec 77.8 % get_request_wait __make_request generic_make_request kcryptd_crypt_write_io_submit kcryptd_crypt process_one_work worker_thread kthread kernel_thread_helper . 5.0 msec 22.2 % worker_thread kthread kernel_thread_helper Process kworker/u:5 (70) Total: 1712.3 msec Creating block layer request492.8 msec 94.3 %