Copied Files from Btrfs partition larger than original

2011-02-14 Thread MOB
So as I'm doing some maintenance on my personal video server, I'm noticing that 
when I'm copying files off of my btrfs partitions, they are getting larger...

First partition is the original:
http://pastebin.com/GM5xWetR

I have 3 affected partitions, This appears to have started with 2.6.37 but 
could have started happening before. I have ~3300 video files where ~840 are on 
btrfs partitions that randomly get shuffled on/off for free space distribution

Thank you,
Michael--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Copied Files from Btrfs partition larger than original

2011-02-14 Thread Hugo Mills
On Mon, Feb 14, 2011 at 12:41:46AM -0800, MOB wrote:
 So as I'm doing some maintenance on my personal video server, I'm noticing 
 that when I'm copying files off of my btrfs partitions, they are getting 
 larger...
 
 First partition is the original:
 http://pastebin.com/GM5xWetR
 
 I have 3 affected partitions, This appears to have started with 2.6.37 but 
 could have started happening before. I have ~3300 video files where ~840 are 
 on btrfs partitions that randomly get shuffled on/off for free space 
 distribution

   Pastebins aren't forever. For the archives:

-- (begin)
ls -lah /mnt/store-p00/1280x720/~NCIS\ Los\ Angeles~2010-05-11~720.mov
-rw-rw-r-- 1 root hdhr 1.7G Nov  6 18:39 /mnt/store-p00/1280x720/~NCIS Los 
Angeles~2010-05-11~720.mov


ls -lah /hdhr/demux/1280x720/~NCIS Los Angeles~2010-05-11~720.mov
-rw-rw-r-- 1 root hdhr 3.7G Nov  6 18:39 /hdhr/demux/1280x720/~NCIS Los 
Angeles~2010-05-11~720.mov
-- (end)

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
  --- Attempted murder, now honestly, what is that?  Do they give a ---  
  Nobel Prize for attempted chemistry?   


signature.asc
Description: Digital signature


2.6.37 kernel BUG at fs/btrfs/inode.c:6752!

2011-02-14 Thread Zhong, Xin
We build packages in a kvm-qemu chroot environment. And the root fs is btrfs. 
It hang during installing packages. And we found error message in dmesg:

[   84.320466] btrfs: use compression
[  288.711396] [ cut here ]
[  288.711569] kernel BUG at fs/btrfs/inode.c:6752!
[  288.711730] invalid opcode:  [#1] PREEMPT SMP 
[  288.712014] last sysfs file: /sys/devices/virtual/bdi/btrfs-1/uevent
[  288.712014] Modules linked in: dm_snapshot dm_mod
[  288.712014] 
[  288.712014] Pid: 869, comm: ldconfig Not tainted 2.6.37-14.3 #1 /Bochs
[  288.712014] EIP: 0060:[c1191f0a] EFLAGS: 00010286 CPU: 0
[  288.712014] EIP is at btrfs_rename+0x2d6/0x41c
[  288.712014] EAX: ffe4 EBX: 4d584e22 ECX: 0007 EDX: c3a88838
[  288.712014] ESI: 2a60ff2b EDI: db43fcdc EBP: c3acfe5c ESP: c3acfe18
[  288.712014]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[  288.712014] Process ldconfig (pid: 869, ti=c3ace000 task=c3a883b0 
task.ti=c3ace000)
[  288.712014] Stack:
[  288.712014]  0246 00acfe2c f4114900 01bfb428 f3f02988 f4114b1c db7a7358 
f4114b1c
[  288.712014]  f3f02988  debfb38c f247a4a0 0336  c155ee98 
f4fcd9d0
[  288.712014]  debfb38c c3acfe80 c10fe222 f4fcd9d0 f4114b1c db7a7358 f4114b1c 

[  288.712014] Call Trace:
[  288.712014]  [c10fe222] ? vfs_rename_other+0x62/0xc7
[  288.712014]  [c10fec68] ? vfs_rename+0xda/0x1a4
[  288.712014]  [c1100d20] ? sys_renameat+0x16d/0x1ce
[  288.712014]  [c104a93b] ? do_page_fault+0x330/0x35d
[  288.712014]  [c104a60b] ? do_page_fault+0x0/0x35d
[  288.712014]  [c1100d93] ? sys_rename+0x12/0x17
[  288.712014]  [c154065d] ? syscall_call+0x7/0xb
[  288.712014] Code: c4 10 eb 23 8b 55 d4 8b 4d d0 8b 42 2c ff 40 30 8b 45 e8 
ff 72 40 ff 72 44 ff 72 2c 8b 55 dc e8 ce dc ff ff 83 c4 0c 85 c0 74 02 0f 0b 
80 7d cb 00 0f 84 85 00 00 00 e8 ea 1e ef ff 8b 4d e4 83 
[  288.712014] EIP: [c1191f0a] btrfs_rename+0x2d6/0x41c SS:ESP 0068:c3acfe18
[  288.730501] ---[ end trace f18bb6381110715f ]---

We also got the output of sysrq+w as below:

[15976.024048] SysRq : Show Blocked State
[15976.025023]   taskPC stack   pid father
[15976.025023] mic-image-cre D 0043 0   273271 0x
[15976.025023]  f452bee0 0046 4e94762e 0043 c18920c0 00207c5b  
c18920c0
[15976.025023]  f45988b8 4e73f9d3 0043 f4598630 0001 f3f02c20 c107c3a6 
f3f02c30
[15976.025023]  0001 f3f02bd8 c153f063 f3f02c10 f452bec4 0046 0046 
f3f02bd8
[15976.025023] Call Trace:
[15976.025023]  [c107c3a6] ? prepare_to_wait+0x4f/0x56
[15976.025023]  [c153f063] ? mutex_unlock+0x8/0xa
[15976.025023]  [c108e368] ? trace_hardirqs_on+0xb/0xd
[15976.025023]  [c11b3955] wait_for_writer.clone.17+0x89/0xaf
[15976.025023]  [c107c1a7] ? autoremove_wake_function+0x0/0x2f
[15976.025023]  [c11b3aa9] btrfs_sync_log+0xc2/0x43b
[15976.025023]  [c108e368] ? trace_hardirqs_on+0xb/0xd
[15976.025023]  [c1193c22] btrfs_sync_file+0x116/0x156
[15976.025023]  [c1113e2c] vfs_fsync_range+0x48/0x65
[15976.025023]  [c1113ec5] vfs_fsync+0x14/0x16
[15976.025023]  [c1113eeb] do_fsync+0x24/0x34
[15976.025023]  [c1114167] sys_fdatasync+0x10/0x12
[15976.025023]  [c154065d] syscall_call+0x7/0xb
[15976.025023] btrfs-transac D 004a 0   337  2 0x
[15976.025023]  f3f71ecc 0046 913eb54f 004a c18920c0 04c2  
c18920c0
[15976.025023]  f47f1038 913eb08d 004a f47f0db0  f4561e70 c153dddf 

[15976.025023]   c119fa68 f3f71ec4 0002  f4561ffc 0002 

[15976.025023] Call Trace:
[15976.025023]  [c153dddf] ? schedule+0x6c0/0x774
[15976.025023]  [c119fa68] ? btrfs_run_ordered_operations+0x2a/0x195
[15976.025023]  [c119fa78] ? btrfs_run_ordered_operations+0x3a/0x195
[15976.025023]  [c107c36f] ? prepare_to_wait+0x18/0x56
[15976.025023]  [c153e11f] schedule_timeout+0x18/0x2c8
[15976.025023]  [c107c3a6] ? prepare_to_wait+0x4f/0x56
[15976.025023]  [c108e368] ? trace_hardirqs_on+0xb/0xd
[15976.025023]  [c107c3a6] ? prepare_to_wait+0x4f/0x56
[15976.025023]  [c1188089] btrfs_commit_transaction+0x252/0x5c4
[15976.025023]  [c107c1a7] ? autoremove_wake_function+0x0/0x2f
[15976.025023]  [c11833f3] transaction_kthread+0x141/0x1dd
[15976.025023]  [c1051a2d] ? complete+0x34/0x3e
[15976.025023]  [c11832b2] ? transaction_kthread+0x0/0x1dd
[15976.025023]  [c107bd84] kthread+0x63/0x68
[15976.025023]  [c107bd21] ? kthread+0x0/0x68
[15976.025023]  [c102eaba] kernel_thread_helper+0x6/0x10
[15976.025023] Sched Debug Version: v0.09, 2.6.37-14.3 #1
[15976.025023] now at 15976066.938938 msecs
[15976.025023]   .jiffies : 15676024
[15976.025023]   .sysctl_sched_latency: 6.00
[15976.025023]   .sysctl_sched_min_granularity: 0.75
[15976.025023]   .sysctl_sched_wakeup_granularity : 1.00
[15976.025023]   .sysctl_sched_child_runs_first   : 0
[15976.025023]   .sysctl_sched_features   : 31855
[15976.025023]   

Re: 2.6.37 kernel BUG at fs/btrfs/inode.c:6752!

2011-02-14 Thread cwillu
On Mon, Feb 14, 2011 at 3:59 AM, Zhong, Xin xin.zh...@intel.com wrote:
 We build packages in a kvm-qemu chroot environment. And the root fs is btrfs. 
 It hang during installing packages. And we found error message in dmesg:

 [   84.320466] btrfs: use compression
 [  288.711396] [ cut here ]
 [  288.711569] kernel BUG at fs/btrfs/inode.c:6752!
 [  288.711730] invalid opcode:  [#1] PREEMPT SMP
 [  288.712014] last sysfs file: /sys/devices/virtual/bdi/btrfs-1/uevent
 [  288.712014] Modules linked in: dm_snapshot dm_mod
 [  288.712014]
 [  288.712014] Pid: 869, comm: ldconfig Not tainted 2.6.37-14.3 #1 /Bochs
 [  288.712014] EIP: 0060:[c1191f0a] EFLAGS: 00010286 CPU: 0
 [  288.712014] EIP is at btrfs_rename+0x2d6/0x41c
 [  288.712014] EAX: ffe4 EBX: 4d584e22 ECX: 0007 EDX: c3a88838
 [  288.712014] ESI: 2a60ff2b EDI: db43fcdc EBP: c3acfe5c ESP: c3acfe18
 [  288.712014]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
 [  288.712014] Process ldconfig (pid: 869, ti=c3ace000 task=c3a883b0 
 task.ti=c3ace000)
 [  288.712014] Stack:
 [  288.712014]  0246 00acfe2c f4114900 01bfb428 f3f02988 f4114b1c 
 db7a7358 f4114b1c
 [  288.712014]  f3f02988  debfb38c f247a4a0 0336  
 c155ee98 f4fcd9d0
 [  288.712014]  debfb38c c3acfe80 c10fe222 f4fcd9d0 f4114b1c db7a7358 
 f4114b1c 
 [  288.712014] Call Trace:
 [  288.712014]  [c10fe222] ? vfs_rename_other+0x62/0xc7
 [  288.712014]  [c10fec68] ? vfs_rename+0xda/0x1a4
 [  288.712014]  [c1100d20] ? sys_renameat+0x16d/0x1ce
 [  288.712014]  [c104a93b] ? do_page_fault+0x330/0x35d
 [  288.712014]  [c104a60b] ? do_page_fault+0x0/0x35d
 [  288.712014]  [c1100d93] ? sys_rename+0x12/0x17
 [  288.712014]  [c154065d] ? syscall_call+0x7/0xb
 [  288.712014] Code: c4 10 eb 23 8b 55 d4 8b 4d d0 8b 42 2c ff 40 30 8b 45 e8 
 ff 72 40 ff 72 44 ff 72 2c 8b 55 dc e8 ce dc ff ff 83 c4 0c 85 c0 74 02 0f 
 0b 80 7d cb 00 0f 84 85 00 00 00 e8 ea 1e ef ff 8b 4d e4 83
 [  288.712014] EIP: [c1191f0a] btrfs_rename+0x2d6/0x41c SS:ESP 0068:c3acfe18
 [  288.730501] ---[ end trace f18bb6381110715f ]---

 We also got the output of sysrq+w as below:

 [15976.024048] SysRq : Show Blocked State
 [15976.025023]   task                PC stack   pid father
 [15976.025023] mic-image-cre D 0043     0   273    271 0x
 [15976.025023]  f452bee0 0046 4e94762e 0043 c18920c0 00207c5b 
  c18920c0
 [15976.025023]  f45988b8 4e73f9d3 0043 f4598630 0001 f3f02c20 
 c107c3a6 f3f02c30
 [15976.025023]  0001 f3f02bd8 c153f063 f3f02c10 f452bec4 0046 
 0046 f3f02bd8
 [15976.025023] Call Trace:
 [15976.025023]  [c107c3a6] ? prepare_to_wait+0x4f/0x56
 [15976.025023]  [c153f063] ? mutex_unlock+0x8/0xa
 [15976.025023]  [c108e368] ? trace_hardirqs_on+0xb/0xd
 [15976.025023]  [c11b3955] wait_for_writer.clone.17+0x89/0xaf
 [15976.025023]  [c107c1a7] ? autoremove_wake_function+0x0/0x2f
 [15976.025023]  [c11b3aa9] btrfs_sync_log+0xc2/0x43b
 [15976.025023]  [c108e368] ? trace_hardirqs_on+0xb/0xd
 [15976.025023]  [c1193c22] btrfs_sync_file+0x116/0x156
 [15976.025023]  [c1113e2c] vfs_fsync_range+0x48/0x65
 [15976.025023]  [c1113ec5] vfs_fsync+0x14/0x16
 [15976.025023]  [c1113eeb] do_fsync+0x24/0x34
 [15976.025023]  [c1114167] sys_fdatasync+0x10/0x12
 [15976.025023]  [c154065d] syscall_call+0x7/0xb
 [15976.025023] btrfs-transac D 004a     0   337      2 0x
 [15976.025023]  f3f71ecc 0046 913eb54f 004a c18920c0 04c2 
  c18920c0
 [15976.025023]  f47f1038 913eb08d 004a f47f0db0  f4561e70 
 c153dddf 
 [15976.025023]   c119fa68 f3f71ec4 0002  f4561ffc 
 0002 
 [15976.025023] Call Trace:
 [15976.025023]  [c153dddf] ? schedule+0x6c0/0x774
 [15976.025023]  [c119fa68] ? btrfs_run_ordered_operations+0x2a/0x195
 [15976.025023]  [c119fa78] ? btrfs_run_ordered_operations+0x3a/0x195
 [15976.025023]  [c107c36f] ? prepare_to_wait+0x18/0x56
 [15976.025023]  [c153e11f] schedule_timeout+0x18/0x2c8
 [15976.025023]  [c107c3a6] ? prepare_to_wait+0x4f/0x56
 [15976.025023]  [c108e368] ? trace_hardirqs_on+0xb/0xd
 [15976.025023]  [c107c3a6] ? prepare_to_wait+0x4f/0x56
 [15976.025023]  [c1188089] btrfs_commit_transaction+0x252/0x5c4
 [15976.025023]  [c107c1a7] ? autoremove_wake_function+0x0/0x2f
 [15976.025023]  [c11833f3] transaction_kthread+0x141/0x1dd
 [15976.025023]  [c1051a2d] ? complete+0x34/0x3e
 [15976.025023]  [c11832b2] ? transaction_kthread+0x0/0x1dd
 [15976.025023]  [c107bd84] kthread+0x63/0x68
 [15976.025023]  [c107bd21] ? kthread+0x0/0x68
 [15976.025023]  [c102eaba] kernel_thread_helper+0x6/0x10
 [15976.025023] Sched Debug Version: v0.09, 2.6.37-14.3 #1
 [15976.025023] now at 15976066.938938 msecs
 [15976.025023]   .jiffies                                 : 15676024
 [15976.025023]   .sysctl_sched_latency                    : 6.00
 [15976.025023]   .sysctl_sched_min_granularity            : 0.75
 [15976.025023]   .sysctl_sched_wakeup_granularity         : 

Re: Question on subvolumes and mount options

2011-02-14 Thread Yuri D'Elia
On Sun, 13 Feb 2011 19:18:20 +
Hugo Mills hugo-l...@carfax.org.uk wrote:

Yes, it's the same piece of storage, just appearing at more than
 one point in your overall filesystem. Similar to the way that bind
 mounts work.

I've noticed that I can also rename subvolumes as well using mv(1).
Can I move/re-arrange subvolumes between them simply using mv?

Does a move between subvolumes involve a copy?

  So you would recommend creating both /root and /home subvolumes, to be
  mounted separately, or create /root and /root/home subvolumes?
 
The former.

Thanks. I can see why this is clearly more flexible in the long term.

  What if I remount the /home subvol into /home2. What happens when I
  touch a file through /home (nodatasum) and what happens when I use
  /home2 - since both are available at the same time?
 
They'll stay in sync with respect to the files written to either
 one. I'm not sure what the behaviour of nodatasum is with different
 mounts of the same subvolume.

Can we get an exact answer/behavior for that? :)

I mean, can I mount two different subvolumes in the same file system with 
different flags (such as ssd, nodatasum, compress)?

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs: compression breaks cp and cross-FS mv, FS_IOC_FIEMAP bug?

2011-02-14 Thread Chris Mason
Excerpts from Josef Bacik's message of 2011-02-13 11:13:30 -0500:
 On Sun, Feb 13, 2011 at 06:07:36PM +0200, Marti Raudsepp wrote:
  On Sun, Feb 13, 2011 at 17:57, Josef Bacik jo...@redhat.com wrote:
   Does the same problem happen when you use cp --sparse=never?
  
  You are right. cp --sparse=never does not cause data loss.
 
 
 So fiemap probably isn't doing the right thing when compression is enabled,
 which doesn't suprise me since we don't do the right thing with delalloc 
 either.
 I will try and get to this soon.  Thanks,

This might be a bug in the cp code.  We're setting the disk extent to
zero but setting different flags to say we're inline and compressed.
The cp fiemap code might be ignoring the flags?

Or, it could just be delalloc ;)

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.37: Multi-second I/O latency while untarring

2011-02-14 Thread Chris Mason
Excerpts from Andrew Lutomirski's message of 2011-02-11 19:35:02 -0500:
 On Fri, Feb 11, 2011 at 10:44 AM, Chris Mason chris.ma...@oracle.com wrote:
  Excerpts from Andrew Lutomirski's message of 2011-02-11 10:08:52 -0500:
  As I type this, I have an ssh process running that's dumping data into
  a fifo at high speed (maybe 500Mbps) and a tar process that's
  untarring from the same fifo onto btrfs.  The btrfs fs is mounted -o
  space_cache,compress.  This machine has 8GB ram, 8 logical cores, and
  a fast (i7-2600) CPU, so it's not an issue with the machine struggling
  under load.
 
  Every few tens of seconds, my system stalls for several seconds.
  These stalls cause keyboard input to be lost, firefox to hang, etc.
 
  Setting tar's ionice priority to best effort / 7 or to idle makes no 
  difference.
 
  ionice idle and queue_depth = 1 on the disk (a slow 2TB WD) also makes
  no difference.
 
  max_sectors_kb = 64 in addition to the above doesn't help either.
 
  latencytop shows regular instances of 2-7 *second* latency, variously
  in sync_page, start_transaction, btrfs_start_ordered_extent, and
  do_get_write_access (from jbd2 on my ext4 root partition).
 
  echo 3 drop_caches gave me 7 GB free RAM.  I still had stalls when
  4-5 GB were still free (so it shouldn't be a problem with important
  pages being evicted).
 
  In case it matters, all of my partitions are on LVM on dm-crypt, but
  this machine has AES-NI so the overhead from that should be minimal.
  In fact, overall CPU usage is only about 10%.
 
  What gives?  I thought this stuff was supposed to be better on modern 
  kernels.
 
  We can tell more if you post the full traces from latencytop.  I have a
  patch here for latencytop that adds a -c mode, which dumps the traces
  out to a text files.
 
  http://oss.oracle.com/~mason/latencytop.patch
 
  Based on what you have here, I think it's probably a latency problem
  between btrfs and the dm-crypt stuff.  How easily can setup a test
  partition without dm-crypt?
 
 Done, on the same physical disk as before.  The latency is just as
 bad.  On this test, I wrote a total of 3.1G, which is under half of my
 RAM.  That should rule out lots of VM issues.  latencytop trace below.

Just to confirm, you say on a physical disk you mean without dm-crypt?

 
 The impression I get (from watching the disk activity light) is that
 the disk is mostly idle but every now and then writes out a ton of
 data.  While it's writing, the system often becomes unusable.

Could you please btrfs fi df /mnt (where /mnt is your test filesystem)

 
 P.S.  How bad is this?  I got it on both disks.
 btrfs: free space inode generation (0) did not match free space cache
 generation (11070) for block group 1103101952

We got rid of these in later kernels, they are fine.

The latencytop data shows us basically waiting for the disk.  We're
either waiting for synchronous reads or writes, and we're heavily
waiting for supers to be sent down to the disk as part of committing
transactions.

There are a few things I'd like you to try:

1) Try deadline instead of cfq, unless you're using deadline in which
case you could try cfq.

2) Try increasing the number of io requests we allow in flight:

echo 2048  /sys/block/xxx/queue/nr_requests

Here xxx is your physical disk (like sda)

3) Try without firefox running.  Firefox is generating a lot of
synchronous IO here.  The btrfs log tries really hard to manage this
without making the box stall, but somehow we might not be doing well.

One place we don't do well is if your disk was freshly formatted and
you're still growing chunks to cover new writes.  In this case the
fsyncs done by firefox will lead to more expensive transaction commits.

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs: compression breaks cp and cross-FS mv, FS_IOC_FIEMAP bug?

2011-02-14 Thread Marti Raudsepp
On Mon, Feb 14, 2011 at 17:01, Chris Mason chris.ma...@oracle.com wrote:
 Or, it could just be delalloc ;)

I suspect delalloc. After creating the file, filefrag reports 1
extent found, but for some reason it doesn't actually print out
details of the extent.

After a sync call, the extent appears and cp starts working as expected:

% rm -f foo bar
% echo foo  foo
% sync
% filefrag -v foo
Filesystem type is: 9123683e
File size of foo is 4 (1 block, blocksize 4096)
 ext logical physical expected length flags
   0   004096 not_aligned,inline,eof
foo: 1 extent found
% cp foo bar
% hexdump bar
000 6f66 0a6f
004

Without sync:

% rm -f foo bar
% echo foo  foo
% filefrag -v foo
Filesystem type is: 9123683e
File size of foo is 4 (1 block, blocksize 4096)
 ext logical physical expected length flags
foo: 1 extent found
% cp foo bar
% hexdump bar
000  
004

Regards,
Marti
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs: compression breaks cp and cross-FS mv, FS_IOC_FIEMAP bug?

2011-02-14 Thread Chris Mason
Excerpts from Marti Raudsepp's message of 2011-02-14 12:58:17 -0500:
 On Mon, Feb 14, 2011 at 17:01, Chris Mason chris.ma...@oracle.com wrote:
  Or, it could just be delalloc ;)
 
 I suspect delalloc. After creating the file, filefrag reports 1
 extent found, but for some reason it doesn't actually print out
 details of the extent.
 
 After a sync call, the extent appears and cp starts working as expected:

Great, that's a ton easier than fixing cp.

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fix uncheck memory allocations

2011-02-14 Thread Tsutomu Itoh
Sano-san,

(2011/02/14 22:57), Yoshinori Sano wrote:
 2011年2月14日8:57 Tsutomu Itoh t-i...@jp.fujitsu.com:
 (2011/02/12 20:17), Yoshinori Sano wrote:
 To make Btrfs code more robust, several return value checks where memory
 allocation can fail are introduced.  I use BUG_ON where I don't know how
 to handle the error properly, which increases the number of using the
 notorious BUG_ON, though.

 Signed-off-by: Yoshinori Sano yoshinori.s...@gmail.com
 ---
  fs/btrfs/compression.c |6 ++
  fs/btrfs/extent-tree.c |2 ++
  fs/btrfs/file.c|8 ++--
  fs/btrfs/inode.c   |5 +
  4 files changed, 19 insertions(+), 2 deletions(-)

 diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
 index 4d2110e..f596554 100644
 --- a/fs/btrfs/compression.c
 +++ b/fs/btrfs/compression.c
 @@ -340,6 +340,8 @@ int btrfs_submit_compressed_write(struct inode *inode, 
 u64 start,

   WARN_ON(start  ((u64)PAGE_CACHE_SIZE - 1));
   cb = kmalloc(compressed_bio_size(root, compressed_len), GFP_NOFS);
 + if (!cb)
 + return -ENOMEM;
   atomic_set(cb-pending_bios, 0);
   cb-errors = 0;
   cb-inode = inode;
 @@ -354,6 +356,10 @@ int btrfs_submit_compressed_write(struct inode *inode, 
 u64 start,
   bdev = BTRFS_I(inode)-root-fs_info-fs_devices-latest_bdev;

   bio = compressed_bio_alloc(bdev, first_byte, GFP_NOFS);
 + if (!bio) {
 + kfree(cb);
 + return -ENOMEM;
 + }
   bio-bi_private = cb;
   bio-bi_end_io = end_compressed_bio_write;
   atomic_inc(cb-pending_bios);
 diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
 index 565e22d..aed16f4 100644
 --- a/fs/btrfs/extent-tree.c
 +++ b/fs/btrfs/extent-tree.c
 @@ -6931,6 +6931,8 @@ static noinline int get_new_locations(struct inode 
 *reloc_inode,
   struct disk_extent *old = exts;
   max *= 2;
   exts = kzalloc(sizeof(*exts) * max, GFP_NOFS);
 + if (!exts)
 + goto out;

'ret = -ENOMEM' is necessary before 'goto out'.

   memcpy(exts, old, sizeof(*exts) * nr);
   if (old != *extents)
   kfree(old);
 diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
 index b0ff34b..4895ad2 100644
 --- a/fs/btrfs/file.c
 +++ b/fs/btrfs/file.c
 @@ -181,10 +181,14 @@ int btrfs_drop_extent_cache(struct inode *inode, u64 
 start, u64 end,
   testend = 0;
   }
   while (1) {
 - if (!split)
 + if (!split) {
   split = alloc_extent_map(GFP_NOFS);
 - if (!split2)
 + BUG_ON(!split || IS_ERR(split));

 alloc_extent_map() returns only the address or NULL.
 Therefore, I think that check by IS_ERR() is unnecessary.

 Regards,
 Itoh
 
 Exactly.  IS_ERR is not required.
 I should read the alloc_extent_map' s implementation more deeply.
 Thank you.

Could you please merge my 
patch(http://marc.info/?l=linux-btrfsm=129764438122741w=2)
with your patch, and post it again?

Thanks,
Itoh



 + }
 + if (!split2) {
   split2 = alloc_extent_map(GFP_NOFS);
 + BUG_ON(!split2 || IS_ERR(split2));
 + }

   write_lock(em_tree-lock);
   em = lookup_extent_mapping(em_tree, start, len);
 diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
 index c9bc0af..40bbe00 100644
 --- a/fs/btrfs/inode.c
 +++ b/fs/btrfs/inode.c
 @@ -287,6 +287,7 @@ static noinline int add_async_extent(struct async_cow 
 *cow,
   struct async_extent *async_extent;

   async_extent = kmalloc(sizeof(*async_extent), GFP_NOFS);
 + BUG_ON(!async_extent);
   async_extent-start = start;
   async_extent-ram_size = ram_size;
   async_extent-compressed_size = compressed_size;
 @@ -384,6 +385,7 @@ again:
(BTRFS_I(inode)-force_compress))) {
   WARN_ON(pages);
   pages = kzalloc(sizeof(struct page *) * nr_pages, GFP_NOFS);
 + BUG_ON(!pages);

   if (BTRFS_I(inode)-force_compress)
   compress_type = BTRFS_I(inode)-force_compress;
 @@ -644,6 +646,7 @@ retry:
   async_extent-ram_size - 1, 0);

   em = alloc_extent_map(GFP_NOFS);
 + BUG_ON(!em || IS_ERR(em));
   em-start = async_extent-start;
   em-len = async_extent-ram_size;
   em-orig_start = em-start;
 @@ -820,6 +823,7 @@ static noinline int cow_file_range(struct inode *inode,
   BUG_ON(ret);

   em = alloc_extent_map(GFP_NOFS);
 + BUG_ON(!em || IS_ERR(em));
   em-start = start;
   em-orig_start = em-start;
   ram_size = ins.offset;
 @@ -1169,6 +1173,7 @@ out_check:
   struct extent_map_tree *em_tree;
   em_tree = 

Re: [PATCH] fix uncheck memory allocations

2011-02-14 Thread Yoshinori Sano
2011年2月15日9:14 Tsutomu Itoh t-i...@jp.fujitsu.com:
 Sano-san,

 (2011/02/14 22:57), Yoshinori Sano wrote:
 2011年2月14日8:57 Tsutomu Itoh t-i...@jp.fujitsu.com:
 (2011/02/12 20:17), Yoshinori Sano wrote:
 To make Btrfs code more robust, several return value checks where memory
 allocation can fail are introduced.  I use BUG_ON where I don't know how
 to handle the error properly, which increases the number of using the
 notorious BUG_ON, though.

 Signed-off-by: Yoshinori Sano yoshinori.s...@gmail.com
 ---
  fs/btrfs/compression.c |6 ++
  fs/btrfs/extent-tree.c |2 ++
  fs/btrfs/file.c|8 ++--
  fs/btrfs/inode.c   |5 +
  4 files changed, 19 insertions(+), 2 deletions(-)

 diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
 index 4d2110e..f596554 100644
 --- a/fs/btrfs/compression.c
 +++ b/fs/btrfs/compression.c
 @@ -340,6 +340,8 @@ int btrfs_submit_compressed_write(struct inode *inode, 
 u64 start,

   WARN_ON(start  ((u64)PAGE_CACHE_SIZE - 1));
   cb = kmalloc(compressed_bio_size(root, compressed_len), GFP_NOFS);
 + if (!cb)
 + return -ENOMEM;
   atomic_set(cb-pending_bios, 0);
   cb-errors = 0;
   cb-inode = inode;
 @@ -354,6 +356,10 @@ int btrfs_submit_compressed_write(struct inode 
 *inode, u64 start,
   bdev = BTRFS_I(inode)-root-fs_info-fs_devices-latest_bdev;

   bio = compressed_bio_alloc(bdev, first_byte, GFP_NOFS);
 + if (!bio) {
 + kfree(cb);
 + return -ENOMEM;
 + }
   bio-bi_private = cb;
   bio-bi_end_io = end_compressed_bio_write;
   atomic_inc(cb-pending_bios);
 diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
 index 565e22d..aed16f4 100644
 --- a/fs/btrfs/extent-tree.c
 +++ b/fs/btrfs/extent-tree.c
 @@ -6931,6 +6931,8 @@ static noinline int get_new_locations(struct inode 
 *reloc_inode,
   struct disk_extent *old = exts;
   max *= 2;
   exts = kzalloc(sizeof(*exts) * max, GFP_NOFS);
 + if (!exts)
 + goto out;

 'ret = -ENOMEM' is necessary before 'goto out'.

I'll keep in mind to fix this too.



   memcpy(exts, old, sizeof(*exts) * nr);
   if (old != *extents)
   kfree(old);
 diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
 index b0ff34b..4895ad2 100644
 --- a/fs/btrfs/file.c
 +++ b/fs/btrfs/file.c
 @@ -181,10 +181,14 @@ int btrfs_drop_extent_cache(struct inode *inode, u64 
 start, u64 end,
   testend = 0;
   }
   while (1) {
 - if (!split)
 + if (!split) {
   split = alloc_extent_map(GFP_NOFS);
 - if (!split2)
 + BUG_ON(!split || IS_ERR(split));

 alloc_extent_map() returns only the address or NULL.
 Therefore, I think that check by IS_ERR() is unnecessary.

 Regards,
 Itoh

 Exactly.  IS_ERR is not required.
 I should read the alloc_extent_map' s implementation more deeply.
 Thank you.

 Could you please merge my 
 patch(http://marc.info/?l=linux-btrfsm=129764438122741w=2)
 with your patch, and post it again?

Yes, this is a good idea :)
I'll merge your patch and post it again later.
Thank you.



 Thanks,
 Itoh



 + }
 + if (!split2) {
   split2 = alloc_extent_map(GFP_NOFS);
 + BUG_ON(!split2 || IS_ERR(split2));
 + }

   write_lock(em_tree-lock);
   em = lookup_extent_mapping(em_tree, start, len);
 diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
 index c9bc0af..40bbe00 100644
 --- a/fs/btrfs/inode.c
 +++ b/fs/btrfs/inode.c
 @@ -287,6 +287,7 @@ static noinline int add_async_extent(struct async_cow 
 *cow,
   struct async_extent *async_extent;

   async_extent = kmalloc(sizeof(*async_extent), GFP_NOFS);
 + BUG_ON(!async_extent);
   async_extent-start = start;
   async_extent-ram_size = ram_size;
   async_extent-compressed_size = compressed_size;
 @@ -384,6 +385,7 @@ again:
(BTRFS_I(inode)-force_compress))) {
   WARN_ON(pages);
   pages = kzalloc(sizeof(struct page *) * nr_pages, GFP_NOFS);
 + BUG_ON(!pages);

   if (BTRFS_I(inode)-force_compress)
   compress_type = BTRFS_I(inode)-force_compress;
 @@ -644,6 +646,7 @@ retry:
   async_extent-ram_size - 1, 0);

   em = alloc_extent_map(GFP_NOFS);
 + BUG_ON(!em || IS_ERR(em));
   em-start = async_extent-start;
   em-len = async_extent-ram_size;
   em-orig_start = em-start;
 @@ -820,6 +823,7 @@ static noinline int cow_file_range(struct inode *inode,
   BUG_ON(ret);

   em = alloc_extent_map(GFP_NOFS);
 + BUG_ON(!em || IS_ERR(em));
   em-start = start;
   em-orig_start = em-start;
 

[GIT PULL] Btrfs updates

2011-02-14 Thread Chris Mason
Hi everyone,

The master branch of the btrfs unstable tree has some important btrfs
fixes:

git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable.git master

I was seeing very rare metadata corruptions during long stress runs, and
eventually tracked it down to two different races in the btrfs
releasepage code.  One was making btrfs think pages were up to date when
they really weren't and the other lead to corrupted csum fields in
metadata.

Chris Mason (2) commits (+50/-6):
Btrfs: don't release pages when we can't clear the uptodate bits (+9/-1)
Btrfs: fix page-private races (+41/-5)

Zheng Yan (1) commits (+1/-0):
Btrfs: Fix balance panic

Dan Rosenberg (1) commits (+8/-2):
btrfs: prevent heap corruption in btrfs_ioctl_space_info()

Tsutomu Itoh (1) commits (+7/-3):
Btrfs: check return value of alloc_extent_map()

Ilya Dryomov (1) commits (+2/-0):
Btrfs - Fix memory leak in btrfs_init_new_device()

Total: (6) commits (+68/-11)

 fs/btrfs/disk-io.c |8 ++--
 fs/btrfs/extent-tree.c |2 +-
 fs/btrfs/extent_io.c   |   48 
 fs/btrfs/extent_map.c  |4 ++--
 fs/btrfs/file.c|1 +
 fs/btrfs/inode.c   |3 +++
 fs/btrfs/ioctl.c   |   10 --
 fs/btrfs/relocation.c  |1 +
 fs/btrfs/volumes.c |2 ++
 9 files changed, 68 insertions(+), 11 deletions(-)
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html