Re: BUG at fs/buffer.c:2925! when mounted USB-disk is disconnected

2008-12-18 Thread Chris Mason
On Thu, 2008-12-18 at 04:15 +0100, Kay Sievers wrote:
 I see the following when disconnecting a USB-stick containing a
 mounted 2-partitions btrfs volume, and I try to umount it later.
 
 I reproduced it 3 times, always after a fresh reboot. The box
 is unstable after that, modules can not be unloaded, other fs's
 can not be unmounted.
 

Ok, this is just one of many problems in the forced usb disconnect area.
But I'll take a look after the holidays.

-chris


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Compressed Filesystem

2008-12-18 Thread Chris Mason
On Tue, 2008-12-16 at 20:45 +0100, Roland wrote:
 I agree that adding more options will add more complexity but it seems
  the same amount of work in kernel space will have to be done
 
 regarding lzo compression itself - it`s already there(since july 2007).
 the in-kernel lzo is equivalent to minilzo. 
 (http://www.oberhumer.com/opensource/lzo/)

The compression code initially used the kernel lzo modules.  Even though
the zlib api is clunky and strange, it is actually a better fit to the
multi-page compressions that need to be done by btrfs.  So adding LZO
support would require some work to compress over multiple pages at a
time.

-chris


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Notes on support for multiple devices for a single filesystem

2008-12-18 Thread Bryan Henderson
 Features like the very nice and useful directory-based snapshots would
 also not be possible with simple block-based multi-devices, right?

Snapshotting via block device has always been an incredibly dumb hack, 
existing primarily because filesystem-based snapshots did not exist for 
the filesystem in question.

I can see that if the filesystem driver in question could already do 
snapshots, nobody would have added snapshot function to the block device 
driver under it, but this doesn't explain why someone at some time created 
block device snapshot instead of creating it for the filesystem in 
question.

Snapshots are better at the filesystem level because the filesystem is 
the only entity that knows when the filesystem is quiescent and 
snapshot-able.

You can use the same logic to say that snapshots are better at the 
application level because only the application knows when its database is 
quiescent and snapshot-able.  In fact, carrying it to the extreme, you 
could say snapshots are better done manually by the human end user with 
none of the computer knowing anything about it.

It probably minimizes engineering effort to have snapshot capability at 
every level, with the implementation at each level exploiting the function 
at the level below.  E.g. when someone tells a filesystem driver to 
snapshot a filesystem that resides on two block devices, the filesystem 
driver quiesces the filesystem, then snapshots each device (implemented in 
the block device driver), then resumes.  The new snapshot filesystem lives 
on the two new snapshot block devices.

Of course, if you want to do a form of snapshot that makes sense only in 
the context of a filesystem, like the directory snapshot mentioned above, 
then you can't get as much help from snapshot functions in the storage 
devices.

--
Bryan Henderson IBM Almaden Research Center
San Jose CA Storage Systems

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fix wrong value returned from btrfs_listxattr when buffer is too small

2008-12-18 Thread Sage Weil
BTW it looks like the commit log made it into git, but not the patch 
itself... :)

sage


On Fri, 12 Dec 2008, Yehuda Sadeh Weinraub wrote:

 Fix bug, btrfs_listxattr doesn't return an error when the buffer size
 is too small (ret was overridden).
 
 Signed-off-by: Yehuda Sadeh yeh...@hq.newdream.net
 ---
  fs/btrfs/xattr.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)
 
 diff --git a/fs/btrfs/xattr.c b/fs/btrfs/xattr.c
 index adb4b32..4146f07 100644
 --- a/fs/btrfs/xattr.c
 +++ b/fs/btrfs/xattr.c
 @@ -226,7 +226,7 @@ ssize_t btrfs_listxattr(struct dentry *dentry,
 char *buffer, size_t size)
 
   if (!buffer || (name_len + 1)  size_left) {
   ret = -ERANGE;
 - break;
 + goto err;
   }
 
   name_ptr = (unsigned long)(di + 1);
 -- 
 1.5.6.5
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
 
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: inode data not getting included in commits?

2008-12-18 Thread Yan Zheng
2008/12/19 Sage Weil s...@newdream.net:
 Hi Chris-

 I noticed some data and metadata getting out of sync on disk, despite
 wrapping my writes with btrfs transactions.  After digging into it a bit,
 it appears to be a larger problem with inode size/data getting written
 during a regular commit.

 I have a test program append a few bytes at a time to a few different
 files, in a loop.  I let it run until I see a btrfs transaction commit
 (via a printk at the bottom of btrfs_commit_transaction).  Then 'reboot -f
 -n'.  After remounting, all files exist but are 0 bytes, and debug-tree
 shows a bunch of empty files.  I would expect to see either the sizes when
 the commit happend (a few hundred KB in my case), or no files at all;
 there was actually no point in time when any of the files were 0 bytes.

 Similarly, if I do the same but wait for a few commits to happen, after
 remount the file sizes reflect the size from around the next-to-last
 commit, not the last commit.

 This is probably more information than you need, but my original test was
 a bit more complicated, with weirder results.  Append to each file, then
 write it's size to an xattr on another file.  Wrap both operations in a
 transaction.  Start it up, run 'sync', then reboot -f -n.  When I remount
 the size and xattr are out of sync by exactly one iteration: the xattr
 reflects the size that resulted from _two_ writes back, not the
 immediately preceeding write.  If anything I would expect to see a larger
 actual size than xattr value (for example if the start/end transaction
 ioctls weren't working)...

 sage



 #include stdio.h
 #include stdlib.h
 #include unistd.h
 #include stdlib.h
 #include sys/types.h
 #include sys/stat.h
 #include fcntl.h

 int main(int argc, char **argv)
 {
while (1) {
int r, fd, pos, i = rand() % 10;
char a[20];

sprintf(a, %d.log, i);
fd = open(a, O_CREAT|O_APPEND|O_WRONLY, 0600);
r = write(fd, foobarfoo\n, 10);
pos = lseek(fd, 0, SEEK_CUR);
printf(write %s = %d, size = %d\n, a, r, pos);
close(fd);
}
 }


This is the desired behaviour of data=ordered. Btrfs transaction commit
don't flush data, and metadata wont get updated until data IO complete.

http://article.gmane.org/gmane.comp.file-systems.btrfs/869/match=new+data+ordered+code

Regards
Yan Zheng
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: weird bash autocomplete issue

2008-12-18 Thread Kay Sievers
On Fri, Dec 19, 2008 at 01:59,  devz...@web.de wrote:
 I see the same issue on x86 32 bit, with the additional __llseek()
 between the getdents64(), and the last entry returned by readdir
 ignored.

 confirmed - it`s readdir which assumes 32bit.

 attached is a sample program which shows the issue on my system.

 if compiled with -D_FILE_OFFSET_BITS=64, the problem goes away.

 old posting from around 2001:

http://sourceware.org/ml/libc-alpha/2001-01/msg00216.html

This is why everybody will have to compile programs with
_FILE_OFFSET_BITS=64.  Did you ever notice that all GNU programs
already do this?

 as 32bit systems can use 64bit filesystems, i think btrfs is correct and bash 
 is wrong,
 as it isn`t LFS aware. i think all 32bit stuff should be LFS aware, nowadays.

 to be exact, it`s not bash but readline library which comes with bash.
 bash configure script correctly checks for _FILE_OFFSET_BITS value, but 
 readline configure script doesn`t.
 this explains why i could not reproduce the issue when i build bash without 
 readline support.

 does it make sense to file a ticket at novell bugzilla ?

Sure, would be good to have that fixed. Cc: kasiev...@novell.com in
the bug, and I will move it directly to the right guy. :)

Thanks,
Kay
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fix wrong value returned from btrfs_listxattr when buffer is too small

2008-12-18 Thread Chris Mason
On Thu, 2008-12-18 at 16:25 -0800, Sage Weil wrote:
 BTW it looks like the commit log made it into git, but not the patch 
 itself... :)
 

Ugh.  The patch was malformed and I thought I fixed it.  I think guilt
is making it easy for me to mess this up ;)  Thanks, will fix.

-chris


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: inode data not getting included in commits?

2008-12-18 Thread Sage Weil
On Fri, 19 Dec 2008, Yan Zheng wrote:
  I noticed some data and metadata getting out of sync on disk, despite
  wrapping my writes with btrfs transactions.  After digging into it a bit,
  it appears to be a larger problem with inode size/data getting written
  during a regular commit.
  [...]
 
 This is the desired behaviour of data=ordered. Btrfs transaction commit
 don't flush data, and metadata wont get updated until data IO complete.
 
 http://article.gmane.org/gmane.comp.file-systems.btrfs/869/match=new+data+ordered+code

Ah, right, so it is.

I think what I'm looking for then is a mount mode to get the old behavior, 
such that each commit flushes previously written data.  Probably a call to 
btrfs_wait_ordered_extents() in btrfs_commit_transaction(), or something 
along those lines...

sage
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html