Re: corruption of active mmapped files in btrfs snapshots

2013-03-25 Thread Chris Mason
Quoting Chris Mason (2013-03-22 16:31:42) Going through the code here, when I change the test to truncate once in the very beginning, I still get errors. So, it isn't an interaction between mmap and truncate. It must be a problem between lzo and mmap. With compression off, we use

Re: corruption of active mmapped files in btrfs snapshots

2013-03-22 Thread Chris Mason
Quoting Alexandre Oliva (2013-03-22 01:27:42) On Mar 21, 2013, Chris Mason chris.ma...@fusionio.com wrote: Quoting Chris Mason (2013-03-21 14:06:14) With mmap the kernel can pick any given time to start writing out dirty pages. The idea is that if the application makes more changes

Re: corruption of active mmapped files in btrfs snapshots

2013-03-22 Thread Chris Mason
Quoting Alexandre Oliva (2013-03-22 10:17:30) On Mar 22, 2013, Chris Mason clma...@fusionio.com wrote: Are you using compression in btrfs or just in leveldb? btrfs lzo compression. Perfect, I'll focus on that part of things. I'd like to take snapshots out of the picture for a minute

Re: corruption of active mmapped files in btrfs snapshots

2013-03-22 Thread Chris Mason
is initiated so there should be no in-progress writes to leveldb other than leveldb compaction (though that might be something to investigate). -Sam On Fri, Mar 22, 2013 at 7:26 AM, Chris Mason clma...@fusionio.com wrote: Quoting Alexandre Oliva (2013-03-22 10:17:30) On Mar 22, 2013, Chris

Re: corruption of active mmapped files in btrfs snapshots

2013-03-22 Thread Chris Mason
[ mmap corruptions with leveldb and btrfs compression ] I ran this a number of times with compression off and wasn't able to trigger problems. With compress=lzo, I see errors on every run. Compile: gcc -Wall -o mmap-trunc mmap-trunc.c Run: ./mmap-trunc file_name The basic idea is to create a

Re: corruption of active mmapped files in btrfs snapshots

2013-03-22 Thread Chris Mason
Quoting Chris Mason (2013-03-22 14:07:05) [ mmap corruptions with leveldb and btrfs compression ] I ran this a number of times with compression off and wasn't able to trigger problems. With compress=lzo, I see errors on every run. Compile: gcc -Wall -o mmap-trunc mmap-trunc.c Run: ./mmap

Re: corruption of active mmapped files in btrfs snapshots

2013-03-21 Thread Chris Mason
Quoting Chris Mason (2013-03-21 14:06:14) Quoting Alexandre Oliva (2013-03-21 03:14:02) On Mar 19, 2013, Alexandre Oliva ol...@gnu.org wrote: On Mar 19, 2013, Alexandre Oliva ol...@gnu.org wrote: that is being processed inside the snapshot. This doesn't explain why the master

Re: corruption of active mmapped files in btrfs snapshots

2013-03-19 Thread Chris Mason
Quoting Alexandre Oliva (2013-03-19 01:20:10) On Mar 18, 2013, Chris Mason chris.ma...@fusionio.com wrote: A few questions. Does leveldb use O_DIRECT and mmap together? No, it doesn't use O_DIRECT at all. Its I/O interface is very simplified: it just opens each new file (database chunks

Re: corruption of active mmapped files in btrfs snapshots

2013-03-18 Thread Chris Mason
A few questions. Does leveldb use O_DIRECT and mmap together? (the source of a write being pages that are mmap'd from somewhere else) That's the most likely place for this kind of problem. Also, you mention crc errors. Are those reported by btrfs or are they application level crcs. Thanks for

Re: ceph-on-btrfs inline-cow regression fix for 3.4.3

2012-06-13 Thread Chris Mason
On Tue, Jun 12, 2012 at 09:46:26PM -0600, Alexandre Oliva wrote: Hi, Greg, There's a btrfs regression in 3.4 that's causing a lot of grief to ceph-on-btrfs users like myself. This small and nice patch cures it. It's in Linus' master already. I've been running it on top of 3.4.2, and it

Re: Btrfs slowdown with ceph (how to reproduce)

2012-01-24 Thread Chris Mason
On Tue, Jan 24, 2012 at 08:15:58PM +0100, Martin Mailand wrote: Hi I tried the branch on one of my ceph osd, and there is a big difference in the performance. The average request size stayed high, but after around a hour the kernel crashed. IOstat http://pastebin.com/xjuriJ6J Kernel

Re: Btrfs slowdown with ceph (how to reproduce)

2012-01-23 Thread Chris Mason
On Mon, Jan 23, 2012 at 01:19:29PM -0500, Josef Bacik wrote: On Fri, Jan 20, 2012 at 01:13:37PM +0100, Christian Brunner wrote: As you might know, I have been seeing btrfs slowdowns in our ceph cluster for quite some time. Even with the latest btrfs code for 3.3 I'm still seeing these

Re: ceph on btrfs [was Re: ceph on non-btrfs file systems]

2011-10-26 Thread Chris Mason
On Tue, Oct 25, 2011 at 04:22:48PM -0400, Josef Bacik wrote: On Tue, Oct 25, 2011 at 04:15:45PM -0400, Chris Mason wrote: On Tue, Oct 25, 2011 at 11:05:12AM -0400, Josef Bacik wrote: On Tue, Oct 25, 2011 at 04:25:02PM +0200, Christian Brunner wrote: Attached is a perf-report. I have

Re: ceph on btrfs [was Re: ceph on non-btrfs file systems]

2011-10-25 Thread Chris Mason
On Tue, Oct 25, 2011 at 11:05:12AM -0400, Josef Bacik wrote: On Tue, Oct 25, 2011 at 04:25:02PM +0200, Christian Brunner wrote: Attached is a perf-report. I have included the whole report, so that you can see the difference between the good and the bad btrfs-endio-wri. We also

Re: ceph on btrfs [was Re: ceph on non-btrfs file systems]

2011-10-24 Thread Chris Mason
On Mon, Oct 24, 2011 at 03:51:47PM -0400, Josef Bacik wrote: On Mon, Oct 24, 2011 at 10:06:49AM -0700, Sage Weil wrote: [adding linux-btrfs to cc] Josef, Chris, any ideas on the below issues? On Mon, 24 Oct 2011, Christian Brunner wrote: Thanks for explaining this. I don't have any

Re: Btrfs slowdown

2011-07-25 Thread Chris Mason
Excerpts from Christian Brunner's message of 2011-07-25 03:54:47 -0400: Hi, we are running a ceph cluster with btrfs as it's base filesystem (kernel 3.0). At the beginning everything worked very well, but after a few days (2-3) things are getting very slow. When I look at the object store

Re: 3.0-rcX BUG at fs/btrfs/ioctl.c:432 - bisected

2011-06-10 Thread Chris Mason
Excerpts from Jim Schutt's message of 2011-06-10 13:06:22 -0400: [ two different btrfs crashes ] I think your two crashes in btrfs were from the uninit variables and those should be fixed in rc2. When I did my bisection, my criteria for success/failure was did mkcephfs succeed?. When I apply