On Sat, Mar 23, 2013 at 06:48:38AM -0300, Alexandre Oliva wrote:
On Mar 22, 2013, David Sterba dste...@suse.cz wrote:
I've reproduced this without compression, with autodefrag on.
I don't have autodefrag on, unless it's enabled by default on 3.8.3 or
on the for-linus tree.
It's not on by
Quoting Chris Mason (2013-03-22 16:31:42)
Going through the code here, when I change the test to truncate once in
the very beginning, I still get errors. So, it isn't an interaction
between mmap and truncate. It must be a problem between lzo and mmap.
With compression off, we use
On Mar 22, 2013, Chris Mason clma...@fusionio.com wrote:
Quoting Samuel Just (2013-03-22 13:06:41)
Incomplete writes for leveldb should just result in lost updates, not
corruption.
In this case, I think Alexandre is scanning for zeros in the file.
Yup, the symptom is zeros at the end of a
Quoting Alexandre Oliva (2013-03-22 01:27:42)
On Mar 21, 2013, Chris Mason chris.ma...@fusionio.com wrote:
Quoting Chris Mason (2013-03-21 14:06:14)
With mmap the kernel can pick any given time to start writing out dirty
pages. The idea is that if the application makes more changes the
On Mar 22, 2013, Chris Mason clma...@fusionio.com wrote:
Are you using compression in btrfs or just in leveldb?
btrfs lzo compression.
I'd like to take snapshots out of the picture for a minute.
That's understandable, I guess, but I don't know that anyone has ever
got the problem without
Quoting Alexandre Oliva (2013-03-22 10:17:30)
On Mar 22, 2013, Chris Mason clma...@fusionio.com wrote:
Are you using compression in btrfs or just in leveldb?
btrfs lzo compression.
Perfect, I'll focus on that part of things.
I'd like to take snapshots out of the picture for a minute.
Incomplete writes for leveldb should just result in lost updates, not
corruption. Also, we do stop writes before the snapshot is initiated
so there should be no in-progress writes to leveldb other than leveldb
compaction (though that might be something to investigate).
-Sam
On Fri, Mar 22, 2013
On Fri, Mar 22, 2013 at 10:26:59AM -0400, Chris Mason wrote:
Quoting Alexandre Oliva (2013-03-22 10:17:30)
On Mar 22, 2013, Chris Mason clma...@fusionio.com wrote:
Are you using compression in btrfs or just in leveldb?
btrfs lzo compression.
Perfect, I'll focus on that part of
In this case, I think Alexandre is scanning for zeros in the file. The
incomplete writes will definitely show that.
-chris
Quoting Samuel Just (2013-03-22 13:06:41)
Incomplete writes for leveldb should just result in lost updates, not
corruption. Also, we do stop writes before the snapshot
On Fri, 22 Mar 2013, Chris Mason wrote:
Quoting Alexandre Oliva (2013-03-22 10:17:30)
On Mar 22, 2013, Chris Mason clma...@fusionio.com wrote:
Are you using compression in btrfs or just in leveldb?
btrfs lzo compression.
Perfect, I'll focus on that part of things.
I'd like
[ mmap corruptions with leveldb and btrfs compression ]
I ran this a number of times with compression off and wasn't able to
trigger problems. With compress=lzo, I see errors on every run.
Compile: gcc -Wall -o mmap-trunc mmap-trunc.c
Run: ./mmap-trunc file_name
The basic idea is to create a
Quoting Chris Mason (2013-03-22 14:07:05)
[ mmap corruptions with leveldb and btrfs compression ]
I ran this a number of times with compression off and wasn't able to
trigger problems. With compress=lzo, I see errors on every run.
Compile: gcc -Wall -o mmap-trunc mmap-trunc.c
Run:
On Mar 19, 2013, Alexandre Oliva ol...@gnu.org wrote:
On Mar 19, 2013, Alexandre Oliva ol...@gnu.org wrote:
that is being processed inside the snapshot.
This doesn't explain why the master database occasionally gets similarly
corrupted, does it?
Actually, scratch this bit for now. I don't
Quoting Chris Mason (2013-03-21 14:06:14)
Quoting Alexandre Oliva (2013-03-21 03:14:02)
On Mar 19, 2013, Alexandre Oliva ol...@gnu.org wrote:
On Mar 19, 2013, Alexandre Oliva ol...@gnu.org wrote:
that is being processed inside the snapshot.
This doesn't explain why the master
On Mar 21, 2013, Chris Mason chris.ma...@fusionio.com wrote:
Quoting Chris Mason (2013-03-21 14:06:14)
With mmap the kernel can pick any given time to start writing out dirty
pages. The idea is that if the application makes more changes the page
becomes dirty again and the kernel writes it
Quoting Alexandre Oliva (2013-03-19 01:20:10)
On Mar 18, 2013, Chris Mason chris.ma...@fusionio.com wrote:
A few questions. Does leveldb use O_DIRECT and mmap together?
No, it doesn't use O_DIRECT at all. Its I/O interface is very
simplified: it just opens each new file (database chunks
On Tue, 19 Mar 2013, Chris Mason wrote:
Quoting Alexandre Oliva (2013-03-19 01:20:10)
On Mar 18, 2013, Chris Mason chris.ma...@fusionio.com wrote:
A few questions. Does leveldb use O_DIRECT and mmap together?
No, it doesn't use O_DIRECT at all. Its I/O interface is very
On Mar 19, 2013, Chris Mason clma...@fusionio.com wrote:
My guess is the truncate is creating a orphan item
Would it, even though the truncate is used to grow rather than to shrink
the file?
that is being processed inside the snapshot.
This doesn't explain why the master database
On Mar 19, 2013, Sage Weil s...@inktank.com wrote:
There is a set of unit tests in the leveldb source tree that ought to do
the trick:
git clone https://code.google.com/p/leveldb/
But these don't create btrfs snapshots.
--
Alexandre Oliva, freedom fighter
On Mar 19, 2013, Alexandre Oliva ol...@gnu.org wrote:
that is being processed inside the snapshot.
This doesn't explain why the master database occasionally gets similarly
corrupted, does it?
Actually, scratch this bit for now. I don't really have proof that the
master database actually
For quite a while, I've experienced oddities with snapshotted Firefox
_CACHE_00?_ files, whose checksums (and contents) would change after the
btrfs snapshot was taken, and would even change depending on how the
file was brought to memory (e.g., rsyncing it to backup storage vs
checking its md5sum
While I wrote the previous email, a smoking gun formed in one of my
servers: a snapshot that had passed a database consistency check turned
out to be corrupted when I tried to rollback to it! Since the snapshot
was not modified in any way between the initial scripted check and the
later manual
A few questions. Does leveldb use O_DIRECT and mmap together? (the
source of a write being pages that are mmap'd from somewhere else)
That's the most likely place for this kind of problem. Also, you
mention crc errors. Are those reported by btrfs or are they application
level crcs.
Thanks for
On Mar 18, 2013, Chris Mason chris.ma...@fusionio.com wrote:
A few questions. Does leveldb use O_DIRECT and mmap together?
No, it doesn't use O_DIRECT at all. Its I/O interface is very
simplified: it just opens each new file (database chunks limited to 2MB)
with O_CREAT|O_RDWR|O_TRUNC, and
24 matches
Mail list logo