Hi,
We had today a catastrophic fs corruption in one of our virtual
machines, after fsck ~100MB was inside lost+found :-(
So is think we hit the same bug (ceph-0.45.2, sparse rbd images)
Is there any progress on this topic, or any hint how to help on this
would be helpful.
Greetings
Stefan
Short version: you should set 'filestore fiemap = false' for your osds.
I was able to reproduce the crash with all the debugging I needed
yesterday via test_librbd_fsx, and the problem looks like a bug in
fiemap. Even though we call fsync before each fiemap call, we were
getting different
Since Guido was seeing this problem on btrfs as well, I'm going to try
tracking down more precisely where it was introduced.
Josh
On 06/15/2012 08:38 AM, Josh Durgin wrote:
Short version: you should set 'filestore fiemap = false' for your osds.
I was able to reproduce the crash with all the
Am Montag, 11. Juni 2012, 09:30:42 schrieb Sage Weil:
If you can reproduce it with 'debug filestore = 20' too, that will be
better, as it will tell us what the FIEMAP ioctl is returning.
I ran another testrun with 'debug filestore = 20'.
Also, if
you can attach/post the contents of the
Am Samstag, 9. Juni 2012, 20:04:20 schrieb Sage Weil:
On Fri, 8 Jun 2012, Guido Winkelmann wrote:
Am Freitag, 8. Juni 2012, 07:50:36 schrieb Josh Durgin:
On 06/08/2012 06:55 AM, Sage Weil wrote:
On Fri, 8 Jun 2012, Oliver Francke wrote:
Hi Guido,
yeah, there is something
Am Freitag, 8. Juni 2012, 06:55:19 schrieb Sage Weil:
On Fri, 8 Jun 2012, Oliver Francke wrote:
Hi Guido,
yeah, there is something weird going on. I just started to establish some
test-VM's. Freshly imported from running *.qcow2 images.
Kernel panic with INIT, seg-faults and other
On Mon, 11 Jun 2012, Guido Winkelmann wrote:
Am Freitag, 8. Juni 2012, 06:55:19 schrieb Sage Weil:
On Fri, 8 Jun 2012, Oliver Francke wrote:
Hi Guido,
yeah, there is something weird going on. I just started to establish some
test-VM's. Freshly imported from running *.qcow2 images.
Am Montag, 11. Juni 2012, 09:30:42 schrieb Sage Weil:
On Mon, 11 Jun 2012, Guido Winkelmann wrote:
Am Freitag, 8. Juni 2012, 06:55:19 schrieb Sage Weil:
On Fri, 8 Jun 2012, Oliver Francke wrote:
Are you guys able to reproduce the corruption with 'debug osd = 20' and
'debug ms =
On Mon, 11 Jun 2012, Guido Winkelmann wrote:
Am Montag, 11. Juni 2012, 09:30:42 schrieb Sage Weil:
On Mon, 11 Jun 2012, Guido Winkelmann wrote:
Am Freitag, 8. Juni 2012, 06:55:19 schrieb Sage Weil:
On Fri, 8 Jun 2012, Oliver Francke wrote:
Are you guys able to reproduce the
On 06/11/2012 10:07 AM, Guido Winkelmann wrote:
Am Montag, 11. Juni 2012, 09:30:42 schrieb Sage Weil:
On Mon, 11 Jun 2012, Guido Winkelmann wrote:
Am Freitag, 8. Juni 2012, 06:55:19 schrieb Sage Weil:
On Fri, 8 Jun 2012, Oliver Francke wrote:
Are you guys able to reproduce the corruption
On Fri, 8 Jun 2012, Guido Winkelmann wrote:
Am Freitag, 8. Juni 2012, 07:50:36 schrieb Josh Durgin:
On 06/08/2012 06:55 AM, Sage Weil wrote:
On Fri, 8 Jun 2012, Oliver Francke wrote:
Hi Guido,
yeah, there is something weird going on. I just started to establish some
test-VM's.
On Sat, 9 Jun 2012, Sage Weil wrote:
On Fri, 8 Jun 2012, Guido Winkelmann wrote:
Am Freitag, 8. Juni 2012, 07:50:36 schrieb Josh Durgin:
On 06/08/2012 06:55 AM, Sage Weil wrote:
On Fri, 8 Jun 2012, Oliver Francke wrote:
Hi Guido,
yeah, there is something weird going on. I
Am Donnerstag, 7. Juni 2012, 12:48:05 schrieben Sie:
On 06/07/2012 11:04 AM, Guido Winkelmann wrote:
Hi,
I'm using Ceph with RBD to provide network-transparent disk images for
KVM-
based virtual servers. The last two days, I've been hunting some weird
elusive bug where data in the
Am Freitag, 8. Juni 2012, 14:55:44 schrieb Guido Winkelmann:
I did not change anything else in the setup. In particular, the OSDs still
use btrfs. One of the OSD has been restarted, though. I will run another
test with a VM without rbd caching, to make sure it wasn't by random chance
Hi Guido,
yeah, there is something weird going on. I just started to establish
some test-VM's. Freshly imported from running *.qcow2 images.
Kernel panic with INIT, seg-faults and other funny stuff.
Just added the rbd_cache=true in my config, voila. All is
fast-n-up-n-running...
All my
On Fri, 8 Jun 2012, Oliver Francke wrote:
Hi Guido,
yeah, there is something weird going on. I just started to establish some
test-VM's. Freshly imported from running *.qcow2 images.
Kernel panic with INIT, seg-faults and other funny stuff.
Just added the rbd_cache=true in my config,
On 06/08/2012 06:55 AM, Sage Weil wrote:
On Fri, 8 Jun 2012, Oliver Francke wrote:
Hi Guido,
yeah, there is something weird going on. I just started to establish some
test-VM's. Freshly imported from running *.qcow2 images.
Kernel panic with INIT, seg-faults and other funny stuff.
Just added
Well then,
quite busy, too with some other stuff, but...
On 06/08/2012 04:50 PM, Josh Durgin wrote:
On 06/08/2012 06:55 AM, Sage Weil wrote:
On Fri, 8 Jun 2012, Oliver Francke wrote:
Hi Guido,
yeah, there is something weird going on. I just started to establish
some
test-VM's. Freshly
Am Freitag, 8. Juni 2012, 07:50:36 schrieb Josh Durgin:
On 06/08/2012 06:55 AM, Sage Weil wrote:
On Fri, 8 Jun 2012, Oliver Francke wrote:
Hi Guido,
yeah, there is something weird going on. I just started to establish some
test-VM's. Freshly imported from running *.qcow2 images.
Hi,
I'm using Ceph with RBD to provide network-transparent disk images for KVM-
based virtual servers. The last two days, I've been hunting some weird elusive
bug where data in the virtual machines would be corrupted in weird ways. It
usually manifests in files having some random data - usually
Am Donnerstag, 7. Juni 2012, 20:18:52 schrieb Stefan Priebe:
I think the test script would help a lot so others can test too.
Okay, I've attached the program. It's barely 2 KB. You need Boost 1.45+, CMake
2.6+ and Crypto++ to compile it.
Warning: This will fill up your harddisk completely,
Hi Guido,
unfortunately this sounds very familiar to me. We have been on a long road with
similar weird errors.
Our setup is something like start a couple of VM's ( qemu-*), let them create
a 1G-file each and randomly seek and write 4MB blocks filled with md5sums of
the block as payload, to be
On 06/07/2012 11:04 AM, Guido Winkelmann wrote:
Hi,
I'm using Ceph with RBD to provide network-transparent disk images for KVM-
based virtual servers. The last two days, I've been hunting some weird elusive
bug where data in the virtual machines would be corrupted in weird ways. It
usually
Hmm, can`t reproduce that(phew!). Qemu-1.1-release, 0.47.2, guest/host
mainly debian wheezy. Only one main difference with my setup from
yours is a underlying fs - I`m tired of btrfs unpredictable load
issues and moved back to xfs.
BTW you calculate sha1 in test suite, not sha256 as you mentioned
On Thursday 07 June 2012 23:54:04 Andrey Korolyov wrote:
Hmm, can`t reproduce that(phew!). Qemu-1.1-release, 0.47.2, guest/host
mainly debian wheezy. Only one main difference with my setup from
yours is a underlying fs - I`m tired of btrfs unpredictable load
issues and moved back to xfs.
I
On Thursday 07 June 2012 12:48:05 Josh Durgin wrote:
On 06/07/2012 11:04 AM, Guido Winkelmann wrote:
Hi,
I'm using Ceph with RBD to provide network-transparent disk images for
KVM-
based virtual servers. The last two days, I've been hunting some weird
elusive bug where data in the
Maybe I did something wrong with your iotester, but I had to mkdir
./iotest to get it to run. I straced and found that it died on 'no
such file'.
On Thu, Jun 7, 2012 at 12:37 PM, Guido Winkelmann
guido-c...@thisisnotatest.de wrote:
Am Donnerstag, 7. Juni 2012, 20:18:52 schrieb Stefan Priebe:
I
On Thursday 07 June 2012 15:53:18 Marcus Sorensen wrote:
Maybe I did something wrong with your iotester, but I had to mkdir
./iotest to get it to run. I straced and found that it died on 'no
such file'.
It's a bit quick and dirty... You are supposed to pass the directory where it
is to put
On Thu, Jun 7, 2012 at 2:36 PM, Guido Winkelmann
guido-c...@thisisnotatest.de wrote:
Again, I'll try that tomorrow. BTW, I could use some advice on how to go about
that. Right I would stop one osd process (not the whole machine), reformat and
remount its btrfs devices as XFS, delete the
29 matches
Mail list logo