Re: [ceph-users] Reproducable Data Corruption with cephfs kernel driver

2014-12-18 Thread Thomas Lemarchand
I too find Ceph fuse more stable.

However, you really should do your tests with a much more recent
kernel ! 3.10 is old.
I think there is Ceph improvements in every kernel version since a long
time.

-- 
Thomas Lemarchand
Cloud Solutions SAS - Responsable des systèmes d'information



On jeu., 2014-12-18 at 14:52 +1000, Lindsay Mathieson wrote:
 I'be been experimenting with CephFS for funning KVM images (proxmox).
 
 cephfs fuse version - 0.87
 
 cephfs kernel module - kernel version 3.10
 
 
 Part of my testing involves running a Windows 7 VM up and running
 CrystalDiskMark to check the I/O in the VM. Its surprisingly good with
 both the fuse and the kernel driver, seq reads  writes are actually
 faster than the underlying disk, so I presume the FS is aggressively
 caching.
 
 With the fuse driver I have no problems.
 
 With the kernel driver, the benchmark runs fine, but when I reboot the
 VM the drive is corrupted and unreadable, every time. Rolling back to
 a snapshot fixes the disk. This does not happen unless I run the
 benchmark, which I presume is writing a lot of data.
 
 No problems with the same test for Ceph rbd, or NFS.
 
 -- 
 Lindsay
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Reproducable Data Corruption with cephfs kernel driver

2014-12-18 Thread Gregory Farnum
On Wed, Dec 17, 2014 at 8:52 PM, Lindsay Mathieson
lindsay.mathie...@gmail.com wrote:
 I'be been experimenting with CephFS for funning KVM images (proxmox).

 cephfs fuse version - 0.87

 cephfs kernel module - kernel version 3.10


 Part of my testing involves running a Windows 7 VM up and running
 CrystalDiskMark to check the I/O in the VM. Its surprisingly good with
 both the fuse and the kernel driver, seq reads  writes are actually
 faster than the underlying disk, so I presume the FS is aggressively
 caching.

 With the fuse driver I have no problems.

 With the kernel driver, the benchmark runs fine, but when I reboot the
 VM the drive is corrupted and unreadable, every time. Rolling back to
 a snapshot fixes the disk. This does not happen unless I run the
 benchmark, which I presume is writing a lot of data.

 No problems with the same test for Ceph rbd, or NFS.

Do you have any information about *how* the drive is corrupted; what
part Win7 is unhappy with? I don't know how Proxmox configures it, but
I assume you're storing the disk images as single files on the FS?

I'm really not sure what the kernel client could even do here, since
if you're not rebooting the host as well as the VM then it can't be
losing any of the data it's given. :/
-Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Reproducable Data Corruption with cephfs kernel driver

2014-12-18 Thread Udo Lembke
Hi Lindsay,
have you tried the different cache-options (no cache, write through,
...) which proxmox offer, for the drive?


Udo

On 18.12.2014 05:52, Lindsay Mathieson wrote:
 I'be been experimenting with CephFS for funning KVM images (proxmox).

 cephfs fuse version - 0.87

 cephfs kernel module - kernel version 3.10


 Part of my testing involves running a Windows 7 VM up and running
 CrystalDiskMark to check the I/O in the VM. Its surprisingly good with
 both the fuse and the kernel driver, seq reads  writes are actually
 faster than the underlying disk, so I presume the FS is aggressively
 caching.

 With the fuse driver I have no problems.

 With the kernel driver, the benchmark runs fine, but when I reboot the
 VM the drive is corrupted and unreadable, every time. Rolling back to
 a snapshot fixes the disk. This does not happen unless I run the
 benchmark, which I presume is writing a lot of data.

 No problems with the same test for Ceph rbd, or NFS.



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Reproducable Data Corruption with cephfs kernel driver

2014-12-18 Thread Lindsay Mathieson
On Thu, 18 Dec 2014 08:41:21 PM Udo Lembke wrote:
 have you tried the different cache-options (no cache, write through,
 ...) which proxmox offer, for the drive?


I tried with writeback and it didn't corrupt.
-- 
Lindsay

signature.asc
Description: This is a digitally signed message part.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Reproducable Data Corruption with cephfs kernel driver

2014-12-18 Thread Lindsay Mathieson
On Thu, 18 Dec 2014 11:23:42 AM Gregory Farnum wrote:
 Do you have any information about *how* the drive is corrupted; what
 part Win7 is unhappy with? 

Failure to find the boot sector I think, I'll run it again and take a screen 
shot.

 I don't know how Proxmox configures it, but
 I assume you're storing the disk images as single files on the FS?

its a single KVM QCOW2 file.

-- 
Lindsay

signature.asc
Description: This is a digitally signed message part.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Reproducable Data Corruption with cephfs kernel driver

2014-12-18 Thread John Spray
On Thu, Dec 18, 2014 at 8:40 PM, Lindsay Mathieson
lindsay.mathie...@gmail.com wrote:
 I don't know how Proxmox configures it, but
 I assume you're storing the disk images as single files on the FS?

 its a single KVM QCOW2 file.

Like the cache mode, the image format might be an interesting thing to
experiment with.  There are bugs out there in all layers of the IO
stack, it's entirely possible that you're seeing a bug elsewhere in
the stack that is only being triggered when using Ceph.

This probably goes without saying, but make sure you're using the
latest/greatest versions of everything, including
kvm/qemu/proxmox/kernel/guest drivers.

John
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com