Hi Sage,

As you advised us, we switched to the release 0.19 of ceph and ran into another 
bug in the ceph client. When writing to a file with the O_SYNC flag,  "0" is 
always returned although the data is written to disk.
This poses a problem in our benchmark which uses the return value as number of 
bytes written. Also it seems that such behavior infringes the POSIX write() 
contract.

Attached is a small unit test in c++.
The unit test creates 2 files which are exactly the same, both filled randomly 
with numbers 0-9.
Afterwards the both files are closed.
Then one file is reopened and filled with 1's.

Running the test:
$ g++ temp.cc
$ ./a.out 100  (this is the number of bytes in the files)
Each time 0 is returned it is printed out on the screen.
Run the executable a.out from within a directory on a ceph file system.

After the program  finishes you will find 2 files:
./test  - filled with one's
./test.start - filled with random numeric data

If you run this test on NFS and ceph you will see that no errors are printed 
out on the NFS file system, and 100 errors are printed out on ceph.

Thanks,

Roman & Roman

-----Original Message-----
From: Sage Weil [mailto:s...@newdream.net] 
Sent: Friday, February 19, 2010 8:39 PM
To: Talyansky, Roman
Cc: ceph-devel@lists.sourceforge.net
Subject: Re: [ceph-devel] Write operation is stuck

Hi Roman,

On Fri, 19 Feb 2010, Talyansky, Roman wrote:
> Since I test several ceph versions simultaneously I could confuse the error 
> checking at different nodes.
> I'll double check this and let you know.

Thanks.  If you haven't switched to the just-released 0.19, now might be 
the time to do that.

> > It also looks like the IO is synchronous, which may have something 
> > to do with your performance.  Are you mounting with -o sync or using 
> > direct IO, or are multiple clients reading and writing to the same file or 
> > something?
>
> The IO is indeed synchronous. However the performance under ceph is much 
> worse than even under nfs, which looks strange. I do not mount with -o 
> synch. And in our experiments multiple clients read and write the same 
> file.

If you are accessing the same file from multiple clients, then any 
comparison with nfs is going to be misleading.  NFS provides only close to 
open consistency, so IO will be buffered and inconsistent.  Ceph provides 
fully consistent semantics by switching to synchronous IO when there are 
multiple clients.  Ceph will be slower, but correct; nfs will be fast, but 
incorrect.

If your application is smart enough to handle it's own consistency (each 
client is writing to a different region of the file) then you probably 
want something along the lines of O_LAZY [1], so that the application can 
tell the FS not to worry about consistency and stick with buffered IO.  
Unfortunately O_LAZY doesn't exist in Linux at this point.  There is some 
preliminary support for it in Ceph... if that's what you're looking for, 
we can cook up some patches for you.

If you can find us in #ceph on irc.oftc.net that might be a quicker way to 
diagnose the performance problems with your workload.

Thanks!
sage

[1] http://www.pdl.cmu.edu/posix/docs/posix_lazy_io.pdf

Attachment: temp.cc
Description: temp.cc

------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Ceph-devel mailing list
Ceph-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ceph-devel

Reply via email to