Hi Sage,

Thanks for the answer.

> It looks like dmesg shows it trying to connect to the monitor at .70, but you 
> tested .83?
Since I test several ceph versions simultaneously I could confuse the error 
checking at different nodes.
I'll double check this and let you know.

> It also looks like the IO is synchronous, which may have something 
> to do with your performance.  Are you mounting with -o sync or using 
> direct IO, or are multiple clients reading and writing to the same file or 
> something?
The IO is indeed synchronous. However the performance under ceph is much worse 
than even under nfs, which looks strange. I do not mount with -o synch. And in 
our experiments multiple clients read and write the same file.

Thanks,
Roman


-----Original Message-----
From: Sage Weil [mailto:s...@newdream.net] 
Sent: Tuesday, February 16, 2010 8:35 PM
To: Talyansky, Roman
Cc: ceph-devel@lists.sourceforge.net
Subject: Re: [ceph-devel] Write operation is stuck

On Tue, 16 Feb 2010, Talyansky, Roman wrote:

> Hi Sage,
> 
> I am trying to reproduce the hang with the latest client and servers.
> I am able to start the servers, however mount fails with input/output error 
> 5. The dmesg listing shows the following info:
> 
> [17008.244739] ceph: loaded 0.18.0 (mon/mds/osd proto 15/30/22)
> [17015.888143] ceph: mon0 10.55.147.70:6789 connection failed
> [17025.880170] ceph: mon0 10.55.147.70:6789 connection failed
> [17035.880121] ceph: mon0 10.55.147.70:6789 connection failed
> [17045.880189] ceph: mon0 10.55.147.70:6789 connection failed
> [17055.880130] ceph: mon0 10.55.147.70:6789 connection failed
> [17065.880113] ceph: mon0 10.55.147.70:6789 connection failed
> [17075.880170] ceph: mon0 10.55.147.70:6789 connection failed
> 
> The server is reachable, as the following command output shows:
> 
> $ nc 10.55.147.83 6789
> ceph v027

It looks like dmesg shows it trying to connect to the monitor at .70, but 
you tested .83?

> I started running the experiments with ceph 0.18 using the 
> configuration, where clients and servers run on separate nodes. It turns 
> out that the performance is extremely bad. Looking at dmesg trace I see 
> ceph-related faults (the partial trace is attached to the email).

The oops in the attached trace.txt was fixed last week in the unstable 
code.  It also looks like the IO is synchronous, which may have something 
to do with your performance.  Are you mounting with -o sync or using 
direct IO, or are multiple clients reading and writing to the same file or 
something?

Thanks-
sage


------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Ceph-devel mailing list
Ceph-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ceph-devel

Reply via email to