2013-10-31 Hefty, Sean sean.he...@intel.com:
Can you please try the attached patch in place of all previous patches?
Any updates on ceph with rsockets?
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info
Sorry it took so long to get to this.
after some more analysis and debugging, I found
workarounds for my problems; I have added these workarounds
to the last version of the patch for the poll problem by Sean;
see the attachment to this posting.
The shutdown() operations below are all
2013/9/12 Andreas Bluemle andreas.blue...@itxperts.de:
I have not yet done any performance testing.
The next step I have to take is more related to setting up
a larger cluster with sth. like 150 osd's without hitting any
resource limitations.
How do you manage failover ? Will you use
2013/9/10 Andreas Bluemle andreas.blue...@itxperts.de:
Since I have added these workarounds to my version of the librdmacm
library, I can at least start up ceph using LD_PRELOAD and end up in
a healthy ceph cluster state.
Have you seen any performance improvement by using LD_PRELOAD with ceph?
On Thu, 12 Sep 2013 12:20:03 +0200
Gandalf Corvotempesta gandalf.corvotempe...@gmail.com wrote:
2013/9/10 Andreas Bluemle andreas.blue...@itxperts.de:
Since I have added these workarounds to my version of the librdmacm
library, I can at least start up ceph using LD_PRELOAD and end up in
a
Hi,
after some more analysis and debugging, I found
workarounds for my problems; I have added these workarounds
to the last version of the patch for the poll problem by Sean;
see the attachment to this posting.
The shutdown() operations below are all SHUT_RDWR.
1. shutdown() on side A of a
I tested out the patch and unfortunately had the same results as
Andreas. About 50% of the time the rpoll() thread in Ceph still hangs
when rshutdown() is called. I saw a similar behaviour when increasing
the poll time on the pre-patched version if that's of any relevance.
I'm not optimistic,
Hi Sean,
I tested out the patch and unfortunately had the same results as
Andreas. About 50% of the time the rpoll() thread in Ceph still hangs
when rshutdown() is called. I saw a similar behaviour when increasing
the poll time on the pre-patched version if that's of any relevance.
Thanks
On
Hi Sean,
I will re-check until the end of the week; there is
some test scheduling issue with our test system, which
affects my access times.
Thanks
Andreas
On Mon, 19 Aug 2013 17:10:11 +
Hefty, Sean sean.he...@intel.com wrote:
Can you see if the patch below fixes the hang?
Hi,
I have added the patch and re-tested: I still encounter
hangs of my application. I am not quite sure whether the
I hit the same error on the shutdown because now I don't hit
the error always, but only every now and then.
WHen adding the patch to my code base (git tag v1.0.17) I notice
an
I have added the patch and re-tested: I still encounter
hangs of my application. I am not quite sure whether the
I hit the same error on the shutdown because now I don't hit
the error always, but only every now and then.
I guess this is at least some progress... :/
WHen adding the patch to
Can you see if the patch below fixes the hang?
Signed-off-by: Sean Hefty sean.he...@intel.com
---
src/rsocket.c | 11 ++-
1 files changed, 10 insertions(+), 1 deletions(-)
diff --git a/src/rsocket.c b/src/rsocket.c
index d544dd0..e45b26d 100644
--- a/src/rsocket.c
+++ b/src/rsocket.c
I am looking at a multithreaded application here, and I believe that
the race is between thread A calling the rpoll() for POLLIN event and
thread B calling the shutdown(SHUT_RDWR) for reading and writing of
the (r)socket almost immediately afterwards.
I modified a test program, and I can
Hi,
maybe some information about the environment I am
working in:
- CentOS 6.4 with custom kernel 3.8.13
- librdmacm / librspreload from git, tag 1.0.17
- application started with librspreload in LD_PRELOAD environment
Currently, I have increased the value of the spin time by setting the
On Aug 14, 2013, at 3:21 AM, Andreas Bluemle andreas.blue...@itxperts.de
wrote:
Hi,
maybe some information about the environment I am
working in:
- CentOS 6.4 with custom kernel 3.8.13
- librdmacm / librspreload from git, tag 1.0.17
- application started with librspreload in LD_PRELOAD
The first question I would have is: why is the rpoll() split into
these two pieces? There must have been some reason to do a busy
loop on some local state information rather than just call the
real poll() directly.
As Scott mentioned in his email, this is done for performance reasons. The
Hi Matthew,
I found a workaround for my (our) problem: in the librdmacm
code, rsocket.c, there is a global constant polling_time, which
is set to 10 microseconds at the moment.
I raise this to 1 - and all of a sudden things work nicely.
I think we are looking at two issues here:
1. the
On Aug 13, 2013, at 10:06 AM, Andreas Bluemle andreas.blue...@itxperts.de
wrote:
Hi Matthew,
I found a workaround for my (our) problem: in the librdmacm
code, rsocket.c, there is a global constant polling_time, which
is set to 10 microseconds at the moment.
I raise this to 1 - and
I found a workaround for my (our) problem: in the librdmacm
code, rsocket.c, there is a global constant polling_time, which
is set to 10 microseconds at the moment.
I raise this to 1 - and all of a sudden things work nicely.
I am adding the linux-rdma list to CC so Sean might see
Moving this conversation to ceph-devel where the dev's might be able
to shed some light on this.
I've added some additional debug to my code to narrow the issue down a
bit and the reader thread appears to be getting locked by
tcp_read_wait() because rpoll never returns an event when the socket
is
Hi Matthew,
I can confirm the beahviour whichi you describe.
I too believe that the problem is on the client side (ceph command).
My log files show the very same symptom, i.e. the client side
not being able to shutdown the pipes properly.
(Q: I had problems yesterday to send a mail to ceph-users
21 matches
Mail list logo