On Mon, Apr 15, 2013 at 1:30 PM, Sanne Grinovero <sa...@infinispan.org>wrote:
> I've attached the logs the the JIRA. > > Some replies inline: > > On 15 April 2013 11:04, Dan Berindei <dan.berin...@gmail.com> wrote: > > > > > > > > On Sat, Apr 13, 2013 at 2:42 PM, Sanne Grinovero <sa...@infinispan.org> > > wrote: > >> > >> On 13 April 2013 11:20, Bela Ban <b...@redhat.com> wrote: > >> > > >> > > >> > On 4/13/13 2:02 AM, Sanne Grinovero wrote: > >> > > >> >> @All, the performance problem seemed to be caused by a problem in > >> >> JGroups, which I've logged here: > >> >> https://issues.jboss.org/browse/JGRP-1617 > >> > > >> > > >> > Almost no information attached to the case :-( If it wasn't you, > Sanne, > >> > I'd outright reject the case ... > >> > >> I wouldn't blame you, and am sorry for the lack of details: as I said > >> it was very late, still I preferred to share the observations we made > >> so far. > >> > >> >From all the experiments we made - and some good logs I'll cleanup for > >> sharing - it's clear that the thread is not woken up while the ACK was > >> already received. > >> And of course I wouldn't expect this to fail in a simple test as it > >> wouldn't have escaped you ;-) or at least you would have had earlier > >> reports. > >> > >> There are lots of complex moving parts in this scenario: from a Muxed > >> JGroups Channel, and the Application Server responsible for > >> initializing the stack with some added magic from CapeDwarf itself: > >> it's not clear to me what configuration is exactly being used, for > >> one. > >> > > > > Does CD also change the JGroups configuration? I thought it only tweaks > the > > Infinispan cache configuration on deployment, and the JGroups channel is > > already started by the time the CD application is deployed. > > CD uses a custom AS build and a custom AS configuration, so anything > could be different. > On top of that, some things are reconfigured programmatically by it. > > Ales already cleared this out, CD doesn't change the JGroups config at all. > >> Without a testcase we might not be 100% sure but it seems likely to be > >> an unexpected behaviour in JGroups, at least under some very specific > >> setup. > >> > >> > >> I'm glad to help tracking down more details of what could trigger > >> this, but I'm not too eager to write a full unit test for this as it > >> involves a lot of other components, and by mocking my own components > >> out I could still reproduce it: it's not Hibernate Search, so I'll > >> need the help from the field experts. > >> > >> Also I suspect a test would need to depend on many more components: is > >> JGroups having an easy way to manage dependencies nowadays? > >> > >> some more inline: > >> > >> > > >> > The MessageDispatcher will *not* wait until the timeout kicks in, > it'll > >> > return as soon as it has acks from all members of the target set. This > >> > works and is covered with a bunch of unit tests, so a regression would > >> > have been caught immediately. > >> > >> I don't doubt the "vanilla scenario", but this is what happens in the > >> more complex case of the CapeDwarf setup. > >> > > > > My first guess would be that the MuxRpcDispatcher on the second node > hasn't > > started yet by the time you call castMessage on the first node. It could > be > > that your workaround just delayed the message a little bit, until the > > MuxRpcDispatcher on the other node actually started (because the > JChannel is > > already started on both nodes, but as long as the MuxRpcDispatcher isn't > > started on the 2nd node it won't send any responses back). > > Before the point in which Search uses the dispatcher, many more > operations happened succesfully and with a reasonable timing: > especially some transactions on Infinispan stored entries quickly and > without trouble. > > Besides if such a race condition would be possible, I would consider > it a critical bug. > > I looked at the muxer code and I think they actually take care of this already: MuxUpHandler returns a NoMuxHandler response when it can't find an appropriate MuxedRpcDispatcher, and MessageDispatcher counts that response against the number of expected responses. So it must be something else... > > >> > I attached a test program to JGRP-1617 which shows that this feature > >> > works correctly. > >> > > >> > Of course, if you lose an ack (e.g. due to a maxed out incoming / OOB > >> > thread pool), the unicast protocol will have to retransmit the ack > until > >> > it has been received. Depending on the unicast protocol you use, this > >> > will be immediate (UNICAST, UNICAST3), or based on a stability > interval > >> > (UNICAST2). > >> > >> Right it's totally possible this is a stack configuration problem in the > >> AS. > >> I wouldn't be the best to ask that though, I don't even understand the > >> configuration format. > >> > > > > You can get the actual JGroups configuration with > > channel.getProtocolStack().printProtocolSpecAsXml(), but I wouldn't > expect > > you to find any surprises there: they should use pretty much the JGroups > > defaults. > > Nice tip. I'll add this as a logging option. > > > > > By default STABLE.desired_avg_gossip is 20s and STABLE.stability_delay is > > 6s, so even if the message was lost it should take < 30s for the message > to > > be resent. > > The delay is actually ~10 seconds per RPC, so still <30s. > The reason the overall test takes 60 seconds is because there are 6 > operations being performed. > > Ok, this means my hunch definitely wasn't true, that would have explained only the first request timing out. > > >> >> For the record, the first operation was indeed triggering some lazy > >> >> initialization of indexes, which in turn would trigger a Lucene > >> >> Directory being started, triggering 3 Cache starts which in turn > would > >> >> trigger 6 state transfer processes: so indeed the first operation > >> >> would not be exactly "cheap" performance wise, still this would > >> >> complete in about 120 milliseconds. > >> > > >> > This sounds very low for the work you describe above. I don't think 6 > >> > state transfers can be completed in 120ms, unless they're async (but > >> > then that means they're not done when you return). Also, cache starts > >> > (wrt JGroups) will definitely take more than a few seconds if you're > the > >> > first cluster node... > >> > >> It's a unit test: the caches are initially empty and networking is > >> loopback, > >> on the second round some ~6 elements are in the cache, no larger than > >> ~10 character strings. > >> Should be reasonable? > >> > > > > Yes, I think it's reasonable, if the JChannel was already started before > the > > CD application was deployed. Starting the first JChannel would take at > least > > 3s, which is the default PING.timeout. > > > > > >> > >> >> Not being sure about the options of depending to a newer JGroups > >> >> release or the complexity of a fix, I'll implement a workaround in > >> >> HSearch in the scope of HSEARCH-1296. > >> > > >> > > >> > If you add more information to JGRP-1617, I'll take a look. This would > >> > be a critical bug in JGroups *if* you can prove that the > >> > MessageDispatcher always runs into the timeout (I don't think you can > >> > though !). > >> > >> Considering the easy workaround and that definitely this needs > >> something special in the configuration, I wouldn't consider it too > >> critical? For as far as we know now, it's entirely possible the > >> configuration being used is illegal. But this is exactly where I need > >> your help ;-) > >> > > > > I'm not sure that your workaround is 100% effective, even if it doesn't > > happen in this test it's always possible to have the app deployed on > some of > > the nodes in the cluster, but not all. > > That's right. I would prefer not to apply anything like that, but for > the sake of the experiment it was usefull to isolate the problem. > > Looking at your workaround, I think you actually set the response mode to GET_NONE (because that's the default value in RequestOptions), so you're back to sending an asynchronous request.
_______________________________________________ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev