Your benchmark is giving me the creeps ! First, which version of JGroups / Infinispan does this use ? Second, is there a way to start this in an IDE rather than through maven ? Third, I don't think bench-jgroups.xml is picked up at all ! How do I make a change to bench-jgroups.xml and have Transactional *use* it ?
This maven crap hides so much, I never know what's going on under the covers !@$@ Do I have to do mvn install or mvn package when I make a change to bench-jgroups.xml ? I don't think I'll ever switch to this f*cking piece of shit ! Please ping when as soon as you've calmed down... :-) Cheers, On 2/1/12 4:52 PM, Sanne Grinovero wrote: > On 1 February 2012 15:18, Bela Ban<b...@redhat.com> wrote: >> >> >> On 2/1/12 10:25 AM, Dan Berindei wrote: >> >>>> That's not the way it works; at startup of F, it sends its IP address >>>> with the discovery request. Everybody returns its IP address with the >>>> discovery response, so even though we have F only talking to A (the >>>> coordinator) initially, F will also know the IP addresses of A,B,C,D and E. >>>> >>> >>> Ok, I stand corrected... since we start all the nodes on the same >>> thread, each of them should reply to the discovery request of the next >>> nodes. >> >> >> Hmm, can you reproduce this every time ? If so, can you send me the >> program so I can run it here ? >> >> >>> However, num_initial_members was set to 3 (the Infinispan default). >>> Could that make PING not wait for all the responses? If it's like >>> that, then I suggest we set a (much) higher num_initial_members and a >>> lower timeout in the default configuration. >> >> >> Yes, the discovery could return quickly, but the responses would even be >> processed if they were received later, so I don't think that's the issue. >> >> The initial discovery should discover *all* IP addresses, later >> triggering a discovery because an IP address wasn't found should always >> be the exceptional case ! >> >> If you start members in turn, then they should easily form a cluster and >> not even merge. Here's what can happen on a merge: >> - The view is A|1={A,B}, both A and B have IP addresses for A and B >> - The view splits into A|2={A} and B|2={B} >> - A now marks B's IP address as removable and B marks A's IP address as >> removable >> - If the cache grows to over 500 entries >> (TP.logical_addr_cache_max_size) or TP.logical_addr_cache_expiration >> milliseconds elapse (whichever comes first), the entries marked as >> removable are removed >> - If, *before that* the merge view A|3={A,B} is installed, A unmarks B >> and B unmarks A, so the entries won't get removed >> >> So a hypothesis of how those IP addresses get removed could be that the >> cluster had a couple of merges, that didn't heal for 2 minutes (?) hard >> to believe though... >> >> We have to get to the bottom of this, so it would be great if you had a >> program that reproduced this, that I could send myself. The main >> question is why the IP address for the target is gone and/or why the IP >> address wasn't received in the first place. >> >> In any case, replacing MERGE2 with MERGE3 might help a bit, as MERGE3 >> [1] periodically broadcasts IP address/logical name and logical address: >> "An INFO message carries the logical name and physical address of a >> member. Compared to MERGE2, this allows us to immediately send messages >> to newly merged members, and not have to solicit this information first. >> " (copied from the documentation) >> >> >> >>>>> Â Note that everything is blocked at this point, we >>>>> won't send another message in the entire cluster until we got the >>>>> physical address. >> >> >> Understood. Let me see if I can block sending of the message for a max >> time (say 2 seconds) until I get the IP address. Not very nice, and I >> prefer a different approach (plus we need to see why this happens in the >> first place anyway)... >> >> >>>> As I said; this is an exceptional case, probably caused by Sanne >>>> starting 12 channels inside the same JVM, at the same time, therefore >>>> causing a traffic spike, which results in dropped discovery requests or >>>> responses. >>>> >>> >>> Bela, we create the caches on a single thread, so we never have more >>> than one node joining at the same time. >>> At most we could have some extra activity if one node can't join the >>> existing cluster and starts a separate partition, but hardly enough to >>> cause congestion. >> >> >> Hmm, does indeed not sound like an issue... >> >> >>>> After than, when F wants to talk to C, it asks the cluster for C's IP >>>> address, and that should be a few ms at most. >>>> >>> >>> Ok, so when F wanted to send the ClusteredGetCommand request to C, >>> PING got the physical address right away. But the ClusteredGetCommand >>> had to wait for STABLE to kick in and for C to ask for retransmission >>> (because we didn't send any other messages). >> >> >> Yep. Before I implement some blocking until we have the IP address, or a >> timeout elapses, I'd like to try to get to the bottom of this problem >> first ! >> >> >>> Maybe *we* should use RSVP for our ClusteredGetCommands, since those >>> can never block... Actually, we don't want to retransmit the request >>> if we already got a response from another node, so it would be best if >>> we could ask for retransmission of a particular request explicitly ;-) >> >> >> I'd rather implement the blocking approach above ! :-) >> >> >>> I wonder if we could also decrease desired_avg_gossip and >>> stability_delay in STABLE. After all, an extra STABLE round can't slow >>> us when we're not doing anything, and when we are busy we're going to >>> hit the max_bytes limit much sooner than the desired_avg_gossip time >>> limit anyway. >> >> >> I don't think this is a good idea as it will generate more traffic. The >> stable task is not skipped when we have a lot of traffic, so this will >> compound the issue. >> >> >>>>> I'm also not sure what to make of these lines: >>>>> >>>>>>>> [org.jgroups.protocols.UDP] sanne-55119: no physical address for >>>>>>>> sanne-53650, dropping message >>>>>>>> [org.jgroups.protocols.pbcast.GMS] JOIN(sanne-55119) sent to >>>>>>>> sanne-53650 timed out (after 3000 ms), retrying >>>>> >>>>> It appears that sanne-55119 knows the logical name of sanne-53650, and >>>>> the fact that it's coordinator, but not its physical address. >>>>> Shouldn't all of this information have arrived at the same time? >>>> >>>> Hmm, correct. However, the logical names are kept in (a static) >>>> UUID.cache and the IP addresses in TP.logical_addr_cache. >>>> >>> >>> Ah, so if we have 12 nodes in the same VM they automatically know each >>> other's logical name - they don't need PING at all! >> >> >> Yes. Note that logical names are not the problem; even if we evict some >> logical name from the cache (and we do this only for removed members), >> JGroups will still work as it only needs UUIDs and IP addresses. >> >> >>> Does the logical cache get cleared on channel stop? I think that would >>> explain another weird thing I was seeing in the test suite logs, >>> sometimes everyone in a cluster would suddenly forget everyone else's >>> logical name and start logging UUIDs. >> >> >> On a view change, we remove all entries which are *not* in the new view. >> However, 'removing' is again simply marking those members as >> 'removable', and only if the cache grows beyond 500 >> (-Djgroups.uuid_cache.max_entries=500) entries will all entries older >> than 5 seconds (-Djgroups.uuid_cache.max_age=5000) be removed. (There is >> no separate reaper task running for this). >> >> So, yes, this can happen, but on the next discovery round, we'll have >> the correct values. Again, as I said, UUID.cache is not as important as >> TP.logical_addr_cache. >> >> >>> This is running the Transactional benchmark, so it would be simpler if >>> we enabled PING trace in the configuration and disabled it before the >>> actual benchmark starts. I'm going to try it myself :) >> >> >> How do you run 12 instances ? Did you change something in the config ? >> I'd be interested in trying the *exact* same config you're running, to >> see what's going on ! > > Pushed it for you, committing the exact configuration changes as well: > > git clone git://github.com/Sanne/InfinispanStartupBenchmark.git > cd InfinispanStartupBenchmark > git co my > sh bench.sh > > If you look into bench.sh > (as it is at > https://github.com/Sanne/InfinispanStartupBenchmark/blob/my/bench.sh > ) > the lines 9, 15, 28 should be the most interesting. > > You need to run with Infinispan's default configuration to reproduce the > issue, > but I wouldn't mind you commenting on > https://github.com/Sanne/InfinispanStartupBenchmark/blob/my/benchmark/src/main/resources/bench-jgroups.xml > as well. That's what I'm now using as default for further performance tests. > > Cheers, > Sanne > >> >> >> [1] http://www.jgroups.org/manual-3.x/html/protlist.html#MERGE3 >> >> -- >> Bela Ban >> Lead JGroups (http://www.jgroups.org) >> JBoss / Red Hat >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev@lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Bela Ban Lead JGroups (http://www.jgroups.org) JBoss / Red Hat _______________________________________________ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev