On Tue, Dec 18, 2012 at 08:13:14PM +0100, Mario Camou wrote: > > We're having some performance issues in a specific workflow, where Ruote > will sometimes start going very slow (i.e., 1-2 seconds between messages > with RUOTE_NOISY=true.
Hello Mario, is that specific workflow long? How many lines in the process definitions? Do the workitems have a big payload? > Part of the problem seems to be memory consumption (we're running on JRuby > 1.7 on top of JDK 7). With this single workflow running, we were running > out of heap space (set to 640M). We upped the heap to 768M and moved > from DefaultHistory to StorageHistory (we're using RuoteMon for storage). > That allowed us to complete the workflow without having an > OutOfMemoryError. However, using JConsole to monitor memory in real time I > see that memory usage will spike to almost all the heap, and then it seems > like the JVM is thrashing, doing a GC (which takes it down to ~500M) and > then filling back up, every 4 seconds. > > With the Eclipse Heap Analyzer I've seen that the biggest culprit is the > WaitLogger. The comments in the code say that it stores the last 147 > messages for this particular worker, but it isn't clear exactly what it > does and what might happen if we bump that number down. Yes, feel free to cut down the 'wait_logger_max' to 0. It should have no bad effect, unless you use Ruote::Dashboard#wait_for in production (outside of tests). To avoid such bad surprises I cut the default from 147 to 56 on ruote master: https://github.com/jmettraux/ruote/commit/327a64f6f7afed685d108ffe425c6651ecb83914 Please tell me how it goes. > Another thing that would be useful is to have a description of the columns > that appear in the noisy log. So, for example: > > 9 00:23.101 20 di * 20121218-1829-jimisuzo-hetsujemo 60311 0_10_1_2 > fill_form wi: [0_10_1_2!60311...!, 7], part: > [Abstra::RuoteProject::Participants::PubSubAuthStorageParticipant, {}] > > What is the timestamp? The number after it? The abbreviation seems to be > some sort of code specifying which part of the processing is going on (I've > seen ap which I think is something like "accepted", re which I assume is > "reply", di which might mean "dispatch" but there's also rc, dd, re, ce, > la, etc. Then comes the WF name, some hex number, the expid, and a bunch of > parameters. I'd like to know a bit more about this to dig in deeper to see > where exactly the performance problems lie. I've started putting a page together about that: http://ruote.rubyforge.org/noisy.html Feedback is welcome. Feliz Navidad, -- John Mettraux - http://lambda.io/jmettraux -- you received this message because you are subscribed to the "ruote users" group. to post : send email to [email protected] to unsubscribe : send email to [email protected] more options : http://groups.google.com/group/openwferu-users?hl=en
