We have a "whole bunch" (~75) of tests written in JRuby that test portions of 
our J2EE app which runs on JBoss and WebLogic (simultaneously).  Most of 
these take the form of testing our ReST API.

We get intermittent failures out of these.  Nearly always, the failure seems 
to have to do with XML parsing (REXML).  REXML failures tend to come in a one 
of several different ways.  Two of the most common are an error saying you 
can only have one root node, or that some XPath is not present (even though 
it is).  I've seen a few other REXML problems, but those are two I really 
remember.

The frequency of failure seems to vary by machine.  The machines that have the 
most problems seem to be our fastest (2.7Ghz) and hyper-threaded.  
On my 2 Ghz machine I see 1-3 with almost every test run.  On one guy's 1.7 
Ghz laptop, he rarely sees them.

I generally have to run our whole suite to see the failures, but they guys 
with the fastest machines usually can get it with just one test case.

I am *VERY* confident this not a problem in our app, but instead a problem in 
JRuby.  I've reachd me tolerance for it and would like to find it.  It is in 
0.8.2, 0.8.3 and 0.9.0 (and may be more frequent in 0.9.0).

Running the suite on my system takes over 10 minutes, and it fails in many 
different locations in different ways (and REXML is not simple to debug in 
Ruby, let alone in Java).   So, the question is, what *might* be happening 
here to cause this?  Is there a specific area of JRuby that I might want to 
look at.

For sometime I've thought the hyperthreading indicated a synchronization 
problem.  However, we do not have multiple JRuby threads going ( debugged 
with breakpoints on have Thread#start call).  We are acting as a client to 
WebLogic and JBoss, and WebLogic does start up 10-15 threads, but none of 
that code *should* be affecting JRuby.

But given the different frequencies where seeing it on non-hyper-threaded 
systems, I wonder if it is not some how related to 'pure speed".  We have one 
place where we put in a sleep and that seems to reduce the frequency of 
failure in that specific location.  I'm not sure how a speed could matter 
here, but it seems to be common problem.

Please, share your thoughts with me. Ask all the questions you like.

David

Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Jruby-devel mailing list
Jruby-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jruby-devel

Reply via email to