I agree entirely - the problem is with ibm's jitc, not lucene - I'm just not sure how likely, or how quickly, it would be for ibm to fix this jdk bug.
I've searched for details on ibm 1.3 jit bugs, and it seems like there are a few, and that the 'fixes' seem to be to disable jit for the class/method giving problems (there don't seem to be many cases of 'this will be fixed soon'). I've spent some time debugging this and narrowed this problem to a jit inlining problem with: org.apache.lucene.store.OutputStream#writeInt(int) this method looks like: public final void writeInt(int i) throws IOException { writeByte((byte)(i >> 24)); writeByte((byte)(i >> 16)); writeByte((byte)(i >> 8)); writeByte((byte) i); } And I've come up with two workarounds for this problem under an ibm 1.3.1 jvm: 1) disable jit compliation for this method by setting this environment variable: JITC_COMPILEOPT=SKIP{org/apache/lucene/store/OutputStream}{writeInt} 2) change the method to explicitly inline the method calls: public final void writeInt(int i) throws IOException { if (bufferPosition >= BUFFER_SIZE) flush(); buffer[bufferPosition++] = (byte)(i >> 24); if (bufferPosition >= BUFFER_SIZE) flush(); buffer[bufferPosition++] = (byte)(i >> 16); if (bufferPosition >= BUFFER_SIZE) flush(); buffer[bufferPosition++] = (byte)(i >> 8); if (bufferPosition >= BUFFER_SIZE) flush(); buffer[bufferPosition++] = (byte) i; } I know thats ugly, and I know that this class didn't change when the .tis code changes were made. I'll try and open a PMR for this. regards, jamie --- Robert Engels <[EMAIL PROTECTED]> wrote: > Would the proper course not be to get IBM to produce > a patch for their JDK > to fix the JIT bug - I have to assume there are > other products where the JDK > bug will surface... > > -----Original Message----- > From: Jamie M [mailto:[EMAIL PROTECTED] > Sent: Sunday, March 07, 2004 5:39 PM > To: Lucene Developers List > Subject: Re: New FieldSortedHitQueue uses Java 1.4 > feature > > > Tim, > > I think there are two issues with 1.4 which affect > java 1.3 compatability: > - this regexp stuff > - the recent .tis file related changes > (thread: > http://www.mail-archive.com/[EMAIL PROTECTED]/msg04383.html, > bug: > http://issues.apache.org/bugzilla/show_bug.cgi?id=27408) > > As you suggest, the regexp stuff can be worked > around, > making the code usable with a 1.3 jvm. However, the > recent .tis related changes cause problems when used > with IBM's 1.3.1 JVM (with jit enabled, under > windows.. I've not tested other OSes) which means > you > can't build indexes (except tiny ones) using such a > jvm. > > This doesn't appear to be Lucene's fault - it seems > to > be an IBM jdk bug, but annoyingly IBM still uses > 1.3.1 > jvms in all its biggest java products (WebSphere > Application Developer 4 & 5, WebSphere Application > Server 4 & 5 (and WebSphere Portal etc which are > built > on top)) - v5.x of these products is the latest > version, and they have this problem with the .tis > related changes. IBM has a big chunk of the java > market, so unfortunately IBM 1.3.1 jvms are > (surprisingly) widely used, certainly in larger > commercial environments. And I'm not sure how > quickly > IBM would release a fix for its jvm. > > So, if lucene 1.4 was released as 'java 1.3 > compatible' then that will probably result in > ibm-product-using people reporting bugs relating to > building indexes. Maybe lucene 1.4 should no longer > officially support java 1.3. Or, a workaround in > the > new .tis related code could be sought to preserve > (ibm) java 1.3 support. Either way, I think this > issue may be more of a deciding factor in whether > lucene 1.4 should support java 1.3 or not. > > My personal preference is that a workaround is found > for the ibm 1.3.1 bug, but I've not found one yet. > > regards, > > jamie > > --- Tim Jones <[EMAIL PROTECTED]> wrote: > > Yes, generating the exception is more expensive > than > > applying the regular > > expression. However, the code isn't run that > often > > (relatively) and the > > results are cached, so it shouldn't make a > > significant difference. > > > > Just to verify the numbers, I put together a small > > test. I ran 100000 > > iterations of integer, float and string values > > through the two methods: > > > > +-----------------------------------+ > > | milliseconds to execute iteration | > > +-----------------------------------+ > > | value type | regex | exception | > > +------------|----------|-----------+ > > | integer | 0.0484 | 0.0094 | > > | float | 0.1641 | 0.311 | > > | string | 0.1125 | 0.4796 | > > +------------+----------+-----------+ > > > > The timing values were calculated by taking the > > total time and dividing by > > the number of iterations. The only case where > using > > the exception method is > > better is if the values are integers (in which > case, > > no exception is thrown > > since integer is the first test). > > > > Profiling the program confirms the results (java > > -Xrunhprof:cpu=samples): > > > > CPU SAMPLES BEGIN (total = 299) Fri Mar 05 > 08:18:27 > > 2004 > > rank self accum count trace method > > 1 40.80% 40.80% 122 37 > > java.lang.Throwable.fillInStackTrace > > 2 8.03% 48.83% 24 50 > > java.lang.StringBuffer.<init> > > 3 7.02% 55.85% 21 51 > > java.lang.StringBuffer.expandCapacity > > 4 3.01% 58.86% 9 49 > > java.lang.StringBuffer.expandCapacity > > 5 2.68% 61.54% 8 41 > > java.lang.NumberFormatException.forInputString > > 6 2.34% 63.88% 7 31 > > java.util.regex.Pattern.matcher > > ... > > > > But, if this is the only code depending on java > 1.4, > > it seems like it would > > be better to remove it for better version > > compatibility. Perhaps what would > > be best would be to have the code detect which > > version it's running under > > and act appropriately. > > > > Tim > > > > > > > From: Mario Ivankovits [mailto:[EMAIL PROTECTED] > > > > > > I dont know where and how often this piece of > code > > gets > > > called and how > > > often a wrong value will be passed, but to throw > > an exception > > > might be > > > more expensive. > > > Think of the stacktrace which needs to be > filled. > > > > > > Maybe the type of the term should then be cached > > (per field). > > > > > > > > > Mario > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: > > [EMAIL PROTECTED] > > For additional commands, e-mail: > > [EMAIL PROTECTED] > > > > > __________________________________ > Do you Yahoo!? > Yahoo! Search - Find what youre looking for faster > http://search.yahoo.com > > --------------------------------------------------------------------- > To unsubscribe, e-mail: > [EMAIL PROTECTED] > For additional commands, e-mail: > [EMAIL PROTECTED] > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: > [EMAIL PROTECTED] > For additional commands, e-mail: > [EMAIL PROTECTED] > __________________________________ Do you Yahoo!? Yahoo! Search - Find what you’re looking for faster http://search.yahoo.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]