While drooling over MappedBigByteBuffer, which we'll (hopefully) see in JDK7, I revisited my own Directory code and noticed a certain peculiarity, shared by Lucene core classes: Each and every IndexInput implementation only implements readByte() and readBytes(), never trying to override readInt/VInt/Long/etc methods.
Currently RAMDirectory uses a list of byte arrays as a backing store, and I got some speedup when switched to custom version that knows each file size beforehand and thus is able to allocate a single byte array (deliberately accepting 2Gb file size limitation) of exactly needed length. Nothing strange here, readByte(s) methods are easily most oft called ones in a Lucene app and they were greatly simplified - readByte became mere: public byte readByte() throws IOException { return buffer[position++]; // I dropped bounds checking, relying on natural ArrayIndexOOBE, we can't easily catch and recover from it anyway } But now, readInt is four readByte calls, readLong is two readInts (ten calls in total), readString - god knows how many. Unless you use a single type of Directory through the lifetime of your application, these readByte calls are never inlined, JIT invokevirtual short-circuit optimization (it skips method lookup if it always finds the same one during this exact invocation) cannot be applied too. There are three cases when we can override readNNN methods and provide implementations with zero or minimum method invocations - RAMDirectory, MMapDirectory and BufferedIndexInput for FSDirectory/CompoundFileReader. Anybody tried this? -- Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com) Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423 ICQ: 104465785 --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org