I finally found time to look into your compound file implementation and to try it out. It works and I like it. However, I wondered, why you are using clones of the base input stream in CompoundFileReader.CSInputStream. The "problem" (actually it is not a real problem) I see is that you are double buffering your input. Note that the clone has a buffer of 1024 bytes and that CompoundFileReader.CSInputStream, which extends InputStream, also has such a buffer. I think in the way you implemented it your input gets copied into the base buffer and from there into the CompoundFileReader.CSInputStream buffer before it is actually used. Please have a look at my patch. At first one might think that my implementation is simpler but less efficient. This is not the case. Actually it is even a little bit more efficient. Points to think about:
*) stream is private to CompoundFileReader, so nobody else can use it, if not explicitly granted access. CompoundFileReader uses stream only in its constructor and when calling openFile.
*) Therefore, CompoundFileReader.CSInputStream can share the same instance of stream if they take care of synchronization and seek.
*) An InputStream not necessarily has to implement seekInternal if readInternal takes care of seeking on the real stream (see FSInputStream).
*) Synchronization of CompoundFileReader.CSInputStream.readInternal on base does no harm since it is called only when the real file has to be accessed and thi is synchronized anyway. However, this synchronizytion is necessary!
I tested my patch with your TestCompoundFile and everything seems fine. I also tried it on one of my indices and searched on this index. It also works. My impression from my tests is that my patch is a little bit faster than your original version but there seems to be not much difference in efficiency.
Christoph
Index: CompoundFileReader.java =================================================================== RCS file: /home/cvs/jakarta-lucene/src/java/org/apache/lucene/index/CompoundFileReader.java,v retrieving revision 1.2 diff -u -r1.2 CompoundFileReader.java --- CompoundFileReader.java 13 Oct 2003 14:18:04 -0000 1.2 +++ CompoundFileReader.java 16 Oct 2003 18:58:02 -0000 @@ -240,10 +240,9 @@ final long length) throws IOException { - this.base = (InputStream) base.clone(); + this.base = base; this.fileOffset = fileOffset; this.length = length; // variable in the superclass - seekInternal(0); // position to the adjusted 0th byte } /** Expert: implements buffer refill. Reads bytes from the current @@ -255,7 +254,10 @@ protected void readInternal(byte[] b, int offset, int len) throws IOException { - base.readBytes(b, offset, len); + synchronized (base) { + base.seek(fileOffset + getFilePointer()); + base.readBytes(b, offset, len); + } } /** Expert: implements seek. Sets current position in this file, where @@ -263,35 +265,11 @@ * @see #readInternal(byte[],int,int) */ protected void seekInternal(long pos) throws IOException - { - if (pos > 0 && pos >= length) - throw new IOException("Seek past the end of file"); - - if (pos < 0) - throw new IOException("Seek to a negative offset"); - - base.seek(fileOffset + pos); - } + {} /** Closes the stream to futher operations. */ public void close() throws IOException - { - base.close(); - } + {} - /** Returns a clone of this stream. - * - * <p>Clones of a stream access the same data, and are positioned at the same - * point as the stream they were cloned from. - * - * <p>Expert: Subclasses must ensure that clones may be positioned at - * different points in the input from each other and from the stream they - * were cloned from. - */ - public Object clone() { - CSInputStream other = (CSInputStream) super.clone(); - other.base = (InputStream) base.clone(); - return other; - } } }
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]