I am not too concerned about the index bound checks. They are an additional
safety and I don´t think they matter in terms of efficiency. The only point
is that I forgot them in my patch and without them the tests fail. So if you
decide to apply my patch, the checks have to be added. If you give your ok,
I can do the commit too.

Christoph

Dmitry Serebrennikov schrieb:
I put those in mostly to assure myself that I got things right. I think the key question is whether it possible to read part of another file. If not, I think that's fine. If yes, I think that's a problem.

Dmitry.

Christoph Goller wrote:

Hi Dmitry,

Now I tried all test cases. They all work except for Russian analyser/stemmer
and occational fails of TestIndexReader (the timestamp problem). So I think
it should be ok as far as CoumpoundFile is concerned. Off course we still
have to find a good solution for the timestamp problem.


However,I stumbled over a problem that I had missed last time. TestCompoundFile
only succedds with your index bound tests in CSInputStream.seekInternal.
On Thursday I had deleted them after trying your test cases because the other
implementations don´t do these tests either. I did not go too deep into your
tests, but do you think the bahaviour of throwing an exception if the seek
index is out of bound is required? Its not part of the contract of the other
implementations of InputStream. Maybe I am missing something here.


Dmitry Serebrennikov schrieb:

Dear Christoph,

Sounds like an excellent enhancement. From a quick look, it appears that you are right and everything should work just fine but use less memory. One question: have you tried the other test cases also or just the TestCompoundFile. There are quite a few conditions that TestCompoundFile does not cover.

At first I thought that the synchronization around readBytes would cause too much performance degradation when a lot of concurrent queries were executing. But after I looked at it some more, I convinced myself that it should be ok. Have you ran any multi-threaded tests / benchmarks? I think it might also be a good idea before making this change.

Christoph, do you think it is possible to just call readInternal on the base stream instead of the readBytes? The main difference is that we would bypass the buffering in the base stream. I think the buffering done by the superclass of the CSInputStream would be quite enough (which is your point to begin with, right)? Perhaps it would be worthwhile to make InputStream.readInternal() public instead of protected?



In CSInputStream.readInternal I call:


synchronized (base) {
  base.seek(fileOffset + getFilePointer());
  base.readBytes(b, offset, len);
}

Calling base.seek does nothing more than setting the file pointer
(bufferStart + bufferPosition) of base correctly.

base.readBytes(b, offset, len) in this case does not use the buffer of
base (at least in most cases). Look into InputStream.readBytes.
If len >= BUFFER_SIZE the base buffer is skipped and the buffer b is
used directly.

I think synchronized in our case does not much more than synchronizing
on the actual file in FSInputStream.readInternal.

Christoph



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



-- ************************************************************* * Dr. Christoph Goller Tel. : +49 89 203 45734 * * Managing Director Email: [EMAIL PROTECTED] * * Detego Software GmbH Mail : Keuslinstr. 13, * * 80798 München, Germany * *************************************************************


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to