> I'm sorry, I should have been more specific. The file handle is only in > the picture when FSInputStream is cloned. From what I can tell after a > quick look, InputStream is responsible for buffering and it delegates to > subclasses (via a call to readInternal) to refill the buffer from the > underlying data store. When cloned, the InputStream clones the buffer > (in the hope that the next read will still hit the buffered data I > suppose), but after that it has its own seek position and its own > buffer. In the case of FSInputStream, there is a Descriptor object that > is shared between the clones. In the case of RAMInputStream - RAMFile is > the shared object.
What is the reason to have buffer with RAMInputStream? To have another copy of same data? > Perhaps a factory patter would be more flexible, but it looks like the > existing code does a pretty good job for the RAM and FS cases. Would the > factory pattern allow a better database implementation? It might. If you use embedded database like JDataStore, you should not cache data internally, database does this. So, buffer and cache simply introduce addtional memory consumption. > I don't know, I have not heard many complaints about that code recently. Ok, I will try it "as is" with JDataStore, and if it works - fine. > There is activity in terms of creating a crawler / content handler > framework. There is also a need to handle "update" better, I think. For > example, I think it would be great to have deletes go through > IndexWriter and get "cached" in the new segment, to be later applied to > the prior segments during optimization. This would make deletes and adds > transactional. Ok, I will have a look, but I have almost no experience with Lucene. > Another thing on my wish / todo list is to reduce the number of OS files > that must be open. Once you get a lot of indexes, with a number of > stored fields, and keep re-indexing them, the number of open files grows > rather quickly. And if Lucene is part of another program that already > has other file IO needs, you end up quickly pushing into the max open > files limit of the OS. The idea I have for this one is to implement a > different kind of segment - one that is composed of a single file. Once > a segment is created by IndexWriter, it never changes (besides the > deletes), so it could easily be stored as a single file. I will check this thing with JDataStore. Maybe we could borrow couple of ideas from them (like built-in file system)... This would simplify life - one file for all indices, tx support?, backup, etc. Thanks! Roman Rokytskyy _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>