Re: Lucene index integrity... or lack of :-(

2002-04-26 Thread Karl Øie
there are some strange problems with FSDirectory, i have found that building chuncks in a RAMDirectory and then merge these into a FSDirectory is more stable than indexing directly into the FSDirectory, i ran into your problem and the dreaded too many open files problems when indexing large

Re: Lucene index integrity... or lack of :-(

2002-04-26 Thread petite_abeille
using a RAMDir as a middle man solved my problems... Thanks. What's is your heuristic to flush the RAMDirectory? Also how do you deal with System.exit() or application death? Eg, your are indexing something and the application dies or is killed. Thanks for any input. R. -- To unsubscribe,

Re: Lucene index integrity... or lack of :-(

2002-04-26 Thread petite_abeille
Thanks. What's is your heuristic to flush the RAMDirectory? please explain this because i don't understand english that good :-( That's ok, I don't really understand English either :-) Simply put, when do you flush the RAMDirectory into the FSDirectory? Every five documents? Ten? A thousand?

Re: Lucene index integrity... or lack of :-(

2002-04-26 Thread Karl Øie
ah, now i see, what i have is a server with 512mb of ram, so i have used two different approaches and both works ok; 1 - i index a fixed number of documents into a RAMDir, like 10 (each of the docs are xml docs about 1,5-2mb) and then i optimize the RAMDir and merge it into the FSDir and then

Re: Lucene index integrity... or lack of :-(

2002-04-26 Thread Karl Øie
forgot this: its a bit hard to determine a good number of balance while indexing XML documents because the internal relations of a DOM can make a XML document become nearly 21 times as big in memory compared to disk (i am not lying, i have seen it my self)... also the RAMDir must be kept in

Re: Lucene index integrity... or lack of :-(

2002-04-26 Thread Otis Gospodnetic
Morning, I'm starting to wander how bullet proof are Lucene indexes? Do they get corrupted easely? If so is there a way to rebuild them? There is no tool to detect index corruption, fixing of indexing, nor index rebuilding. The last one anyone can/has to do on their own. I'm started to

Re: Lucene index integrity... or lack of :-(

2002-04-26 Thread petite_abeille
Hello again, There is no tool to detect index corruption, fixing of indexing, nor index rebuilding. The last one anyone can/has to do on their own. :-( Well, that *very* sad to say the least... How do I know if my indexes are not corrupted even if everything seems to be working fine? Don't

Re: Lucene index integrity... or lack of :-(

2002-04-26 Thread Otis Gospodnetic
Hello, There is no tool to detect index corruption, fixing of indexing, nor index rebuilding. The last one anyone can/has to do on their own. :-( Well, that *very* sad to say the least... How do I know if my indexes are not corrupted even if everything seems to be working fine?