Re: Lucene index integrity... or lack of :-(
there are some strange problems with FSDirectory, i have found that building chuncks in a RAMDirectory and then merge these into a FSDirectory is more stable than indexing directly into the FSDirectory, i ran into your problem and the dreaded too many open files problems when indexing large documents with many fields using a RAMDir as a middle man solved my problems... mvh karl øie On Friday 26 April 2002 13:54, petite_abeille wrote: Hello, I'm starting to wander how bullet proof are Lucene indexes? Do they get corrupted easely? If so is there a way to rebuild them? I'm started to get the following exception left and right... 04/25 18:34:39 (Warning) Indexer.indexObjectWithValues: java.io.IOException: _91.fnm already exists I build a little app (http://homepage.mac.com/zoe_info/) that uses Lucene quiet extensively, and I would like to keep it that way. However, I'm starting to have second thought about Lucene's reliability... :-( I'm sure I'm doing something wrong somewhere, but I really cannot see what... Any help or insight greatly appreciated. Thanks. PA. -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Lucene index integrity... or lack of :-(
using a RAMDir as a middle man solved my problems... Thanks. What's is your heuristic to flush the RAMDirectory? Also how do you deal with System.exit() or application death? Eg, your are indexing something and the application dies or is killed. Thanks for any input. R. -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Lucene index integrity... or lack of :-(
Thanks. What's is your heuristic to flush the RAMDirectory? please explain this because i don't understand english that good :-( That's ok, I don't really understand English either :-) Simply put, when do you flush the RAMDirectory into the FSDirectory? Every five documents? Ten? A thousand? What is a good balance between RAM and FS? Thanks. PA. -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Lucene index integrity... or lack of :-(
ah, now i see, what i have is a server with 512mb of ram, so i have used two different approaches and both works ok; 1 - i index a fixed number of documents into a RAMDir, like 10 (each of the docs are xml docs about 1,5-2mb) and then i optimize the RAMDir and merge it into the FSDir and then optimize the FSDir... 2 - i use the Runtime.freeMemory() and Runtime.totalMemory() to see if i have reached more than 80% of the available memory, if so i optimize the RAMDir, merge it and optimize the FSDir..., if not i just add more documents to the RAMDir as far as i have tested i have never experienced a failure while merging a RAMDir into a FSDir regardless of size, so it's my systems memory that is the problem mvh karl øie On Friday 26 April 2002 15:33, petite_abeille wrote: Thanks. What's is your heuristic to flush the RAMDirectory? please explain this because i don't understand english that good :-( That's ok, I don't really understand English either :-) Simply put, when do you flush the RAMDirectory into the FSDirectory? Every five documents? Ten? A thousand? What is a good balance between RAM and FS? Thanks. PA. -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Lucene index integrity... or lack of :-(
forgot this: its a bit hard to determine a good number of balance while indexing XML documents because the internal relations of a DOM can make a XML document become nearly 21 times as big in memory compared to disk (i am not lying, i have seen it my self)... also the RAMDir must be kept in memory while indexing and merging, so checking the systems free memory is easier that trying to calculate memoryusage mvh karl øie -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Lucene index integrity... or lack of :-(
Morning, I'm starting to wander how bullet proof are Lucene indexes? Do they get corrupted easely? If so is there a way to rebuild them? There is no tool to detect index corruption, fixing of indexing, nor index rebuilding. The last one anyone can/has to do on their own. I'm started to get the following exception left and right... 04/25 18:34:39 (Warning) Indexer.indexObjectWithValues: java.io.IOException: _91.fnm already exists I've seen people asking about this on the list, but I never encountered this particular exception. I build a little app (http://homepage.mac.com/zoe_info/) that uses Lucene quiet extensively, and I would like to keep it that way. However, I'm starting to have second thought about Lucene's reliability... :-( I'm sure I'm doing something wrong somewhere, but I really cannot see what... Maybe it's not a Lucene issue then, although I've seen this mentioned so often, which means that documentation could be improved to prevent people from making the same mistakes that others have already made. Otis __ Do You Yahoo!? Yahoo! Games - play chess, backgammon, pool and more http://games.yahoo.com/ -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Lucene index integrity... or lack of :-(
Hello again, There is no tool to detect index corruption, fixing of indexing, nor index rebuilding. The last one anyone can/has to do on their own. :-( Well, that *very* sad to say the least... How do I know if my indexes are not corrupted even if everything seems to be working fine? Don't tell me I'm the first one to run into this kind of issues?!? How can I trust an index if there is *no* way of checking its integrity? And even if you happen to notice that something is fishy, there is no way to rebuild the index -short or re-indexing everything from scratch? That does not sound like a very healthy situation to me. Fragile will be kind for describing it... I've seen people asking about this on the list, but I never encountered this particular exception. Lucky you... Maybe it's not a Lucene issue then, although I've seen this mentioned so often, which means that documentation could be improved to prevent people from making the same mistakes that others have already made. Maybe, maybe not. And most likely I'm doing something odd. In any case, could you point me to the mistakes that others have already made? Or did I miss something obvious here? Thanks. PA -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Lucene index integrity... or lack of :-(
Hello, There is no tool to detect index corruption, fixing of indexing, nor index rebuilding. The last one anyone can/has to do on their own. :-( Well, that *very* sad to say the least... How do I know if my indexes are not corrupted even if everything seems to be working fine? Don't tell me I'm the first one to run into this kind of issues?!? How can I trust an index if there is *no* way of checking its integrity? And even if you happen to notice that something is fishy, there is no way to rebuild the index -short or re-indexing everything from scratch? That does not sound like a very healthy situation to me. Fragile will be kind for describing it... Yes, that's all unfortunate. If you come up with anything, please share it. Or, you can use Lucene Sandbox and develop stuff there. I've seen people asking about this on the list, but I never encountered this particular exception. Lucky you... :) Maybe it's not a Lucene issue then, although I've seen this mentioned so often, which means that documentation could be improved to prevent people from making the same mistakes that others have already made. Maybe, maybe not. And most likely I'm doing something odd. In any case, could you point me to the mistakes that others have already made? Or did I miss something obvious here? Nah, the only thing I can suggest is check the lists' archives, that is where mistakes of others would be recorded. Otis __ Do You Yahoo!? Yahoo! Games - play chess, backgammon, pool and more http://games.yahoo.com/ -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]