Ian, OOM exception point varies not fixed. It could come anywhere once memory exceeds a certain point. I have allocated 1 GB memory for JVM. I haven't used profiler. When I said after 70 K docs it fails i meant approx 70k documents but if I reduce memory then it will OOM before 70K so its not specific to any particular document. To add each document first I search and then do update so I am not sure whether lucene loads all the indices for search and thats why its going OOM ? I am not sure how search operation works in Lucene.
Thanks Ajay Ian Lea wrote: > > Where exactly are you hitting the OOM exception? Have you got a stack > trace? How much memory are you allocating to the JVM? Have you run a > profiler to find out what is using the memory? > > If it runs OK for 70K docs then fails, 2 possibilities come to mind: > either the 70K + 1 doc is particularly large, or you or lucene > (unlikely) are holding on to something that you shouldn't be. > > > -- > Ian. > > > On Tue, Mar 2, 2010 at 1:48 PM, ajay_gupta <ajay...@gmail.com> wrote: >> >> Hi Erick, >> I tried setting setRAMBufferSizeMB as 200-500MB as well but still it >> goes >> OOM error. >> I thought its filebased indexing so memory shouldn't be an issue but you >> might be right that when searching it might be using lot of memory ? Is >> there way to load documents in chunks or someothere way to make it >> scalable >> ? >> >> Thanks in advance >> Ajay >> >> >> Erick Erickson wrote: >>> >>> I'm not following this entirely, but these docs may be huge by the >>> time you add context for every word in them. You say that you >>> "search the existing indices then I get the content and append....". >>> So is it possible that after 70K documents your additions become >>> so huge that you're blowing up? Have you taken any measurements >>> to determine how big the docs get as you index more and more >>> of them? >>> >>> If the above is off base, have you tried setting >>> IndexWriter.setRAMBufferSizeMB? >>> >>> HTH >>> Erick >>> >>> On Tue, Mar 2, 2010 at 8:27 AM, ajay_gupta <ajay...@gmail.com> wrote: >>> >>>> >>>> Hi, >>>> It might be general question though but I couldn't find the answer yet. >>>> I >>>> have around 90k documents sizing around 350 MB. Each document contains >>>> a >>>> record which has some text content. For each word in this text I want >>>> to >>>> store context for that word and index it so I am reading each document >>>> and >>>> for each word in that document I am appending fixed number of >>>> surrounding >>>> words. To do that first I search in existing indices if this word >>>> already >>>> exist and if it is then I get the content and append the new context >>>> and >>>> update the document. In case no context exist I create a document with >>>> fields "word" and "context" and add these two fields with values as >>>> word >>>> value and context value. >>>> >>>> I tried this in RAM but after certain no of docs it gave out of memory >>>> error >>>> so I thought to use FSDirectory method but surprisingly after 70k >>>> documents >>>> it also gave OOM error. I have enough disk space but still I am getting >>>> this >>>> error.I am not sure even for disk based indexing why its giving this >>>> error. >>>> I thought disk based indexing will be slow but atleast it will be >>>> scalable. >>>> Could someone suggest what could be the issue ? >>>> >>>> Thanks >>>> Ajay >>>> -- >>>> View this message in context: >>>> http://old.nabble.com/Lucene-Indexing-out-of-memory-tp27755872p27755872.html >>>> Sent from the Lucene - Java Users mailing list archive at Nabble.com. >>>> >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>>> >>>> >>> >>> >> >> -- >> View this message in context: >> http://old.nabble.com/Lucene-Indexing-out-of-memory-tp27755872p27756082.html >> Sent from the Lucene - Java Users mailing list archive at Nabble.com. >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > -- View this message in context: http://old.nabble.com/Lucene-Indexing-out-of-memory-tp27755872p27767405.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org