So could that potentially explain our use of more ram on indexing? Or is this a rare edge case. -- Jeff Newburn Software Engineer, Zappos.com jnewb...@zappos.com - 702-943-7562
> From: Mark Miller <markrmil...@gmail.com> > Reply-To: <solr-user@lucene.apache.org> > Date: Tue, 06 Oct 2009 15:30:50 -0400 > To: <solr-user@lucene.apache.org> > Subject: Re: Solr Trunk Heap Space Issues > > This is looking like its just a Lucene oddity you get when adding a > single doc due to some changes with the NRT stuff. > > Mark Miller wrote: >> Okay - I'm sorry - serves me right for working sick. >> >> Now that I have put on my glasses and correctly tagged my two eclipse tests: >> >> It still appears that trunk likes to use more RAM. >> >> I switched both tests to one million iterations and watched the heap. >> >> The test from the build around may 5th (I promise :) ) regularly GC's >> down to about 70-80MB after a fair time >> of running. It doesn't appear to climb - keeps GC'ing back to 70-80 >> (after starting at by GC'ing down to 40 for a bit). >> >> The test from trunk, after a fair time of running, keeps GC'ing down to >> about 120-150MB - 150 at the end, slowly working its >> way up from 90-110 at the beginning. >> >> Don't know what that means yet - but it appears trunk likes to use a bit >> more RAM while indexing. Odd that its so much more because these docs >> are tiny: >> >> String[] fields = {"text","simple" >> ,"text","test" >> ,"text","how now brown cow" >> ,"text","what's that?" >> ,"text","radical!" >> ,"text","what's all this about, anyway?" >> ,"text","just how fast is this text indexing?" >> }; >> >> Mark Miller wrote: >> >>> Okay, I juggled the tests in eclipse and flipped the results. So they >>> make sense. >>> >>> Sorry - goose chase on this one. >>> >>> Yonik Seeley wrote: >>> >>> >>>> I don't see this with trunk... I just tried TestIndexingPerformance >>>> with 1M docs, and it seemed to work fine. >>>> Memory use stabilized at 40MB. >>>> Most memory use was for indexing (not analysis). >>>> char[] topped out at 4.5MB >>>> >>>> -Yonik >>>> http://www.lucidimagination.com >>>> >>>> >>>> On Tue, Oct 6, 2009 at 12:31 PM, Mark Miller <markrmil...@gmail.com> wrote: >>>> >>>> >>>> >>>>> Yeah - I was wondering about that ... not sure how these guys are >>>>> stacking up ... >>>>> >>>>> Yonik Seeley wrote: >>>>> >>>>> >>>>> >>>>>> TestIndexingPerformance? >>>>>> What the heck... that's not even multi-threaded! >>>>>> >>>>>> -Yonik >>>>>> http://www.lucidimagination.com >>>>>> >>>>>> >>>>>> >>>>>> On Tue, Oct 6, 2009 at 12:17 PM, Mark Miller <markrmil...@gmail.com> >>>>>> wrote: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> Darnit - didn't finish that email. This is after running your old short >>>>>>> doc perf test for 10,000 iterations. You see the same thing with 1000 >>>>>>> iterations but much less pronounced eg gettin' worse with more >>>>>>> iterations. >>>>>>> >>>>>>> Mark Miller wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> A little before and after. The before is around may 5th'is - the after >>>>>>>> is trunk. >>>>>>>> >>>>>>>> http://myhardshadow.com/memanalysis/before.png >>>>>>>> http://myhardshadow.com/memanalysis/after.png >>>>>>>> >>>>>>>> Mark Miller wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> Took a peak at the checkout around the time he says he's using. >>>>>>>>> >>>>>>>>> CharTokenizer appears to be holding onto much large char[] arrays now >>>>>>>>> than before. Same with snowball.Among - used to be almost nothing, now >>>>>>>>> its largio. >>>>>>>>> >>>>>>>>> The new TokenStream stuff appears to be clinging. Needs to find some >>>>>>>>> inner peace. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>> >>> >> >> >> > > > -- > - Mark > > http://www.lucidimagination.com > > >