So could that potentially explain our use of more ram on indexing? Or is
this a rare edge case.
-- 
Jeff Newburn
Software Engineer, Zappos.com
jnewb...@zappos.com - 702-943-7562


> From: Mark Miller <markrmil...@gmail.com>
> Reply-To: <solr-user@lucene.apache.org>
> Date: Tue, 06 Oct 2009 15:30:50 -0400
> To: <solr-user@lucene.apache.org>
> Subject: Re: Solr Trunk Heap Space Issues
> 
> This is looking like its just a Lucene oddity you get when adding a
> single doc due to some changes with the NRT stuff.
> 
> Mark Miller wrote:
>> Okay - I'm sorry - serves me right for working sick.
>> 
>> Now that I have put on my glasses and correctly tagged my two eclipse tests:
>> 
>> It still appears that trunk likes to use more RAM.
>> 
>> I switched both tests to one million iterations and watched the heap.
>> 
>> The test from the build around may 5th (I promise :) ) regularly GC's
>> down to about 70-80MB after a fair time
>> of running. It doesn't appear to climb - keeps GC'ing back to 70-80
>> (after starting at by GC'ing down to 40 for a bit).
>> 
>> The test from trunk, after a fair time of running, keeps GC'ing down to
>> about 120-150MB - 150 at the end, slowly working its
>> way up from 90-110 at the beginning.
>> 
>> Don't know what that means yet - but it appears trunk likes to use a bit
>> more RAM while indexing. Odd that its so much more because these docs
>> are tiny:
>> 
>>     String[] fields = {"text","simple"
>>             ,"text","test"
>>             ,"text","how now brown cow"
>>             ,"text","what's that?"
>>             ,"text","radical!"
>>             ,"text","what's all this about, anyway?"
>>             ,"text","just how fast is this text indexing?"
>>     };
>> 
>> Mark Miller wrote:
>>   
>>> Okay, I juggled the tests in eclipse and flipped the results. So they
>>> make sense.
>>> 
>>> Sorry - goose chase on this one.
>>> 
>>> Yonik Seeley wrote:
>>>   
>>>     
>>>> I don't see this with trunk... I just tried TestIndexingPerformance
>>>> with 1M docs, and it seemed to work fine.
>>>> Memory use stabilized at 40MB.
>>>> Most memory use was for indexing (not analysis).
>>>> char[] topped out at 4.5MB
>>>> 
>>>> -Yonik
>>>> http://www.lucidimagination.com
>>>> 
>>>> 
>>>> On Tue, Oct 6, 2009 at 12:31 PM, Mark Miller <markrmil...@gmail.com> wrote:
>>>>   
>>>>     
>>>>       
>>>>> Yeah - I was wondering about that ... not sure how these guys are
>>>>> stacking up ...
>>>>> 
>>>>> Yonik Seeley wrote:
>>>>>     
>>>>>       
>>>>>         
>>>>>> TestIndexingPerformance?
>>>>>> What the heck... that's not even multi-threaded!
>>>>>> 
>>>>>> -Yonik
>>>>>> http://www.lucidimagination.com
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Tue, Oct 6, 2009 at 12:17 PM, Mark Miller <markrmil...@gmail.com>
>>>>>> wrote:
>>>>>> 
>>>>>>       
>>>>>>         
>>>>>>           
>>>>>>> Darnit - didn't finish that email. This is after running your old short
>>>>>>> doc perf test for 10,000 iterations. You see the same thing with 1000
>>>>>>> iterations but much less pronounced eg gettin' worse with more
>>>>>>> iterations.
>>>>>>> 
>>>>>>> Mark Miller wrote:
>>>>>>> 
>>>>>>>         
>>>>>>>           
>>>>>>>            
>>>>>>>> A little before and after. The before is around may 5th'is - the after
>>>>>>>> is trunk.
>>>>>>>> 
>>>>>>>> http://myhardshadow.com/memanalysis/before.png
>>>>>>>> http://myhardshadow.com/memanalysis/after.png
>>>>>>>> 
>>>>>>>> Mark Miller wrote:
>>>>>>>> 
>>>>>>>> 
>>>>>>>>           
>>>>>>>>           
>>>>>>>>           
>>>>>>>>> Took a peak at the checkout around the time he says he's using.
>>>>>>>>> 
>>>>>>>>> CharTokenizer appears to be holding onto much large char[] arrays now
>>>>>>>>> than before. Same with snowball.Among - used to be almost nothing, now
>>>>>>>>> its largio.
>>>>>>>>> 
>>>>>>>>> The new TokenStream stuff appears to be clinging. Needs to find some
>>>>>>>>> inner peace.
>>>>>>>>>          
>>>>>>>>>          
>>>>>>>>>          
>>>   
>>>     
>> 
>> 
>>   
> 
> 
> -- 
> - Mark
> 
> http://www.lucidimagination.com
> 
> 
> 

Reply via email to