I don't see this with trunk... I just tried TestIndexingPerformance
with 1M docs, and it seemed to work fine.
Memory use stabilized at 40MB.
Most memory use was for indexing (not analysis).
char[] topped out at 4.5MB

-Yonik
http://www.lucidimagination.com


On Tue, Oct 6, 2009 at 12:31 PM, Mark Miller <markrmil...@gmail.com> wrote:
> Yeah - I was wondering about that ... not sure how these guys are
> stacking up ...
>
> Yonik Seeley wrote:
>> TestIndexingPerformance?
>> What the heck... that's not even multi-threaded!
>>
>> -Yonik
>> http://www.lucidimagination.com
>>
>>
>>
>> On Tue, Oct 6, 2009 at 12:17 PM, Mark Miller <markrmil...@gmail.com> wrote:
>>
>>> Darnit - didn't finish that email. This is after running your old short
>>> doc perf test for 10,000 iterations. You see the same thing with 1000
>>> iterations but much less pronounced eg gettin' worse with more iterations.
>>>
>>> Mark Miller wrote:
>>>
>>>> A little before and after. The before is around may 5th'is - the after
>>>> is trunk.
>>>>
>>>> http://myhardshadow.com/memanalysis/before.png
>>>> http://myhardshadow.com/memanalysis/after.png
>>>>
>>>> Mark Miller wrote:
>>>>
>>>>
>>>>> Took a peak at the checkout around the time he says he's using.
>>>>>
>>>>> CharTokenizer appears to be holding onto much large char[] arrays now
>>>>> than before. Same with snowball.Among - used to be almost nothing, now
>>>>> its largio.
>>>>>
>>>>> The new TokenStream stuff appears to be clinging. Needs to find some
>>>>> inner peace.

Reply via email to