[ 
https://issues.apache.org/jira/browse/LUCENE-4515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488643#comment-13488643
 ] 

Michael McCandless commented on LUCENE-4515:
--------------------------------------------

bq. the comment is on start but it says "end" I think given the fact that we 
know the freq we can read the slice without storing the end but we'd need to 
change SliceReader for it and I am not sure if that is worth the trouble we 
could get in. Yet, 4byte per term though.

Ahh I see, right!  It's not needed.  You do need the "end" per term as you 
build up the slices, but once done you can rely entirely on freq.

bq. we really rely on this in ByteBlockPool already so which likely doesn't 
work at this time but we don't run into since we don't reuse in DWPT? I will 
add a test.

Hmm if we never reuse in DWPT then we don't need to clear...

bq. I think reuse is a special usecase and I guess we should allow it. Yet, I 
totally agree this is risky. I suggest to make this possible if you subclass 
and expose this stuff via protected API so if you really really wanna do this 
you can if you subclass?

I think if we remove reset(), and then have protected ctor that can pass in the 
allocator ... maybe that's OK?  Still makes me nervous ... we should mark that 
ctor experimental ...
                
> Make MemoryIndex more memory efficient
> --------------------------------------
>
>                 Key: LUCENE-4515
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4515
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/other
>    Affects Versions: 4.0, 4.1, 5.0
>            Reporter: Simon Willnauer
>             Fix For: 4.1, 5.0
>
>         Attachments: LUCENE-4515.patch, LUCENE-4515.patch
>
>
> Currently MemoryIndex uses BytesRef objects to represent terms and holds an 
> int[] per term per field to represent postings. For highlighting this creates 
> a ton of objects for each search that 1. need to be GCed and 2. can't be 
> reused.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to