[ https://issues.apache.org/jira/browse/LUCENE-4515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487830#comment-13487830 ]
Michael McCandless commented on LUCENE-4515: -------------------------------------------- Cool!: you used the same slice idea that we use to hold postings in RAM in shared byte[]s, but with int[]s instead. This should be a huge reduction on GC load for MemoryIndex. I agree that DocFieldProcessor.docBoost is unused... synchronizedAllocator looks unused? I guess you added that after removing all sync from RecyclingByteBlockAllocator ... but I think we can just add synchronizedAllocator back later if/when we need it? Separately can you call out that RecyclingByteBlockAllocator is not thread safe in its javadocs? {quote} int[] start; // nocommit maybe we can safe the end array and just check freq - need to change the SliceReader for this {quote} I think you need the start ... because if you used more than one slice you won't know how to read "backwards" to get to the starting slice? {quote} intBlockPool = new IntBlockPool(); // nocommit expose allocator and impl a recycling one {quote} If we do that we have to make sure that allocator clears each int[] before returning it, in getIntBlock(). The added MemoryIndex.reset method is sort of ... spooky? Like, do we really need/want to reuse a MemoryIndex? (I guess this is because we added passing in an allocator to the ctor ... so you want the byte[]'s returned to it ... but that also makes me nervous: should we really pass in an external allocator...?). > Make MemoryIndex more memory efficient > -------------------------------------- > > Key: LUCENE-4515 > URL: https://issues.apache.org/jira/browse/LUCENE-4515 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/other > Affects Versions: 4.0, 4.1, 5.0 > Reporter: Simon Willnauer > Fix For: 4.1, 5.0 > > Attachments: LUCENE-4515.patch > > > Currently MemoryIndex uses BytesRef objects to represent terms and holds an > int[] per term per field to represent postings. For highlighting this creates > a ton of objects for each search that 1. need to be GCed and 2. can't be > reused. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org