[
https://issues.apache.org/jira/browse/LUCENE-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12933911#action_12933911
]
Uwe Schindler edited comment on LUCENE-2771 at 11/19/10 1:49 PM:
-----------------------------------------------------------------
Here the relevant comments:
{quote}
*Robert Muir added a comment - 19/Nov/10 04:28 AM*
here is a hack patch for Uwe's idea about the norms.
we need to change SegmentMerger to not call norms on the top-level IR but
populate its normBuffer from the subs.
in my opinion it seems crazy we are currently creating these big arrays this
way (yeah there is the hairy code for re-open that re-uses the big merged cache
for the NRT case, but still).
Maybe i am missing something.
*Robert Muir added a comment - 19/Nov/10 04:28 AM*
here is a hack patch for Uwe's idea about the norms. we need to change
SegmentMerger to not call norms on the top-level IR but populate its normBuffer
from the subs. in my opinion it seems crazy we are currently creating these big
arrays this way (yeah there is the hairy code for re-open that re-uses the big
merged cache for the NRT case, but still). Maybe i am missing something.
*Robert Muir added a comment - 19/Nov/10 04:58 AM*
here's another hacky update: but still a few tests explicitly check these norms
and need to be fixed.
maybe we could add an uncached "MultiNorms" or something at least in src/test
for convenience,
just to fill the byte arrays so these tests can assertEquals
otherwise we are going to have to put a lot of SlowMultiReaderWrappers in these
tests.
*Robert Muir added a comment - 19/Nov/10 04:58 AM*
here's another hacky update: but still a few tests explicitly check these norms
and need to be fixed. maybe we could add an uncached "MultiNorms" or something
at least in src/test for convenience, just to fill the byte arrays so these
tests can assertEquals otherwise we are going to have to put a lot of
SlowMultiReaderWrappers in these tests.
*Robert Muir added a comment - 19/Nov/10 07:45 AM*
here is an updated patch, with core/contrib/solr tests passing.
For ParallelReader i forced it to require non-composite readers only (e.g.
SlowMultiReaderWrap them if thats not the case).
TODO:
- ParallelReader shouldnt need multifields etc anymore
- there are 5 @Ignore'd ParallelReader-related tests, because of things like
reopen/isOptimized/isCurrent
- merge in Uwe's improved SegmentsMerger
- clean up code.
*Robert Muir added a comment - 19/Nov/10 07:45 AM*
here is an updated patch, with core/contrib/solr tests passing. For
ParallelReader i forced it to require non-composite readers only (e.g.
SlowMultiReaderWrap them if thats not the case). TODO:
ParallelReader shouldnt need multifields etc anymore
there are 5 @Ignore'd ParallelReader-related tests, because of things like
reopen/isOptimized/isCurrent
merge in Uwe's improved SegmentsMerger
clean up code.
{quote}
was (Author: thetaphi):
Here the relevant comments:
{quote}
*Robert Muir added a comment - 19/Nov/10 04:28 AM*
here is a hack patch for Uwe's idea about the norms.
we need to change SegmentMerger to not call norms on the top-level IR but
populate its normBuffer from the subs.
in my opinion it seems crazy we are currently creating these big arrays this
way (yeah there is the hairy code for re-open that re-uses the big merged cache
for the NRT case, but still).
Maybe i am missing something.
*Robert Muir added a comment - 19/Nov/10 04:28 AM*
here is a hack patch for Uwe's idea about the norms. we need to change
SegmentMerger to not call norms on the top-level IR but populate its normBuffer
from the subs. in my opinion it seems crazy we are currently creating these big
arrays this way (yeah there is the hairy code for re-open that re-uses the big
merged cache for the NRT case, but still). Maybe i am missing something.
*Robert Muir added a comment - 19/Nov/10 04:58 AM*
here's another hacky update: but still a few tests explicitly check these norms
and need to be fixed.
maybe we could add an uncached "MultiNorms" or something at least in src/test
for convenience,
just to fill the byte arrays so these tests can assertEquals
otherwise we are going to have to put a lot of SlowMultiReaderWrappers in these
tests.
*Robert Muir added a comment - 19/Nov/10 04:58 AM*
here's another hacky update: but still a few tests explicitly check these norms
and need to be fixed. maybe we could add an uncached "MultiNorms" or something
at least in src/test for convenience, just to fill the byte arrays so these
tests can assertEquals otherwise we are going to have to put a lot of
SlowMultiReaderWrappers in these tests.
*Uwe Schindler added a comment - 19/Nov/10 07:17 AM*
Here a better patch for the segment merger. We should even apply this if we not
remove top-level norms!
It saves lots of memory during merging by using ReaderUtil to go down to
segment level! Don't wonder about BytesRef, but we need a reference here
because of the anonymous inner class.
*Uwe Schindler added a comment - 19/Nov/10 07:17 AM*
Here a better patch for the segment merger. We should even apply this if we not
remove top-level norms! It saves lots of memory during merging by using
ReaderUtil to go down to segment level! Don't wonder about BytesRef, but we
need a reference here because of the anonymous inner class.
*Robert Muir added a comment - 19/Nov/10 07:45 AM*
here is an updated patch, with core/contrib/solr tests passing.
For ParallelReader i forced it to require non-composite readers only (e.g.
SlowMultiReaderWrap them if thats not the case).
TODO:
- ParallelReader shouldnt need multifields etc anymore
- there are 5 @Ignore'd ParallelReader-related tests, because of things like
reopen/isOptimized/isCurrent
- merge in Uwe's improved SegmentsMerger
- clean up code.
*Robert Muir added a comment - 19/Nov/10 07:45 AM*
here is an updated patch, with core/contrib/solr tests passing. For
ParallelReader i forced it to require non-composite readers only (e.g.
SlowMultiReaderWrap them if thats not the case). TODO:
ParallelReader shouldnt need multifields etc anymore
there are 5 @Ignore'd ParallelReader-related tests, because of things like
reopen/isOptimized/isCurrent
merge in Uwe's improved SegmentsMerger
clean up code.
{quote}
> Remove norms() support from non-atomic IndexReaders
> ---------------------------------------------------
>
> Key: LUCENE-2771
> URL: https://issues.apache.org/jira/browse/LUCENE-2771
> Project: Lucene - Java
> Issue Type: Improvement
> Reporter: Uwe Schindler
> Fix For: 4.0
>
>
> Spin-off from LUCENE-2769:
> Currently all IndexReaders support norms(), but the core of Lucene never uses
> it and its even dangerous because of memory usage. We should do the same like
> with MultiFields and factor it out and throw UOE on non-atomic readers.
> The SlowMultiReaderWrapper can then manage the norms. Also ParallelReader
> needs to be fixed.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]