[jira] [Commented] (LUCENE-6777) Switch GeoPointTermsEnum range list to use a reusable BytesRef

Michael McCandless (JIRA) Thu, 03 Sep 2015 10:00:08 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-6777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14729394#comment-14729394
 ]


Michael McCandless commented on LUCENE-6777:
--------------------------------------------

Thanks [~nknize]!  I don't think we have to perf test before committing ... but 
can we try to simplify the code a bit?  E.g. things like:

{noformat}
-    if (term.compareTo(rangeBounds.get(0).cell) < 0) {
+    if (term.compareTo(this.nextSubRange = 
rangeBounds.get(0).asBytesRef(this.nextSubRange)) < 0) {
{noformat}

break that inner assignment out into its own line before?

Another example, instead of:
{noformat}
    final int result = Short.compare(this.shift, other.shift);
    return (result == 0) ? Long.compare(this.start, other.start) : result;
{noformat}

can we do e.g.:

{noformat}
    final int result = Short.compare(this.shift, other.shift);
    if (result == 0) {
      return result;
    }
    return Long.compare(this.start, other.start);
{noformat}

Instead of cloning the somewhat hairy
{{NumericUtils.longToPrefixCodedBytes}}, is it possible to e.g. reuse
a shared {{BytesRefBuilder}}?

Instead of the lazy alloc of reusable can we change the {{asBytesRef}} to
{{fillBytesRef}}, and it takes the non-null {{BytesRef}} in, and
returns void?  It's confusing to both take a parameter and return a result that
are sort of doing the same thing ...


> Switch GeoPointTermsEnum range list to use a reusable BytesRef 
> ---------------------------------------------------------------
>
>                 Key: LUCENE-6777
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6777
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Nicholas Knize
>         Attachments: LUCENE-6777.patch
>
>
> GeoPointTermsEnum currently constructs a BytesRef for every computed range, 
> then sorts on this BytesRef.  This adds an unnecessary memory overhead since 
> the TermsEnum only requires BytesRef on calls to nextSeekTerm and accept and 
> the ranges only need to be sorted by their long representation. This issue 
> adds the following two improvements:
> 1. Lazily compute the BytesRef on demand only when its needed
> 2. Add a single, transient BytesRef to GeoPointTermsEnum
> This will further cut back on heap usage when constructing ranges across 
> every segment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-6777) Switch GeoPointTermsEnum range list to use a reusable BytesRef

Reply via email to