[
https://issues.apache.org/jira/browse/LUCENE-6199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14342639#comment-14342639
]
Robert Muir commented on LUCENE-6199:
-------------------------------------
{quote}
This gives a 2.4x reduction (137 MB to 56 MB) in heap usage in a
simple test that creates 100K indexed fields in a single-segment
index.
{quote}
Do you have this test? I am unhappy with several of the changes remaining in
this patch, e.g. doing more object creation at runtime, hurting the typical
case, to save a few bytes for abusers, because you can avoid a BytesRef per
field.
FIS changes on the subtasks (LUCENE-6317 and LUCENE-6318) are safe and should
save the most per-field memory with the default configuration (100s of bytes
per field).
IMO we should not introduce performance regressions for typical cases to save
an abuser 8 bytes, 8 megabytes, or even 8 gigabytes. But if we split out these
controversial things we can examine them individually and maybe some can be
improved without bad tradeoffs.
> Reduce per-field heap usage for indexed fields
> ----------------------------------------------
>
> Key: LUCENE-6199
> URL: https://issues.apache.org/jira/browse/LUCENE-6199
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Fix For: Trunk, 5.1
>
> Attachments: LUCENE-6199.patch, LUCENE-6199.patch
>
>
> Lucene uses a non-trivial baseline bytes of heap for each indexed
> field, and I know it's abusive for an app to create 100K indexed
> fields but I still think we can and should make some effort to reduce
> heap usage per unique field?
> E.g. in block tree we store 3 BytesRefs per field, when 3 byte[]s
> would do...
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]