[
https://issues.apache.org/jira/browse/LUCENE-7826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008210#comment-16008210
]
Michael McCandless commented on LUCENE-7826:
--------------------------------------------
You could do this in your own custom codec, but I don't think we should expose
this behavior in Lucene's default codec. Typically the RAM usage for .tip is
manageable.
Can you describe the 10s of billions of documents that you are indexing? You
need to use sharding for that (a single Lucene index can index at most ~2.1
billion documents). What is the total unique term account?
Another thing you could do is increase the block size of the on-disk terms (see
the {{minTermBlockSize}} and {{maxTermBlockSize}} params to
{{Lucene50PostingsFormat}}. These are easy knobs to turn to decrease heap
needed by .tip, but increase term lookup cost (since more scanning will be
needed).
> Support to unload FST's .tip into memory,make load or unload configuable!.
> --------------------------------------------------------------------------
>
> Key: LUCENE-7826
> URL: https://issues.apache.org/jira/browse/LUCENE-7826
> Project: Lucene - Core
> Issue Type: New Feature
> Components: core/FSTs, core/store
> Reporter: Xibao
> Priority: Minor
> Original Estimate: 24h
> Remaining Estimate: 24h
>
> in real case,we use lucene index many documents. But some machine have not
> much memory.,once documents reach up to tens of billion,lucene can not start
> because of no enough memory. Most of the memry cost is FST;s .tip content.
> So I want to pull my change on lucene core to make load FST's .tip into
> memory become configurable!
> What do you think?
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]