[
https://issues.apache.org/jira/browse/LUCENE-8321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16480519#comment-16480519
]
Robert Muir commented on LUCENE-8321:
-------------------------------------
Also I think the IW accounting needs to stay. Considering we can reasonably
merge segments of ~ 1B docs then i think it makes sense to up the limit to 16B
or so, but any higher gets into trappy territory. Strongly feel it can't be
"unlimited" as long as a single segment is limited.
But I'm concerned this small increase is worth the complexity cost: both on
users and on the code: it certainly won't make things any simpler. Also I can
see people complaining about what seems like an "arbitrary" limit in the code,
even though its no more arbitrary than 2B. But we could try it out and see what
it looks like?
> Allow composite readers to have more than 2B documents
> ------------------------------------------------------
>
> Key: LUCENE-8321
> URL: https://issues.apache.org/jira/browse/LUCENE-8321
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Adrien Grand
> Priority: Minor
>
> I would like to start discussing removing the limit of ~2B documents that we
> have for indices, while still enforcing it at the segment level for practical
> reasons.
> Postings, stored fields, and all other codec APIs would keep working on
> integers to represent doc ids. Only top-level doc ids and numbers of
> documents would need to move to a long. I say "only" because we now mostly
> consume indices per-segment, but there is still a number of places where we
> identify documents by their top-level doc ID like {{IndexReader#document}},
> top-docs collectors, etc.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]