> I have no idea whether you can successfully recover anything from that
> index now that it has broken the hard limit.

Theoretically, I think it's possible with some very surgical edits.
However, I've tried to do this in the past and abandoned it. The code to
split the index needs to be able to open it first, so we reasoned that we'd
have no way to demonstrate correctness and at that point restoring from a
backup was the best option.

Maybe somebody smarter or more determined has a better experience.

Mike

On Tue, Aug 8, 2017 at 10:21 AM, Shawn Heisey <apa...@elyograg.org> wrote:

> On 8/7/2017 9:41 AM, Wael Kader wrote:
> > I faced an issue that is making me go crazy.
> > I am running SOLR saving data on HDFS and I have a single node setup with
> > an index that has been running fine until today.
> > I know that 2 billion documents is too much on a single node but it has
> > been running fine for my requirements and it was pretty fast.
> >
> > I restarted SOLR today and I am getting an error stating "Too many
> > documents, composite IndexReaders cannot exceed 2147483519.
> > The last backup I have is 2 weeks back and I really need the index to
> start
> > to get the data from the index.
>
> You have run into what I think might be the only *hard* limit in the
> entire Lucene ecosystem.  Other limits can usually be broken with
> careful programming, but that one is set in stone.
>
> A Lucene index uses a 32-bit Java integer to track the internal document
> ID.  In Java, numeric variables are signed.  For that reason, an integer
> cannot exceed (2^31)-1.  That number is 2147483647.  It appears that
> Lucene cuts that off at a value that's smaller by 128.  Not sure why
> that is, but it's probably to prevent problems when a small offset is
> added to the value.
>
> SolrCloud is perfectly capable of running indexes with far more than two
> billion documents, but as Yago mentioned, the collection must be sharded
> for that to happen.
>
> I have no idea whether you can successfully recover anything from that
> index now that it has broken the hard limit.
>
> Thanks,
> Shawn
>
>

Reply via email to