And that Lucene index document limit includes deleted and updated
documents, so even if your actual document count stays under 2^31-1,
deleting and updating documents can push the apparent document count over
the limit unless you very aggressively merge segments to expunge deleted
documents.

-- Jack Krupansky

-- Jack Krupansky

On Mon, Dec 29, 2014 at 12:54 PM, Erick Erickson <erickerick...@gmail.com>
wrote:

> When you say 2B docs on a single Solr instance, are you talking only one
> shard?
> Because if you are, you're very close to the absolute upper limit of a
> shard, internally
> the doc id is an int or 2^31. 2^31 + 1 will cause all sorts of problems.
>
> But yeah, your 100B documents are going to use up a lot of servers...
>
> Best,
> Erick
>
> On Mon, Dec 29, 2014 at 7:24 AM, Bram Van Dam <bram.van...@intix.eu>
> wrote:
> > Hi folks,
> >
> > I'm trying to get a feel of how large Solr can grow without slowing down
> too
> > much. We're looking into a use-case with up to 100 billion documents
> > (SolrCloud), and we're a little afraid that we'll end up requiring 100
> > servers to pull it off.
> >
> > The largest index we currently have is ~2billion documents in a single
> Solr
> > instance. Documents are smallish (5k each) and we have ~50 fields in the
> > schema, with an index size of about 2TB. Performance is mostly OK. Cold
> > searchers take a while, but most queries are alright after warming up. I
> > wish I could provide more statistics, but I only have very limited
> access to
> > the data (...banks...).
> >
> > I'd very grateful to anyone sharing statistics, especially on the larger
> end
> > of the spectrum -- with or without SolrCloud.
> >
> > Thanks,
> >
> >  - Bram
>

Reply via email to