And that Lucene index document limit includes deleted and updated documents, so even if your actual document count stays under 2^31-1, deleting and updating documents can push the apparent document count over the limit unless you very aggressively merge segments to expunge deleted documents.
-- Jack Krupansky -- Jack Krupansky On Mon, Dec 29, 2014 at 12:54 PM, Erick Erickson <erickerick...@gmail.com> wrote: > When you say 2B docs on a single Solr instance, are you talking only one > shard? > Because if you are, you're very close to the absolute upper limit of a > shard, internally > the doc id is an int or 2^31. 2^31 + 1 will cause all sorts of problems. > > But yeah, your 100B documents are going to use up a lot of servers... > > Best, > Erick > > On Mon, Dec 29, 2014 at 7:24 AM, Bram Van Dam <bram.van...@intix.eu> > wrote: > > Hi folks, > > > > I'm trying to get a feel of how large Solr can grow without slowing down > too > > much. We're looking into a use-case with up to 100 billion documents > > (SolrCloud), and we're a little afraid that we'll end up requiring 100 > > servers to pull it off. > > > > The largest index we currently have is ~2billion documents in a single > Solr > > instance. Documents are smallish (5k each) and we have ~50 fields in the > > schema, with an index size of about 2TB. Performance is mostly OK. Cold > > searchers take a while, but most queries are alright after warming up. I > > wish I could provide more statistics, but I only have very limited > access to > > the data (...banks...). > > > > I'd very grateful to anyone sharing statistics, especially on the larger > end > > of the spectrum -- with or without SolrCloud. > > > > Thanks, > > > > - Bram >