FYI, there is SirenDB on top of lucene that addresses such concerns... It supports multi-level parent-child relationships and provides nice querying capabilities...
-- Ravi On Thu, Jul 31, 2014 at 12:59 PM, Ravikumar Govindarajan < ravikumar.govindara...@gmail.com> wrote: > We are planning to use block-indexing and ToChildBlockJoin queries... > > Each parent-doc can contain anywhere between 1-2000 children-docs and is > highly variable. > > A sample user-stats is as follows > > 1. No.of. parent-docs = 500K > 2. Children -per parent = 50 > 3. Total-docs = 25 Million > 4. Size occupied = 7 GB > > A given index contains many such users but we are planning to limit the > size to 32GB per-index. When exceeded, addDocuments() call moves to newer > index. > > The number of docs in one 32GB index sounds very scary. Are queries > affected by such high number of documents? Filters, AcceptDocs etc... > could also become memory-heavy no? > > Is Block-Join the correct fit for the above scenario? > > -- > Ravi > >