What Shawn said. 117 shards and 116 docs tells you absolutely nothing useful. I've never seen the number of docs on various shards be off by more than 2-3% when enough docs are indexed to be statistically valid.
Best, Erick On Fri, Mar 16, 2018 at 5:34 AM, Shawn Heisey <apa...@elyograg.org> wrote: > On 3/6/2018 11:53 AM, Nawab Zada Asad Iqbal wrote: >> >> I have 117 shards and i tried to use document ids from zero to 116. I find >> that the distribution is very uneven, e.g., the largest bucket receives >> total 5 documents; and around 38 shards will be empty. Is it expected? > > > With such a small data set, this fits what I would expect. > > Choosing buckets by hashing (which is what compositeId does) is not perfect, > but if you send it thousands or millions of documents, it will be > *generally* balanced. > > Thanks, > Shawn >