On 3/4/2012 3:31 AM, Sphene Software wrote:
Folks,

I am planning to use DIH for an index of size 10 million records.

I would like to know the following;
- Can DIH scale for this size of an indexes
- If DIH is a bottleneck, what is the specific issue and how it can be
addressed

My entire index is about 67 million documents. There are a total of seven shards, six of them have over 11 million documents each. I can do a full dataimport (from MySQL) of those six shards simultaneously in less than three hours. The seventh shard is less than 500000 documents and builds after the others during a full rebuild. It is rare that we have to do a full rebuild, it's mostly at schema change time.

I use SolrJ for updates, my experience with that so far suggests that doing the full import with my SolrJ code would take significantly longer than three hours.

Thanks,
Shawn

Reply via email to