Minfeng- This issue is tougher as the number of shard you have raise, you
can read Erick Erickson's post:
http://grokbase.com/t/lucene/solr-user/131p75p833/how-distributed-queries-works.
If you have 100M docs I guess you are running this issue.
The common way to deal with this issue is by filteri
SolrEntityProcessor is fine for small amounts of data but not useful for
such a large index. The problem is that deep paging in search results is
expensive. As the "start" value for a query increases so does the cost of
the query. You are much better off just re-indexing the data.
On Mon, Jun 10,
I trying to migrate 100M documents from a solr index (v3.6) to a solrcloud
index (v4.1, 4 shards) by using SolrEntityProcessor. My data-config.xml is
like
http://10.64.35.117:8995/solr/"; query="*:*" rows="2000" fl=
"author_class,authorlink,author_location_text,author_text,author,category,date,