Time-Routed Alias Not Distributing Wrongly Placed Docs

John Nashorn Tue, 27 Nov 2018 09:42:31 -0800

Hello Everyone,
I'm using "hive-solr" from Lucidworks to index my data into Solr (v:7.5, cloud 
mode). As written in the Solr Manual, TRA expects documents to be indexed using 
its alias name, and not directly into the collections under it. Unfortunately, 
hive-solr doesn't allow using TRA names as indexing targets. So what I do is: I 
index data using the first collection created by TRA and expect Solr to 
distribute my data into its respective collection under the hood. This works to 
some extent, but a big portion of data stays in where they were indexed, ie. 
the first collection of the TRA. For example (approximate numbers):


* coll_2018-07-01 => 800.000.000 docs
* coll_2018-08-01 => 0 docs
* coll_2018-09-01 => 0 docs
* coll_2018-10-01 => 150.000.000 docs
* coll_2018-11-01 => 0 docs

Here, coll_2018-07-01 contains data that should normally be in the other four 
collections.

Is there a way to make TRA scan (somehow intentionally) misplaced data and send 
them to their correct places?

Time-Routed Alias Not Distributing Wrongly Placed Docs

Reply via email to