Hi.

We are now rethinking our sharding strategy to shard on the blog entry
publishdate instead of just a simple hash. This is due to that the index
size is growing too much to be handled in one index.

Thing is that we want to select a core based on the publishdate of the
entry. Let's say we have one core per month.

A new core need to be added at the very latest 00:00 the first day each
month. How would I do that dynamically ?

I am thinking something like this:

*A core dir which serves as template (i.e. contains no data)

* A cronjob which:
-Copies the tpl-dir -> $corename-$year-$month
-Updates the solr.xml
-Restarts solr (can one only reload the solr.xml file ?)

Of course we get issues on the client since we need to figure out which
cores to search in for each query...
The benefit though is that it seems alot easier to manage the indexes,
especially that we are able to choose which indices to search in to utilize
the memory better. We probably need to distribute a file to the clients so
they know where each index reside (or have the conf in a DB)

Any thoughts ?

Cheers

//Marcus





-- 
Marcus Herou CTO and co-founder Tailsweep AB
+46702561312
marcus.he...@tailsweep.com
http://www.tailsweep.com/

Reply via email to