We have built a ES system using time based indexes (yearly, monthly, weekly)
for different clients, we have followed standard pattern to define the
names. (similar to logstash, but more complex than that). 

We get all the indexes from ES & filter-out based on name while searching,
we did not wanted to make use of aliases (as we are book keeping the
searches with other non-ES context information), we want to know the indexes
queried at any given point of time & is also that limit the indexes searched
based on time search criteria & other context information.

Currently we found that http://<ip>:9200/_stats/indices/ doesn't include the
nodes that were down, we would need search to fail in those cases. 
Apparently http://<ip>:9200/_aliases API does include the nodes (indexes)
that were down as well.
Whats the preferred API to get all indexes.

These APIs are looks to be fast & using cache, Is really makes sense to
cache these information.

We have been obsering stats/ API behaving (using native java TransportClient
API) in inconsistent manner, we don't get all the indexes @ all.

We are thinking to maintain client Vs Indexes in separate datastore. It
would be extra work to sync up these with ES.

It will be nice if any one can share the best practices dealing with such
large indexes.





--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/Managing-Index-metadata-outside-the-ES-tp4060360.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1406046953374-4060360.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to