how to scale an ES deployment to millions of tenants with different data schemas

Ziv Shalev Wed, 17 Sep 2014 06:05:25 -0700

Hi,
we are considering using ES as a primary data-source for a new project.
our data is generated by millions of different users, *each having a 
relatively small number of documents, yet each having a different data 
schema.*


*we are considering several approaches:*
* index per user - we are concerned with scaling the ES cluster to support 
millions of indexes, each having relatively small number of docs.
* all users colocated on a single index - we are concerned that an ES index 
will not support millions of different fields (as each user has a different 
data schema).
* mix of the two above - having X users colocated on a single index, and 
having Y such indexes to host our entire user population.
* implementing some kind of a "mapping layer" that maps users' schema onto 
generic fields in one or more indexes. 
this would probably work, but of course is harder to implement & maintain. 

*so my questions:*
* are there production deployments out there that have a million active 
indexes? what do they look like?
* how many different fields does it make sense to host in a single index? 
would it scale to millions of fields in a single index?
* are there other ways to go about this that we have overlooked?

thanks!!

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f98651e6-5e6a-4ed8-aba8-b5e91078f036%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

how to scale an ES deployment to millions of tenants with different data schemas

Reply via email to