This will still mean less overhead than having those distinct field in discreet indexes. I wouldn't worry about that.
-- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/> On Wed, Sep 17, 2014 at 4:35 PM, Ziv Shalev <[email protected]> wrote: > thanks for the prompt reply! > one thing though - when using a single multi-tenant index, my concerns are > not around the number of fields per doc (which is small, less than 50), > but rather the fact that since each tenant has different fields, the > accumulated number of fields in such an index will be huge. > > i.e. tenant 1 has fields F11..F1n, tenant 2 has fields F21..F2n, ... > these fields are distinct so the number of fields for the multi-tenant > index will grow to millions quickly. > > will such an indexing methodology work in ES? > > thanks! > > > On Wednesday, September 17, 2014 4:21:17 PM UTC+3, Itamar Syn-Hershko > wrote: >> >> First, you should really read this: http://aphyr.com/posts/ >> 317-call-me-maybe-elasticsearch regarding using ES as a single source of >> truth >> >> Millions of indexes is not advisable, unless you plan on having millions >> of servers. Depending on index size and write frequency to them, you don't >> want to have more than a few dozen indexes per machine (including >> replicas). This is because of concerns of memory, CPU, I/O and file >> descriptors. >> >> One big single index may present its own problems due to the different >> schemas, although it may be solvable using dynamic index templates >> <http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/custom-dynamic-mapping.html#dynamic-templates>. >> I will still expect you to have issues with number of shards (basically, >> running out of shards at some point). >> >> Therefore I will try and find a middle way here, using probably some sort >> of a mapping mechanism. Even also time based if its applicable. >> >> Re your questions: >> >> * are there production deployments out there that have a million active >> indexes? what do they look like? >> >> I'm not aware of such >> >> * how many different fields does it make sense to host in a single index? >> would it scale to millions of fields in a single index? >> >> You mean in a single document. I recall seeing Shay suggesting not to go >> over the 100 threshold or so. Lucene really isn't optimized for scaling >> vertically, especially in the document level. >> >> * are there other ways to go about this that we have overlooked? >> >> Maybe look at your data model and try to re-arrange it. >> >> -- >> >> Itamar Syn-Hershko >> http://code972.com | @synhershko <https://twitter.com/synhershko> >> Freelance Developer & Consultant >> Author of RavenDB in Action <http://manning.com/synhershko/> >> > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/23a8484a-dcfc-4c8a-bc9d-a02bc4280985%40googlegroups.com > <https://groups.google.com/d/msgid/elasticsearch/23a8484a-dcfc-4c8a-bc9d-a02bc4280985%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZtjRiUhQPNL2i3TCr9ZNus%3DijMAZJ2J3P25vtHnxNGUag%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
