Thanks for the explanation it is clear now. Now for the other part of my question. Lets assume that I am expecting this index to hold data for 1000 users. Each user will have 500,000 documents and each document will be 512KB. Now, these documents are pure text files. And lets say that my query will only search the field that holds the file contents and will only return the file names.
Lets assume the cluster contains two nodes, each node has a Quad Core Cpu and 16GB of RAM and the heap size is set to 8GB on each node. So with that example, how many shards you would say that I need to get a relatively fast search. I know its hard to calculate but I would love to find a way to at least estimate how many shards I would need since this cannot be increased in the future. -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8007da62-9f41-48ab-acbb-67b200acf952%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.