RAM is 1000x faster than SSD. Elasticsearch loads the index into RAM when needed, so all searches are after loading index files into memory. Unless your index does not fit into virtual memory, you can assume that searching is done on a) RAM b) if not in RAM, you hit the file cache or c) you miss the file cache and this requires disk seeks by the OS virtual memory manager. With mlockall you avoid swap, so RAM pages are never dislocated to a slow disk device and the performance is in a steady state.
I am not sure if that is nobel prize - I rather think this is due to my hardware and software setup - but in my ES node it is quite usual to see writes with >200 MB/sec to SSD while bulk indexing. I admit this is not the value of maximum drive performance but there are other factors which are out of scope of Elasticsearch (e.g. JVM buffer, file system journaling, random writes vs. sequential writes, intertwining of reads and writes, fair queue scheduling of the VM, bus speed, controller configuration, cache settings etc.) so with normal methods, it can not go much higher. Your question was about the number of shards for SSD. Really - there is no different shard limit on an ES cluster whether with HDD or with SSD. Maybe to wanted to ask about performance gains of SSD, then I misunderstood the question completely. Jörg On Fri, Oct 10, 2014 at 12:34 AM, Kevin Burton <[email protected]> wrote: > > > On Wednesday, October 8, 2014 12:07:30 AM UTC-7, Jörg Prante wrote: >> >> With ES, you can go up to the bandwidth limit the OS allows for writing >> I/O (if you disable throttling etc.) >> >> This means, if you write to one shard, it can be as fast as writing to >> thousands of shards in parallel in summary. There is an OS limit for file >> system buffers so the more shards, the more RAM is recommended. >> >> If OS restricts file descriptor limits and you want to write to shards, >> you can estimate you need a peak of 100-200 file descriptors for active >> merges etc. (this varies from version to version and from setting to >> setting). ES does not impose a limit here. >> >> These resource demands and custom configurations are not related to the >> choice SSD or HDD. >> >> > There's no way that's true... for example, if you are on HDD, and you're > index is not in memory, AND you're serving queries (which is a realistic > use case) then there is NO way you can write at the full IO of the disk. > It's just physically impossible. > > If ES has been able to solve that problem then they could win a nobel > prize :-p > > SSD is 2-3 orders of magnitude faster than HDD ... so yes, it is related > to the choice of SSD or HDD. > > ... it's entirely possible I'm misinterpreting what you're saying though. > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/3b40a167-8f82-4f60-9833-8e3bce77370f%40googlegroups.com > <https://groups.google.com/d/msgid/elasticsearch/3b40a167-8f82-4f60-9833-8e3bce77370f%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEK-ECPT-qYgXDDcs5jPZ9W8tSkqEgF9S16TjDXNKBSzw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
