RAM is 1000x faster than SSD. Elasticsearch loads the index into RAM when
needed, so all searches are after loading index files into memory. Unless
your index does not fit into virtual memory, you can assume that searching
is done on a) RAM b) if not in RAM, you hit the file cache or c) you miss
the file cache and this requires disk seeks by the OS virtual memory
manager. With mlockall you avoid swap, so RAM pages are never dislocated to
a slow disk device and the performance is in a steady state.

I am not sure if that is nobel prize - I rather think this is due to my
hardware and software setup - but in my ES node it is quite usual to see
writes with >200 MB/sec to SSD while bulk indexing. I admit this is not the
value of maximum drive performance but there are other factors which are
out of scope of Elasticsearch (e.g. JVM buffer, file system journaling,
random writes vs. sequential writes, intertwining of reads and writes, fair
queue scheduling of the VM, bus speed, controller configuration, cache
settings etc.) so with normal methods, it can not go much higher.

Your question was about the number of shards for SSD. Really - there is no
different shard limit on an ES cluster whether with HDD or with SSD.

Maybe to wanted to ask about performance gains of SSD, then I misunderstood
the question completely.

Jörg


On Fri, Oct 10, 2014 at 12:34 AM, Kevin Burton <[email protected]> wrote:

>
>
> On Wednesday, October 8, 2014 12:07:30 AM UTC-7, Jörg Prante wrote:
>>
>> With ES, you can go up to the bandwidth limit the OS allows for writing
>> I/O (if you disable throttling etc.)
>>
>> This means, if you write to one shard, it can be as fast as writing to
>> thousands of shards in parallel in summary. There is an OS limit for file
>> system buffers so the more shards, the more RAM is recommended.
>>
>> If OS restricts file descriptor limits and you want to write to shards,
>> you can estimate you need a peak of 100-200 file descriptors for active
>> merges etc. (this varies from version to version and from setting to
>> setting). ES does not impose a limit here.
>>
>> These resource demands and custom configurations are not related to the
>> choice SSD or HDD.
>>
>>
> There's no way that's true... for example, if you are on HDD, and you're
> index is not in memory, AND you're serving queries (which is a realistic
> use case) then there is NO way you can write at the full IO of the disk.
> It's just physically impossible.
>
> If ES has been able to solve that problem then they could win a nobel
> prize :-p
>
> SSD is 2-3 orders of magnitude faster than HDD ... so yes, it is related
> to the choice of SSD or HDD.
>
> ... it's entirely possible I'm misinterpreting what you're saying though.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/3b40a167-8f82-4f60-9833-8e3bce77370f%40googlegroups.com
> <https://groups.google.com/d/msgid/elasticsearch/3b40a167-8f82-4f60-9833-8e3bce77370f%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEK-ECPT-qYgXDDcs5jPZ9W8tSkqEgF9S16TjDXNKBSzw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to