There is not much to add - estimating ES process memory really depends on individual requirements (bulk indexing, field cache, filter/facet, concurrent queries) - just take a portion of your data, measure memory/CPU/disk I/O, and extrapolate - best is to add nodes if resources get tight. Rule of thumb is 50% of RAM to ES heap
- you are correct, primarycache=all may buffer more data than required (useful for maximum ZFS performance). You have already limited the ARC size. Use mmapfs for ES store, this should work best with primarycache=metadata - ZFS recordsize for JVM apps like ES should be default which is 4k. Also with ES, important is to match ZFS recordsize with kernel page size and sector size of the drive so there is no skew in the number of I/O operations. Check for yourself if higher values like 8k /16k / 64k / 256k gets better throughput on ES data folder. On certain striped HW RAID devices it may be the case, but I doubt it (ZFS internal buffering is compensating for this effect, write throughput will suffer if recordsize is too high) - and you should switch off atime on ES data folder Jörg On Tue, May 13, 2014 at 7:39 AM, Patrick Proniewski < [email protected]> wrote: > Hello, > > I'm running an Elasticsearch node on a FreeBSD server, on top of ZFS > storage. For now I've considered that ES is smart and manages its own > cache, so I've disabled primary cache for data, leaving only metadata being > cacheable. Last thing I want is to have data cached twice, one time is ZFS > ARC and a second time in application's own cache. I've also disabled > compression: > > $ zfs get compression,primarycache,recordsize zdata/elasticsearch > NAME PROPERTY VALUE SOURCE > zdata/elasticsearch compression off local > zdata/elasticsearch primarycache metadata local > zdata/elasticsearch recordsize 128K default > > It's a general purpose server (web, mysql, mail, ELK, etc.). I'm not > looking for absolute best ES performance, I'm looking for best use of my > resources. > I have 16 GB RAM, and I plan to put a limit to ARC size (currently > consuming 8.2 GB RAM) so I can mlockall ES memory. But I don't think I'll > go the RAM-only storage route (< > http://jprante.github.io/applications/2012/07/26/Mmap-with-Lucene.html>) > as I'm running only one node. > > How can I estimate the amount of memory I must allocate to ES process? > > Should I switch primarycache=all back on despite ES already caching data? > > What is the best ZFS record/block size to accommodate Elasticsearch/Lucene > IOs? > > Thanks, > Patrick > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/FBBA84AE-D610-4060-AFBC-FC7D5BA0803F%40patpro.net > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFeK_eTvLSEZ3BGgQGmWEzX5Y4v2AdWo8KZoywVe48zBg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
