There is not much to add

- estimating ES process memory really depends on individual requirements
(bulk indexing, field cache, filter/facet, concurrent queries) - just take
a portion of your data, measure memory/CPU/disk I/O, and extrapolate - best
is to add nodes if resources get tight. Rule of thumb is 50% of RAM to ES
heap

- you are correct, primarycache=all may buffer more data than required
(useful for maximum ZFS performance). You have already limited the ARC
size. Use mmapfs for ES store, this should work best with
primarycache=metadata

- ZFS recordsize for JVM apps like ES should be default which is 4k. Also
with ES, important is to match ZFS recordsize with kernel page size and
sector size of the drive so there is no skew in the number of I/O
operations. Check for yourself if higher values like 8k /16k / 64k / 256k
gets better throughput on ES data folder. On certain striped HW RAID
devices it may be the case, but I doubt it (ZFS internal buffering is
compensating for this effect, write throughput will suffer if recordsize is
too high)

- and you should switch off atime on ES data folder

Jörg



On Tue, May 13, 2014 at 7:39 AM, Patrick Proniewski <
[email protected]> wrote:

> Hello,
>
> I'm running an Elasticsearch node on a FreeBSD server, on top of ZFS
> storage. For now I've considered that ES is smart and manages its own
> cache, so I've disabled primary cache for data, leaving only metadata being
> cacheable. Last thing I want is to have data cached twice, one time is ZFS
> ARC and a second time in application's own cache. I've also disabled
> compression:
>
> $ zfs get compression,primarycache,recordsize  zdata/elasticsearch
> NAME                 PROPERTY      VALUE         SOURCE
> zdata/elasticsearch  compression   off           local
> zdata/elasticsearch  primarycache  metadata      local
> zdata/elasticsearch  recordsize    128K          default
>
> It's a general purpose server (web, mysql, mail, ELK, etc.). I'm not
> looking for absolute best ES performance, I'm looking for best use of my
> resources.
> I have 16 GB RAM, and I plan to put a limit to ARC size (currently
> consuming 8.2 GB RAM) so I can mlockall ES memory. But I don't think I'll
> go the RAM-only storage route (<
> http://jprante.github.io/applications/2012/07/26/Mmap-with-Lucene.html>)
> as I'm running only one node.
>
> How can I estimate the amount of memory I must allocate to ES process?
>
> Should I switch primarycache=all back on despite ES already caching data?
>
> What is the best ZFS record/block size to accommodate Elasticsearch/Lucene
> IOs?
>
> Thanks,
> Patrick
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/FBBA84AE-D610-4060-AFBC-FC7D5BA0803F%40patpro.net
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFeK_eTvLSEZ3BGgQGmWEzX5Y4v2AdWo8KZoywVe48zBg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to