While it is possible to create an ES cluster with dedicated reader/writer nodes, this is not the default and in many cases, dedication of nodes is not required at all. ES has some better heuristics built in to relief the admin from tedious jobs like setting up dedicated nodes.
So I wonder how you understand what a reader or a writer node is. Note, if you connect a client to a node, this node does not necessarily do the heavy work, it just forwards automatically the requests to the nodes that hold the shards. You should use replica level > 0 to distribute the query load. Replica levels are duplicating shards just because of this - to allow better distributed forwarding of search requests, and to allow some resilience in case of node failures. 256m heap is very small for the massive Elasticsearch filter queries that Kibana uses. Jörg On Fri, Mar 21, 2014 at 1:04 PM, Rujuta Deshpande <[email protected]> wrote: > Hi, > > I am setting up a system consisting of elasticsearch-logstash-kibana for > log analysis. I am using one machine (2 GB RAM, 2 CPUs) running logstash, > kibana and two instances of elasticsearch. Two other machines, each > running logstash-forwarder are pumping logs into the ELK system. > > The reasoning behind using two ES instances was this - I needed one > uninterrupted instance to index the incoming logs and I also needed to > query the currently existing indices. However, I didn't want any complex > querying to result in loss of events owing to Out of Memory Errors because > of excessive querying. > > So, one elasticsearch node was master = true and data = true which did > the indexing (called the writer node) and the other node, was master = > false and data = false (this was the workhorse or reader node) . > > I assumed that, in cases of excessive querying, although the data is > stored on the writer node, the reader node will query the data and all the > processing will take place on the reader as a result of which issues like > out of memory error etc will be avoided and uninterrupted indexing will > take place. > > However, while testing this, I realized that the reader hardly uses the > heap memory ( Checked this in Marvel ) and when I fire a complex search > query - which was a search request using the python API where the 'size' > parameter was set to 10000, the writer node throws an out of memory error, > indicating that the processing also takes place on the writer node only. My > min and max heap size was set to 256m for this test. I also ensured that I > was firing the search query to the port on which the reader node was > listening (Port 9200). The writer node was running on Port 9201. > > Was my previous understanding of the problem incorrect - i.e. having one > reader and one writer node, doesn't help in uninterrupted indexing of > documents? If this is so, what is the use of having a separate workhorse or > reader node? > > My eventual aim is to be able to query elasticsearch and fetch large > amounts of data at a time without interrupting/slowing down the indexing of > documents. > > Thank you. > > Rujuta > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/a8fcd5f0-447a-4654-9115-9bc4e524b246%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/a8fcd5f0-447a-4654-9115-9bc4e524b246%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHZZXV2xm6JrUuV8V-Sg1uLhehqQ68Bn_2SRpJ1ZAvuVg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
