One of the main usage of having a data-less node is that it would act as a coordinator between the other nodes. It will gather all the responses from the other nodes/shards and reduce them into one.
In your case, the data-less node is gathering all the data from just one node. In other words, it is not doing much since the reduce phase is basically a pass-thru operation. With a two node cluster, I would say you are better off having both machines act as full nodes. Cheers, Ivan On Fri, Mar 21, 2014 at 5:04 AM, Rujuta Deshpande <[email protected]> wrote: > Hi, > > I am setting up a system consisting of elasticsearch-logstash-kibana for > log analysis. I am using one machine (2 GB RAM, 2 CPUs) running logstash, > kibana and two instances of elasticsearch. Two other machines, each > running logstash-forwarder are pumping logs into the ELK system. > > The reasoning behind using two ES instances was this - I needed one > uninterrupted instance to index the incoming logs and I also needed to > query the currently existing indices. However, I didn't want any complex > querying to result in loss of events owing to Out of Memory Errors because > of excessive querying. > > So, one elasticsearch node was master = true and data = true which did > the indexing (called the writer node) and the other node, was master = > false and data = false (this was the workhorse or reader node) . > > I assumed that, in cases of excessive querying, although the data is > stored on the writer node, the reader node will query the data and all the > processing will take place on the reader as a result of which issues like > out of memory error etc will be avoided and uninterrupted indexing will > take place. > > However, while testing this, I realized that the reader hardly uses the > heap memory ( Checked this in Marvel ) and when I fire a complex search > query - which was a search request using the python API where the 'size' > parameter was set to 10000, the writer node throws an out of memory error, > indicating that the processing also takes place on the writer node only. My > min and max heap size was set to 256m for this test. I also ensured that I > was firing the search query to the port on which the reader node was > listening (Port 9200). The writer node was running on Port 9201. > > Was my previous understanding of the problem incorrect - i.e. having one > reader and one writer node, doesn't help in uninterrupted indexing of > documents? If this is so, what is the use of having a separate workhorse or > reader node? > > My eventual aim is to be able to query elasticsearch and fetch large > amounts of data at a time without interrupting/slowing down the indexing of > documents. > > Thank you. > > Rujuta > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/a8fcd5f0-447a-4654-9115-9bc4e524b246%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/a8fcd5f0-447a-4654-9115-9bc4e524b246%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQD25ipp5UFihLDqcqxqr1_4nMvngsNmedA73gLfjG_rcQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
