Hi Debraj,


A couple things you could try.



Given your design

https://chart.googleapis.com/chart?chl=digraph+G+%7B%0D%0A+++rankdir%3DLR%3B%0D%0A+++service1LSFWD+-%3E+LS+-%3E+Kafka+-%3E+LSELK+-%3E+ES+-%3E+Kibana%0D%0A+++service2LSFWD+-%3E+LS%0D%0A+++service3LSFWD+-%3E+LS%0D%0A+++service4LSFWD+-%3E+LS%0D%0A%7D&cht=gv







1)      Try splitting services apart.  I do this by tagging each input, then 
using a condition in the output to send the data to a different kafka topic



input {

        file {

                path => "/var/log/hadoop/hdfs/hadoop-hdfs-namenode-*.log"

                tags => ["namenode-log"]

                …

                }

        }

}

input {

        file {

                path => "/var/log/hadoop/hdfs/hadoop-hdfs-namenode-*.out"

                tags => ["namenode-out"]

                …

                }

        }

}

output {

        if "namenode-log” in [tags] {

                kafka {

topic_id => "hadoop-namenode-log"

                                …

}

output {

        if "namenode-out” in [tags] {

                kafka {

topic_id => "hadoop-namenode-out"

                                …

}





2)      Try having more partitions in Kafka, so that LS fans out more

3)      Try adding more workers to the kafka output module 
(https://www.elastic.co/guide/en/logstash/current/plugins-outputs-kafka.html#plugins-outputs-kafka-workers)

4)      Try switching to the service nodes writing directly to kafka, using 
full Logstash, vs Logstash forwarder.  That removes a bottleneck.



Happy to try and help, as this is stuff is currently on my mind.  Maybe some 
more details about what queues you’re seeing filled up?



Cheers,



Todd.



-----Original Message-----
From: D [mailto:subharaj.ma...@gmail.com]
Sent: Friday, November 27, 2015 4:55
To: users@kafka.apache.org
Subject: Fair Usage of Kafka Queues



Hi,



We have a ElasticSearch-Logstash-Kibana deployment in which multiple

logstash-forwarders (from multiple service logs) pushes log to logstash

which then sends it to Kafka and then one more logstash pulls those logs

from Kafka and indexes them in ElasticSearch cluster. Right now we are

using single kafka queue in which all the logs from all the services are

going.



Can we configure Kafka in such a way that a single logstash-forwarder

(of a single service) does not hog the entire set-up? We want to ensure

that all services can use the set-up fairly. So basically I want to

ensure the Kafka queues are not always filled up by only one type of

messages.



Thanks,

Debraj

Reply via email to