-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello,

After the recent discussion of rsyslog sending logs to ElasticSearch, using the 
bulk indexing API, I did some playing around with the current plugin. First, 
let me just say that I really appreciate the work that Nathan did on the 
omelasticsearch plugin, and that it will work fine under many use cases. 
However, there are a few fundamental limitations with the current 
omelasticsearch/rsyslog integration:

- - omelasticsearch uses curl to make the API calls to ES. The downside of this 
is that you have to specify a hostname. ES supports auto-discovering a cluster, 
as well as fail-over. If the host omelasticsearch is using goes down, the 
cluster may still be fully functional, but omelasticsearch won't be able to 
find it. Of course, you could go in and add other cluster members as failover 
actions, but this would mean a config change every time you change your ES 
topology.

- - curl has a default of only returning 16KB of the HTTP response. This 
response contains the information of which messages were successfully inserted 
into ES, and which failed. For a large batch of messages, one could easily get 
a response over the 16KB limit. This would require running a custom-compiled 
version of curl, that ups this limit.

- - "Pushing" to ES seems to work much less reliably than having ES "pull" 
messages. For similarly small-sized batches (~250 messages), ES would often 
take 6-8ms for the bulk insert. However, it would occasionally spike up to 
6000ms, which would cause quite a backlog in the queue. Having ES "pull" 
messages instead (more on this later) seemed to work much more consistently.

- - Finally, I'm a bit confused on how rsyslog receives commit errors with the 
new transactional plugin system. If there's a batch of 5 messages, and only 
message 4 is successfully committed during endTransaction, how would one convey 
that information back to rsyslog? I know Radu mentioned calling a program with 
omprog, and sending messages to ES from there, but in my setup, data integrity 
is paramount, and I don't want to re-implement rsyslog's reliable method 
delivery and failover systems.

The method that I'm currently stress-testing is using the ElasticSearch 
River[1] with a RabbitMQ[2] type. With this setup, rsyslog sends messages to a 
RabbitMQ queue. ElasticSearch is configured with the queue's information, and 
then it periodically pulls messages from that queue. Once it has the messages, 
it proceeds to bulk index them. If the master ES node goes down, the new master 
starts pulling messages from the queue. Overall, it seems to work well, and the 
indexing throughput seems higher, due to not pushing messages to ES when it's 
very busy.

Unfortunately, I can't find any rsyslog plugin for RabbitMQ, so I'm currently 
bouncing my messages through a logstash[3] server. Does anyone know of any 
plugin? I suspect the zeromq plugins might be a good starting point; I'm not 
sure how much would have to be rewritten to send to RabbitMQ instead.

Those were my experiences - I hope some of that proves useful to others looking 
into ElasticSearch.

- --
Vlad Grigorescu | IT Security Engineer
University of Illinois at Urbana-Champaign
Office of Privacy and Information Assurance

[1] - <http://www.elasticsearch.org/guide/reference/river/>
[2] - <https://github.com/elasticsearch/elasticsearch-river-rabbitmq>
[3] - <http://logstash.net/docs/1.1.0/outputs/elasticsearch_river>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.18 (Darwin)

iQIcBAEBAgAGBQJPMo66AAoJEMEVj6tjLlJyW28P/1QgQjTvADUCzG7ljohnK0xq
CS1V8lUGU8Q+oZ6RUbc546mMyYGABuRvEr0nKXSY1r9vTIS2OeaUt4EFgdJWP8mO
pQXNuFqhmtQCUXqflIUQHuY7y4d6EBmuz5b5sXbYWqLVVVQ5hpb96A4LqTzgkecT
XRtYXtU+P5N4kOdKTpgDH80MsFIbHkEFa1NusuuCyBRx0p0b6ZYuOqr13QZV3gGn
3UUbiS6qAi8+3Tw6KhRZ5fpAWw0vdCJP0etyTkR264CgrFQMUM8eFaTrdscK6eHV
akDtkM9vCiOeDZucUCo5XIW4nnLXZcR4lGVAS50a/J2IrHUGoe5fV/SYsd2hRHMm
veUF18ggH7UCjV91HkQ3TBJtQABjGhdhNPW5o74D0neR7ngSbs3j/sbF0NKZmbHa
+XQarL6ba1pJXApLlNIzn3CUWZGnCi65j1UcOkK6HGEbIK3Sa/q550CjuZDWShTF
is02ubxm29XP2VkSrWkab2CwIlM7CGtghaaoEbAxJdz0zJJs93MejUKJ0nRBEOPH
5bExCYfUgao9x+41XIw5Zw8X783MMD1PcS6wgJ+5WOGIWdHQZNHsfrRXNeoM++uu
uHW7aWk+SkExNP/JhLLXFgv5mmhnA7NePrFRV/CaCZPrB8THwN2D6G2MTFTCSA5C
Y3rJ63TeNKF4hSAVhss5
=v2BA
-----END PGP SIGNATURE-----
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/

Reply via email to