Re: Splunk vs. Elastic search performance?

Ivan Brusic Mon, 23 Jun 2014 09:23:37 -0700

I agree. I thought elasticsearch_http was actually the recommended route.
Also, I have seen no reported issues with different client/server versions
since 1.0. My current logstash setup (which is not production level, simply
a dev logging tool) uses Elasticsearch 1.2.1 with Logstash 1.4.1 using the
non http interface.


-- 
Ivan


On Fri, Jun 20, 2014 at 3:29 PM, Mark Walkom <[email protected]>
wrote:

> I wasn't aware that the elasticsearch_http output wasn't recommended?
> When I spoke to a few of the ELK devs a few months ago, they indicated
> that there was minimal performance difference, at the greater benefit of
> not being locked to specific LS+ES versioning.
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: [email protected]
> web: www.campaignmonitor.com
>
>
> On 21 June 2014 02:43, Brian <[email protected]> wrote:
>
>> Thomas,
>>
>> Thanks for your insights and experiences. As I am someone who has
>> explored and used ES for over a year but is relatively new to the ELK
>> stack, your data points are extremely valuable. Let me offer some of my own
>> views.
>>
>> Re: double the storage. I strongly recommend ELK users to disable the
>> _all field. The entire text of the log events generated by logstash ends up
>> in the message field (and not @message as many people incorrectly post).
>> So the _all field is just redundant overhead with no value add. The result
>> is a dramatic drop in database file sizes and dramatic increase in load
>> performance. Of course, you need to configure ES to use the message field
>> as the default for a Lucene Kibana query.
>>
>> During the year that I've used ES and watched this group, I have been on
>> the front line of a brand new product with a smart and dedicated
>> development team working steadily to improve the product. Six months ago,
>> the ELK stack eluded me and reports weren't encouraging (with the sole
>> exception of the Kibana web site's marketing pitch). But ES has come a long
>> way since six months ago, and the ELK stack is much more closely integrated.
>>
>> The Splunk UI is carefully crafted to isolate users from each other and
>> prevent external (to the Splunk db itself, not to our company) users from
>> causing harm to data. But Kibana seems to be meant for a small cadre of
>> trusted users. What if I write a dashboard with the same name as someone
>> else's? Kibana doesn't even begin to discuss user isolation. But I am
>> confident that it will.
>>
>> How can I tell Kibana to set the default Lucene query operator to AND
>> instead of OR. Google is not my friend: I keep getting references to the
>> Ruby versions of Kibana; that's ancient history by now. Kibana is cool and
>> promising, but it has a long way to go for deployment to all of the folks
>> in our company who currently have access to Splunk.
>>
>> Logstash has a nice book that's been very helpful, and logstash itself
>> has been an excellent tool for prototyping. The book has been invaluable in
>> helping me extract dates from log events and handling all of our different
>> multiline events. But it still doesn't explain why the date filter needs a
>> different array of matching strings to get the date that the grok filter
>> has already matched and isolated. And recommendations to avoid the
>> elasticsearch_http output and use elasticsearch (via the Node client)
>> directly contradict the fact that logstash's 1.1.1 version of the ES client
>> library is not compatible with the most recent 1.2.1 version of ES.
>>
>> And logstash is also a resource hog, so we eventually plan to replace it
>> with Perl and Apache Flume (already in use) and pipe it into my Java bulk
>> load tool (which is always kept up-to-date with the versions of ES we
>> deploy!!). Because we send the data via Flume to our data warehouse, any
>> losses in ES will be annoying but won't be catastrophic. And the front-end
>> following of rotated log files will be done using the GNU *tail -F* command
>> and option. This GNU tail command with its uppercase -F option follows
>> rotated log files perfectly. I doubt that logstash can do the same, and we
>> currently see that neither can Splunk (so we sporadically lose log events
>> in Splunk too). So GNU tail -F piped into logstash with the stdin filter
>> works perfectly in my evaluation setup and will likely form the first stage
>> of any log forwarder we end up deploying,
>>
>> Brian
>>
>> On Thursday, June 19, 2014 8:48:34 AM UTC-4, Thomas Paulsen wrote:
>>>
>>> We had a 2,2TB/d installation of Splunk and ran it on VMWare with 12
>>> Indexer and 2 Searchheads. Each indexer had 1000IOPS guaranteed assigned.
>>> The system is slow but ok to use.
>>>
>>> We tried Elasticsearch and we were able to get the same performance with
>>> the same amount of machines. Unfortunately with Elasticsearch you need
>>> almost double amount of storage, plus a LOT of patience to make is run. It
>>> took us six months to set it up properly, and even now, the system is quite
>>> buggy and instable and from time to time we loose data with Elasticsearch.
>>>
>>> I don´t recommend ELK for a critical production system, for just dev
>>> work, it is ok, if you don´t mind the hassle of setting up and operating
>>> it. The costs you save by not buying a splunk license you have to invest
>>> into consultants to get it up and running. Our dev teams hate Elasticsearch
>>> and prefer Splunk.
>>>
>>
>> On Thursday, June 19, 2014 8:48:34 AM UTC-4, Thomas Paulsen wrote:
>>>
>>> We had a 2,2TB/d installation of Splunk and ran it on VMWare with 12
>>> Indexer and 2 Searchheads. Each indexer had 1000IOPS guaranteed assigned.
>>> The system is slow but ok to use.
>>>
>>> We tried Elasticsearch and we were able to get the same performance with
>>> the same amount of machines. Unfortunately with Elasticsearch you need
>>> almost double amount of storage, plus a LOT of patience to make is run. It
>>> took us six months to set it up properly, and even now, the system is quite
>>> buggy and instable and from time to time we loose data with Elasticsearch.
>>>
>>> I don´t recommend ELK for a critical production system, for just dev
>>> work, it is ok, if you don´t mind the hassle of setting up and operating
>>> it. The costs you save by not buying a splunk license you have to invest
>>> into consultants to get it up and running. Our dev teams hate Elasticsearch
>>> and prefer Splunk.
>>>
>>> Am Samstag, 19. April 2014 00:07:44 UTC+2 schrieb Mark Walkom:
>>>>
>>>> That's a lot of data! I don't know of any installations that big but
>>>> someone else might.
>>>>
>>>> What sort of infrastructure are you running splunk on now, what's your
>>>> current and expected retention?
>>>>
>>>> Regards,
>>>> Mark Walkom
>>>>
>>>> Infrastructure Engineer
>>>> Campaign Monitor
>>>> email: [email protected]
>>>> web: www.campaignmonitor.com
>>>>
>>>>
>>>> On 19 April 2014 07:33, Frank Flynn <[email protected]> wrote:
>>>>
>>>>> We have a large Splunk instance.  We load about 1.25 Tb of logs a day.
>>>>>  We have about 1,300 loaders (servers that collect and load logs - they 
>>>>> may
>>>>> do other things too).
>>>>>
>>>>> As I look at Elasticsearch / Logstash / Kibana does anyone know of a
>>>>> performance comparison guide?  Should I expect to run on very similar
>>>>> hardware?  More? or Less?
>>>>>
>>>>> Sure it depends on exactly what we're doing, the exact queries and the
>>>>> frequency we'd run them but I'm trying to get any kind of idea before we
>>>>> start.
>>>>>
>>>>> Are there any white papers or other documents about switching?  It
>>>>> seems an obvious choice but I can only find very little performance
>>>>> comparisons (I did see that Elasticsearch just hired "the former VP of
>>>>> Products at Splunk, Gaurav Gupta" - but there were few numbers in that
>>>>> article either).
>>>>>
>>>>> Thanks,
>>>>> Frank
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "elasticsearch" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>> To view this discussion on the web visit https://groups.google.com/d/
>>>>> msgid/elasticsearch/ea1a338b-5b44-485d-84b2-3558a812e8a0%
>>>>> 40googlegroups.com
>>>>> <https://groups.google.com/d/msgid/elasticsearch/ea1a338b-5b44-485d-84b2-3558a812e8a0%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/6441b278-39ad-417d-98a6-d6e131895634%40googlegroups.com
>> <https://groups.google.com/d/msgid/elasticsearch/6441b278-39ad-417d-98a6-d6e131895634%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAEM624ZPUksz0DdYMPrTrN0D21PqSdbZrEozGsG8srjom3CvSQ%40mail.gmail.com
> <https://groups.google.com/d/msgid/elasticsearch/CAEM624ZPUksz0DdYMPrTrN0D21PqSdbZrEozGsG8srjom3CvSQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCR1iuW-CF0XWZ1cexuYP4Ttfp%3DCaCyxngNA_zWAK6OHQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Splunk vs. Elastic search performance?

Reply via email to