Re: Performance comparison between Grok and Java regex

2018-07-11 Thread Muhammed Irshad
Thanks Simon. Let me try these out and benchmark performance. On Wed, Jul 11, 2018 at 9:07 PM, Simon Elliston Ball < si...@simonellistonball.com> wrote: > A streaming token parser might well get you good performance for that > format... maybe something like an antlr grammar or even a simple

Re: Performance comparison between Grok and Java regex

2018-07-11 Thread Simon Elliston Ball
A streaming token parser might well get you good performance for that format... maybe something like an antlr grammar or even a simple scanner. Regex is not the only pattern :) It would also be great to see such a parser contributed back to the community of possible, and I sure we would be

Re: Performance comparison between Grok and Java regex

2018-07-11 Thread Muhammed Irshad
Otto Fowler, Yes, I am Ok with the trade-offs. In case of Active Directory log records can I parse it using non-regex custom parser ? I think we need one pattern matching library right as it is plain text thing ? One of the dummy AD record of my use case would be like this below. 12/02/2017

Re: Performance comparison between Grok and Java regex

2018-07-11 Thread Otto Fowler
I am not saying it is faster, just giving some info. Also, that part of the documentation is not referring to regex v. grok, but grok verses a custom non-regex parser, at least as I read it. If you have the ability to build, deploy, test and maintain a custom parser ( unless you will be

Re: Performance comparison between Grok and Java regex

2018-07-11 Thread Muhammed Irshad
Otto Fowler, Thanks for the reply. I saw it uses same Java regex under the hood. I got bit sceptic by seeing this open issue in java-grok which says grok is much slower when compared with pure regex. The fix is not available yet in metron as it

Re: Performance comparison between Grok and Java regex

2018-07-11 Thread Otto Fowler
Java-Grok IS java regex. It is just a DSL over Java regex. It takes grok expressions ( that can reference other expressions and be compound ) and parses/resolves them and then builds one big regex out of them. Also, Groks, once parsed / used are re-used, so at that point they are like compiled

Re: Performance comparison between Grok and Java regex

2018-07-11 Thread Muhammed Irshad
Thanks a lot Kevin for replying. Which thread are you mentioning ? The stackoverflow link ? I could not see any such option. On Wed, Jul 11, 2018 at 3:04 PM, Kevin Waterson wrote: > Like the thread says, the two regex engines are wildly different, however.. > you can increase the threads using

Re: Performance comparison between Grok and Java regex

2018-07-11 Thread Kevin Waterson
Like the thread says, the two regex engines are wildly different, however.. you can increase the threads using -w option in grok to increase the threads. Kevin On Wed, Jul 11, 2018 at 5:35 PM Muhammed Irshad wrote: > Hi All, > > I am trying to write Java custom parser for parsing AD logs. I am

Performance comparison between Grok and Java regex

2018-07-11 Thread Muhammed Irshad
Hi All, I am trying to write Java custom parser for parsing AD logs. I am expecting log flow of 10 million AD events per second. Is using Java regex to parse benefit over using Grok parser in terms of performance ? Is there any performance benchmark or insights regarding the same ? I found this