Re: HBaseDao and IndexDao abstraction

2018-10-23 Thread Muhammed Irshad
Hi All,

I have got a solution for this using SHEW ( Simple HBase Enrichment Writer
) which is documented in confluence
<https://cwiki.apache.org/confluence/display/METRON/2016/06/16/Metron+Tutorial+-+Fundamentals+Part+6%3A+Streaming+Enrichment>
but not in metron current book documentation
<https://metron.apache.org/current-book/index.html>. I am going to give
this a try and see how it goes. Thanks a lot for Simon Elliston Ball
 & Metron slack channel :)

On Thu, Oct 18, 2018 at 10:51 AM Muhammed Irshad 
wrote:

> Mike,
>
> Thanks for replying. I had gone through it already and we are indexing our
> Active Directory logs to hdfs by streaming from Splunk. But I have a
> requirement of maintaining Active Directory asset inventory ( Just list of
> asset and their status not historic data) along with AD event indexing. So
> I thought of using HBase and was thinking the best place to put this logic
> ( Enrichment by writing a custom stellar which populate HBase column family
> for assets or In indexing layer ) . Then I saw the HBaseDao in
> documentation and wanted to understand what it is and weather it can be
> used to meet my use case.
>
> On Tue, Oct 16, 2018 at 7:41 PM Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
>> Hi Muhammed,
>>
>> I think you probably want to start with our parser infrastructure rather
>> than the DAO's for what you're doing. This series of blog posts gives a
>> use
>> case driven walkthrough that should help shed some light on things:
>> Part 1 (start here) -
>>
>> https://cwiki.apache.org/confluence/display/METRON/2016/04/25/Metron+Tutorial+-+Fundamentals+Part+1%3A+Creating+a+New+Telemetry
>> TOC of the 7-part series -
>>
>> https://cwiki.apache.org/confluence/display/METRON/2016/06/22/Metron+Tutorial+-+Fundamentals+Part+7%3A+Dashboarding+with+Kibana
>>
>> Here's some details about our parser infrastructure -
>>
>> https://github.com/apache/metron/tree/master/metron-platform/metron-parsers
>> ...which feeds into the data enrichment topology -
>>
>> https://github.com/apache/metron/tree/master/metron-platform/metron-enrichment
>> ...which feeds into the indexing topology, which you've already found
>>
>> Hope this helps for a start!
>>
>> Best,
>> Mike Miklavcic
>>
>>
>> On Tue, Oct 16, 2018 at 12:05 AM Muhammed Irshad 
>> wrote:
>>
>> > Hi all,
>> >
>> > What is the actual use of HBaseDao documented in metron indexing
>> > documentation
>> > <
>> >
>> https://metron.apache.org/current-book/metron-platform/metron-indexing/index.html
>> > >
>> > under section 'The IndexDao Abstraction' ? From my reading I understand
>> it
>> > as a HBase indexing implementation which can be clubbed to hdfs for
>> updated
>> > data. What is the use of it as we cannot chose to index in HBase / hdfs
>> > dynamically ? Can some one explain an example about how to configure and
>> > use it ( More documentation link or reference is fine) ? I have a use
>> case
>> > where I need to maintain an Active Directory inventory, Using AD event
>> logs
>> > being indexed via metron. Is HBaseDao can be used for this use case ?
>> >
>> > --
>> > Muhammed Irshad K T
>> > Senior Software Engineer
>> > +919447946359
>> > irshadkt@gmail.com
>> > Skype : muhammed.irshad.k.t
>> >
>>
>
>
> --
> Muhammed Irshad K T
> Senior Software Engineer
> +919447946359
> irshadkt@gmail.com
> Skype : muhammed.irshad.k.t
>


-- 
Muhammed Irshad K T
Senior Software Engineer
+919447946359
irshadkt@gmail.com
Skype : muhammed.irshad.k.t


Re: Invite to Slack Channel

2018-10-22 Thread Muhammed Irshad
Some one get me also the slack channel link ?
Thanks,
Muhammed Irshad
Q*Burst*
www.qburst.com


On Wed, Oct 17, 2018 at 7:33 PM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> Sent
>
> On Wed, Oct 17, 2018 at 7:23 AM Tibor Meller 
> wrote:
>
> > Hi Guys,
> > Can you add me to the apache metron slack chanel?
> >
> > Thanks,
> >
> > On Thu, Oct 4, 2018 at 1:14 PM Otto Fowler 
> > wrote:
> >
> > > Done
> > >
> > >
> > > On October 4, 2018 at 05:35:06, Tamás Fodor (ftamas.m...@gmail.com)
> > wrote:
> > >
> > > Hello,
> > >
> > > Michael, can you add me as well?
> > >
> > > Thank you in advance!
> > >
> > > Tamas
> > >
> > > On Wed, Oct 3, 2018 at 4:27 PM Michael Miklavcic <
> > > michael.miklav...@gmail.com> wrote:
> > >
> > > > Sent
> > > >
> > > > On Wed, Oct 3, 2018 at 8:17 AM Shane Ardell <
> shane.m.ard...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hello everyone,
> > > > >
> > > > > Is it possible for someone to send me an invite to the Metron Slack
> > > > > channel?
> > > > >
> > > > > Regards,
> > > > > Shane
> > > > >
> > > >
> > >
> >
>


Re: HBaseDao and IndexDao abstraction

2018-10-17 Thread Muhammed Irshad
Mike,

Thanks for replying. I had gone through it already and we are indexing our
Active Directory logs to hdfs by streaming from Splunk. But I have a
requirement of maintaining Active Directory asset inventory ( Just list of
asset and their status not historic data) along with AD event indexing. So
I thought of using HBase and was thinking the best place to put this logic
( Enrichment by writing a custom stellar which populate HBase column family
for assets or In indexing layer ) . Then I saw the HBaseDao in
documentation and wanted to understand what it is and weather it can be
used to meet my use case.

On Tue, Oct 16, 2018 at 7:41 PM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> Hi Muhammed,
>
> I think you probably want to start with our parser infrastructure rather
> than the DAO's for what you're doing. This series of blog posts gives a use
> case driven walkthrough that should help shed some light on things:
> Part 1 (start here) -
>
> https://cwiki.apache.org/confluence/display/METRON/2016/04/25/Metron+Tutorial+-+Fundamentals+Part+1%3A+Creating+a+New+Telemetry
> TOC of the 7-part series -
>
> https://cwiki.apache.org/confluence/display/METRON/2016/06/22/Metron+Tutorial+-+Fundamentals+Part+7%3A+Dashboarding+with+Kibana
>
> Here's some details about our parser infrastructure -
> https://github.com/apache/metron/tree/master/metron-platform/metron-parsers
> ...which feeds into the data enrichment topology -
>
> https://github.com/apache/metron/tree/master/metron-platform/metron-enrichment
> ...which feeds into the indexing topology, which you've already found
>
> Hope this helps for a start!
>
> Best,
> Mike Miklavcic
>
>
> On Tue, Oct 16, 2018 at 12:05 AM Muhammed Irshad 
> wrote:
>
> > Hi all,
> >
> > What is the actual use of HBaseDao documented in metron indexing
> > documentation
> > <
> >
> https://metron.apache.org/current-book/metron-platform/metron-indexing/index.html
> > >
> > under section 'The IndexDao Abstraction' ? From my reading I understand
> it
> > as a HBase indexing implementation which can be clubbed to hdfs for
> updated
> > data. What is the use of it as we cannot chose to index in HBase / hdfs
> > dynamically ? Can some one explain an example about how to configure and
> > use it ( More documentation link or reference is fine) ? I have a use
> case
> > where I need to maintain an Active Directory inventory, Using AD event
> logs
> > being indexed via metron. Is HBaseDao can be used for this use case ?
> >
> > --
> > Muhammed Irshad K T
> > Senior Software Engineer
> > +919447946359
> > irshadkt@gmail.com
> > Skype : muhammed.irshad.k.t
> >
>


-- 
Muhammed Irshad K T
Senior Software Engineer
+919447946359
irshadkt@gmail.com
Skype : muhammed.irshad.k.t


HBaseDao and IndexDao abstraction

2018-10-16 Thread Muhammed Irshad
Hi all,

What is the actual use of HBaseDao documented in metron indexing
documentation
<https://metron.apache.org/current-book/metron-platform/metron-indexing/index.html>
under section 'The IndexDao Abstraction' ? From my reading I understand it
as a HBase indexing implementation which can be clubbed to hdfs for updated
data. What is the use of it as we cannot chose to index in HBase / hdfs
dynamically ? Can some one explain an example about how to configure and
use it ( More documentation link or reference is fine) ? I have a use case
where I need to maintain an Active Directory inventory, Using AD event logs
being indexed via metron. Is HBaseDao can be used for this use case ?

-- 
Muhammed Irshad K T
Senior Software Engineer
+919447946359
irshadkt@gmail.com
Skype : muhammed.irshad.k.t


Custom parser using Jackson instead of json-simple

2018-10-05 Thread Muhammed Irshad
Hi All,

Is it not possible to use any Json library other than json-simple
<https://code.google.com/archive/p/json-simple/> while writing custom
parsers ? I could see we should implement custom parser interface
MessageParser in document. What if I need to use Jackson
instead of json-simple ? I see Jackson performs better than json-simple in
few aspects in some of the benchmark studies. I tried writing a custom
parser implementing MessageParser ( Jackson's JsonNode ). No
compiler errors. I am getting below error when I deploy and run this parser
in HCP.

2018-10-05 04:42:09.829 o.a.s.d.executor Thread-12-parserBolt-executor[5 5]
[ERROR]
java.lang.ClassCastException: org.codehaus.jackson.node.ObjectNode cannot
be cast to org.json.simple.JSONObject
at org.apache.metron.parsers.bolt.ParserBolt.execute(ParserBolt.java:187)
[stormjar.jar:?
]
at
org.apache.storm.daemon.executor$fn__10252$tuple_action_fn__10254.invoke(executor.clj:735)
[storm-core-1.1.0.2.6.5.0-292.jar:1.1.0.2.6.5.0-292]
at
org.apache.storm.daemon.executor$mk_task_receiver$fn__10171.invoke(executor.clj:466)
[storm-core-1.1.0.2.6.5.0-292.jar:1.1.0.2.6.5.0-292]
at
org.apache.storm.disruptor$clojure_handler$reify__9685.onEvent(disruptor.clj:40)
[storm-core-1.1.0.2.6.5.0-292.jar:1.1.0.2.6.5.0-292]
at
org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:472)
[storm-core-1.1.0.2.6.5.0-292.jar:1.1.0.2.6.5.0-292]
at
org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:451)
[storm-core-1.1.0.2.6.5.0-292.jar:1.1.0.2.6.5.0-292]
at
org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73)
[storm-core-1.1.0.2.6.5.0-292.jar:1.1.0.2.6.5.0-292]
at
org.apache.storm.daemon.executor$fn__10252$fn__10265$fn__10320.invoke(executor.clj:855)
[storm-core-1.1.0.2.6.5.0-292.jar:1.1.0.2.6.5.0-292]
at org.apache.storm.util$async_loop$fn__553.invoke(util.clj:484)
[storm-core-1.1.0.2.6.5.0-292.jar:1.1.0.2.6.5.0-292]
at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?
]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]
2018-10-05 04:42:09.850 o.a.s.d.executor Thread-12-parserBolt-executor[5 5]
[ERROR]


-- 
Muhammed Irshad K T
Senior Software Engineer
+919447946359
irshadkt@gmail.com
Skype : muhammed.irshad.k.t


Re: Performance comparison between Grok and Java regex

2018-07-11 Thread Muhammed Irshad
Thanks Simon. Let me try these out and benchmark performance.

On Wed, Jul 11, 2018 at 9:07 PM, Simon Elliston Ball <
si...@simonellistonball.com> wrote:

> A streaming token parser might well get you good performance for that
> format... maybe something like an antlr grammar or even a simple scanner.
> Regex is not the only pattern :)
>
> It would also be great to see such a parser contributed back to the
> community of possible, and I sure we would be happy to help maintain and
> improve it in the open source.
>
> Simon
>
> > On 11 Jul 2018, at 16:26, Muhammed Irshad 
> wrote:
> >
> > Otto Fowler,
> >
> > Yes, I am Ok with the trade-offs. In case of Active Directory log records
> > can I parse it using non-regex custom parser ? I think we need one
> pattern
> > matching library right as it is plain text thing ? One of the dummy AD
> > record of my use case would be like this below.
> >
> >
> > 12/02/2017 05:14:43 PM LogName=Security SourceName=Microsoft Windows
> > security auditing. EventCode=4625 EventType=0 Type=Information
> ComputerName=
> > dc1.ad.ecorp.com TaskCategory=Logon OpCode=Info
> > RecordNumber=95055509895231650867 Keywords=Audit Success Message=An
> account
> > failed to log on. Subject: Security ID: NULL SID Account Name: - Account
> > Domain: - Logon ID: 0x0 Logon Type: 3 Account For Which Logon Failed:
> > Security ID: NULL SID Account Name: K1560365938U$ Account Domain: ECORP
> > Failure Information: Failure Reason: Unknown user name or bad password.
> > Status: 0xC06D Sub Status: 0xC06A Network Information:
> Workstation
> > Name: K1560365938U Source Network Address: 192.168.151.95 Source Port:
> > 53176 Detailed Authentification Information: Logon Process: NtLmSsp
> > Authentification Package: NTLM Transited Services: - Package Name (NTLM
> > ONLY): - Key Length: 0 This event is generated when a logon request
> fails.
> > It is generated on the computer where access was attempted. The Subject
> > fields indicate the account on the local system which requested the
> logon.
> > This is most commonly a service such as the Server service, or a local
> > process such as Winlogon.exe or Services.exe. The Logon Type field
> > indicates the kind of logon that was requested. The most common types
> are 2
> > (interactive) and 3 (network). The Process Information fields indicate
> > which account and process on the system requested the logon. The Network
> > Information fields indicate where a remote logon request originated.
> > Workstation name is not always available and may be left blank in some
> > cases. The authentication information fields provide detailed information
> > about this specific logon request. Transited services indicate which
> > intermediate services have participated in this logon request. Package
> name
> > indicates which sub-protocol was used among the NTLM protocols
> >
> > On Wed, Jul 11, 2018 at 8:44 PM, Otto Fowler 
> > wrote:
> >
> >> I am not saying it is faster, just giving some info.
> >>
> >> Also, that part of the documentation is not referring to regex v. grok,
> >> but grok verses a custom non-regex parser, at least as I read it.
> >>
> >> If you have the ability to build, deploy, test and maintain a custom
> >> parser ( unless you will be submitting it to the project? ), then in
> most
> >> cases where performance
> >> is the top issue ( or rather throughput ) you are most likely going to
> get
> >> better results that way.  Accepting that you are ok with the tradeoffs.
> >>
> >> If you have 10M mps parsing might night be your bottleneck.
> >>
> >>
> >>
> >>
> >>
> >> On July 11, 2018 at 11:01:19, Muhammed Irshad (irshadkt@gmail.com)
> >> wrote:
> >>
> >> Otto Fowler,
> >>
> >> Thanks for the reply. I saw it uses same Java regex under the hood. I
> got
> >> bit sceptic by seeing this open issue
> >> <https://github.com/thekrakken/java-grok/issues/75> in java-grok which
> >> says
> >> grok is much slower when compared with pure regex. The fix is not
> >> available
> >> yet in metron as it need few changes in the API and issue to be closed.
> As
> >> data volume is so huge in my requirement I had to double check and
> confirm
> >> before I go with one. Also metron documentation
> >> <https://metron.apache.org/current-book/metron-platform/
> >> metron-parsers/index.html>
> >> itself says the below statement under 

Re: Performance comparison between Grok and Java regex

2018-07-11 Thread Muhammed Irshad
Otto Fowler,

Yes, I am Ok with the trade-offs. In case of Active Directory log records
can I parse it using non-regex custom parser ? I think we need one pattern
matching library right as it is plain text thing ? One of the dummy AD
record of my use case would be like this below.


12/02/2017 05:14:43 PM LogName=Security SourceName=Microsoft Windows
security auditing. EventCode=4625 EventType=0 Type=Information ComputerName=
dc1.ad.ecorp.com TaskCategory=Logon OpCode=Info
RecordNumber=95055509895231650867 Keywords=Audit Success Message=An account
failed to log on. Subject: Security ID: NULL SID Account Name: - Account
Domain: - Logon ID: 0x0 Logon Type: 3 Account For Which Logon Failed:
Security ID: NULL SID Account Name: K1560365938U$ Account Domain: ECORP
Failure Information: Failure Reason: Unknown user name or bad password.
Status: 0xC06D Sub Status: 0xC06A Network Information: Workstation
Name: K1560365938U Source Network Address: 192.168.151.95 Source Port:
53176 Detailed Authentification Information: Logon Process: NtLmSsp
Authentification Package: NTLM Transited Services: - Package Name (NTLM
ONLY): - Key Length: 0 This event is generated when a logon request fails.
It is generated on the computer where access was attempted. The Subject
fields indicate the account on the local system which requested the logon.
This is most commonly a service such as the Server service, or a local
process such as Winlogon.exe or Services.exe. The Logon Type field
indicates the kind of logon that was requested. The most common types are 2
(interactive) and 3 (network). The Process Information fields indicate
which account and process on the system requested the logon. The Network
Information fields indicate where a remote logon request originated.
Workstation name is not always available and may be left blank in some
cases. The authentication information fields provide detailed information
about this specific logon request. Transited services indicate which
intermediate services have participated in this logon request. Package name
indicates which sub-protocol was used among the NTLM protocols

On Wed, Jul 11, 2018 at 8:44 PM, Otto Fowler 
wrote:

> I am not saying it is faster, just giving some info.
>
> Also, that part of the documentation is not referring to regex v. grok,
> but grok verses a custom non-regex parser, at least as I read it.
>
> If you have the ability to build, deploy, test and maintain a custom
> parser ( unless you will be submitting it to the project? ), then in most
> cases where performance
> is the top issue ( or rather throughput ) you are most likely going to get
> better results that way.  Accepting that you are ok with the tradeoffs.
>
> If you have 10M mps parsing might night be your bottleneck.
>
>
>
>
>
> On July 11, 2018 at 11:01:19, Muhammed Irshad (irshadkt@gmail.com)
> wrote:
>
> Otto Fowler,
>
> Thanks for the reply. I saw it uses same Java regex under the hood. I got
> bit sceptic by seeing this open issue
> <https://github.com/thekrakken/java-grok/issues/75> in java-grok which
> says
> grok is much slower when compared with pure regex. The fix is not
> available
> yet in metron as it need few changes in the API and issue to be closed. As
> data volume is so huge in my requirement I had to double check and confirm
> before I go with one. Also metron documentation
> <https://metron.apache.org/current-book/metron-platform/
> metron-parsers/index.html>
> itself says the below statement under Parser Adapter section.
>
> "Grok parser adapters are designed primarly for someone who is not a Java
> coder for quickly standing up a parser adapter for lower velocity
> topologies. Grok relies on Regex for message parsing, which is much slower
> than purpose-built Java parsers, but is more extensible. Grok parsers are
> defined via a config file and the topplogy does not need to be recombiled
> in order to make changes to them."
>
> On Wed, Jul 11, 2018 at 8:01 PM, Otto Fowler 
> wrote:
>
> > Java-Grok IS java regex. It is just a DSL over Java regex. It takes grok
> > expressions ( that can reference other expressions and be compound ) and
> > parses/resolves them and then builds one big regex out of them.
> > Also, Groks, once parsed / used are re-used, so at that point they are
> > like compiled regex’s.
> >
> > That is not to say that that takes 0 time, but it may help you to
> > understand.
> >
> > https://github.com/thekrakken/java-grok/blob/master/src/
> > main/java/io/krakens/grok/api/Grok.java
> > https://github.com/thekrakken/java-grok/blob/master/src/
> > main/java/io/krakens/grok/api/GrokCompiler.java
> >
> > On July 11, 2018 at 07:13:38, Muhammed Irshad (irshadkt@gmail.com)
> > wrote:
> >
> > Thanks a l

Re: Performance comparison between Grok and Java regex

2018-07-11 Thread Muhammed Irshad
Otto Fowler,

Thanks for the reply. I saw it uses same Java regex under the hood. I got
bit sceptic by seeing this open issue
<https://github.com/thekrakken/java-grok/issues/75> in java-grok which says
grok is much slower when compared with pure regex. The fix is not available
yet in metron as it need few changes in the API and issue to be closed. As
data volume is so huge in my requirement I had to double check and confirm
before I go with one. Also metron documentation
<https://metron.apache.org/current-book/metron-platform/metron-parsers/index.html>
itself says the below statement under Parser Adapter section.

"Grok parser adapters are designed primarly for someone who is not a Java
coder for quickly standing up a parser adapter for lower velocity
topologies. Grok relies on Regex for message parsing, which is much slower
than purpose-built Java parsers, but is more extensible. Grok parsers are
defined via a config file and the topplogy does not need to be recombiled
in order to make changes to them."

On Wed, Jul 11, 2018 at 8:01 PM, Otto Fowler 
wrote:

> Java-Grok IS java regex.  It is just a DSL over Java regex.  It takes grok
> expressions ( that can reference other expressions and be compound ) and
> parses/resolves them and then builds one big regex out of them.
> Also, Groks, once parsed / used are re-used, so at that point they are
> like compiled regex’s.
>
> That is not to say that that takes 0 time, but it may help you to
> understand.
>
> https://github.com/thekrakken/java-grok/blob/master/src/
> main/java/io/krakens/grok/api/Grok.java
> https://github.com/thekrakken/java-grok/blob/master/src/
> main/java/io/krakens/grok/api/GrokCompiler.java
>
> On July 11, 2018 at 07:13:38, Muhammed Irshad (irshadkt@gmail.com)
> wrote:
>
> Thanks a lot Kevin for replying. Which thread are you mentioning ? The
> stackoverflow link ? I could not see any such option.
>
> On Wed, Jul 11, 2018 at 3:04 PM, Kevin Waterson 
>
> wrote:
>
> > Like the thread says, the two regex engines are wildly different,
> however..
> > you can increase the threads using -w option in grok to increase the
> > threads.
> >
> > Kevin
> >
> > On Wed, Jul 11, 2018 at 5:35 PM Muhammed Irshad 
>
> > wrote:
> >
> > > Hi All,
> > >
> > > I am trying to write Java custom parser for parsing AD logs. I am
> > expecting
> > > log flow of 10 million AD events per second. Is using Java regex to
> parse
> > > benefit over using Grok parser in terms of performance ? Is there any
> > > performance benchmark or insights regarding the same ?
> > >
> > > I found this stackoverflow
> > > <
> > > https://stackoverflow.com/questions/43222863/logstash-
> > grok-filter-is-slower-than-java-regex-pattern-matching
> > > >
> > > question which inspired me for this post.
> > >
> > > --
> > > Muhammed Irshad K T
> > > Senior Software Engineer
> > > +919447946359
> > > irshadkt@gmail.com
> > > Skype : muhammed.irshad.k.t
> > >
> >
>
>
>
> --
> Muhammed Irshad K T
> Senior Software Engineer
> +919447946359
> irshadkt@gmail.com
> Skype : muhammed.irshad.k.t
>
>


-- 
Muhammed Irshad K T
Senior Software Engineer
+919447946359
irshadkt@gmail.com
Skype : muhammed.irshad.k.t


Re: Performance comparison between Grok and Java regex

2018-07-11 Thread Muhammed Irshad
Thanks a lot Kevin for replying. Which thread are you mentioning ? The
stackoverflow link ? I could not see any such option.

On Wed, Jul 11, 2018 at 3:04 PM, Kevin Waterson 
wrote:

> Like the thread says, the two regex engines are wildly different, however..
> you can increase the threads using -w option in grok to increase the
> threads.
>
> Kevin
>
> On Wed, Jul 11, 2018 at 5:35 PM Muhammed Irshad 
> wrote:
>
> > Hi All,
> >
> > I am trying to write Java custom parser for parsing AD logs. I am
> expecting
> > log flow of 10 million AD events per second. Is using Java regex to parse
> > benefit over using Grok parser in terms of performance ? Is there any
> > performance benchmark or insights regarding the same ?
> >
> > I found this stackoverflow
> > <
> > https://stackoverflow.com/questions/43222863/logstash-
> grok-filter-is-slower-than-java-regex-pattern-matching
> > >
> > question which inspired me for this post.
> >
> > --
> > Muhammed Irshad K T
> > Senior Software Engineer
> > +919447946359
> > irshadkt@gmail.com
> > Skype : muhammed.irshad.k.t
> >
>



-- 
Muhammed Irshad K T
Senior Software Engineer
+919447946359
irshadkt@gmail.com
Skype : muhammed.irshad.k.t


Performance comparison between Grok and Java regex

2018-07-11 Thread Muhammed Irshad
Hi All,

I am trying to write Java custom parser for parsing AD logs. I am expecting
log flow of 10 million AD events per second. Is using Java regex to parse
benefit over using Grok parser in terms of performance ? Is there any
performance benchmark or insights regarding the same ?

I found this stackoverflow
<https://stackoverflow.com/questions/43222863/logstash-grok-filter-is-slower-than-java-regex-pattern-matching>
question which inspired me for this post.

-- 
Muhammed Irshad K T
Senior Software Engineer
+919447946359
irshadkt@gmail.com
Skype : muhammed.irshad.k.t


Metron docker compose fails

2018-07-05 Thread Muhammed Irshad
Hi All,

I was trying to setup metron docker in my local machine for development
purpose. I tried the steps mentioned here
<https://metron.apache.org/current-book/metron-contrib/metron-docker/index.html>.
But the command `docker-compose up -d` fails with the below error. I have
tried with both latest master and 0.5.0 release tag source code. But same
issue. What am I doing wrong ? Is the documentation updated ?

`Step 3/7 : ADD ./es_templates /es_templates


ERROR: Service 'elasticsearch' failed to build: ADD failed: stat
/mnt/sda1/var/lib/docker/tmp/docker-builder890591068/es_templates: no such
file or directory `

I just commented out the line `ADD ./es_templates /es_templates` in
elasticsearch docker file then this error is gone and new one comes related
to storm.

'Step 13/32 : RUN sed -i -e "s/kafka.zk=.*:/kafka.zk=kafkazk:/g"
/usr/metron/$METRON_VERSION/config/enrichment.properties

 ---> Running in 34fe9681481f
sed: can't read /usr/metron/0.5.0/config/enrichment.properties: No such
file or directory

ERROR: Service 'storm' failed to build: The command '/bin/sh -c sed -i -e
"s/kafka.zk=.*:/kafka.zk=kafkazk:/g"
/usr/metron/$METRON_VERSION/config/enrichment.properties' returned a
non-zero code: 2
'




-- 
Muhammed Irshad K T
Senior Software Engineer
+919447946359
irshadkt@gmail.com
Skype : muhammed.irshad.k.t