You're better off using our management UI, it runs the same code that the
parser topology does. I would start small with just a couple expressions
(something like "blah_DELIMITED %{MONTH:month}") and ensure you're at least
getting back the month. Then you can incrementally add more on until you
find out where your problem is.
Ryan
On Wed, Sep 13, 2017 at 9:36 AM, Frank Horsfall <
[email protected]> wrote:
> Morning all,
>
>
>
> Is anyone else seeing this error?
>
>
>
>
>
> After successfully going through the telemetry tutorial with squid (
> https://cwiki.apache.org/confluence/display/METRON/
> Adding+a+New+Telemetry+Data+Source ) I started the exercise of creating
> a new telemetry based on a data set I wish to use.
>
>
>
> Test data
>
>
>
> Jul 28 00:13:24 device1 devicelogger.py: connection|5287|accept|tcp|
> httpd|1501200799.33|10.2.1.83|1084|10.2.1.99|80
>
> Jul 28 00:13:44 device1 devicelogger.py: connection|5289|accept|tcp|
> httpd|1501200814.55|10.2.1.83|1126|10.2.1.99|80
>
> Jul 28 00:13:44 device1 devicelogger.py: connection|5288|accept|tcp|
> httpd|1501200808.64|10.2.1.83|1116|10.2.1.99|80
>
> Jul 28 00:59:49 device1 devicelogger.py: connection|5296|accept|tcp|
> httpd|1501203587.76|10.2.1.83|1556|10.2.1.99|80
>
>
>
> Grok statement
>
>
>
> blah_DELIMITED %{MONTH:month} %{MONTHDAY:day}
> %{HOUR:hour}:%{MINUTE:minute}:%{SECOND:seconds} %{WORD:hostname}
> %{WORD:script}.%{WORD:extension}: %{WORD:connection}\|%{NUMBER:
> number}\|%{WORD:type}\|%{WORD:transport}\|%{WORD:protocol}\|
> %{NUMBER:timestamp}\|%{IP:ip_src_addr}\|%{NUMBER:src_port}\
> |%{IP:ip_dst_addr}\|%{NUMBER:dst_port}
>
>
>
>
>
> I tested the pattern at the Grok site
>
> http://grokconstructor.appspot.com/do/match#result
>
>
>
>
>
> Added the pattern to hdfs
>
>
>
> [hdfs@metn1 ~]$ hadoop fs -cat /apps/metron/patterns/blah
>
> blah_DELIMITED %{MONTH:month} %{MONTHDAY:day}
> %{HOUR:hour}:%{MINUTE:minute}:%{SECOND:seconds} %{WORD:hostname}
> %{WORD:script}.%{WORD:extension}: %{WORD:connection}\|%{NUMBER:
> number}\|%{WORD:type}\|%{WORD:transport}\|%{WORD:protocol}\|
> %{NUMBER:timestamp}\|%{IP:ip_src_addr}\|%{NUMBER:src_port}\
> |%{IP:ip_dst_addr}\|%{NUMBER:dst_port}
>
>
>
>
>
> *Dump of zookeeper*
>
>
>
> PARSER Config: blah
>
> {
>
> "parserClassName": "org.apache.metron.parsers.GrokParser",
>
> "sensorTopic": "blah",
>
> "parserConfig": {
>
> "grokPath": "/apps/metron/patterns/dionaea",
>
> "patternLabel": "blah_DELIMITED",
>
> "timestampField": "timestamp"
>
> }
>
> }
>
>
>
>
>
> INDEXING Config: blah
>
> {
>
> "elasticsearch": {
>
> "index": "blah",
>
> "batchSize": 5,
>
> "enabled": true
>
> },
>
> "hdfs": {
>
> "index": "blah",
>
> "batchSize": 5,
>
> "enabled" : true
>
> },
>
> "solr": {
>
> "index": "blah",
>
> "batchSize": 5,
>
> "enabled" : false
>
> }
>
> }
>
>
>
> ENRICHMENT Config: blah
>
> {
>
> "enrichment" : {
>
> "fieldMap":
>
> {
>
> "geo": ["ip_dst_addr", "ip_src_addr"],
>
> "host": ["host"]
>
> }
>
> },
>
> "threatIntel": {
>
> "fieldMap":
>
> {
>
> "hbaseThreatIntel": ["ip_src_addr", "ip_dst_addr"]
>
> },
>
> "fieldToTypeMap":
>
> {
>
> "ip_src_addr" : ["malicious_ip"],
>
> "ip_dst_addr" : ["malicious_ip"]
>
> }
>
> }
>
> }
>
>
>
>
>
> *Nifi is set up and passes correctly. But when I get to the parserBolt
> an error occurs.*
>
>
>
> java.lang.IllegalStateException: Grok parser Error: Grok statement
> produced a null message. Original message was: Jul 28 00:13:24 device1
> devicelogger.py:
> connection|5287|accept|tcp|httpd|1501200799.33|10.2.1.83|1084|10.2.1.99|80
> and the parsed message was: {} . Check the pattern at:
> /apps/metron/patterns/dionaea on Jul 28 00:13:24 device1 devicelogger.py:
> connection|5287|accept|tcp|httpd|1501200799.33|10.2.1.83|1084|10.2.1.99|80
> at org.apache.metron.parsers.GrokParser.parse(GrokParser.java:174) at
> org.apache.metron.parsers.interfaces.MessageParser.
> parseOptional(MessageParser.java:45) at org.apache.metron.parsers.
> bolt.ParserBolt.execute(ParserBolt.java:133) at org.apache.storm.daemon.
> executor$fn__6573$tuple_action_fn__6575.invoke(executor.clj:734) at
> org.apache.storm.daemon.executor$mk_task_receiver$fn__6494.invoke(executor.clj:466)
> at
> org.apache.storm.disruptor$clojure_handler$reify__6007.onEvent(disruptor.clj:40)
> at
> org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:451)
> at
> org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:430)
> at
> org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73)
> at
> org.apache.storm.daemon.executor$fn__6573$fn__6586$fn__6639.invoke(executor.clj:853)
> at org.apache.storm.util$async_loop$fn__554.invoke(util.clj:484) at
> clojure.lang.AFn.run(AFn.java:22) at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: Grok statement produced a null
> message. Original message was: Jul 28 00:13:24 device1 devicelogger.py:
> connection|5287|accept|tcp|httpd|1501200799.33|10.2.1.83|1084|10.2.1.99|80
> and the parsed message was: {} . Check the pattern at:
> /apps/metron/patterns/dionaea at org.apache.metron.parsers.
> GrokParser.parse(GrokParser.java:152) ... 12 more
>
>
>
>
>
> Any ideas?
>
>
>
> Kindest regards,
>
> Frank
>
>
>
>
>
>
>
>
>
>
>