Hi Metron-users,

I'm looking for help on how to trace a bit more detail on why our storm worker 
processes are dying. This is a section of 
workers-artifacts/enrichment-3-1503063017/6700/worker.log after setting to 
TRACE level in /usr/hdp/2.6.0.3-8/storm/log4j2/worker.xml

...
2017-08-18 14:06:21.450 o.a.z.ClientCnxn [DEBUG] Reading reply 
sessionid:0x15dec16c436f197, packet:: clientPath:/metron/topology/indexing 
serverPath:/metron/topology/indexing finished:false header:: 18,4  
replyHeader:: 18,429572421
1,0  request:: '/metron/topology/indexing,T  response:: 
,s{4294967729,4294967729,1502906451501,1502906451501,0,7,0,0,0,7,4295003746} 
2017-08-18 14:06:21.450 o.a.z.ClientCnxn [DEBUG] Reading reply 
sessionid:0x15dec16c436f197, packet:: clientPath:/metron/topology/indexing 
serverPath:/metron/topology/indexing finished:false header:: 19,12  
replyHeader:: 19,42957242
11,0  request:: '/metron/topology/indexing,T  response:: 
v{'websphere,'squid,'error,'asa,'bro,'snort,'yaf},s{4294967729,4294967729,1502906451501,1502906451501,0,7,0,0,0,7,4295003746}
 
2017-08-18 14:06:21.450 o.a.s.s.o.a.z.ClientCnxn [DEBUG] Reading reply 
sessionid:0x15dec16c436f196, packet:: clientPath:null serverPath:null 
finished:false header:: 192,2  replyHeader:: 192,4295724211,0  request:: 
'/storm/errors/en
richment-3-1503063017/hostEnrichmentBolt/e0000000113,-1  response:: null
2017-08-18 14:06:21.450 o.a.c.u.DefaultTracerDriver [TRACE] Trace: 
GetDataBuilderImpl-Background - 13 ms
2017-08-18 14:06:21.450 o.a.c.u.DefaultTracerDriver [TRACE] Trace: 
GetChildrenBuilderImpl-Background - 13 ms
2017-08-18 14:06:21.451 o.a.s.s.o.a.c.u.DefaultTracerDriver [TRACE] Trace: 
DeleteBuilderImpl-Foreground - 40 ms
2017-08-18 14:06:21.453 o.a.s.util [ERROR] Halting process: ("Worker died")
java.lang.RuntimeException: ("Worker died")
        at org.apache.storm.util$exit_process_BANG_.doInvoke(util.clj:341) 
[storm-core-1.1.0.2.6.0.3-8.jar:1.1.0.2.6.0.3-8]
        at clojure.lang.RestFn.invoke(RestFn.java:423) [clojure-1.7.0.jar:?]
        at 
org.apache.storm.daemon.worker$fn__11035$fn__11036.invoke(worker.clj:763) 
[storm-core-1.1.0.2.6.0.3-8.jar:1.1.0.2.6.0.3-8]
        at 
org.apache.storm.daemon.executor$mk_executor_data$fn__10250$fn__10251.invoke(executor.clj:274)
 [storm-core-1.1.0.2.6.0.3-8.jar:1.1.0.2.6.0.3-8]
        at org.apache.storm.util$async_loop$fn__553.invoke(util.clj:494) 
[storm-core-1.1.0.2.6.0.3-8.jar:1.1.0.2.6.0.3-8]
        at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]
2017-08-18 14:06:21.453 o.a.z.ClientCnxn [DEBUG] Reading reply 
sessionid:0x25df0f24ef579fe, packet:: clientPath:/metron/topology/enrichments 
serverPath:/metron/topology/enrichments finished:false header:: 14,4  
replyHeader:: 14,429
5724211,0  request:: '/metron/topology/enrichments,T  response:: 
,s{4294967743,4294967743,1502906451970,1502906451970,0,6,0,0,0,6,4295003747} 
2017-08-18 14:06:21.454 o.a.z.ClientCnxn [DEBUG] Reading reply 
sessionid:0x25df0f24ef579fe, packet:: clientPath:/metron/topology/enrichments 
serverPath:/metron/topology/enrichments finished:false header:: 15,12  
replyHeader:: 15,42
95724211,0  request:: '/metron/topology/enrichments,T  response:: 
v{'websphere,'squid,'asa,'bro,'snort,'yaf},s{4294967743,4294967743,1502906451970,1502906451970,0,6,0,0,0,6,4295003747}
 
2017-08-18 14:06:21.454 o.a.c.u.DefaultTracerDriver [TRACE] Trace: 
GetDataBuilderImpl-Background - 16 ms
2017-08-18 14:06:21.454 o.a.c.u.DefaultTracerDriver [TRACE] Trace: 
GetChildrenBuilderImpl-Background - 16 ms
2017-08-18 14:06:21.454 o.a.c.u.DefaultTracerDriver [TRACE] Trace: 
GetDataBuilderImpl-Background - 16 ms
...

This is Metron 1_2_0_0_61 - a new install in a new environment, so it's 
possible we have made mistakes in firewall configuration or such, although I 
believe one of our guys has tried turning off the rules this environment uses 
(flushing iptables rules.)

Is there a way to see what clojure expression was being evaluated? (If I 
understand the traceback correctly.)

>From consuming Kafka topics with the command line tools for Kafka, I see 
>content on 'squid' and 'enrichments', but nothing coming from 'indexing'. I 
>expect because the 'enrichment' bolts are failing. The Storm UI shows that all 
>the following have thrown various exceptions:

geoEnrichmentBolt
hostEnrichmentBolt
simpleHBaseEnrichmentBolt
simpleHBaseThreatIntelBolt
threatIntelJoinBolt

Apparently these have not:
stellarEnrichmentBolt
stellarThreatIntelBolt
(and more)


Many thanks for tips,

 Nick

Reply via email to