Metron Enrichment Error

2019-11-06 Thread Gonçalo Pedras
Hi,
I've built Metron and installed alongside the current Ambari version with 
HDP-3.1 support provided by the GitHub project 
(https://github.com/apache/metron/tree/feature/METRON-2088-support-hdp-3.1).

I've followed the documentation and installed everything with success. Altough 
when i'm starting the services on my cluster, everything is running fine with 
no errors except for Metron Enrichment which gives me an odd error that I can't 
find any fix available for it.
The error is:

resource_management.core.exceptions.ExecutionFailed: Execution of 'echo 
"disable 'enrichment'" | hbase shell -n' returned 1.
...
Took 0.3556 secondsERROR ArgumentError: Table enrichment does not exist.
Does it implies that I must create the tables in HBase manually?

Thanks


Re: Metron Enrichment Error

2019-11-06 Thread Gonçalo Pedras
Resolved.

Deleted Metron service and added the service again but this time I configured 
REST by configuring the JDBC with mysql.



On 2019/11/06 10:49:50, Gonçalo Pedras mailto:g...@ctd.pt>> wrote:

> Hi,>

> I've built Metron and installed alongside the current Ambari version with 
> HDP-3.1 support provided by the GitHub project 
> (https://github.com/apache/metron/tree/feature/METRON-2088-support-hdp-3.1).>

>

> I've followed the documentation and installed everything with success. 
> Altough when i'm starting the services on my cluster, everything is running 
> fine with no errors except for Metron Enrichment which gives me an odd error 
> that I can't find any fix available for it.>

> The error is:>

>

> resource_management.core.exceptions.ExecutionFailed: Execution of 'echo 
> "disable 'enrichment'" | hbase shell -n' returned 1.>

> ...>

> Took 0.3556 secondsERROR ArgumentError: Table enrichment does not exist.>

> Does it implies that I must create the tables in HBase manually?>

>

> Thanks>

>


RE: Score not being issued by ThreatIntel Enrichment

2019-11-21 Thread Gonçalo Pedras
Hi,
It works. Thanks for the help. Really appreciated.

Thanks



Score not being issued by ThreatIntel Enrichment

2019-11-21 Thread Gonçalo Pedras
Hi,
I've deployed Metron alongside the current Ambari version using the Metron 
HDP3.1 support provided by a branch in the GitHub project.

Fast forward, I'm testing Metron:

1.   I've deployed a custom CSV parser with 3 fields ( 2 dummy fields and a 
IP field). The parser works fine.

2.   Created a custom template for my sensor with the required fields 
(guid, ip_src_addr, ip_dst_addr, ...) for Elasticsearch for the pattern 
indexes. Works fine, even Metron can recognize the indexes.

3.   Created a custom Threat Intel source (extractor  enrichment config 
JSON files, and the CSV content file). Also works fine, I've tested it using 
Stellar with ENRICHMENT_GET function, returning the content I wrote in the CSV 
file.

4.   Configured Threat Triage for the sensor with the rule "ip_src_addr == 
''" and the score of 5. Doesn't work... The 
data in the Elasticsearch's index is still being issued without the threat 
score.

The enrichment config of the threat intel source:
{
 "zkQuorum" : ":",
 "sensorToFieldList": {
   "xcsvtest": {
   "type": "THREAT_INTEL",
   "fieldToEnrichmentTypes": {
   "ip_src_addr" : ["testList"]
   }
   }
 }
}

My enrichment configuration:

{
"enrichment": {
   "fieldMap": {
   "geo": [
   "ip_src_addr"
   ]
   },
   "fieldToTypeMap": {},
   "config": {}
},
"threatIntel": {
   "fieldMap": {},
   "fieldToTypeMap": {
   "ip_src_addr": [
   "testList"
   ]
   },
   "config": {},
   "triageConfig": {
   "riskLevelRules": [
   {
   
"name": "All_threat",
   
"comment": "",
   
"rule": "ip_src_addr == '8.8.8.8' ",
   
"reason": null,
   
"score": "5"
   }
   ],
   "aggregator": "MAX",
   "aggregationConfig": {}
   }
},
"configuration": {}
}



Appreciate any help.
Thanks


Profiler debug

2020-03-06 Thread Gonçalo Pedras
Hi,
I'm experiencing some GC overhead limits in my profiler. The profiler works 
fine while testing in Stellar enviorment, although it fails most of the time as 
a running topology. My profiler has 1 profile only.
My profile has the following configuration:

· 5 integer counters, when It finds a specific string it adds to the 
respective counter.

· Unique list of strings, when It finds a new string it adds to the 
list.

· A fixed string field (stays the same throughout the profiler 
duration).
My profile flushes:

· Profiler: a hardcoded Json map of the results of the counters. Ex: 
{\"counter1\": variable1, \"counter2\": variable2 ..}

· Triage: the integer counters, the fixed string value and the size of 
the unique string list.
The only reason I have a fixed value on the profile is that I don't want to 
enrich the same data twice, so I just might pass it through the profiler.
Is there any other way to debug the profiler while running as a topology? 
Because I can't seem to understand why it ends in error and the storm logs 
don't help much. I've tweaked some options like the profile period, time to 
live and worker childopts.

Thanks


RE: Profiler debug

2020-03-06 Thread Gonçalo Pedras
Thanks for replying. One more question: how do I set the debug for 
“org.apache.metron.profiler” ?
Thanks



Re: Metron MaaS issue

2020-03-06 Thread Gonçalo Pedras
Hi,

I have the same issues when it comes to redeploy new versions, I usually just 
shut down the process by it's ID. It's not really an issue for me because it's 
doing something so simple.

When it comes to the fact that the process can't read from existing folder, it 
had happened to me when I had my python app splitted into multiple files, but 
when I merged it all worked fine. Besides that you must check if the current 
folder you're uploading the MaaS service hasn't any subfolder.

I hope it helps, I'm not a developer but I hope this answer will create a work 
around for your issue.



On 2020/02/27 13:43:00, Hema malini mailto:n...@gmail.com>> 
wrote:

> Hi all,>

>

> In metron can I deploy the model live in MaaS. For eg , if I deploy a model>

> in the pipeline with v1, and redeploy another version by replacing the>

> existing one, do I have to kill the existing yarn job.>

>

> Also , in MaaS, if it has multiple files , it is not able to read from>

> existing folder. In hdfs I could see all the files been uploaded, but the>

> request seems to be null. In the yarn logs, I could see only null request>

> received.>

>

>

> Thanks and Regards,>

> Hema>

>


RE: Profiler doubt

2020-01-28 Thread Gonçalo Pedras
Hi again,
I found something in the profiler storm logs that proves the delay:
“2020-01-28 09:46:37.061 o.a.m.p.s.FixedFrequencyFlushSignal 
watermark-event-generator-0 [WARN] Timestamp out-of-order by -54968000 ms. This 
may indicate a problem in the data. timestamp=1580202172000, 
maxKnown=1580147204000, flushFreq=90 ms”

The profiler is delayed 15 hours and a half.


RE: Profiler doubt

2020-01-27 Thread Gonçalo Pedras
Hi Allen,

Thanks for the reply by the way.
I’ve been checking my profiler, tweaking some options and whatnot. I’ve set the 
“timestampField” and solved half the issue.
I ran the spark batch profiler and it rectified the counts. Then I started the 
Storm profiler once again. Now the profiler is delayed one hour and a half and 
counting the same record twice. I’ve restarted it, reinstalled it and it keeps 
delayed somehow. I even droped the HBase table and created a new one and the 
Storm profiler is still delayed.
I’m sending the records myself, so it’s just 2 to 10 records at a time. And the 
topology logs in the Storm UI tells it’s actually doing its job. The system 
time is synchronized and the ASA records are generated in the same timezone. 
I’m running out of options here.

Thanks


Profiler consumer stuck

2020-01-31 Thread Gonçalo Pedras
Hi, again
I found a problem in my profiler consumer. For some reason my profiler won't 
consume new records from "indexing" topic. I checked the Kafka Consumer Groups 
and the current offset were 4 records behind, stucked. And whenever it consumes 
from "indexing" it says the data is old because it doesn't consume right away 
obviously.


RE: Profiler doubt

2020-01-28 Thread Gonçalo Pedras
Hi,
This profiler is really inconsistent, i’m going crazy right now.
I’ve made a further investigation and this is really bugging my mind:

1.   I’m not expecting to receive15 hours old messages. In fact I’m the one 
who’s picking the messages from the current time and sending them to Kafka, for 
instance, let’s say it’s 15h33 GMT, I would pick a message like this one: 
“<182>Jan 28 2020 15:33:14 # : %ASA-6-305011: Built dynamic TCP 
translation from ###/48678 to /48678” and send it to Kafka.

2.   These messages are successfully parsed because I can find them in the 
“enrichments” topic in Kafka. And the messages have the right “timestamp” field 
when parsed. So the problem is not in the messages themselves. (The syslog 
timestamp is the value of the timestamp).

3.   The results of the Profile Client are really off.

I ran a test:

· I sent 4 messages at 14h18; and 5 messages at 14h25; All the messages 
have the same syslog severity.
If my profiler runs every 15 minutes than the range of 14h15 to 14h30 the 
result must be 9:

{period.start=158022090, period=1755801, 
profile=ClientA_syslog_severety_count, period.end=158022180, groups=[], 
value=9, entity=info}

Surprisingly it’s right. Than I ran a second test:

· I sent 4 messages at 14h41; and 3 messages at 14h48; all the messages 
have the same syslog severity.
With that said the result must be 7. Here’s the result:

{period.start=158022180, period=1755802, 
profile=ClientA_syslog_severety_count, period.end=158022270, groups=[], 
value=9, entity=info}

I ran a third test:

· Sent 3 messages at 15h51.
The profiler returned none:

{period.start=158022630, period=1755807, 
profile=ClientA_syslog_severety_count, period.end=158022720, groups=[], 
value=0, entity=info}

I checked the Kafka topics to make sure there weren’t more messages than it was 
supposed to. Everything is consistent except the profiler. I’m about to nuke 
myself.

Thanks



RE: Profiler doubt

2020-01-28 Thread Gonçalo Pedras
I only restarted before running th first test, since all the configurations are 
the same in the three tests.



3rd party stellar functions

2020-02-21 Thread Gonçalo Pedras
Hi,
I was following this tutorial on custom Stellar functions 
https://metron.apache.org/current-book/metron-stellar/stellar-common/3rdPartyStellar.html
 , and it just doesn't work. I changed the Global configs, uploaded the "jar" 
file to HDFS, and then ran the stellar environment to test it. No errors at 
all, but the function isn't even listed when I do "%functions".

Thanks


RE: 3rd party stellar functions

2020-02-21 Thread Gonçalo Pedras
Yes I did, I pushed the configs using zk_load_configs.sh; restarted zookeeper; 
started stellar with “-z” flag. No errors.


RE: 3rd party stellar functions

2020-02-21 Thread Gonçalo Pedras
It’s now working, I don’t if it had to do with it but I made a “clean package” 
in maven and it started working for some reason.



Profiler doubt

2020-01-20 Thread Gonçalo Pedras
Hi,
I've deployed Apache Metron with HDP 3.1 support provided by the GitHub 
repository 
(https://github.com/apache/metron/blob/feature/METRON-2088-support-hdp-3.1).
I've some questions about the Profiler and somehow confused. I'm testing the 
ASA parser and i've deployed two profiles:

1.   Counting ip_src_addr.

2.   Counting syslog_severity.
The profiler properties have the default settings.
I ran the parser last friday for a couple of seconds and it generated about 
three thousand records. Today I ran the 'PROFILER_GET' in Stellar for a 
'PROFILE_FIXED' of 72 hours and I checked it against the Elasticsearch index 
and I realised the counts don't match. For exemple, for a specific IP source 
"a" in that period of time I got 21 hits and in the result of 'PROFILER_GET' 
returned a stream of results that make no sense to me. My source of the ASA 
parser wasn't sending any records to Kafka and somehow the profiler managed to 
keep counting beyond that period of time. Where it should be something like: 
[21], it returned [27, 27, 27, 54, 27, 27, ...] . My question is:

· Is the Profiler working fine? And if it is, can someone explain it to 
me?

· And if it is not woking well, what is the problem, and how to fix it?

Thanks



RE: Metron MaaS issue

2020-03-09 Thread Gonçalo Pedras
Have you tried increasing your service memory? If you have and it doesn’t work 
either, there’s a work around until you find a fix for you MaaS service.
Forget about MaaS, launch your application as a seperate service. There’s a 
function in stellar “REST_GET” to make requests to REST services. The 
definition of the funciton is as follows:
REST_GET(url, configs, args). An exemple of use of this function is something 
like this: REST_GET(“http://hostname:8832/apply?”, null, {“ip”:”1.2.3.4”}). The 
only difference from MAAS_MODEL_APPLY is that the MaaS function will hold until 
it receives a response from your application while the REST_GET has a default 
timeout. Besides that you can configure the timeout. More information about 
REST_GET in 
https://metron.apache.org/current-book/metron-stellar/stellar-common/index.html#REST_GET
 .
Hope it helps.



RE: Workers restarts randomly

2020-03-16 Thread Gonçalo Pedras
Hi,
Thanks for replying by the way.
Not really, doesn’t tell anything, the last messages were:

“2020-03-16 09:14:06.983 o.a.m.p.DefaultProfileBuilder 
watermark-event-generator-0 [DEBUG] Applying message to profile; profile=A, 
entity=, 
timestamp=1584349869000”
“2020-03-16 09:14:06.983 o.a.m.p.DefaultProfileBuilder 
watermark-event-generator-0 [DEBUG] Initializing profile; profile=A, 
entity=XX, 
timestamp=1584349869000”
“2020-03-16 09:14:06.983 o.a.m.p.s.ProfileBuilderBolt 
watermark-event-generator-0 [DEBUG] Message distributed: profile=A, 
entity,
 timestamp=1584349869000”
And then restarted.


De: Nick Allen 
Enviada: 13 de março de 2020 21:05
Para: user@metron.apache.org
Assunto: Re: Workers restarts randomly

Turn logging up to DEBUG.  Does that provide any more information?



On Fri, Mar 13, 2020 at 11:36 AM Gonçalo Pedras 
mailto:goncalo.ped...@ctd.pt>> wrote:
Hi,
I’ve deployed a profiler with 2 profiles. When I start the profiler, everything 
works fine until the first flush. After that, my profiler keeps restarting for 
no reason. The topology logs before the restart:


“2020-03-13 15:17:56.878 o.a.s.k.s.KafkaSpout Thread-24-kafkaSpout-executor[10 
10] [INFO] Initialization complete”
“2020-03-13 15:19:24.036 
o.a.h.m.s.t.a.MetricSinkWriteShardHostnameHashingStrategy Thread-31 [INFO] 
Calculated collector shard XXX based on hostname: 
XX”
“2020-03-13 15:21:55.370 o.a.k.c.Metadata kafka-producer-network-thread | 
producer-1 [INFO] Cluster ID: I8DxrzeSQJWslsnwP8Dlrw”

That’s it. After a while another worker is assign. Just like that. No errors no 
nothing.

What could the cause be?

Thanks


Workers restarts randomly

2020-03-13 Thread Gonçalo Pedras
Hi,
I've deployed a profiler with 2 profiles. When I start the profiler, everything 
works fine until the first flush. After that, my profiler keeps restarting for 
no reason. The topology logs before the restart:


"2020-03-13 15:17:56.878 o.a.s.k.s.KafkaSpout Thread-24-kafkaSpout-executor[10 
10] [INFO] Initialization complete"
"2020-03-13 15:19:24.036 
o.a.h.m.s.t.a.MetricSinkWriteShardHostnameHashingStrategy Thread-31 [INFO] 
Calculated collector shard XXX based on hostname: 
XX"
"2020-03-13 15:21:55.370 o.a.k.c.Metadata kafka-producer-network-thread | 
producer-1 [INFO] Cluster ID: I8DxrzeSQJWslsnwP8Dlrw"

That's it. After a while another worker is assign. Just like that. No errors no 
nothing.

What could the cause be?

Thanks


Profiler topology zookeeper client timed out

2020-03-09 Thread Gonçalo Pedras
Hi,

I've set up a profiler of 15 minutes period, although it isn't really 15 
minutes long. I've been monitoring the topology in Storm UI and I reached the 
conclusion that when my profiler flushes, after a few minutes the worker 
restarts. I checked the logs and I found that the worker restarts when the 
Client session times out:
"2020-03-09 17:11:02.923 o.a.s.s.o.a.z.ClientCnxn 
main-SendThread(XX) [INFO] Client session timed out, have 
not heard from server in 21341ms for sessionid 0x170aa100eba035e, closing 
socket connection and attempting reconnect"
And when I checked it against the zookeeper logs, it shows that the session 
expired:
"2020-03-09 17:11:36,003 - INFO  [SessionTracker:ZooKeeperServer@357] - 
Expiring session 0x170aa100eba035e, timeout of 3ms exceeded"
"2020-03-09 17:11:36,003 - INFO  [ProcessThread(sid:3 
cport:-1)::PrepRequestProcessor@491] - Processed session termination for 
sessionid: 0x170aa100eba035e"

Therefore my profiler doesn't flush every 15 minutes because when it restarts I 
have to wait 15 minutes + (plus) the time it wasted after the previous flush.

What should I do?

Thanks