Mohan created METRON-1133:
-----------------------------
Summary: Entity value for a profiled data written wrongly to Hbase
Key: METRON-1133
URL: https://issues.apache.org/jira/browse/METRON-1133
Project: Metron
Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Mohan
I have created profile with Profiler's "Group By" functionality which operates
over any incoming telemetry that has an `ip_src_addr` and a `timestamp` field.
It produces a profile that segments the data by day of week. It does by using a
'groupBy' expression to extract the day of week from the telemetry's
`timestamp` field.
My Kafka messages are
{code:java}
{ "ip_src_addr": "10.0.0.1", "protocol": "HTTPS", "length": "10",
"bytes_in": 234, "timestamp": "1503657089000" }
{ "ip_src_addr": "10.0.0.2", "protocol": "HTTP", "length": "20",
"bytes_in": 390, "timestamp": "1503657089000" }
{ "ip_src_addr": "10.0.0.3", "protocol": "DNS", "length": "30",
"bytes_in": 560, "timestamp": "1503657089000" }
{code}
My profile config looks as
{code:java}
{
"profiles": [
{
"profile": "calender-effects",
"onlyif": "exists(ip_src_addr) and exists(timestamp)",
"foreach": "ip_src_addr",
"init":{ "count": 0 },
"update":{ "count": "count + 1" },
"result": "count",
"groupBy": ["DAY_OF_WEEK(start)"]
}]
}
{code}
After pushing all the above messages 8 times each, When I scan the profiler
table from Hbase
{code:java}
hbase(main):003:0> scan 'profiler'
ROW
COLUMN+CELL
\x00\x00\x03Pcalender-effects10.0.0.16\x00\x00\x00\x00\x00\xBF3.
column=P:value, timestamp=1503657430993, value=\x02\x10
\x00\x00\x03Pcalender-effects10.0.0.26\x00\x00\x00\x00\x00\xBF3.
column=P:value, timestamp=1503657430993, value=\x02\x10
\x00\x00\x03Pcalender-effects10.0.0.36\x00\x00\x00\x00\x00\xBF3.
column=P:value, timestamp=1503657430993, value=\x02\x10
{code}
I see that an extra digit '6' is getting appended to the entity values
When retrieving profile data using the stellar shell, I wasn't able retrieve
data from the same day of week to account for any calendar effects.
The following example retrieves profile data over the past 10 days.
{code:java}
[Stellar]>>> PROFILE_GET( "calender-effects", "10.0.0.1", PROFILE_FIXED(10,
"DAYS",{'profiler.client.period.duration' : '2',
'profiler.client.period.duration.units' : 'MINUTES'}), [] )
[]
{code}
I was able to retrieve the data by changing the entity value to "10.0.0.16"
instead of 10.0.0.1
{code:java}
[Stellar]>>> PROFILE_GET( "calender-effects", "10.0.0.16", PROFILE_FIXED(10,
"DAYS",{'profiler.client.period.duration' : '2',
'profiler.client.period.duration.units' : 'MINUTES'}), [] )
[8]
{code}
retrieves profile data over the past 10 days only for Fridays also fails,
{code:java}
[Stellar]>>> PROFILE_GET( "calender-effects", "10.0.0.16", PROFILE_FIXED(10,
"DAYS",{'profiler.client.period.duration' : '2',
'profiler.client.period.duration.units' : 'MINUTES'}), [friday] )
[]
{code}
SO Where this value '6' is getting appended to the entity value ? It looks to
me like the group by value ie the Day of the week is getting appended to the
entity value.
to confirm the same I changed the timestamp value in the messages to
"timestamp": "1503583870672" and I see that the value '5' got appended to the
entity value!!
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)