[jira] [Commented] (METRON-2284) Metron Profiler for Spark doesn't work as expected

2019-11-07 Thread Nick Allen (Jira)


[ 
https://issues.apache.org/jira/browse/METRON-2284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969431#comment-16969431
 ] 

Nick Allen commented on METRON-2284:


This definitely looks to be a bug.  The behavior between the Profiler in the 
REPL and in the Batch Profiler in Spark should be identical, but it seems not 
to be. 

What are you trying to do with this profile?  Maybe I can help you with a 
workaround until we can fix the problem.

BTW, thank you for providing such a clear bug report with the exact steps to 
replicate.  Very helpful!

> Metron Profiler for Spark doesn't work as expected
> --
>
> Key: METRON-2284
> URL: https://issues.apache.org/jira/browse/METRON-2284
> Project: Metron
>  Issue Type: Bug
>Affects Versions: 0.7.1
>Reporter: Maxim Dashenko
>Priority: Major
>
> Used command:
> {code}
> /usr/hdp/current/spark2-client/bin/spark-submit --class 
> org.apache.metron.profiler.spark.cli.BatchProfilerCLI --properties-file 
> /usr/hcp/current/metron/config/batch-profiler.properties 
> ~/metron-profiler-spark-0.7.1.1.9.1.0-6.jar --config 
> /usr/hcp/current/metron/config/batch-profiler.properties --profiles 
> ~/profiler.json
> {code}
>  cat /usr/hcp/current/metron/config/batch-profiler.properties
> {code}
> profiler.batch.input.path=/tmp/test_data.logs
> profiler.batch.input.format=json
> profiler.period.duration=15
> profiler.period.duration.units=MINUTES
> {code}
>  
> cat ~/profiler.json
> {code}
> {
>  "profiles":[
>{
>  "profile":"batchtest5",
>  "onlyif":"source.type == 'testsource' and devicehostname == 
> 'windows9.something.com'",
>  "foreach":"devicehostname",
>  "init":{
>"val":"SET_INIT()"
>  },
>  "update":{
>"val":"SET_ADD(val, IS_EMPTY(devicehostname))"
>  },
> "result":{
>"profile":"val"
> }
>}
>  ],
>  "timestampField":"timestamp"
> }
> {code}
>  cat test_data.logs
> {code}
> {"devicehostname": "windows9.something.com", "timestamp": 1567241981000, 
> "source.type": "testsource"}
> {code}
> Stellar statement
> {code}
> PROFILE_GET('batchtest5', 'windows9.something.com', PROFILE_FIXED(100, 
> 'DAYS'))
> {code}
> Returns:
> {code}
> [[true]]
> {code}
> Expected result:
> {code}
> [[false]]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (METRON-2284) Metron Profiler for Spark doesn't work as expected

2019-10-18 Thread Maxim Dashenko (Jira)


[ 
https://issues.apache.org/jira/browse/METRON-2284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16954666#comment-16954666
 ] 

Maxim Dashenko commented on METRON-2284:


So the issue is replicated only with metron-profiler-spark-0.7.1.1.9.1.0-6.jar

> Metron Profiler for Spark doesn't work as expected
> --
>
> Key: METRON-2284
> URL: https://issues.apache.org/jira/browse/METRON-2284
> Project: Metron
>  Issue Type: Bug
>Affects Versions: 0.7.1
>Reporter: Maxim Dashenko
>Priority: Major
>
> Used command:
> {code}
> /usr/hdp/current/spark2-client/bin/spark-submit --class 
> org.apache.metron.profiler.spark.cli.BatchProfilerCLI --properties-file 
> /usr/hcp/current/metron/config/batch-profiler.properties 
> ~/metron-profiler-spark-0.7.1.1.9.1.0-6.jar --config 
> /usr/hcp/current/metron/config/batch-profiler.properties --profiles 
> ~/profiler.json
> {code}
>  cat /usr/hcp/current/metron/config/batch-profiler.properties
> {code}
> profiler.batch.input.path=/tmp/test_data.logs
> profiler.batch.input.format=json
> profiler.period.duration=15
> profiler.period.duration.units=MINUTES
> {code}
>  
> cat ~/profiler.json
> {code}
> {
>  "profiles":[
>{
>  "profile":"batchtest5",
>  "onlyif":"source.type == 'testsource' and devicehostname == 
> 'windows9.something.com'",
>  "foreach":"devicehostname",
>  "init":{
>"val":"SET_INIT()"
>  },
>  "update":{
>"val":"SET_ADD(val, IS_EMPTY(devicehostname))"
>  },
> "result":{
>"profile":"val"
> }
>}
>  ],
>  "timestampField":"timestamp"
> }
> {code}
>  cat test_data.logs
> {code}
> {"devicehostname": "windows9.something.com", "timestamp": 1567241981000, 
> "source.type": "testsource"}
> {code}
> Stellar statement
> {code}
> PROFILE_GET('batchtest5', 'windows9.something.com', PROFILE_FIXED(100, 
> 'DAYS'))
> {code}
> Returns:
> {code}
> [[true]]
> {code}
> Expected result:
> {code}
> [[false]]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (METRON-2284) Metron Profiler for Spark doesn't work as expected

2019-10-18 Thread Maxim Dashenko (Jira)


[ 
https://issues.apache.org/jira/browse/METRON-2284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16954578#comment-16954578
 ] 

Maxim Dashenko commented on METRON-2284:


Here is the result:

{code}
[Stellar]>>> val := SET_INIT()
[]
[Stellar]>>> devicehostname := 'windows9.something.com'
windows9.something.com
[Stellar]>>> val := SET_ADD(val, IS_EMPTY(devicehostname))
[false]
[Stellar]>>> conf := SHELL_EDIT()
{
 "profiles":[
   {
 "profile":"batchtest5",
 "onlyif":"devicehostname == 'windows9.something.com'",
 "foreach":"devicehostname",
 "init":{
   "val":"SET_INIT()"
 },
 "update":{
   "val":"SET_ADD(val, IS_EMPTY(devicehostname))"
 },
"result":{
   "profile":"val"
}
   }
 ],
 "timestampField":"timestamp"
}
[Stellar]>>>  
[Stellar]>>> profiler := PROFILER_INIT(conf)
Profiler{1 profile(s), 0 messages(s), 0 route(s)}
[Stellar]>>> msg := SHELL_EDIT()
{"devicehostname": "windows9.something.com", "timestamp": 1567241981000}
[Stellar]>>>  
[Stellar]>>> PROFILER_APPLY(msg, profiler)
Profiler{1 profile(s), 1 messages(s), 1 route(s)}
[Stellar]>>> values := PROFILER_FLUSH(profiler)
[{period={duration=90, period=1741379, start=156724110, 
end=156724200}, profile=batchtest5, groups=[], value=[false], 
entity=windows9.something.com}]
{code}

> Metron Profiler for Spark doesn't work as expected
> --
>
> Key: METRON-2284
> URL: https://issues.apache.org/jira/browse/METRON-2284
> Project: Metron
>  Issue Type: Bug
>Affects Versions: 0.7.1
>Reporter: Maxim Dashenko
>Priority: Major
>
> Used command:
> {code}
> /usr/hdp/current/spark2-client/bin/spark-submit --class 
> org.apache.metron.profiler.spark.cli.BatchProfilerCLI --properties-file 
> /usr/hcp/current/metron/config/batch-profiler.properties 
> ~/metron-profiler-spark-0.7.1.1.9.1.0-6.jar --config 
> /usr/hcp/current/metron/config/batch-profiler.properties --profiles 
> ~/profiler.json
> {code}
>  cat /usr/hcp/current/metron/config/batch-profiler.properties
> {code}
> profiler.batch.input.path=/tmp/test_data.logs
> profiler.batch.input.format=json
> profiler.period.duration=15
> profiler.period.duration.units=MINUTES
> {code}
>  
> cat ~/profiler.json
> {code}
> {
>  "profiles":[
>{
>  "profile":"batchtest5",
>  "onlyif":"source.type == 'testsource' and devicehostname == 
> 'windows9.something.com'",
>  "foreach":"devicehostname",
>  "init":{
>"val":"SET_INIT()"
>  },
>  "update":{
>"val":"SET_ADD(val, IS_EMPTY(devicehostname))"
>  },
> "result":{
>"profile":"val"
> }
>}
>  ],
>  "timestampField":"timestamp"
> }
> {code}
>  cat test_data.logs
> {code}
> {"devicehostname": "windows9.something.com", "timestamp": 1567241981000, 
> "source.type": "testsource"}
> {code}
> Stellar statement
> {code}
> PROFILE_GET('batchtest5', 'windows9.something.com', PROFILE_FIXED(100, 
> 'DAYS'))
> {code}
> Returns:
> {code}
> [[true]]
> {code}
> Expected result:
> {code}
> [[false]]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (METRON-2284) Metron Profiler for Spark doesn't work as expected

2019-10-17 Thread Michael Miklavcic (Jira)


[ 
https://issues.apache.org/jira/browse/METRON-2284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16954000#comment-16954000
 ] 

Michael Miklavcic commented on METRON-2284:
---

Can we have you try the following exercise from the REPL using your profile?

https://github.com/apache/metron/blob/master/metron-analytics/metron-profiler-repl/README.md#getting-started


{code:java}
[Stellar]>>> val := SET_INIT()
[]
[Stellar]>>> devicehostname := 'windows9.something.com'
windows9.something.com
[Stellar]>>> val := SET_ADD(val, IS_EMPTY(devicehostname))
[false]
[Stellar]>>> conf := SHELL_EDIT()
# add the following profile contents in the vi editor that comes up:
{
 "profiles":[
   {
 "profile":"batchtest5",
 "onlyif":"devicehostname == 'windows9.something.com'",
 "foreach":"devicehostname",
 "init":{
   "val":"SET_INIT()"
 },
 "update":{
   "val":"SET_ADD(val, IS_EMPTY(devicehostname))"
 },
"result":{
   "profile":"val"
}
   }
 ],
 "timestampField":"timestamp"
}
[Stellar]>>> profiler := PROFILER_INIT(conf)
Profiler{1 profile(s), 0 messages(s), 0 route(s)}
[Stellar]>>> msg := SHELL_EDIT()
# add this record
{"devicehostname": "windows9.something.com", "timestamp": 1567241981000}
[Stellar]>>> PROFILER_APPLY(msg, profiler)
Profiler{1 profile(s), 1 messages(s), 1 route(s)}
[Stellar]>>> values := PROFILER_FLUSH(profiler)
[{period={duration=90, period=1741379, start=156724110, 
end=156724200}, profile=batchtest5, groups=[], value=[false], 
entity=windows9.something.com}]
{code}


I'm seeing "value" set to false, as expected, at least from the REPL. Let's see 
if we can verify that part of the functionality matches up as expected and go 
from there.

> Metron Profiler for Spark doesn't work as expected
> --
>
> Key: METRON-2284
> URL: https://issues.apache.org/jira/browse/METRON-2284
> Project: Metron
>  Issue Type: Bug
>Affects Versions: 0.7.1
>Reporter: Maxim Dashenko
>Priority: Major
>
> Used command:
> {code}
> /usr/hdp/current/spark2-client/bin/spark-submit --class 
> org.apache.metron.profiler.spark.cli.BatchProfilerCLI --properties-file 
> /usr/hcp/current/metron/config/batch-profiler.properties 
> ~/metron-profiler-spark-0.7.1.1.9.1.0-6.jar --config 
> /usr/hcp/current/metron/config/batch-profiler.properties --profiles 
> ~/profiler.json
> {code}
>  cat /usr/hcp/current/metron/config/batch-profiler.properties
> {code}
> profiler.batch.input.path=/tmp/test_data.logs
> profiler.batch.input.format=json
> profiler.period.duration=15
> profiler.period.duration.units=MINUTES
> {code}
>  
> cat ~/profiler.json
> {code}
> {
>  "profiles":[
>{
>  "profile":"batchtest5",
>  "onlyif":"source.type == 'testsource' and devicehostname == 
> 'windows9.something.com'",
>  "foreach":"devicehostname",
>  "init":{
>"val":"SET_INIT()"
>  },
>  "update":{
>"val":"SET_ADD(val, IS_EMPTY(devicehostname))"
>  },
> "result":{
>"profile":"val"
> }
>}
>  ],
>  "timestampField":"timestamp"
> }
> {code}
>  cat test_data.logs
> {code}
> {"devicehostname": "windows9.something.com", "timestamp": 1567241981000, 
> "source.type": "testsource"}
> {code}
> Stellar statement
> {code}
> PROFILE_GET('batchtest5', 'windows9.something.com', PROFILE_FIXED(100, 
> 'DAYS'))
> {code}
> Returns:
> {code}
> [[true]]
> {code}
> Expected result:
> {code}
> [[false]]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)