Github user cestella commented on the issue:
https://github.com/apache/incubator-metron/pull/435
Testing Instructions beyond the normal smoke test (i.e. letting data
flow through to the indices and checking them).
## Preliminaries
* Set an environment variable to indicate `METRON_HOME`:
`export METRON_HOME=/usr/metron/0.3.0`
* Create the profiler hbase table
`echo "create 'profiler', 'P'" | hbase shell`
* Open `~/rand_gen.py` and paste the following:
```
#!/usr/bin/python
import random
import sys
import time
def main():
mu = float(sys.argv[1])
sigma = float(sys.argv[2])
freq_s = int(sys.argv[3])
while True:
out = '{ "value" : ' + str(random.gauss(mu, sigma)) + ' }'
print out
sys.stdout.flush()
time.sleep(freq_s)
if __name__ == '__main__':
main()
```
This will generate random JSON maps with a numeric field called `value`
* Set the profiler to use 1 minute tick durations:
* Edit `$METRON_HOME/config/profiler.properties` to adjust the capture
duration by changing `profiler.period.duration=15` to
`profiler.period.duration=1`
* Edit `$METRON_HOME/config/zookeeper/global.json` and add the following
properties:
```
"profiler.client.period.duration" : "1",
"profiler.client.period.duration.units" : "MINUTES"
```
## Free Up Space on the virtual machine
First, let's free up some headroom on the virtual machine. If you are
running this on a
multinode cluster, you would not have to do this.
* Kill monit via `service monit stop`
* Kill tcpreplay via `for i in $(ps -ef | grep tcpreplay | awk '{print
$2}');do kill -9 $i;done`
* Kill existing parser topologies via
* `storm kill snort`
* `storm kill bro`
* We won't need the enrichment or indexing topologies for this test, so you
can kill them via:
* `storm kill enrichment`
* `storm kill indexing`
* Kill yaf via `for i in $(ps -ef | grep yaf | awk '{print $2}');do kill -9
$i;done`
* Kill bro via `for i in $(ps -ef | grep bro | awk '{print $2}');do kill -9
$i;done`
## Start the profiler
* `$METRON_HOME/bin/start_profiler_topology.sh`
## Test Case
* Set up a profile to accept some synthetic data with a numeric `value`
field and persist a stats summary of the data
* Edit `$METRON_HOME/config/zookeeper/profiler.json` and paste in the
following:
```
{
"profiles": [
{
"profile": "stat",
"foreach": "'global'",
"onlyif": "true",
"init" : {
},
"update": {
"s": "STATS_ADD(s, value)"
},
"result": "s"
}
]
}
```
* Send some synthetic data directly to the profiler:
`python ~/rand_gen.py 0 1 1 |
/usr/hdp/current/kafka-broker/bin/kafka-console-producer.sh --broker-list
node1:6667 --topic indexing`
* Wait for at least 10 minutes and execute the following via the Stellar
REPL:
```
# Grab the last 10 minutes worth of timestamps
PROFILE_FIXED( 10, 'MINUTES')
# Looks like 10 were returned, great. Now, validate that I get 10 profile
measurements back
PROFILE_GET('stat', 'global', PROFILE_FIXED( 10, 'MINUTES' ) )
# Ok, now look at the mean across the distribution
# STATS_MEAN( STATS_MERGE(PROFILE_GET('stat', 'global', PROFILE_FIXED( 10,
'MINUTES' ) )))
```
For me, the following was the result:
```
Stellar, Go!
Please note that functions are loading lazily in the background and will be
unavailable until loaded fully.
{es.clustername=metron, es.ip=node1, es.port=9300,
es.date.format=yyyy.MM.dd.HH, profiler.client.period.duration=1,
profiler.client.period.duration.units=MINUTES}
[Stellar]>>> # Grab the last 10 minutes worth of timestamps
[Stellar]>>> PROFILE_FIXED( 10, 'MINUTES')
Functions loaded, you may refer to functions now...
[24767772, 24767773, 24767774, 24767775, 24767776, 24767777, 24767778,
24767779, 24767780, 24767781, 24767782]
[Stellar]>>> # Looks like 10 were returned, great. Now, validate that I
get 10 profile measurements back
[Stellar]>>> PROFILE_GET('stat', 'global', PROFILE_FIXED( 10, 'MINUTES' ) )
[org.apache.metron.statistics.OnlineStatisticsProvider@44749031,
org.apache.metron.statistics.OnlineStatisticsProvider@d2a7fbb9,
org.apache.metron.statistics.OnlineStatisticsProvider@a217cfd7,
org.apache.metron.statistics.OnlineStatisticsProvider@c5e42aed,
org.apache.metron.statistics.OnlineStatisticsProvider@c4f4753d,
org.apache.metron.statistics.OnlineStatisticsProvider@87a1606a,
org.apache.metron.statistics.OnlineStatisticsProvider@e1b4c8dc,
org.apache.metron.statistics.OnlineStatisticsProvider@fdb7b8d8]
[Stellar]>>> # Ok, now look at the mean across the distribution
[Stellar]>>> STATS_MEAN( STATS_MERGE(PROFILE_GET('stat', 'global',
PROFILE_FIXED( 10, 'MINUTES' ) )))
-0.0077433441069769265
[Stellar]>>>
```
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---