[ 
https://issues.apache.org/jira/browse/METRON-684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15850429#comment-15850429
 ] 

ASF GitHub Bot commented on METRON-684:
---------------------------------------

Github user cestella commented on the issue:

    https://github.com/apache/incubator-metron/pull/435
  
    Testing Instructions beyond the normal smoke test (i.e. letting data
    flow through to the indices and checking them).
    
    ## Preliminaries
    * Set an environment variable to indicate `METRON_HOME`:
    `export METRON_HOME=/usr/metron/0.3.0` 
    
    * Create the profiler hbase table
    `echo "create 'profiler', 'P'" | hbase shell`
    
    * Open `~/rand_gen.py` and paste the following:
    ```
    #!/usr/bin/python
    import random
    import sys
    import time
    def main():
      mu = float(sys.argv[1])
      sigma = float(sys.argv[2])
      freq_s = int(sys.argv[3])
      while True:
        out = '{ "value" : ' + str(random.gauss(mu, sigma)) + ' }'
        print out
        sys.stdout.flush()
        time.sleep(freq_s)
    
    if __name__ == '__main__':
      main()
    ```
    This will generate random JSON maps with a numeric field called `value`
    
    * Set the profiler to use 1 minute tick durations:
      * Edit `$METRON_HOME/config/profiler.properties` to adjust the capture 
duration by changing `profiler.period.duration=15` to 
`profiler.period.duration=1`
      * Edit `$METRON_HOME/config/zookeeper/global.json` and add the following 
properties:
    ```
    "profiler.client.period.duration" : "1",
    "profiler.client.period.duration.units" : "MINUTES"
    ```
    
    ## Free Up Space on the virtual machine
    
    First, let's free up some headroom on the virtual machine.  If you are 
running this on a
    multinode cluster, you would not have to do this.
    * Kill monit via `service monit stop`
    * Kill tcpreplay via `for i in $(ps -ef | grep tcpreplay | awk '{print 
$2}');do kill -9 $i;done`
    * Kill existing parser topologies via 
       * `storm kill snort`
       * `storm kill bro`
    * We won't need the enrichment or indexing topologies for this test, so you 
can kill them via:
       * `storm kill enrichment`
       * `storm kill indexing`
    * Kill yaf via `for i in $(ps -ef | grep yaf | awk '{print $2}');do kill -9 
$i;done`
    * Kill bro via `for i in $(ps -ef | grep bro | awk '{print $2}');do kill -9 
$i;done`
    
    ## Start the profiler
    * `$METRON_HOME/bin/start_profiler_topology.sh`
    
    ## Test Case
    
    * Set up a profile to accept some synthetic data with a numeric `value` 
field and persist a stats summary of the data
      * Edit `$METRON_HOME/config/zookeeper/profiler.json` and paste in the 
following:
    ```
    {
      "profiles": [
        {
          "profile": "stat",
          "foreach": "'global'",
          "onlyif": "true",
          "init" : {
                   },
          "update": {
            "s": "STATS_ADD(s, value)"
                    },
          "result": "s"
        }
      ]
    }
    ```
    
    * Send some synthetic data directly to the profiler:
    `python ~/rand_gen.py 0 1 1 | 
/usr/hdp/current/kafka-broker/bin/kafka-console-producer.sh --broker-list 
node1:6667 --topic indexing`
    * Wait for at least 10 minutes and execute the following via the Stellar 
REPL:
    ```
    # Grab the last 10 minutes worth of timestamps
    PROFILE_FIXED( 10, 'MINUTES')
    # Looks like 10 were returned, great.  Now, validate that I get 10 profile 
measurements back
    PROFILE_GET('stat', 'global', PROFILE_FIXED( 10, 'MINUTES' ) )
    # Ok, now look at the mean across the distribution
    # STATS_MEAN( STATS_MERGE(PROFILE_GET('stat', 'global', PROFILE_FIXED( 10, 
'MINUTES' ) )))
    ```
    For me, the following was the result:
    ```
    Stellar, Go!
    Please note that functions are loading lazily in the background and will be 
unavailable until loaded fully.
    {es.clustername=metron, es.ip=node1, es.port=9300, 
es.date.format=yyyy.MM.dd.HH, profiler.client.period.duration=1, 
profiler.client.period.duration.units=MINUTES}
    [Stellar]>>> # Grab the last 10 minutes worth of timestamps
    [Stellar]>>> PROFILE_FIXED( 10, 'MINUTES')
    Functions loaded, you may refer to functions now...
    [24767772, 24767773, 24767774, 24767775, 24767776, 24767777, 24767778, 
24767779, 24767780, 24767781, 24767782]
    [Stellar]>>> # Looks like 10 were returned, great.  Now, validate that I 
get 10 profile measurements back
    [Stellar]>>> PROFILE_GET('stat', 'global', PROFILE_FIXED( 10, 'MINUTES' ) )
    [org.apache.metron.statistics.OnlineStatisticsProvider@44749031, 
org.apache.metron.statistics.OnlineStatisticsProvider@d2a7fbb9, 
org.apache.metron.statistics.OnlineStatisticsProvider@a217cfd7, 
org.apache.metron.statistics.OnlineStatisticsProvider@c5e42aed, 
org.apache.metron.statistics.OnlineStatisticsProvider@c4f4753d, 
org.apache.metron.statistics.OnlineStatisticsProvider@87a1606a, 
org.apache.metron.statistics.OnlineStatisticsProvider@e1b4c8dc, 
org.apache.metron.statistics.OnlineStatisticsProvider@fdb7b8d8]
    [Stellar]>>> # Ok, now look at the mean across the distribution
    [Stellar]>>> STATS_MEAN( STATS_MERGE(PROFILE_GET('stat', 'global', 
PROFILE_FIXED( 10, 'MINUTES' ) )))
    -0.0077433441069769265
    [Stellar]>>>
    ```



> Decouple Timestamp calculation from PROFILE_GET
> -----------------------------------------------
>
>                 Key: METRON-684
>                 URL: https://issues.apache.org/jira/browse/METRON-684
>             Project: Metron
>          Issue Type: Improvement
>            Reporter: Casey Stella
>            Assignee: Casey Stella
>
> Currently PROFILE_GET only supports a static lookback of a fixed duration.  
> As we have more complicated, potentially sparse, lookbacks (e.g. the same 
> time slice every tuesday for a month), it would be nice to decouple the 
> construction of timestamps from PROFILE_GET into its own set of functions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to