Added: 
release/metron/0.4.1/site-book/metron-analytics/metron-maas-service/index.html
==============================================================================
--- 
release/metron/0.4.1/site-book/metron-analytics/metron-maas-service/index.html 
(added)
+++ 
release/metron/0.4.1/site-book/metron-analytics/metron-maas-service/index.html 
Fri Sep 15 23:37:46 2017
@@ -0,0 +1,523 @@
+<!DOCTYPE html>
+<!--
+ | Generated by Apache Maven Doxia at 2017-09-08
+ | Rendered using Apache Maven Fluido Skin 1.3.0
+-->
+<html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
+  <head>
+    <meta charset="UTF-8" />
+    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
+    <meta name="Date-Revision-yyyymmdd" content="20170908" />
+    <meta http-equiv="Content-Language" content="en" />
+    <title>Metron &#x2013; Model Management Infrastructure</title>
+    <link rel="stylesheet" href="../../css/apache-maven-fluido-1.3.0.min.css" 
/>
+    <link rel="stylesheet" href="../../css/site.css" />
+    <link rel="stylesheet" href="../../css/print.css" media="print" />
+
+      
+    <script type="text/javascript" 
src="../../js/apache-maven-fluido-1.3.0.min.js"></script>
+
+                          
+        
+<script type="text/javascript">$( document ).ready( function() { $( 
'.carousel' ).carousel( { interval: 3500 } ) } );</script>
+          
+            </head>
+        <body class="topBarDisabled">
+          
+                
+                    
+    
+        <div class="container-fluid">
+          <div id="banner">
+        <div class="pull-left">
+                                    <a href="http://metron.apache.org/"; 
id="bannerLeft">
+                                                                               
                 <img src="../../images/metron-logo.png"  alt="Apache Metron" 
width="148px" height="48px"/>
+                </a>
+                      </div>
+        <div class="pull-right">  </div>
+        <div class="clear"><hr/></div>
+      </div>
+
+      <div id="breadcrumbs">
+        <ul class="breadcrumb">
+                
+                    
+                              <li class="">
+                    <a href="http://www.apache.org"; class="externalLink" 
title="Apache">
+        Apache</a>
+        </li>
+      <li class="divider ">/</li>
+            <li class="">
+                    <a href="http://metron.apache.org/"; class="externalLink" 
title="Metron">
+        Metron</a>
+        </li>
+      <li class="divider ">/</li>
+            <li class="">
+                    <a href="../../index.html" title="Documentation">
+        Documentation</a>
+        </li>
+      <li class="divider ">/</li>
+        <li class="">Model Management Infrastructure</li>
+        
+                
+                    
+                  <li id="publishDate" class="pull-right">Last Published: 
2017-09-08</li> <li class="divider pull-right">|</li>
+              <li id="projectVersion" class="pull-right">Version: 0.4.1</li>
+            
+                            </ul>
+      </div>
+
+            
+      <div class="row-fluid">
+        <div id="leftColumn" class="span3">
+          <div class="well sidebar-nav">
+                
+                    
+                <ul class="nav nav-list">
+                    <li class="nav-header">User Documentation</li>
+                                                                               
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                          
+      <li>
+    
+                          <a href="../../index.html" title="Metron">
+          <i class="icon-chevron-down"></i>
+        Metron</a>
+                    <ul class="nav nav-list">
+                      
+      <li>
+    
+                          <a href="../../Upgrading.html" title="Upgrading">
+          <i class="none"></i>
+        Upgrading</a>
+            </li>
+                                                                               
                                                                                
 
+      <li>
+    
+                          <a href="../../metron-analytics/index.html" 
title="Analytics">
+          <i class="icon-chevron-down"></i>
+        Analytics</a>
+                    <ul class="nav nav-list">
+                      
+      <li class="active">
+    
+            <a href="#"><i class="none"></i>Maas-service</a>
+          </li>
+                      
+      <li>
+    
+                          <a 
href="../../metron-analytics/metron-profiler/index.html" title="Profiler">
+          <i class="none"></i>
+        Profiler</a>
+            </li>
+                      
+      <li>
+    
+                          <a 
href="../../metron-analytics/metron-profiler-client/index.html" 
title="Profiler-client">
+          <i class="none"></i>
+        Profiler-client</a>
+            </li>
+                                                                        
+      <li>
+    
+                          <a 
href="../../metron-analytics/metron-statistics/index.html" title="Statistics">
+          <i class="icon-chevron-right"></i>
+        Statistics</a>
+                  </li>
+              </ul>
+        </li>
+                      
+      <li>
+    
+                          <a 
href="../../metron-contrib/metron-docker/index.html" title="Docker">
+          <i class="none"></i>
+        Docker</a>
+            </li>
+                                                                               
                                                                                
                                                                                
                                                                                
                                                                                
                                 
+      <li>
+    
+                          <a href="../../metron-deployment/index.html" 
title="Deployment">
+          <i class="icon-chevron-right"></i>
+        Deployment</a>
+                  </li>
+                      
+      <li>
+    
+                          <a 
href="../../metron-interface/metron-alerts/index.html" title="Alerts">
+          <i class="none"></i>
+        Alerts</a>
+            </li>
+                      
+      <li>
+    
+                          <a 
href="../../metron-interface/metron-config/index.html" title="Config">
+          <i class="none"></i>
+        Config</a>
+            </li>
+                      
+      <li>
+    
+                          <a 
href="../../metron-interface/metron-rest/index.html" title="Rest">
+          <i class="none"></i>
+        Rest</a>
+            </li>
+                                                                               
                                                                                
                                                                                
                   
+      <li>
+    
+                          <a href="../../metron-platform/index.html" 
title="Platform">
+          <i class="icon-chevron-right"></i>
+        Platform</a>
+                  </li>
+                                                                               
                             
+      <li>
+    
+                          <a href="../../metron-sensors/index.html" 
title="Sensors">
+          <i class="icon-chevron-right"></i>
+        Sensors</a>
+                  </li>
+                                                                        
+      <li>
+    
+                          <a 
href="../../metron-stellar/stellar-common/index.html" title="Stellar-common">
+          <i class="icon-chevron-right"></i>
+        Stellar-common</a>
+                  </li>
+                                                                        
+      <li>
+    
+                          <a href="../../use-cases/index.html" 
title="Use-cases">
+          <i class="icon-chevron-right"></i>
+        Use-cases</a>
+                  </li>
+              </ul>
+        </li>
+            </ul>
+                
+                    
+                
+          <hr class="divider" />
+
+           <div id="poweredBy">
+                            <div class="clear"></div>
+                            <div class="clear"></div>
+                            <div class="clear"></div>
+                             <a href="http://maven.apache.org/"; title="Built 
by Maven" class="poweredBy">
+        <img class="builtBy" alt="Built by Maven" 
src="../../images/logos/maven-feather.png" />
+      </a>
+                  </div>
+          </div>
+        </div>
+        
+                
+        <div id="bodyColumn"  class="span9" >
+                                  
+            <h1>Model Management Infrastructure</h1>
+<p><a name="Model_Management_Infrastructure"></a></p>
+<div class="section">
+<h2><a name="Introduction"></a>Introduction</h2>
+<p>One of the main features envisioned and requested is the ability to augment 
the threat intelligence and enrichment processes with insights derived from 
machine learning or statistical models. The challenges with this sort of 
infrastructure are</p>
+
+<ul>
+  
+<li>Applying the model may be sufficiently computationally/resource intensive 
that we need to support scaling via load balancing, which will require service 
discovery and management.</li>
+  
+<li>Models require out of band and frequent training to react to growing 
threats and new patterns that emerge.</li>
+  
+<li>Models should be language/environment agnostic as much as possible. These 
should include small-data and big-data libraries and languages.</li>
+</ul>
+<p>To support a high throughput environment that is manageable, it is evident 
that </p>
+
+<ul>
+  
+<li>Multiple versions of models will need to be exposed</li>
+  
+<li>Deployment should happen using Yarn to manage resources</li>
+  
+<li>Clients should have new model endpoints pushed to them</li>
+</ul></div>
+<div class="section">
+<h2><a name="Architecture"></a>Architecture</h2>
+<p><img src="../../images/maas_arch.png" alt="Architecture" /></p>
+<p>To support these requirements, the following components have been 
created:</p>
+
+<ul>
+  
+<li>A Yarn application which will listen for model deployment requests and 
upon execution, register their endpoints in zookeeper:
+  
+<ul>
+    
+<li>Operation type: ADD, REMOVE, LIST</li>
+    
+<li>Model Name</li>
+    
+<li>Model Version</li>
+    
+<li>Memory requirements (in megabytes)</li>
+    
+<li>Number of instances</li>
+  </ul></li>
+  
+<li>A command line deployment client which will localize the model payload 
onto HDFS and submit a model request</li>
+  
+<li>A Java client which will interact with zookeeper and receive updates about 
model state changes (new deployments, removals, etc.)</li>
+  
+<li>A series of Stellar functions for interacting with models deployed via the 
Model as a Service infrastructure.</li>
+</ul></div>
+<div class="section">
+<h2><a name="maas_service.sh"></a><tt>maas_service.sh</tt></h2>
+<p>The <tt>maas_service.sh</tt> script starts the Yarn application which will 
listen for requests. Right now the queue for the requests is a distributed 
queue stored in <a class="externalLink" 
href="http://curator.apache.org/curator-recipes/distributed-queue.html";>zookeeper</a>
 for convenience.</p>
+
+<div class="source">
+<div class="source">
+<pre>./maas_service.sh
+usage: MaaSClient
+ -c,--create                          Flag to indicate whether to create
+                                      the domain specified with -domain.
+ -d,--domain &lt;arg&gt;                    ID of the timeline domain where the
+                                      timeline entities will be put
+ -e,--shell_env &lt;arg&gt;                 Environment for shell script.
+                                      Specified as env_key=env_val pairs
+ -h,--help                            This screen
+ -j,--jar &lt;arg&gt;                       Jar file containing the application
+                                      master
+ -l,--log4j &lt;arg&gt;                     The log4j properties file to load
+ -ma,--modify_acls &lt;arg&gt;              Users and groups that allowed to
+                                      modify the timeline entities in the
+                                      given domain
+ -mc,--master_vcores &lt;arg&gt;            Amount of virtual cores to be
+                                      requested to run the application
+                                      master
+ -mm,--master_memory &lt;arg&gt;            Amount of memory in MB to be
+                                      requested to run the application
+                                      master
+ -nle,--node_label_expression &lt;arg&gt;   Node label expression to determine
+                                      the nodes where all the containers
+                                      of this application will be
+                                      allocated, &quot;&quot; means containers 
can
+                                      be allocated anywhere, if you don't
+                                      specify the option, default
+                                      node_label_expression of queue will
+                                      be used.
+ -q,--queue &lt;arg&gt;                     RM Queue in which this application
+                                      is to be submitted
+ -t,--timeout &lt;arg&gt;                   Application timeout in milliseconds
+ -va,--view_acls &lt;arg&gt;                Users and groups that allowed to
+                                      view the timeline entities in the
+                                      given domain
+ -zq,--zk_quorum &lt;arg&gt;                Zookeeper Quorum
+ -zr,--zk_root &lt;arg&gt;                  Zookeeper Root
+</pre></div></div></div>
+<div class="section">
+<h2><a name="maas_deploy.sh"></a><tt>maas_deploy.sh</tt></h2>
+<p>The <tt>maas_deploy.sh</tt> script allows users to deploy models and their 
collateral from their local disk to the cluster. It is assumed that the </p>
+
+<ul>
+  
+<li>Collateral has exactly one <tt>.sh</tt> script capable of starting the 
endpoint</li>
+  
+<li>The model service executable will expose itself as a URL endpoint (e.g. as 
a REST interface, but not necessarily)</li>
+  
+<li>The model service executable will write out to local disk a JSON blob 
indicating the endpoint (see <a class="externalLink" 
href="https://gist.github.com/cestella/cba10aff0f970078a4c2c8cade3a4d1a#file-dga-py-L21";>here</a>
 for an example mock service using Python and Flask).</li>
+</ul>
+
+<div class="source">
+<div class="source">
+<pre>./maas_deploy.sh
+usage: ModelSubmission
+ -h,--help                       This screen
+ -hmp,--hdfs_model_path &lt;arg&gt;    Model Path (HDFS)
+ -lmp,--local_model_path &lt;arg&gt;   Model Path (local)
+ -l,--log4j &lt;arg&gt;                The log4j properties file to load
+ -m,--memory &lt;arg&gt;               Memory for container
+ -mo,--mode &lt;arg&gt;                ADD, LIST or REMOVE
+ -n,--name &lt;arg&gt;                 Model Name
+ -ni,--num_instances &lt;arg&gt;       Number of model instances
+ -v,--version &lt;arg&gt;              Model version
+ -zq,--zk_quorum &lt;arg&gt;           Zookeeper Quorum
+ -zr,--zk_root &lt;arg&gt;             Zookeeper Root
+</pre></div></div></div>
+<div class="section">
+<h2><a name="Kerberos_Support"></a>Kerberos Support</h2>
+<p>Model as a service will run on a kerberized cluster (see <a 
href="../../metron-deployment/vagrant/Kerberos-setup.html">here</a> for 
instructions for vagrant) with a caveat. The user who submits the service will 
be the user who executes the models on the cluster. That is to say that user 
impersonation of models deployed is not done at the moment.</p></div>
+<div class="section">
+<h2><a name="Stellar_Integration"></a>Stellar Integration</h2>
+<p>Two Stellar functions have been added to provide the ability to call out to 
models deployed via Model as a Service. One aimed at recovering a load balanced 
endpoint of a deployed model given the name and, optionally, the version. The 
second is aimed at calling that endpoint assuming that it is exposed as a REST 
endpoint.</p>
+
+<ul>
+  
+<li><tt>MAAS_MODEL_APPLY(endpoint, function?, model_args)</tt> : Returns the 
output of a model deployed via model which is deployed at endpoint. 
<tt>endpoint</tt> is a map containing <tt>name</tt>, <tt>version</tt>, 
<tt>url</tt> for the REST endpoint, <tt>function</tt> is the endpoint path and 
is optional, and <tt>model_args</tt> is a dictionary of arguments for the model 
(these become request params).</li>
+  
+<li><tt>MAAS_GET_ENDPOINT(model_name, model_version?)</tt> : Inspects 
zookeeper and returns a map containing the <tt>name</tt>, <tt>version</tt> and 
<tt>url</tt> for the model referred to by <tt>model_name</tt> and 
<tt>model_version</tt>. If <tt>model_version</tt> is not specified, the most 
current model associated with <tt>model_name</tt> is returned. In the instance 
where more than one model is deployed, a random one is selected with uniform 
probability.</li>
+</ul>
+<p><a name="Example"></a></p>
+<h1>Example</h1>
+<p>Let&#x2019;s augment the <tt>squid</tt> proxy sensor to use a model that 
will determine if the destination host is a domain generating algorithm. For 
the purposes of demonstration, this algorithm is super simple and is 
implemented using Python with a REST interface exposed via the Flask python 
library.</p></div>
+<div class="section">
+<h2><a name="Install_Prerequisites_and_Mock_DGA_Service"></a>Install 
Prerequisites and Mock DGA Service</h2>
+<p>Now let&#x2019;s install some prerequisites:</p>
+
+<ul>
+  
+<li>Flask via <tt>yum install python-flask</tt></li>
+  
+<li>Jinja2 via <tt>yum install python-jinja2</tt></li>
+  
+<li>Squid client via <tt>yum install squid</tt></li>
+  
+<li>ES Head plugin via <tt>/usr/share/elasticsearch/bin/plugin install 
mobz/elasticsearch-head</tt></li>
+</ul>
+<p>Start Squid via <tt>service squid start</tt></p>
+<p>Now that we have flask and jinja, we can create a mock DGA service to 
deploy with MaaS:</p>
+
+<ul>
+  
+<li>Download the files in <a class="externalLink" 
href="https://gist.github.com/cestella/cba10aff0f970078a4c2c8cade3a4d1a";>this</a>
 gist into the <tt>$HOME/mock_dga</tt> directory</li>
+  
+<li>Make <tt>rest.sh</tt> executable via <tt>chmod +x 
$HOME/mock_dga/rest.sh</tt></li>
+</ul>
+<p>This service will treat <tt>yahoo.com</tt> and <tt>amazon.com</tt> as legit 
and everything else as malicious. The contract is that the REST service exposes 
an endpoint <tt>/apply</tt> and returns back JSON maps with a single key 
<tt>is_malicious</tt> which can be <tt>malicious</tt> or 
<tt>legit</tt>.</p></div>
+<div class="section">
+<h2><a name="Deploy_Mock_DGA_Service_via_MaaS"></a>Deploy Mock DGA Service via 
MaaS</h2>
+<p>The following presumes that you are a logged in as a user who has a home 
directory in HDFS under <tt>/user/$USER</tt>. If you do not, please create one 
and ensure the permissions are set appropriate:</p>
+
+<div class="source">
+<div class="source">
+<pre>su - hdfs -c &quot;hadoop fs -mkdir /user/$USER&quot;
+su - hdfs -c &quot;hadoop fs -chown $USER:$USER /user/$USER&quot;
+</pre></div></div>
+<p>Or, in the common case for the <tt>metron</tt> user:</p>
+
+<div class="source">
+<div class="source">
+<pre>su - hdfs -c &quot;hadoop fs -mkdir /user/metron&quot;
+su - hdfs -c &quot;hadoop fs -chown metron:metron /user/metron&quot;
+</pre></div></div>
+<p>Now let&#x2019;s start MaaS and deploy the Mock DGA Service:</p>
+
+<ul>
+  
+<li>Start MaaS via <tt>$METRON_HOME/bin/maas_service.sh -zq 
node1:2181</tt></li>
+  
+<li>Start one instance of the mock DGA model with 512M of memory via 
<tt>$METRON_HOME/bin/maas_deploy.sh -zq node1:2181 -lmp $HOME/mock_dga -hmp 
/user/$USER/models -mo ADD -m 512 -n dga -v 1.0 -ni 1</tt></li>
+  
+<li>As a sanity check:
+  
+<ul>
+    
+<li>Ensure that the model is running via <tt>$METRON_HOME/bin/maas_deploy.sh 
-zq node1:2181 -mo LIST</tt>. You should see <tt>Model dga @ 1.0</tt> be 
displayed and under that a url such as (but not exactly) 
<tt>http://node1:36161</tt></li>
+    
+<li>Try to hit the model via curl: <tt>curl 
'http://localhost:36161/apply?host=caseystella.com'</tt> and ensure that it 
returns a JSON map indicating the domain is malicious.</li>
+  </ul></li>
+</ul></div>
+<div class="section">
+<h2><a name="Adjust_Configurations_for_Squid_to_Call_Model"></a>Adjust 
Configurations for Squid to Call Model</h2>
+<p>Now that we have a deployed model, let&#x2019;s adjust the configurations 
for the Squid topology to annotate the messages with the output of the 
model.</p>
+
+<ul>
+  
+<li>Edit the squid parser configuration at 
<tt>$METRON_HOME/config/zookeeper/parsers/squid.json</tt> in your favorite text 
editor and add a new FieldTransformation to indicate a threat alert based on 
the model (note the addition of <tt>is_malicious</tt> and 
<tt>is_alert</tt>):</li>
+</ul>
+
+<div class="source">
+<div class="source">
+<pre>{
+  &quot;parserClassName&quot;: 
&quot;org.apache.metron.parsers.GrokParser&quot;,
+  &quot;sensorTopic&quot;: &quot;squid&quot;,
+  &quot;parserConfig&quot;: {
+    &quot;grokPath&quot;: &quot;/patterns/squid&quot;,
+    &quot;patternLabel&quot;: &quot;SQUID_DELIMITED&quot;,
+    &quot;timestampField&quot;: &quot;timestamp&quot;
+  },
+  &quot;fieldTransformations&quot; : [
+    {
+      &quot;transformation&quot; : &quot;STELLAR&quot;
+    ,&quot;output&quot; : [ &quot;full_hostname&quot;, 
&quot;domain_without_subdomains&quot;, &quot;is_malicious&quot;, 
&quot;is_alert&quot; ]
+    ,&quot;config&quot; : {
+      &quot;full_hostname&quot; : &quot;URL_TO_HOST(url)&quot;
+      ,&quot;domain_without_subdomains&quot; : 
&quot;DOMAIN_REMOVE_SUBDOMAINS(full_hostname)&quot;
+      ,&quot;is_malicious&quot; : &quot;MAP_GET('is_malicious', 
MAAS_MODEL_APPLY(MAAS_GET_ENDPOINT('dga'), {'host' : 
domain_without_subdomains}))&quot;
+      ,&quot;is_alert&quot; : &quot;if is_malicious == 'malicious' then 'true' 
else null&quot;
+                }
+    }
+                           ]
+}
+</pre></div></div>
+
+<ul>
+  
+<li>Edit the squid enrichment configuration at 
<tt>$METRON_HOME/config/zookeeper/enrichments/squid.json</tt> (this file will 
not exist, so create a new one) to make the threat triage adjust the level of 
risk based on the model output:</li>
+</ul>
+
+<div class="source">
+<div class="source">
+<pre>{
+  &quot;enrichment&quot; : {
+    &quot;fieldMap&quot;: {}
+  },
+  &quot;threatIntel&quot; : {
+    &quot;fieldMap&quot;:{},
+    &quot;triageConfig&quot; : {
+      &quot;riskLevelRules&quot; : [
+        {
+          &quot;rule&quot; : &quot;is_malicious == 'malicious'&quot;,
+          &quot;score&quot; : 100
+        }
+      ],
+      &quot;aggregator&quot; : &quot;MAX&quot;
+    }
+  }
+}
+</pre></div></div>
+
+<ul>
+  
+<li>Upload new configs via <tt>$METRON_HOME/bin/zk_load_configs.sh --mode PUSH 
-i $METRON_HOME/config/zookeeper -z node1:2181</tt></li>
+  
+<li>Make the Squid topic in kafka via 
<tt>/usr/hdp/current/kafka-broker/bin/kafka-topics.sh --zookeeper node1:2181 
--create --topic squid --partitions 1 --replication-factor 1</tt></li>
+</ul></div>
+<div class="section">
+<h2><a name="Start_Topologies_and_Send_Data"></a>Start Topologies and Send 
Data</h2>
+<p>Now we need to start the topologies and send some data:</p>
+
+<ul>
+  
+<li>Start the squid topology via <tt>$METRON_HOME/bin/start_parser_topology.sh 
-k node1:6667 -z node1:2181 -s squid</tt></li>
+  
+<li>Generate some data via the squid client:
+  
+<ul>
+    
+<li>Generate a legit example: <tt>squidclient http://yahoo.com</tt></li>
+    
+<li>Generate a malicious example: <tt>squidclient http://cnn.com</tt></li>
+  </ul></li>
+  
+<li>Send the data to kafka via <tt>cat /var/log/squid/access.log | 
/usr/hdp/current/kafka-broker/bin/kafka-console-producer.sh --broker-list 
node1:6667 --topic squid</tt></li>
+  
+<li>Browse the data in elasticsearch via the ES Head plugin @ <a 
class="externalLink" 
href="http://node1:9200/_plugin/head/";>http://node1:9200/_plugin/head/</a> and 
verify that in the squid index you have two documents
+  
+<ul>
+    
+<li>One from <tt>yahoo.com</tt> which does not have <tt>is_alert</tt> set and 
does have <tt>is_malicious</tt> set to <tt>legit</tt></li>
+    
+<li>One from <tt>cnn.com</tt> which does have <tt>is_alert</tt> set to 
<tt>true</tt>, <tt>is_malicious</tt> set to <tt>malicious</tt> and 
<tt>threat:triage:level</tt> set to 100</li>
+  </ul></li>
+</ul></div>
+                  </div>
+            </div>
+          </div>
+
+    <hr/>
+
+    <footer>
+            <div class="container-fluid">
+              <div class="row span12">Copyright &copy;                    2017
+                        <a href="https://www.apache.org";>The Apache Software 
Foundation</a>.
+            All Rights Reserved.      
+                    
+      </div>
+
+                          
+        
+                </div>
+    </footer>
+  </body>
+</html>

Added: 
release/metron/0.4.1/site-book/metron-analytics/metron-profiler-client/index.html
==============================================================================
--- 
release/metron/0.4.1/site-book/metron-analytics/metron-profiler-client/index.html
 (added)
+++ 
release/metron/0.4.1/site-book/metron-analytics/metron-profiler-client/index.html
 Fri Sep 15 23:37:46 2017
@@ -0,0 +1,913 @@
+<!DOCTYPE html>
+<!--
+ | Generated by Apache Maven Doxia at 2017-09-08
+ | Rendered using Apache Maven Fluido Skin 1.3.0
+-->
+<html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
+  <head>
+    <meta charset="UTF-8" />
+    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
+    <meta name="Date-Revision-yyyymmdd" content="20170908" />
+    <meta http-equiv="Content-Language" content="en" />
+    <title>Metron &#x2013; Metron Profiler Client</title>
+    <link rel="stylesheet" href="../../css/apache-maven-fluido-1.3.0.min.css" 
/>
+    <link rel="stylesheet" href="../../css/site.css" />
+    <link rel="stylesheet" href="../../css/print.css" media="print" />
+
+      
+    <script type="text/javascript" 
src="../../js/apache-maven-fluido-1.3.0.min.js"></script>
+
+                          
+        
+<script type="text/javascript">$( document ).ready( function() { $( 
'.carousel' ).carousel( { interval: 3500 } ) } );</script>
+          
+            </head>
+        <body class="topBarDisabled">
+          
+                
+                    
+    
+        <div class="container-fluid">
+          <div id="banner">
+        <div class="pull-left">
+                                    <a href="http://metron.apache.org/"; 
id="bannerLeft">
+                                                                               
                 <img src="../../images/metron-logo.png"  alt="Apache Metron" 
width="148px" height="48px"/>
+                </a>
+                      </div>
+        <div class="pull-right">  </div>
+        <div class="clear"><hr/></div>
+      </div>
+
+      <div id="breadcrumbs">
+        <ul class="breadcrumb">
+                
+                    
+                              <li class="">
+                    <a href="http://www.apache.org"; class="externalLink" 
title="Apache">
+        Apache</a>
+        </li>
+      <li class="divider ">/</li>
+            <li class="">
+                    <a href="http://metron.apache.org/"; class="externalLink" 
title="Metron">
+        Metron</a>
+        </li>
+      <li class="divider ">/</li>
+            <li class="">
+                    <a href="../../index.html" title="Documentation">
+        Documentation</a>
+        </li>
+      <li class="divider ">/</li>
+        <li class="">Metron Profiler Client</li>
+        
+                
+                    
+                  <li id="publishDate" class="pull-right">Last Published: 
2017-09-08</li> <li class="divider pull-right">|</li>
+              <li id="projectVersion" class="pull-right">Version: 0.4.1</li>
+            
+                            </ul>
+      </div>
+
+            
+      <div class="row-fluid">
+        <div id="leftColumn" class="span3">
+          <div class="well sidebar-nav">
+                
+                    
+                <ul class="nav nav-list">
+                    <li class="nav-header">User Documentation</li>
+                                                                               
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                          
+      <li>
+    
+                          <a href="../../index.html" title="Metron">
+          <i class="icon-chevron-down"></i>
+        Metron</a>
+                    <ul class="nav nav-list">
+                      
+      <li>
+    
+                          <a href="../../Upgrading.html" title="Upgrading">
+          <i class="none"></i>
+        Upgrading</a>
+            </li>
+                                                                               
                                                                                
 
+      <li>
+    
+                          <a href="../../metron-analytics/index.html" 
title="Analytics">
+          <i class="icon-chevron-down"></i>
+        Analytics</a>
+                    <ul class="nav nav-list">
+                      
+      <li>
+    
+                          <a 
href="../../metron-analytics/metron-maas-service/index.html" 
title="Maas-service">
+          <i class="none"></i>
+        Maas-service</a>
+            </li>
+                      
+      <li>
+    
+                          <a 
href="../../metron-analytics/metron-profiler/index.html" title="Profiler">
+          <i class="none"></i>
+        Profiler</a>
+            </li>
+                      
+      <li class="active">
+    
+            <a href="#"><i class="none"></i>Profiler-client</a>
+          </li>
+                                                                        
+      <li>
+    
+                          <a 
href="../../metron-analytics/metron-statistics/index.html" title="Statistics">
+          <i class="icon-chevron-right"></i>
+        Statistics</a>
+                  </li>
+              </ul>
+        </li>
+                      
+      <li>
+    
+                          <a 
href="../../metron-contrib/metron-docker/index.html" title="Docker">
+          <i class="none"></i>
+        Docker</a>
+            </li>
+                                                                               
                                                                                
                                                                                
                                                                                
                                                                                
                                 
+      <li>
+    
+                          <a href="../../metron-deployment/index.html" 
title="Deployment">
+          <i class="icon-chevron-right"></i>
+        Deployment</a>
+                  </li>
+                      
+      <li>
+    
+                          <a 
href="../../metron-interface/metron-alerts/index.html" title="Alerts">
+          <i class="none"></i>
+        Alerts</a>
+            </li>
+                      
+      <li>
+    
+                          <a 
href="../../metron-interface/metron-config/index.html" title="Config">
+          <i class="none"></i>
+        Config</a>
+            </li>
+                      
+      <li>
+    
+                          <a 
href="../../metron-interface/metron-rest/index.html" title="Rest">
+          <i class="none"></i>
+        Rest</a>
+            </li>
+                                                                               
                                                                                
                                                                                
                   
+      <li>
+    
+                          <a href="../../metron-platform/index.html" 
title="Platform">
+          <i class="icon-chevron-right"></i>
+        Platform</a>
+                  </li>
+                                                                               
                             
+      <li>
+    
+                          <a href="../../metron-sensors/index.html" 
title="Sensors">
+          <i class="icon-chevron-right"></i>
+        Sensors</a>
+                  </li>
+                                                                        
+      <li>
+    
+                          <a 
href="../../metron-stellar/stellar-common/index.html" title="Stellar-common">
+          <i class="icon-chevron-right"></i>
+        Stellar-common</a>
+                  </li>
+                                                                        
+      <li>
+    
+                          <a href="../../use-cases/index.html" 
title="Use-cases">
+          <i class="icon-chevron-right"></i>
+        Use-cases</a>
+                  </li>
+              </ul>
+        </li>
+            </ul>
+                
+                    
+                
+          <hr class="divider" />
+
+           <div id="poweredBy">
+                            <div class="clear"></div>
+                            <div class="clear"></div>
+                            <div class="clear"></div>
+                             <a href="http://maven.apache.org/"; title="Built 
by Maven" class="poweredBy">
+        <img class="builtBy" alt="Built by Maven" 
src="../../images/logos/maven-feather.png" />
+      </a>
+                  </div>
+          </div>
+        </div>
+        
+                
+        <div id="bodyColumn"  class="span9" >
+                                  
+            <h1>Metron Profiler Client</h1>
+<p><a name="Metron_Profiler_Client"></a></p>
+<p>This project provides a client API for accessing the profiles generated by 
the <a href="../metron-profiler/index.html">Metron Profiler</a>. This includes 
both a Java API and Stellar API for accessing the profile data. The primary use 
case is to extract profile data for use during model scoring.</p>
+<div class="section">
+<h2><a name="Stellar_Client_API"></a>Stellar Client API</h2>
+<div class="section">
+<h3><a name="PROFILE_GET"></a><tt>PROFILE_GET</tt></h3>
+<p>The <tt>PROFILE_GET</tt> command allows you to select all of the profile 
measurements written. This command takes the following arguments:</p>
+
+<div class="source">
+<div class="source">
+<pre>REQUIRED:
+    profile - The name of the profile
+    entity - The name of the entity
+    periods - The list of profile periods to grab.  These are ProfilePeriod 
objects.
+OPTIONAL:
+       groups_list - Optional, must correspond to the 'groupBy' list used in 
profile creation - List (in square brackets) of 
+            groupBy values used to filter the profile. Default is the empty 
list, meaning groupBy was not used when 
+            creating the profile.
+    config_overrides - Optional - Map (in curly braces) of name:value pairs, 
each overriding the global config parameter
+            of the same name. Default is the empty Map, meaning no overrides.
+</pre></div></div>
+<p>There is an older calling format where <tt>groups_list</tt> is specified as 
a sequence of group names, &#x201c;varargs&#x201d; style, instead of a List 
object. This format is still supported for backward compatibility, but it is 
deprecated, and it is disallowed if the optional <tt>config_overrides</tt> 
argument is used.</p>
+<p>The <tt>periods</tt> field is (likely) the output of another Stellar 
function which defines the times to include.</p>
+<div class="section">
+<h4><a name="Groups_list_argument"></a>Groups_list argument</h4>
+<p>The <tt>groups_list</tt> argument in the client must exactly correspond to 
the <a href="../metron-profiler/index.html#groupBy"><tt>groupBy</tt></a> 
configuration in the profile definition. If <tt>groupBy</tt> was not used in 
the profile, <tt>groups_list</tt> must be empty in the client. If 
<tt>groupBy</tt> was used in the profile, then the client <tt>groups_list</tt> 
is <b>not</b> optional; it must be the same length as the <tt>groupBy</tt> 
list, and specify exactly one selected group value for each <tt>groupBy</tt> 
criterion, in the same order. For example:</p>
+
+<div class="source">
+<div class="source">
+<pre>If in Profile, the groupBy criteria are:  [ 
&#x201c;DAY_OF_WEEK()&#x201d;, &#x201c;URL_TO_PORT()&#x201d; ]
+Then in PROFILE_GET, an allowed groups value would be:  [ &#x201c;3&#x201d;, 
&#x201c;8080&#x201d; ]
+which will select only records from Tuesdays with port number 8080.
+</pre></div></div></div>
+<div class="section">
+<h4><a 
name="Configuration_and_the_config_overrides_argument"></a>Configuration and 
the config_overrides argument</h4>
+<p>By default, the Profiler creates profiles with a period duration of 15 
minutes. This means that data is accumulated, summarized and flushed every 15 
minutes. The Client API must also have knowledge of this duration to correctly 
retrieve the profile data. If the Client is expecting 15 minute periods, it 
will not be able to read data generated by a Profiler that was configured for 1 
hour periods, and will return zero results. </p>
+<p>Similarly, all six Client configuration parameters listed in the table 
below must match the Profiler configuration parameter settings from the time 
the profile was created. The period duration and other configuration parameters 
from the Profiler topology are stored in local filesystem at 
<tt>$METRON_HOME/config/profiler.properties</tt>. The Stellar Client API can be 
configured correspondingly by setting the following properties in 
Metron&#x2019;s global configuration, on local filesystem at 
<tt>$METRON_HOME/config/zookeeper/global.json</tt>, then uploaded to Zookeeper 
(at <tt>/metron/topology/global</tt>) by using <tt>zk_load_configs.sh</tt>: </p>
+
+<div class="source">
+<div class="source">
+<pre>    $ cd $METRON_HOME
+    $ bin/zk_load_configs.sh -m PUSH -i config/zookeeper/ -z node1:2181
+</pre></div></div>
+<p>Any of these six Client configuration parameters may be overridden at run 
time using the <tt>config_overrides</tt> Map argument in PROFILE_GET. The 
primary use case is when historical profiles have been created with a different 
Profiler configuration than is currently configured, and the analyst needing to 
access them does not want to change the global Client configuration so as not 
to disrupt the work of other analysts working with current profiles.</p>
+
+<table border="0" class="table table-striped">
+  <thead>
+    
+<tr class="a">
+      
+<th>Key </th>
+      
+<th>Description </th>
+      
+<th>Required </th>
+      
+<th>Default </th>
+    </tr>
+  </thead>
+  <tbody>
+    
+<tr class="b">
+      
+<td>profiler.client.period.duration </td>
+      
+<td>The duration of each profile period. This value should be defined along 
with <tt>profiler.client.period.duration.units</tt>. </td>
+      
+<td>Optional </td>
+      
+<td>15 </td>
+    </tr>
+    
+<tr class="a">
+      
+<td>profiler.client.period.duration.units </td>
+      
+<td>The units used to specify the profile period duration. This value should 
be defined along with <tt>profiler.client.period.duration</tt>. </td>
+      
+<td>Optional </td>
+      
+<td>MINUTES </td>
+    </tr>
+    
+<tr class="b">
+      
+<td>profiler.client.hbase.table </td>
+      
+<td>The name of the HBase table used to store profile data. </td>
+      
+<td>Optional </td>
+      
+<td>profiler </td>
+    </tr>
+    
+<tr class="a">
+      
+<td>profiler.client.hbase.column.family </td>
+      
+<td>The name of the HBase column family used to store profile data. </td>
+      
+<td>Optional </td>
+      
+<td>P </td>
+    </tr>
+    
+<tr class="b">
+      
+<td>profiler.client.salt.divisor </td>
+      
+<td>The salt divisor used to store profile data. </td>
+      
+<td>Optional </td>
+      
+<td>1000 </td>
+    </tr>
+    
+<tr class="a">
+      
+<td>hbase.provider.impl </td>
+      
+<td>The name of the HBaseTableProvider implementation class. </td>
+      
+<td>Optional </td>
+      
+<td> </td>
+    </tr>
+  </tbody>
+</table></div></div>
+<div class="section">
+<h3><a name="Profile_Selectors"></a>Profile Selectors</h3>
+<p>You will notice that the third argument for <tt>PROFILE_GET</tt> is a list 
of <tt>ProfilePeriod</tt> objects. This list is expected to be produced by 
another Stellar function. There are a couple options available.</p>
+<div class="section">
+<h4><a name="PROFILE_FIXED"></a><tt>PROFILE_FIXED</tt></h4>
+<p>The profiler periods associated with a fixed lookback starting from now. 
These are ProfilePeriod objects.</p>
+
+<div class="source">
+<div class="source">
+<pre>REQUIRED:
+    durationAgo - How long ago should values be retrieved from?
+    units - The units of 'durationAgo'.
+OPTIONAL:
+    config_overrides - Optional - Map (in curly braces) of name:value pairs, 
each overriding the global config parameter
+            of the same name. Default is the empty Map, meaning no overrides.
+
+e.g. To retrieve all the profiles for the last 5 hours.  
PROFILE_GET('profile', 'entity', PROFILE_FIXED(5, 'HOURS'))
+</pre></div></div>
+<p>Note that the <tt>config_overrides</tt> parameter operates exactly as the 
<tt>config_overrides</tt> argument in <tt>PROFILE_GET</tt>. The only available 
parameters for override are:</p>
+
+<ul>
+  
+<li><tt>profiler.client.period.duration</tt></li>
+  
+<li><tt>profiler.client.period.duration.units</tt></li>
+</ul></div>
+<div class="section">
+<h4><a name="PROFILE_WINDOW"></a><tt>PROFILE_WINDOW</tt></h4>
+<p><tt>PROFILE_WINDOW</tt> is intended to provide a finer-level of control 
over selecting windows for profiles:</p>
+
+<ul>
+  
+<li>Specify windows relative to the data timestamp (see the optional 
<tt>now</tt> parameter below)</li>
+  
+<li>Specify non-contiguous windows to better handle seasonal data (e.g. the 
last hour for every day for the last month)</li>
+  
+<li>Specify profile output excluding holidays</li>
+  
+<li>Specify only profile output on a specific day of the week</li>
+</ul>
+<p>It does this by a domain specific language mimicking natural language that 
defines the windows excluded.</p>
+
+<div class="source">
+<div class="source">
+<pre>REQUIRED:
+    windowSelector - The statement specifying the window to select.
+    now - Optional - The timestamp to use for now.
+OPTIONAL:
+    config_overrides - Optional - Map (in curly braces) of name:value pairs, 
each overriding the global config parameter
+            of the same name. Default is the empty Map, meaning no overrides.
+
+e.g. To retrieve all the measurements written for 'profile' and 'entity' for 
the last hour 
+on the same weekday excluding weekends and US holidays across the last 14 
days: 
+PROFILE_GET('profile', 'entity', PROFILE_WINDOW('1 hour window every 24 hours 
starting from 14 days ago including the current day of the week excluding 
weekends, holidays:us'))
+</pre></div></div>
+<p>Note that the <tt>config_overrides</tt> parameter operates exactly as the 
<tt>config_overrides</tt> argument in <tt>PROFILE_GET</tt>. The only available 
parameters for override are:</p>
+
+<ul>
+  
+<li><tt>profiler.client.period.duration</tt></li>
+  
+<li><tt>profiler.client.period.duration.units</tt></li>
+</ul>
+<div class="section">
+<h5><a name="The_Profile_Selector_Language"></a>The Profile Selector 
Language</h5>
+<p>The domain specific language can be broken into a series of clauses, some 
optional</p>
+
+<ul>
+  
+<li><a href="#Temporal_Window_Width"><span style="color:blue">Total Temporal 
Duration</span></a> - The total range of time in which windows may be 
specified</li>
+  
+<li><a href="#InclusionExclusion_specifiers"><span style="color:red">Temporal 
Window Width</span></a> - How large each temporal window</li>
+  
+<li><a href="#Skip_distance"><span style="color:green">Skip 
distance</span></a> (optional)- How far to skip between when one window starts 
and when the next begins</li>
+  
+<li><a href="#InclusionExclusion_specifiers"><span 
style="color:purple">Inclusion/Exclusion specifiers</span></a> (optional) - The 
set of specifiers to further filter the window</li>
+</ul>
+<p>One <i>must</i> specify either a total temporal duration or a temporal 
window width. The remaining clauses are optional. During the course of the 
following discussion, we will color code the clauses in the examples and link 
them to the relevant section for more detail.</p>
+<p>From a high level, the language fits the following three forms, which are 
composed of the clauses above:</p>
+
+<ul>
+  
+<li><a href="#Temporal_Window_Width"><span style="color:red">time_interval 
WINDOW?</span></a> <a href="#InclusionExclusion_specifiers"><span 
style="color:purple">(INCLUDING specifier_list)? (EXCLUDING 
specifier_list)?</span></a></li>
+  
+<li><a href="#Temporal_Window_Width"><span style="color:red">time_interval 
WINDOW?</span></a> <a href="#Skip_distance"><span style="color:green">EVERY 
time_interval</span></a> <a href="#Total_Temporal_Duration"><span 
style="color:blue">FROM time_interval (TO time_interval)?</span></a> <a 
href="#InclusionExclusion_specifiers"><span style="color:purple">(INCLUDING 
specifier_list)? (EXCLUDING specifier_list)?</span></a></li>
+  
+<li><a href="#Total_Temporal_Duration"><span style="color:blue">FROM 
time_interval (TO time_interval)?</span></a></li>
+</ul>
+<div class="section">
+<h6><a name="Total_Temporal_Duration"></a><span style="color:blue">Total 
Temporal Duration</span></h6>
+<p>Total temporal duration is specified by a phrase: <tt>FROM time_interval 
AGO TO time_interval AGO</tt> This indicates the beginning and ending of a time 
interval. This is an inclusive duration.</p>
+
+<ul>
+  
+<li><tt>FROM</tt> - Can be the words &#x201c;from&#x201d; or &#x201c;starting 
from&#x201d;</li>
+  
+<li><tt>time_interval</tt> - A time amount followed by a unit (e.g. 1 hour). 
Fractional amounts are not supported. The unit may be &#x201c;minute&#x201d;, 
&#x201c;day&#x201d;, &#x201c;hour&#x201d; with any pluralization.</li>
+  
+<li><tt>TO</tt> - Can be the words &#x201c;until&#x201d; or 
&#x201c;to&#x201d;</li>
+  
+<li><tt>AGO</tt> - Optionally the word &#x201c;ago&#x201d;</li>
+</ul>
+<p>The <tt>TO time_interval AGO</tt> portion is optional. If unspecified then 
it is expected that the time interval ends now.</p>
+<p>Due to the vagaries of the english language, the from and the to portions, 
if both specified, are interchangeable with regard to which one specifies the 
start and which specifies the end. </p>
+<p>In other words &#x201c;<a href="#Total_Temporal_Duration"><span 
style="color:blue">starting from 1 hour ago to 30 minutes 
ago</span></a>&#x201d; and &#x201c;<a href="#Total_Temporal_Duration"><span 
style="color:blue">starting from 30 minutes ago to 1 hour 
ago</span></a>&#x201d; specify the same temporal duration.</p>
+<p><b>Examples</b></p>
+
+<ul>
+  
+<li>A duration starting 1 hour ago and ending now
+  
+<ul>
+    
+<li><a href="#Total_Temporal_Duration"><span style="color:blue">from 1 hour 
ago</span></a></li>
+    
+<li><a href="#Total_Temporal_Duration"><span style="color:blue">from 1 
hour</span></a></li>
+    
+<li><a href="#Total_Temporal_Duration"><span style="color:blue">starting from 
1 hour ago</span></a></li>
+    
+<li><a href="#Total_Temporal_Duration"><span style="color:blue">starting from 
1 hour</span></a></li>
+  </ul></li>
+  
+<li>A duration starting 1 hour ago and ending 30 minutes ago:
+  
+<ul>
+    
+<li><a href="#Total_Temporal_Duration"><span style="color:blue">from 1 hour 
ago until 30 minutes ago</span></a></li>
+    
+<li><a href="#Total_Temporal_Duration"><span style="color:blue">from 30 
minutes ago until 1 hour ago</span></a></li>
+    
+<li><a href="#Total_Temporal_Duration"><span style="color:blue">starting from 
1 hour ago to 30 minutes ago</span></a></li>
+    
+<li><a href="#Total_Temporal_Duration"><span style="color:blue">starting from 
1 hour to 30 minutes</span></a></li>
+  </ul></li>
+</ul></div>
+<div class="section">
+<h6><a name="Temporal_Window_Width"></a><span style="color:red">Temporal 
Window Width</span></h6>
+<p>Temporal window width is the specification of a window. A window is may 
either repeat within total temporal duration or may fill the total temporal 
duration. This is an inclusive window. A window is specified by the phrase: 
<tt>time_interval WINDOW</tt></p>
+
+<ul>
+  
+<li><tt>time_interval</tt> - A time amount followed by a unit (e.g. 1 hour). 
Fractional amounts are not supported. The unit may be &#x201c;minute&#x201d;, 
&#x201c;day&#x201d;, &#x201c;hour&#x201d; with any pluralization.</li>
+  
+<li><tt>WINDOW</tt> - Optionally the word &#x201c;window&#x201d;</li>
+</ul>
+<p><b>Examples</b></p>
+
+<ul>
+  
+<li>A fixed window starting 2 hours ago and going until now
+  
+<ul>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">2 
hour</span></a></li>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">2 
hours</span></a></li>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">2 hours 
window</span></a></li>
+  </ul></li>
+  
+<li>A repeating 30 minute window starting 2 hours ago and repeating every hour 
until now. This would result in 2 30-minute wide windows: 2 hours ago and 1 
hour ago
+  
+<ul>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">30 minute 
window</span></a> <a href="#Skip_distance"><span style="color:green">every 1 
hour</span></a> <a href="#Total_Temporal_Duration"><span 
style="color:blue">starting from 2 hours ago</span></a></li>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">30 minute 
windows</span></a> <a href="#Skip_distance"><span style="color:green">every 1 
hour</span></a> <a href="#Total_Temporal_Duration"><span 
style="color:blue">from 2 hours ago</span></a></li>
+  </ul></li>
+  
+<li>A repeating 30 minute window starting 2 hours ago and repeating every hour 
until 30 minutes ago. This would result in 2 30-minute wide windows: 2 hours 
ago and 1 hour ago
+  
+<ul>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">30 minute 
window</span></a> <a href="#Skip_distance"><span style="color:green">every 1 
hour</span></a> <a href="#Total_Temporal_Duration"><span 
style="color:blue">starting from 2 hours ago until 30 minutes 
ago</span></a></li>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">30 minutes 
window</span></a> <a href="#Skip_distance"><span style="color:green">every 1 
hour</span></a> <a href="#Total_Temporal_Duration"><span 
style="color:blue">from 2 hours ago to 30 minutes ago</span></a></li>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">30 minutes 
window</span></a> <a href="#Skip_distance"><span style="color:green">for every 
1 hour</span></a> <a href="#Total_Temporal_Duration"><span 
style="color:blue">from 30 minutes ago to 2 hours ago</span></a></li>
+  </ul></li>
+</ul></div>
+<div class="section">
+<h6><a name="Skip_distance"></a><span style="color:green">Skip 
distance</span></h6>
+<p>Skip distance is the amount of time between temporal window beginnings that 
the next window starts. It is, in effect, the window period. </p>
+<p>It is specified by the phrase <tt>EVERY time_interval</tt></p>
+
+<ul>
+  
+<li><tt>time_interval</tt> - A time amount followed by a unit (e.g. 1 hour). 
Fractional amounts are not supported. The unit may be &#x201c;minute&#x201d;, 
&#x201c;day&#x201d;, &#x201c;hour&#x201d; with any pluralization.</li>
+  
+<li><tt>EVERY</tt> - The word/phrase &#x201c;every&#x201d; or &#x201c;for 
every&#x201d;</li>
+</ul>
+<p><b>Examples</b></p>
+
+<ul>
+  
+<li>A repeating 30 minute window starting 2 hours ago and repeating every hour 
until now. This would result in 2 30-minute wide windows: 2 hours ago and 1 
hour ago
+  
+<ul>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">30 minute 
window</span></a> <a href="#Skip_distance"><span style="color:green">every 1 
hour</span></a> <a href="#Total_Temporal_Duration"><span 
style="color:blue">starting from 2 hours ago </span></a></li>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">30 minutes 
window</span></a> <a href="#Skip_distance"><span style="color:green">every 1 
hour</span></a> <a href="#Total_Temporal_Duration"><span 
style="color:blue">from 2 hours ago </span></a></li>
+  </ul></li>
+  
+<li>A repeating 30 minute window starting 2 hours ago and repeating every hour 
until 30 minutes ago. This would result in 2 30-minute wide windows: 2 hours 
ago and 1 hour ago
+  
+<ul>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">30 minute 
window</span></a> <a href="#Skip_distance"><span style="color:green">every 1 
hour</span></a> <a href="#Total_Temporal_Duration"><span 
style="color:blue">starting from 2 hours ago until 30 minutes 
ago</span></a></li>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">30 minutes 
window</span></a> <a href="#Skip_distance"><span style="color:green">every 1 
hour</span></a> <a href="#Total_Temporal_Duration"><span 
style="color:blue">from 2 hours ago to 30 minutes ago</span></a></li>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">30 minutes 
window</span></a> <a href="#Skip_distance"><span style="color:green">for every 
1 hour</span></a> <a href="#Total_Temporal_Duration"><span 
style="color:blue">from 30 minutes ago to 2 hours ago</span></a></li>
+  </ul></li>
+</ul></div>
+<div class="section">
+<h6><a name="InclusionExclusion_specifiers"></a><span 
style="color:purple">Inclusion/Exclusion specifiers</span></h6>
+<p>Inclusion and Exclusion specifiers operate as filters on the set of 
windows. They operate on the window beginning timestamp.</p>
+<p>For inclusion specifiers, windows who are passed by <i>any</i> of the set 
of inclusion specifiers are included.<br />inclusion specifiers. Similarly, 
windows who are passed by <i>any</i> of the set of exclusion specifiers are 
excluded. Exclusion specifiers trump inclusion specifiers.</p>
+<p>Specifiers follow one of the following formats depending on if it is an 
inclusion or exclusion specifier:</p>
+
+<ul>
+  
+<li><tt>INCLUSION specifier, specifier, ...</tt>
+  
+<ul>
+    
+<li><tt>INCLUSION</tt> can be &#x201c;include&#x201d;, 
&#x201c;includes&#x201d; or &#x201c;including&#x201d;</li>
+  </ul></li>
+  
+<li><tt>EXCLUSION specifier, specifier, ...</tt>
+  
+<ul>
+    
+<li><tt>EXCLUSION</tt> can be &#x201c;exclude&#x201d;, 
&#x201c;excludes&#x201d; or &#x201c;excluding&#x201d;</li>
+  </ul></li>
+</ul>
+<p>The specifiers are a set of fixed specifiers available as part of the 
language:</p>
+
+<ul>
+  
+<li>Fixed day of week-based specifiers - includes or excludes if the window is 
on the specified day of the week
+  
+<ul>
+    
+<li>&#x201c;monday&#x201d; or &#x201c;mondays&#x201d;</li>
+    
+<li>&#x201c;tuesday&#x201d; or &#x201c;tuesdays&#x201d;</li>
+    
+<li>&#x201c;wednesday&#x201d; or &#x201c;wednesdays&#x201d;</li>
+    
+<li>&#x201c;thursday&#x201d; or &#x201c;thursdays&#x201d;</li>
+    
+<li>&#x201c;friday&#x201d; or &#x201c;fridays&#x201d;</li>
+    
+<li>&#x201c;saturday&#x201d; or &#x201c;saturdays&#x201d;</li>
+    
+<li>&#x201c;sunday&#x201d; or &#x201c;sundays&#x201d;</li>
+    
+<li>&#x201c;weekday&#x201d; or &#x201c;weekdays&#x201d;</li>
+    
+<li>&#x201c;weekend&#x201d; or &quot;&#x201c;weekends&#x201d;</li>
+  </ul></li>
+  
+<li>Relative day of week-based specifiers - includes or excludes based on the 
day of week relative to now
+  
+<ul>
+    
+<li>&#x201c;current day of the week&#x201d;</li>
+    
+<li>&#x201c;current day of week&#x201d;</li>
+    
+<li>&#x201c;this day of the week&#x201d;</li>
+    
+<li>&#x201c;this day of week&#x201d;</li>
+  </ul></li>
+  
+<li>Specified date - includes or excludes based on the specified date
+  
+<ul>
+    
+<li>&#x201c;date&#x201d; - Takes up to 2 arguments
+    
+<ul>
+      
+<li>The day in <tt>yyyy/MM/dd</tt> format if no second argument is 
provided</li>
+      
+<li>Optionally the format to specify the first argument in</li>
+      
+<li>Example: <tt>date:2017/12/25</tt> would include or exclude December 25, 
2017</li>
+      
+<li>Example: <tt>date:20171225:yyyyMMdd</tt> would include or exclude December 
25, 2017</li>
+    </ul></li>
+  </ul></li>
+  
+<li>Holidays - includes or excludes based on if the window starts during a 
holiday
+  
+<ul>
+    
+<li>&#x201c;holiday&#x201d; or &#x201c;holidays&#x201d;
+    
+<ul>
+      
+<li>Arguments form the jollyday hierarchy of holidays. e.g. 
&#x201c;us:nyc&#x201d; would be holidays for New York City, USA</li>
+      
+<li>If none is specified, it will choose based on locale.</li>
+      
+<li>Countries supported are those supported in <a class="externalLink" 
href="https://github.com/svendiedrichsen/jollyday/tree/master/src/main/resources/holidays";>jollyday</a></li>
+      
+<li>Example: <tt>holiday:us:nyc</tt> would be the holidays of New York City, 
USA</li>
+      
+<li>Example: <tt>holiday:hu</tt> would be the holidays of Hungary</li>
+    </ul></li>
+  </ul></li>
+</ul>
+<p><b>WARNING: Daylight Savings Time effects</b></p>
+<p>While Universal Time (UTC) is nice and constant, many servers are set to 
local timezones that enable Daylight Savings Time (DST). This means that twice 
a year, on DST transition weekends, &#x201c;Sunday&#x201d; is either 23 or 25 
hours long. However, durations specified as &#x201c;7 days ago&#x201d; are 
always interpreted as &#x201c;7*24 hours ago&#x201d;. This can lead to some 
surprising effects when using days of the week as inclusion or exclusion 
specifiers.</p>
+<p>For example, the profile window specified by the phrase &#x201c;30 minute 
window every 24 hours from 7 days ago&#x201d; will always have 7 thirty-minute 
intervals, and these will normally occur on 5 weekdays and 2 weekend days. 
However, if you invoke this window at 12:15am any day during the week following 
the start of DST, you will get these intervals (supposing you start early on a 
Wednesday morning):</p>
+
+<div class="source">
+<div class="source">
+<pre>Tuesday 12:15am-12:45am (yesterday)
+Monday 12:15am-12:45am
+Saturday 11:15pm-11:45pm (skipped Sunday!)
+Friday 11:15pm-11:45pm
+Thursday 11:15pm-11:45pm
+Wednesday 11:15pm-11:45pm
+Tuesday 11:15pm-11:45pm
+</pre></div></div>
+<p>Sunday got skipped over because it was only 23 hours long; that is, there 
were 24 hours between Saturday 11:15pm and Monday 12:15am. So if you specified 
&#x201c;excluding weekends&#x201d;, you would get 6 days&#x2019; intervals 
instead of the expected 5. There are multiple variations on this theme.</p>
+<p>Remember that the underlying time is kept in UTC, so the data is always 
correct. It is only when attempting to interpret UTC as local time, date, and 
day, that these confusions may occur. They may be eliminated by setting your 
server timezone to UTC, or otherwise disabling DST.</p>
+<p><b>Examples</b></p>
+<p>Assume these are executed at noon.</p>
+
+<ul>
+  
+<li>A 1 hour window for the past 8 &#x2018;current day of the week&#x2019;
+  
+<ul>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">1 hour 
window</span></a> <a href="#Skip_distance"><span style="color:green">every 24 
hours</span></a> <a href="#Total_Temporal_Duration"><span 
style="color:blue">from 56 days ago</span></a> <a 
href="#InclusionExclusion_specifiers"><span style="color:purple">including this 
day of the week</span></a></li>
+  </ul></li>
+  
+<li>A 1 hour window for the past 8 tuesdays
+  
+<ul>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">1 hour 
window</span></a> <a href="#Skip_distance"><span style="color:green">every 24 
hours</span></a> <a href="#Total_Temporal_Duration"><span 
style="color:blue">from 56 days ago</span></a> <a 
href="#InclusionExclusion_specifiers"><span style="color:purple">including 
tuesdays</span></a></li>
+  </ul></li>
+  
+<li>A 30 minute window every tuesday at noon starting 14 days ago until now
+  
+<ul>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">30 minute 
window</span></a> <a href="#Skip_distance"><span style="color:green">every 24 
hours</span></a> <a href="#Total_Temporal_Duration"><span 
style="color:blue">from 14 days ago</span></a> <a 
href="#InclusionExclusion_specifiers"><span style="color:purple">including 
tuesdays</span></a></li>
+  </ul></li>
+  
+<li>A 30 minute window every day except holidays and weekends at noon starting 
14 days ago until now
+  
+<ul>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">30 minute 
window</span></a> <a href="#Skip_distance"><span style="color:green">every 24 
hours</span></a> <a href="#Total_Temporal_Duration"><span 
style="color:blue">from 14 days ago</span></a> <a 
href="#InclusionExclusion_specifiers"><span style="color:purple">excluding 
holidays:us, weekends</span></a></li>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">30 minute 
window</span></a> <a href="#Skip_distance"><span style="color:green">every 24 
hours</span></a> <a href="#Total_Temporal_Duration"><span 
style="color:blue">from 14 days ago</span></a> <a 
href="#InclusionExclusion_specifiers"><span style="color:purple">including 
weekdays excluding holidays:us, weekends</span></a></li>
+  </ul></li>
+  
+<li>A 30 minute window at noon every day from 7 days ago including saturdays 
and excluding weekends. Because exclusions trump inclusions, the following will 
never yield any windows
+  
+<ul>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">30 minute 
window</span></a> <a href="#Skip_distance"><span style="color:green">every 24 
hours</span></a> <a href="#Total_Temporal_Duration"><span 
style="color:blue">from 7 days ago</span></a> <a 
href="#InclusionExclusion_specifiers"><span style="color:purple">including 
saturdays excluding weekends</span></a></li>
+  </ul></li>
+</ul></div></div></div></div>
+<div class="section">
+<h3><a name="Errors"></a>Errors</h3>
+<p>The most common result of incorrect <tt>PROFILE_GET</tt> arguments or 
Client configuration parameters is an empty result set, rather than an error. 
The Client cannot effectively validate the arguments, because the Profiler 
configuration parameters may be changed and the profile itself does not store 
them. The person doing the querying must carry forward the knowledge of the 
Profiler configuration parameters from the time of profile creation, and use 
corresponding <tt>PROFILE_GET</tt> arguments and Client configuration 
parameters when querying the data.</p></div>
+<div class="section">
+<h3><a name="Examples"></a>Examples</h3>
+<p>The following are usage examples that show how the Stellar API can be used 
to read profiles generated by the <a 
href="../metron-profiler/index.html">Metron Profiler</a>. This API would be 
used in conjunction with other Stellar functions like <a 
href="../../metron-platform/metron-common/index.html#MAAS_MODEL_APPLY"><tt>MAAS_MODEL_APPLY</tt></a>
 to perform model scoring on streaming data.</p>
+<p>These examples assume a profile has been defined called 
&#x2018;snort-alerts&#x2019; that tracks the number of Snort alerts associated 
with an IP address over time. The profile definition might look similar to the 
following.</p>
+
+<div class="source">
+<div class="source">
+<pre>{
+  &quot;profiles&quot;: [
+    {
+      &quot;profile&quot;: &quot;snort-alerts&quot;,
+      &quot;foreach&quot;: &quot;ip_src_addr&quot;,
+      &quot;onlyif&quot;:  &quot;source.type == 'snort'&quot;,
+      &quot;update&quot;:  { &quot;s&quot;: &quot;STATS_ADD(s, 1)&quot; },
+      &quot;result&quot;:  &quot;STATS_MEAN(s)&quot;
+    }
+  ]
+}
+</pre></div></div>
+<p>During model scoring the entity being scored, in this case a particular IP 
address, will be known. The following examples shows how this profile data 
might be retrieved. Retrieve all values of &#x2018;snort-alerts&#x2019; from 
&#x2018;10.0.0.1&#x2019; over the past 4 hours.</p>
+
+<div class="source">
+<div class="source">
+<pre>PROFILE_GET('snort-alerts', '10.0.0.1', PROFILE_FIXED(4, 'HOURS'))
+</pre></div></div>
+<p>Retrieve all values of &#x2018;snort-alerts&#x2019; from 
&#x2018;10.0.0.1&#x2019; over the past 2 days.</p>
+
+<div class="source">
+<div class="source">
+<pre>PROFILE_GET('snort-alerts', '10.0.0.1', PROFILE_FIXED(2, 'DAYS'))
+</pre></div></div>
+<p>If the profile had been defined to group the data by weekday versus 
weekend, then the following example would apply:</p>
+<p>Retrieve all values of &#x2018;snort-alerts&#x2019; from 
&#x2018;10.0.0.1&#x2019; that occurred on &#x2018;weekdays&#x2019; over the 
past 30 days.</p>
+
+<div class="source">
+<div class="source">
+<pre>PROFILE_GET('snort-alerts', '10.0.0.1', PROFILE_FIXED(30, 'DAYS'), 
['weekdays'] )
+</pre></div></div>
+<p>The client may need to use a configuration different from the current 
Client configuration settings. For example, perhaps you are on a cluster shared 
with other analysts, and need to access a profile that was constructed 2 months 
ago using different period duration, while they are accessing more recent 
profiles constructed with the currently configured period duration. For this 
situation, you may use the <tt>config_overrides</tt> argument:</p>
+<p>Retrieve all values of &#x2018;snort-alerts&#x2019; from 
&#x2018;10.0.0.1&#x2019; over the past 2 days, with no <tt>groupBy</tt>, and 
overriding the usual global client configuration parameters for window 
duration.</p>
+
+<div class="source">
+<div class="source">
+<pre>PROFILE_GET('profile1', 'entity1', PROFILE_FIXED(2, 'DAYS', 
{'profiler.client.period.duration' : '2', 
'profiler.client.period.duration.units' : 'MINUTES'}), [])
+</pre></div></div>
+<p>Retrieve all values of &#x2018;snort-alerts&#x2019; from 
&#x2018;10.0.0.1&#x2019; that occurred on &#x2018;weekdays&#x2019; over the 
past 30 days, overriding the usual global client configuration parameters for 
window duration.</p>
+
+<div class="source">
+<div class="source">
+<pre>PROFILE_GET('profile1', 'entity1', PROFILE_FIXED(30, 'DAYS', 
{'profiler.client.period.duration' : '2', 
'profiler.client.period.duration.units' : 'MINUTES'}), ['weekdays'] )
+</pre></div></div></div></div>
+<div class="section">
+<h2><a name="Getting_Started"></a>Getting Started</h2>
+<p>These instructions step through the process of using the Stellar Client API 
on a live cluster. These instructions assume that the &#x2018;Getting 
Started&#x2019; instructions included with the <a 
href="../metron-profiler/index.html">Metron Profiler</a> have been followed. 
This will create a Profile called &#x2018;test&#x2019; whose data will be 
retrieved with the Stellar Client API.</p>
+<p>To validate that everything is working, login to the server hosting Metron. 
We will use the Stellar Shell to replicate the execution environment of Stellar 
running in a Storm topology, like Metron&#x2019;s Parser or Enrichment 
topology. Replace &#x2018;node1:2181&#x2019; with the URL to a Zookeeper 
Broker. </p>
+
+<div class="source">
+<div class="source">
+<pre>[root@node1 0.4.1]# bin/stellar -z node1:2181
+Stellar, Go!
+Please note that functions are loading lazily in the background and will be 
unavailable until loaded fully.
+{es.clustername=metron, es.ip=node1, es.port=9300, 
es.date.format=yyyy.MM.dd.HH}
+
+[Stellar]&gt;&gt;&gt; ?PROFILE_GET
+Functions loaded, you may refer to functions now...
+PROFILE_GET
+Description: Retrieves a series of values from a stored profile.
+
+Arguments:
+       profile - The name of the profile.
+       entity - The name of the entity.
+       durationAgo - How long ago should values be retrieved from?
+       units - The units of 'durationAgo'.
+       groups_list - Optional, must correspond to the 'groupBy' list used in 
profile creation - List (in square brackets) of 
+            groupBy values used to filter the profile. Default is the empty 
list, meaning groupBy was not used when 
+            creating the profile.
+       config_overrides - Optional - Map (in curly braces) of name:value 
pairs, each overriding the global config parameter
+            of the same name. Default is the empty Map, meaning no overrides.
+
+Returns: The selected profile measurements.
+
+[Stellar]&gt;&gt;&gt; PROFILE_GET('test','192.168.138.158', 1, 'HOURS')
+[12078.0, 8921.0, 12131.0]
+</pre></div></div>
+<p>The client API call above has retrieved the past hour of the 
&#x2018;test&#x2019; profile for the entity 
&#x2018;192.168.138.158&#x2019;.</p></div>
+<div class="section">
+<h2><a name="Developing_Profiles"></a>Developing Profiles</h2>
+<p>Troubleshooting issues when programming against a live stream of data can 
be difficult. The Stellar REPL is a powerful tool to help work out the kinds of 
enrichments and transformations that are needed. The Stellar REPL can also be 
used to help when developing profiles for the Profiler.</p>
+<p>Follow these steps in the Stellar REPL to see how it can be used to help 
create profiles.</p>
+
+<ol style="list-style-type: decimal">
+  
+<li>
+<p>Take a first pass at defining your profile. As an example, in the editor 
copy/paste the basic &#x201c;Hello, World&#x201d; profile below.</p>
+  
+<div class="source">
+<div class="source">
+<pre>[Stellar]&gt;&gt;&gt; conf := SHELL_EDIT()
+[Stellar]&gt;&gt;&gt; conf
+{
+  &quot;profiles&quot;: [
+    {
+      &quot;profile&quot;: &quot;hello-world&quot;,
+      &quot;onlyif&quot;:  &quot;exists(ip_src_addr)&quot;,
+      &quot;foreach&quot;: &quot;ip_src_addr&quot;,
+      &quot;init&quot;:    { &quot;count&quot;: &quot;0&quot; },
+      &quot;update&quot;:  { &quot;count&quot;: &quot;count + 1&quot; },
+      &quot;result&quot;:  &quot;count&quot;
+    }
+  ]
+}
+</pre></div></div></li>
+  
+<li>
+<p>Initialize the Profiler.</p>
+  
+<div class="source">
+<div class="source">
+<pre>[Stellar]&gt;&gt;&gt; profiler := PROFILER_INIT(conf)
+[Stellar]&gt;&gt;&gt; profiler
+Profiler{1 profile(s), 0 messages(s), 0 route(s)}
+</pre></div></div>
+<p>The profiler itself will show the number of profiles defined, the number of 
messages applied, and the number of routes taken. </p>
+<p>A route is defined when a message is applied to a specific profile. If a 
message is applied and not needed by any profile, then there are no routes. If 
a message is needed by one profile, then one route has been defined. If a 
message is needed by two profiles, then two routes have been defined. </p></li>
+  
+<li>
+<p>Create a message to simulate the type of telemetry that you expect to be 
profiled. As an example, in the editor copy/paste the JSON below.</p>
+  
+<div class="source">
+<div class="source">
+<pre>[Stellar]&gt;&gt;&gt; message := SHELL_EDIT()
+[Stellar]&gt;&gt;&gt; message
+{
+  &quot;ip_src_addr&quot;: &quot;10.0.0.1&quot;,
+  &quot;protocol&quot;: &quot;HTTPS&quot;,
+  &quot;length&quot;: &quot;10&quot;,
+  &quot;bytes_in&quot;: &quot;234&quot;
+}
+</pre></div></div></li>
+  
+<li>
+<p>Apply some telemetry messages to your profiles. The following applies the 
same message 3 times.</p>
+  
+<div class="source">
+<div class="source">
+<pre>[Stellar]&gt;&gt;&gt; PROFILER_APPLY(message, profiler)
+Profiler{1 profile(s), 1 messages(s), 1 route(s)}
+</pre></div></div>
+  
+<div class="source">
+<div class="source">
+<pre>[Stellar]&gt;&gt;&gt; PROFILER_APPLY(message, profiler)
+Profiler{1 profile(s), 2 messages(s), 2 route(s)}
+</pre></div></div>
+  
+<div class="source">
+<div class="source">
+<pre>[Stellar]&gt;&gt;&gt; PROFILER_APPLY(message, profiler)
+Profiler{1 profile(s), 3 messages(s), 3 route(s)}
+</pre></div></div>
+<p>It is also possible to apply multiple messages at once. This is useful when 
testing against a larger set of data. To do this, create a string that contains 
a JSON array of messages and pass that to the <tt>PROFILER_APPLY</tt> 
function.</p></li>
+  
+<li>
+<p>Flush the Profiler to see what has been calculated. A flush is what occurs 
at the end of each 15 minute period in the Profiler. The result is a list of 
profile measurements. Each measurement is a map containing detailed information 
about the profile data that has been generated.</p>
+  
+<div class="source">
+<div class="source">
+<pre>[Stellar]&gt;&gt;&gt; values := PROFILER_FLUSH(profiler)
+[Stellar]&gt;&gt;&gt; values
+[{period={duration=900000, period=1669628, start=1502665200000, 
end=1502666100000}, 
+   profile=hello-world, groups=[], value=3, entity=10.0.0.1}]
+</pre></div></div>
+<p>This profile simply counts the number of messages by IP source address. 
Notice that the value is &#x2018;3&#x2019; for the entity 
&#x2018;10.0.0.1&#x2019; as we applied 3 messages with an 
&#x2018;ip_src_addr&#x2019; of &#x2018;10.0.0.1&#x2019;. There will always be 
one measurement for each [profile, entity] pair.</p></li>
+  
+<li>
+<p>If you are unhappy with the data that has been generated, then 
&#x2018;wash, rinse and repeat&#x2019; this process. Once you are happy with 
the profile that was created, follow the <a 
href="../metron-profiler/index.html#Getting_Started">Getting Started</a> guide 
to use the profile against your live, streaming data in a Metron 
cluster.</p></li>
+</ol></div>
+                  </div>
+            </div>
+          </div>
+
+    <hr/>
+
+    <footer>
+            <div class="container-fluid">
+              <div class="row span12">Copyright &copy;                    2017
+                        <a href="https://www.apache.org";>The Apache Software 
Foundation</a>.
+            All Rights Reserved.      
+                    
+      </div>
+
+                          
+        
+                </div>
+    </footer>
+  </body>
+</html>


Reply via email to