Repository: metron
Updated Branches:
  refs/heads/master 0b5a2e7a8 -> ac0e05f01


METRON-1194 Add Profiler Debug Functions to Profiler README (nickwallen via 
ottobackwards) closes apache/metron#765


Project: http://git-wip-us.apache.org/repos/asf/metron/repo
Commit: http://git-wip-us.apache.org/repos/asf/metron/commit/ac0e05f0
Tree: http://git-wip-us.apache.org/repos/asf/metron/tree/ac0e05f0
Diff: http://git-wip-us.apache.org/repos/asf/metron/diff/ac0e05f0

Branch: refs/heads/master
Commit: ac0e05f0168d8431b9e0a45c0be93f8179ee9e45
Parents: 0b5a2e7
Author: nickwallen <n...@nickallen.org>
Authored: Sat Sep 30 10:37:00 2017 -0400
Committer: otto <o...@apache.org>
Committed: Sat Sep 30 10:37:00 2017 -0400

----------------------------------------------------------------------
 metron-analytics/metron-profiler/README.md | 238 ++++++++++++++++++++----
 1 file changed, 199 insertions(+), 39 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/metron/blob/ac0e05f0/metron-analytics/metron-profiler/README.md
----------------------------------------------------------------------
diff --git a/metron-analytics/metron-profiler/README.md 
b/metron-analytics/metron-profiler/README.md
index 9b908f6..e95fd93 100644
--- a/metron-analytics/metron-profiler/README.md
+++ b/metron-analytics/metron-profiler/README.md
@@ -7,15 +7,27 @@ This is achieved by summarizing the streaming telemetry data 
consumed by Metron
 Any field contained within a message can be used to generate a profile.  A 
profile can even be produced by combining fields that originate in different 
data sources.  A user has considerable power to transform the data used in a 
profile by leveraging the Stellar language. A user only need configure the 
desired profiles and ensure that the Profiler topology is running.
 
 * [Installation](#installation)
-* [Getting Started](#getting-started)
 * [Creating Profiles](#creating-profiles)
+* [Deploying Profiles](#deploying-profiles)
+* [Anatomy of a Profile](#anatomy-of-a-profile)
 * [Configuring the Profiler](#configuring-the-profiler)
 * [Examples](#examples)
 * [Implementation](#implementation)
 
 ## Installation
 
-Follow these instructions to install the Profiler.  This assumes that core 
Metron has already been installed and validated.  
+The Profiler can be installed with either of these two methods.
+
+ * [Ambari Installation](#ambari-installation)
+ * [Manual Installation](#manual-installation)
+
+### Ambari Installation
+
+The Metron Profiler is installed automatically when installing Metron using 
the Ambari MPack.  You can skip the [Installation](#installation) section and 
move ahead to [Creating Profiles](#creating-profiles) should this be the case.
+
+### Manual Installation
+
+This section will describe the steps necessary to manually install the 
Profiler on an RPM-based Linux distribution.  This assumes that core Metron has 
already been installed and validated.  If you installed Metron using the 
[Ambari MPack](#ambari-mpack), then the Profiler has already been installed and 
you can skip this section.
 
 1. Build the Metron RPMs (see Building the 
[RPMs](../../metron-deployment#rpms)).  
 
@@ -58,6 +70,14 @@ Follow these instructions to install the Profiler.  This 
assumes that core Metro
     /usr/metron/0.4.1/lib/metron-profiler-0.4.0-uber.jar
     ```
 
+1. Edit the configuration file located at 
`$METRON_HOME/config/profiler.properties`.  
+    ```
+    kafka.zk=node1:2181
+    kafka.broker=node1:6667
+    ```
+    * Change `kafka.zk` to refer to Zookeeper in your environment.  
+    * Change `kafka.broker` to refer to a Kafka Broker in your environment.
+
 1. Create a table within HBase that will store the profile data. By default, 
the table is named `profiler` with a column family `P`.  The table name and 
column family must match the Profiler's configuration (see [Configuring the 
Profiler](#configuring-the-profiler)).  
 
     ```
@@ -65,14 +85,6 @@ Follow these instructions to install the Profiler.  This 
assumes that core Metro
     hbase(main):001:0> create 'profiler', 'P'
     ```
 
-1. Edit the configuration file located at 
`$METRON_HOME/config/profiler.properties`.  
-    ```
-    kafka.zk=node1:2181
-    kafka.broker=node1:6667
-    ```
-    Change `kafka.zk` to refer to Zookeeper in your environment.  
-    Change `kafka.broker` to refer to a Kafka Broker in your environment.
-
 1. Start the Profiler topology.
     ```
     $ cd $METRON_HOME
@@ -81,9 +93,157 @@ Follow these instructions to install the Profiler.  This 
assumes that core Metro
 
 At this point the Profiler is running and consuming telemetry messages.  We 
have not defined any profiles yet, so it is not doing anything very useful.  
The next section walks you through the steps to create your very first "Hello, 
World!" profile.
 
-## Getting Started
+## Creating Profiles
 
-This section will describe the steps required to get your first "Hello, 
World!"" profile running.  This assumes that you have a successful Profiler 
[Installation](#installation) and have it running.
+This section will describe how to create your very first "Hello, World" 
profile.  It will also outline a useful workflow for creating, testing, and 
deploying profiles.
+
+Creating and refining profiles is an iterative process.  Iterating against a 
live stream of data is slow, difficult and error prone.  The Profile Debugger 
was created to provide a controlled and isolated execution environment to 
create, refine and troubleshoot profiles.
+
+1. Launch the Stellar Shell.  We will leverage the Profiler Debugger from 
within the Stellar Shell.  
+       ```
+       [root@node1 ~]# $METRON_HOME/bin/stellar
+       Stellar, Go!
+       [Stellar]>>> %functions PROFILER
+       PROFILER_APPLY, PROFILER_FLUSH, PROFILER_INIT
+       ```  
+       
+1. Create a simple `hello-world` profile that will count the number of 
messages for each `ip_src_addr`.  The `SHELL_EDIT` function will open an editor 
in which you can copy/paste the following Profiler configuration.
+       ```
+       [Stellar]>>> conf := SHELL_EDIT()
+       [Stellar]>>> conf
+       {
+         "profiles": [
+           {
+             "profile": "hello-world",
+             "onlyif":  "exists(ip_src_addr)",
+             "foreach": "ip_src_addr",
+             "init":    { "count": "0" },
+             "update":  { "count": "count + 1" },
+             "result":  "count"
+           }
+         ]
+       }
+       ```
+
+1. Create a Profile execution environment; the Profile Debugger. 
+
+       The Profiler will output the number of profiles that have been defined, 
the number of messages that have been applied and the number of routes that 
have been followed.  
+
+       A route is defined when a message is applied to a specific profile.
+       * If a message is not needed by any profile, then there are no routes. 
+       * If a message is needed by one profile, then one route has been 
followed.
+       * If a message is needed by two profiles, then two routes have been 
followed.
+
+       ```
+       [Stellar]>>> p := PROFILER_INIT(conf)
+       [Stellar]>>> p
+       Profiler{1 profile(s), 0 messages(s), 0 route(s)}
+       ```
+
+1. Create a message that mimics the telemetry that your profile will consume. 
+
+       This message can be as simple or complex as you like.  For the 
`hello-world` profile, all you need is a message containing an `ip_src_addr` 
field.
+
+       ```
+       [Stellar]>>> msg := SHELL_EDIT()
+       [Stellar]>>> msg
+       {
+               "ip_src_addr": "10.0.0.1"
+       }
+       ```
+
+1. Apply the message to your Profiler, as many times as you like.
+
+       ```
+       [Stellar]>>> PROFILER_APPLY(msg, p)
+       Profiler{1 profile(s), 1 messages(s), 1 route(s)}
+       ```
+       ```
+       [Stellar]>>> PROFILER_APPLY(msg, p)
+       Profiler{1 profile(s), 2 messages(s), 2 route(s)}
+       ```
+
+1. Flush the Profiler.  
+       
+       A flush is what occurs at the end of each 15 minute period in the 
Profiler.  The result is a list of Profile Measurements. Each measurement is a 
map containing detailed information about the profile data that has been 
generated. The `value` field is what is written to HBase when running this 
profile in the Profiler topology. 
+       
+       There will always be one measurement for each [profile, entity] pair.  
This profile simply counts the number of messages by IP source address. Notice 
that the value is '3' for the entity '10.0.0.1' as we applied 3 messages with 
an 'ip_src_addr' of ’10.0.0.1'.
+       
+       ```
+       [Stellar]>>> values := PROFILER_FLUSH(profiler)
+       [Stellar]>>> values
+       [{period={duration=900000, period=1669628, start=1502665200000, 
end=1502666100000},
+       profile=hello-world, groups=[], value=3, entity=10.0.0.1}]
+       ```
+
+1. Apply real, live telemetry to your profile.
+
+       Once you are happy with your profile against a controlled data set, it 
can be useful to introduce more complex, live data.  This example extracts 10 
messages of live, enriched telemetry to test your profile(s).
+       ```
+       [Stellar]>>> %define bootstrap.servers := "node1:6667" 
+       node1:6667
+       [Stellar]>>> msgs := KAFKA_GET("indexing", 10) 
+       [Stellar]>>> LENGTH(msgs)
+       10
+       ```
+       Apply those 10 messages to your profile(s).
+       ```
+       [Stellar]>>> PROFILER_APPLY(msgs, p)
+         Profiler{1 profile(s), 10 messages(s), 10 route(s)}
+       ```
+
+
+## Deploying Profiles
+
+This section will describe the steps required to get your first "Hello, 
World!"" profile running.  This assumes that you have a successful Profiler 
[Installation](#installation) and have it running.  You can deploy profiles in 
two different ways.
+
+* [Deploying Profiles with the Stellar 
Shell](#deploying-profiles-with-the-stellar-shell)
+* [Deploying Profiles from the Command 
Line](#deploying-profiles-from-the-command-line)
+
+### Deploying Profiles with the Stellar Shell
+
+Continuing the previous running example, at this point, you have seen how your 
profile behaves against real, live telemetry in a controlled execution 
environment.  The next step is to deploy your profile to the live, actively 
running Profiler topology.
+
+1.  Start the Stellar Shell with the `-z ZK:2181` command line argument.  This 
is required when deploying a new profile to the active Profiler topology.  
Replace `ZK:2181` with a URL that is appropriate to your environment.
+       ```
+       [root@node1 ~]# $METRON_HOME/bin/stellar -z ZK:2181
+       Stellar, Go!
+       [Stellar]>>>
+       [Stellar]>>> %functions CONFIG CONFIG_GET, CONFIG_PUT
+       ```
+       
+1. If you haven't already, define your profile.
+       ```
+       [Stellar]>>> conf := SHELL_EDIT()
+       [Stellar]>>> conf
+       {
+         "profiles": [
+           {
+             "profile": "hello-world",
+             "onlyif":  "exists(ip_src_addr)",
+             "foreach": "ip_src_addr",
+             "init":    { "count": "0" },
+             "update":  { "count": "count + 1" },
+             "result":  "count"
+           }
+         ]
+       }
+       ```
+
+1. Check what is already deployed.  
+
+       Pushing a new profile configuration is destructive.  It will overwrite 
any existing configuration.  Check what you have out there.  Manually merge the 
existing configuration with your new profile definition.
+       
+       ```
+       [Stellar]>>> existing := CONFIG_GET("PROFILER")
+       ```
+
+1. Deploy your profile.  This will push the configuration to to the live, 
actively running Profiler topology.  This will overwrite any existing profile 
definitions.
+       ```
+       [Stellar]>>> CONFIG_PUT("PROFILER", conf)
+       ```
+
+### Deploying Profiles from the Command Line 
 
 1. Create the profile definition in a file located at 
`$METRON_HOME/config/zookeeper/profiler.json`.  This file will likely not 
exist, if you have never created Profiles before.
 
@@ -149,12 +309,9 @@ This section will describe the steps required to get your 
first "Hello, World!""
 
     It is assumed that the `PROFILE_GET` client is correctly configured to 
match the Profile configuration before using it to read that Profile.  More 
information on configuring and using the Profiler client can be found 
[here](../metron-profiler-client).  
 
+## Anatomy of a Profile
 
-## Creating Profiles
-
-The Profiler specification requires a JSON-formatted set of elements, many of 
which can contain Stellar code.  The specification contains the following 
elements.  (For the impatient, skip ahead to the [Examples](#examples).)
-The specification for the Profiler topology is stored in Zookeeper at  
`/metron/topology/profiler`.  These properties also exist in the local 
filesystem at `$METRON_HOME/config/zookeeper/profiler.json`.
-The values can be changed on disk and then uploaded to Zookeeper using 
`$METRON_HOME/bin/zk_load_configs.sh`.
+A profile definition requires a JSON-formatted set of elements, many of which 
can contain Stellar code.  The specification contains the following elements.  
(For the impatient, skip ahead to the [Examples](#examples).)
 
 | Name                          |               | Description
 |---                            |---            |---
@@ -392,29 +549,32 @@ The maximum number of seconds between batch writes to 
HBase.
 
 ## Examples
 
-The following examples are intended to highlight the functionality provided by 
the Profiler. Each shows the configuration that would be required to generate 
the profile.  
-
-These examples assume a fictitious input message stream that looks something 
like the following.
+The following examples are intended to highlight the functionality provided by 
the Profiler. Try out these examples easily in the Stellar Shell as described 
in the [Creating Profiles](#creating-profiles) section.
 
+These examples assume a fictitious input message stream that looks like the 
following.  
 ```
-{
-  "ip_src_addr": "10.0.0.1",
-  "protocol": "HTTPS",
-  "length": "10",
-  "bytes_in": "234"
-},
-{
-  "ip_src_addr": "10.0.0.2",
-  "protocol": "HTTP",
-  "length": "20",
-  "bytes_in": "390"
-},
-{
-  "ip_src_addr": "10.0.0.3",
-  "protocol": "DNS",
-  "length": "30",
-  "bytes_in": "560"
-}
+[Stellar]>>> msgs := SHELL_EDIT()
+[Stellar]>>> msgs
+[
+  {
+    "ip_src_addr": "10.0.0.1",
+    "protocol": "HTTPS",
+    "length": "10",
+    "bytes_in": "234"
+  },
+  {
+    "ip_src_addr": "10.0.0.2",
+    "protocol": "HTTP",
+    "length": "20",
+    "bytes_in": "390"
+  },
+  {
+    "ip_src_addr": "10.0.0.3",
+    "protocol": "DNS",
+    "length": "30",
+    "bytes_in": "560"
+  }
+]
 ```
 
 
@@ -585,4 +745,4 @@ The Profiler is implemented as a Storm topology using the 
following bolts and sp
 
 * `ProfileBuilderBolt` - This bolt maintains all of the state required to 
build a profile.  When the window period expires, the data is summarized as a 
`ProfileMeasurement`, all state is flushed, and the `ProfileMeasurement` is 
emitted.  Each instance of this bolt is responsible for maintaining the state 
for a single Profile-Entity pair.
 
-* `HBaseBolt` - A bolt that is responsible for writing to HBase.  Most 
profiles will be flushed every 15 minutes or so.  If each `ProfileBuilderBolt` 
were responsible for writing to HBase itself, there would be little to no 
opportunity to optimize these writes.  By aggregating the writes from multiple 
Profile-Entity pairs these writes can be batched, for example.
+* `HBaseBolt` - A bolt that is responsible for writing to HBase.  Most 
profiles will be flushed every 15 minutes or so.  If each `ProfileBuilderBolt` 
were responsible for writing to HBase itself, there would be little to no 
opportunity to optimize these writes.  By aggregating the writes from multiple 
Profile-Entity pairs these writes can be batched, for example.
\ No newline at end of file

Reply via email to