[GitHub] metron issue #961: METRON-1487 Define Performance Benchmarks for Enrichment ...

2018-03-16 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/961
  
@JonZeolla @cestella Are you guys good with this?


---


[GitHub] metron pull request #966: METRON-1493 Unhelpful Error Message When Assignmen...

2018-03-16 Thread nickwallen
GitHub user nickwallen opened a pull request:

https://github.com/apache/metron/pull/966

METRON-1493 Unhelpful Error Message When Assignment Expressions Fail

When executing an assignment expression that fails, the error message is 
less than helpful.  Prior to this PR the error message looks something like 
this.

```
[Stellar]>>> p := 0/0
[!] Assignment expression failed
java.lang.IllegalArgumentException: Assignment expression failed
at 
org.apache.metron.stellar.common.shell.StellarResult.error(StellarResult.java:115)
at 
org.apache.metron.stellar.common.shell.specials.AssignmentCommand.execute(AssignmentCommand.java:82)
at 
org.apache.metron.stellar.common.shell.DefaultStellarShellExecutor.execute(DefaultStellarShellExecutor.java:252)
at 
org.apache.metron.stellar.common.shell.cli.StellarShell.execute(StellarShell.java:357)
at org.jboss.aesh.console.AeshProcess.run(AeshProcess.java:53)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
```

This has now been fixed so that you can understand exactly what happened.

```
[Stellar]>>> x := 0/0
[!] / by zero
java.lang.ArithmeticException: / by zero
at 
org.apache.metron.stellar.common.evaluators.ArithmeticEvaluator$ArithmeticEvaluatorFunctions.lambda$division$3(ArithmeticEvaluator.java:98)
at 
org.apache.metron.stellar.common.evaluators.ArithmeticEvaluator.evaluate(ArithmeticEvaluator.java:39)
at 
org.apache.metron.stellar.common.StellarCompiler.lambda$exitArithExpr_div$2(StellarCompiler.java:316)
at 
org.apache.metron.stellar.common.StellarCompiler$Expression.apply(StellarCompiler.java:190)
at 
org.apache.metron.stellar.common.BaseStellarProcessor.parse(BaseStellarProcessor.java:145)
at 
org.apache.metron.stellar.common.shell.DefaultStellarShellExecutor.executeStellar(DefaultStellarShellExecutor.java:401)
at 
org.apache.metron.stellar.common.shell.DefaultStellarShellExecutor.execute(DefaultStellarShellExecutor.java:257)
at 
org.apache.metron.stellar.common.shell.specials.AssignmentCommand.execute(AssignmentCommand.java:66)
at 
org.apache.metron.stellar.common.shell.DefaultStellarShellExecutor.execute(DefaultStellarShellExecutor.java:252)
at 
org.apache.metron.stellar.common.shell.cli.StellarShell.execute(StellarShell.java:357)
at org.jboss.aesh.console.AeshProcess.run(AeshProcess.java:53)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
```

To test this, launch the REPL and try it out.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/nickwallen/metron METRON-1493

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/metron/pull/966.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #966


commit 88ffa5be69e0901ac6c79952ffe164d2de4ef2a7
Author: Nick Allen 
Date:   2018-03-16T13:15:59Z

METRON-1493 Unhelpful Error Message When Assignment Expressions Fail




---


[GitHub] metron pull request #965: METRON-590 Enable Use of Event Time in Profiler

2018-03-15 Thread nickwallen
GitHub user nickwallen opened a pull request:

https://github.com/apache/metron/pull/965

METRON-590 Enable Use of Event Time in Profiler

This enables the use of event time processing in the Profiler.

By default, the Profiler will still use processing time.  If you configure 
the profiler with a `timestampField` then it will extract the timestamps from 
that field contained within the incoming telemetry.

## Manual Testing



1. Launch a development environment.  Shutdown Indexing, Elasticsearch, 
Kibana, YARN, and MapReduce2 to avoid any resource issues.

1. Using Ambari, change the following settings and restart the Profiler.

Set the "Period Duration" to 1 minute.
Set the "Window Duration" to 15 seconds.
Set the "Window Lag" to 30 seconds.

1. Replace `/opt/sensor-stubs/bin/start-bro-stub` with the following.

Instead of adding the current time into each Bro message, this will add 
a timestamp from 1 day ago.
```
#
# how long to delay between each 'batch' in seconds.
#
DELAY=${1:-2}

#
# how many messages to send in each 'batch'.  the messages are drawn 
randomly
# from the entire set of canned data.
#
COUNT=${2:-10}

INPUT="/opt/sensor-stubs/data/bro.out"
PRODUCER="/usr/hdp/current/kafka-broker/bin/kafka-console-producer.sh"
TOPIC="bro"

while true; do

  # transform the bro timestamp and push to kafka
  SEARCH="\"ts\"\:[0-9]\+\."
  REPLACE="\"ts\"\:`date -d '1 day ago' +'%s'`\."
  shuf -n $COUNT $INPUT | sed -e "s/$SEARCH/$REPLACE/g" | $PRODUCER 
--broker-list node1:6667 --topic $TOPIC

  sleep $DELAY
done
```

1. Restart the Bro Sensor Stub.

```
service sensor-stubs stop
service sensor-stubs start bro
```

1. Open up the REPL and configure the Profiler like so.

Notice that we are setting the 'timestampField' within the Profiler 
configuration.  This will tell the Profiler to extract the timestamp from this 
field rather than using system time.
```
[Stellar]>>> conf := SHELL_EDIT(conf)
{
  "profiles": [
{
  "profile": "hello-world",
  "onlyif": "source.type == 'bro'",
  "foreach": "'global'",
  "init":{ "count": "0" },
  "update":  { "count": "count + 1" },
  "result":  "count"
}
  ],
  "timestampField": "timestamp"
}
[Stellar]>>>
[Stellar]>>>
[Stellar]>>> CONFIG_PUT("PROFILER",conf)
```

1. Query the Profiler data store.  This will take a minute or so until you 
see a value written.

```
[Stellar]>>> PROFILE_GET("hello-world", "global", PROFILE_FIXED(2, 
"DAYS"))
[]
[Stellar]>>> PROFILE_GET("hello-world", "global", PROFILE_FIXED(2, 
"DAYS"))
[200]
```

1. Now query back just a couple hours instead.  Notice that you should get 
no results.  This indicates that the Profiler successfully used the timestamp 
from the Bro data which contained day old values.

```
[Stellar]>>> PROFILE_GET("hello-world", "global", PROFILE_FIXED(2, 
"HOURS"))
[]
```

1. Now change the Profiler configuration to remove the "timestampField" 
setting.  This will switch the Profiler back to using system aka processing 
time.

```
[Stellar]>>> conf := SHELL_EDIT(conf)
{
  "profiles": [
{
  "profile": "hello-world",
  "onlyif": "source.type == 'bro'",
  "foreach": "'global'",
  "init":{ "count": "0" },
  "update":  { "count": "count + 1" },
  "result":  "count"
}
  ]
}
[Stellar]>>>
[Stellar]>>> CONFIG_PUT("PROFILER",conf)
```

1. The Profiler will pick-up the change after the next flush event.  Query 
for profile data in the past few minutes.  This shows that the Profiler has 
switched bac

[GitHub] metron issue #963: METRON-1490: Better error message when user specifies an ...

2018-03-15 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/963
  
+1 LGTM


---


[GitHub] metron issue #962: METRON-1488: user_settings hbase table does not have acls...

2018-03-15 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/962
  
+1 looks good. thanks


---


[GitHub] metron pull request #961: METRON-1487 Define Performance Benchmarks for Enri...

2018-03-15 Thread nickwallen
Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/961#discussion_r174772238
  
--- Diff: metron-platform/metron-enrichment/Performance.md ---
@@ -0,0 +1,527 @@
+
+
+# Enrichment Performance
+
+This guide defines a set of benchmarks used to measure the performance of 
the Enrichment topology.  The guide also provides detailed steps on how to 
execute those benchmarks along with advice for tuning the Unified Enrichment 
topology.
+
+* [Benchmarks](#benchmarks)
+* [Benchmark Execution](#benchmark-execution)
+* [Performance Tuning](#performance-tuning)
+* [Benchmark Results](#benchmark-results)
+
+## Benchmarks
+
+The following section describes a set of enrichments that will be used to 
benchmark the performance of the Enrichment topology.
+
+* [Geo IP Enrichment](#geo-ip-enrichment)
+* [HBase Enrichment](#hbase-enrichment)
+* [Stellar Enrichment](#stellar-enrichment)
+
+### Geo IP Enrichment
+
+This benchmark measures the performance of executing a Geo IP enrichment.  
Given a valid IP address the enrichment will append detailed location 
information for that IP.  The location information is sourced from an external 
Geo IP data source like [Maxmind](https://github.com/maxmind/GeoIP2-java).
+
+ Configuration
+
+Adding the following Stellar expression to the Enrichment topology 
configuration will define a Geo IP enrichment.
+```
+geo := GEO_GET(ip_dst_addr)
+```
+
+After the enrichment process completes, the  telemetry message will 
contain a set of fields with location information for the given IP address.
+```
+{
+   "ip_dst_addr":"151.101.129.140",
+   ...
+   "geo.city":"San Francisco",
+   "geo.country":"US",
+   "geo.dmaCode":"807",
+   "geo.latitude":"37.7697",
+   "geo.location_point":"37.7697,-122.3933",
+   "geo.locID":"5391959",
+   "geo.longitude":"-122.3933",
+   "geo.postalCode":"94107",
+ }
+```
+
+### HBase Enrichment
+
+This benchmark measures the performance of executing an enrichment that 
retrieves data from an external HBase table. This type of enrichment is useful 
for enriching telemetry from an Asset Database or other source of relatively 
static data.
+
+ Configuration
+
+Adding the following Stellar expression to the Enrichment topology 
configuration will define an Hbase enrichment.  This looks up the 'ip_dst_addr' 
within an HBase table 'top-1m' and returns a hostname.
+```
+top1m := ENRICHMENT_GET('top-1m', ip_dst_addr, 'top-1m', 't')
+```
+
+After the telemetry has been enriched, it will contain the host and IP 
elements that were retrieved from the HBase table.
+```
+{
+   "ip_dst_addr":"151.101.2.166",
+   ...
+   "top1m.host":"earther.com",
+   "top1m.ip":"151.101.2.166"
+}
+```
+
+### Stellar Enrichment
+
+This benchmark measures the performance of executing a basic Stellar 
expression.  In this benchmark, the enrichment is purely a computational task 
that has no dependence on an external system like a database.  
+
+ Configuration
+
+Adding the following Stellar expression to the Enrichment topology 
configuration will define a basic Stellar enrichment.  The following returns 
true if the IP is in the given subnet and false otherwise.
+```
+local := IN_SUBNET(ip_dst_addr, '192.168.0.0/24')
+```
+
+After the telemetry has been enriched, it will contain a field with a 
boolean value indicating whether the IP was within the given subnet.
+```
+{
+   "ip_dst_addr":"151.101.2.166",
+   ...
+   "local":false
+}
+```
+
+## Benchmark Execution
+
+This section describes the steps necessary to execute the performance 
benchmarks for the Enrichment topology.
+
+* [Prepare Enrichment Data](#prepare-enrichment-data)
+* [Load HBase with Enrichment Data](#load-hbase-with-enrichment-data)
+* [Configure the Enrichments](#configure-the-enrichments)
+* [Create Input Telemetry](#create-input-telemetry)
+* [Cluster Setup](#cluster-setup)
+* [Monitoring](#monitoring)
+
+### Prepare Enrichment Data
+
+The Alexa Top 1 Million was used as an data source for these benchmarks.
+
+1. Download the [Alexa Top 1 
Million](http://s3.amazonaws.com/alexa-static/top-1m.csv.zip).
+
+2. For each hostname, query DNS to retrieve an associated IP address.  
+
+   A script like the

[GitHub] metron-bro-plugin-kafka issue #6: METRON-1469: Kafka Plugin for Bro - Config...

2018-03-13 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron-bro-plugin-kafka/pull/6
  
+1 Looks great.  Thanks @dcode 


---


[GitHub] metron pull request #961: METRON-1487 Define Performance Benchmarks for Enri...

2018-03-12 Thread nickwallen
Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/961#discussion_r173963284
  
--- Diff: metron-platform/metron-enrichment/Performance.md ---
@@ -0,0 +1,522 @@
+
+
+# Enrichment Performance
+
+This guide defines a set of benchmarks used to measure the performance of 
the Enrichment topology.  The guide also provides detailed steps on how to 
execute those benchmarks along with advice for tuning the Enrichment topology.
+
+* [Benchmarks](#benchmarks)
+* [Benchmark Execution](#benchmark-execution)
+* [Performance Tuning](#performance-tuning)
+* [Benchmark Results](#benchmark-results)
+
+## Benchmarks
+
+* [Geo IP Enrichment](#geo-ip-enrichment)
+* [HBase Enrichment](#hbase-enrichment)
+* [Stellar Enrichment](#stellar-enrichment)
+
+### Geo IP Enrichment
+
+This benchmark measures the performance of executing a Geo IP enrichment.  
Given a valid IP address the enrichment will append detailed location 
information for that IP.  The location information is sourced from an external 
Geo IP data source like [Maxmind](https://github.com/maxmind/GeoIP2-java). 
+
+ Configuration
+
+Adding the following Stellar expression to the Enrichment topology 
configuration will define a Geo IP enrichment.
+```
+geo := GEO_GET(ip_dst_addr)
+```
+
+After the enrichment process completes, the  telemetry message will 
contain a set of fields with location information for the given IP address.
+```
+{
+   "ip_dst_addr":"151.101.129.140",
+   ...
+   "geo.city":"San Francisco",
+   "geo.country":"US",
+   "geo.dmaCode":"807",
+   "geo.latitude":"37.7697",
+   "geo.location_point":"37.7697,-122.3933",
+   "geo.locID":"5391959",
+   "geo.longitude":"-122.3933",
+   "geo.postalCode":"94107",
+ }
+```
+
+### HBase Enrichment
+
+This benchmark measures the performance of executing an enrichment that 
retrieves data from an external HBase table. This type of enrichment is useful 
for enriching telemetry from an Asset Database or other source of relatively 
static data.
+
+ Configuration
+
+Adding the following Stellar expression to the Enrichment topology 
configuration will define an Hbase enrichment.  This looks up the 'ip_dst_addr' 
within an HBase table 'top-1m' and returns a hostname.
+```
+top1m := ENRICHMENT_GET('top-1m', ip_dst_addr, 'top-1m', 't')
+```
+
+After the telemetry has been enriched, it will contain the host and IP 
elements that were retrieved from the HBase table.
+```
+{
+   "ip_dst_addr":"151.101.2.166",
+   ...
+   "top1m.host":"earther.com",
+   "top1m.ip":"151.101.2.166"
+}
+```
+
+### Stellar Enrichment
+
+This benchmark measures the performance of executing a basic Stellar 
expression.  In this benchmark, the enrichment is purely a computational task 
that has no dependence on an external system like a database.  
+
+ Configuration 
+
+Adding the following Stellar expression to the Enrichment topology 
configuration will define a basic Stellar enrichment.  The following returns 
true if the IP is in the given subnet and false otherwise. 
+```
+local := IN_SUBNET(ip_dst_addr, '192.168.0.0/24')
+```
+
+After the telemetry has been enriched, it will contain a field with a 
boolean value indicating whether the IP was within the given subnet.
+```
+{
+   "ip_dst_addr":"151.101.2.166",
+   ...
+   "local":false
+}
+```
+   
+## Benchmark Execution
+
+* [Prepare Enrichment Data](#prepare-enrichment-data)
+* [Load HBase with Enrichment Data](#load-hbase-with-enrichment-data)
+* [Configure the Enrichments](#configure-the-enrichments)
+* [Create Input Telemetry](#create-input-telemetry)
+* [Cluster Setup](#cluster-setup)
+* [Monitoring](#monitoring)
+
+### Prepare Enrichment Data
+
+The Alexa Top 1 Million was used as an data source for these benchmarks.
+
+1. Download the [Alexa Top 1 
Million](http://s3.amazonaws.com/alexa-static/top-1m.csv.zip).
+
+2. For each hostname, query DNS to retrieve an associated IP address.  
+
+   A script like the following can be used for this.  There is no need to 
do this for all 1 million entries in the data set. Doing this for around 10,000 
records is sufficient.
+
+   ```python
+   import dns.resolver
+   import csv
+
+   resolver = dns.res

[GitHub] metron pull request #961: METRON-1487 Define Performance Benchmarks for Enri...

2018-03-12 Thread nickwallen
Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/961#discussion_r173963096
  
--- Diff: metron-platform/metron-enrichment/Performance.md ---
@@ -0,0 +1,522 @@
+
+
+# Enrichment Performance
+
+This guide defines a set of benchmarks used to measure the performance of 
the Enrichment topology.  The guide also provides detailed steps on how to 
execute those benchmarks along with advice for tuning the Enrichment topology.
+
+* [Benchmarks](#benchmarks)
+* [Benchmark Execution](#benchmark-execution)
+* [Performance Tuning](#performance-tuning)
+* [Benchmark Results](#benchmark-results)
+
+## Benchmarks
+
+* [Geo IP Enrichment](#geo-ip-enrichment)
+* [HBase Enrichment](#hbase-enrichment)
+* [Stellar Enrichment](#stellar-enrichment)
+
+### Geo IP Enrichment
+
+This benchmark measures the performance of executing a Geo IP enrichment.  
Given a valid IP address the enrichment will append detailed location 
information for that IP.  The location information is sourced from an external 
Geo IP data source like [Maxmind](https://github.com/maxmind/GeoIP2-java). 
+
+ Configuration
+
+Adding the following Stellar expression to the Enrichment topology 
configuration will define a Geo IP enrichment.
+```
+geo := GEO_GET(ip_dst_addr)
+```
+
+After the enrichment process completes, the  telemetry message will 
contain a set of fields with location information for the given IP address.
+```
+{
+   "ip_dst_addr":"151.101.129.140",
+   ...
+   "geo.city":"San Francisco",
+   "geo.country":"US",
+   "geo.dmaCode":"807",
+   "geo.latitude":"37.7697",
+   "geo.location_point":"37.7697,-122.3933",
+   "geo.locID":"5391959",
+   "geo.longitude":"-122.3933",
+   "geo.postalCode":"94107",
+ }
+```
+
+### HBase Enrichment
+
+This benchmark measures the performance of executing an enrichment that 
retrieves data from an external HBase table. This type of enrichment is useful 
for enriching telemetry from an Asset Database or other source of relatively 
static data.
+
+ Configuration
+
+Adding the following Stellar expression to the Enrichment topology 
configuration will define an Hbase enrichment.  This looks up the 'ip_dst_addr' 
within an HBase table 'top-1m' and returns a hostname.
+```
+top1m := ENRICHMENT_GET('top-1m', ip_dst_addr, 'top-1m', 't')
+```
+
+After the telemetry has been enriched, it will contain the host and IP 
elements that were retrieved from the HBase table.
+```
+{
+   "ip_dst_addr":"151.101.2.166",
+   ...
+   "top1m.host":"earther.com",
+   "top1m.ip":"151.101.2.166"
+}
+```
+
+### Stellar Enrichment
+
+This benchmark measures the performance of executing a basic Stellar 
expression.  In this benchmark, the enrichment is purely a computational task 
that has no dependence on an external system like a database.  
+
+ Configuration 
+
+Adding the following Stellar expression to the Enrichment topology 
configuration will define a basic Stellar enrichment.  The following returns 
true if the IP is in the given subnet and false otherwise. 
+```
+local := IN_SUBNET(ip_dst_addr, '192.168.0.0/24')
+```
+
+After the telemetry has been enriched, it will contain a field with a 
boolean value indicating whether the IP was within the given subnet.
+```
+{
+   "ip_dst_addr":"151.101.2.166",
+   ...
+   "local":false
+}
+```
+   
+## Benchmark Execution
+
+* [Prepare Enrichment Data](#prepare-enrichment-data)
+* [Load HBase with Enrichment Data](#load-hbase-with-enrichment-data)
+* [Configure the Enrichments](#configure-the-enrichments)
+* [Create Input Telemetry](#create-input-telemetry)
+* [Cluster Setup](#cluster-setup)
+* [Monitoring](#monitoring)
+
+### Prepare Enrichment Data
+
+The Alexa Top 1 Million was used as an data source for these benchmarks.
+
+1. Download the [Alexa Top 1 
Million](http://s3.amazonaws.com/alexa-static/top-1m.csv.zip).
+
+2. For each hostname, query DNS to retrieve an associated IP address.  
+
+   A script like the following can be used for this.  There is no need to 
do this for all 1 million entries in the data set. Doing this for around 10,000 
records is sufficient.
+
+   ```python
+   import dns.resolver
+   import csv
+
+   resolver = dns.res

[GitHub] metron pull request #961: METRON-1487 Define Performance Benchmarks for Enri...

2018-03-12 Thread nickwallen
GitHub user nickwallen opened a pull request:

https://github.com/apache/metron/pull/961

METRON-1487 Define Performance Benchmarks for Enrichment Topology

I created a markdown document that defines a set of performance benchmarks 
for the Enrichment topology.  These benchmarks should be repeatable to help 
detect performance regressions that might occur over time.

This PR creates a new markdown document under 
`metron-platform/metron-enrichment` that does the following.
(1) Defines performance benchmarks for the Enrichment topology
(2) Describes how the benchmarks can be executed
(3) Describes how to tune the topology when executing the benchmarks 
(4) Describes actual benchmark results and tuned parameters

## Pull Request Checklist

- [ ] Is there a JIRA ticket associated with this PR? If not one needs to 
be created at [Metron 
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
- [ ] Does your PR title start with METRON- where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
- [ ] Has your PR been rebased against the latest commit within the target 
branch (typically master)?
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered by building and verifying the site-book? If not then run 
the following commands and the verify changes via 
`site-book/target/site/index.html`:



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/nickwallen/metron METRON-1487

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/metron/pull/961.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #961


commit 706b06c01d0109fe4b1ce4e332f736864413435f
Author: Nick Allen 
Date:   2018-03-12T16:58:53Z

METRON-1487 Define Performance Benchmarks for Enrichment Topology




---


[GitHub] metron pull request #947: METRON-1467: Replace guava caches in places where ...

2018-03-07 Thread nickwallen
Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/947#discussion_r172888378
  
--- Diff: 
metron-platform/metron-enrichment/src/main/java/org/apache/metron/enrichment/bolt/JoinBolt.java
 ---
@@ -89,29 +91,25 @@ public void prepare(Map map, TopologyContext 
topologyContext, OutputCollector ou
 if (this.maxTimeRetain == null) {
   throw new IllegalStateException("maxTimeRetain must be specified");
 }
-loader = new CacheLoader>() {
-  @Override
-  public Map load(String key) throws Exception {
-return new HashMap<>();
-  }
-};
-cache = CacheBuilder.newBuilder().maximumSize(maxCacheSize)
-.expireAfterWrite(maxTimeRetain, 
TimeUnit.MINUTES).removalListener(new JoinRemoveListener())
-.build(loader);
+loader = s -> new HashMap<>();
+cache = Caffeine.newBuilder().maximumSize(maxCacheSize)
+ .expireAfterWrite(maxTimeRetain, TimeUnit.MINUTES)
+ .removalListener(new JoinRemoveListener())
--- End diff --

Yes, it is pre-existing.  We can address at a later time.

I remember now, maxing out this cache causes the Split/Join to fail, which 
is a major problem for the Split/Join topology.  And this cache here is only 
for the Split/Join, not the Unified topology.

We should probably look at adding similar logging (only when ERROR enabled) 
for the other places where we use the cache.  Or just some mechanism to 
periodically log cache stats.  Anywho, down the road.


---


[GitHub] metron issue #947: METRON-1467: Replace guava caches in places where the key...

2018-03-07 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/947
  
+1 LGTM


---


[GitHub] metron pull request #947: METRON-1467: Replace guava caches in places where ...

2018-03-07 Thread nickwallen
Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/947#discussion_r172866140
  
--- Diff: 
metron-platform/metron-enrichment/src/main/java/org/apache/metron/enrichment/bolt/JoinBolt.java
 ---
@@ -89,29 +91,25 @@ public void prepare(Map map, TopologyContext 
topologyContext, OutputCollector ou
 if (this.maxTimeRetain == null) {
   throw new IllegalStateException("maxTimeRetain must be specified");
 }
-loader = new CacheLoader>() {
-  @Override
-  public Map load(String key) throws Exception {
-return new HashMap<>();
-  }
-};
-cache = CacheBuilder.newBuilder().maximumSize(maxCacheSize)
-.expireAfterWrite(maxTimeRetain, 
TimeUnit.MINUTES).removalListener(new JoinRemoveListener())
-.build(loader);
+loader = s -> new HashMap<>();
+cache = Caffeine.newBuilder().maximumSize(maxCacheSize)
+ .expireAfterWrite(maxTimeRetain, TimeUnit.MINUTES)
+ .removalListener(new JoinRemoveListener())
--- End diff --

It seems like we only want notified of a full cache when ERROR logging is 
set. Is that the case? In the `JoinRemoveListener` we end up doing some work 
that we probably don't need to do unless ERROR logging is set.  One easy fix 
would be to only add the "remove listener" if `LOG.isDebugEnabled()`.


---


[GitHub] metron issue #940: METRON-1460: Create a complementary non-split-join enrich...

2018-03-06 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/940
  
+1 The unified topology works great.


---


[GitHub] metron pull request #940: METRON-1460: Create a complementary non-split-join...

2018-03-06 Thread nickwallen
Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/940#discussion_r172694248
  
--- Diff: 
metron-platform/metron-enrichment/src/main/java/org/apache/metron/enrichment/parallel/ParallelEnricher.java
 ---
@@ -0,0 +1,281 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.enrichment.parallel;
+
+import com.github.benmanes.caffeine.cache.stats.CacheStats;
+import org.apache.metron.common.Constants;
+import 
org.apache.metron.common.configuration.enrichment.SensorEnrichmentConfig;
+import 
org.apache.metron.common.configuration.enrichment.handler.ConfigHandler;
+import org.apache.metron.common.performance.PerformanceLogger;
+import org.apache.metron.common.utils.MessageUtils;
+import org.apache.metron.enrichment.bolt.CacheKey;
+import org.apache.metron.enrichment.interfaces.EnrichmentAdapter;
+import org.apache.metron.enrichment.utils.EnrichmentUtils;
+import org.json.simple.JSONObject;
+
+import java.util.AbstractMap;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.EnumMap;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutionException;
+import java.util.function.BinaryOperator;
+import java.util.function.Supplier;
+
+/**
+ * This is an independent component which will accept a message and a set 
of enrichment adapters as well as a config which defines
+ * how those enrichments should be performed and fully enrich the message. 
 The result will be the enriched message
+ * unified together and a list of errors which happened.
+ */
+public class ParallelEnricher {
+
+  private Map> enrichmentsByType = new 
HashMap<>();
+  private EnumMap cacheStats = new 
EnumMap<>(EnrichmentStrategies.class);
+
+  /**
+   * The result of an enrichment.
+   */
+  public static class EnrichmentResult {
+private JSONObject result;
+private List> enrichmentErrors;
+
+public EnrichmentResult(JSONObject result, List> enrichmentErrors) {
+  this.result = result;
+  this.enrichmentErrors = enrichmentErrors;
+}
+
+/**
+ * The unified fully enriched result.
+ * @return
+ */
+public JSONObject getResult() {
+  return result;
+}
+
+/**
+ * The errors that happened in the course of enriching.
+ * @return
+ */
+public List> getEnrichmentErrors() {
+  return enrichmentErrors;
+}
+  }
+
+  private ConcurrencyContext concurrencyContext;
+
+  /**
+   * Construct a parallel enricher with a set of enrichment adapters 
associated with their enrichment types.
+   * @param enrichmentsByType
+   */
+  public ParallelEnricher( Map> 
enrichmentsByType
+ , ConcurrencyContext concurrencyContext
+ , boolean logStats
+ )
+  {
+this.enrichmentsByType = enrichmentsByType;
+this.concurrencyContext = concurrencyContext;
+if(logStats) {
+  for(EnrichmentStrategies s : EnrichmentStrategies.values()) {
+cacheStats.put(s, null);
+  }
+}
+  }
+
+  /**
+   * Fully enriches a message.  Each enrichment is done in parallel via a 
threadpool.
+   * Each enrichment is fronted with a LRU cache.
+   *
+   * @param message the message to enrich
+   * @param strategy The enrichment strategy to use (e.g. enrichment or 
threat intel)
+   * @param config The sensor enrichment config
+   * @param perfLog The performance logger.  We log the performance for 
this call, the split portion and the enrichment portion.
+   * @return the enrichment result
+   */
+  pu

[GitHub] metron pull request #940: METRON-1460: Create a complementary non-split-join...

2018-03-06 Thread nickwallen
Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/940#discussion_r172595029
  
--- Diff: 
metron-platform/metron-enrichment/src/main/java/org/apache/metron/enrichment/parallel/ParallelEnricher.java
 ---
@@ -0,0 +1,281 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.enrichment.parallel;
+
+import com.github.benmanes.caffeine.cache.stats.CacheStats;
+import org.apache.metron.common.Constants;
+import 
org.apache.metron.common.configuration.enrichment.SensorEnrichmentConfig;
+import 
org.apache.metron.common.configuration.enrichment.handler.ConfigHandler;
+import org.apache.metron.common.performance.PerformanceLogger;
+import org.apache.metron.common.utils.MessageUtils;
+import org.apache.metron.enrichment.bolt.CacheKey;
+import org.apache.metron.enrichment.interfaces.EnrichmentAdapter;
+import org.apache.metron.enrichment.utils.EnrichmentUtils;
+import org.json.simple.JSONObject;
+
+import java.util.AbstractMap;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.EnumMap;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutionException;
+import java.util.function.BinaryOperator;
+import java.util.function.Supplier;
+
+/**
+ * This is an independent component which will accept a message and a set 
of enrichment adapters as well as a config which defines
+ * how those enrichments should be performed and fully enrich the message. 
 The result will be the enriched message
+ * unified together and a list of errors which happened.
+ */
+public class ParallelEnricher {
+
+  private Map> enrichmentsByType = new 
HashMap<>();
+  private EnumMap cacheStats = new 
EnumMap<>(EnrichmentStrategies.class);
+
+  /**
+   * The result of an enrichment.
+   */
+  public static class EnrichmentResult {
+private JSONObject result;
+private List> enrichmentErrors;
+
+public EnrichmentResult(JSONObject result, List> enrichmentErrors) {
+  this.result = result;
+  this.enrichmentErrors = enrichmentErrors;
+}
+
+/**
+ * The unified fully enriched result.
+ * @return
+ */
+public JSONObject getResult() {
+  return result;
+}
+
+/**
+ * The errors that happened in the course of enriching.
+ * @return
+ */
+public List> getEnrichmentErrors() {
+  return enrichmentErrors;
+}
+  }
+
+  private ConcurrencyContext concurrencyContext;
+
+  /**
+   * Construct a parallel enricher with a set of enrichment adapters 
associated with their enrichment types.
+   * @param enrichmentsByType
+   */
+  public ParallelEnricher( Map> 
enrichmentsByType
+ , ConcurrencyContext concurrencyContext
+ , boolean logStats
+ )
+  {
+this.enrichmentsByType = enrichmentsByType;
+this.concurrencyContext = concurrencyContext;
+if(logStats) {
+  for(EnrichmentStrategies s : EnrichmentStrategies.values()) {
+cacheStats.put(s, null);
+  }
+}
+  }
+
+  /**
+   * Fully enriches a message.  Each enrichment is done in parallel via a 
threadpool.
+   * Each enrichment is fronted with a LRU cache.
+   *
+   * @param message the message to enrich
+   * @param strategy The enrichment strategy to use (e.g. enrichment or 
threat intel)
+   * @param config The sensor enrichment config
+   * @param perfLog The performance logger.  We log the performance for 
this call, the split portion and the enrichment portion.
+   * @return the enrichment result
+   */
+  pu

[GitHub] metron issue #948: METRON-1468: Add support for apache/metron-bro-plugin-kaf...

2018-03-06 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/948
  
+1 


---


[GitHub] metron issue #942: METRON-1461: Modify the MIN, MAX Stellar methods to take ...

2018-03-06 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/942
  
The first point here is around semantics.  I am assuming the semantics of 
this would be a "max of maxes".  So if I have a list of stats objects, I 
compare the max of each one.  Whichever max is the greatest, that stats object 
gets returned.  The MIN function would just be a "min of mins".

Does that match your use case?  Is this the right approach?

Given those semantics, I think the 'Comparable' approach could work, but 
with a twist.  You can't just make the Stats objects Comparable.  Because how 
do you compare them?  By the average, median, min or max?  There is not one way 
to do it that is broadly applicable.

The means of comparison for the MAX function should use the max of a stats 
object.  The means of comparison for the MIN function should use the min of the 
stats object. 

One way is to create a class that wraps a Stats object and implements the 
Comparable interface.  In the case of MAX, the wrapper will compare using the 
max of the underlying stats object.  In the case of MIN function, the wrapper 
will compare using the min of the underlying stats object. 





---


[GitHub] metron issue #942: METRON-1461: Modify the MIN, MAX Stellar methods to take ...

2018-03-06 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/942
  
I would like to address the issue somehow or at least garner more community 
feedback on this change.

As it stands, usage of the function is not very consistent.  For example, I 
can pass a list of numbers, but I can't pass a list of Stat objects.  I can 
have a list of mixed numeric types, but I can't have a Stats object in a mixed 
list.  That is inconsistent IMO.






---


[GitHub] metron issue #948: METRON-1468: Add support for apache/metron-bro-plugin-kaf...

2018-03-06 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/948
  
Other than that little nit, it works great though.  Thanks @JonZeolla 


---


[GitHub] metron issue #948: METRON-1468: Add support for apache/metron-bro-plugin-kaf...

2018-03-06 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/948
  
When I got to the point of selecting the repo, I was unsure of what to type 
for the Bro Plugin repo.

```
$ ./metron-pr948/dev-utilities/committer-utils/prepare-commit
  ...using settings from /Users/nallen/.metron-prepare-commit
  which repo? [metron]:
```

Do you think we could make this more obvious somehow?  Maybe something like 
the following?

```
$ ./metron-pr948/dev-utilities/committer-utils/prepare-commit
  ...using settings from /Users/nallen/.metron-prepare-commit
  which repo? [1]:
[1] metron
[2] metron-bro-plugin-kafka
```

This would, as it does now, default to `[1] metron`.


---


[GitHub] metron-bro-plugin-kafka issue #6: Configurable JSON timestamps and default a...

2018-03-06 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron-bro-plugin-kafka/pull/6
  
@dcode 

1. The JIRA created for this is 
https://issues.apache.org/jira/browse/METRON-1469.

1. Please change the PR title to "METRON-1469: Kafka Plugin for Bro - 
Configurable JSON Timestamps".

1. Please update your PR description to remove references to "send all 
logs".  I just want to avoid any future confusion.


---


[GitHub] metron-bro-plugin-kafka issue #7: METRON-1324: Increment metron-bro-plugin-k...

2018-03-06 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron-bro-plugin-kafka/pull/7
  
+1 Thanks, @JonZeolla !


---


[GitHub] metron issue #936: METRON-1450:Add rest endpoint documentation for splitting...

2018-03-06 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/936
  
Thanks @MohanDV .  Will merge this now.


---


[GitHub] metron issue #942: METRON-1461: Modify the MIN, MAX Stellar methods to take ...

2018-03-06 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/942
  
@MohanDV - It looks like some care was taken previously so that getting the 
max of a list of mixed elements will just work.  For example `MAX([1, 2d, 3f]) 
== 3f`.  

Did you consider an implementation such that this would continue to work 
with STATS objects?  I imagine that something like the following could be made 
to work; `MAX[1, 2d, 3f, stats]`?  

What do you think?




---


[GitHub] metron issue #940: METRON-1460: Create a complementary non-split-join enrich...

2018-03-06 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/940
  
That's great @cestella .  Many thanks.  I will run it up in the lab. No 
problem.


---


[GitHub] metron pull request #940: METRON-1460: Create a complementary non-split-join...

2018-03-05 Thread nickwallen
Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/940#discussion_r172359339
  
--- Diff: 
metron-platform/metron-enrichment/src/main/java/org/apache/metron/enrichment/parallel/Strategy.java
 ---
@@ -0,0 +1,47 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.enrichment.parallel;
+
+import org.apache.metron.common.Constants;
+import 
org.apache.metron.common.configuration.enrichment.SensorEnrichmentConfig;
+import 
org.apache.metron.common.configuration.enrichment.handler.ConfigHandler;
+import org.json.simple.JSONObject;
+import org.slf4j.Logger;
+
+import java.util.Map;
+
+/**
+ * Enrichment strategy.  This interface provides a mechanism to interface 
with the enrichment config and any
+ * post processing steps that are needed to be done after-the-fact.
+ *
+ * The reasoning behind this is that the key difference between 
enrichments and threat intel is that they pull
+ * their configurations from different parts of the SensorEnrichmentConfig 
object and as a post-join step, they differ
+ * slightly.
+ *
+ */
+public interface Strategy {
+  Constants.ErrorType getErrorType();
--- End diff --

Can we javadoc each method?  This seems like an important interface.


---


[GitHub] metron pull request #940: METRON-1460: Create a complementary non-split-join...

2018-03-05 Thread nickwallen
Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/940#discussion_r172353404
  
--- Diff: 
metron-platform/metron-enrichment/src/main/java/org/apache/metron/enrichment/bolt/UnifiedEnrichmentBolt.java
 ---
@@ -0,0 +1,415 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.enrichment.bolt;
+
+import org.apache.metron.common.Constants;
+import org.apache.metron.common.bolt.ConfiguredEnrichmentBolt;
+import org.apache.metron.common.configuration.ConfigurationType;
+import 
org.apache.metron.common.configuration.enrichment.SensorEnrichmentConfig;
+import org.apache.metron.common.error.MetronError;
+import org.apache.metron.common.performance.PerformanceLogger;
+import org.apache.metron.common.utils.ErrorUtils;
+import org.apache.metron.common.utils.MessageUtils;
+import org.apache.metron.enrichment.adapters.geo.GeoLiteDatabase;
+import org.apache.metron.enrichment.configuration.Enrichment;
+import org.apache.metron.enrichment.interfaces.EnrichmentAdapter;
+import org.apache.metron.enrichment.parallel.EnrichmentContext;
+import org.apache.metron.enrichment.parallel.EnrichmentStrategies;
+import org.apache.metron.enrichment.parallel.ParallelEnricher;
+import org.apache.metron.enrichment.parallel.WorkerPoolStrategy;
+import org.apache.metron.stellar.dsl.Context;
+import org.apache.metron.stellar.dsl.StellarFunction;
+import org.apache.metron.stellar.dsl.StellarFunctions;
+import org.apache.storm.task.OutputCollector;
+import org.apache.storm.task.TopologyContext;
+import org.apache.storm.topology.OutputFieldsDeclarer;
+import org.apache.storm.tuple.Fields;
+import org.apache.storm.tuple.Tuple;
+import org.apache.storm.tuple.Values;
+import org.json.simple.JSONObject;
+import org.json.simple.parser.JSONParser;
+import org.json.simple.parser.ParseException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.UnsupportedEncodingException;
+import java.lang.invoke.MethodHandles;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.UUID;
+
+/**
+ * This bolt is a unified enrichment/threat intel bolt.  In contrast to 
the split/enrich/join
+ * bolts above, this handles the entire enrichment lifecycle in one bolt 
using a threadpool to
+ * enrich in parallel.
+ *
+ * From an architectural perspective, this is a divergence from the 
polymorphism based strategy we have
+ * used in the split/join bolts.  Rather, this bolt is provided a strategy 
to use, either enrichment or threat intel,
+ * through composition.  This allows us to move most of the implementation 
into components independent
+ * from Storm.  This will greater facilitate reuse.
+ */
+public class UnifiedEnrichmentBolt extends ConfiguredEnrichmentBolt {
+
+  public static class Perf {} // used for performance logging
+  private PerformanceLogger perfLog; // not static bc multiple bolts may 
exist in same worker
+
+  protected static final Logger LOG = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+  public static final String STELLAR_CONTEXT_CONF = "stellarContext";
+
+  /**
+   * The number of threads in the threadpool.  One threadpool is created 
per process.
+   * This is a topology-level configuration
+   */
+  public static final String THREADPOOL_NUM_THREADS_TOPOLOGY_CONF = 
"metron.threadpool.size";
+  /**
+   * The type of threadpool to create. This is a topology-level 
configuration.
+   */
+  public static final String THREADPOOL_TYPE_TOPOLOGY_CONF = 
"metron.threadpool.type";
+
+  /**
+   * The enricher implementation to use.  This will do the parallel 
enrichment via a thread pool.
+   */
+  protected ParallelEnricher enricher;
+
+  /**
+   * The strategy to use for this enrichment bolt.  Practically speak

[GitHub] metron pull request #940: METRON-1460: Create a complementary non-split-join...

2018-03-05 Thread nickwallen
Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/940#discussion_r172363362
  
--- Diff: 
metron-platform/metron-enrichment/src/main/java/org/apache/metron/enrichment/parallel/EnrichmentStrategies.java
 ---
@@ -0,0 +1,79 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.enrichment.parallel;
+
+import org.apache.metron.common.Constants;
+import 
org.apache.metron.common.configuration.enrichment.SensorEnrichmentConfig;
+import 
org.apache.metron.common.configuration.enrichment.handler.ConfigHandler;
+import org.apache.metron.enrichment.bolt.CacheKey;
+import org.json.simple.JSONObject;
+import org.slf4j.Logger;
+
+import java.util.Map;
+import java.util.concurrent.Executor;
+
+public enum EnrichmentStrategies implements Strategy {
--- End diff --

I don't understand the purpose of this class.  Why have an 
`EnrichmentStrategy`, a `ThreatIntelStrategy`, and `EnrichmentStrategies`?


---


[GitHub] metron issue #940: METRON-1460: Create a complementary non-split-join enrich...

2018-03-05 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/940
  
I completed some fairly extensive performance testing comparing this new 
Unified topology against the existing Split-Join implementation.  The 
difference was dramatic. 

- The Unified topology _performed roughly 3.4 times faster than Split-Join._

Both topologies in this side-by-side test included the same fixes, 
including the Guava cache problem fixed in #947. The tests included two 
enrichments:
* GeoIP enrichment; `geo := GEO_GET(ip_dst_addr)`
* Compute-only Stellar enrichment; `local := IN_SUBNET(ip_dst_addr, 
'192.168.0.0/24')`

The number one driver of performance is the cache hit rate, which is 
heavily dependent on what your data looks-like.  With these enrichments, that's 
driven by how varied the `ip_dst_addr` is in the data.  

I tested both of these topologies with different sets of data intended to 
either increase or decrease that cache hit rate.  The differences between the 
two topologies were fairly consistent across the different data sets. 

When running these topologies, reasonably well-tuned, on the same data, I 
was able to consistently maintain 70,000 events per second with the Split/Join 
topology.  In the same environment, I was able to maintain 312,000 events per 
second using the Unified topology.  

The raw throughput numbers are relative and depend on how much hardware you 
are willing to throw at the problem.  I was running on 3 nodes dedicated to 
running the Enrichment topology only.  But with the same data, on the same 
hardware, the difference was 3.4 times.  That's big.

Pushing as much as you can into a single executor and avoiding network hops 
is definitely the way to go here.



---


[GitHub] metron-bro-plugin-kafka pull request #6: Configurable JSON timestamps and de...

2018-03-05 Thread nickwallen
Github user nickwallen commented on a diff in the pull request:


https://github.com/apache/metron-bro-plugin-kafka/pull/6#discussion_r172240860
  
--- Diff: src/KafkaWriter.cc ---
@@ -54,20 +66,49 @@ KafkaWriter::KafkaWriter(WriterFrontend* frontend): 
WriterBackend(frontend), for
 }
 
 KafkaWriter::~KafkaWriter()
-{}
+{
+
+// Cleanup all the things
+delete topic;
+delete producer;
+delete formatter;
+delete conf;
+delete topic_conf;
+
+}
 
 bool KafkaWriter::DoInit(const WriterInfo& info, int num_fields, const 
threading::Field* const* fields)
 {
+// Timeformat object, default to TS_EPOCH
+threading::formatter::JSON::TimeFormat tf = 
threading::formatter::JSON::TS_EPOCH;
+
 // if no global 'topic_name' is defined, use the log stream's 'path'
 if(topic_name.empty()) {
 topic_name = info.path;
 }
 
+// format timestamps
+if ( strcmp(json_timestamps.c_str(), "JSON::TS_EPOCH") == 0 ) {
--- End diff --

Ah, I see.  Thanks for clarifying.  Let's work with what you have.  I agree 
there is little documentation in these parts.


---


[GitHub] metron-bro-plugin-kafka pull request #6: Configurable JSON timestamps and de...

2018-03-05 Thread nickwallen
Github user nickwallen commented on a diff in the pull request:


https://github.com/apache/metron-bro-plugin-kafka/pull/6#discussion_r172187861
  
--- Diff: README.md ---
@@ -37,10 +37,11 @@ The following examples highlight different ways that 
the plugin can be used.  Si
 
 ### Example 1
 
-The goal in this example is to send all HTTP and DNS records to a Kafka 
topic named `bro`. 
+The goal in this example is to send all HTTP and DNS records to a Kafka 
topic named `bro`.
  * Any configuration value accepted by librdkafka can be added to the 
`kafka_conf` configuration table.  
  * By defining `topic_name` all records will be sent to the same Kafka 
topic.
- * Defining `logs_to_send` will ensure that only HTTP and DNS records are 
sent.
+ * Defining `logs_to_send` will ensure that only HTTP and DNS records are 
sent. An empty set will default to all logs being
--- End diff --

We should remove this edit in the README since we removed the "send all by 
default" change in your PR.




---


[GitHub] metron-bro-plugin-kafka pull request #6: Configurable JSON timestamps and de...

2018-03-05 Thread nickwallen
Github user nickwallen commented on a diff in the pull request:


https://github.com/apache/metron-bro-plugin-kafka/pull/6#discussion_r172192869
  
--- Diff: src/KafkaWriter.cc ---
@@ -54,20 +66,49 @@ KafkaWriter::KafkaWriter(WriterFrontend* frontend): 
WriterBackend(frontend), for
 }
 
 KafkaWriter::~KafkaWriter()
-{}
+{
+
+// Cleanup all the things
+delete topic;
+delete producer;
+delete formatter;
+delete conf;
+delete topic_conf;
+
+}
 
 bool KafkaWriter::DoInit(const WriterInfo& info, int num_fields, const 
threading::Field* const* fields)
 {
+// Timeformat object, default to TS_EPOCH
+threading::formatter::JSON::TimeFormat tf = 
threading::formatter::JSON::TS_EPOCH;
+
 // if no global 'topic_name' is defined, use the log stream's 'path'
 if(topic_name.empty()) {
 topic_name = info.path;
 }
 
+// format timestamps
+if ( strcmp(json_timestamps.c_str(), "JSON::TS_EPOCH") == 0 ) {
--- End diff --

Just curious, why do we treat `json_timestamps` as a string?  Why not just 
treat it as a `JSON::TimestampFormat`?  Wouldn't that simplify a lot of this 
logic and remove all of the string comparison and string copy logic?


---


[GitHub] metron-bro-plugin-kafka pull request #6: Configurable JSON timestamps and de...

2018-03-05 Thread nickwallen
Github user nickwallen commented on a diff in the pull request:


https://github.com/apache/metron-bro-plugin-kafka/pull/6#discussion_r172193204
  
--- Diff: src/KafkaWriter.cc ---
@@ -54,20 +66,49 @@ KafkaWriter::KafkaWriter(WriterFrontend* frontend): 
WriterBackend(frontend), for
 }
 
 KafkaWriter::~KafkaWriter()
-{}
+{
+
+// Cleanup all the things
+delete topic;
+delete producer;
+delete formatter;
+delete conf;
+delete topic_conf;
+
+}
 
 bool KafkaWriter::DoInit(const WriterInfo& info, int num_fields, const 
threading::Field* const* fields)
 {
+// Timeformat object, default to TS_EPOCH
+threading::formatter::JSON::TimeFormat tf = 
threading::formatter::JSON::TS_EPOCH;
+
 // if no global 'topic_name' is defined, use the log stream's 'path'
 if(topic_name.empty()) {
 topic_name = info.path;
 }
 
+// format timestamps
+if ( strcmp(json_timestamps.c_str(), "JSON::TS_EPOCH") == 0 ) {
+  tf = threading::formatter::JSON::TS_EPOCH;
+}
+else if ( strcmp(json_timestamps.c_str(), "JSON::TS_MILLIS") == 0 ) {
+  tf = threading::formatter::JSON::TS_MILLIS;
+}
+else if ( strcmp(json_timestamps.c_str(), "JSON::TS_ISO8601") == 0 ) {
+  tf = threading::formatter::JSON::TS_ISO8601;
+}
+else
+{
--- End diff --

Small nit: Can we join the open paren to the line above just to match the 
rest of the code style.  Gracias.


---


[GitHub] metron issue #944: METRON-1463: Adjust the groupings and shuffles in enrichm...

2018-02-27 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/944
  
+1 These corrections should greatly improve performance. 


---


[GitHub] metron issue #940: METRON-1460: Create a complementary non-split-join enrich...

2018-02-27 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/940
  
I'd hold on merging this until we can get this tested at some decent scale. 
 Unless it already has been?  Otherwise, I don't see a need to merge this until 
we know it actually addresses a problem.


---


[GitHub] metron issue #936: METRON-1450:Added documentation for random access and bat...

2018-02-27 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/936
  
+1 Thanks for correcting this @MohanDV.  This looks good.  

Let's give @JonZeolla a chance to respond before we merge this.


---


[GitHub] metron issue #933: METRON-1452 Rebase Dev Environment on Latest CentOS 6

2018-02-26 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/933
  
FYI - After the last commit, I spun-up the CentOS environment again; 
validated the Alerts UI is receiving data, ran the Metron Service Check 
successfully.  All is well.


---


[GitHub] metron pull request #942: METRON-1461: Modify the MIN, MAX Stellar methods t...

2018-02-26 Thread nickwallen
Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/942#discussion_r170597762
  
--- Diff: 
metron-stellar/stellar-common/src/main/java/org/apache/metron/stellar/dsl/functions/OrdinalFunctions.java
 ---
@@ -37,17 +35,23 @@
* Return the maximum value of a list of input values in a Stellar list
*/
   @Stellar(name = "MAX"
-  , description = "Returns the maximum value of a list of input 
values"
-  , params = {"list - List of arguments. The list may only contain 
objects that are mutually comparable / ordinal (implement java.lang.Comparable 
interface)" +
+  , description = "Returns the maximum value of a list of input 
values or from a statistics object"
+  , params = {"stats - The Stellar statistics object"
+  ,"list - List of arguments. The list may only contain objects 
that are mutually comparable / ordinal (implement java.lang.Comparable 
interface)" +
   " Multi type numeric comparisons are supported: 
MAX([10,15L,15.3]) would return 15.3, but MAX(['23',25]) will fail and return 
null as strings and numbers can't be compared."}
-  , returns = "The maximum value in the list, or null if the list 
is empty or the input values were not comparable.")
+  , returns = "The maximum value in the list or from stats, or 
null if the list is empty or the input values were not comparable.")
   public static class Max extends BaseStellarFunction {
 
 @Override
 public Object apply(List args) {
   if (args.size() < 1 || args.get(0) == null) {
 throw new IllegalStateException("MAX function requires at least a 
Stellar list of values");
   }
+  Object firstArg = args.get(0);
+  if(firstArg instanceof Ordinal) {
+Ordinal stats = convert(firstArg, Ordinal.class);
+return stats.getMax();
+  }
   Iterable list = (Iterable) args.get(0);
--- End diff --

It would make sense to wrap the existing "iterable" handling code in an 
"else if".  And also handle the possibility that the argument is not an 
Iterable nor an Ordinal. Perhaps like so...

```
Object firstArg = args.get(0);
if(firstArg instanceof Ordinal) {
  

} else if(firstArg instanceof Iterable) {
  

} else {
   throw new IllegalStateException("MAX function expects either  ");

}
```


---


[GitHub] metron pull request #942: METRON-1461: Modify the MIN, MAX Stellar methods t...

2018-02-26 Thread nickwallen
Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/942#discussion_r170595688
  
--- Diff: 
metron-stellar/stellar-common/src/main/java/org/apache/metron/stellar/dsl/functions/OrdinalFunctions.java
 ---
@@ -37,17 +35,23 @@
* Return the maximum value of a list of input values in a Stellar list
*/
   @Stellar(name = "MAX"
-  , description = "Returns the maximum value of a list of input 
values"
-  , params = {"list - List of arguments. The list may only contain 
objects that are mutually comparable / ordinal (implement java.lang.Comparable 
interface)" +
+  , description = "Returns the maximum value of a list of input 
values or from a statistics object"
+  , params = {"stats - The Stellar statistics object"
+  ,"list - List of arguments. The list may only contain objects 
that are mutually comparable / ordinal (implement java.lang.Comparable 
interface)" +
   " Multi type numeric comparisons are supported: 
MAX([10,15L,15.3]) would return 15.3, but MAX(['23',25]) will fail and return 
null as strings and numbers can't be compared."}
-  , returns = "The maximum value in the list, or null if the list 
is empty or the input values were not comparable.")
+  , returns = "The maximum value in the list or from stats, or 
null if the list is empty or the input values were not comparable.")
   public static class Max extends BaseStellarFunction {
 
 @Override
 public Object apply(List args) {
   if (args.size() < 1 || args.get(0) == null) {
 throw new IllegalStateException("MAX function requires at least a 
Stellar list of values");
--- End diff --

With your changes, this error message is now incorrect.  Can you update 
this? 

This only checks the number of args, so the error message should probably 
just say that we expect one argument or something to that effect.


---


[GitHub] metron pull request #942: METRON-1461: Modify the MIN, MAX Stellar methods t...

2018-02-26 Thread nickwallen
Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/942#discussion_r170598354
  
--- Diff: 
metron-stellar/stellar-common/src/main/java/org/apache/metron/stellar/dsl/functions/Ordinal.java
 ---
@@ -0,0 +1,24 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.stellar.dsl.functions;
+
+public interface Ordinal {
--- End diff --

Can you add javadocs for the class and each method?


---


[GitHub] metron issue #933: METRON-1452 Rebase Dev Environment on Latest CentOS 6

2018-02-26 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/933
  
When creating the Ubuntu environment a while back, I created an Ansible 
role that enables swap space.  (In the base Ubuntu image, swap space is not 
enabled.)  It was easy enough to reuse that in the CentOS environment also.

With the latest commit, the CentOS environment has a larger swap space as 
before, but without the burden of maintaining an image in Vagrant Cloud/Atlas.  
I think this is the best of both worlds.

Let me know what you guys think.  Would like to get reaffirmation on the 
+1s before merging this. @mmiklavc @cestella 




---


[GitHub] metron-bro-plugin-kafka pull request #6: Configurable JSON timestamps and de...

2018-02-25 Thread nickwallen
Github user nickwallen commented on a diff in the pull request:


https://github.com/apache/metron-bro-plugin-kafka/pull/6#discussion_r170469201
  
--- Diff: src/KafkaWriter.cc ---
@@ -54,20 +66,51 @@ KafkaWriter::KafkaWriter(WriterFrontend* frontend): 
WriterBackend(frontend), for
 }
 
 KafkaWriter::~KafkaWriter()
-{}
+{
+// Cleanup Kafka resources
+while (producer->outq_len() > 0) {
--- End diff --

Waiting for the queue to clear is already performed in 
[DoFinish](https://github.com/rocknsm/metron-bro-plugin-kafka/blob/3bf94c7cfd06995280476c7c62f63616ce82ac3f/src/KafkaWriter.cc#L187-L191).
  Why do this again?




---


[GitHub] metron-bro-plugin-kafka pull request #6: Configurable JSON timestamps and de...

2018-02-25 Thread nickwallen
Github user nickwallen commented on a diff in the pull request:


https://github.com/apache/metron-bro-plugin-kafka/pull/6#discussion_r170470107
  
--- Diff: src/KafkaWriter.cc ---
@@ -54,20 +66,51 @@ KafkaWriter::KafkaWriter(WriterFrontend* frontend): 
WriterBackend(frontend), for
 }
 
 KafkaWriter::~KafkaWriter()
-{}
+{
+// Cleanup Kafka resources
+while (producer->outq_len() > 0) {
+producer->poll(1000);
+}
+producer->poll(1000);
+
+// Cleanup all the things
+delete topic;
+delete producer;
+delete formatter;
+delete conf;
+delete topic_conf;
+
+}
 
 bool KafkaWriter::DoInit(const WriterInfo& info, int num_fields, const 
threading::Field* const* fields)
 {
+// Timeformat object, default to TS_EPOCH
+threading::formatter::JSON::TimeFormat tf = 
threading::formatter::JSON::TS_EPOCH;
+
 // if no global 'topic_name' is defined, use the log stream's 'path'
 if(topic_name.empty()) {
 topic_name = info.path;
 }
 
+// format timestamps
+if ( strcmp(json_timestamps.c_str(), "JSON::TS_EPOCH") == 0 )
--- End diff --

Small nit: I would prefer if you added brackets around all the if 
statements here, which would more closely match the style used in the rest of 
the source code.

```
if (true) {
  foo();
}
```


---


[GitHub] metron-bro-plugin-kafka pull request #6: Configurable JSON timestamps and de...

2018-02-25 Thread nickwallen
Github user nickwallen commented on a diff in the pull request:


https://github.com/apache/metron-bro-plugin-kafka/pull/6#discussion_r170471879
  
--- Diff: scripts/Apache/Kafka/logs-to-kafka.bro ---
@@ -22,7 +22,7 @@ event bro_init() &priority=-5
 {
for (stream_id in Log::active_streams)
{
-   if (stream_id in Kafka::logs_to_send)
+   if ((|logs_to_send| == 0) || stream_id in Kafka::logs_to_send)
--- End diff --

As @JonZeolla mentioned, defaulting to "all logs on" is something that 
we've talked through before.  We should handle this as a separate PR with its 
own discussion.  I would take this out of the current PR.


---


[GitHub] metron issue #933: METRON-1452 Rebase Dev Environment on Latest CentOS 6

2018-02-21 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/933
  
I did not notice any issues, but I'll spin it up again and compare the 
difference in swap space just so we know what we're getting into.

Thanks for the info @dlyle65535 !


---


[GitHub] metron issue #933: METRON-1452 Rebase Dev Environment on Latest CentOS 6

2018-02-21 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/933
  
> what are we doing in the current special metron cut of centos 6? I'm not 
familiar enough with why we forked to understand what we're possibly giving up 
or exchanging by switching to the main centos Vagrant image.

Sure, I'll try to explain what I remember and justify the change.  Better 
to do our due diligence here.

Back then, we had some issues where the CentOS image would be updated and 
our Ansible deployment scripts would no longer work.  A couple times we'd wake 
up in the morning with a broken dev environment when we hadn't changed anything 
in Metron.  

As I remember it, this was back when we were just getting started.  All of 
Metron was deployed via Ansible, different Ansible versions would have 
different behaviors and break things, and it was generally a very painful 
experience.  

Back then we valued a stable dev environment over more rigorous testing.  
The Ansible scripts themselves have always just been a means to deploy Metron 
in a dev environment and not necessarily something that we want to support as 
part of Metron.  We didn't care all that much if the Ansible scripts didn't 
work in all CentOS environments, they are just for our dev environment.

Fast forward to now and most of the deployment process is part of the 
MPack. The MPack is something that we expect our users to actually use in their 
own environments.  Today, the MPack is a core part of Metron itself.  

If a patch in CentOS occurs that breaks our MPack, then I definitely want 
to know about that.  Given that, today I think we want to prioritize rigorous 
testing over a stable dev environment.  And that is why I think we should use 
the centos/6 image as it stands.





---


[GitHub] metron issue #619: METRON-939 Elasticsearch ES5 with Xshield client support

2018-02-10 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/619
  
This functionality was completed in #840.  As mentioned in #840 this 
inspired much of that work.  Is there anything else needed from this PR?  If 
not, can you close this PR @wardbekker ?

Thanks




---


[GitHub] metron pull request #933: METRON-1452 Rebase Dev Environment on Latest CentO...

2018-02-09 Thread nickwallen
GitHub user nickwallen opened a pull request:

https://github.com/apache/metron/pull/933

METRON-1452 Rebase Dev Environment on Latest CentOS 6

Currently the CentOS development environment 
(`metron-deployment/development/centos6`) is based on an image 
[metron/centos_base](https://app.vagrantup.com/metron/boxes/centos_base) that 
has not been updated in 11 months.  This image is really just a snapshot of 
[bento/centos6.7](https://app.vagrantup.com/bento/boxes/centos-6.7) from 11 
months ago. The 
[bento/centos6.7](https://app.vagrantup.com/bento/boxes/centos-6.7) image has 
not been updated in quite some time also.

On the other hand, the 
[centos/6](https://app.vagrantup.com/centos/boxes/6) image was updated 23 days 
ago. Presumably these images are receiving critical patches for long term 
support.

We should base the CentOS development environment 
`metron-deployment/development/centos6` on the 
[centos/6](https://app.vagrantup.com/centos/boxes/6) image so that we can be 
confident that Metron continues to work on the latest patches for the CentOS 6 
series.

This would match what we do for the Ubuntu development environment which is 
based on  [ubuntu/trusty64](https://app.vagrantup.com/ubuntu/boxes/trusty64).  
This image continues to receive updates regularly despite the age of the Ubuntu 
14 release.  It was updated just 3 days ago.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/nickwallen/metron METRON-1452

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/metron/pull/933.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #933


commit 9a6f92878e472aba27db9f000a894a8c82349ce9
Author: Nick Allen 
Date:   2018-02-09T14:11:31Z

METRON-1452 Rebase Dev Environment on Latest CentOS 6




---


[GitHub] metron issue #930: METRON-1318 updated MacOS instructions and explain AWS de...

2018-02-08 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/930
  
+1 Thanks @wardbekker !


---


[GitHub] metron issue #932: METRON-1451: On Centos full dev, Metron Indexing shows up...

2018-02-08 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/932
  
+1 Works great.  This was an annoying one to track down.  Great detective 
work @anandsubbu 


---


[GitHub] metron issue #932: METRON-1451: On Centos full dev, Metron Indexing shows up...

2018-02-08 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/932
  
Thanks @anandsubbu .  This looks like a solid fix.  Spinning it up now.


---


[GitHub] metron pull request #931: METRON-1449 Set Zookeeper URL for Stellar Running ...

2018-02-07 Thread nickwallen
GitHub user nickwallen opened a pull request:

https://github.com/apache/metron/pull/931

METRON-1449 Set Zookeeper URL for Stellar Running in Zeppelin Notebook

## Changes

* This enhances the Stellar interpreter for Zeppelin to allow the user to 
set a `zookeeper.url` property.  

* If the property is defined, a Zk client connection is accessible by 
the Stellar functions executed within Zeppelin.  If no `zookeeper.url` is 
defined, then the behavior remains the same as before.

* Multiple lines of Stellar can now be executed within the same block 
within a Zeppelin notebook.

* Updates to the README simplify installation of the interpreter in 
Zeppelin.

## Testing

1. Follow the README to build the interpreter and install it in Zeppelin.

1. Launch a Zookeeper instance.

1. Load up some basic configuration.  Create the standard Metron settings 
using `zkCli`.  It would look something like the following.
```
create /metron ""
create /metron/topology ""
create /metron/topology/global "{}"
create /metron/topology/parsers "{}"
create /metron/topology/enrichments "{}"
create /metron/topology/indexing "{}"
```

1. Set the zookeeper URL.  Go to the Interpreters > Stellar page and define 
a `zookeeper.url` property.

```
zookeeper.url = localhost:2181
```

1. Add the following dependencies to the interpreter so that we can access 
the metron-management functions.

Yes, this is ugly.  We need to fix some of our dependencies in the 
metron-management project (and others.)

| artifact  | exclude   
   |

|---|--|
| org.apache.metron:metron-management:0.4.3 |   
   |
| org.apache.metron:metron-common:0.4.3 |   
   |
| io.thekraken:grok:0.1.0   | 
org.apache.commons:commons-lang3 |
| org.apache.commons:commons-lang3:3.2  |   
   |
   
1. Save the interpreter changes, then open a notebook and execute the 
following. 

```
CONFIG_GET("GLOBAL")
```

An empty set of globals should be returned from Zookeeper.

1.  Run multiple expressions in a single Zeppelin code block.

## Known Problems

1. There is a problem that occurs with the Zk cache for some of the 
metron-management functions.  Calling `CONFIG_GET` will only ever return the 
first value that it gets from Zk.  If you change the globals via a 
`CONFIG_PUT`, the new values will NOT be reflected by calling `CONFIG_GET` 
until you restart the interpreter.  

This behavior differs from the CLI REPL.  I have been unable to 
determine why this is. I do not see anything in stellar-zeppelin that would 
cause this and am a little suspect of dependency issues in metron-management.  

   Have an idea what the problem might be?  Let me know!

   


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/nickwallen/metron METRON-1449

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/metron/pull/931.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #931


commit 9cc831fc7176c2a6674c204791659a5c4ac24f52
Author: Nick Allen 
Date:   2018-01-16T16:34:33Z

METRON-1449 Set Zookeeper URL for Stellar Running in Zeppelin Notebook




---


[GitHub] metron issue #927: METRON-1447 Heap Size Not Set Correctly by MPack for ES 5...

2018-02-07 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/927
  
Thanks for the review @mmiklavc and making sure the merge was solid.


---


[GitHub] metron issue #928: METRON-1444: Add Ubuntu Repositories for Elasticsearch to...

2018-02-07 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/928
  
+1 Ran this up on Ubuntu.  Thanks for the fix!


---


[GitHub] metron issue #928: METRON-1444: Add Ubuntu Repositories for Elasticsearch to...

2018-02-07 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/928
  
I disagree @ottobackwards .  The core problem is that there is a bug in 
Ambari that prevented us from loading the repos directly in the Mpack.  The 
only way to fix this is to upgrade.

This looks solid. I'm going to spin this up now.  Glad this one will soon 
be in the rearview mirror.


---


[GitHub] metron pull request #622: METRON-1005 Create Decodable Row Key for Profiler

2018-02-06 Thread nickwallen
Github user nickwallen closed the pull request at:

https://github.com/apache/metron/pull/622


---


[GitHub] metron pull request #927: METRON-1447 Heap Size Not Set Correctly by MPack f...

2018-02-06 Thread nickwallen
GitHub user nickwallen opened a pull request:

https://github.com/apache/metron/pull/927

METRON-1447 Heap Size Not Set Correctly by MPack for ES 5.x

The preferred way in which the heap size and other JVM options are set 
changed between ES 2.x and ES 5.x.  The project upgraded to ES 5.x as part of 
#840 , but the way the heap size is set by the Mpack was not changed.

This resulted in the heap size for Elasticsearch to be set incorrectly.  
This also allows Elasticsearch to use up to 2G of heap when launched in the 
development environments, which is too much for a constrained single VM.

## Changes

The user can set the heap size by populating the "heap_size" field under 
"Advanced elastic-jvm-options" in Ambari.  

Elasticsearch also exposes a large number of other settings in this file.  
The entire content of the file was exposed in Ambari to allow users to also 
alter any other JVM options as needed.

![screen shot 2018-02-06 at 11 27 34 
am](https://user-images.githubusercontent.com/2475409/35870877-c3d310ce-0b30-11e8-9b07-e77ae3b7074c.png)

## Testing

1. Launch a development environment; either Ubuntu or CentOS.  
* Ensure that telemetry reaches the Alerts UI.
* Run the Metron Service Check
* Run the Elasticsearch Service Check

1. Login to the node and ensure that only a single `-Xms` and `-Xmx` option 
was passed to the JVM when launching Elasticsearch.  Ensure these are both set 
to the default heap size of 512mb.

```
root@node1:/etc/elasticsearch# ps -ef | grep Elastic
root  1084 31038  0 16:09 pts/400:00:00 grep --color=auto 
Elastic
elastic+ 30048 1 23 16:08 ?00:00:17 
/usr/jdk64/jdk1.8.0_112/bin/java -Xms512m -Xmx512m -XX:+UseConcMarkSweepGC 
-XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly 
-XX:+AlwaysPreTouch -server -Xss1m -Djava.awt.headless=true 
-Dfile.encoding=UTF-8 -Djna.nosys=true 
-Djdk.io.permissionsUseCanonicalPath=true -Dio.netty.noUnsafe=true 
-Dio.netty.noKeySetOptimization=true -Dio.netty.recycler.maxCapacityPerThread=0 
-Dlog4j.shutdownHookEnabled=false -Dlog4j2.disable.jmx=true 
-Dlog4j.skipJansi=true -XX:+HeapDumpOnOutOfMemoryError 
-Des.path.home=/usr/share/elasticsearch -cp /usr/share/elasticsearch/lib/* 
org.elasticsearch.bootstrap.Elasticsearch -d -p 
/var/run/elasticsearch/elasticsearch.pid 
-Edefault.path.logs=/var/log/elasticsearch 
-Edefault.path.data=/var/lib/elasticsearch/ 
-Edefault.path.conf=/etc/elasticsearch/
```

1. Alter the JVM options template, save the settings, restart 
Elasticsearch, and ensure that the changes are reflected in the 
`/etc/elasticsearch/jvm.options` file.

## Pull Request Checklist
- [ ] Is there a JIRA ticket associated with this PR? If not one needs to 
be created at [Metron 
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
- [ ] Does your PR title start with METRON- where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
- [ ] Has your PR been rebased against the latest commit within the target 
branch (typically master)?
- [ ] Have you included steps to reproduce the behavior or problem that is 
being changed or addressed?
- [ ] Have you included steps or a guide to how the change may be verified 
and tested manually?
- [ ] Have you ensured that the full suite of tests and checks have been 
executed in the root metron folder via:
- [ ] Have you written or updated unit tests and or integration tests to 
verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [ ] Have you verified the basic functionality of the build by building 
and running locally with Vagrant full-dev environment or the equivalent?
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered by building and verifying the site-book? If not then run 
the following commands and the verify changes via 
`site-book/target/site/index.html`:


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/nickwallen/metron METRON-1447

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/metron/pull/927.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #927


commit 4224bfaa44ac0ad1512e7b4e13f77769c8708f32
Author: Nick Allen 
Date:   2018-02-06T16:14:15Z

METRON-1447 Heap Size Not Set Correctly by MPack for ES 5.x




---


[GitHub] metron issue #926: METRON-1446: Fix openjdk issue with Ubuntu

2018-02-06 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/926
  
+1 LGTM.  I have not been able to replicate this problem, which I am really 
confused about.  Although this seems harmless enough of a change.


---


[GitHub] metron issue #926: METRON-1446: Fix openjdk issue with Ubuntu

2018-02-06 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/926
  
Was this tested against Vagrant 2.0.2 or 1.8.1 ?


---


[GitHub] metron issue #925: METRON-1443 Missing Critical MPack Install Instruction fo...

2018-02-05 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/925
  
https://issues.apache.org/jira/browse/METRON-1444


---


[GitHub] metron pull request #925: METRON-1443 Missing Critical MPack Install Instruc...

2018-02-02 Thread nickwallen
GitHub user nickwallen opened a pull request:

https://github.com/apache/metron/pull/925

METRON-1443 Missing Critical MPack Install Instruction for Ubuntu

When installing Elasticsearch with the MPack on Ubuntu, you must manually 
install the Elasticsearch repositories.  The Mpack itself does not do this, 
like it does on CentOS. 

When the development environment on Ubuntu is spun-up this step is 
performed within Ansible as a prerequisite to the Mpack install.  Until this 
can be fixed so that it matches what happens on CentOS, this needs to be at 
least documented.

I should have documented this in #903 , but did not do so.  Oops.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/nickwallen/metron METRON-1443

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/metron/pull/925.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #925


commit 0c40178494d2a12e8e4abbd43ba4f85338aa05da
Author: Nick Allen 
Date:   2018-02-02T22:12:23Z

METRON-1443 Missing Critical MPack Install Instruction for Ubuntu




---


[GitHub] metron issue #920: METRON-1438 Move SHELL functions from metron-management t...

2018-02-02 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/920
  
+1 Thanks @ottobackwards 


---


[GitHub] metron pull request #920: METRON-1438 Move SHELL functions from metron-manag...

2018-02-02 Thread nickwallen
Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/920#discussion_r165663581
  
--- Diff: 
metron-stellar/stellar-common/src/test/java/org/apache/metron/stellar/dsl/functions/ShellFunctionsTest.java
 ---
@@ -40,8 +45,8 @@
   );
 
   Context context = new Context.Builder()
-.with(Context.Capabilities.SHELL_VARIABLES , () -> variables)
-.build();
+.with(Context.Capabilities.SHELL_VARIABLES , () -> 
variables).build();
--- End diff --

Right now, I never let my IDE reformat for me.  Like you said, if we get 
the code base matching check style and I can load that style into my IDE, then 
I'd gladly let it do most of the work for me.

Maybe I'll open a discuss thread.  I don't know how to handle this kind of 
thing and it happens all the time.

But for this specific scenario in your PR, it really doesn't matter either 
way.  I think you're good to go either way.


---


[GitHub] metron pull request #920: METRON-1438 Move SHELL functions from metron-manag...

2018-02-02 Thread nickwallen
Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/920#discussion_r165660358
  
--- Diff: 
metron-stellar/stellar-common/src/main/java/org/apache/metron/stellar/common/shell/cli/PausableInput.java
 ---
@@ -36,8 +37,8 @@
  *
  */
 public class PausableInput extends InputStream {
-  InputStream in = System.in;
-  boolean paused = false;
+  private InputStream in = System.in;
+  private AtomicBoolean paused = new AtomicBoolean(false);
--- End diff --

Good find!  I'm sure that was frustrating to dig into.


---


[GitHub] metron pull request #920: METRON-1438 Move SHELL functions from metron-manag...

2018-02-02 Thread nickwallen
Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/920#discussion_r165654991
  
--- Diff: 
metron-stellar/stellar-common/src/main/java/org/apache/metron/stellar/common/shell/cli/PausableInput.java
 ---
@@ -36,8 +37,8 @@
  *
  */
 public class PausableInput extends InputStream {
-  InputStream in = System.in;
-  boolean paused = false;
+  private InputStream in = System.in;
+  private AtomicBoolean paused = new AtomicBoolean(false);
--- End diff --

What problem were you solving here @ottobackwards ?  Is this bit access by 
multiple threads?


---


[GitHub] metron pull request #920: METRON-1438 Move SHELL functions from metron-manag...

2018-02-02 Thread nickwallen
Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/920#discussion_r165653517
  
--- Diff: 
metron-stellar/stellar-common/src/test/java/org/apache/metron/stellar/dsl/functions/ShellFunctionsTest.java
 ---
@@ -40,8 +45,8 @@
   );
 
   Context context = new Context.Builder()
-.with(Context.Capabilities.SHELL_VARIABLES , () -> variables)
-.build();
+.with(Context.Capabilities.SHELL_VARIABLES , () -> 
variables).build();
--- End diff --

I am actually interested in what direction as a project we should be taking 
with these types of fluent, chained statements.  I run across this all the time 
and I want to know the 'right' way that I should be doing it for the project.

IMHO, the way it was (separated by a line break) is more readable.  
Meaning, a long set of chained statements should be separated by line breaks.  
For example...
```
  result = new ProfileMeasurement()
  .withProfileName(profileName)
  .withEntity(entity)
  .withGroups(groups)
  .withPeriod(period)
  .withProfileValue(profileValue)
  .withTriageValues(triageValues)
  .withDefinition(definition);
```

But, of course, in terms of code style my opinion doesn't matter.  It is 
all about our style guidelines. 
 What does the Google code style guidelines say?   

Doesn't 
[this](https://google.github.io/styleguide/javaguide.html#s4.5.1-line-wrapping-where-to-break)
 support what I have said above about line breaks in this case?  




---


[GitHub] metron issue #919: METRON-1439: Turn off git pager in platform-info script

2018-02-01 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/919
  
> Sidenote, do we want to move this script to the dev-utilities dir?

I do think it would be good to move that script.  Not sure where under 
dev-utilities though.  And we can tackle as separate PR, if you like.




---


[GitHub] metron issue #919: METRON-1439: Turn off git pager in platform-info script

2018-02-01 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/919
  
+1 works great for me.  Thanks


---


[GitHub] metron issue #907: METRON-1427: Add support for storm 1.1 and hdp 2.6

2018-01-30 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/907
  
+1 Ran it up with all of our fixes in the Ubuntu dev environment.  Works 
great.  Thanks!


---


[GitHub] metron issue #907: METRON-1427: Add support for storm 1.1 and hdp 2.6

2018-01-30 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/907
  
I also just merged #913 so I will run this up just to be sure the whole 
package is working in the Ubuntu dev environment.


---


[GitHub] metron issue #907: METRON-1427: Add support for storm 1.1 and hdp 2.6

2018-01-29 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/907
  
FYI @cestella I submitted a fix against your PR branch that should address 
the issue with the embedded handlebars in the Ambari response. 


---


[GitHub] metron issue #907: METRON-1427: Add support for storm 1.1 and hdp 2.6

2018-01-29 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/907
  
For (2), the problem is that the HDFS configuration in Ambari has embedded 
'mustache handlebars' (see `{{major_stack_version}}`) that when returned from 
an API call confuses Ansible.  We need some way to strip that out or ignore it.

![screen shot 2018-01-29 at 2 31 44 
pm](https://user-images.githubusercontent.com/2475409/35530186-5072f3ae-0501-11e8-8bd9-93096da51f02.png)



---


[GitHub] metron pull request #913: METRON-1432 JDK Install Fails on Ubuntu Developmen...

2018-01-29 Thread nickwallen
GitHub user nickwallen opened a pull request:

https://github.com/apache/metron/pull/913

METRON-1432 JDK Install Fails on Ubuntu Development Environment

The Ansible role used to install the JDK does not work correctly on Ubuntu. 
 This fixes the problem and ensures that the JDK can be installed on either 
Ubuntu or CentOS.

## Testing

1. Launch the Ubuntu development environment. 
* Run the Metron Service Check
* Ensure data is visible within the Alerts UI

1. Launch the CentOS development environment.
* Run the Metron Service Check
* Ensure data is visible within the Alerts UI

## Pull Request Checklist
- [ ] Is there a JIRA ticket associated with this PR? If not one needs to 
be created at [Metron 
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
- [ ] Does your PR title start with METRON- where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
- [ ] Has your PR been rebased against the latest commit within the target 
branch (typically master)?
- [ ] Have you included steps to reproduce the behavior or problem that is 
being changed or addressed?
- [ ] Have you included steps or a guide to how the change may be verified 
and tested manually?
- [ ] Have you ensured that the full suite of tests and checks have been 
executed in the root metron folder via:
- [ ] Have you written or updated unit tests and or integration tests to 
verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [ ] Have you verified the basic functionality of the build by building 
and running locally with Vagrant full-dev environment or the equivalent?



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/nickwallen/metron METRON-1432

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/metron/pull/913.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #913


commit bbc27bf6337067d51e2d6d7d03bcb19203e35f59
Author: Nick Allen 
Date:   2018-01-29T18:15:32Z

METRON-1432 JDK Install Fails on Ubuntu Development Environment




---


[GitHub] metron issue #907: METRON-1427: Add support for storm 1.1 and hdp 2.6

2018-01-29 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/907
  
Running this up on Ubuntu, I ran into two issues.

(1) JDK Install Failed on Ubuntu - I do not think this was caused by this 
PR.  This is something that I should have hit in #903, but the changes there 
never brought this bug to light.  Fortunately, it is an easy fix and I will 
open it as a separate PR.

(2) After the Ambari deployment completes, it begins to install the 
sensors.  It queries Ambari for information to configure the sensors and in one 
of those queries it hits this problem.  I am not sure exactly what the problem 
is yet.
```
TASK [ambari_gather_facts : Ask Ambari: hdfs_url] 
**
ok: [node1]

TASK [ambari_gather_facts : set_fact] 
**
fatal: [node1]: FAILED! =>{  
   "failed":true,
   "msg":"the field 'args' has an invalid value, which appears to include a 
variable that is undefined. The error was: {u'status': 200, u'content_type': 
u'text/plain', u'set_cookie': 
u'AMBARISESSIONID=5tlv0y7btgc24krm5ugdpdgj;Path=/;HttpOnly', u'expires': u'Thu, 
01 Jan 1970 00:00:00 GMT', u'vary': u'Accept-Encoding, User-Agent', u'user': 
u'admin', u'pragma': u'no-cache', u'x_frame_options': u'DENY', 
u'x_xss_protection': u'1; mode=block', u'url': 
u'http://node1:8080/api/v1/clusters/metron_cluster/configurations?type=core-site&tag=TOPOLOGY_RESOLVED',
 u'changed': False, u'x_content_type_options': u'nosniff', u'content': u'{\\n  
\"href\" : 
\"http://node1:8080/api/v1/clusters/metron_cluster/configurations?type=core-site&tag=TOPOLOGY_RESOLVED\",\\n
  \"items\" : [\\n{\\n  \"href\" : 
\"http://node1:8080/api/v1/clusters/metron_cluster/configurations?type=core-site&tag=TOPOLOGY_RESOLVED\",\\n
  \"tag\" : \"TOPOLOGY_RESOLVED\",\\n  \"type\" : \"core-site\",\\n 
 \"version\
 " : 2,\\n  \"Config\" : {\\n\"cluster_name\" : 
\"metron_cluster\",\\n\"stack_id\" : \"HDP-2.6\"\\n  },\\n  
\"properties\" : {\\n\"fs.defaultFS\" : \"hdfs://node1:8020\",\\n   
 \"fs.trash.interval\" : \"360\",\\n
\"ha.failover-controller.active-standby-elector.zk.op.retries\" : \"120\",\\n   
 \"hadoop.custom-extensions.root\" : 
\"/hdp/ext/{{major_stack_version}}/hadoop\",\\n
\"hadoop.http.authentication.simple.anonymous.allowed\" : \"true\",\\n
\"hadoop.proxyuser.hbase.groups\" : \"*\",\\n
\"hadoop.proxyuser.hbase.hosts\" : \"*\",\\n
\"hadoop.security.auth_to_local\" : \"DEFAULT\",\\n
\"hadoop.security.authentication\" : \"simple\",\\n
\"hadoop.security.authorization\" : \"false\",\\n
\"hadoop.security.key.provider.path\" : \"\",\\n
\"io.compression.codecs\" : 
\"org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.Snapp
 yCodec\",\\n\"io.file.buffer.size\" : \"131072\",\\n
\"io.serializations\" : 
\"org.apache.hadoop.io.serializer.WritableSerialization\",\\n
\"ipc.client.connect.max.retries\" : \"50\",\\n
\"ipc.client.connection.maxidletime\" : \"3\",\\n
\"ipc.client.idlethreshold\" : \"8000\",\\n\"ipc.server.tcpnodelay\" : 
\"true\",\\n\"mapreduce.jobtracker.webinterface.trusted\" : 
\"false\",\\n\"net.topology.script.file.name\" : 
\"/etc/hadoop/conf/topology_script.py\"\\n  },\\n  
\"properties_attributes\" : {\\n\"final\" : {\\n  
\"fs.defaultFS\" : \"true\"\\n}\\n  }\\n}\\n  ]\\n}', 
u'connection': u'close', u'msg': u'OK (unknown bytes)', u'redirected': False, 
u'cache_control': u'no-store'}: 'major_stack_version' is undefined\n\nThe error 
appears to have been in 
'/Users/nallen/tmp/metron-pr907/metron-deployment/ansible/roles/ambari_gather_facts/tasks/main.yml':
 line 82, column 3, but may\nbe elsewhe
 re in the file depending on the exact syntax problem.\n\nThe offending line 
appears to be:\n\n\n- set_fact:\n  ^ here\n"
}
```




---


[GitHub] metron issue #903: METRON-1370 Create Full Dev Equivalent for Ubuntu

2018-01-26 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/903
  
Ok, I renamed the development environments.  

I went with a slightly different name than I previously mentioned, but it 
still matches the suggestions that I received earlier.  I thought this made 
more sense.  Feel free to tell me if you don't like it.

Instead of `vagrant/full-dev-environment` or `vagrant/metron-on-centos` or 
`vagrant/dev-on-centos6`, we have `development/centos6` which is concise and 
very clear as to the intended purpose. 

* `metron-deployment/development/centos6`
* `metron-deployment/development/ubuntu14`
* `metron-deployment/development/fastcapa`

I also added `metron-deployment/ansible/README.md` to clarify the purpose 
and use of those shared Ansible assets.  I really do not want to see people 
trying to use those for anything outside of the development environments.

I edited `metron-deployment/development/README.md` to describe the various 
development environments.

Let me know if this jives for everyone; @ottobackwards, @cestella, @lvets, 
etc.


---


[GitHub] metron issue #903: METRON-1370 Create Full Dev Equivalent for Ubuntu

2018-01-25 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/903
  
I am liking a combination of the suggestions from @lvets and @cestella.  
Something like this maybe?
* `dev-on-centos6`
* `dev-on-ubuntu14`

I like the name because of points made by others...
* The name identifies these as development environments
* Allows us to support multiple versions of the same platform; Ubuntu 16, 
CentOS 7, etc.


---


[GitHub] metron issue #903: METRON-1370 Create Full Dev Equivalent for Ubuntu

2018-01-25 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/903
  
> @lvets: Just for my understanding, but why Ubuntu Trusty? In April that 
will be 2 full Ubuntu LTS versions behind the then current one...

Because that's the requirement that I need to support.  All the work around 
the DEBs, the Mpack, Ansible setup was driven towards that.  

If you or anyone else wants to add support for a newer version that can 
also be done, but someone will have to put in the effort to do so.


---


[GitHub] metron issue #903: METRON-1370 Create Full Dev Equivalent for Ubuntu

2018-01-25 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/903
  
Thanks @ottobackwards .  I'll see if we can get any more reviewers before I 
merge this.


---


[GitHub] metron issue #903: METRON-1370 Create Full Dev Equivalent for Ubuntu

2018-01-25 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/903
  
> @ottobackwards : Do we expect there to be issues with 2.6? Is this PR and 
Casey's 2.6 pr going to conflict or have issues?

Yes, we will need to retest one or the other.  I am open to helping retest 
either this or #907, whichever goes in last.  If you or @cestella see an 
advantage to one or the other going first, I am all ears.

Fortunately, I do not expect there to be a problem.  Portions of this PR 
were previously tested against Ambari 2.6.  But of course, anything can happen, 
so we will need to retest either way.






---


[GitHub] metron issue #903: METRON-1370 Create Full Dev Equivalent for Ubuntu

2018-01-25 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/903
  
I spun this up again on both Ubuntu and CentOS.  Both worked successfully.  
I am happy with it now @ottobackwards .  Give her another go when you can.  
Thanks.


---


[GitHub] metron issue #902: METRON-1413 Add Metron Commit Tool

2018-01-25 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/902
  
I merged with master, so I expect Travis to be happy now.  Just need +1s 
and I'll get this in to allow for any follow-ons.


---


[GitHub] metron issue #901: METRON-1410 [MPACK] Check for existing HBASE tables befor...

2018-01-25 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/901
  
+1 Looks good @ottobackwards.  Thanks for fixing this!

I have not tested this myself, but it looks solid.  Let me know if you'd 
prefer me to spin this up to get a second test run in.  Otherwise, I assume 
that what you've done is sufficient.


---


[GitHub] metron issue #903: METRON-1370 Create Full Dev Equivalent for Ubuntu

2018-01-25 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/903
  
> @ottobackwards : Failure during vagrant up for metron-on-ubuntu

Thanks, Otto. Yep, I messed that up.  I pushed the fix, but I am going to 
run through full CentOS and Ubuntu deployments just to be sure.




---


[GitHub] metron issue #902: METRON-1413 Add Metron Commit Tool

2018-01-24 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/902
  
> @ottobackwards: I think of there being two users for these scripts...

That makes perfect sense to me.  Can we tackle that in a follow-on?


---


[GitHub] metron issue #905: METRON-1417: Disable pcap-service by default in Monit

2018-01-24 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/905
  
I should also say, #903 certainly doesn't preclude this.   This has a +1 
from me.  



---


[GitHub] metron issue #905: METRON-1417: Disable pcap-service by default in Monit

2018-01-24 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/905
  
I one-up'd you in #903 by just removing Monit all-together.  There is 
really no need for it any longer.  It was useful before the MPack; now not so 
much.


---


[GitHub] metron issue #888: METRON-1389: Zeppelin notebook import does not work with ...

2018-01-24 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/888
  
+1 by inspection.  Nice work @anandsubbu .

We'll need to figure out this intermittent test failure (impacting all PRs, 
not just yours) before we merge.





---


[GitHub] metron issue #903: METRON-1370 Create Full Dev Equivalent for Ubuntu

2018-01-23 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/903
  
I am continuing a final round of testing to make sure these changes are 
solid.  I wanted to get the PR open so I could start getting feedback as soon 
as possible. 


---


[GitHub] metron pull request #903: METRON-1370 Create Full Dev Equivalent for Ubuntu

2018-01-23 Thread nickwallen
GitHub user nickwallen opened a pull request:

https://github.com/apache/metron/pull/903

METRON-1370 Create Full Dev Equivalent for Ubuntu



This PR adds a single node, VM based deployment mechanism based on Ubuntu 
Trusty.  This has feature parity with the existing CentOS single node VM.  This 
will help us continue to improve and support the Ubuntu-based DEB packages that 
were added in #868.

### Changes

* The Metron, Elasticsearch, and Kibana MPacks were updated to support 
installation on Ubuntu.  This included making the following changes.

* Adding repo definitions for Ubuntu
* Changing the default location of the `/etc/sysconfig/elasticsearch` 
configuration to work across platforms
* Adding an additional configuration in the MPack to allow 
Elasticsearch to perform a memory lock on Ubuntu
* Support configuration of Elasticsearch on a platform running Systemd; 
like Ubuntu 

* The Elasticsearch MPack was enhanced to correctly perform a service 
check.  Previously the hostname and port were hard coded and the service check 
was not functional in most environments.

* The Ubuntu DEBs were enhanced to allow for the addition of "maintainer 
scripts".  This allows the Management UI and Alerts UI service scripts to be 
installed with the DEBs.  This was needed to reach feature parity with the RPMs.

* The name `full-dev-platform` no longer makes sense IMHO.  We do not have 
a `quick-dev-platform` any longer to distinguish it from.  Now the main 
distinguisher is the underlying operating system.  I renamed our "Full Dev" 
development environment to `metron-on-centos` and `metron-on-ubuntu`.  I am 
completely open to community suggestions on what these should be named.

```
metron-deployment/vagrant/
├── README.md
├── fastcapa
├── metron-on-centos
└── metron-on-ubuntu
```

* The organization of `metron-deployment` had to change so that assets 
could be reused across both the CentOS and Ubuntu deployments.  All shared 
Ansible assets were moved to `metron-deployment/ansible`.  This includes the 
Ansible modules, playbooks and roles that are all used across multiple 
environments.

```
metron-deployment/ansible/
├── extra_modules
├── playbooks
└── roles
```

* The following Ansible roles had to be updated to support both CentOS and 
Ubuntu based on small differences in package names and conventions.

* `ambari-common` 
* `ambari-config`
* `ambari-master`
* `ambari-slave`
* `libselinux-python`
* `ntp`

* A new role was added to enable swap space; `enable-swap`.  This is 
required for the Ubuntu deployment as the underlying Ubuntu Trusty image does 
not have swap space enabled by default.

* The `metron-builder` role was changed to selectively build either the 
DEBs or the RPMs as needed.

* The `metron-rpms` role was renamed to `metron-packages` and was also 
enhanced to create the local repository on a Ubuntu host.  This also 
selectively copies either RPMs or DEBs to the VM as needed.

* Monit is no longer needed and so has been removed from all VM 
deployments.  Monit was added prior to the existence of our Mpack installer.  
It is no longer needed

* Removed the `metron-streaming` role which is no longer applicable since 
these functions are now performed by the MPack.

### Testing

I have performed the following testing based on these changes.

- [ ] Run Metron on CentOS.  

```
cd metron-deployment/vagrant/metron-on-centos
vagrant up
```

* Validate that alerts are visible within the Alerts UI.
* Validate that the Metron Service Check completes successfully.
* Validate that the Elasticsearch service check completes successfully.

Be sure to shutdown and halt or destroy this VM before moving on to the 
next step.

- [ ] Run Metron on Ubuntu.

```
cd metron-deployment/vagrant/metron-on-centos
vagrant up
```

* Validate that alerts are visible within the Alerts UI.
* Validate that the Metron Service Check completes successfully.
* Validate that the Elasticsearch service check completes successfully.

Be sure to shutdown and halt or destroy this VM before moving on to the 
next step.

- [ ] Run one of the Fastcapa test environments.

```
cd metron-deployment/vagrant/fastcapa/centos-7.1/
vagrant up
```

* If the process fails at the task "fastcapa : Restart for modified 
kernel params" simply run `vagrant provision` again.
* Ensure that the deployment process reports succ

[GitHub] metron pull request #902: METRON-1413 Add Metron Commit Tool

2018-01-22 Thread nickwallen
GitHub user nickwallen reopened a pull request:

https://github.com/apache/metron/pull/902

METRON-1413 Add Metron Commit Tool

This PR contributes the `prepare-merge` tool that many (some?) contributors 
use.  Up until now, it has been managed in a separate repo.

I didn't have a logical place to put this tool, so I had to reorganize a 
bit.  Since our tooling has been growing, a reorganization shouldn't be too 
unexpected.

* Creates top level directory called `dev-utilities`.
* Moves existing `build_utils` to `dev-utilities/build-utils`
* Moves existing `build_utils/release-utils` to 
`dev-utilities/release-utils`
* Creates `dev-utilities/committer-utils`
* Adds the `prepare-commit` script to `committer-utils`.

It is a bit easier to see by just looking at it.
```
dev-utilities/
├── build-utils
│   ├── README.md
│   ├── create_bundled_licenses.sh
│   ├── generate_license.py
│   ├── list_dependencies.sh
│   ├── verify_license.py
│   └── verify_licenses.sh
├── committer-utils
│   ├── README.md
│   └── prepare-commit
└── release-utils
├── metron-rc-check
└── validate-jira-for-release

3 directories, 10 files
```

## Pull Request Checklist

- [x] Is there a JIRA ticket associated with this PR? If not one needs to 
be created at [Metron 
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
 
- [x] Does your PR title start with METRON- where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
- [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/nickwallen/metron METRON-1413

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/metron/pull/902.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #902


commit 44494212f1fd78e212d15a72f142c9b7534b46f8
Author: Nick Allen 
Date:   2018-01-19T17:28:42Z

METRON-1413 Add Metron Commit Tool

commit 098f90bc7fee5e2ab016470b02fee28c06f0f71e
Author: Nick Allen 
Date:   2018-01-19T17:57:52Z

Added license to README

commit 3ba733ca497f74b21625419e54757e6ac95c0bdd
Author: Nick Allen 
Date:   2018-01-19T18:06:38Z

Fixed typo

commit 9835faac9b814cb7e3c7f3142c79b55bf41bfec1
Author: Nick Allen 
Date:   2018-01-19T19:10:52Z

Renamed to dev-utilities

commit 8e981bb6ee246a45c07c93b41494b484ff66b0e3
Author: Nick Allen 
Date:   2018-01-19T20:18:46Z

Fixed-up references to build_utils




---


[GitHub] metron pull request #902: METRON-1413 Add Metron Commit Tool

2018-01-22 Thread nickwallen
Github user nickwallen closed the pull request at:

https://github.com/apache/metron/pull/902


---


[GitHub] metron issue #902: METRON-1413 Add Metron Commit Tool

2018-01-20 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/902
  
> @JonZeolla: ... the curl command HTML encodes the JIRA description but I 
don't know of a clean bash-native way to decode it.

Yes, I have noticed, especially with apostrophes.  When that happens, I 
usually just manually override it as a work around.  

My goal was just to get the tool into Apache as-is (warts and all).  We can 
try and fix annoyances like that on subsequent PRs (IMHO).  

I know you have some enhancements that you would like to make also.  
Looking forward to those. :)



---


[GitHub] metron pull request #902: METRON-1413 Add Metron Commit Tool

2018-01-20 Thread nickwallen
Github user nickwallen closed the pull request at:

https://github.com/apache/metron/pull/902


---


[GitHub] metron issue #902: METRON-1413 Add Metron Commit Tool

2018-01-20 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/902
  
Also, I've never liked the name `prepare-commit`.  If you guys can think of 
something better, please let me know.


---


[GitHub] metron issue #902: METRON-1413 Add Metron Commit Tool

2018-01-20 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/902
  
Travis had a hickup... Doing the "close/reopen" dance to trigger the CI 
build.


---


[GitHub] metron pull request #902: METRON-1413 Add Metron Commit Tool

2018-01-20 Thread nickwallen
GitHub user nickwallen reopened a pull request:

https://github.com/apache/metron/pull/902

METRON-1413 Add Metron Commit Tool

This PR contributes the `prepare-merge` tool that many (some?) contributors 
use.  Up until now, it has been managed in a separate repo.

I didn't have a logical place to put this tool, so I had to reorganize a 
bit.  Since our tooling has been growing, a reorganization shouldn't be too 
unexpected.

* Creates top level directory called `dev-utilities`.
* Moves existing `build_utils` to `dev-utilities/build-utils`
* Moves existing `build_utils/release-utils` to 
`dev-utilities/release-utils`
* Creates `dev-utilities/committer-utils`
* Adds the `prepare-commit` script to `committer-utils`.

It is a bit easier to see by just looking at it.
```
dev-utilities/
├── build-utils
│   ├── README.md
│   ├── create_bundled_licenses.sh
│   ├── generate_license.py
│   ├── list_dependencies.sh
│   ├── verify_license.py
│   └── verify_licenses.sh
├── committer-utils
│   ├── README.md
│   └── prepare-commit
└── release-utils
├── metron-rc-check
└── validate-jira-for-release

3 directories, 10 files
```

## Pull Request Checklist

- [x] Is there a JIRA ticket associated with this PR? If not one needs to 
be created at [Metron 
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
 
- [x] Does your PR title start with METRON- where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
- [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/nickwallen/metron METRON-1413

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/metron/pull/902.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #902


commit 44494212f1fd78e212d15a72f142c9b7534b46f8
Author: Nick Allen 
Date:   2018-01-19T17:28:42Z

METRON-1413 Add Metron Commit Tool

commit 098f90bc7fee5e2ab016470b02fee28c06f0f71e
Author: Nick Allen 
Date:   2018-01-19T17:57:52Z

Added license to README

commit 3ba733ca497f74b21625419e54757e6ac95c0bdd
Author: Nick Allen 
Date:   2018-01-19T18:06:38Z

Fixed typo

commit 9835faac9b814cb7e3c7f3142c79b55bf41bfec1
Author: Nick Allen 
Date:   2018-01-19T19:10:52Z

Renamed to dev-utilities

commit 8e981bb6ee246a45c07c93b41494b484ff66b0e3
Author: Nick Allen 
Date:   2018-01-19T20:18:46Z

Fixed-up references to build_utils




---


[GitHub] metron issue #902: METRON-1413 Add Metron Commit Tool

2018-01-20 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/902
  
The Apache ID isn't really necessary to use `prepare-commit`.  What if I 
just changed the docs to note that it is not required?  I could document how 
you could use the script to review a PR.

I would like to try and fill the need that you've identified 
@ottobackwards, I just really want to keep the amount of code that we have to 
maintain and support as minimal as possible.  


 


---


[GitHub] metron issue #902: METRON-1413 Add Metron Commit Tool

2018-01-19 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/902
  
I don't see the value in `checkout-pr` any longer.  You can just use 
`prepare-commit` which gives you the exact view of what the code would 
look-like when its merged.


---


[GitHub] metron issue #902: METRON-1413 Add Metron Commit Tool

2018-01-19 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/902
  
Thanks @mattf-horton.  Either of those suggestions works for me; 
'dev-support' or 'dev-utilities'.  I'll let others chime in with their 
preference and then update it accordingly.


---


  1   2   3   4   5   6   7   8   9   >