[GitHub] metron pull request #1281: METRON-1895: Add Knox SSO as an option in Metron

2018-11-26 Thread merrimanr
GitHub user merrimanr opened a pull request:

https://github.com/apache/metron/pull/1281

METRON-1895:  Add Knox SSO as an option in Metron

## Contributor Comments
This PR enables Metron to use Knox SSO for authentication.  This PR depends 
on https://github.com/apache/metron/pull/1275 and, in it's current state, is 
meant only as a way to demonstrate this capability.

### Testing

1. Follow the instructions in the "Testing" section of 
https://github.com/apache/metron/pull/1275 to add Metron as a Knox service.  
Once you can access REST and the UIs through Knox you are ready for the next 
steps.

2. Add the Metron REST host to the Knox SSO white list.  Navigate to the 
`Advanced knoxsso-topology` setting in Ambari at `Services > Knox > Configs > 
Advanced knoxsso-topology`.  This property contains the complete Knox SSO 
topology xml.  Update the `knoxsso.redirect.whitelist.regex` param value to 
include `node1`.  The complete xml should look like this:
```

  
  
  webappsec
  WebAppSec
  true
  
xframe.options.enabledtrue
  

  
  authentication
  ShiroProvider
  true
  
  sessionTimeout
  30
  
  
  redirectToUrl
  /gateway/knoxsso/knoxauth/login.html
  
  
  restrictedCookies
  rememberme,WWW-Authenticate
  
  
  main.ldapRealm
  
org.apache.hadoop.gateway.shirorealm.KnoxLdapRealm
  
  
  main.ldapContextFactory
  
org.apache.hadoop.gateway.shirorealm.KnoxLdapContextFactory
  
  
  main.ldapRealm.contextFactory
  $ldapContextFactory
  
  
  main.ldapRealm.userDnTemplate
  
uid={0},ou=people,dc=hadoop,dc=apache,dc=org
  
  
  main.ldapRealm.contextFactory.url
  ldap://localhost:33389
  
  
  
main.ldapRealm.authenticationCachingEnabled
  false
  
  
  
main.ldapRealm.contextFactory.authenticationMechanism
  simple
  
  
  urls./**
  authcBasic
  
  

  
  identity-assertion
  Default
  true
  
  

  
knoxauth
  

  
  KNOXSSO
  
  knoxsso.cookie.secure.only
  false
  
  
  knoxsso.token.ttl
  3
  
  
 knoxsso.redirect.whitelist.regex
 
^https?:\/\/(localhost|127\.0\.0\.1|0:0:0:0:0:0:0:1|::1|node1):[0-9].*$
  
  

  
``` 
Save and restart Knox.

3. Update the `metron.xml` topology file at 
`/usr/hdp/current/knox-server/conf/topologies/metron.xml` to use Knox SSO 
authentication.  Replace this:
```

authentication
ShiroProvider
true

sessionTimeout
30


main.ldapRealm

org.apache.hadoop.gateway.shirorealm.KnoxLdapRealm


main.ldapRealm.userDnTemplate
uid={0},ou=people,dc=hadoop,dc=apache,dc=org


main.ldapRealm.contextFactory.url
ldap://localhost:33389



main.ldapRealm.contextFactory.authenticationMechanism
simple


urls./**
authcBasic


```
With this:
```

federation
SSOCookieProvider
true

sso.authentication.provider.url
https://node1:8443/gateway/knoxsso/api/v1/websso


```

4. Extract the Knox public ke

[GitHub] metron issue #1265: METRON-1874 Create a Parser Debugger

2018-11-20 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1265
  
Love it.  I tested it in full dev and worked great.  +1


---


[GitHub] metron pull request #1265: METRON-1874 Create a Parser Debugger

2018-11-20 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1265#discussion_r235005842
  
--- Diff: 
metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/ParserRunnerImpl.java
 ---
@@ -87,13 +87,13 @@ public boolean isError() {
   protected transient Consumer onSuccess;
   protected transient Consumer onError;
 
-  private HashSet sensorTypes;
+  private Set sensorTypes;
--- End diff --

Shouldn't this be serializable?


---


[GitHub] metron pull request #1275: METRON-1878: Add Metron as a Knox service

2018-11-19 Thread merrimanr
GitHub user merrimanr opened a pull request:

https://github.com/apache/metron/pull/1275

METRON-1878: Add Metron as a Knox service

## Contributor Comments
This PR adds REST and the Alerts UI as services in Knox.  Currently this is 
incomplete and meant as a starting point to give reviewers an idea of the 
changes involved.  The outstanding items are detailed below.

### Changed Included

- Fixed minor path issues in various UI related code.  Files include:
  - app.component.html - path should be relative
  - authentication.service.ts - logout should match other endpoints
  - alert-search.directive.ts - path should be relative
  - index.html - a relative base href means Knox does not have to rewrite 
links
- Created Metron Knox topology and service definition files for REST and 
Alerts UI.  The service definition files are relatively simple and should look 
similar to other services.
- Created a script to deploy Metron files to Knox
- Added a Spring "knox" profile that configures Swagger with the right 
endpoint paths
- Added service definition files to rpm spec

### Testing
This feature can be tested in full dev with some manual configuration.  
After spinning up full dev:

1. Add Knox as a Service in Ambari (Admin > Stacks and Versions > Add 
Service next to Knox)
2. Follow the instructions 
[here](https://github.com/apache/metron/tree/master/metron-deployment/development#knox-demo-ldap)
 to enable the Knox LDAP in full dev.  You should be able to authentication 
with REST using guest/guest-password after this is done.
3. Install the Metron topology and service definition files by running the 
script at `/usr/metron/0.6.1/bin/install_metron_knox.sh`.  This script copies 
the metron.xml topology file to `/usr/hdp/current/knox-server/conf/topologies` 
and the service definition files to 
`/usr/hdp/current/knox-server/data/services/metron-rest/0.6.1` and 
`/usr/hdp/current/knox-server/data/services/metron-alerts/0.6.1`
4. Update the REST API path for the Alerts UI.  Currently expressjs proxies 
REST requests from the Alerts UI but this will now be done by Knox instead.  
The `apiRoot` setting in `app-config.json` is used to construct the url for 
REST requests.  Change this setting in 
`/usr/metron/0.6.1/web/alerts-ui/assets/app-config.json` from:
```
{
  "apiRoot": "/api/v1"
}
```
to:
```
{
  "apiRoot": "/gateway/metron/metron-rest/api/v1"
}
```
5. Update the REST API path for Swagger.  The Swagger interface is served 
by the REST application and provides links for the various endpoints.  Swagger 
must be configured with the new Knox root path for REST.  Navigate to `Ambari > 
Metron > Configs > REST`.   Change the `Active Spring profiles` setting from 
`dev` to `dev,knox` and change the `Metron Spring options` to 
`--knox.root=/gateway/metron/metron-rest`.  This configures Swagger to prepend 
the endpoint paths with the `knox.root` Spring property.
6. Restart REST with Ambari and you should now be able to access both 
Swagger and the Alerts UI through Knox:
- Swagger - https://node1:8443/gateway/metron/metron-rest/swagger-ui.html
- Alerts UI - https://node1:8443/gateway/metron/metron-alerts/alerts-list
Everything should function normally including login and logout.

### Outstanding Items

 Installing Metron in Knox

Installing a service in Knox involves 2 steps:
1. Deploy the service definition files mentioned earlier by copying them to 
Knox directories on the machine where Knox is installed
2. Either update an existing Knox topology file or create a new topology 
file.  This file configures a collection of services.  In our case this 
includes authentication and urls to the services exposed.

What is the best way to install these files in Knox?  The approach in this 
PR only works if Metron is installed on the same machine as the Knox gateway.

I also opted to create a dedicated Metron topology file.  I think this will 
give us more control and allow us to expose properties in a more user-friendly 
way.  Knox has a default topology that you can configure in Ambari but it's 
exposed as is:  a big xml file.  Does anyone think we should use the default 
topology file instead?

 Adding Knox to the stack

This process requires that we add Knox separately through Ambari.  Do we 
want to make Knox a dependency for Metron similar to Kafka, Storm, etc?  

Does it make sense to require that Knox and Metron are colocated?  Doing 
this simplifies installation and configuration and I would vote we make this a 
requirement unless there is a good reason not to.  Does anyone think there are 
cases where users would need to install them on different machines?  If we do 
that, how do we install the service definition fi

[GitHub] metron issue #1266: METRON-1875: Expose configurable global settings in the ...

2018-11-16 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1266
  
I addressed the style issues and added a comment about APP_INITIALIZER and 
promises.  Let me know what you think.


---


[GitHub] metron pull request #1266: METRON-1875: Expose configurable global settings ...

2018-11-16 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1266#discussion_r234293206
  
--- Diff: 
metron-interface/metron-alerts/src/app/service/app-config.service.ts ---
@@ -0,0 +1,40 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import { Injectable } from '@angular/core';
+import { HttpClient } from '@angular/common/http';
+
+@Injectable()
+export class AppConfigService {
+
+  private appConfig;
+
+  constructor(private http: HttpClient) { }
+
+  loadAppConfig() {
--- End diff --

The approach in this PR was inspired by 
https://juristr.com/blog/2018/01/ng-app-runtime-config/.  It mentions promises 
are not supported in APP_INITIALIZER.  I tried your suggestion and it did not 
work.  I can add a comment explaining that.


---


[GitHub] metron pull request #1266: METRON-1875: Expose configurable global settings ...

2018-11-15 Thread merrimanr
GitHub user merrimanr opened a pull request:

https://github.com/apache/metron/pull/1266

METRON-1875: Expose configurable global settings in the Alerts UI

## Contributor Comments
This PR exposes a JSON file that can be used to configure the Alerts UI.  
Properties in this file are read on startup and available through an 
`AppConfigService` that can be injected.  The use case in this PR is a 
configurable REST API path.  This path was hardcoded throughout the Alerts UI 
code but can now be changed in the configuration file directly.  A restart or 
rebuild is not necessary, the change is visible as soon as the Alerts UI is 
refreshed in the browser.  I think this could be generally useful for other use 
cases in the future.

### Changes Included

- An `AppConfigService` was added that loads configuration settings by 
requesting the config file as a static asset.
- The various services were updated to use the `apiRoot` configuration 
setting instead of a hardcoded url.
- We currently load column names from local storage on startup.  I don't 
believe this is used anymore so I switched the initialization call to 
`AppConfigService` and removed the unused code in `ColumnNameService`.
- Updated the unit tests to inject a mock `AppConfigService` instance.

### Testing
To test this feature, spin up full dev and verify that the UI continues to 
function normally.  Edit the file at 
`/usr/metron/0.6.1/web/alerts-ui/assets/app-config.json' and set the `apiRoot` 
property to a different value.  Refresh the Alerts UI and verify the api path 
reflects the change in `app-config.json`.  For example if I changed 
`app-config.json` to be:
```
{
  "apiRoot": "/some/api/path"
}
```
I would expect to see the search request url (in the network tab of Chrome 
developer tools for example) to be 
`http://node1:4201/some/api/path/search/search`.

## Pull Request Checklist

Thank you for submitting a contribution to Apache Metron.  
Please refer to our [Development 
Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235)
 for the complete guide to follow for contributions.  
Please refer also to our [Build Verification 
Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview)
 for complete smoke testing guides.  


In order to streamline the review of the contribution we ask you follow 
these guidelines and ask you to double check the following:

### For all changes:
- [x] Is there a JIRA ticket associated with this PR? If not one needs to 
be created at [Metron 
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
- [x] Does your PR title start with METRON- where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
- [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?


### For code changes:
- [x] Have you included steps to reproduce the behavior or problem that is 
being changed or addressed?
- [x] Have you included steps or a guide to how the change may be verified 
and tested manually?
- [x] Have you ensured that the full suite of tests and checks have been 
executed in the root metron folder via:
  ```
  mvn -q clean integration-test install && 
dev-utilities/build-utils/verify_licenses.sh 
  ```

- [x] Have you written or updated unit tests and or integration tests to 
verify your changes?
- [x] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [x] Have you verified the basic functionality of the build by building 
and running locally with Vagrant full-dev environment or the equivalent?

### For documentation related changes:
- [x] Have you ensured that format looks appropriate for the output in 
which it is rendered by building and verifying the site-book? If not then run 
the following commands and the verify changes via 
`site-book/target/site/index.html`:

  ```
  cd site-book
  mvn site
  ```

 Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.
It is also recommended that [travis-ci](https://travis-ci.org) is set up 
for your personal repository such that your branches are built there before 
submitting a pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/merrimanr/incubator-metron METRON-1875

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/metron/pull/1266.p

[GitHub] metron pull request #1263: METRON-1870: Intermittent Stellar REST test failu...

2018-11-14 Thread merrimanr
GitHub user merrimanr opened a pull request:

https://github.com/apache/metron/pull/1263

METRON-1870: Intermittent Stellar REST test failures

## Contributor Comments
This PR attempts to fix the intermittent test failures reported in the Jira.

For this failure:
```

restGetShouldTimeoutWithSuppliedTimeout(org.apache.metron.stellar.dsl.functions.RestFunctionsTest)
 Time elapsed: 0.336 sec <<< ERROR! 
org.apache.metron.stellar.dsl.ParseException: Unable to parse: 
REST_GET('http://localhost:41396/get', { "timeout": 1 } ) due to: Connection is 
not open
```
I believe the root cause is that we're getting an IllegalStateException in 
some cases where we're expecting an IOException.  The HttpGet.abort() method is 
called within an ScheduledExecutorService so it's unpredictable when exactly 
the request is aborted.  I changed the caught exception to be general so 
hopefully this gives us the result we expect.

For this failure:
```

restGetShouldHandleIOException(org.apache.metron.stellar.dsl.functions.RestFunctionsTest)
 Time elapsed: 0.314 sec <<< FAILURE! java.lang.AssertionError: expected null, 
but was:<{get=success}>
``` 
An IOException is being simulated by setting the socket timeout really low 
and executing a request against the mock server.  Looks like this is not enough 
to make it consistently fail.  I changed the test to spy on the doGet method 
and throw an IOException so that it fails every time.

I was not able to reproduce these problems locally but I think they will 
resolve the issues.

## Pull Request Checklist

Thank you for submitting a contribution to Apache Metron.  
Please refer to our [Development 
Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235)
 for the complete guide to follow for contributions.  
Please refer also to our [Build Verification 
Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview)
 for complete smoke testing guides.  


In order to streamline the review of the contribution we ask you follow 
these guidelines and ask you to double check the following:

### For all changes:
- [x] Is there a JIRA ticket associated with this PR? If not one needs to 
be created at [Metron 
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
- [x] Does your PR title start with METRON- where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
- [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?


### For code changes:
- [ ] Have you included steps to reproduce the behavior or problem that is 
being changed or addressed?
- [ ] Have you included steps or a guide to how the change may be verified 
and tested manually?
- [ ] Have you ensured that the full suite of tests and checks have been 
executed in the root metron folder via:
  ```
  mvn -q clean integration-test install && 
dev-utilities/build-utils/verify_licenses.sh 
  ```

- [x] Have you written or updated unit tests and or integration tests to 
verify your changes?
- [x] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [ ] Have you verified the basic functionality of the build by building 
and running locally with Vagrant full-dev environment or the equivalent?

### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered by building and verifying the site-book? If not then run 
the following commands and the verify changes via 
`site-book/target/site/index.html`:

  ```
  cd site-book
  mvn site
  ```

 Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.
It is also recommended that [travis-ci](https://travis-ci.org) is set up 
for your personal repository such that your branches are built there before 
submitting a pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/merrimanr/incubator-metron METRON-1870

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/metron/pull/1263.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1263


commit b2dd4232e997eef23d0ba7350caa49d52ce5e929
Author: merrimanr 
Date:   2018-11-14T15:09:21Z

initial commit




---


[GitHub] metron issue #1262: METRON-1871 Cannot Run Elasticsearch Integration Tests i...

2018-11-14 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1262
  
I find myself commenting these out every time I run ES integration tests in 
my IDE.  +1


---


[GitHub] metron pull request #1250: METRON-1850: Stellar REST function

2018-11-08 Thread merrimanr
Github user merrimanr closed the pull request at:

https://github.com/apache/metron/pull/1250


---


[GitHub] metron pull request #1250: METRON-1850: Stellar REST function

2018-11-08 Thread merrimanr
GitHub user merrimanr reopened a pull request:

https://github.com/apache/metron/pull/1250

METRON-1850: Stellar REST function

## Contributor Comments
This PR adds a Stellar REST function that can be used to enrich messages 
with data from 3rd party REST services.  This function leverages the Apache 
HttpComponents library for Http requests.

The function call follows this format:  `REST_GET(rest uri, optional rest 
settings)`

There are a handful of settings including basic authentication credentials, 
proxy support, and timeouts.  These settings are included in the `RestConfig` 
class.  Any of these settings can be defined in the global config under the 
`stellar.rest.settings` property and will override any default values.  The 
global config settings can also be overridden by passing in a config object to 
the expression.  This will allow support for multiple REST services that may 
have different requirements.

Responses are expected to be in JSON format.  Successful requests will 
return a MAP object in Stellar.  Errors will be logged and null will be 
returned by default.  There are ways to override this behavior by configuring a 
list of acceptable status codes and/or values to be returned on error or empty 
responses.
 
Considering this will be used on streaming data, how we handle timeouts is 
important.  This function exposes HttpClient timeout settings but those are not 
enough to keep unwanted latency from being introduced.  A hard timeout is 
implemented in the function to abort a request if the timeout is exceeded.  The 
idea is to guarantee the total request time will not exceed a configured value.

### Changes Included

- HttpClient capability added to Stellar
- HttpClient setup added to the various bolts and Stellar REPL
- Utility added for setting up a pooling HttpClient (and possibly other 
types of clients in the future)
- Configuration mechanism added for Stellar REST function including 
settings for authentication, proxy support, timeouts and other settings
- Function implementation and appropriate unit/integration tests (both unit 
tests and integration tests are included)

### Testing

There are several different ways to test this feature and I would encourage 
reviewers to get creative and look for cases I may not have thought of.  For my 
testing, I used an online Http service that provides simple endpoints for 
simulating different use cases:  `http://httpbin.org/#/`.  Feel free to try 
your own or use this one.

I tested this in full dev using the Stellar REPL and the parser and 
enrichment topologies.  First you need to perform a couple setup steps:

1. Spin up full dev and ensure everything comes up and data is being indexed
2. Ssh to full dev and install the Squid proxy server:
```
yum -y install squid
```
3. Create a password file that Squid can use for basic authentication
```
yum -y install httpd-tools
touch /etc/squid/passwd && chown squid /etc/squid/passwd
htpasswd /etc/squid/passwd user # (Will prompt for a password)
```
4. Configure Squid for basic authentication by adding these lines to 
`/etc/squid/squid.conf`, under the lines with `acl Safe_ports*`:
```
auth_param basic program /usr/lib64/squid/ncsa_auth /etc/squid/passwd
auth_param basic children 5
auth_param basic realm Squid Basic Authentication
auth_param basic credentialsttl 2 hours
acl auth_users proxy_auth REQUIRED
http_access allow auth_users
```
5. Start Squid and verify it is working correctly:
```
service squid restart
curl --proxy-user user:password -x node1:3128 http://www.google.com/
```
6. Next create password files in HDFS:
```
su hdfs
cd ~
echo -n 'passwd' > basicPassword.txt
hdfs dfs -put basicPassword.txt /apps/metron
echo -n 'password' > proxyPassword.txt
hdfs dfs -put proxyPassword.txt /apps/metron
exit
```

To test with the Stellar REPL, follow these steps:

1. Start the Stellar REPL and verify the `REST_GET` function is available:
```
/usr/metron/0.6.1/bin/stellar --zookeeper node1:2181
[Stellar]>>> %functions REST
REST_GET
```
2. Test a simple get request:
```
[Stellar]>>> REST_GET('http://httpbin.org/get')
{args={}, headers={Accept=application/json, Accept-Encoding=gzip,deflate, 
Cache-Control=max-age=259200, Connection=close, Host=httpbin.org, 
User-Agent=Apache-HttpClient/4.3.2 (java 1.5)}, origin=127.0.0.1, 
136.62.241.236, url=http://httpbin.org/get}
```
3. Test a get request with basic authentication:
```
[Stellar]>>> config := 
{'basic.auth.user':'user','basic.auth.password.path':'/apps/metron/basicPassword.txt'}
{basic.auth.user=user, 
basic.auth.password.path=/apps/metron/basicPassword.txt}
[Stellar]>>> REST_GET('

[GitHub] metron issue #1250: METRON-1850: Stellar REST function

2018-11-07 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1250
  
The latest commit incorporates the changes from 
https://github.com/apache/metron/pull/1251 and moves the httpclient inside the 
function.  I was able to verify proper closing in the REPL and the 
EnrichmentIntegrationTest.  I was not able to verify in the Storm topologies 
because Storm does not guarantee these methods will be called in production.  I 
added code comments in the bolts where we are calling 
`StellarFunctions.close()`.

I spun this up in full dev again and ran through the test plan.   
Everything continued to work as expected.


---


[GitHub] metron issue #1251: METRON-1853: Add shutdown hook to Stellar BaseFunctionRe...

2018-11-06 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1251
  
+1


---


[GitHub] metron issue #1253: METRON-1857 Fix Metaalert Nested Alert Field Name in Ind...

2018-11-06 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1253
  
I tested this in full dev.  +1


---


[GitHub] metron issue #1253: METRON-1857 Fix Metaalert Nested Alert Field Name in Ind...

2018-11-05 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1253
  
I'm not sure if you know the answer to this @nickwallen because it predates 
this PR, but is the intention to convert all `metron_alert.*` fields to keyword 
types?  I can see the  motivation behind doing this because we may not be aware 
of all field types in the various sensors that could be added to a metaalert.  
Maybe @justinleet knows?

If my assumption is true then we need another small change to make that 
happen.  Currently the `match_mapping_type` attribute is set to `string` which 
will only convert string types.  If we want to convert all fields, it needs to 
be:
```
"dynamic_templates": [
{
  "alert_template": {
  "path_match": "metron_alert.*",
  "match_mapping_type": "*",
  "mapping": {
"type": "keyword"
  }
}
```

Notice `string` has been changed to `*`.


---


[GitHub] metron issue #1249: METRON-1815: Separate metron-parsers into metron-parsers...

2018-11-05 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1249
  
Can you give some examples of code that fits into that category?


---


[GitHub] metron pull request #1251: METRON-1853: Add shutdown hook to Stellar BaseFun...

2018-11-02 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1251#discussion_r230484974
  
--- Diff: 
metron-stellar/stellar-common/src/main/java/org/apache/metron/stellar/dsl/StellarFunctions.java
 ---
@@ -30,4 +30,8 @@ public static FunctionResolver FUNCTION_RESOLVER() {
   public static void initialize(Context context) {
 SingletonFunctionResolver.getInstance().initialize(context);
   }
+
+  public static void teardown(Context context) {
--- End diff --

Should this be:
```
public static void teardown() {
SingletonFunctionResolver.getInstance().teardown();
  }
```


---


[GitHub] metron pull request #1250: METRON-1850: Stellar REST function

2018-11-02 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1250#discussion_r230468297
  
--- Diff: 
metron-stellar/stellar-common/src/main/java/org/apache/metron/stellar/dsl/functions/RestFunctions.java
 ---
@@ -0,0 +1,351 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.stellar.dsl.functions;
+
+import org.apache.commons.io.IOUtils;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.http.HttpEntity;
+import org.apache.http.HttpHost;
+import org.apache.http.auth.AuthScope;
+import org.apache.http.auth.UsernamePasswordCredentials;
+import org.apache.http.client.CredentialsProvider;
+import org.apache.http.client.config.RequestConfig;
+import org.apache.http.client.methods.CloseableHttpResponse;
+import org.apache.http.client.methods.HttpGet;
+import org.apache.http.client.protocol.HttpClientContext;
+import org.apache.http.impl.client.BasicCredentialsProvider;
+import org.apache.http.impl.client.CloseableHttpClient;
+import org.apache.http.util.EntityUtils;
+import org.apache.metron.stellar.common.utils.ConversionUtils;
+import org.apache.metron.stellar.common.utils.JSONUtils;
+import org.apache.metron.stellar.dsl.Context;
+import org.apache.metron.stellar.dsl.ParseException;
+import org.apache.metron.stellar.dsl.Stellar;
+import org.apache.metron.stellar.dsl.StellarFunction;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.ByteArrayInputStream;
+import java.io.IOException;
+import java.lang.invoke.MethodHandles;
+import java.net.URI;
+import java.net.URISyntaxException;
+import java.nio.charset.StandardCharsets;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.Executors;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.ScheduledFuture;
+import java.util.concurrent.TimeUnit;
+
+import static java.lang.String.format;
+import static 
org.apache.metron.stellar.dsl.Context.Capabilities.GLOBAL_CONFIG;
+import static 
org.apache.metron.stellar.dsl.functions.RestConfig.STELLAR_REST_SETTINGS;
+
+/**
+ * Defines functions that enable REST requests with proper result and 
error handling.  Depends on an
+ * Apache HttpComponents client being supplied as a Stellar HTTP_CLIENT 
capability.  Exposes various Http settings
+ * including authentication, proxy and timeouts through the global config 
with the option to override any settings
+ * through a config object supplied in the expression.
+ */
+public class RestFunctions {
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+  /**
+   * Retrieves the ClosableHttpClient from the execution context.
+   *
+   * @param context The execution context.
+   * @return A ClosableHttpClient, if one exists.  Otherwise, an exception 
is thrown.
+   */
+  private static CloseableHttpClient getHttpClient(Context context) {
+Optional clientOpt = 
context.getCapability(Context.Capabilities.HTTP_CLIENT);
+if(clientOpt.isPresent()) {
+  return (CloseableHttpClient) clientOpt.get();
+} else {
+  throw new IllegalStateException("Missing HTTP_CLIENT; http 
connection required");
+}
+  }
+
+  /**
+   * Get an argument from a list of arguments.
+   *
+   * @param index The index within the list of arguments.
+   * @param clazz The type expected.
+   * @param args All of the arguments.
+   * @param  The type of the argument expected.
+   */
+  public static  T getArg(int index, Class clazz, List args) 
{
+
+if(index >= args.size()) {
+  throw new IllegalArgumentException(format("Expec

[GitHub] metron issue #1250: METRON-1850: Stellar REST function

2018-11-02 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1250
  
I think we should close it, docs are pretty clear about that.  I think it's 
possible connections could be left open on the server side.  

Moving the code out of the bolt is an organization/style issue.  I think 
there should be a functional justification like there is for closing a client.


---


[GitHub] metron issue #1250: METRON-1850: Stellar REST function

2018-11-02 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1250
  
We do shut down Zookeeper connections here:  
https://github.com/apache/metron/blob/master/metron-platform/metron-common/src/main/java/org/apache/metron/common/bolt/ConfiguredBolt.java#L129.
  The Apache HttpComponents docs recommend closing clients:  
http://hc.apache.org/httpcomponents-client-4.5.x/tutorial/html/fundamentals.html#d5e217.
  


---


[GitHub] metron pull request #1250: METRON-1850: Stellar REST function

2018-11-02 Thread merrimanr
Github user merrimanr closed the pull request at:

https://github.com/apache/metron/pull/1250


---


[GitHub] metron pull request #1250: METRON-1850: Stellar REST function

2018-11-02 Thread merrimanr
GitHub user merrimanr reopened a pull request:

https://github.com/apache/metron/pull/1250

METRON-1850: Stellar REST function

## Contributor Comments
This PR adds a Stellar REST function that can be used to enrich messages 
with data from 3rd party REST services.  This function leverages the Apache 
HttpComponents library for Http requests.

The function call follows this format:  `REST_GET(rest uri, optional rest 
settings)`

There are a handful of settings including basic authentication credentials, 
proxy support, and timeouts.  These settings are included in the `RestConfig` 
class.  Any of these settings can be defined in the global config under the 
`stellar.rest.settings` property and will override any default values.  The 
global config settings can also be overridden by passing in a config object to 
the expression.  This will allow support for multiple REST services that may 
have different requirements.

Responses are expected to be in JSON format.  Successful requests will 
return a MAP object in Stellar.  Errors will be logged and null will be 
returned by default.  There are ways to override this behavior by configuring a 
list of acceptable status codes and/or values to be returned on error or empty 
responses.
 
Considering this will be used on streaming data, how we handle timeouts is 
important.  This function exposes HttpClient timeout settings but those are not 
enough to keep unwanted latency from being introduced.  A hard timeout is 
implemented in the function to abort a request if the timeout is exceeded.  The 
idea is to guarantee the total request time will not exceed a configured value.

### Changes Included

- HttpClient capability added to Stellar
- HttpClient setup added to the various bolts and Stellar REPL
- Utility added for setting up a pooling HttpClient (and possibly other 
types of clients in the future)
- Configuration mechanism added for Stellar REST function including 
settings for authentication, proxy support, timeouts and other settings
- Function implementation and appropriate unit/integration tests (both unit 
tests and integration tests are included)

### Testing

There are several different ways to test this feature and I would encourage 
reviewers to get creative and look for cases I may not have thought of.  For my 
testing, I used an online Http service that provides simple endpoints for 
simulating different use cases:  `http://httpbin.org/#/`.  Feel free to try 
your own or use this one.

I tested this in full dev using the Stellar REPL and the parser and 
enrichment topologies.  First you need to perform a couple setup steps:

1. Spin up full dev and ensure everything comes up and data is being indexed
2. Ssh to full dev and install the Squid proxy server:
```
yum -y install squid
```
3. Create a password file that Squid can use for basic authentication
```
yum -y install httpd-tools
touch /etc/squid/passwd && chown squid /etc/squid/passwd
htpasswd /etc/squid/passwd user # (Will prompt for a password)
```
4. Configure Squid for basic authentication by adding these lines to 
`/etc/squid/squid.conf`, under the lines with `acl Safe_ports*`:
```
auth_param basic program /usr/lib64/squid/ncsa_auth /etc/squid/passwd
auth_param basic children 5
auth_param basic realm Squid Basic Authentication
auth_param basic credentialsttl 2 hours
acl auth_users proxy_auth REQUIRED
http_access allow auth_users
```
5. Start Squid and verify it is working correctly:
```
service squid restart
curl --proxy-user user:password -x node1:3128 http://www.google.com/
```
6. Next create password files in HDFS:
```
su hdfs
cd ~
echo passwd > basicPassword.txt
hdfs dfs -put basicPassword.txt /apps/metron
echo password > proxyPassword.txt
hdfs dfs -put proxyPassword.txt /apps/metron
exit
```

To test with the Stellar REPL, follow these steps:

1. Start the Stellar REPL and verify the `REST_GET` function is available:
```
/usr/metron/0.6.1/bin/stellar --zookeeper node1:2181
[Stellar]>>> %functions REST
REST_GET
```
2. Test a simple get request:
```
[Stellar]>>> REST_GET('http://httpbin.org/get')
{args={}, headers={Accept=application/json, Accept-Encoding=gzip,deflate, 
Cache-Control=max-age=259200, Connection=close, Host=httpbin.org, 
User-Agent=Apache-HttpClient/4.3.2 (java 1.5)}, origin=127.0.0.1, 
136.62.241.236, url=http://httpbin.org/get}
```
3. Test a get request with basic authentication:
```
[Stellar]>>> config := 
{'basic.auth.user':'user','basic.auth.password.path':'/apps/metron/basicPassword.txt'}
{basic.auth.user=user, 
basic.auth.password.path=/apps/metron/basicPassword.txt}
[Stellar]>>> REST_GET('

[GitHub] metron pull request #1250: METRON-1850: Stellar REST function

2018-11-02 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1250#discussion_r230367152
  
--- Diff: 
metron-platform/metron-common/src/main/java/org/apache/metron/common/bolt/ConfiguredEnrichmentBolt.java
 ---
@@ -17,18 +17,41 @@
  */
 package org.apache.metron.common.bolt;
 
+import java.io.IOException;
 import java.lang.invoke.MethodHandles;
+import java.util.Map;
+
+import org.apache.http.impl.client.CloseableHttpClient;
 import org.apache.metron.common.configuration.EnrichmentConfigurations;
+import org.apache.metron.stellar.common.utils.HttpClientUtils;
+import org.apache.storm.task.OutputCollector;
+import org.apache.storm.task.TopologyContext;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
 public abstract class ConfiguredEnrichmentBolt extends 
ConfiguredBolt {
 
   private static final Logger LOG = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
 
+  protected CloseableHttpClient httpClient;
 
   public ConfiguredEnrichmentBolt(String zookeeperUrl) {
 super(zookeeperUrl, "ENRICHMENT");
   }
 
+  @Override
+  public void prepare(Map stormConf, TopologyContext context, 
OutputCollector collector) {
+super.prepare(stormConf, context, collector);
--- End diff --

Adding a shutdown hook in Stellar is not trivial and out of scope for this 
PR in my opinion.  I would prefer that be a follow on.


---


[GitHub] metron pull request #1250: METRON-1850: Stellar REST function

2018-11-02 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1250#discussion_r230366857
  
--- Diff: 
metron-stellar/stellar-common/src/main/java/org/apache/metron/stellar/dsl/functions/RestFunctions.java
 ---
@@ -0,0 +1,351 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.stellar.dsl.functions;
+
+import org.apache.commons.io.IOUtils;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.http.HttpEntity;
+import org.apache.http.HttpHost;
+import org.apache.http.auth.AuthScope;
+import org.apache.http.auth.UsernamePasswordCredentials;
+import org.apache.http.client.CredentialsProvider;
+import org.apache.http.client.config.RequestConfig;
+import org.apache.http.client.methods.CloseableHttpResponse;
+import org.apache.http.client.methods.HttpGet;
+import org.apache.http.client.protocol.HttpClientContext;
+import org.apache.http.impl.client.BasicCredentialsProvider;
+import org.apache.http.impl.client.CloseableHttpClient;
+import org.apache.http.util.EntityUtils;
+import org.apache.metron.stellar.common.utils.ConversionUtils;
+import org.apache.metron.stellar.common.utils.JSONUtils;
+import org.apache.metron.stellar.dsl.Context;
+import org.apache.metron.stellar.dsl.ParseException;
+import org.apache.metron.stellar.dsl.Stellar;
+import org.apache.metron.stellar.dsl.StellarFunction;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.ByteArrayInputStream;
+import java.io.IOException;
+import java.lang.invoke.MethodHandles;
+import java.net.URI;
+import java.net.URISyntaxException;
+import java.nio.charset.StandardCharsets;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.Executors;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.ScheduledFuture;
+import java.util.concurrent.TimeUnit;
+
+import static java.lang.String.format;
+import static 
org.apache.metron.stellar.dsl.Context.Capabilities.GLOBAL_CONFIG;
+import static 
org.apache.metron.stellar.dsl.functions.RestConfig.STELLAR_REST_SETTINGS;
+
+/**
+ * Defines functions that enable REST requests with proper result and 
error handling.  Depends on an
+ * Apache HttpComponents client being supplied as a Stellar HTTP_CLIENT 
capability.  Exposes various Http settings
+ * including authentication, proxy and timeouts through the global config 
with the option to override any settings
+ * through a config object supplied in the expression.
+ */
+public class RestFunctions {
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+  /**
+   * Retrieves the ClosableHttpClient from the execution context.
+   *
+   * @param context The execution context.
+   * @return A ClosableHttpClient, if one exists.  Otherwise, an exception 
is thrown.
+   */
+  private static CloseableHttpClient getHttpClient(Context context) {
+Optional clientOpt = 
context.getCapability(Context.Capabilities.HTTP_CLIENT);
+if(clientOpt.isPresent()) {
+  return (CloseableHttpClient) clientOpt.get();
+} else {
+  throw new IllegalStateException("Missing HTTP_CLIENT; http 
connection required");
+}
+  }
+
+  /**
+   * Get an argument from a list of arguments.
+   *
+   * @param index The index within the list of arguments.
+   * @param clazz The type expected.
+   * @param args All of the arguments.
+   * @param  The type of the argument expected.
+   */
+  public static  T getArg(int index, Class clazz, List args) 
{
+
+if(index >= args.size()) {
+  throw new IllegalArgumentException(format("Expec

[GitHub] metron pull request #1250: METRON-1850: Stellar REST function

2018-11-02 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1250#discussion_r230366579
  
--- Diff: 
metron-stellar/stellar-common/src/main/java/org/apache/metron/stellar/dsl/functions/RestFunctions.java
 ---
@@ -0,0 +1,351 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.stellar.dsl.functions;
+
+import org.apache.commons.io.IOUtils;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.http.HttpEntity;
+import org.apache.http.HttpHost;
+import org.apache.http.auth.AuthScope;
+import org.apache.http.auth.UsernamePasswordCredentials;
+import org.apache.http.client.CredentialsProvider;
+import org.apache.http.client.config.RequestConfig;
+import org.apache.http.client.methods.CloseableHttpResponse;
+import org.apache.http.client.methods.HttpGet;
+import org.apache.http.client.protocol.HttpClientContext;
+import org.apache.http.impl.client.BasicCredentialsProvider;
+import org.apache.http.impl.client.CloseableHttpClient;
+import org.apache.http.util.EntityUtils;
+import org.apache.metron.stellar.common.utils.ConversionUtils;
+import org.apache.metron.stellar.common.utils.JSONUtils;
+import org.apache.metron.stellar.dsl.Context;
+import org.apache.metron.stellar.dsl.ParseException;
+import org.apache.metron.stellar.dsl.Stellar;
+import org.apache.metron.stellar.dsl.StellarFunction;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.ByteArrayInputStream;
+import java.io.IOException;
+import java.lang.invoke.MethodHandles;
+import java.net.URI;
+import java.net.URISyntaxException;
+import java.nio.charset.StandardCharsets;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.Executors;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.ScheduledFuture;
+import java.util.concurrent.TimeUnit;
+
+import static java.lang.String.format;
+import static 
org.apache.metron.stellar.dsl.Context.Capabilities.GLOBAL_CONFIG;
+import static 
org.apache.metron.stellar.dsl.functions.RestConfig.STELLAR_REST_SETTINGS;
+
+/**
+ * Defines functions that enable REST requests with proper result and 
error handling.  Depends on an
+ * Apache HttpComponents client being supplied as a Stellar HTTP_CLIENT 
capability.  Exposes various Http settings
+ * including authentication, proxy and timeouts through the global config 
with the option to override any settings
+ * through a config object supplied in the expression.
+ */
+public class RestFunctions {
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+  /**
+   * Retrieves the ClosableHttpClient from the execution context.
+   *
+   * @param context The execution context.
+   * @return A ClosableHttpClient, if one exists.  Otherwise, an exception 
is thrown.
+   */
+  private static CloseableHttpClient getHttpClient(Context context) {
+Optional clientOpt = 
context.getCapability(Context.Capabilities.HTTP_CLIENT);
+if(clientOpt.isPresent()) {
+  return (CloseableHttpClient) clientOpt.get();
+} else {
+  throw new IllegalStateException("Missing HTTP_CLIENT; http 
connection required");
+}
+  }
+
+  /**
+   * Get an argument from a list of arguments.
+   *
+   * @param index The index within the list of arguments.
+   * @param clazz The type expected.
+   * @param args All of the arguments.
+   * @param  The type of the argument expected.
+   */
+  public static  T getArg(int index, Class clazz, List args) 
{
+
+if(index >= args.size()) {
+  throw new IllegalArgumentException(format("Expec

[GitHub] metron pull request #1250: METRON-1850: Stellar REST function

2018-11-01 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1250#discussion_r230227817
  
--- Diff: 
metron-platform/metron-common/src/main/java/org/apache/metron/common/bolt/ConfiguredEnrichmentBolt.java
 ---
@@ -17,18 +17,41 @@
  */
 package org.apache.metron.common.bolt;
 
+import java.io.IOException;
 import java.lang.invoke.MethodHandles;
+import java.util.Map;
+
+import org.apache.http.impl.client.CloseableHttpClient;
 import org.apache.metron.common.configuration.EnrichmentConfigurations;
+import org.apache.metron.stellar.common.utils.HttpClientUtils;
+import org.apache.storm.task.OutputCollector;
+import org.apache.storm.task.TopologyContext;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
 public abstract class ConfiguredEnrichmentBolt extends 
ConfiguredBolt {
 
   private static final Logger LOG = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
 
+  protected CloseableHttpClient httpClient;
 
   public ConfiguredEnrichmentBolt(String zookeeperUrl) {
 super(zookeeperUrl, "ENRICHMENT");
   }
 
+  @Override
+  public void prepare(Map stormConf, TopologyContext context, 
OutputCollector collector) {
+super.prepare(stormConf, context, collector);
--- End diff --

I tried that initially but realized there is no shutdown hook in Stellar.  
HttpClient objects need to be closed and the only hook that exists is in the 
bolts.  

If we did have a way to close it, I would be in favor of managing it in the 
function.  If we decide it's worth adding that hook in Stellar (I think we 
should as a follow on) we can easily move this inside the function.


---


[GitHub] metron pull request #1250: METRON-1850: Stellar REST function

2018-11-01 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1250#discussion_r230209150
  
--- Diff: 
metron-stellar/stellar-common/src/main/java/org/apache/metron/stellar/dsl/functions/RestConfig.java
 ---
@@ -0,0 +1,147 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.stellar.dsl.functions;
--- End diff --

I looked at 2 different patterns that already exist.  

The first was the pattern you mentioned, the ConfigOptions interface.  The 
problem is that this class lives in metron-common which depends on 
stellar-common.  Regardless I don't think pattern is a great fit because it 
doesn't serialize/deserialize easily and contains a lot of features that are 
not needed here (functional interfaces for example).  

The second was the PROFILER_INIT function in the ProfilerFunctions class.  
This approach uses a java bean (ProfilerConfig) to hold configuration which 
allows easy serialization/deserialization.  This is actually what I used at 
first until I made the change to accept a config as an argument expression.

I actually went with option 2 at first but found it cumbersome to merge 
configs together.  I ended up with the pattern I did because a map supports 
applying properties on top of each other well and serialization/deserialization 
works without issue.  We ended having to do this for PcapConfig and PcapOptions 
in the REST layer by creating the PcapRequest class which extends a Map.


---


[GitHub] metron pull request #1250: METRON-1850: Stellar REST function

2018-11-01 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1250#discussion_r230203860
  
--- Diff: 
metron-platform/metron-common/src/main/java/org/apache/metron/common/bolt/ConfiguredEnrichmentBolt.java
 ---
@@ -17,18 +17,41 @@
  */
 package org.apache.metron.common.bolt;
 
+import java.io.IOException;
 import java.lang.invoke.MethodHandles;
+import java.util.Map;
+
+import org.apache.http.impl.client.CloseableHttpClient;
 import org.apache.metron.common.configuration.EnrichmentConfigurations;
+import org.apache.metron.stellar.common.utils.HttpClientUtils;
+import org.apache.storm.task.OutputCollector;
+import org.apache.storm.task.TopologyContext;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
 public abstract class ConfiguredEnrichmentBolt extends 
ConfiguredBolt {
 
   private static final Logger LOG = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
 
+  protected CloseableHttpClient httpClient;
 
   public ConfiguredEnrichmentBolt(String zookeeperUrl) {
 super(zookeeperUrl, "ENRICHMENT");
   }
 
+  @Override
+  public void prepare(Map stormConf, TopologyContext context, 
OutputCollector collector) {
+super.prepare(stormConf, context, collector);
--- End diff --

I can add a property to configure this.  Should it be included by default 
or not ?  

Do you think instantiating this object but never using it would be harmful? 
 It might make things simpler for users if there is one less setting to worry 
about.


---


[GitHub] metron issue #1246: METRON-1844: Allow for LDAP to be used for authenticatio...

2018-11-01 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1246
  
I tested this in full dev using the instructions in the README and with the 
Knox sample ldap.  Both worked fine.  I also tested using legacy JDBC 
authentication and it continued to work.  Nice job!  +1


---


[GitHub] metron pull request #1246: METRON-1844: Allow for LDAP to be used for authen...

2018-11-01 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1246#discussion_r230138602
  
--- Diff: 
metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/configuration/metron-rest-env.xml
 ---
@@ -35,32 +35,29 @@
 
 
 metron_spring_profiles_active
-Active Spring profiles
+Active Spring profiles. 'ldap' is used to enable 
authentication via LDAP.
 Active Spring profiles
-
-
-true
-
+jdbc
--- End diff --

Are we using a "jdbc" profile anywhere?  From what I can tell jdbc 
authentication is used by default but we don't actually use a profile for that.


---


[GitHub] metron pull request #1250: METRON-1850: Stellar REST function

2018-10-31 Thread merrimanr
GitHub user merrimanr opened a pull request:

https://github.com/apache/metron/pull/1250

METRON-1850: Stellar REST function

## Contributor Comments
This PR adds a Stellar REST function that can be used to enrich messages 
with data from 3rd party REST services.  This function leverages the Apache 
HttpComponents library for Http requests.

The function call follows this format:  `REST_GET(rest uri, optional rest 
settings)`

There are a handful of settings including basic authentication credentials, 
proxy support, and timeouts.  These settings are included in the `RestConfig` 
class.  Any of these settings can be defined in the global config under the 
`stellar.rest.settings` property and will override any default values.  The 
global config settings can also be overridden by passing in a config object to 
the expression.  This will allow support for multiple REST services that may 
have different requirements.

Responses are expected to be in JSON format.  Successful requests will 
return a MAP object in Stellar.  Errors will be logged and null will be 
returned by default.  There are ways to override this behavior by configuring a 
list of acceptable status codes and/or values to be returned on error or empty 
responses.
 
Considering this will be used on streaming data, how we handle timeouts is 
important.  This function exposes HttpClient timeout settings but those are not 
enough to keep unwanted latency from being introduced.  A hard timeout is 
implemented in the function to abort a request if the timeout is exceeded.  The 
idea is to guarantee the total request time will not exceed a configured value.

### Changes Included

- HttpClient capability added to Stellar
- HttpClient setup added to the various bolts and Stellar REPL
- Utility added for setting up a pooling HttpClient (and possibly other 
types of clients in the future)
- Configuration mechanism added for Stellar REST function including 
settings for authentication, proxy support, timeouts and other settings
- Function implementation and appropriate unit/integration tests (both unit 
tests and integration tests are included)

### Testing

There are several different ways to test this feature and I would encourage 
reviewers to get creative and look for cases I may not have thought of.  For my 
testing, I used an online Http service that provides simple endpoints for 
simulating different use cases:  `http://httpbin.org/#/`.  Feel free to try 
your own or use this one.

I tested this in full dev using the Stellar REPL and the parser and 
enrichment topologies.  First you need to perform a couple setup steps:

1. Spin up full dev and ensure everything comes up and data is being indexed
2. Ssh to full dev and install the Squid proxy server:
```
yum -y install squid
```
3. Create a password file that Squid can use for basic authentication
```
yum -y install httpd-tools
touch /etc/squid/passwd && chown squid /etc/squid/passwd
htpasswd /etc/squid/passwd user # (Will prompt for a password)
```
4. Configure Squid for basic authentication by adding these lines to 
`/etc/squid/squid.conf`, under the lines with `acl Safe_ports*`:
```
auth_param basic program /usr/lib64/squid/ncsa_auth /etc/squid/passwd
auth_param basic children 5
auth_param basic realm Squid Basic Authentication
auth_param basic credentialsttl 2 hours
acl auth_users proxy_auth REQUIRED
http_access allow auth_users
```
5. Start Squid and verify it is working correctly:
```
service squid restart
curl --proxy-user user:password -x node1:3128 http://www.google.com/
```
6. Next create password files in HDFS:
```
su hdfs
cd ~
echo passwd > basicPassword.txt
hdfs dfs -put basicPassword.txt /apps/metron
echo password > proxyPassword.txt
hdfs dfs -put proxyPassword.txt /apps/metron
exit
```

To test with the Stellar REPL, follow these steps:

1. Start the Stellar REPL and verify the `REST_GET` function is available:
```
/usr/metron/0.6.1/bin/stellar --zookeeper node1:2181
[Stellar]>>> %functions REST
REST_GET
```
2. Test a simple get request:
```
[Stellar]>>> REST_GET('http://httpbin.org/get')
{args={}, headers={Accept=application/json, Accept-Encoding=gzip,deflate, 
Cache-Control=max-age=259200, Connection=close, Host=httpbin.org, 
User-Agent=Apache-HttpClient/4.3.2 (java 1.5)}, origin=127.0.0.1, 
136.62.241.236, url=http://httpbin.org/get}
```
3. Test a get request with basic authentication:
```
[Stellar]>>> config := 
{'basic.auth.user':'user','basic.auth.password.path':'/apps/metron/basicPassword.txt'}
{basic.auth.user=user, 
basic.auth.password.path=/apps/metron/basicPassword.txt}
[Stellar]>>> REST_GET('

[GitHub] metron issue #1240: METRON-1830: Re-implement Alerts dialog box without jQue...

2018-10-25 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1240
  
Looks good to me @sardell.  +1


---


[GitHub] metron issue #1241: METRON-1833: Management UI incorrectly displaying sensor...

2018-10-22 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1241
  
+1


---


[GitHub] metron pull request #1239: METRON-1829: Large Error Message Causes Slow Sear...

2018-10-18 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1239#discussion_r226479283
  
--- Diff: 
metron-platform/metron-writer/src/main/java/org/apache/metron/writer/BulkWriterComponent.java
 ---
@@ -118,12 +118,15 @@ public void commit(BulkWriterResponse response) {
 
   public void error(String sensorType, Throwable e, Iterable 
tuples, MessageGetStrategy messageGetStrategy) {
 LOG.error(format("Failing %d tuple(s); sensorType=%s", 
Iterables.size(tuples), sensorType), e);
-MetronError error = new MetronError()
-.withSensorType(Collections.singleton(sensorType))
-.withErrorType(Constants.ErrorType.INDEXING_ERROR)
-.withThrowable(e);
-tuples.forEach(t -> error.addRawMessage(messageGetStrategy.get(t)));
-handleError(tuples, error);
+tuples.forEach(t -> {
--- End diff --

Done.


---


[GitHub] metron pull request #1239: METRON-1829: Large Error Message Causes Slow Sear...

2018-10-18 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1239#discussion_r226460423
  
--- Diff: 
metron-platform/metron-writer/src/main/java/org/apache/metron/writer/BulkWriterComponent.java
 ---
@@ -118,12 +118,15 @@ public void commit(BulkWriterResponse response) {
 
   public void error(String sensorType, Throwable e, Iterable 
tuples, MessageGetStrategy messageGetStrategy) {
 LOG.error(format("Failing %d tuple(s); sensorType=%s", 
Iterables.size(tuples), sensorType), e);
-MetronError error = new MetronError()
-.withSensorType(Collections.singleton(sensorType))
-.withErrorType(Constants.ErrorType.INDEXING_ERROR)
-.withThrowable(e);
-tuples.forEach(t -> error.addRawMessage(messageGetStrategy.get(t)));
-handleError(tuples, error);
+tuples.forEach(t -> {
--- End diff --

You're right again.  I agree, we shouldn't be reporting the same exception 
for every message in the batch.  I changed it to what you suggested in the 
initial comment and updated the tests.

For the other issue you bring up with `error(Throwable, Iterable)`, 
it looks like it gets called from only one place:  
`BulkMessageWriterBolt.handleMissingMessage`.  This method only deals with a 
single tuple so it is misleading that the `error` method accepts multiple 
tuples.  Would changing that to accept a single tuple make it less confusing?


---


[GitHub] metron issue #1213: METRON-1681: Decouple the ParserBolt from the Parse exec...

2018-10-18 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1213
  
Latest commit adds some logging around parser initialization.  Let me know 
what you think.


---


[GitHub] metron pull request #1233: METRON-1816: Date format Stellar function

2018-10-18 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1233#discussion_r226383027
  
--- Diff: 
metron-stellar/stellar-common/src/main/java/org/apache/metron/stellar/dsl/functions/DateFunctions.java
 ---
@@ -109,6 +110,13 @@ public static long getEpochTime(String date, String 
format, Optional tim
 return sdf.parse(date).getTime();
   }
 
+  public static String getDateFormat(String format, Optional 
epochTime, Optional timezone) {
+Long time = epochTime.orElseGet(System::currentTimeMillis);
+TimezonedFormat fmt = timezone.map(s -> new TimezonedFormat(format, 
s)).orElseGet(() -> new TimezonedFormat(format));
+SimpleDateFormat sdf = formatCache.get(fmt).get();
--- End diff --

There are no exceptions thrown by any calls in this method.  What are you 
thinking we should do?


---


[GitHub] metron pull request #1233: METRON-1816: Date format Stellar function

2018-10-18 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1233#discussion_r226382248
  
--- Diff: 
metron-stellar/stellar-common/src/test/java/org/apache/metron/stellar/dsl/functions/DateFunctionsTest.java
 ---
@@ -225,4 +226,36 @@ public void testDayOfYearNow() {
   public void testDayOfYearNull() {
 Object result = run("DAY_OF_YEAR(nada)");
   }
+
+  @Test
+  public void testDateFormat() {
+Object result = run("DATE_FORMAT('EEE MMM dd  hh:mm:ss zzz', 
epoch, 'EST')");
+assertEquals("Thu Aug 25 2016 08:27:10 EST", result);
+  }
+
+  @Test
+  public void testDateFormatDefault() {
+Object result = run("DATE_FORMAT('EEE MMM dd  hh:mm:ss ')");
+
assertTrue(result.toString().endsWith(TimeZone.getDefault().getDisplayName(true,
 1)));
+  }
+
+  @Test
+  public void testDateFormatNow() {
+Object result = run("DATE_FORMAT('EEE MMM dd  hh:mm:ss zzz', 
'GMT')");
+assertTrue(result.toString().endsWith("GMT"));
+  }
+
+  @Test
+  public void testDateFormatDefaultTimezone() {
+Object result = run("DATE_FORMAT('EEE MMM dd  hh:mm:ss ', 
epoch)");
+
assertTrue(result.toString().endsWith(TimeZone.getDefault().getDisplayName(true,
 1)));
+  }
+
+  /**
+   * If refer to variable that does not exist, expect ParseException.
+   */
+  @Test(expected = ParseException.class)
+  public void testDateFormatNull() {
+Object result = run("DATE_FORMAT('EEE MMM dd  hh:mm:ss zzz', nada, 
'EST')");
+  }
--- End diff --

Done


---


[GitHub] metron pull request #1239: METRON-1829: Large Error Message Causes Slow Sear...

2018-10-18 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1239#discussion_r226377917
  
--- Diff: 
metron-platform/metron-writer/src/main/java/org/apache/metron/writer/BulkWriterComponent.java
 ---
@@ -118,12 +118,15 @@ public void commit(BulkWriterResponse response) {
 
   public void error(String sensorType, Throwable e, Iterable 
tuples, MessageGetStrategy messageGetStrategy) {
 LOG.error(format("Failing %d tuple(s); sensorType=%s", 
Iterables.size(tuples), sensorType), e);
-MetronError error = new MetronError()
-.withSensorType(Collections.singleton(sensorType))
-.withErrorType(Constants.ErrorType.INDEXING_ERROR)
-.withThrowable(e);
-tuples.forEach(t -> error.addRawMessage(messageGetStrategy.get(t)));
-handleError(tuples, error);
+tuples.forEach(t -> {
--- End diff --

You're right!  My mistake.  I changed it to pass in just the current tuple 
to `handleError`.  Let me know if that works.


---


[GitHub] metron pull request #1239: METRON-1829: Large Error Message Causes Slow Sear...

2018-10-18 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1239#discussion_r226378008
  
--- Diff: 
metron-platform/metron-writer/src/test/java/org/apache/metron/writer/BulkWriterComponentTest.java
 ---
@@ -161,9 +165,12 @@ public void 
writeShouldThrowExceptionWhenHandleErrorIsFalse() throws Exception {
   @Test
   public void writeShouldProperlyHandleWriterException() throws Exception {
 Throwable e = new Exception("test exception");
-MetronError error = new MetronError()
+MetronError expectedError1 = new MetronError()
 .withSensorType(Collections.singleton(sensorType))
-
.withErrorType(Constants.ErrorType.INDEXING_ERROR).withThrowable(e).withRawMessages(Arrays.asList(message1,
 message2));
+
.withErrorType(Constants.ErrorType.INDEXING_ERROR).withThrowable(e).withRawMessages(Collections.singletonList(message1));
--- End diff --

I agree.  Done.


---


[GitHub] metron issue #1233: METRON-1816: Date format Stellar function

2018-10-17 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1233
  
The existing Stellar date functions use SimpleDateFormat and already have a 
caching layer built in so I decided to reuse what was already there.  Happy to 
switch to DateTimeFormatter here or in a follow on.


---


[GitHub] metron issue #1213: METRON-1681: Decouple the ParserBolt from the Parse exec...

2018-10-17 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1213
  
The latest commit adds the Deprecated annotations to parse and 
parseOptional and removes MultilineMessageParser.  I also updated the 
`MessageParserTest` with more tests to ensure we're handling the deprecated 
methods appropriately.  As a result of that, I changed the logic in 
MessageParser.parseOptionalResult slightly to return an empty Optional rather 
than an empty List to match the existing test.  Let me know if anyone has a 
problem with this.

Once we finish this I will make an announcement on the user and dev lists.


---


[GitHub] metron issue #1213: METRON-1681: Decouple the ParserBolt from the Parse exec...

2018-10-16 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1213
  
I merged the latest master (which includes 
https://github.com/apache/metron/pull/1234) into this PR.  I experimented with 
moving the methods in MultilineMessageParser back into MessageParser and I 
think it turned out well.  Makes things clearer and easier to understand in my 
opinion (and also made the merge much easier).

What do you think @mmiklavc and @ottobackwards?  It reverts some changes in 
https://github.com/apache/metron/pull/1234 so I want to get your thoughts and 
buy in.


---


[GitHub] metron pull request #1239: METRON-1829: Large Error Message Causes Slow Sear...

2018-10-16 Thread merrimanr
GitHub user merrimanr opened a pull request:

https://github.com/apache/metron/pull/1239

METRON-1829: Large Error Message Causes Slow Search Performance

## Contributor Comments
When a failure happens in the BulkWriterComponent, all pending messages in 
a batch are combined into a single error message.  This results in a single 
large objects being written to ES and causes performance problems.  This PR 
attempts to solve this issue by instead creating separate error messages for 
each message in the batch. 

### Testing
This has been tested and verified in full dev:

1. Spin up full dev and verify data is being indexed into ES
2. Stop HDFS in Ambari.  This will cause a failure in the 
BulkWriterComponent.  By default the batch size for OOTB sensors is set to 5.
3. After HDFS has stopped, verify that documents are being written to an ES 
error index
4. Retrieve a single document from the ES error index.  There should only 
be a single `raw_message` field.  Subsequent documents in ES should also only 
contain a single `raw_message` field.  Before there would have been 5 
`raw_message_*` fields in a single document.
5.  Verify error documents are displayed properly in Kibana.

## Pull Request Checklist

Thank you for submitting a contribution to Apache Metron.  
Please refer to our [Development 
Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235)
 for the complete guide to follow for contributions.  
Please refer also to our [Build Verification 
Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview)
 for complete smoke testing guides.  


In order to streamline the review of the contribution we ask you follow 
these guidelines and ask you to double check the following:

### For all changes:
- [x] Is there a JIRA ticket associated with this PR? If not one needs to 
be created at [Metron 
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
- [x] Does your PR title start with METRON- where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
- [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?


### For code changes:
- [x] Have you included steps to reproduce the behavior or problem that is 
being changed or addressed?
- [x] Have you included steps or a guide to how the change may be verified 
and tested manually?
- [x] Have you ensured that the full suite of tests and checks have been 
executed in the root metron folder via:
  ```
  mvn -q clean integration-test install && 
dev-utilities/build-utils/verify_licenses.sh 
  ```

- [x] Have you written or updated unit tests and or integration tests to 
verify your changes?
- [x] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [x] Have you verified the basic functionality of the build by building 
and running locally with Vagrant full-dev environment or the equivalent?

### For documentation related changes:
- [x] Have you ensured that format looks appropriate for the output in 
which it is rendered by building and verifying the site-book? If not then run 
the following commands and the verify changes via 
`site-book/target/site/index.html`:

  ```
  cd site-book
  mvn site
  ```

 Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.
It is also recommended that [travis-ci](https://travis-ci.org) is set up 
for your personal repository such that your branches are built there before 
submitting a pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/merrimanr/incubator-metron METRON-1829

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/metron/pull/1239.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1239


commit d674d9afbf7f5a230737dbcfe4ab10a70106f4ed
Author: merrimanr 
Date:   2018-10-15T20:46:31Z

initial commit




---


[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...

2018-10-12 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1213#discussion_r224780152
  
--- Diff: 
metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/ParserRunnerImpl.java
 ---
@@ -137,11 +208,29 @@ private void 
initializeParsers(Supplier parserConfigSuppli
 }
   }
 
+  /**
+   * Post-processes parsed messages by:
+   * 
+   *   Applying field transformations defined in the sensor parser 
config
+   *   Filtering messages using the configured MessageFilter class
+   *   Validating messages using the MessageParser validate method
+   * 
+   * If a message is successfully processed a message is returned in a 
ProcessResult.  If a message fails
+   * validation, a MetronError object is created and returned in a 
ProcessResult.  If a message is
+   * filtered out an empty Optional is returned.
+   *
+   * @param sensorType Sensor type of the message
+   * @param message Message parsed by the MessageParser
+   * @param rawMessage Raw message including metadata
+   * @param parser MessageParser for the sensor type
+   * @param parserConfigurations Parser configurations
+   */
   @SuppressWarnings("unchecked")
-  protected Optional processMessage(String sensorType, 
JSONObject message, RawMessage rawMessage,
+  protected Optional processMessage(String sensorType, 
JSONObject message, RawMessage rawMessage,
   
MessageParser parser,
--- End diff --

Yes parser is used.  Line 248, `parser.validate`.


---


[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...

2018-10-12 Thread merrimanr
GitHub user merrimanr reopened a pull request:

https://github.com/apache/metron/pull/1213

METRON-1681: Decouple the ParserBolt from the Parse execution logic

## Contributor Comments
The primary purpose of this PR is to create a parser container abstraction 
that is decoupled from Storm.  This parser container (I call it ParserRunner, 
anyone have a better name?) is responsible for:

- Instantiating the MessageParser and MessageFilter objects for each sensor.
- Initializing the Stellar environment.
- Accepting a raw binary message along with parser configurations and 
calling the MessageParser.parseOptional method for the appropriate sensor type
- Process each message.  Most of this logic was migrated from the 
ParserBolt.execute method.  
- Execute callbacks depending on the processing result of each message

Configuration is external to this abstraction.  A configuration supplier is 
passed in when initializing and a configuration object is passed in when 
processing each message.  I believe this was originally done because we want 
message processing to be atomic without the configuration unexpectedly 
changing.  We can easily change to a message supplier during execute if 
necessary.  A CuratorFramework object is also required for setting up Stellar 
but we could easily make this optional.  

I decided to keep writing out of the abstraction in this PR.  I can 
envision different platforms having different requirements or needs for sending 
along messages after they are parsed.  Therefore the ParserBolt still handles 
writing messages to Kafka.  If we do decide we want to add writing to our 
abstraction we could do it in a follow on PR to keep this from becoming even 
bigger.

Since all of the post parsing logic was in the Parserbolt, messages could 
be written as they were processed rather than having to wait for all messages 
to be processed. To maintain this behavior I added 2 callback functions in the 
form of Java Consumers:  onSuccess and OnError.  The other option would be to 
make message processing synchronous and just return a list of results.

This lifecycle of this container looks like:

1. Container is created from a collection of sensor types 
2. Container is initialized with the init method that accepts a Curator 
client and a configuration Supplier.  This in turn sets up Stellar and 
instantiates MessageParser and MessageFilter classes.
3. Container is ready and accepts messages for processing and calls the 
appropriate callbacks.

Because this splits the ParserBolt into 2 different classes much of the 
ParserBoltTest unit test didn't make sense anymore.  I ended up essentially 
rewriting it and also creating a unit test for ParserRunner.  I tried to 
represent all the original tests in ParserBoltTest and have 95%+ coverage.  If 
I missed any cases or made undesirable style changes, let me know.

### Changes Included

- ParserRunner abstraction that is decoupled from the ParserBolt.  The 
ParserBolt now initializes the ParserRunner and defers parsing to that class.  
The only thing required is Metron configuration and a Curator client (which 
could be optional)
- Refactored ParserBolt that sets up the ParserRunner, passes message to it 
for processing, and writes results to Kafka.
- MessageParser and MessageFilter objects are now created when Storm calls 
prepare, avoiding serialization issues 
(https://issues.apache.org/jira/browse/METRON-1793)
- Updated unit and integration tests

### Testing
I have done basic testing in full dev:

1. Spin up full dev and verify bro and snort alerts are indexed in ES and 
the data looks correct.

2. Test for proper message parser error handling by producing an 
unparseable message to the bro topic:
```
echo 'bad message' | 
/usr/hdp/current/kafka-broker/bin/kafka-console-producer.sh --broker-list 
node1:6667 --topic bro
```
You should see a corresponding error message in the ES error index:
```
{
"_index" : "error_index_2018.09.25.20",
"_type" : "error_doc",
"_id" : "df83aebc-506d-404b-aba6-c5a740d07c57",
"_score" : 1.0,
"_source" : {
  "exception" : "java.lang.IllegalStateException: Unable to parse 
Message: test",
  "failed_sensor_type" : "bro",
  "stack" : "java.lang.IllegalStateException: Unable to parse 
Message: bad message
  ...
  "hostname" : "node1",
  "source:type" : "error",
  "raw_message" : "test",
  "error_hash" : 
"9f86d081884c7d659a2feaa0c55ad015a3bf4f1b2b0b822cd15d6c15b0f00a08",

[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...

2018-10-12 Thread merrimanr
Github user merrimanr closed the pull request at:

https://github.com/apache/metron/pull/1213


---


[GitHub] metron issue #1213: METRON-1681: Decouple the ParserBolt from the Parse exec...

2018-10-11 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1213
  
The latest commit resolves the conflicts introduced by 
https://github.com/apache/metron/pull/1184.  The changes in all classes except 
ParserRunnerImpl and ParserRunnerImplTest were fairly trivial, mainly just 
adjusting the return types to match the new MessageParserResult and 
ParserRunnerResults interfaces.

Most of the work was done in ParserRunnerImpl (the execute method 
specifically).  Since the MessageParser interface can now return multiple 
messages and exceptions (and a master exception in some cases), there needs to 
be a way to map them to a ParserRunnerResults object (which is a list of 
messages and errors).   Now the primary job of this execute method is to:
- call MessageParser.parseOptionalResult
- post process messages returned by the MessageParser
- consolidate the various errors and messages into a single 
ParserRunnerResults object

Let me know if anyone thinks this should work differently.  I also added 
more javadocs to the various classes involved, primarily ParserRunnerImpl since 
that's where the magic happens.  This merge was fairly significant so I am 
planning on testing in full dev again.


---


[GitHub] metron issue #1218: METRON-1801 Allow Customization of Elasticsearch Documen...

2018-10-11 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1218
  
Looks good to me.  +1


---


[GitHub] metron pull request #1218: METRON-1801 Allow Customization of Elasticsearch ...

2018-10-11 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1218#discussion_r224548764
  
--- Diff: metron-platform/metron-elasticsearch/pom.xml ---
@@ -206,24 +206,7 @@
 test-jar
 test
 
-
--- End diff --

I have also noticed this and I end up commenting these out every time I run 
a test in my IDE.  Thanks for investigating and removing them.


---


[GitHub] metron pull request #1218: METRON-1801 Allow Customization of Elasticsearch ...

2018-10-11 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1218#discussion_r224541191
  
--- Diff: metron-platform/metron-elasticsearch/pom.xml ---
@@ -206,24 +206,7 @@
 test-jar
 test
 
-
--- End diff --

Why were these dependencies removed?  Just curious.


---


[GitHub] metron issue #1218: METRON-1801 Allow Customization of Elasticsearch Documen...

2018-10-11 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1218
  
@mraliagha, that's a good suggestion.  I believe we can functionally 
achieve that be creating a custom id field in the format you suggest (with a 
Stellar field transform) and set that field to be the ES id with the Ambari 
property exposed in this PR.  Do you feel it's worth documenting as an 
optimization?

I spun this up in full dev and ran through all the testing instructions.  
Everything worked as advertised.  I think there are just a couple open 
questions but this is pretty close in my opinion.


---


[GitHub] metron pull request #1218: METRON-1801 Allow Customization of Elasticsearch ...

2018-10-11 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1218#discussion_r224538137
  
--- Diff: 
metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/ElasticsearchWriterConfig.java
 ---
@@ -0,0 +1,134 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ *
+ */
+
+package org.apache.metron.elasticsearch.writer;
+
+import org.apache.metron.common.Constants;
+import org.apache.metron.stellar.common.utils.ConversionUtils;
+
+import java.util.Map;
+
+/**
+ * Configuration settings that customize the behavior of the {@link 
ElasticsearchWriter}.
+ */
+public enum ElasticsearchWriterConfig {
+
+  /**
+   * Defines which message field, the document identifier is set to.
+   *
+   * If defined, the value of the specified message field is set as the 
Elasticsearch doc ID. If
+   * this field is undefined or blank, then the document identifier is not 
set.
+   */
+  DOC_ID_SOURCE_FIELD("es.document.id", "", String.class);
--- End diff --

This does leave things in an inconsistent state and will makes things 
confusing for anyone working with these classes.  However I think a follow up 
PR would be fine.

Are you aware of the 
[ConfigOption](https://github.com/apache/metron/blob/master/metron-platform/metron-common/src/main/java/org/apache/metron/common/configuration/ConfigOption.java)
 interface?  It looks like there may be some duplication with this class.


---


[GitHub] metron issue #1184: METRON-1761, allow application of grok statement multipl...

2018-10-10 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1184
  
+1 from me too.  I will merge this into 
https://github.com/apache/metron/pull/1213 once it's in master and we can 
continue moving forward there.  Parsing is getting an upgrade!


---


[GitHub] metron pull request #1233: METRON-1816: Date format Stellar function

2018-10-10 Thread merrimanr
GitHub user merrimanr opened a pull request:

https://github.com/apache/metron/pull/1233

METRON-1816: Date format Stellar function

## Contributor Comments
This PR adds a Stellar function that wraps the Java SimpleDateFormat class. 
 The first argument is a date format string and is required.  The second 
argument is epoch time and is optional (defaults to current time).  The third 
argument is a timezone and is also option (default time zone of the environment 
is used if missing).

### Testing
This has been tested in full dev:

1. Spin up full dev and launch the Stellar REPL.  The `DATE_FORMAT` 
function should be available:
```
Stellar, Go!
Functions are loading lazily in the background and will be unavailable 
until loaded fully.
{}
[Stellar]>>> %functions DATE
DATE_FORMAT, IS_DATE
```
2. Create a variable and set it to an epoch time:
```
[Stellar]>>> epoch := 1472131630748L
1472131630748
```
3. Use the `DATE_FORMAT` function to format the value in the epoch field 
created in the previous step.  The result should be a correctly formatted date 
that matches the format string, epoch time and timezone:
```
[Stellar]>>> DATE_FORMAT('EEE MMM dd  hh:mm:ss zzz', epoch, 'EST')
Thu Aug 25 2016 08:27:10 EST
```
4. Format the date using the default timezone.  The result should match the 
default timezone of your environment:
```
[Stellar]>>> DATE_FORMAT('EEE MMM dd  hh:mm:ss ', epoch)
Thu Aug 25 2016 01:27:10 Coordinated Universal Time
```
5. Format the date using the current time:
```
[Stellar]>>> DATE_FORMAT('EEE MMM dd  hh:mm:ss zzz', 'GMT')
Tue Oct 09 2018 10:13:24 GMT
```
6. Format the date using the current time and default time zone:
```
[Stellar]>>> DATE_FORMAT('EEE MMM dd  hh:mm:ss ')
Tue Oct 09 2018 10:12:54 Coordinated Universal Time
```


## Pull Request Checklist

Thank you for submitting a contribution to Apache Metron.  
Please refer to our [Development 
Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235)
 for the complete guide to follow for contributions.  
Please refer also to our [Build Verification 
Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview)
 for complete smoke testing guides.  


In order to streamline the review of the contribution we ask you follow 
these guidelines and ask you to double check the following:

### For all changes:
- [x] Is there a JIRA ticket associated with this PR? If not one needs to 
be created at [Metron 
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
- [x] Does your PR title start with METRON- where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
- [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?


### For code changes:
- [x] Have you included steps to reproduce the behavior or problem that is 
being changed or addressed?
- [x] Have you included steps or a guide to how the change may be verified 
and tested manually?
- [x] Have you ensured that the full suite of tests and checks have been 
executed in the root metron folder via:
  ```
  mvn -q clean integration-test install && 
dev-utilities/build-utils/verify_licenses.sh 
  ```

- [x] Have you written or updated unit tests and or integration tests to 
verify your changes?
- [x] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [x] Have you verified the basic functionality of the build by building 
and running locally with Vagrant full-dev environment or the equivalent?

### For documentation related changes:
- [x] Have you ensured that format looks appropriate for the output in 
which it is rendered by building and verifying the site-book? If not then run 
the following commands and the verify changes via 
`site-book/target/site/index.html`:

  ```
  cd site-book
  mvn site
  ```

 Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.
It is also recommended that [travis-ci](https://travis-ci.org) is set up 
for your personal repository such that your branches are built there before 
submitting a pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/merrimanr/incubator-metron METRON-1816

Alternatively you can review and apply these change

[GitHub] metron issue #1221: METRON-1805: Provide a default value for the Storm topol...

2018-10-09 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1221
  
I added a default to the profiler topology.  We're probably less likely to 
run into a problem with the parser topologies I would think.  Should we provide 
defaults for those?


---


[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...

2018-10-09 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1213#discussion_r223887299
  
--- Diff: 
metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/ParserRunnerImpl.java
 ---
@@ -0,0 +1,216 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.parsers;
+
+import org.apache.commons.lang3.StringUtils;
+import org.apache.metron.common.Constants;
+import org.apache.metron.common.configuration.FieldTransformer;
+import org.apache.metron.common.configuration.FieldValidator;
+import org.apache.metron.common.configuration.ParserConfigurations;
+import org.apache.metron.common.configuration.SensorParserConfig;
+import org.apache.metron.common.error.MetronError;
+import org.apache.metron.common.message.metadata.RawMessage;
+import org.apache.metron.common.utils.ReflectionUtils;
+import org.apache.metron.parsers.filters.Filters;
+import org.apache.metron.parsers.interfaces.MessageFilter;
+import org.apache.metron.parsers.interfaces.MessageParser;
+import org.apache.metron.parsers.topology.ParserComponent;
+import org.apache.metron.stellar.dsl.Context;
+import org.json.simple.JSONObject;
+
+import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Set;
+import java.util.UUID;
+import java.util.function.Consumer;
+import java.util.function.Supplier;
+import java.util.stream.Collectors;
+
+public class ParserRunnerImpl implements ParserRunner, Serializable {
+
+  protected transient Consumer onSuccess;
+  protected transient Consumer onError;
+
+  private HashSet sensorTypes;
+  private Map sensorToParserComponentMap;
+
+  // Stellar variables
+  private transient Context stellarContext;
+
+  public ParserRunnerImpl(HashSet sensorTypes) {
+this.sensorTypes = sensorTypes;
+  }
+
+  public Map getSensorToParserComponentMap() {
+return sensorToParserComponentMap;
+  }
+
+  public void setSensorToParserComponentMap(Map 
sensorToParserComponentMap) {
+this.sensorToParserComponentMap = sensorToParserComponentMap;
+  }
+
+  public Context getStellarContext() {
+return stellarContext;
+  }
+
+  @Override
+  public Set getSensorTypes() {
+return sensorTypes;
+  }
+
+  @Override
+  public void init(Supplier parserConfigSupplier, 
Context stellarContext) {
+if (parserConfigSupplier == null) {
+  throw new IllegalStateException("A parser config supplier must be 
set before initializing the ParserRunner.");
+}
+if (stellarContext == null) {
+  throw new IllegalStateException("A stellar context must be set 
before initializing the ParserRunner.");
+}
+this.stellarContext = stellarContext;
+initializeParsers(parserConfigSupplier);
+  }
+
+  @Override
+  public List execute(String sensorType, RawMessage 
rawMessage, ParserConfigurations parserConfigurations) {
+List parserResults;
+SensorParserConfig sensorParserConfig = 
parserConfigurations.getSensorParserConfig(sensorType);
+if (sensorParserConfig != null) {
+  MessageParser parser = 
sensorToParserComponentMap.get(sensorType).getMessageParser();
+  List messages = 
parser.parseOptional(rawMessage.getMessage()).orElse(Collections.emptyList());
+  parserResults = messages.stream()
+  .map(message -> processMessage(sensorType, message, 
rawMessage, parser, parserConfigurations))
+  .filter(Optional::isPresent)
+  .map(Optional::get).collect(Collectors.toList());
+} else {
+  throw new IllegalStateException(String.format("Could not execute 

[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...

2018-10-09 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1213#discussion_r223886964
  
--- Diff: 
metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/ParserResult.java
 ---
@@ -0,0 +1,75 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.parsers;
+
+import org.apache.metron.common.error.MetronError;
+import org.json.simple.JSONObject;
+
+import java.util.Arrays;
+import java.util.Objects;
+
+public class ParserResult {
+
+  private String sensorType;
+  private JSONObject message;
+  private MetronError error;
+  private byte[] originalMessage;
+
+  public ParserResult(String sensorType, JSONObject message, byte[] 
originalMessage) {
+this.sensorType = sensorType;
+this.message = message;
+this.originalMessage = originalMessage;
+  }
+
+  public ParserResult(String sensorType, MetronError error, byte[] 
originalMessage) {
+this.sensorType = sensorType;
+this.error = error;
+this.originalMessage = originalMessage;
+  }
+
+  public String getSensorType() {
+return sensorType;
+  }
+
+  public JSONObject getMessage() {
+return message;
+  }
+
+  public MetronError getError() {
+return error;
+  }
+
+  public byte[] getOriginalMessage() {
+return originalMessage;
+  }
+
+  public boolean isError() {
+return error != null;
+  }
+
+  @Override
+  public boolean equals(Object o) {
--- End diff --

Done


---


[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...

2018-10-09 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1213#discussion_r223885124
  
--- Diff: 
metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/ParserRunnerImpl.java
 ---
@@ -101,26 +92,28 @@ public void init(Supplier 
parserConfigSupplier, Context st
   }
 
   @Override
-  public void execute(String sensorType, RawMessage rawMessage, 
ParserConfigurations parserConfigurations) {
-if (onSuccess == null) {
-  throw new IllegalStateException("An onSuccess function must be set 
before parsing a message.");
-}
-if (onError == null) {
-  throw new IllegalStateException("An onError function must be set 
before parsing a message.");
-}
+  public List execute(String sensorType, RawMessage 
rawMessage, ParserConfigurations parserConfigurations) {
+List parserResults;
 SensorParserConfig sensorParserConfig = 
parserConfigurations.getSensorParserConfig(sensorType);
 if (sensorParserConfig != null) {
   MessageParser parser = 
sensorToParserComponentMap.get(sensorType).getMessageParser();
   List messages = 
parser.parseOptional(rawMessage.getMessage()).orElse(Collections.emptyList());
-  messages.forEach(message -> processMessage(sensorType, message, 
rawMessage, parser, parserConfigurations));
+  parserResults = messages.stream()
+  .map(message -> processMessage(sensorType, message, 
rawMessage, parser, parserConfigurations))
+  .filter(Optional::isPresent)
--- End diff --

I believe not present means the message was filtered out (our FilterClass 
mechanism).  Otherwise there will either be a message or error in the 
ParserResult.


---


[GitHub] metron issue #1231: METRON-1811: Alert Search Fails When Sorting by Alert St...

2018-10-09 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1231
  
The latest commit changes the SearchIntegrationTest to use the ES templates 
that ship with Metron.  There are a couple different tests that verify certain 
fields are defined with the correct types and I made sure the internal fields 
(including `alert_status`) are covered.  The one potentially confusing change 
is that a couple test fields are added to the OOTB ES templates before these 
tests are run.  This was necessary for the existing tests to continue working 
without major changes.  

I think this should cover it.  Let me know what you think.


---


[GitHub] metron issue #1231: METRON-1811: Alert Search Fails When Sorting by Alert St...

2018-10-09 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1231
  
I don't know that a test would have caught this.  The issue is that a UI 
feature was implemented and added a new field to sensor indices in ES.  This 
new field was not added to the ES templates but the feature still worked 
because ES has a very flexible schema.  It was a separate, unrelated operation 
(sorting on the new field) that exposed a problem.

What do you think?  Is there a way we can catch this kind of thing?  I can 
add a test that simulates a sort but it would only guard against a field being 
removed from a template and not added.


---


[GitHub] metron pull request #1231: METRON-1811: Alert Search Fails When Sorting by A...

2018-10-08 Thread merrimanr
GitHub user merrimanr reopened a pull request:

https://github.com/apache/metron/pull/1231

METRON-1811: Alert Search Fails When Sorting by Alert Status

## Contributor Comments
This PR fixes sorting on the `alert_status` field in the Alerts UI by 
defining the field in ES templates as a `keyword` type.  The change was applied 
to the sensor templates that ship with Metron:  bro, snort and yaf.  This field 
was added to the Solr schemas as well.

I also updated our documentation to give users guidance when defining their 
own templates or upgrading their templates.  I expanded this to include other 
internal fields like `source:type` and `metron_alert`.  I did not include 
dynamic fields but I can add documentation for that here if it makes sense.

### Testing

This has been tested in full dev:

1. Spin up full dev and navigate to the Alerts UI.
2. Change the status of a couple alerts by opening up their details panel 
and clicking a different status (OPEN for example).
3. Sort by `alert_status`.  The Alerts UI should properly display alerts by 
`alert_status` and no errors should be reported in the console.
4. Enable Solr and verify data is visible in the Alerts UI.  Repeat steps 2 
and 3.

## Pull Request Checklist

Thank you for submitting a contribution to Apache Metron.  
Please refer to our [Development 
Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235)
 for the complete guide to follow for contributions.  
Please refer also to our [Build Verification 
Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview)
 for complete smoke testing guides.  


In order to streamline the review of the contribution we ask you follow 
these guidelines and ask you to double check the following:

### For all changes:
- [x] Is there a JIRA ticket associated with this PR? If not one needs to 
be created at [Metron 
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
- [x] Does your PR title start with METRON- where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
- [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?


### For code changes:
- [x] Have you included steps to reproduce the behavior or problem that is 
being changed or addressed?
- [x] Have you included steps or a guide to how the change may be verified 
and tested manually?
- [x] Have you ensured that the full suite of tests and checks have been 
executed in the root metron folder via:
  ```
  mvn -q clean integration-test install && 
dev-utilities/build-utils/verify_licenses.sh 
  ```

- [x] Have you written or updated unit tests and or integration tests to 
verify your changes?
- [x] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [x] Have you verified the basic functionality of the build by building 
and running locally with Vagrant full-dev environment or the equivalent?

### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered by building and verifying the site-book? If not then run 
the following commands and the verify changes via 
`site-book/target/site/index.html`:

  ```
  cd site-book
  mvn site
  ```

 Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.
It is also recommended that [travis-ci](https://travis-ci.org) is set up 
for your personal repository such that your branches are built there before 
submitting a pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/merrimanr/incubator-metron METRON-1811

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/metron/pull/1231.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1231


commit 7a707bfbb1c6339f5891763c82e611eb080c4af7
Author: merrimanr 
Date:   2018-10-08T14:40:37Z

initial commit

commit 7016c1fb1b0cede0191edd32ac1dddf9b191eb13
Author: merrimanr 
Date:   2018-10-08T19:36:38Z

typo

commit 0920cdd0b83ccae7ceb7fc5b0ebfc5284a777e5d
Author: merrimanr 
Date:   2018-10-08T19:57:54Z

fixed test




---


[GitHub] metron pull request #1231: METRON-1811: Alert Search Fails When Sorting by A...

2018-10-08 Thread merrimanr
Github user merrimanr closed the pull request at:

https://github.com/apache/metron/pull/1231


---


[GitHub] metron issue #1230: METRON-1812: Fix dependencies_with_url.csv

2018-10-08 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1230
  
+1 for the clever history reference and this PR.


---


[GitHub] metron pull request #1231: METRON-1811: Alert Search Fails When Sorting by A...

2018-10-08 Thread merrimanr
GitHub user merrimanr opened a pull request:

https://github.com/apache/metron/pull/1231

METRON-1811: Alert Search Fails When Sorting by Alert Status

## Contributor Comments
This PR fixes sorting on the `alert_status` field in the Alerts UI by 
defining the field in ES templates as a `keyword` type.  The change was applied 
to the sensor templates that ship with Metron:  bro, snort and yaf.  This field 
was added to the Solr schemas as well.

I also updated our documentation to give users guidance when defining their 
own templates or upgrading their templates.  I expanded this to include other 
internal fields like `source:type` and `metron_alert`.  I did not include 
dynamic fields but I can add documentation for that here if it makes sense.

### Testing

This has been tested in full dev:

1. Spin up full dev and navigate to the Alerts UI.
2. Change the status of a couple alerts by opening up their details panel 
and clicking a different status (OPEN for example).
3. Sort by `alert_status`.  The Alerts UI should properly display alerts by 
`alert_status` and no errors should be reported in the console.
4. Enable Solr and verify data is visible in the Alerts UI.  Repeat steps 2 
and 3.

## Pull Request Checklist

Thank you for submitting a contribution to Apache Metron.  
Please refer to our [Development 
Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235)
 for the complete guide to follow for contributions.  
Please refer also to our [Build Verification 
Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview)
 for complete smoke testing guides.  


In order to streamline the review of the contribution we ask you follow 
these guidelines and ask you to double check the following:

### For all changes:
- [x] Is there a JIRA ticket associated with this PR? If not one needs to 
be created at [Metron 
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
- [x] Does your PR title start with METRON- where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
- [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?


### For code changes:
- [x] Have you included steps to reproduce the behavior or problem that is 
being changed or addressed?
- [x] Have you included steps or a guide to how the change may be verified 
and tested manually?
- [x] Have you ensured that the full suite of tests and checks have been 
executed in the root metron folder via:
  ```
  mvn -q clean integration-test install && 
dev-utilities/build-utils/verify_licenses.sh 
  ```

- [x] Have you written or updated unit tests and or integration tests to 
verify your changes?
- [x] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [x] Have you verified the basic functionality of the build by building 
and running locally with Vagrant full-dev environment or the equivalent?

### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered by building and verifying the site-book? If not then run 
the following commands and the verify changes via 
`site-book/target/site/index.html`:

  ```
  cd site-book
  mvn site
  ```

 Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.
It is also recommended that [travis-ci](https://travis-ci.org) is set up 
for your personal repository such that your branches are built there before 
submitting a pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/merrimanr/incubator-metron METRON-1811

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/metron/pull/1231.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1231


commit 7a707bfbb1c6339f5891763c82e611eb080c4af7
Author: merrimanr 
Date:   2018-10-08T14:40:37Z

initial commit




---


[GitHub] metron issue #1227: METRON-1807: Auto populate the recommended values to som...

2018-10-08 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1227
  
+1 by inspection.  Thanks @MohanDV.


---


[GitHub] metron issue #1221: METRON-1805: Provide a default value for the Storm topol...

2018-10-05 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1221
  
As far as I know there was not an issue with other topologies.  Writing 
data tends to cause back pressure so that's why these changes were suggested 
for indexing topologies.


---


[GitHub] metron issue #1184: METRON-1761, allow application of grok statement multipl...

2018-10-04 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1184
  
I prefer @mmiklavc's suggestion several comments back:  return a List of 
something other than a List.  Could we use a wrapper that can 
contain results and errors?  The calling class could then determine the best 
way to handle errors without having to fail the whole batch.


---


[GitHub] metron pull request #1190: METRON-1771: Update REST endpoints to support eve...

2018-10-04 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1190#discussion_r222823946
  
--- Diff: 
metron-platform/metron-indexing/src/main/java/org/apache/metron/indexing/dao/MultiIndexDao.java
 ---
@@ -282,4 +276,27 @@ public Document getLatest(final String guid, String 
sensorType) throws IOExcepti
   public List getIndices() {
 return indices;
   }
+
+  private Document getDocument(List documentContainers) 
throws IOException {
+Document ret = null;
+List error = new ArrayList<>();
+for(DocumentContainer dc : documentContainers) {
+  if(dc.getException().isPresent()) {
+Throwable e = dc.getException().get();
+error.add(e.getMessage() + "\n" + ExceptionUtils.getStackTrace(e));
+  }
+  else {
+if(dc.getDocument().isPresent()) {
+  Document d = dc.getDocument().get();
+  if(ret == null || ret.getTimestamp() < d.getTimestamp()) {
+ret = d;
+  }
+}
+  }
+}
--- End diff --

Looks good to me.  Latest commit includes this change.


---


[GitHub] metron issue #1213: METRON-1681: Decouple the ParserBolt from the Parse exec...

2018-10-04 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1213
  
As far as I know it's been that way for a really long time.


---


[GitHub] metron issue #1213: METRON-1681: Decouple the ParserBolt from the Parse exec...

2018-10-04 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1213
  
The latest commit removes the callbacks.  Now the execute method returns a 
list of ParserResult objects.  I had to add a MetronError field to the 
ParserResult class so that both successful messages and errors are passed back.

I also noticed we are currently skipping a tuple if no configuration is 
found for that sensor.  I think skipping the tuple is fine but we should still 
log an error message.  I made a change in the ParserRunner class to throw an 
error on this condition.  The parser bolt will catch it and handle it similar 
to a parser error by creating a MetronError object and sending it to the error 
topic.  Does anyone think we should keep as it is now?


---


[GitHub] metron issue #1213: METRON-1681: Decouple the ParserBolt from the Parse exec...

2018-10-04 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1213
  
I don't feel that strongly about either approach so I'll switch it to 
return a list and remove the callbacks.


---


[GitHub] metron pull request #1221: METRON-1805: Provide a default value for the Stor...

2018-10-03 Thread merrimanr
GitHub user merrimanr opened a pull request:

https://github.com/apache/metron/pull/1221

METRON-1805: Provide a default value for the Storm 
topology.max.spout.pending setting

## Contributor Comments
Users have reported problems in the random and batch indexing topologies 
when this setting is not set.  The primary purpose of this PR is to start a 
discussion around providing a default value for the Storm 
topology.max.spout.pending setting in our topologies and implement the changes 
if we decide to do it.  

1. Should we provide defaults or continue to defer to the Storm default 
where a maximum isn't enforced?
2. Which topologies should we provide defaults for?
3. What should the default values be for the topologies we include?

## Pull Request Checklist

Thank you for submitting a contribution to Apache Metron.  
Please refer to our [Development 
Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235)
 for the complete guide to follow for contributions.  
Please refer also to our [Build Verification 
Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview)
 for complete smoke testing guides.  


In order to streamline the review of the contribution we ask you follow 
these guidelines and ask you to double check the following:

### For all changes:
- [x] Is there a JIRA ticket associated with this PR? If not one needs to 
be created at [Metron 
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
- [x] Does your PR title start with METRON- where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
- [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?


### For code changes:
- [ ] Have you included steps to reproduce the behavior or problem that is 
being changed or addressed?
- [ ] Have you included steps or a guide to how the change may be verified 
and tested manually?
- [ ] Have you ensured that the full suite of tests and checks have been 
executed in the root metron folder via:
  ```
  mvn -q clean integration-test install && 
dev-utilities/build-utils/verify_licenses.sh 
  ```

- [ ] Have you written or updated unit tests and or integration tests to 
verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [ ] Have you verified the basic functionality of the build by building 
and running locally with Vagrant full-dev environment or the equivalent?

### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered by building and verifying the site-book? If not then run 
the following commands and the verify changes via 
`site-book/target/site/index.html`:

  ```
  cd site-book
  mvn site
  ```

 Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.
It is also recommended that [travis-ci](https://travis-ci.org) is set up 
for your personal repository such that your branches are built there before 
submitting a pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/merrimanr/incubator-metron METRON-1805

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/metron/pull/1221.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1221


commit 6988e0607afc1cd402063e0696a1220cc134d9d1
Author: merrimanr 
Date:   2018-10-03T21:45:32Z

initial commit




---


[GitHub] metron issue #1213: METRON-1681: Decouple the ParserBolt from the Parse exec...

2018-10-03 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1213
  
I just pushed a commit out based on some feedback from Nick: 

- The stellar context is now passed into the ParserRunner abstraction as a 
dependency.  I imagine different environments will require different Stellar 
initialization strategies so this made sense to me.  I also like it because it 
makes the abstraction simpler.
-  Changed ParserRunner to an interface with a single implementation.  I 
could change this to ParserRunnerStrategy in case we think we might want to use 
that pattern in the future.  Then all that would be missing is a ParserContext 
class (and possibly a ParserRunnerStrategies enum) which we could add now or 
when we need it.


---


[GitHub] metron pull request #1190: METRON-1771: Update REST endpoints to support eve...

2018-10-03 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1190#discussion_r222470180
  
--- Diff: 
metron-platform/metron-indexing/src/main/java/org/apache/metron/indexing/dao/MultiIndexDao.java
 ---
@@ -282,4 +276,27 @@ public Document getLatest(final String guid, String 
sensorType) throws IOExcepti
   public List getIndices() {
 return indices;
   }
+
+  private Document getDocument(List documentContainers) 
throws IOException {
+Document ret = null;
+List error = new ArrayList<>();
+for(DocumentContainer dc : documentContainers) {
+  if(dc.getException().isPresent()) {
+Throwable e = dc.getException().get();
+error.add(e.getMessage() + "\n" + ExceptionUtils.getStackTrace(e));
+  }
+  else {
+if(dc.getDocument().isPresent()) {
+  Document d = dc.getDocument().get();
+  if(ret == null || ret.getTimestamp() < d.getTimestamp()) {
+ret = d;
+  }
+}
+  }
+}
--- End diff --

No problem, don't mind working it out here and now.  The case that causes 
the test to fail (although I'm not sure that was the intention of the test) is 
where 2 DAOs are contained in a MultiIndexDao and the first DAO returns an 
invalid result (no document or exception) while the second returns a valid 
result.  I don't know what would cause a document to be missing both a document 
and an exception.  

The question is what should happen when this unexpected condition does 
happen?  Should it blow up and throw and exception or return a document if at 
least one DAO returns a valid one.


---


[GitHub] metron issue #1213: METRON-1681: Decouple the ParserBolt from the Parse exec...

2018-10-03 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1213
  
Thanks for pointing me to that @mmiklavc, I had not looked at it yet.  I am 
in agreement that we should do our best to match other classes and patterns we 
already have.  I will continue studying that to see where we can make things 
more consistent.  If you can provide some examples that demonstrate your ideas 
that would help me understand it better.  

From what I can tell, the strategy pattern was used in the 
UnifiedEnrichmentBolt because there are 2 separate strategies for enriching a 
message:  enrichment and threat intel.  Do we need the strategy pattern here 
since there is only one parser running strategy?  We could implement this as a 
strategy pattern in anticipation of one day needing another parser running 
strategy.  Do you think it's worth doing now or later when we need it?

The other difference I see is that Future objects are used to join the 
different messages after they are processed in parallel.  I believe this is 
done because all enrichments/threat intel must be done before the message can 
be pieced back together and sent along.  In this case we don't have that 
limitation.  The documents that are parsed do not depend on each other and can 
be passed along as soon as they are processed.  We are set up for 
post-processing these documents in parallel but I see that as a low-latency 
operation and I'm not sure it's worth the extra overhead.




---


[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...

2018-10-03 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1213#discussion_r222440594
  
--- Diff: 
metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/ParserRunner.java
 ---
@@ -0,0 +1,234 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.parsers;
+
+import com.github.benmanes.caffeine.cache.Cache;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.curator.framework.CuratorFramework;
+import org.apache.metron.common.Constants;
+import org.apache.metron.common.configuration.FieldTransformer;
+import org.apache.metron.common.configuration.FieldValidator;
+import org.apache.metron.common.configuration.ParserConfigurations;
+import org.apache.metron.common.configuration.SensorParserConfig;
+import org.apache.metron.common.error.MetronError;
+import org.apache.metron.common.message.metadata.RawMessage;
+import org.apache.metron.common.utils.ReflectionUtils;
+import org.apache.metron.parsers.filters.Filters;
+import org.apache.metron.parsers.interfaces.MessageFilter;
+import org.apache.metron.parsers.interfaces.MessageParser;
+import org.apache.metron.parsers.topology.ParserComponent;
+import org.apache.metron.stellar.common.CachingStellarProcessor;
+import org.apache.metron.stellar.dsl.Context;
+import org.apache.metron.stellar.dsl.StellarFunctions;
+import org.json.simple.JSONObject;
+
+import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.UUID;
+import java.util.function.Consumer;
+import java.util.function.Supplier;
+import java.util.stream.Collectors;
+
+public class ParserRunner implements Serializable {
+
+  protected transient Consumer onSuccess;
+  protected transient Consumer onError;
+
+  private HashSet sensorTypes;
+  private Map sensorToParserComponentMap;
+
+  // Stellar variables
+  private transient Context stellarContext;
--- End diff --

Context is not serializable.


---


[GitHub] metron issue #1213: METRON-1681: Decouple the ParserBolt from the Parse exec...

2018-10-03 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1213
  
That sounds correct to me.  Specifically decoupling the Kafka writing step 
since our writing implementation requires a tuple.


---


[GitHub] metron issue #1213: METRON-1681: Decouple the ParserBolt from the Parse exec...

2018-10-02 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1213
  
The MessageParser.parseOptional call returns a list of JSONObjects which is 
then processed (field transformations applied, messages validated, etc).  The 
difference is WHEN each processed JSONObject is written to Kafka.  In the 
current ParserBolt each one is sent as soon as it is processed because 
everything runs in ParserBolt.execute.  Since we are extracting that logic 
outside of the ParserBolt, we either need to write them all after processing is 
done by the abstraction or expose callbacks that the abstraction calls as it 
finishes processing each one.

I don't think it matters much either way.  I chose to use callbacks because 
it more closely matches the flow of the current ParserBolt.  I will assume 
everyone is ok with callbacks and will continue work on the suggestions made in 
the comments.


---


[GitHub] metron issue #1213: METRON-1681: Decouple the ParserBolt from the Parse exec...

2018-10-02 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1213
  
I think the travis failure is unrelated.  I have to make changes based on 
feedback so I'll keep on eye on it after the next run.

Before I go implementing changes we need to decide on callback vs list.  I 
don't see how we can design an interface that supports both (am I wrong there?) 
so we need to choose.  Seems to be split at the moment.  Thoughts?


---


[GitHub] metron pull request #1190: METRON-1771: Update REST endpoints to support eve...

2018-10-02 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1190#discussion_r222000643
  
--- Diff: 
metron-platform/metron-indexing/src/main/java/org/apache/metron/indexing/dao/MultiIndexDao.java
 ---
@@ -282,4 +276,27 @@ public Document getLatest(final String guid, String 
sensorType) throws IOExcepti
   public List getIndices() {
 return indices;
   }
+
+  private Document getDocument(List documentContainers) 
throws IOException {
+Document ret = null;
+List error = new ArrayList<>();
+for(DocumentContainer dc : documentContainers) {
+  if(dc.getException().isPresent()) {
+Throwable e = dc.getException().get();
+error.add(e.getMessage() + "\n" + ExceptionUtils.getStackTrace(e));
+  }
+  else {
+if(dc.getDocument().isPresent()) {
+  Document d = dc.getDocument().get();
+  if(ret == null || ret.getTimestamp() < d.getTimestamp()) {
+ret = d;
+  }
+}
+  }
+}
--- End diff --

Upon further review, I'm not sure this is what we want.  This caused an 
integration test to fail because the result from the first DAO did not contain 
a document or exception while the result from the second DAO did contain a 
document.  Before this change, the document from the second DAO would have been 
returned.  I think we want to return a document if possible, even if all DAOs 
are not consistent.  Do you disagree?


---


[GitHub] metron issue #1190: METRON-1771: Update REST endpoints to support eventually...

2018-10-02 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1190
  
@nickwallen, the latest commit should address your comments.  Let me know 
what you think.


---


[GitHub] metron issue #1212: METRON-1794 Include User Details When Escalating Alerts

2018-09-28 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1212
  
Looks good to me.  +1


---


[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...

2018-09-27 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1213#discussion_r221023741
  
--- Diff: 
metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/ParserRunner.java
 ---
@@ -0,0 +1,234 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.parsers;
+
+import com.github.benmanes.caffeine.cache.Cache;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.curator.framework.CuratorFramework;
+import org.apache.metron.common.Constants;
+import org.apache.metron.common.configuration.FieldTransformer;
+import org.apache.metron.common.configuration.FieldValidator;
+import org.apache.metron.common.configuration.ParserConfigurations;
+import org.apache.metron.common.configuration.SensorParserConfig;
+import org.apache.metron.common.error.MetronError;
+import org.apache.metron.common.message.metadata.RawMessage;
+import org.apache.metron.common.utils.ReflectionUtils;
+import org.apache.metron.parsers.filters.Filters;
+import org.apache.metron.parsers.interfaces.MessageFilter;
+import org.apache.metron.parsers.interfaces.MessageParser;
+import org.apache.metron.parsers.topology.ParserComponent;
+import org.apache.metron.stellar.common.CachingStellarProcessor;
+import org.apache.metron.stellar.dsl.Context;
+import org.apache.metron.stellar.dsl.StellarFunctions;
+import org.json.simple.JSONObject;
+
+import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.UUID;
+import java.util.function.Consumer;
+import java.util.function.Supplier;
+import java.util.stream.Collectors;
+
+public class ParserRunner implements Serializable {
+
+  protected transient Consumer onSuccess;
+  protected transient Consumer onError;
+
+  private HashSet sensorTypes;
+  private Map sensorToParserComponentMap;
+
+  // Stellar variables
+  private transient Context stellarContext;
+
+  public ParserRunner(HashSet sensorTypes) {
+this.sensorTypes = sensorTypes;
+  }
+
+  public Map getSensorToParserComponentMap() {
+return sensorToParserComponentMap;
+  }
+
+  public void setSensorToParserComponentMap(Map 
sensorToParserComponentMap) {
+this.sensorToParserComponentMap = sensorToParserComponentMap;
+  }
+
+  public Context getStellarContext() {
+return stellarContext;
+  }
+
+  public void setOnSuccess(Consumer onSuccess) {
+this.onSuccess = onSuccess;
+  }
+
+  public void setOnError(Consumer onError) {
+this.onError = onError;
+  }
+
+  public void init(CuratorFramework client, Supplier 
parserConfigSupplier) {
+if (parserConfigSupplier == null) {
+  throw new IllegalStateException("A parser config supplier must be 
set before initializing the ParserRunner.");
+}
+initializeStellar(client, parserConfigSupplier);
+initializeParsers(parserConfigSupplier);
+  }
+
+  protected void initializeStellar(CuratorFramework client, 
Supplier parserConfigSupplier) {
+Map cacheConfig = new HashMap<>();
+for (String sensorType: sensorTypes) {
+  SensorParserConfig config = 
parserConfigSupplier.get().getSensorParserConfig(sensorType);
+
+  if (config != null) {
+cacheConfig.putAll(config.getCacheConfig());
+  }
+}
+Cache cache = 
CachingStellarProcessor.createCache(cacheConfig);
+
+Context.Builder builder = new Context.Builder()
+.with(Context.Capabilities.ZOOKEEPER_CLIENT, () -> client)
+.with(Context.Capabilities.GLOBAL_CONFIG, () -> 
parserConfigSupplier.get().getGlobalConfig())
+.with(Context.Capabilities.STELL

[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...

2018-09-27 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1213#discussion_r221017124
  
--- Diff: 
metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/ParserRunner.java
 ---
@@ -0,0 +1,234 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.parsers;
+
+import com.github.benmanes.caffeine.cache.Cache;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.curator.framework.CuratorFramework;
+import org.apache.metron.common.Constants;
+import org.apache.metron.common.configuration.FieldTransformer;
+import org.apache.metron.common.configuration.FieldValidator;
+import org.apache.metron.common.configuration.ParserConfigurations;
+import org.apache.metron.common.configuration.SensorParserConfig;
+import org.apache.metron.common.error.MetronError;
+import org.apache.metron.common.message.metadata.RawMessage;
+import org.apache.metron.common.utils.ReflectionUtils;
+import org.apache.metron.parsers.filters.Filters;
+import org.apache.metron.parsers.interfaces.MessageFilter;
+import org.apache.metron.parsers.interfaces.MessageParser;
+import org.apache.metron.parsers.topology.ParserComponent;
+import org.apache.metron.stellar.common.CachingStellarProcessor;
+import org.apache.metron.stellar.dsl.Context;
+import org.apache.metron.stellar.dsl.StellarFunctions;
+import org.json.simple.JSONObject;
+
+import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.UUID;
+import java.util.function.Consumer;
+import java.util.function.Supplier;
+import java.util.stream.Collectors;
+
+public class ParserRunner implements Serializable {
+
+  protected transient Consumer onSuccess;
+  protected transient Consumer onError;
+
+  private HashSet sensorTypes;
+  private Map sensorToParserComponentMap;
+
+  // Stellar variables
+  private transient Context stellarContext;
+
+  public ParserRunner(HashSet sensorTypes) {
+this.sensorTypes = sensorTypes;
+  }
+
+  public Map getSensorToParserComponentMap() {
+return sensorToParserComponentMap;
+  }
+
+  public void setSensorToParserComponentMap(Map 
sensorToParserComponentMap) {
+this.sensorToParserComponentMap = sensorToParserComponentMap;
+  }
+
+  public Context getStellarContext() {
+return stellarContext;
+  }
+
+  public void setOnSuccess(Consumer onSuccess) {
+this.onSuccess = onSuccess;
+  }
+
+  public void setOnError(Consumer onError) {
+this.onError = onError;
+  }
+
+  public void init(CuratorFramework client, Supplier 
parserConfigSupplier) {
+if (parserConfigSupplier == null) {
+  throw new IllegalStateException("A parser config supplier must be 
set before initializing the ParserRunner.");
+}
+initializeStellar(client, parserConfigSupplier);
+initializeParsers(parserConfigSupplier);
+  }
+
+  protected void initializeStellar(CuratorFramework client, 
Supplier parserConfigSupplier) {
--- End diff --

Ok sounds good.  Are you ok if I leave the Stellar stuff in there as a 
default that only gets called if one hasn't been provided?


---


[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...

2018-09-27 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1213#discussion_r220995913
  
--- Diff: 
metron-platform/metron-parsers/src/test/java/org/apache/metron/parsers/integration/ParserDriver.java
 ---
@@ -83,49 +86,21 @@ public void close() throws Exception {
 List errors = new ArrayList<>();
 
 public ShimParserBolt(List output) {
-  super(null
-  , Collections.singletonMap(
-  sensorType == null ? config.getSensorTopic() : sensorType,
-  new ParserComponents(
-  ReflectionUtils.createInstance(config.getParserClassName()),
-  null,
-  new WriterHandler(new CollectingWriter(output))
-  )
- )
-  );
+  super(null, parserRunner, Collections.singletonMap(sensorType, new 
WriterHandler(new CollectingWriter(output;
--- End diff --

Not sure this ShimParserBolt is necessary anymore.


---


[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...

2018-09-27 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1213#discussion_r220994410
  
--- Diff: 
metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/topology/ParserTopologyBuilder.java
 ---
@@ -268,29 +266,14 @@ private static ParserBolt createParserBolt( String 
zookeeperUrl,
   Optional 
securityProtocol,
   ParserConfigurations configs,
   Optional 
outputTopic) {
-
-Map parserBoltConfigs = new HashMap<>();
+Map writerConfigs = new HashMap<>();
 for( Entry entry : 
sensorTypeToParserConfig.entrySet()) {
   String sensorType = entry.getKey();
   SensorParserConfig parserConfig = entry.getValue();
-  // create message parser
--- End diff --

This should not be done here because it can potentially cause serialization 
problems:  https://issues.apache.org/jira/browse/METRON-1793.  MessageParser 
and MessageFilter objects should be created during initialization and not when 
the container is created (and before the ParserBolt is created).  We should be 
doing this when prepare is called on the ParserBolt.


---


[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...

2018-09-27 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1213#discussion_r220993330
  
--- Diff: 
metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/topology/ParserComponent.java
 ---
@@ -17,26 +17,23 @@
  */
 package org.apache.metron.parsers.topology;
 
-import java.io.Serializable;
-import org.apache.metron.parsers.bolt.WriterHandler;
 import org.apache.metron.parsers.interfaces.MessageFilter;
 import org.apache.metron.parsers.interfaces.MessageParser;
 import org.json.simple.JSONObject;
 
-public class ParserComponents implements Serializable {
+import java.io.Serializable;
+
+public class ParserComponent implements Serializable {
   private static final long serialVersionUID = 7880346740026374665L;
 
   private MessageParser messageParser;
   private MessageFilter filter;
-  private WriterHandler writer;
 
-  public ParserComponents(
+  public ParserComponent(
   MessageParser messageParser,
-  MessageFilter filter,
--- End diff --

WriterHandler was moved out of this class because it's handled by Storm.  
MessageParser and MessageFilter is abstracted away from the streaming platform 
in this PR.


---


[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...

2018-09-27 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1213#discussion_r220993022
  
--- Diff: 
metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/filters/BroMessageFilter.java
 ---
@@ -67,7 +67,7 @@ else if(protocolsObj instanceof List) {
*/
 
   @Override
-  public boolean emitTuple(JSONObject message, Context context) {
+  public boolean emit(JSONObject message, Context context) {
--- End diff --

Renamed this because it should not be specific to Storm.


---


[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...

2018-09-27 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1213#discussion_r220991435
  
--- Diff: 
metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/bolt/ParserBolt.java
 ---
@@ -242,169 +226,81 @@ public void prepare(Map stormConf, TopologyContext 
context, OutputCollector coll
 }
   }
 
-  protected void initializeStellar() {
-Context.Builder builder = new Context.Builder()
-
.with(Context.Capabilities.ZOOKEEPER_CLIENT, () -> client)
-.with(Context.Capabilities.GLOBAL_CONFIG, 
() -> getConfigurations().getGlobalConfig())
-.with(Context.Capabilities.STELLAR_CONFIG, 
() -> getConfigurations().getGlobalConfig())
-;
-if(cache != null) {
-  builder = builder.with(Context.Capabilities.CACHE, () -> cache);
-}
-this.stellarContext = builder.build();
-StellarFunctions.initialize(stellarContext);
-  }
-
 
   @SuppressWarnings("unchecked")
   @Override
   public void execute(Tuple tuple) {
 if (TupleUtils.isTick(tuple)) {
-  try {
-for (Entry entry : 
sensorToComponentMap.entrySet()) {
-  entry.getValue().getWriter().flush(getConfigurations(), 
messageGetStrategy);
-}
-  } catch (Exception e) {
-throw new RuntimeException(
-"This should have been caught in the writerHandler.  If you 
see this, file a JIRA", e);
-  } finally {
-collector.ack(tuple);
-  }
+  handleTickTuple(tuple);
   return;
 }
-
+numWritten = 0;
--- End diff --

This counter is implemented as a member variable and reset for each tuple.  


---


[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...

2018-09-27 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1213#discussion_r220988891
  
--- Diff: 
metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/bolt/ParserBolt.java
 ---
@@ -242,169 +226,81 @@ public void prepare(Map stormConf, TopologyContext 
context, OutputCollector coll
 }
   }
 
-  protected void initializeStellar() {
-Context.Builder builder = new Context.Builder()
-
.with(Context.Capabilities.ZOOKEEPER_CLIENT, () -> client)
-.with(Context.Capabilities.GLOBAL_CONFIG, 
() -> getConfigurations().getGlobalConfig())
-.with(Context.Capabilities.STELLAR_CONFIG, 
() -> getConfigurations().getGlobalConfig())
-;
-if(cache != null) {
-  builder = builder.with(Context.Capabilities.CACHE, () -> cache);
-}
-this.stellarContext = builder.build();
-StellarFunctions.initialize(stellarContext);
-  }
-
 
   @SuppressWarnings("unchecked")
   @Override
   public void execute(Tuple tuple) {
 if (TupleUtils.isTick(tuple)) {
-  try {
-for (Entry entry : 
sensorToComponentMap.entrySet()) {
-  entry.getValue().getWriter().flush(getConfigurations(), 
messageGetStrategy);
-}
-  } catch (Exception e) {
-throw new RuntimeException(
-"This should have been caught in the writerHandler.  If you 
see this, file a JIRA", e);
-  } finally {
-collector.ack(tuple);
-  }
+  handleTickTuple(tuple);
   return;
 }
-
+numWritten = 0;
 byte[] originalMessage = (byte[]) messageGetStrategy.get(tuple);
+String topic = 
tuple.getStringByField(FieldsConfiguration.TOPIC.getFieldName());
+String sensorType = topicToSensorMap.get(topic);
 try {
-  SensorParserConfig sensorParserConfig;
-  MessageParser parser;
-  String sensor;
-  Map metadata;
-  if (sensorToComponentMap.size() == 1) {
-// There's only one parser, so grab info directly
-Entry sensorParser = 
sensorToComponentMap.entrySet().iterator()
-.next();
-sensor = sensorParser.getKey();
-parser = sensorParser.getValue().getMessageParser();
-sensorParserConfig = getSensorParserConfig(sensor);
-  } else {
-// There's multiple parsers, so pull the topic from the Tuple and 
look up the sensor
-String topic = 
tuple.getStringByField(FieldsConfiguration.TOPIC.getFieldName());
-sensor = topicToSensorMap.get(topic);
-parser = sensorToComponentMap.get(sensor).getMessageParser();
-sensorParserConfig = getSensorParserConfig(sensor);
-  }
+  ParserConfigurations parserConfigurations = getConfigurations();
+  SensorParserConfig sensorParserConfig = 
parserConfigurations.getSensorParserConfig(sensorType);
+  RawMessage rawMessage = RawMessageUtil.INSTANCE.getRawMessage( 
sensorParserConfig.getRawMessageStrategy()
+  , tuple
+  , originalMessage
+  , sensorParserConfig.getReadMetadata()
+  , sensorParserConfig.getRawMessageStrategyConfig()
+  );
+  parserRunner.setOnSuccess(parserResult -> onSuccess(parserResult, 
tuple));
--- End diff --

Setting onSuccess here with the current tuple.  We could also store the 
current tuple in a member variable if we wanted to move this call to prepare.


---


[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...

2018-09-27 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1213#discussion_r220988475
  
--- Diff: 
metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/bolt/ParserBolt.java
 ---
@@ -242,169 +226,81 @@ public void prepare(Map stormConf, TopologyContext 
context, OutputCollector coll
 }
   }
 
-  protected void initializeStellar() {
-Context.Builder builder = new Context.Builder()
-
.with(Context.Capabilities.ZOOKEEPER_CLIENT, () -> client)
-.with(Context.Capabilities.GLOBAL_CONFIG, 
() -> getConfigurations().getGlobalConfig())
-.with(Context.Capabilities.STELLAR_CONFIG, 
() -> getConfigurations().getGlobalConfig())
-;
-if(cache != null) {
-  builder = builder.with(Context.Capabilities.CACHE, () -> cache);
-}
-this.stellarContext = builder.build();
-StellarFunctions.initialize(stellarContext);
-  }
-
 
   @SuppressWarnings("unchecked")
   @Override
   public void execute(Tuple tuple) {
 if (TupleUtils.isTick(tuple)) {
-  try {
-for (Entry entry : 
sensorToComponentMap.entrySet()) {
-  entry.getValue().getWriter().flush(getConfigurations(), 
messageGetStrategy);
-}
-  } catch (Exception e) {
-throw new RuntimeException(
-"This should have been caught in the writerHandler.  If you 
see this, file a JIRA", e);
-  } finally {
-collector.ack(tuple);
-  }
+  handleTickTuple(tuple);
   return;
 }
-
+numWritten = 0;
 byte[] originalMessage = (byte[]) messageGetStrategy.get(tuple);
+String topic = 
tuple.getStringByField(FieldsConfiguration.TOPIC.getFieldName());
+String sensorType = topicToSensorMap.get(topic);
 try {
-  SensorParserConfig sensorParserConfig;
-  MessageParser parser;
-  String sensor;
-  Map metadata;
-  if (sensorToComponentMap.size() == 1) {
--- End diff --

I refactored this to always look up the sensor from the topic.  I think 
this makes it easier to read and test.


---


[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...

2018-09-27 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1213#discussion_r220987950
  
--- Diff: 
metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/bolt/ParserBolt.java
 ---
@@ -242,169 +226,81 @@ public void prepare(Map stormConf, TopologyContext 
context, OutputCollector coll
 }
   }
 
-  protected void initializeStellar() {
-Context.Builder builder = new Context.Builder()
-
.with(Context.Capabilities.ZOOKEEPER_CLIENT, () -> client)
-.with(Context.Capabilities.GLOBAL_CONFIG, 
() -> getConfigurations().getGlobalConfig())
-.with(Context.Capabilities.STELLAR_CONFIG, 
() -> getConfigurations().getGlobalConfig())
-;
-if(cache != null) {
-  builder = builder.with(Context.Capabilities.CACHE, () -> cache);
-}
-this.stellarContext = builder.build();
-StellarFunctions.initialize(stellarContext);
-  }
-
 
   @SuppressWarnings("unchecked")
   @Override
   public void execute(Tuple tuple) {
 if (TupleUtils.isTick(tuple)) {
-  try {
--- End diff --

Slight refactor here.  I moved this to it's own method to make execute more 
readable.


---


[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...

2018-09-27 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1213#discussion_r220987391
  
--- Diff: 
metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/ParserRunner.java
 ---
@@ -0,0 +1,234 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.parsers;
+
+import com.github.benmanes.caffeine.cache.Cache;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.curator.framework.CuratorFramework;
+import org.apache.metron.common.Constants;
+import org.apache.metron.common.configuration.FieldTransformer;
+import org.apache.metron.common.configuration.FieldValidator;
+import org.apache.metron.common.configuration.ParserConfigurations;
+import org.apache.metron.common.configuration.SensorParserConfig;
+import org.apache.metron.common.error.MetronError;
+import org.apache.metron.common.message.metadata.RawMessage;
+import org.apache.metron.common.utils.ReflectionUtils;
+import org.apache.metron.parsers.filters.Filters;
+import org.apache.metron.parsers.interfaces.MessageFilter;
+import org.apache.metron.parsers.interfaces.MessageParser;
+import org.apache.metron.parsers.topology.ParserComponent;
+import org.apache.metron.stellar.common.CachingStellarProcessor;
+import org.apache.metron.stellar.dsl.Context;
+import org.apache.metron.stellar.dsl.StellarFunctions;
+import org.json.simple.JSONObject;
+
+import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.UUID;
+import java.util.function.Consumer;
+import java.util.function.Supplier;
+import java.util.stream.Collectors;
+
+public class ParserRunner implements Serializable {
+
+  protected transient Consumer onSuccess;
+  protected transient Consumer onError;
+
+  private HashSet sensorTypes;
+  private Map sensorToParserComponentMap;
+
+  // Stellar variables
+  private transient Context stellarContext;
+
+  public ParserRunner(HashSet sensorTypes) {
+this.sensorTypes = sensorTypes;
+  }
+
+  public Map getSensorToParserComponentMap() {
+return sensorToParserComponentMap;
+  }
+
+  public void setSensorToParserComponentMap(Map 
sensorToParserComponentMap) {
+this.sensorToParserComponentMap = sensorToParserComponentMap;
+  }
+
+  public Context getStellarContext() {
+return stellarContext;
+  }
+
+  public void setOnSuccess(Consumer onSuccess) {
+this.onSuccess = onSuccess;
+  }
+
+  public void setOnError(Consumer onError) {
+this.onError = onError;
+  }
+
+  public void init(CuratorFramework client, Supplier 
parserConfigSupplier) {
+if (parserConfigSupplier == null) {
+  throw new IllegalStateException("A parser config supplier must be 
set before initializing the ParserRunner.");
+}
+initializeStellar(client, parserConfigSupplier);
+initializeParsers(parserConfigSupplier);
+  }
+
+  protected void initializeStellar(CuratorFramework client, 
Supplier parserConfigSupplier) {
--- End diff --

We could make the client optional here.  The only required dependency would 
then be ParserConfigurations.


---


[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...

2018-09-26 Thread merrimanr
GitHub user merrimanr opened a pull request:

https://github.com/apache/metron/pull/1213

METRON-1681: Decouple the ParserBolt from the Parse execution logic

## Contributor Comments
The primary purpose of this PR is to create a parser container abstraction 
that is decoupled from Storm.  This parser container (I call it ParserRunner, 
anyone have a better name?) is responsible for:

- Instantiating the MessageParser and MessageFilter objects for each sensor.
- Initializing the Stellar environment.
- Accepting a raw binary message along with parser configurations and 
calling the MessageParser.parseOptional method for the appropriate sensor type
- Process each message.  Most of this logic was migrated from the 
ParserBolt.execute method.  
- Execute callbacks depending on the processing result of each message

Configuration is external to this abstraction.  A configuration supplier is 
passed in when initializing and a configuration object is passed in when 
processing each message.  I believe this was originally done because we want 
message processing to be atomic without the configuration unexpectedly 
changing.  We can easily change to a message supplier during execute if 
necessary.  A CuratorFramework object is also required for setting up Stellar 
but we could easily make this optional.  

I decided to keep writing out of the abstraction in this PR.  I can 
envision different platforms having different requirements or needs for sending 
along messages after they are parsed.  Therefore the ParserBolt still handles 
writing messages to Kafka.  If we do decide we want to add writing to our 
abstraction we could do it in a follow on PR to keep this from becoming even 
bigger.

Since all of the post parsing logic was in the Parserbolt, messages could 
be written as they were processed rather than having to wait for all messages 
to be processed. To maintain this behavior I added 2 callback functions in the 
form of Java Consumers:  onSuccess and OnError.  The other option would be to 
make message processing synchronous and just return a list of results.

This lifecycle of this container looks like:

1. Container is created from a collection of sensor types 
2. Container is initialized with the init method that accepts a Curator 
client and a configuration Supplier.  This in turn sets up Stellar and 
instantiates MessageParser and MessageFilter classes.
3. Container is ready and accepts messages for processing and calls the 
appropriate callbacks.

Because this splits the ParserBolt into 2 different classes much of the 
ParserBoltTest unit test didn't make sense anymore.  I ended up essentially 
rewriting it and also creating a unit test for ParserRunner.  I tried to 
represent all the original tests in ParserBoltTest and have 95%+ coverage.  If 
I missed any cases or made undesirable style changes, let me know.

### Changes Included

- ParserRunner abstraction that is decoupled from the ParserBolt.  The 
ParserBolt now initializes the ParserRunner and defers parsing to that class.  
The only thing required is Metron configuration and a Curator client (which 
could be optional)
- Refactored ParserBolt that sets up the ParserRunner, passes message to it 
for processing, and writes results to Kafka.
- MessageParser and MessageFilter objects are now created when Storm calls 
prepare, avoiding serialization issues 
(https://issues.apache.org/jira/browse/METRON-1793)
- Updated unit and integration tests

### Testing
I have done basic testing in full dev and will add a testing plan soon.  
You should be able to spin up full dev and see bro and snort data in ES.

 ### Next Steps
This PR contains working code but is still needs documentation.  I am 
planning on testing a couple different parsers in full dev (grok for example) 
in addition to what I've already tested.  I will be adding inline comments for 
some of the less obvious changes or refactors to make it easier to review.  My 
plan is for any discussion around specific parts of the code to get added as 
javadocs eventually.  I also think we should add some developer documentation 
to make it easier for maintaining and integrating this into other platforms.  I 
imagine a lot of info in this description would make it in there as well.

The intention for now is to get some feedback on the overall approach and 
get people thinking about it.  I'm still working on documentation and will add 
that soon.  Let me know what you think!

## Pull Request Checklist

Thank you for submitting a contribution to Apache Metron.  
Please refer to our [Development 
Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235)
 for the complete guide to follow for contributions.  
Please refer also to our [Build Verification 
Guidelines](https://cwiki.apache.org/confluence

[GitHub] metron pull request #1190: METRON-1771: Update REST endpoints to support eve...

2018-09-13 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1190#discussion_r217462174
  
--- Diff: 
metron-platform/metron-elasticsearch/src/test/java/org/apache/metron/elasticsearch/dao/ElasticsearchUpdateDaoTest.java
 ---
@@ -0,0 +1,48 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.metron.elasticsearch.dao;
+
+import org.apache.metron.indexing.dao.AccessConfig;
+import org.apache.metron.indexing.dao.UpdateDaoTest;
+import org.apache.metron.indexing.dao.update.UpdateDao;
+import org.elasticsearch.client.transport.TransportClient;
+import org.junit.Before;
+
+import static org.mockito.Mockito.mock;
+
+public class ElasticsearchUpdateDaoTest extends UpdateDaoTest {
--- End diff --

No problem


---


[GitHub] metron pull request #1190: METRON-1771: Update REST endpoints to support eve...

2018-09-13 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1190#discussion_r217462087
  
--- Diff: 
metron-platform/metron-indexing/src/test/java/org/apache/metron/indexing/InMemoryMetaAlertRetrieveLatestDao.java
 ---
@@ -0,0 +1,45 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.indexing;
+
+import org.apache.metron.indexing.dao.IndexDao;
+import org.apache.metron.indexing.dao.metaalert.MetaAlertRetrieveLatestDao;
+import org.apache.metron.indexing.dao.search.GetRequest;
+import org.apache.metron.indexing.dao.update.Document;
+
+import java.io.IOException;
+import java.util.List;
+
+public class InMemoryMetaAlertRetrieveLatestDao implements 
MetaAlertRetrieveLatestDao {
--- End diff --

Sure no problem.


---


[GitHub] metron pull request #1190: METRON-1771: Update REST endpoints to support eve...

2018-09-13 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1190#discussion_r217459943
  
--- Diff: 
metron-platform/metron-indexing/src/main/java/org/apache/metron/indexing/dao/MultiIndexDao.java
 ---
@@ -282,4 +276,27 @@ public Document getLatest(final String guid, String 
sensorType) throws IOExcepti
   public List getIndices() {
 return indices;
   }
+
+  private Document getDocument(List documentContainers) 
throws IOException {
+Document ret = null;
+List error = new ArrayList<>();
+for(DocumentContainer dc : documentContainers) {
+  if(dc.getException().isPresent()) {
+Throwable e = dc.getException().get();
+error.add(e.getMessage() + "\n" + ExceptionUtils.getStackTrace(e));
+  }
+  else {
+if(dc.getDocument().isPresent()) {
+  Document d = dc.getDocument().get();
+  if(ret == null || ret.getTimestamp() < d.getTimestamp()) {
+ret = d;
+  }
+}
+  }
+}
--- End diff --

I like it.  Will make that change.


---


[GitHub] metron pull request #1190: METRON-1771: Update REST endpoints to support eve...

2018-09-13 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/1190#discussion_r217459852
  
--- Diff: 
metron-platform/metron-elasticsearch/src/test/java/org/apache/metron/elasticsearch/dao/ElasticsearchUpdateDaoTest.java
 ---
@@ -0,0 +1,48 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.metron.elasticsearch.dao;
+
+import org.apache.metron.indexing.dao.AccessConfig;
+import org.apache.metron.indexing.dao.UpdateDaoTest;
+import org.apache.metron.indexing.dao.update.UpdateDao;
+import org.elasticsearch.client.transport.TransportClient;
+import org.junit.Before;
+
+import static org.mockito.Mockito.mock;
+
+public class ElasticsearchUpdateDaoTest extends UpdateDaoTest {
--- End diff --

Check out UpdateDaoTest.  As I created tests for ElasticsearchUpdateDao, 
SolrUpdateDao, and HBaseDao I found myself copy/pasting the various tests.  So 
I consolidated them into a single test.  I believe there is value in having a 
single sets of tests for the UpdateDao interface methods that all the Daos must 
satisfy.  This class sets up it's Dao implementation to be used in the tests.


---


[GitHub] metron issue #1190: METRON-1771: Update REST endpoints to support eventually...

2018-09-12 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1190
  
The latest commit updates the various places where looking up a document 
that doesn't exist returns null.  Now an IOException is thrown with a helpful 
message and guid of the missing object.  I added tests to cover these cases.  I 
also added changes that will throw an exception when alerts to be add/removed 
don't exist.  Previously that wasn't covered and the operation would succeed 
even though the missing alert wasn't added/removed.

For the parallel stream issue, I made it consistent with the rest of the 
class.  It now uses the same approach as the getLatest method and uses the 
DocumentContainer class.  As part of this I moved common code to a separate 
method.

Let me know what you think about these changes.


---


  1   2   3   4   5   >