[GitHub] metron pull request #1281: METRON-1895: Add Knox SSO as an option in Metron
GitHub user merrimanr opened a pull request: https://github.com/apache/metron/pull/1281 METRON-1895: Add Knox SSO as an option in Metron ## Contributor Comments This PR enables Metron to use Knox SSO for authentication. This PR depends on https://github.com/apache/metron/pull/1275 and, in it's current state, is meant only as a way to demonstrate this capability. ### Testing 1. Follow the instructions in the "Testing" section of https://github.com/apache/metron/pull/1275 to add Metron as a Knox service. Once you can access REST and the UIs through Knox you are ready for the next steps. 2. Add the Metron REST host to the Knox SSO white list. Navigate to the `Advanced knoxsso-topology` setting in Ambari at `Services > Knox > Configs > Advanced knoxsso-topology`. This property contains the complete Knox SSO topology xml. Update the `knoxsso.redirect.whitelist.regex` param value to include `node1`. The complete xml should look like this: ``` webappsec WebAppSec true xframe.options.enabledtrue authentication ShiroProvider true sessionTimeout 30 redirectToUrl /gateway/knoxsso/knoxauth/login.html restrictedCookies rememberme,WWW-Authenticate main.ldapRealm org.apache.hadoop.gateway.shirorealm.KnoxLdapRealm main.ldapContextFactory org.apache.hadoop.gateway.shirorealm.KnoxLdapContextFactory main.ldapRealm.contextFactory $ldapContextFactory main.ldapRealm.userDnTemplate uid={0},ou=people,dc=hadoop,dc=apache,dc=org main.ldapRealm.contextFactory.url ldap://localhost:33389 main.ldapRealm.authenticationCachingEnabled false main.ldapRealm.contextFactory.authenticationMechanism simple urls./** authcBasic identity-assertion Default true knoxauth KNOXSSO knoxsso.cookie.secure.only false knoxsso.token.ttl 3 knoxsso.redirect.whitelist.regex ^https?:\/\/(localhost|127\.0\.0\.1|0:0:0:0:0:0:0:1|::1|node1):[0-9].*$ ``` Save and restart Knox. 3. Update the `metron.xml` topology file at `/usr/hdp/current/knox-server/conf/topologies/metron.xml` to use Knox SSO authentication. Replace this: ``` authentication ShiroProvider true sessionTimeout 30 main.ldapRealm org.apache.hadoop.gateway.shirorealm.KnoxLdapRealm main.ldapRealm.userDnTemplate uid={0},ou=people,dc=hadoop,dc=apache,dc=org main.ldapRealm.contextFactory.url ldap://localhost:33389 main.ldapRealm.contextFactory.authenticationMechanism simple urls./** authcBasic ``` With this: ``` federation SSOCookieProvider true sso.authentication.provider.url https://node1:8443/gateway/knoxsso/api/v1/websso ``` 4. Extract the Knox public ke
[GitHub] metron issue #1265: METRON-1874 Create a Parser Debugger
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1265 Love it. I tested it in full dev and worked great. +1 ---
[GitHub] metron pull request #1265: METRON-1874 Create a Parser Debugger
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1265#discussion_r235005842 --- Diff: metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/ParserRunnerImpl.java --- @@ -87,13 +87,13 @@ public boolean isError() { protected transient Consumer onSuccess; protected transient Consumer onError; - private HashSet sensorTypes; + private Set sensorTypes; --- End diff -- Shouldn't this be serializable? ---
[GitHub] metron pull request #1275: METRON-1878: Add Metron as a Knox service
GitHub user merrimanr opened a pull request: https://github.com/apache/metron/pull/1275 METRON-1878: Add Metron as a Knox service ## Contributor Comments This PR adds REST and the Alerts UI as services in Knox. Currently this is incomplete and meant as a starting point to give reviewers an idea of the changes involved. The outstanding items are detailed below. ### Changed Included - Fixed minor path issues in various UI related code. Files include: - app.component.html - path should be relative - authentication.service.ts - logout should match other endpoints - alert-search.directive.ts - path should be relative - index.html - a relative base href means Knox does not have to rewrite links - Created Metron Knox topology and service definition files for REST and Alerts UI. The service definition files are relatively simple and should look similar to other services. - Created a script to deploy Metron files to Knox - Added a Spring "knox" profile that configures Swagger with the right endpoint paths - Added service definition files to rpm spec ### Testing This feature can be tested in full dev with some manual configuration. After spinning up full dev: 1. Add Knox as a Service in Ambari (Admin > Stacks and Versions > Add Service next to Knox) 2. Follow the instructions [here](https://github.com/apache/metron/tree/master/metron-deployment/development#knox-demo-ldap) to enable the Knox LDAP in full dev. You should be able to authentication with REST using guest/guest-password after this is done. 3. Install the Metron topology and service definition files by running the script at `/usr/metron/0.6.1/bin/install_metron_knox.sh`. This script copies the metron.xml topology file to `/usr/hdp/current/knox-server/conf/topologies` and the service definition files to `/usr/hdp/current/knox-server/data/services/metron-rest/0.6.1` and `/usr/hdp/current/knox-server/data/services/metron-alerts/0.6.1` 4. Update the REST API path for the Alerts UI. Currently expressjs proxies REST requests from the Alerts UI but this will now be done by Knox instead. The `apiRoot` setting in `app-config.json` is used to construct the url for REST requests. Change this setting in `/usr/metron/0.6.1/web/alerts-ui/assets/app-config.json` from: ``` { "apiRoot": "/api/v1" } ``` to: ``` { "apiRoot": "/gateway/metron/metron-rest/api/v1" } ``` 5. Update the REST API path for Swagger. The Swagger interface is served by the REST application and provides links for the various endpoints. Swagger must be configured with the new Knox root path for REST. Navigate to `Ambari > Metron > Configs > REST`. Change the `Active Spring profiles` setting from `dev` to `dev,knox` and change the `Metron Spring options` to `--knox.root=/gateway/metron/metron-rest`. This configures Swagger to prepend the endpoint paths with the `knox.root` Spring property. 6. Restart REST with Ambari and you should now be able to access both Swagger and the Alerts UI through Knox: - Swagger - https://node1:8443/gateway/metron/metron-rest/swagger-ui.html - Alerts UI - https://node1:8443/gateway/metron/metron-alerts/alerts-list Everything should function normally including login and logout. ### Outstanding Items Installing Metron in Knox Installing a service in Knox involves 2 steps: 1. Deploy the service definition files mentioned earlier by copying them to Knox directories on the machine where Knox is installed 2. Either update an existing Knox topology file or create a new topology file. This file configures a collection of services. In our case this includes authentication and urls to the services exposed. What is the best way to install these files in Knox? The approach in this PR only works if Metron is installed on the same machine as the Knox gateway. I also opted to create a dedicated Metron topology file. I think this will give us more control and allow us to expose properties in a more user-friendly way. Knox has a default topology that you can configure in Ambari but it's exposed as is: a big xml file. Does anyone think we should use the default topology file instead? Adding Knox to the stack This process requires that we add Knox separately through Ambari. Do we want to make Knox a dependency for Metron similar to Kafka, Storm, etc? Does it make sense to require that Knox and Metron are colocated? Doing this simplifies installation and configuration and I would vote we make this a requirement unless there is a good reason not to. Does anyone think there are cases where users would need to install them on different machines? If we do that, how do we install the service definition fi
[GitHub] metron issue #1266: METRON-1875: Expose configurable global settings in the ...
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1266 I addressed the style issues and added a comment about APP_INITIALIZER and promises. Let me know what you think. ---
[GitHub] metron pull request #1266: METRON-1875: Expose configurable global settings ...
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1266#discussion_r234293206 --- Diff: metron-interface/metron-alerts/src/app/service/app-config.service.ts --- @@ -0,0 +1,40 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +import { Injectable } from '@angular/core'; +import { HttpClient } from '@angular/common/http'; + +@Injectable() +export class AppConfigService { + + private appConfig; + + constructor(private http: HttpClient) { } + + loadAppConfig() { --- End diff -- The approach in this PR was inspired by https://juristr.com/blog/2018/01/ng-app-runtime-config/. It mentions promises are not supported in APP_INITIALIZER. I tried your suggestion and it did not work. I can add a comment explaining that. ---
[GitHub] metron pull request #1266: METRON-1875: Expose configurable global settings ...
GitHub user merrimanr opened a pull request: https://github.com/apache/metron/pull/1266 METRON-1875: Expose configurable global settings in the Alerts UI ## Contributor Comments This PR exposes a JSON file that can be used to configure the Alerts UI. Properties in this file are read on startup and available through an `AppConfigService` that can be injected. The use case in this PR is a configurable REST API path. This path was hardcoded throughout the Alerts UI code but can now be changed in the configuration file directly. A restart or rebuild is not necessary, the change is visible as soon as the Alerts UI is refreshed in the browser. I think this could be generally useful for other use cases in the future. ### Changes Included - An `AppConfigService` was added that loads configuration settings by requesting the config file as a static asset. - The various services were updated to use the `apiRoot` configuration setting instead of a hardcoded url. - We currently load column names from local storage on startup. I don't believe this is used anymore so I switched the initialization call to `AppConfigService` and removed the unused code in `ColumnNameService`. - Updated the unit tests to inject a mock `AppConfigService` instance. ### Testing To test this feature, spin up full dev and verify that the UI continues to function normally. Edit the file at `/usr/metron/0.6.1/web/alerts-ui/assets/app-config.json' and set the `apiRoot` property to a different value. Refresh the Alerts UI and verify the api path reflects the change in `app-config.json`. For example if I changed `app-config.json` to be: ``` { "apiRoot": "/some/api/path" } ``` I would expect to see the search request url (in the network tab of Chrome developer tools for example) to be `http://node1:4201/some/api/path/search/search`. ## Pull Request Checklist Thank you for submitting a contribution to Apache Metron. Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions. Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides. In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following: ### For all changes: - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [x] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? ### For code changes: - [x] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [x] Have you included steps or a guide to how the change may be verified and tested manually? - [x] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via: ``` mvn -q clean integration-test install && dev-utilities/build-utils/verify_licenses.sh ``` - [x] Have you written or updated unit tests and or integration tests to verify your changes? - [x] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [x] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent? ### For documentation related changes: - [x] Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via `site-book/target/site/index.html`: ``` cd site-book mvn site ``` Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. It is also recommended that [travis-ci](https://travis-ci.org) is set up for your personal repository such that your branches are built there before submitting a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/merrimanr/incubator-metron METRON-1875 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/metron/pull/1266.p
[GitHub] metron pull request #1263: METRON-1870: Intermittent Stellar REST test failu...
GitHub user merrimanr opened a pull request: https://github.com/apache/metron/pull/1263 METRON-1870: Intermittent Stellar REST test failures ## Contributor Comments This PR attempts to fix the intermittent test failures reported in the Jira. For this failure: ``` restGetShouldTimeoutWithSuppliedTimeout(org.apache.metron.stellar.dsl.functions.RestFunctionsTest) Time elapsed: 0.336 sec <<< ERROR! org.apache.metron.stellar.dsl.ParseException: Unable to parse: REST_GET('http://localhost:41396/get', { "timeout": 1 } ) due to: Connection is not open ``` I believe the root cause is that we're getting an IllegalStateException in some cases where we're expecting an IOException. The HttpGet.abort() method is called within an ScheduledExecutorService so it's unpredictable when exactly the request is aborted. I changed the caught exception to be general so hopefully this gives us the result we expect. For this failure: ``` restGetShouldHandleIOException(org.apache.metron.stellar.dsl.functions.RestFunctionsTest) Time elapsed: 0.314 sec <<< FAILURE! java.lang.AssertionError: expected null, but was:<{get=success}> ``` An IOException is being simulated by setting the socket timeout really low and executing a request against the mock server. Looks like this is not enough to make it consistently fail. I changed the test to spy on the doGet method and throw an IOException so that it fails every time. I was not able to reproduce these problems locally but I think they will resolve the issues. ## Pull Request Checklist Thank you for submitting a contribution to Apache Metron. Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions. Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides. In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following: ### For all changes: - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [x] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? ### For code changes: - [ ] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [ ] Have you included steps or a guide to how the change may be verified and tested manually? - [ ] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via: ``` mvn -q clean integration-test install && dev-utilities/build-utils/verify_licenses.sh ``` - [x] Have you written or updated unit tests and or integration tests to verify your changes? - [x] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent? ### For documentation related changes: - [ ] Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via `site-book/target/site/index.html`: ``` cd site-book mvn site ``` Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. It is also recommended that [travis-ci](https://travis-ci.org) is set up for your personal repository such that your branches are built there before submitting a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/merrimanr/incubator-metron METRON-1870 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/metron/pull/1263.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1263 commit b2dd4232e997eef23d0ba7350caa49d52ce5e929 Author: merrimanr Date: 2018-11-14T15:09:21Z initial commit ---
[GitHub] metron issue #1262: METRON-1871 Cannot Run Elasticsearch Integration Tests i...
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1262 I find myself commenting these out every time I run ES integration tests in my IDE. +1 ---
[GitHub] metron pull request #1250: METRON-1850: Stellar REST function
Github user merrimanr closed the pull request at: https://github.com/apache/metron/pull/1250 ---
[GitHub] metron pull request #1250: METRON-1850: Stellar REST function
GitHub user merrimanr reopened a pull request: https://github.com/apache/metron/pull/1250 METRON-1850: Stellar REST function ## Contributor Comments This PR adds a Stellar REST function that can be used to enrich messages with data from 3rd party REST services. This function leverages the Apache HttpComponents library for Http requests. The function call follows this format: `REST_GET(rest uri, optional rest settings)` There are a handful of settings including basic authentication credentials, proxy support, and timeouts. These settings are included in the `RestConfig` class. Any of these settings can be defined in the global config under the `stellar.rest.settings` property and will override any default values. The global config settings can also be overridden by passing in a config object to the expression. This will allow support for multiple REST services that may have different requirements. Responses are expected to be in JSON format. Successful requests will return a MAP object in Stellar. Errors will be logged and null will be returned by default. There are ways to override this behavior by configuring a list of acceptable status codes and/or values to be returned on error or empty responses. Considering this will be used on streaming data, how we handle timeouts is important. This function exposes HttpClient timeout settings but those are not enough to keep unwanted latency from being introduced. A hard timeout is implemented in the function to abort a request if the timeout is exceeded. The idea is to guarantee the total request time will not exceed a configured value. ### Changes Included - HttpClient capability added to Stellar - HttpClient setup added to the various bolts and Stellar REPL - Utility added for setting up a pooling HttpClient (and possibly other types of clients in the future) - Configuration mechanism added for Stellar REST function including settings for authentication, proxy support, timeouts and other settings - Function implementation and appropriate unit/integration tests (both unit tests and integration tests are included) ### Testing There are several different ways to test this feature and I would encourage reviewers to get creative and look for cases I may not have thought of. For my testing, I used an online Http service that provides simple endpoints for simulating different use cases: `http://httpbin.org/#/`. Feel free to try your own or use this one. I tested this in full dev using the Stellar REPL and the parser and enrichment topologies. First you need to perform a couple setup steps: 1. Spin up full dev and ensure everything comes up and data is being indexed 2. Ssh to full dev and install the Squid proxy server: ``` yum -y install squid ``` 3. Create a password file that Squid can use for basic authentication ``` yum -y install httpd-tools touch /etc/squid/passwd && chown squid /etc/squid/passwd htpasswd /etc/squid/passwd user # (Will prompt for a password) ``` 4. Configure Squid for basic authentication by adding these lines to `/etc/squid/squid.conf`, under the lines with `acl Safe_ports*`: ``` auth_param basic program /usr/lib64/squid/ncsa_auth /etc/squid/passwd auth_param basic children 5 auth_param basic realm Squid Basic Authentication auth_param basic credentialsttl 2 hours acl auth_users proxy_auth REQUIRED http_access allow auth_users ``` 5. Start Squid and verify it is working correctly: ``` service squid restart curl --proxy-user user:password -x node1:3128 http://www.google.com/ ``` 6. Next create password files in HDFS: ``` su hdfs cd ~ echo -n 'passwd' > basicPassword.txt hdfs dfs -put basicPassword.txt /apps/metron echo -n 'password' > proxyPassword.txt hdfs dfs -put proxyPassword.txt /apps/metron exit ``` To test with the Stellar REPL, follow these steps: 1. Start the Stellar REPL and verify the `REST_GET` function is available: ``` /usr/metron/0.6.1/bin/stellar --zookeeper node1:2181 [Stellar]>>> %functions REST REST_GET ``` 2. Test a simple get request: ``` [Stellar]>>> REST_GET('http://httpbin.org/get') {args={}, headers={Accept=application/json, Accept-Encoding=gzip,deflate, Cache-Control=max-age=259200, Connection=close, Host=httpbin.org, User-Agent=Apache-HttpClient/4.3.2 (java 1.5)}, origin=127.0.0.1, 136.62.241.236, url=http://httpbin.org/get} ``` 3. Test a get request with basic authentication: ``` [Stellar]>>> config := {'basic.auth.user':'user','basic.auth.password.path':'/apps/metron/basicPassword.txt'} {basic.auth.user=user, basic.auth.password.path=/apps/metron/basicPassword.txt} [Stellar]>>> REST_GET('
[GitHub] metron issue #1250: METRON-1850: Stellar REST function
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1250 The latest commit incorporates the changes from https://github.com/apache/metron/pull/1251 and moves the httpclient inside the function. I was able to verify proper closing in the REPL and the EnrichmentIntegrationTest. I was not able to verify in the Storm topologies because Storm does not guarantee these methods will be called in production. I added code comments in the bolts where we are calling `StellarFunctions.close()`. I spun this up in full dev again and ran through the test plan. Everything continued to work as expected. ---
[GitHub] metron issue #1251: METRON-1853: Add shutdown hook to Stellar BaseFunctionRe...
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1251 +1 ---
[GitHub] metron issue #1253: METRON-1857 Fix Metaalert Nested Alert Field Name in Ind...
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1253 I tested this in full dev. +1 ---
[GitHub] metron issue #1253: METRON-1857 Fix Metaalert Nested Alert Field Name in Ind...
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1253 I'm not sure if you know the answer to this @nickwallen because it predates this PR, but is the intention to convert all `metron_alert.*` fields to keyword types? I can see the motivation behind doing this because we may not be aware of all field types in the various sensors that could be added to a metaalert. Maybe @justinleet knows? If my assumption is true then we need another small change to make that happen. Currently the `match_mapping_type` attribute is set to `string` which will only convert string types. If we want to convert all fields, it needs to be: ``` "dynamic_templates": [ { "alert_template": { "path_match": "metron_alert.*", "match_mapping_type": "*", "mapping": { "type": "keyword" } } ``` Notice `string` has been changed to `*`. ---
[GitHub] metron issue #1249: METRON-1815: Separate metron-parsers into metron-parsers...
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1249 Can you give some examples of code that fits into that category? ---
[GitHub] metron pull request #1251: METRON-1853: Add shutdown hook to Stellar BaseFun...
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1251#discussion_r230484974 --- Diff: metron-stellar/stellar-common/src/main/java/org/apache/metron/stellar/dsl/StellarFunctions.java --- @@ -30,4 +30,8 @@ public static FunctionResolver FUNCTION_RESOLVER() { public static void initialize(Context context) { SingletonFunctionResolver.getInstance().initialize(context); } + + public static void teardown(Context context) { --- End diff -- Should this be: ``` public static void teardown() { SingletonFunctionResolver.getInstance().teardown(); } ``` ---
[GitHub] metron pull request #1250: METRON-1850: Stellar REST function
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1250#discussion_r230468297 --- Diff: metron-stellar/stellar-common/src/main/java/org/apache/metron/stellar/dsl/functions/RestFunctions.java --- @@ -0,0 +1,351 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.stellar.dsl.functions; + +import org.apache.commons.io.IOUtils; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.FSDataInputStream; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.http.HttpEntity; +import org.apache.http.HttpHost; +import org.apache.http.auth.AuthScope; +import org.apache.http.auth.UsernamePasswordCredentials; +import org.apache.http.client.CredentialsProvider; +import org.apache.http.client.config.RequestConfig; +import org.apache.http.client.methods.CloseableHttpResponse; +import org.apache.http.client.methods.HttpGet; +import org.apache.http.client.protocol.HttpClientContext; +import org.apache.http.impl.client.BasicCredentialsProvider; +import org.apache.http.impl.client.CloseableHttpClient; +import org.apache.http.util.EntityUtils; +import org.apache.metron.stellar.common.utils.ConversionUtils; +import org.apache.metron.stellar.common.utils.JSONUtils; +import org.apache.metron.stellar.dsl.Context; +import org.apache.metron.stellar.dsl.ParseException; +import org.apache.metron.stellar.dsl.Stellar; +import org.apache.metron.stellar.dsl.StellarFunction; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.ByteArrayInputStream; +import java.io.IOException; +import java.lang.invoke.MethodHandles; +import java.net.URI; +import java.net.URISyntaxException; +import java.nio.charset.StandardCharsets; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.concurrent.Executors; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.ScheduledFuture; +import java.util.concurrent.TimeUnit; + +import static java.lang.String.format; +import static org.apache.metron.stellar.dsl.Context.Capabilities.GLOBAL_CONFIG; +import static org.apache.metron.stellar.dsl.functions.RestConfig.STELLAR_REST_SETTINGS; + +/** + * Defines functions that enable REST requests with proper result and error handling. Depends on an + * Apache HttpComponents client being supplied as a Stellar HTTP_CLIENT capability. Exposes various Http settings + * including authentication, proxy and timeouts through the global config with the option to override any settings + * through a config object supplied in the expression. + */ +public class RestFunctions { + + private static final Logger LOG = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); + + /** + * Retrieves the ClosableHttpClient from the execution context. + * + * @param context The execution context. + * @return A ClosableHttpClient, if one exists. Otherwise, an exception is thrown. + */ + private static CloseableHttpClient getHttpClient(Context context) { +Optional clientOpt = context.getCapability(Context.Capabilities.HTTP_CLIENT); +if(clientOpt.isPresent()) { + return (CloseableHttpClient) clientOpt.get(); +} else { + throw new IllegalStateException("Missing HTTP_CLIENT; http connection required"); +} + } + + /** + * Get an argument from a list of arguments. + * + * @param index The index within the list of arguments. + * @param clazz The type expected. + * @param args All of the arguments. + * @param The type of the argument expected. + */ + public static T getArg(int index, Class clazz, List args) { + +if(index >= args.size()) { + throw new IllegalArgumentException(format("Expec
[GitHub] metron issue #1250: METRON-1850: Stellar REST function
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1250 I think we should close it, docs are pretty clear about that. I think it's possible connections could be left open on the server side. Moving the code out of the bolt is an organization/style issue. I think there should be a functional justification like there is for closing a client. ---
[GitHub] metron issue #1250: METRON-1850: Stellar REST function
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1250 We do shut down Zookeeper connections here: https://github.com/apache/metron/blob/master/metron-platform/metron-common/src/main/java/org/apache/metron/common/bolt/ConfiguredBolt.java#L129. The Apache HttpComponents docs recommend closing clients: http://hc.apache.org/httpcomponents-client-4.5.x/tutorial/html/fundamentals.html#d5e217. ---
[GitHub] metron pull request #1250: METRON-1850: Stellar REST function
Github user merrimanr closed the pull request at: https://github.com/apache/metron/pull/1250 ---
[GitHub] metron pull request #1250: METRON-1850: Stellar REST function
GitHub user merrimanr reopened a pull request: https://github.com/apache/metron/pull/1250 METRON-1850: Stellar REST function ## Contributor Comments This PR adds a Stellar REST function that can be used to enrich messages with data from 3rd party REST services. This function leverages the Apache HttpComponents library for Http requests. The function call follows this format: `REST_GET(rest uri, optional rest settings)` There are a handful of settings including basic authentication credentials, proxy support, and timeouts. These settings are included in the `RestConfig` class. Any of these settings can be defined in the global config under the `stellar.rest.settings` property and will override any default values. The global config settings can also be overridden by passing in a config object to the expression. This will allow support for multiple REST services that may have different requirements. Responses are expected to be in JSON format. Successful requests will return a MAP object in Stellar. Errors will be logged and null will be returned by default. There are ways to override this behavior by configuring a list of acceptable status codes and/or values to be returned on error or empty responses. Considering this will be used on streaming data, how we handle timeouts is important. This function exposes HttpClient timeout settings but those are not enough to keep unwanted latency from being introduced. A hard timeout is implemented in the function to abort a request if the timeout is exceeded. The idea is to guarantee the total request time will not exceed a configured value. ### Changes Included - HttpClient capability added to Stellar - HttpClient setup added to the various bolts and Stellar REPL - Utility added for setting up a pooling HttpClient (and possibly other types of clients in the future) - Configuration mechanism added for Stellar REST function including settings for authentication, proxy support, timeouts and other settings - Function implementation and appropriate unit/integration tests (both unit tests and integration tests are included) ### Testing There are several different ways to test this feature and I would encourage reviewers to get creative and look for cases I may not have thought of. For my testing, I used an online Http service that provides simple endpoints for simulating different use cases: `http://httpbin.org/#/`. Feel free to try your own or use this one. I tested this in full dev using the Stellar REPL and the parser and enrichment topologies. First you need to perform a couple setup steps: 1. Spin up full dev and ensure everything comes up and data is being indexed 2. Ssh to full dev and install the Squid proxy server: ``` yum -y install squid ``` 3. Create a password file that Squid can use for basic authentication ``` yum -y install httpd-tools touch /etc/squid/passwd && chown squid /etc/squid/passwd htpasswd /etc/squid/passwd user # (Will prompt for a password) ``` 4. Configure Squid for basic authentication by adding these lines to `/etc/squid/squid.conf`, under the lines with `acl Safe_ports*`: ``` auth_param basic program /usr/lib64/squid/ncsa_auth /etc/squid/passwd auth_param basic children 5 auth_param basic realm Squid Basic Authentication auth_param basic credentialsttl 2 hours acl auth_users proxy_auth REQUIRED http_access allow auth_users ``` 5. Start Squid and verify it is working correctly: ``` service squid restart curl --proxy-user user:password -x node1:3128 http://www.google.com/ ``` 6. Next create password files in HDFS: ``` su hdfs cd ~ echo passwd > basicPassword.txt hdfs dfs -put basicPassword.txt /apps/metron echo password > proxyPassword.txt hdfs dfs -put proxyPassword.txt /apps/metron exit ``` To test with the Stellar REPL, follow these steps: 1. Start the Stellar REPL and verify the `REST_GET` function is available: ``` /usr/metron/0.6.1/bin/stellar --zookeeper node1:2181 [Stellar]>>> %functions REST REST_GET ``` 2. Test a simple get request: ``` [Stellar]>>> REST_GET('http://httpbin.org/get') {args={}, headers={Accept=application/json, Accept-Encoding=gzip,deflate, Cache-Control=max-age=259200, Connection=close, Host=httpbin.org, User-Agent=Apache-HttpClient/4.3.2 (java 1.5)}, origin=127.0.0.1, 136.62.241.236, url=http://httpbin.org/get} ``` 3. Test a get request with basic authentication: ``` [Stellar]>>> config := {'basic.auth.user':'user','basic.auth.password.path':'/apps/metron/basicPassword.txt'} {basic.auth.user=user, basic.auth.password.path=/apps/metron/basicPassword.txt} [Stellar]>>> REST_GET('
[GitHub] metron pull request #1250: METRON-1850: Stellar REST function
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1250#discussion_r230367152 --- Diff: metron-platform/metron-common/src/main/java/org/apache/metron/common/bolt/ConfiguredEnrichmentBolt.java --- @@ -17,18 +17,41 @@ */ package org.apache.metron.common.bolt; +import java.io.IOException; import java.lang.invoke.MethodHandles; +import java.util.Map; + +import org.apache.http.impl.client.CloseableHttpClient; import org.apache.metron.common.configuration.EnrichmentConfigurations; +import org.apache.metron.stellar.common.utils.HttpClientUtils; +import org.apache.storm.task.OutputCollector; +import org.apache.storm.task.TopologyContext; import org.slf4j.Logger; import org.slf4j.LoggerFactory; public abstract class ConfiguredEnrichmentBolt extends ConfiguredBolt { private static final Logger LOG = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); + protected CloseableHttpClient httpClient; public ConfiguredEnrichmentBolt(String zookeeperUrl) { super(zookeeperUrl, "ENRICHMENT"); } + @Override + public void prepare(Map stormConf, TopologyContext context, OutputCollector collector) { +super.prepare(stormConf, context, collector); --- End diff -- Adding a shutdown hook in Stellar is not trivial and out of scope for this PR in my opinion. I would prefer that be a follow on. ---
[GitHub] metron pull request #1250: METRON-1850: Stellar REST function
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1250#discussion_r230366857 --- Diff: metron-stellar/stellar-common/src/main/java/org/apache/metron/stellar/dsl/functions/RestFunctions.java --- @@ -0,0 +1,351 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.stellar.dsl.functions; + +import org.apache.commons.io.IOUtils; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.FSDataInputStream; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.http.HttpEntity; +import org.apache.http.HttpHost; +import org.apache.http.auth.AuthScope; +import org.apache.http.auth.UsernamePasswordCredentials; +import org.apache.http.client.CredentialsProvider; +import org.apache.http.client.config.RequestConfig; +import org.apache.http.client.methods.CloseableHttpResponse; +import org.apache.http.client.methods.HttpGet; +import org.apache.http.client.protocol.HttpClientContext; +import org.apache.http.impl.client.BasicCredentialsProvider; +import org.apache.http.impl.client.CloseableHttpClient; +import org.apache.http.util.EntityUtils; +import org.apache.metron.stellar.common.utils.ConversionUtils; +import org.apache.metron.stellar.common.utils.JSONUtils; +import org.apache.metron.stellar.dsl.Context; +import org.apache.metron.stellar.dsl.ParseException; +import org.apache.metron.stellar.dsl.Stellar; +import org.apache.metron.stellar.dsl.StellarFunction; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.ByteArrayInputStream; +import java.io.IOException; +import java.lang.invoke.MethodHandles; +import java.net.URI; +import java.net.URISyntaxException; +import java.nio.charset.StandardCharsets; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.concurrent.Executors; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.ScheduledFuture; +import java.util.concurrent.TimeUnit; + +import static java.lang.String.format; +import static org.apache.metron.stellar.dsl.Context.Capabilities.GLOBAL_CONFIG; +import static org.apache.metron.stellar.dsl.functions.RestConfig.STELLAR_REST_SETTINGS; + +/** + * Defines functions that enable REST requests with proper result and error handling. Depends on an + * Apache HttpComponents client being supplied as a Stellar HTTP_CLIENT capability. Exposes various Http settings + * including authentication, proxy and timeouts through the global config with the option to override any settings + * through a config object supplied in the expression. + */ +public class RestFunctions { + + private static final Logger LOG = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); + + /** + * Retrieves the ClosableHttpClient from the execution context. + * + * @param context The execution context. + * @return A ClosableHttpClient, if one exists. Otherwise, an exception is thrown. + */ + private static CloseableHttpClient getHttpClient(Context context) { +Optional clientOpt = context.getCapability(Context.Capabilities.HTTP_CLIENT); +if(clientOpt.isPresent()) { + return (CloseableHttpClient) clientOpt.get(); +} else { + throw new IllegalStateException("Missing HTTP_CLIENT; http connection required"); +} + } + + /** + * Get an argument from a list of arguments. + * + * @param index The index within the list of arguments. + * @param clazz The type expected. + * @param args All of the arguments. + * @param The type of the argument expected. + */ + public static T getArg(int index, Class clazz, List args) { + +if(index >= args.size()) { + throw new IllegalArgumentException(format("Expec
[GitHub] metron pull request #1250: METRON-1850: Stellar REST function
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1250#discussion_r230366579 --- Diff: metron-stellar/stellar-common/src/main/java/org/apache/metron/stellar/dsl/functions/RestFunctions.java --- @@ -0,0 +1,351 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.stellar.dsl.functions; + +import org.apache.commons.io.IOUtils; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.FSDataInputStream; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.http.HttpEntity; +import org.apache.http.HttpHost; +import org.apache.http.auth.AuthScope; +import org.apache.http.auth.UsernamePasswordCredentials; +import org.apache.http.client.CredentialsProvider; +import org.apache.http.client.config.RequestConfig; +import org.apache.http.client.methods.CloseableHttpResponse; +import org.apache.http.client.methods.HttpGet; +import org.apache.http.client.protocol.HttpClientContext; +import org.apache.http.impl.client.BasicCredentialsProvider; +import org.apache.http.impl.client.CloseableHttpClient; +import org.apache.http.util.EntityUtils; +import org.apache.metron.stellar.common.utils.ConversionUtils; +import org.apache.metron.stellar.common.utils.JSONUtils; +import org.apache.metron.stellar.dsl.Context; +import org.apache.metron.stellar.dsl.ParseException; +import org.apache.metron.stellar.dsl.Stellar; +import org.apache.metron.stellar.dsl.StellarFunction; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.ByteArrayInputStream; +import java.io.IOException; +import java.lang.invoke.MethodHandles; +import java.net.URI; +import java.net.URISyntaxException; +import java.nio.charset.StandardCharsets; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.concurrent.Executors; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.ScheduledFuture; +import java.util.concurrent.TimeUnit; + +import static java.lang.String.format; +import static org.apache.metron.stellar.dsl.Context.Capabilities.GLOBAL_CONFIG; +import static org.apache.metron.stellar.dsl.functions.RestConfig.STELLAR_REST_SETTINGS; + +/** + * Defines functions that enable REST requests with proper result and error handling. Depends on an + * Apache HttpComponents client being supplied as a Stellar HTTP_CLIENT capability. Exposes various Http settings + * including authentication, proxy and timeouts through the global config with the option to override any settings + * through a config object supplied in the expression. + */ +public class RestFunctions { + + private static final Logger LOG = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); + + /** + * Retrieves the ClosableHttpClient from the execution context. + * + * @param context The execution context. + * @return A ClosableHttpClient, if one exists. Otherwise, an exception is thrown. + */ + private static CloseableHttpClient getHttpClient(Context context) { +Optional clientOpt = context.getCapability(Context.Capabilities.HTTP_CLIENT); +if(clientOpt.isPresent()) { + return (CloseableHttpClient) clientOpt.get(); +} else { + throw new IllegalStateException("Missing HTTP_CLIENT; http connection required"); +} + } + + /** + * Get an argument from a list of arguments. + * + * @param index The index within the list of arguments. + * @param clazz The type expected. + * @param args All of the arguments. + * @param The type of the argument expected. + */ + public static T getArg(int index, Class clazz, List args) { + +if(index >= args.size()) { + throw new IllegalArgumentException(format("Expec
[GitHub] metron pull request #1250: METRON-1850: Stellar REST function
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1250#discussion_r230227817 --- Diff: metron-platform/metron-common/src/main/java/org/apache/metron/common/bolt/ConfiguredEnrichmentBolt.java --- @@ -17,18 +17,41 @@ */ package org.apache.metron.common.bolt; +import java.io.IOException; import java.lang.invoke.MethodHandles; +import java.util.Map; + +import org.apache.http.impl.client.CloseableHttpClient; import org.apache.metron.common.configuration.EnrichmentConfigurations; +import org.apache.metron.stellar.common.utils.HttpClientUtils; +import org.apache.storm.task.OutputCollector; +import org.apache.storm.task.TopologyContext; import org.slf4j.Logger; import org.slf4j.LoggerFactory; public abstract class ConfiguredEnrichmentBolt extends ConfiguredBolt { private static final Logger LOG = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); + protected CloseableHttpClient httpClient; public ConfiguredEnrichmentBolt(String zookeeperUrl) { super(zookeeperUrl, "ENRICHMENT"); } + @Override + public void prepare(Map stormConf, TopologyContext context, OutputCollector collector) { +super.prepare(stormConf, context, collector); --- End diff -- I tried that initially but realized there is no shutdown hook in Stellar. HttpClient objects need to be closed and the only hook that exists is in the bolts. If we did have a way to close it, I would be in favor of managing it in the function. If we decide it's worth adding that hook in Stellar (I think we should as a follow on) we can easily move this inside the function. ---
[GitHub] metron pull request #1250: METRON-1850: Stellar REST function
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1250#discussion_r230209150 --- Diff: metron-stellar/stellar-common/src/main/java/org/apache/metron/stellar/dsl/functions/RestConfig.java --- @@ -0,0 +1,147 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.stellar.dsl.functions; --- End diff -- I looked at 2 different patterns that already exist. The first was the pattern you mentioned, the ConfigOptions interface. The problem is that this class lives in metron-common which depends on stellar-common. Regardless I don't think pattern is a great fit because it doesn't serialize/deserialize easily and contains a lot of features that are not needed here (functional interfaces for example). The second was the PROFILER_INIT function in the ProfilerFunctions class. This approach uses a java bean (ProfilerConfig) to hold configuration which allows easy serialization/deserialization. This is actually what I used at first until I made the change to accept a config as an argument expression. I actually went with option 2 at first but found it cumbersome to merge configs together. I ended up with the pattern I did because a map supports applying properties on top of each other well and serialization/deserialization works without issue. We ended having to do this for PcapConfig and PcapOptions in the REST layer by creating the PcapRequest class which extends a Map. ---
[GitHub] metron pull request #1250: METRON-1850: Stellar REST function
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1250#discussion_r230203860 --- Diff: metron-platform/metron-common/src/main/java/org/apache/metron/common/bolt/ConfiguredEnrichmentBolt.java --- @@ -17,18 +17,41 @@ */ package org.apache.metron.common.bolt; +import java.io.IOException; import java.lang.invoke.MethodHandles; +import java.util.Map; + +import org.apache.http.impl.client.CloseableHttpClient; import org.apache.metron.common.configuration.EnrichmentConfigurations; +import org.apache.metron.stellar.common.utils.HttpClientUtils; +import org.apache.storm.task.OutputCollector; +import org.apache.storm.task.TopologyContext; import org.slf4j.Logger; import org.slf4j.LoggerFactory; public abstract class ConfiguredEnrichmentBolt extends ConfiguredBolt { private static final Logger LOG = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); + protected CloseableHttpClient httpClient; public ConfiguredEnrichmentBolt(String zookeeperUrl) { super(zookeeperUrl, "ENRICHMENT"); } + @Override + public void prepare(Map stormConf, TopologyContext context, OutputCollector collector) { +super.prepare(stormConf, context, collector); --- End diff -- I can add a property to configure this. Should it be included by default or not ? Do you think instantiating this object but never using it would be harmful? It might make things simpler for users if there is one less setting to worry about. ---
[GitHub] metron issue #1246: METRON-1844: Allow for LDAP to be used for authenticatio...
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1246 I tested this in full dev using the instructions in the README and with the Knox sample ldap. Both worked fine. I also tested using legacy JDBC authentication and it continued to work. Nice job! +1 ---
[GitHub] metron pull request #1246: METRON-1844: Allow for LDAP to be used for authen...
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1246#discussion_r230138602 --- Diff: metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/configuration/metron-rest-env.xml --- @@ -35,32 +35,29 @@ metron_spring_profiles_active -Active Spring profiles +Active Spring profiles. 'ldap' is used to enable authentication via LDAP. Active Spring profiles - - -true - +jdbc --- End diff -- Are we using a "jdbc" profile anywhere? From what I can tell jdbc authentication is used by default but we don't actually use a profile for that. ---
[GitHub] metron pull request #1250: METRON-1850: Stellar REST function
GitHub user merrimanr opened a pull request: https://github.com/apache/metron/pull/1250 METRON-1850: Stellar REST function ## Contributor Comments This PR adds a Stellar REST function that can be used to enrich messages with data from 3rd party REST services. This function leverages the Apache HttpComponents library for Http requests. The function call follows this format: `REST_GET(rest uri, optional rest settings)` There are a handful of settings including basic authentication credentials, proxy support, and timeouts. These settings are included in the `RestConfig` class. Any of these settings can be defined in the global config under the `stellar.rest.settings` property and will override any default values. The global config settings can also be overridden by passing in a config object to the expression. This will allow support for multiple REST services that may have different requirements. Responses are expected to be in JSON format. Successful requests will return a MAP object in Stellar. Errors will be logged and null will be returned by default. There are ways to override this behavior by configuring a list of acceptable status codes and/or values to be returned on error or empty responses. Considering this will be used on streaming data, how we handle timeouts is important. This function exposes HttpClient timeout settings but those are not enough to keep unwanted latency from being introduced. A hard timeout is implemented in the function to abort a request if the timeout is exceeded. The idea is to guarantee the total request time will not exceed a configured value. ### Changes Included - HttpClient capability added to Stellar - HttpClient setup added to the various bolts and Stellar REPL - Utility added for setting up a pooling HttpClient (and possibly other types of clients in the future) - Configuration mechanism added for Stellar REST function including settings for authentication, proxy support, timeouts and other settings - Function implementation and appropriate unit/integration tests (both unit tests and integration tests are included) ### Testing There are several different ways to test this feature and I would encourage reviewers to get creative and look for cases I may not have thought of. For my testing, I used an online Http service that provides simple endpoints for simulating different use cases: `http://httpbin.org/#/`. Feel free to try your own or use this one. I tested this in full dev using the Stellar REPL and the parser and enrichment topologies. First you need to perform a couple setup steps: 1. Spin up full dev and ensure everything comes up and data is being indexed 2. Ssh to full dev and install the Squid proxy server: ``` yum -y install squid ``` 3. Create a password file that Squid can use for basic authentication ``` yum -y install httpd-tools touch /etc/squid/passwd && chown squid /etc/squid/passwd htpasswd /etc/squid/passwd user # (Will prompt for a password) ``` 4. Configure Squid for basic authentication by adding these lines to `/etc/squid/squid.conf`, under the lines with `acl Safe_ports*`: ``` auth_param basic program /usr/lib64/squid/ncsa_auth /etc/squid/passwd auth_param basic children 5 auth_param basic realm Squid Basic Authentication auth_param basic credentialsttl 2 hours acl auth_users proxy_auth REQUIRED http_access allow auth_users ``` 5. Start Squid and verify it is working correctly: ``` service squid restart curl --proxy-user user:password -x node1:3128 http://www.google.com/ ``` 6. Next create password files in HDFS: ``` su hdfs cd ~ echo passwd > basicPassword.txt hdfs dfs -put basicPassword.txt /apps/metron echo password > proxyPassword.txt hdfs dfs -put proxyPassword.txt /apps/metron exit ``` To test with the Stellar REPL, follow these steps: 1. Start the Stellar REPL and verify the `REST_GET` function is available: ``` /usr/metron/0.6.1/bin/stellar --zookeeper node1:2181 [Stellar]>>> %functions REST REST_GET ``` 2. Test a simple get request: ``` [Stellar]>>> REST_GET('http://httpbin.org/get') {args={}, headers={Accept=application/json, Accept-Encoding=gzip,deflate, Cache-Control=max-age=259200, Connection=close, Host=httpbin.org, User-Agent=Apache-HttpClient/4.3.2 (java 1.5)}, origin=127.0.0.1, 136.62.241.236, url=http://httpbin.org/get} ``` 3. Test a get request with basic authentication: ``` [Stellar]>>> config := {'basic.auth.user':'user','basic.auth.password.path':'/apps/metron/basicPassword.txt'} {basic.auth.user=user, basic.auth.password.path=/apps/metron/basicPassword.txt} [Stellar]>>> REST_GET('
[GitHub] metron issue #1240: METRON-1830: Re-implement Alerts dialog box without jQue...
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1240 Looks good to me @sardell. +1 ---
[GitHub] metron issue #1241: METRON-1833: Management UI incorrectly displaying sensor...
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1241 +1 ---
[GitHub] metron pull request #1239: METRON-1829: Large Error Message Causes Slow Sear...
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1239#discussion_r226479283 --- Diff: metron-platform/metron-writer/src/main/java/org/apache/metron/writer/BulkWriterComponent.java --- @@ -118,12 +118,15 @@ public void commit(BulkWriterResponse response) { public void error(String sensorType, Throwable e, Iterable tuples, MessageGetStrategy messageGetStrategy) { LOG.error(format("Failing %d tuple(s); sensorType=%s", Iterables.size(tuples), sensorType), e); -MetronError error = new MetronError() -.withSensorType(Collections.singleton(sensorType)) -.withErrorType(Constants.ErrorType.INDEXING_ERROR) -.withThrowable(e); -tuples.forEach(t -> error.addRawMessage(messageGetStrategy.get(t))); -handleError(tuples, error); +tuples.forEach(t -> { --- End diff -- Done. ---
[GitHub] metron pull request #1239: METRON-1829: Large Error Message Causes Slow Sear...
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1239#discussion_r226460423 --- Diff: metron-platform/metron-writer/src/main/java/org/apache/metron/writer/BulkWriterComponent.java --- @@ -118,12 +118,15 @@ public void commit(BulkWriterResponse response) { public void error(String sensorType, Throwable e, Iterable tuples, MessageGetStrategy messageGetStrategy) { LOG.error(format("Failing %d tuple(s); sensorType=%s", Iterables.size(tuples), sensorType), e); -MetronError error = new MetronError() -.withSensorType(Collections.singleton(sensorType)) -.withErrorType(Constants.ErrorType.INDEXING_ERROR) -.withThrowable(e); -tuples.forEach(t -> error.addRawMessage(messageGetStrategy.get(t))); -handleError(tuples, error); +tuples.forEach(t -> { --- End diff -- You're right again. I agree, we shouldn't be reporting the same exception for every message in the batch. I changed it to what you suggested in the initial comment and updated the tests. For the other issue you bring up with `error(Throwable, Iterable)`, it looks like it gets called from only one place: `BulkMessageWriterBolt.handleMissingMessage`. This method only deals with a single tuple so it is misleading that the `error` method accepts multiple tuples. Would changing that to accept a single tuple make it less confusing? ---
[GitHub] metron issue #1213: METRON-1681: Decouple the ParserBolt from the Parse exec...
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1213 Latest commit adds some logging around parser initialization. Let me know what you think. ---
[GitHub] metron pull request #1233: METRON-1816: Date format Stellar function
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1233#discussion_r226383027 --- Diff: metron-stellar/stellar-common/src/main/java/org/apache/metron/stellar/dsl/functions/DateFunctions.java --- @@ -109,6 +110,13 @@ public static long getEpochTime(String date, String format, Optional tim return sdf.parse(date).getTime(); } + public static String getDateFormat(String format, Optional epochTime, Optional timezone) { +Long time = epochTime.orElseGet(System::currentTimeMillis); +TimezonedFormat fmt = timezone.map(s -> new TimezonedFormat(format, s)).orElseGet(() -> new TimezonedFormat(format)); +SimpleDateFormat sdf = formatCache.get(fmt).get(); --- End diff -- There are no exceptions thrown by any calls in this method. What are you thinking we should do? ---
[GitHub] metron pull request #1233: METRON-1816: Date format Stellar function
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1233#discussion_r226382248 --- Diff: metron-stellar/stellar-common/src/test/java/org/apache/metron/stellar/dsl/functions/DateFunctionsTest.java --- @@ -225,4 +226,36 @@ public void testDayOfYearNow() { public void testDayOfYearNull() { Object result = run("DAY_OF_YEAR(nada)"); } + + @Test + public void testDateFormat() { +Object result = run("DATE_FORMAT('EEE MMM dd hh:mm:ss zzz', epoch, 'EST')"); +assertEquals("Thu Aug 25 2016 08:27:10 EST", result); + } + + @Test + public void testDateFormatDefault() { +Object result = run("DATE_FORMAT('EEE MMM dd hh:mm:ss ')"); + assertTrue(result.toString().endsWith(TimeZone.getDefault().getDisplayName(true, 1))); + } + + @Test + public void testDateFormatNow() { +Object result = run("DATE_FORMAT('EEE MMM dd hh:mm:ss zzz', 'GMT')"); +assertTrue(result.toString().endsWith("GMT")); + } + + @Test + public void testDateFormatDefaultTimezone() { +Object result = run("DATE_FORMAT('EEE MMM dd hh:mm:ss ', epoch)"); + assertTrue(result.toString().endsWith(TimeZone.getDefault().getDisplayName(true, 1))); + } + + /** + * If refer to variable that does not exist, expect ParseException. + */ + @Test(expected = ParseException.class) + public void testDateFormatNull() { +Object result = run("DATE_FORMAT('EEE MMM dd hh:mm:ss zzz', nada, 'EST')"); + } --- End diff -- Done ---
[GitHub] metron pull request #1239: METRON-1829: Large Error Message Causes Slow Sear...
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1239#discussion_r226377917 --- Diff: metron-platform/metron-writer/src/main/java/org/apache/metron/writer/BulkWriterComponent.java --- @@ -118,12 +118,15 @@ public void commit(BulkWriterResponse response) { public void error(String sensorType, Throwable e, Iterable tuples, MessageGetStrategy messageGetStrategy) { LOG.error(format("Failing %d tuple(s); sensorType=%s", Iterables.size(tuples), sensorType), e); -MetronError error = new MetronError() -.withSensorType(Collections.singleton(sensorType)) -.withErrorType(Constants.ErrorType.INDEXING_ERROR) -.withThrowable(e); -tuples.forEach(t -> error.addRawMessage(messageGetStrategy.get(t))); -handleError(tuples, error); +tuples.forEach(t -> { --- End diff -- You're right! My mistake. I changed it to pass in just the current tuple to `handleError`. Let me know if that works. ---
[GitHub] metron pull request #1239: METRON-1829: Large Error Message Causes Slow Sear...
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1239#discussion_r226378008 --- Diff: metron-platform/metron-writer/src/test/java/org/apache/metron/writer/BulkWriterComponentTest.java --- @@ -161,9 +165,12 @@ public void writeShouldThrowExceptionWhenHandleErrorIsFalse() throws Exception { @Test public void writeShouldProperlyHandleWriterException() throws Exception { Throwable e = new Exception("test exception"); -MetronError error = new MetronError() +MetronError expectedError1 = new MetronError() .withSensorType(Collections.singleton(sensorType)) - .withErrorType(Constants.ErrorType.INDEXING_ERROR).withThrowable(e).withRawMessages(Arrays.asList(message1, message2)); + .withErrorType(Constants.ErrorType.INDEXING_ERROR).withThrowable(e).withRawMessages(Collections.singletonList(message1)); --- End diff -- I agree. Done. ---
[GitHub] metron issue #1233: METRON-1816: Date format Stellar function
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1233 The existing Stellar date functions use SimpleDateFormat and already have a caching layer built in so I decided to reuse what was already there. Happy to switch to DateTimeFormatter here or in a follow on. ---
[GitHub] metron issue #1213: METRON-1681: Decouple the ParserBolt from the Parse exec...
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1213 The latest commit adds the Deprecated annotations to parse and parseOptional and removes MultilineMessageParser. I also updated the `MessageParserTest` with more tests to ensure we're handling the deprecated methods appropriately. As a result of that, I changed the logic in MessageParser.parseOptionalResult slightly to return an empty Optional rather than an empty List to match the existing test. Let me know if anyone has a problem with this. Once we finish this I will make an announcement on the user and dev lists. ---
[GitHub] metron issue #1213: METRON-1681: Decouple the ParserBolt from the Parse exec...
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1213 I merged the latest master (which includes https://github.com/apache/metron/pull/1234) into this PR. I experimented with moving the methods in MultilineMessageParser back into MessageParser and I think it turned out well. Makes things clearer and easier to understand in my opinion (and also made the merge much easier). What do you think @mmiklavc and @ottobackwards? It reverts some changes in https://github.com/apache/metron/pull/1234 so I want to get your thoughts and buy in. ---
[GitHub] metron pull request #1239: METRON-1829: Large Error Message Causes Slow Sear...
GitHub user merrimanr opened a pull request: https://github.com/apache/metron/pull/1239 METRON-1829: Large Error Message Causes Slow Search Performance ## Contributor Comments When a failure happens in the BulkWriterComponent, all pending messages in a batch are combined into a single error message. This results in a single large objects being written to ES and causes performance problems. This PR attempts to solve this issue by instead creating separate error messages for each message in the batch. ### Testing This has been tested and verified in full dev: 1. Spin up full dev and verify data is being indexed into ES 2. Stop HDFS in Ambari. This will cause a failure in the BulkWriterComponent. By default the batch size for OOTB sensors is set to 5. 3. After HDFS has stopped, verify that documents are being written to an ES error index 4. Retrieve a single document from the ES error index. There should only be a single `raw_message` field. Subsequent documents in ES should also only contain a single `raw_message` field. Before there would have been 5 `raw_message_*` fields in a single document. 5. Verify error documents are displayed properly in Kibana. ## Pull Request Checklist Thank you for submitting a contribution to Apache Metron. Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions. Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides. In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following: ### For all changes: - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [x] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? ### For code changes: - [x] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [x] Have you included steps or a guide to how the change may be verified and tested manually? - [x] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via: ``` mvn -q clean integration-test install && dev-utilities/build-utils/verify_licenses.sh ``` - [x] Have you written or updated unit tests and or integration tests to verify your changes? - [x] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [x] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent? ### For documentation related changes: - [x] Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via `site-book/target/site/index.html`: ``` cd site-book mvn site ``` Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. It is also recommended that [travis-ci](https://travis-ci.org) is set up for your personal repository such that your branches are built there before submitting a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/merrimanr/incubator-metron METRON-1829 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/metron/pull/1239.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1239 commit d674d9afbf7f5a230737dbcfe4ab10a70106f4ed Author: merrimanr Date: 2018-10-15T20:46:31Z initial commit ---
[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1213#discussion_r224780152 --- Diff: metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/ParserRunnerImpl.java --- @@ -137,11 +208,29 @@ private void initializeParsers(Supplier parserConfigSuppli } } + /** + * Post-processes parsed messages by: + * + * Applying field transformations defined in the sensor parser config + * Filtering messages using the configured MessageFilter class + * Validating messages using the MessageParser validate method + * + * If a message is successfully processed a message is returned in a ProcessResult. If a message fails + * validation, a MetronError object is created and returned in a ProcessResult. If a message is + * filtered out an empty Optional is returned. + * + * @param sensorType Sensor type of the message + * @param message Message parsed by the MessageParser + * @param rawMessage Raw message including metadata + * @param parser MessageParser for the sensor type + * @param parserConfigurations Parser configurations + */ @SuppressWarnings("unchecked") - protected Optional processMessage(String sensorType, JSONObject message, RawMessage rawMessage, + protected Optional processMessage(String sensorType, JSONObject message, RawMessage rawMessage, MessageParser parser, --- End diff -- Yes parser is used. Line 248, `parser.validate`. ---
[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...
GitHub user merrimanr reopened a pull request: https://github.com/apache/metron/pull/1213 METRON-1681: Decouple the ParserBolt from the Parse execution logic ## Contributor Comments The primary purpose of this PR is to create a parser container abstraction that is decoupled from Storm. This parser container (I call it ParserRunner, anyone have a better name?) is responsible for: - Instantiating the MessageParser and MessageFilter objects for each sensor. - Initializing the Stellar environment. - Accepting a raw binary message along with parser configurations and calling the MessageParser.parseOptional method for the appropriate sensor type - Process each message. Most of this logic was migrated from the ParserBolt.execute method. - Execute callbacks depending on the processing result of each message Configuration is external to this abstraction. A configuration supplier is passed in when initializing and a configuration object is passed in when processing each message. I believe this was originally done because we want message processing to be atomic without the configuration unexpectedly changing. We can easily change to a message supplier during execute if necessary. A CuratorFramework object is also required for setting up Stellar but we could easily make this optional. I decided to keep writing out of the abstraction in this PR. I can envision different platforms having different requirements or needs for sending along messages after they are parsed. Therefore the ParserBolt still handles writing messages to Kafka. If we do decide we want to add writing to our abstraction we could do it in a follow on PR to keep this from becoming even bigger. Since all of the post parsing logic was in the Parserbolt, messages could be written as they were processed rather than having to wait for all messages to be processed. To maintain this behavior I added 2 callback functions in the form of Java Consumers: onSuccess and OnError. The other option would be to make message processing synchronous and just return a list of results. This lifecycle of this container looks like: 1. Container is created from a collection of sensor types 2. Container is initialized with the init method that accepts a Curator client and a configuration Supplier. This in turn sets up Stellar and instantiates MessageParser and MessageFilter classes. 3. Container is ready and accepts messages for processing and calls the appropriate callbacks. Because this splits the ParserBolt into 2 different classes much of the ParserBoltTest unit test didn't make sense anymore. I ended up essentially rewriting it and also creating a unit test for ParserRunner. I tried to represent all the original tests in ParserBoltTest and have 95%+ coverage. If I missed any cases or made undesirable style changes, let me know. ### Changes Included - ParserRunner abstraction that is decoupled from the ParserBolt. The ParserBolt now initializes the ParserRunner and defers parsing to that class. The only thing required is Metron configuration and a Curator client (which could be optional) - Refactored ParserBolt that sets up the ParserRunner, passes message to it for processing, and writes results to Kafka. - MessageParser and MessageFilter objects are now created when Storm calls prepare, avoiding serialization issues (https://issues.apache.org/jira/browse/METRON-1793) - Updated unit and integration tests ### Testing I have done basic testing in full dev: 1. Spin up full dev and verify bro and snort alerts are indexed in ES and the data looks correct. 2. Test for proper message parser error handling by producing an unparseable message to the bro topic: ``` echo 'bad message' | /usr/hdp/current/kafka-broker/bin/kafka-console-producer.sh --broker-list node1:6667 --topic bro ``` You should see a corresponding error message in the ES error index: ``` { "_index" : "error_index_2018.09.25.20", "_type" : "error_doc", "_id" : "df83aebc-506d-404b-aba6-c5a740d07c57", "_score" : 1.0, "_source" : { "exception" : "java.lang.IllegalStateException: Unable to parse Message: test", "failed_sensor_type" : "bro", "stack" : "java.lang.IllegalStateException: Unable to parse Message: bad message ... "hostname" : "node1", "source:type" : "error", "raw_message" : "test", "error_hash" : "9f86d081884c7d659a2feaa0c55ad015a3bf4f1b2b0b822cd15d6c15b0f00a08",
[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...
Github user merrimanr closed the pull request at: https://github.com/apache/metron/pull/1213 ---
[GitHub] metron issue #1213: METRON-1681: Decouple the ParserBolt from the Parse exec...
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1213 The latest commit resolves the conflicts introduced by https://github.com/apache/metron/pull/1184. The changes in all classes except ParserRunnerImpl and ParserRunnerImplTest were fairly trivial, mainly just adjusting the return types to match the new MessageParserResult and ParserRunnerResults interfaces. Most of the work was done in ParserRunnerImpl (the execute method specifically). Since the MessageParser interface can now return multiple messages and exceptions (and a master exception in some cases), there needs to be a way to map them to a ParserRunnerResults object (which is a list of messages and errors). Now the primary job of this execute method is to: - call MessageParser.parseOptionalResult - post process messages returned by the MessageParser - consolidate the various errors and messages into a single ParserRunnerResults object Let me know if anyone thinks this should work differently. I also added more javadocs to the various classes involved, primarily ParserRunnerImpl since that's where the magic happens. This merge was fairly significant so I am planning on testing in full dev again. ---
[GitHub] metron issue #1218: METRON-1801 Allow Customization of Elasticsearch Documen...
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1218 Looks good to me. +1 ---
[GitHub] metron pull request #1218: METRON-1801 Allow Customization of Elasticsearch ...
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1218#discussion_r224548764 --- Diff: metron-platform/metron-elasticsearch/pom.xml --- @@ -206,24 +206,7 @@ test-jar test - --- End diff -- I have also noticed this and I end up commenting these out every time I run a test in my IDE. Thanks for investigating and removing them. ---
[GitHub] metron pull request #1218: METRON-1801 Allow Customization of Elasticsearch ...
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1218#discussion_r224541191 --- Diff: metron-platform/metron-elasticsearch/pom.xml --- @@ -206,24 +206,7 @@ test-jar test - --- End diff -- Why were these dependencies removed? Just curious. ---
[GitHub] metron issue #1218: METRON-1801 Allow Customization of Elasticsearch Documen...
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1218 @mraliagha, that's a good suggestion. I believe we can functionally achieve that be creating a custom id field in the format you suggest (with a Stellar field transform) and set that field to be the ES id with the Ambari property exposed in this PR. Do you feel it's worth documenting as an optimization? I spun this up in full dev and ran through all the testing instructions. Everything worked as advertised. I think there are just a couple open questions but this is pretty close in my opinion. ---
[GitHub] metron pull request #1218: METRON-1801 Allow Customization of Elasticsearch ...
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1218#discussion_r224538137 --- Diff: metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/ElasticsearchWriterConfig.java --- @@ -0,0 +1,134 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * + */ + +package org.apache.metron.elasticsearch.writer; + +import org.apache.metron.common.Constants; +import org.apache.metron.stellar.common.utils.ConversionUtils; + +import java.util.Map; + +/** + * Configuration settings that customize the behavior of the {@link ElasticsearchWriter}. + */ +public enum ElasticsearchWriterConfig { + + /** + * Defines which message field, the document identifier is set to. + * + * If defined, the value of the specified message field is set as the Elasticsearch doc ID. If + * this field is undefined or blank, then the document identifier is not set. + */ + DOC_ID_SOURCE_FIELD("es.document.id", "", String.class); --- End diff -- This does leave things in an inconsistent state and will makes things confusing for anyone working with these classes. However I think a follow up PR would be fine. Are you aware of the [ConfigOption](https://github.com/apache/metron/blob/master/metron-platform/metron-common/src/main/java/org/apache/metron/common/configuration/ConfigOption.java) interface? It looks like there may be some duplication with this class. ---
[GitHub] metron issue #1184: METRON-1761, allow application of grok statement multipl...
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1184 +1 from me too. I will merge this into https://github.com/apache/metron/pull/1213 once it's in master and we can continue moving forward there. Parsing is getting an upgrade! ---
[GitHub] metron pull request #1233: METRON-1816: Date format Stellar function
GitHub user merrimanr opened a pull request: https://github.com/apache/metron/pull/1233 METRON-1816: Date format Stellar function ## Contributor Comments This PR adds a Stellar function that wraps the Java SimpleDateFormat class. The first argument is a date format string and is required. The second argument is epoch time and is optional (defaults to current time). The third argument is a timezone and is also option (default time zone of the environment is used if missing). ### Testing This has been tested in full dev: 1. Spin up full dev and launch the Stellar REPL. The `DATE_FORMAT` function should be available: ``` Stellar, Go! Functions are loading lazily in the background and will be unavailable until loaded fully. {} [Stellar]>>> %functions DATE DATE_FORMAT, IS_DATE ``` 2. Create a variable and set it to an epoch time: ``` [Stellar]>>> epoch := 1472131630748L 1472131630748 ``` 3. Use the `DATE_FORMAT` function to format the value in the epoch field created in the previous step. The result should be a correctly formatted date that matches the format string, epoch time and timezone: ``` [Stellar]>>> DATE_FORMAT('EEE MMM dd hh:mm:ss zzz', epoch, 'EST') Thu Aug 25 2016 08:27:10 EST ``` 4. Format the date using the default timezone. The result should match the default timezone of your environment: ``` [Stellar]>>> DATE_FORMAT('EEE MMM dd hh:mm:ss ', epoch) Thu Aug 25 2016 01:27:10 Coordinated Universal Time ``` 5. Format the date using the current time: ``` [Stellar]>>> DATE_FORMAT('EEE MMM dd hh:mm:ss zzz', 'GMT') Tue Oct 09 2018 10:13:24 GMT ``` 6. Format the date using the current time and default time zone: ``` [Stellar]>>> DATE_FORMAT('EEE MMM dd hh:mm:ss ') Tue Oct 09 2018 10:12:54 Coordinated Universal Time ``` ## Pull Request Checklist Thank you for submitting a contribution to Apache Metron. Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions. Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides. In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following: ### For all changes: - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [x] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? ### For code changes: - [x] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [x] Have you included steps or a guide to how the change may be verified and tested manually? - [x] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via: ``` mvn -q clean integration-test install && dev-utilities/build-utils/verify_licenses.sh ``` - [x] Have you written or updated unit tests and or integration tests to verify your changes? - [x] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [x] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent? ### For documentation related changes: - [x] Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via `site-book/target/site/index.html`: ``` cd site-book mvn site ``` Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. It is also recommended that [travis-ci](https://travis-ci.org) is set up for your personal repository such that your branches are built there before submitting a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/merrimanr/incubator-metron METRON-1816 Alternatively you can review and apply these change
[GitHub] metron issue #1221: METRON-1805: Provide a default value for the Storm topol...
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1221 I added a default to the profiler topology. We're probably less likely to run into a problem with the parser topologies I would think. Should we provide defaults for those? ---
[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1213#discussion_r223887299 --- Diff: metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/ParserRunnerImpl.java --- @@ -0,0 +1,216 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.parsers; + +import org.apache.commons.lang3.StringUtils; +import org.apache.metron.common.Constants; +import org.apache.metron.common.configuration.FieldTransformer; +import org.apache.metron.common.configuration.FieldValidator; +import org.apache.metron.common.configuration.ParserConfigurations; +import org.apache.metron.common.configuration.SensorParserConfig; +import org.apache.metron.common.error.MetronError; +import org.apache.metron.common.message.metadata.RawMessage; +import org.apache.metron.common.utils.ReflectionUtils; +import org.apache.metron.parsers.filters.Filters; +import org.apache.metron.parsers.interfaces.MessageFilter; +import org.apache.metron.parsers.interfaces.MessageParser; +import org.apache.metron.parsers.topology.ParserComponent; +import org.apache.metron.stellar.dsl.Context; +import org.json.simple.JSONObject; + +import java.io.Serializable; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.HashSet; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.Set; +import java.util.UUID; +import java.util.function.Consumer; +import java.util.function.Supplier; +import java.util.stream.Collectors; + +public class ParserRunnerImpl implements ParserRunner, Serializable { + + protected transient Consumer onSuccess; + protected transient Consumer onError; + + private HashSet sensorTypes; + private Map sensorToParserComponentMap; + + // Stellar variables + private transient Context stellarContext; + + public ParserRunnerImpl(HashSet sensorTypes) { +this.sensorTypes = sensorTypes; + } + + public Map getSensorToParserComponentMap() { +return sensorToParserComponentMap; + } + + public void setSensorToParserComponentMap(Map sensorToParserComponentMap) { +this.sensorToParserComponentMap = sensorToParserComponentMap; + } + + public Context getStellarContext() { +return stellarContext; + } + + @Override + public Set getSensorTypes() { +return sensorTypes; + } + + @Override + public void init(Supplier parserConfigSupplier, Context stellarContext) { +if (parserConfigSupplier == null) { + throw new IllegalStateException("A parser config supplier must be set before initializing the ParserRunner."); +} +if (stellarContext == null) { + throw new IllegalStateException("A stellar context must be set before initializing the ParserRunner."); +} +this.stellarContext = stellarContext; +initializeParsers(parserConfigSupplier); + } + + @Override + public List execute(String sensorType, RawMessage rawMessage, ParserConfigurations parserConfigurations) { +List parserResults; +SensorParserConfig sensorParserConfig = parserConfigurations.getSensorParserConfig(sensorType); +if (sensorParserConfig != null) { + MessageParser parser = sensorToParserComponentMap.get(sensorType).getMessageParser(); + List messages = parser.parseOptional(rawMessage.getMessage()).orElse(Collections.emptyList()); + parserResults = messages.stream() + .map(message -> processMessage(sensorType, message, rawMessage, parser, parserConfigurations)) + .filter(Optional::isPresent) + .map(Optional::get).collect(Collectors.toList()); +} else { + throw new IllegalStateException(String.format("Could not execute
[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1213#discussion_r223886964 --- Diff: metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/ParserResult.java --- @@ -0,0 +1,75 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.parsers; + +import org.apache.metron.common.error.MetronError; +import org.json.simple.JSONObject; + +import java.util.Arrays; +import java.util.Objects; + +public class ParserResult { + + private String sensorType; + private JSONObject message; + private MetronError error; + private byte[] originalMessage; + + public ParserResult(String sensorType, JSONObject message, byte[] originalMessage) { +this.sensorType = sensorType; +this.message = message; +this.originalMessage = originalMessage; + } + + public ParserResult(String sensorType, MetronError error, byte[] originalMessage) { +this.sensorType = sensorType; +this.error = error; +this.originalMessage = originalMessage; + } + + public String getSensorType() { +return sensorType; + } + + public JSONObject getMessage() { +return message; + } + + public MetronError getError() { +return error; + } + + public byte[] getOriginalMessage() { +return originalMessage; + } + + public boolean isError() { +return error != null; + } + + @Override + public boolean equals(Object o) { --- End diff -- Done ---
[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1213#discussion_r223885124 --- Diff: metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/ParserRunnerImpl.java --- @@ -101,26 +92,28 @@ public void init(Supplier parserConfigSupplier, Context st } @Override - public void execute(String sensorType, RawMessage rawMessage, ParserConfigurations parserConfigurations) { -if (onSuccess == null) { - throw new IllegalStateException("An onSuccess function must be set before parsing a message."); -} -if (onError == null) { - throw new IllegalStateException("An onError function must be set before parsing a message."); -} + public List execute(String sensorType, RawMessage rawMessage, ParserConfigurations parserConfigurations) { +List parserResults; SensorParserConfig sensorParserConfig = parserConfigurations.getSensorParserConfig(sensorType); if (sensorParserConfig != null) { MessageParser parser = sensorToParserComponentMap.get(sensorType).getMessageParser(); List messages = parser.parseOptional(rawMessage.getMessage()).orElse(Collections.emptyList()); - messages.forEach(message -> processMessage(sensorType, message, rawMessage, parser, parserConfigurations)); + parserResults = messages.stream() + .map(message -> processMessage(sensorType, message, rawMessage, parser, parserConfigurations)) + .filter(Optional::isPresent) --- End diff -- I believe not present means the message was filtered out (our FilterClass mechanism). Otherwise there will either be a message or error in the ParserResult. ---
[GitHub] metron issue #1231: METRON-1811: Alert Search Fails When Sorting by Alert St...
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1231 The latest commit changes the SearchIntegrationTest to use the ES templates that ship with Metron. There are a couple different tests that verify certain fields are defined with the correct types and I made sure the internal fields (including `alert_status`) are covered. The one potentially confusing change is that a couple test fields are added to the OOTB ES templates before these tests are run. This was necessary for the existing tests to continue working without major changes. I think this should cover it. Let me know what you think. ---
[GitHub] metron issue #1231: METRON-1811: Alert Search Fails When Sorting by Alert St...
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1231 I don't know that a test would have caught this. The issue is that a UI feature was implemented and added a new field to sensor indices in ES. This new field was not added to the ES templates but the feature still worked because ES has a very flexible schema. It was a separate, unrelated operation (sorting on the new field) that exposed a problem. What do you think? Is there a way we can catch this kind of thing? I can add a test that simulates a sort but it would only guard against a field being removed from a template and not added. ---
[GitHub] metron pull request #1231: METRON-1811: Alert Search Fails When Sorting by A...
GitHub user merrimanr reopened a pull request: https://github.com/apache/metron/pull/1231 METRON-1811: Alert Search Fails When Sorting by Alert Status ## Contributor Comments This PR fixes sorting on the `alert_status` field in the Alerts UI by defining the field in ES templates as a `keyword` type. The change was applied to the sensor templates that ship with Metron: bro, snort and yaf. This field was added to the Solr schemas as well. I also updated our documentation to give users guidance when defining their own templates or upgrading their templates. I expanded this to include other internal fields like `source:type` and `metron_alert`. I did not include dynamic fields but I can add documentation for that here if it makes sense. ### Testing This has been tested in full dev: 1. Spin up full dev and navigate to the Alerts UI. 2. Change the status of a couple alerts by opening up their details panel and clicking a different status (OPEN for example). 3. Sort by `alert_status`. The Alerts UI should properly display alerts by `alert_status` and no errors should be reported in the console. 4. Enable Solr and verify data is visible in the Alerts UI. Repeat steps 2 and 3. ## Pull Request Checklist Thank you for submitting a contribution to Apache Metron. Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions. Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides. In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following: ### For all changes: - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [x] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? ### For code changes: - [x] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [x] Have you included steps or a guide to how the change may be verified and tested manually? - [x] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via: ``` mvn -q clean integration-test install && dev-utilities/build-utils/verify_licenses.sh ``` - [x] Have you written or updated unit tests and or integration tests to verify your changes? - [x] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [x] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent? ### For documentation related changes: - [ ] Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via `site-book/target/site/index.html`: ``` cd site-book mvn site ``` Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. It is also recommended that [travis-ci](https://travis-ci.org) is set up for your personal repository such that your branches are built there before submitting a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/merrimanr/incubator-metron METRON-1811 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/metron/pull/1231.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1231 commit 7a707bfbb1c6339f5891763c82e611eb080c4af7 Author: merrimanr Date: 2018-10-08T14:40:37Z initial commit commit 7016c1fb1b0cede0191edd32ac1dddf9b191eb13 Author: merrimanr Date: 2018-10-08T19:36:38Z typo commit 0920cdd0b83ccae7ceb7fc5b0ebfc5284a777e5d Author: merrimanr Date: 2018-10-08T19:57:54Z fixed test ---
[GitHub] metron pull request #1231: METRON-1811: Alert Search Fails When Sorting by A...
Github user merrimanr closed the pull request at: https://github.com/apache/metron/pull/1231 ---
[GitHub] metron issue #1230: METRON-1812: Fix dependencies_with_url.csv
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1230 +1 for the clever history reference and this PR. ---
[GitHub] metron pull request #1231: METRON-1811: Alert Search Fails When Sorting by A...
GitHub user merrimanr opened a pull request: https://github.com/apache/metron/pull/1231 METRON-1811: Alert Search Fails When Sorting by Alert Status ## Contributor Comments This PR fixes sorting on the `alert_status` field in the Alerts UI by defining the field in ES templates as a `keyword` type. The change was applied to the sensor templates that ship with Metron: bro, snort and yaf. This field was added to the Solr schemas as well. I also updated our documentation to give users guidance when defining their own templates or upgrading their templates. I expanded this to include other internal fields like `source:type` and `metron_alert`. I did not include dynamic fields but I can add documentation for that here if it makes sense. ### Testing This has been tested in full dev: 1. Spin up full dev and navigate to the Alerts UI. 2. Change the status of a couple alerts by opening up their details panel and clicking a different status (OPEN for example). 3. Sort by `alert_status`. The Alerts UI should properly display alerts by `alert_status` and no errors should be reported in the console. 4. Enable Solr and verify data is visible in the Alerts UI. Repeat steps 2 and 3. ## Pull Request Checklist Thank you for submitting a contribution to Apache Metron. Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions. Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides. In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following: ### For all changes: - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [x] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? ### For code changes: - [x] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [x] Have you included steps or a guide to how the change may be verified and tested manually? - [x] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via: ``` mvn -q clean integration-test install && dev-utilities/build-utils/verify_licenses.sh ``` - [x] Have you written or updated unit tests and or integration tests to verify your changes? - [x] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [x] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent? ### For documentation related changes: - [ ] Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via `site-book/target/site/index.html`: ``` cd site-book mvn site ``` Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. It is also recommended that [travis-ci](https://travis-ci.org) is set up for your personal repository such that your branches are built there before submitting a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/merrimanr/incubator-metron METRON-1811 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/metron/pull/1231.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1231 commit 7a707bfbb1c6339f5891763c82e611eb080c4af7 Author: merrimanr Date: 2018-10-08T14:40:37Z initial commit ---
[GitHub] metron issue #1227: METRON-1807: Auto populate the recommended values to som...
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1227 +1 by inspection. Thanks @MohanDV. ---
[GitHub] metron issue #1221: METRON-1805: Provide a default value for the Storm topol...
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1221 As far as I know there was not an issue with other topologies. Writing data tends to cause back pressure so that's why these changes were suggested for indexing topologies. ---
[GitHub] metron issue #1184: METRON-1761, allow application of grok statement multipl...
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1184 I prefer @mmiklavc's suggestion several comments back: return a List of something other than a List. Could we use a wrapper that can contain results and errors? The calling class could then determine the best way to handle errors without having to fail the whole batch. ---
[GitHub] metron pull request #1190: METRON-1771: Update REST endpoints to support eve...
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1190#discussion_r222823946 --- Diff: metron-platform/metron-indexing/src/main/java/org/apache/metron/indexing/dao/MultiIndexDao.java --- @@ -282,4 +276,27 @@ public Document getLatest(final String guid, String sensorType) throws IOExcepti public List getIndices() { return indices; } + + private Document getDocument(List documentContainers) throws IOException { +Document ret = null; +List error = new ArrayList<>(); +for(DocumentContainer dc : documentContainers) { + if(dc.getException().isPresent()) { +Throwable e = dc.getException().get(); +error.add(e.getMessage() + "\n" + ExceptionUtils.getStackTrace(e)); + } + else { +if(dc.getDocument().isPresent()) { + Document d = dc.getDocument().get(); + if(ret == null || ret.getTimestamp() < d.getTimestamp()) { +ret = d; + } +} + } +} --- End diff -- Looks good to me. Latest commit includes this change. ---
[GitHub] metron issue #1213: METRON-1681: Decouple the ParserBolt from the Parse exec...
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1213 As far as I know it's been that way for a really long time. ---
[GitHub] metron issue #1213: METRON-1681: Decouple the ParserBolt from the Parse exec...
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1213 The latest commit removes the callbacks. Now the execute method returns a list of ParserResult objects. I had to add a MetronError field to the ParserResult class so that both successful messages and errors are passed back. I also noticed we are currently skipping a tuple if no configuration is found for that sensor. I think skipping the tuple is fine but we should still log an error message. I made a change in the ParserRunner class to throw an error on this condition. The parser bolt will catch it and handle it similar to a parser error by creating a MetronError object and sending it to the error topic. Does anyone think we should keep as it is now? ---
[GitHub] metron issue #1213: METRON-1681: Decouple the ParserBolt from the Parse exec...
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1213 I don't feel that strongly about either approach so I'll switch it to return a list and remove the callbacks. ---
[GitHub] metron pull request #1221: METRON-1805: Provide a default value for the Stor...
GitHub user merrimanr opened a pull request: https://github.com/apache/metron/pull/1221 METRON-1805: Provide a default value for the Storm topology.max.spout.pending setting ## Contributor Comments Users have reported problems in the random and batch indexing topologies when this setting is not set. The primary purpose of this PR is to start a discussion around providing a default value for the Storm topology.max.spout.pending setting in our topologies and implement the changes if we decide to do it. 1. Should we provide defaults or continue to defer to the Storm default where a maximum isn't enforced? 2. Which topologies should we provide defaults for? 3. What should the default values be for the topologies we include? ## Pull Request Checklist Thank you for submitting a contribution to Apache Metron. Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions. Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides. In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following: ### For all changes: - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [x] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? ### For code changes: - [ ] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [ ] Have you included steps or a guide to how the change may be verified and tested manually? - [ ] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via: ``` mvn -q clean integration-test install && dev-utilities/build-utils/verify_licenses.sh ``` - [ ] Have you written or updated unit tests and or integration tests to verify your changes? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent? ### For documentation related changes: - [ ] Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via `site-book/target/site/index.html`: ``` cd site-book mvn site ``` Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. It is also recommended that [travis-ci](https://travis-ci.org) is set up for your personal repository such that your branches are built there before submitting a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/merrimanr/incubator-metron METRON-1805 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/metron/pull/1221.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1221 commit 6988e0607afc1cd402063e0696a1220cc134d9d1 Author: merrimanr Date: 2018-10-03T21:45:32Z initial commit ---
[GitHub] metron issue #1213: METRON-1681: Decouple the ParserBolt from the Parse exec...
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1213 I just pushed a commit out based on some feedback from Nick: - The stellar context is now passed into the ParserRunner abstraction as a dependency. I imagine different environments will require different Stellar initialization strategies so this made sense to me. I also like it because it makes the abstraction simpler. - Changed ParserRunner to an interface with a single implementation. I could change this to ParserRunnerStrategy in case we think we might want to use that pattern in the future. Then all that would be missing is a ParserContext class (and possibly a ParserRunnerStrategies enum) which we could add now or when we need it. ---
[GitHub] metron pull request #1190: METRON-1771: Update REST endpoints to support eve...
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1190#discussion_r222470180 --- Diff: metron-platform/metron-indexing/src/main/java/org/apache/metron/indexing/dao/MultiIndexDao.java --- @@ -282,4 +276,27 @@ public Document getLatest(final String guid, String sensorType) throws IOExcepti public List getIndices() { return indices; } + + private Document getDocument(List documentContainers) throws IOException { +Document ret = null; +List error = new ArrayList<>(); +for(DocumentContainer dc : documentContainers) { + if(dc.getException().isPresent()) { +Throwable e = dc.getException().get(); +error.add(e.getMessage() + "\n" + ExceptionUtils.getStackTrace(e)); + } + else { +if(dc.getDocument().isPresent()) { + Document d = dc.getDocument().get(); + if(ret == null || ret.getTimestamp() < d.getTimestamp()) { +ret = d; + } +} + } +} --- End diff -- No problem, don't mind working it out here and now. The case that causes the test to fail (although I'm not sure that was the intention of the test) is where 2 DAOs are contained in a MultiIndexDao and the first DAO returns an invalid result (no document or exception) while the second returns a valid result. I don't know what would cause a document to be missing both a document and an exception. The question is what should happen when this unexpected condition does happen? Should it blow up and throw and exception or return a document if at least one DAO returns a valid one. ---
[GitHub] metron issue #1213: METRON-1681: Decouple the ParserBolt from the Parse exec...
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1213 Thanks for pointing me to that @mmiklavc, I had not looked at it yet. I am in agreement that we should do our best to match other classes and patterns we already have. I will continue studying that to see where we can make things more consistent. If you can provide some examples that demonstrate your ideas that would help me understand it better. From what I can tell, the strategy pattern was used in the UnifiedEnrichmentBolt because there are 2 separate strategies for enriching a message: enrichment and threat intel. Do we need the strategy pattern here since there is only one parser running strategy? We could implement this as a strategy pattern in anticipation of one day needing another parser running strategy. Do you think it's worth doing now or later when we need it? The other difference I see is that Future objects are used to join the different messages after they are processed in parallel. I believe this is done because all enrichments/threat intel must be done before the message can be pieced back together and sent along. In this case we don't have that limitation. The documents that are parsed do not depend on each other and can be passed along as soon as they are processed. We are set up for post-processing these documents in parallel but I see that as a low-latency operation and I'm not sure it's worth the extra overhead. ---
[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1213#discussion_r222440594 --- Diff: metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/ParserRunner.java --- @@ -0,0 +1,234 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.parsers; + +import com.github.benmanes.caffeine.cache.Cache; +import org.apache.commons.lang3.StringUtils; +import org.apache.curator.framework.CuratorFramework; +import org.apache.metron.common.Constants; +import org.apache.metron.common.configuration.FieldTransformer; +import org.apache.metron.common.configuration.FieldValidator; +import org.apache.metron.common.configuration.ParserConfigurations; +import org.apache.metron.common.configuration.SensorParserConfig; +import org.apache.metron.common.error.MetronError; +import org.apache.metron.common.message.metadata.RawMessage; +import org.apache.metron.common.utils.ReflectionUtils; +import org.apache.metron.parsers.filters.Filters; +import org.apache.metron.parsers.interfaces.MessageFilter; +import org.apache.metron.parsers.interfaces.MessageParser; +import org.apache.metron.parsers.topology.ParserComponent; +import org.apache.metron.stellar.common.CachingStellarProcessor; +import org.apache.metron.stellar.dsl.Context; +import org.apache.metron.stellar.dsl.StellarFunctions; +import org.json.simple.JSONObject; + +import java.io.Serializable; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.HashSet; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.UUID; +import java.util.function.Consumer; +import java.util.function.Supplier; +import java.util.stream.Collectors; + +public class ParserRunner implements Serializable { + + protected transient Consumer onSuccess; + protected transient Consumer onError; + + private HashSet sensorTypes; + private Map sensorToParserComponentMap; + + // Stellar variables + private transient Context stellarContext; --- End diff -- Context is not serializable. ---
[GitHub] metron issue #1213: METRON-1681: Decouple the ParserBolt from the Parse exec...
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1213 That sounds correct to me. Specifically decoupling the Kafka writing step since our writing implementation requires a tuple. ---
[GitHub] metron issue #1213: METRON-1681: Decouple the ParserBolt from the Parse exec...
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1213 The MessageParser.parseOptional call returns a list of JSONObjects which is then processed (field transformations applied, messages validated, etc). The difference is WHEN each processed JSONObject is written to Kafka. In the current ParserBolt each one is sent as soon as it is processed because everything runs in ParserBolt.execute. Since we are extracting that logic outside of the ParserBolt, we either need to write them all after processing is done by the abstraction or expose callbacks that the abstraction calls as it finishes processing each one. I don't think it matters much either way. I chose to use callbacks because it more closely matches the flow of the current ParserBolt. I will assume everyone is ok with callbacks and will continue work on the suggestions made in the comments. ---
[GitHub] metron issue #1213: METRON-1681: Decouple the ParserBolt from the Parse exec...
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1213 I think the travis failure is unrelated. I have to make changes based on feedback so I'll keep on eye on it after the next run. Before I go implementing changes we need to decide on callback vs list. I don't see how we can design an interface that supports both (am I wrong there?) so we need to choose. Seems to be split at the moment. Thoughts? ---
[GitHub] metron pull request #1190: METRON-1771: Update REST endpoints to support eve...
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1190#discussion_r222000643 --- Diff: metron-platform/metron-indexing/src/main/java/org/apache/metron/indexing/dao/MultiIndexDao.java --- @@ -282,4 +276,27 @@ public Document getLatest(final String guid, String sensorType) throws IOExcepti public List getIndices() { return indices; } + + private Document getDocument(List documentContainers) throws IOException { +Document ret = null; +List error = new ArrayList<>(); +for(DocumentContainer dc : documentContainers) { + if(dc.getException().isPresent()) { +Throwable e = dc.getException().get(); +error.add(e.getMessage() + "\n" + ExceptionUtils.getStackTrace(e)); + } + else { +if(dc.getDocument().isPresent()) { + Document d = dc.getDocument().get(); + if(ret == null || ret.getTimestamp() < d.getTimestamp()) { +ret = d; + } +} + } +} --- End diff -- Upon further review, I'm not sure this is what we want. This caused an integration test to fail because the result from the first DAO did not contain a document or exception while the result from the second DAO did contain a document. Before this change, the document from the second DAO would have been returned. I think we want to return a document if possible, even if all DAOs are not consistent. Do you disagree? ---
[GitHub] metron issue #1190: METRON-1771: Update REST endpoints to support eventually...
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1190 @nickwallen, the latest commit should address your comments. Let me know what you think. ---
[GitHub] metron issue #1212: METRON-1794 Include User Details When Escalating Alerts
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1212 Looks good to me. +1 ---
[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1213#discussion_r221023741 --- Diff: metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/ParserRunner.java --- @@ -0,0 +1,234 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.parsers; + +import com.github.benmanes.caffeine.cache.Cache; +import org.apache.commons.lang3.StringUtils; +import org.apache.curator.framework.CuratorFramework; +import org.apache.metron.common.Constants; +import org.apache.metron.common.configuration.FieldTransformer; +import org.apache.metron.common.configuration.FieldValidator; +import org.apache.metron.common.configuration.ParserConfigurations; +import org.apache.metron.common.configuration.SensorParserConfig; +import org.apache.metron.common.error.MetronError; +import org.apache.metron.common.message.metadata.RawMessage; +import org.apache.metron.common.utils.ReflectionUtils; +import org.apache.metron.parsers.filters.Filters; +import org.apache.metron.parsers.interfaces.MessageFilter; +import org.apache.metron.parsers.interfaces.MessageParser; +import org.apache.metron.parsers.topology.ParserComponent; +import org.apache.metron.stellar.common.CachingStellarProcessor; +import org.apache.metron.stellar.dsl.Context; +import org.apache.metron.stellar.dsl.StellarFunctions; +import org.json.simple.JSONObject; + +import java.io.Serializable; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.HashSet; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.UUID; +import java.util.function.Consumer; +import java.util.function.Supplier; +import java.util.stream.Collectors; + +public class ParserRunner implements Serializable { + + protected transient Consumer onSuccess; + protected transient Consumer onError; + + private HashSet sensorTypes; + private Map sensorToParserComponentMap; + + // Stellar variables + private transient Context stellarContext; + + public ParserRunner(HashSet sensorTypes) { +this.sensorTypes = sensorTypes; + } + + public Map getSensorToParserComponentMap() { +return sensorToParserComponentMap; + } + + public void setSensorToParserComponentMap(Map sensorToParserComponentMap) { +this.sensorToParserComponentMap = sensorToParserComponentMap; + } + + public Context getStellarContext() { +return stellarContext; + } + + public void setOnSuccess(Consumer onSuccess) { +this.onSuccess = onSuccess; + } + + public void setOnError(Consumer onError) { +this.onError = onError; + } + + public void init(CuratorFramework client, Supplier parserConfigSupplier) { +if (parserConfigSupplier == null) { + throw new IllegalStateException("A parser config supplier must be set before initializing the ParserRunner."); +} +initializeStellar(client, parserConfigSupplier); +initializeParsers(parserConfigSupplier); + } + + protected void initializeStellar(CuratorFramework client, Supplier parserConfigSupplier) { +Map cacheConfig = new HashMap<>(); +for (String sensorType: sensorTypes) { + SensorParserConfig config = parserConfigSupplier.get().getSensorParserConfig(sensorType); + + if (config != null) { +cacheConfig.putAll(config.getCacheConfig()); + } +} +Cache cache = CachingStellarProcessor.createCache(cacheConfig); + +Context.Builder builder = new Context.Builder() +.with(Context.Capabilities.ZOOKEEPER_CLIENT, () -> client) +.with(Context.Capabilities.GLOBAL_CONFIG, () -> parserConfigSupplier.get().getGlobalConfig()) +.with(Context.Capabilities.STELL
[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1213#discussion_r221017124 --- Diff: metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/ParserRunner.java --- @@ -0,0 +1,234 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.parsers; + +import com.github.benmanes.caffeine.cache.Cache; +import org.apache.commons.lang3.StringUtils; +import org.apache.curator.framework.CuratorFramework; +import org.apache.metron.common.Constants; +import org.apache.metron.common.configuration.FieldTransformer; +import org.apache.metron.common.configuration.FieldValidator; +import org.apache.metron.common.configuration.ParserConfigurations; +import org.apache.metron.common.configuration.SensorParserConfig; +import org.apache.metron.common.error.MetronError; +import org.apache.metron.common.message.metadata.RawMessage; +import org.apache.metron.common.utils.ReflectionUtils; +import org.apache.metron.parsers.filters.Filters; +import org.apache.metron.parsers.interfaces.MessageFilter; +import org.apache.metron.parsers.interfaces.MessageParser; +import org.apache.metron.parsers.topology.ParserComponent; +import org.apache.metron.stellar.common.CachingStellarProcessor; +import org.apache.metron.stellar.dsl.Context; +import org.apache.metron.stellar.dsl.StellarFunctions; +import org.json.simple.JSONObject; + +import java.io.Serializable; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.HashSet; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.UUID; +import java.util.function.Consumer; +import java.util.function.Supplier; +import java.util.stream.Collectors; + +public class ParserRunner implements Serializable { + + protected transient Consumer onSuccess; + protected transient Consumer onError; + + private HashSet sensorTypes; + private Map sensorToParserComponentMap; + + // Stellar variables + private transient Context stellarContext; + + public ParserRunner(HashSet sensorTypes) { +this.sensorTypes = sensorTypes; + } + + public Map getSensorToParserComponentMap() { +return sensorToParserComponentMap; + } + + public void setSensorToParserComponentMap(Map sensorToParserComponentMap) { +this.sensorToParserComponentMap = sensorToParserComponentMap; + } + + public Context getStellarContext() { +return stellarContext; + } + + public void setOnSuccess(Consumer onSuccess) { +this.onSuccess = onSuccess; + } + + public void setOnError(Consumer onError) { +this.onError = onError; + } + + public void init(CuratorFramework client, Supplier parserConfigSupplier) { +if (parserConfigSupplier == null) { + throw new IllegalStateException("A parser config supplier must be set before initializing the ParserRunner."); +} +initializeStellar(client, parserConfigSupplier); +initializeParsers(parserConfigSupplier); + } + + protected void initializeStellar(CuratorFramework client, Supplier parserConfigSupplier) { --- End diff -- Ok sounds good. Are you ok if I leave the Stellar stuff in there as a default that only gets called if one hasn't been provided? ---
[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1213#discussion_r220995913 --- Diff: metron-platform/metron-parsers/src/test/java/org/apache/metron/parsers/integration/ParserDriver.java --- @@ -83,49 +86,21 @@ public void close() throws Exception { List errors = new ArrayList<>(); public ShimParserBolt(List output) { - super(null - , Collections.singletonMap( - sensorType == null ? config.getSensorTopic() : sensorType, - new ParserComponents( - ReflectionUtils.createInstance(config.getParserClassName()), - null, - new WriterHandler(new CollectingWriter(output)) - ) - ) - ); + super(null, parserRunner, Collections.singletonMap(sensorType, new WriterHandler(new CollectingWriter(output; --- End diff -- Not sure this ShimParserBolt is necessary anymore. ---
[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1213#discussion_r220994410 --- Diff: metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/topology/ParserTopologyBuilder.java --- @@ -268,29 +266,14 @@ private static ParserBolt createParserBolt( String zookeeperUrl, Optional securityProtocol, ParserConfigurations configs, Optional outputTopic) { - -Map parserBoltConfigs = new HashMap<>(); +Map writerConfigs = new HashMap<>(); for( Entry entry : sensorTypeToParserConfig.entrySet()) { String sensorType = entry.getKey(); SensorParserConfig parserConfig = entry.getValue(); - // create message parser --- End diff -- This should not be done here because it can potentially cause serialization problems: https://issues.apache.org/jira/browse/METRON-1793. MessageParser and MessageFilter objects should be created during initialization and not when the container is created (and before the ParserBolt is created). We should be doing this when prepare is called on the ParserBolt. ---
[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1213#discussion_r220993330 --- Diff: metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/topology/ParserComponent.java --- @@ -17,26 +17,23 @@ */ package org.apache.metron.parsers.topology; -import java.io.Serializable; -import org.apache.metron.parsers.bolt.WriterHandler; import org.apache.metron.parsers.interfaces.MessageFilter; import org.apache.metron.parsers.interfaces.MessageParser; import org.json.simple.JSONObject; -public class ParserComponents implements Serializable { +import java.io.Serializable; + +public class ParserComponent implements Serializable { private static final long serialVersionUID = 7880346740026374665L; private MessageParser messageParser; private MessageFilter filter; - private WriterHandler writer; - public ParserComponents( + public ParserComponent( MessageParser messageParser, - MessageFilter filter, --- End diff -- WriterHandler was moved out of this class because it's handled by Storm. MessageParser and MessageFilter is abstracted away from the streaming platform in this PR. ---
[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1213#discussion_r220993022 --- Diff: metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/filters/BroMessageFilter.java --- @@ -67,7 +67,7 @@ else if(protocolsObj instanceof List) { */ @Override - public boolean emitTuple(JSONObject message, Context context) { + public boolean emit(JSONObject message, Context context) { --- End diff -- Renamed this because it should not be specific to Storm. ---
[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1213#discussion_r220991435 --- Diff: metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/bolt/ParserBolt.java --- @@ -242,169 +226,81 @@ public void prepare(Map stormConf, TopologyContext context, OutputCollector coll } } - protected void initializeStellar() { -Context.Builder builder = new Context.Builder() - .with(Context.Capabilities.ZOOKEEPER_CLIENT, () -> client) -.with(Context.Capabilities.GLOBAL_CONFIG, () -> getConfigurations().getGlobalConfig()) -.with(Context.Capabilities.STELLAR_CONFIG, () -> getConfigurations().getGlobalConfig()) -; -if(cache != null) { - builder = builder.with(Context.Capabilities.CACHE, () -> cache); -} -this.stellarContext = builder.build(); -StellarFunctions.initialize(stellarContext); - } - @SuppressWarnings("unchecked") @Override public void execute(Tuple tuple) { if (TupleUtils.isTick(tuple)) { - try { -for (Entry entry : sensorToComponentMap.entrySet()) { - entry.getValue().getWriter().flush(getConfigurations(), messageGetStrategy); -} - } catch (Exception e) { -throw new RuntimeException( -"This should have been caught in the writerHandler. If you see this, file a JIRA", e); - } finally { -collector.ack(tuple); - } + handleTickTuple(tuple); return; } - +numWritten = 0; --- End diff -- This counter is implemented as a member variable and reset for each tuple. ---
[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1213#discussion_r220988891 --- Diff: metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/bolt/ParserBolt.java --- @@ -242,169 +226,81 @@ public void prepare(Map stormConf, TopologyContext context, OutputCollector coll } } - protected void initializeStellar() { -Context.Builder builder = new Context.Builder() - .with(Context.Capabilities.ZOOKEEPER_CLIENT, () -> client) -.with(Context.Capabilities.GLOBAL_CONFIG, () -> getConfigurations().getGlobalConfig()) -.with(Context.Capabilities.STELLAR_CONFIG, () -> getConfigurations().getGlobalConfig()) -; -if(cache != null) { - builder = builder.with(Context.Capabilities.CACHE, () -> cache); -} -this.stellarContext = builder.build(); -StellarFunctions.initialize(stellarContext); - } - @SuppressWarnings("unchecked") @Override public void execute(Tuple tuple) { if (TupleUtils.isTick(tuple)) { - try { -for (Entry entry : sensorToComponentMap.entrySet()) { - entry.getValue().getWriter().flush(getConfigurations(), messageGetStrategy); -} - } catch (Exception e) { -throw new RuntimeException( -"This should have been caught in the writerHandler. If you see this, file a JIRA", e); - } finally { -collector.ack(tuple); - } + handleTickTuple(tuple); return; } - +numWritten = 0; byte[] originalMessage = (byte[]) messageGetStrategy.get(tuple); +String topic = tuple.getStringByField(FieldsConfiguration.TOPIC.getFieldName()); +String sensorType = topicToSensorMap.get(topic); try { - SensorParserConfig sensorParserConfig; - MessageParser parser; - String sensor; - Map metadata; - if (sensorToComponentMap.size() == 1) { -// There's only one parser, so grab info directly -Entry sensorParser = sensorToComponentMap.entrySet().iterator() -.next(); -sensor = sensorParser.getKey(); -parser = sensorParser.getValue().getMessageParser(); -sensorParserConfig = getSensorParserConfig(sensor); - } else { -// There's multiple parsers, so pull the topic from the Tuple and look up the sensor -String topic = tuple.getStringByField(FieldsConfiguration.TOPIC.getFieldName()); -sensor = topicToSensorMap.get(topic); -parser = sensorToComponentMap.get(sensor).getMessageParser(); -sensorParserConfig = getSensorParserConfig(sensor); - } + ParserConfigurations parserConfigurations = getConfigurations(); + SensorParserConfig sensorParserConfig = parserConfigurations.getSensorParserConfig(sensorType); + RawMessage rawMessage = RawMessageUtil.INSTANCE.getRawMessage( sensorParserConfig.getRawMessageStrategy() + , tuple + , originalMessage + , sensorParserConfig.getReadMetadata() + , sensorParserConfig.getRawMessageStrategyConfig() + ); + parserRunner.setOnSuccess(parserResult -> onSuccess(parserResult, tuple)); --- End diff -- Setting onSuccess here with the current tuple. We could also store the current tuple in a member variable if we wanted to move this call to prepare. ---
[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1213#discussion_r220988475 --- Diff: metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/bolt/ParserBolt.java --- @@ -242,169 +226,81 @@ public void prepare(Map stormConf, TopologyContext context, OutputCollector coll } } - protected void initializeStellar() { -Context.Builder builder = new Context.Builder() - .with(Context.Capabilities.ZOOKEEPER_CLIENT, () -> client) -.with(Context.Capabilities.GLOBAL_CONFIG, () -> getConfigurations().getGlobalConfig()) -.with(Context.Capabilities.STELLAR_CONFIG, () -> getConfigurations().getGlobalConfig()) -; -if(cache != null) { - builder = builder.with(Context.Capabilities.CACHE, () -> cache); -} -this.stellarContext = builder.build(); -StellarFunctions.initialize(stellarContext); - } - @SuppressWarnings("unchecked") @Override public void execute(Tuple tuple) { if (TupleUtils.isTick(tuple)) { - try { -for (Entry entry : sensorToComponentMap.entrySet()) { - entry.getValue().getWriter().flush(getConfigurations(), messageGetStrategy); -} - } catch (Exception e) { -throw new RuntimeException( -"This should have been caught in the writerHandler. If you see this, file a JIRA", e); - } finally { -collector.ack(tuple); - } + handleTickTuple(tuple); return; } - +numWritten = 0; byte[] originalMessage = (byte[]) messageGetStrategy.get(tuple); +String topic = tuple.getStringByField(FieldsConfiguration.TOPIC.getFieldName()); +String sensorType = topicToSensorMap.get(topic); try { - SensorParserConfig sensorParserConfig; - MessageParser parser; - String sensor; - Map metadata; - if (sensorToComponentMap.size() == 1) { --- End diff -- I refactored this to always look up the sensor from the topic. I think this makes it easier to read and test. ---
[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1213#discussion_r220987950 --- Diff: metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/bolt/ParserBolt.java --- @@ -242,169 +226,81 @@ public void prepare(Map stormConf, TopologyContext context, OutputCollector coll } } - protected void initializeStellar() { -Context.Builder builder = new Context.Builder() - .with(Context.Capabilities.ZOOKEEPER_CLIENT, () -> client) -.with(Context.Capabilities.GLOBAL_CONFIG, () -> getConfigurations().getGlobalConfig()) -.with(Context.Capabilities.STELLAR_CONFIG, () -> getConfigurations().getGlobalConfig()) -; -if(cache != null) { - builder = builder.with(Context.Capabilities.CACHE, () -> cache); -} -this.stellarContext = builder.build(); -StellarFunctions.initialize(stellarContext); - } - @SuppressWarnings("unchecked") @Override public void execute(Tuple tuple) { if (TupleUtils.isTick(tuple)) { - try { --- End diff -- Slight refactor here. I moved this to it's own method to make execute more readable. ---
[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1213#discussion_r220987391 --- Diff: metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/ParserRunner.java --- @@ -0,0 +1,234 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.parsers; + +import com.github.benmanes.caffeine.cache.Cache; +import org.apache.commons.lang3.StringUtils; +import org.apache.curator.framework.CuratorFramework; +import org.apache.metron.common.Constants; +import org.apache.metron.common.configuration.FieldTransformer; +import org.apache.metron.common.configuration.FieldValidator; +import org.apache.metron.common.configuration.ParserConfigurations; +import org.apache.metron.common.configuration.SensorParserConfig; +import org.apache.metron.common.error.MetronError; +import org.apache.metron.common.message.metadata.RawMessage; +import org.apache.metron.common.utils.ReflectionUtils; +import org.apache.metron.parsers.filters.Filters; +import org.apache.metron.parsers.interfaces.MessageFilter; +import org.apache.metron.parsers.interfaces.MessageParser; +import org.apache.metron.parsers.topology.ParserComponent; +import org.apache.metron.stellar.common.CachingStellarProcessor; +import org.apache.metron.stellar.dsl.Context; +import org.apache.metron.stellar.dsl.StellarFunctions; +import org.json.simple.JSONObject; + +import java.io.Serializable; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.HashSet; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.UUID; +import java.util.function.Consumer; +import java.util.function.Supplier; +import java.util.stream.Collectors; + +public class ParserRunner implements Serializable { + + protected transient Consumer onSuccess; + protected transient Consumer onError; + + private HashSet sensorTypes; + private Map sensorToParserComponentMap; + + // Stellar variables + private transient Context stellarContext; + + public ParserRunner(HashSet sensorTypes) { +this.sensorTypes = sensorTypes; + } + + public Map getSensorToParserComponentMap() { +return sensorToParserComponentMap; + } + + public void setSensorToParserComponentMap(Map sensorToParserComponentMap) { +this.sensorToParserComponentMap = sensorToParserComponentMap; + } + + public Context getStellarContext() { +return stellarContext; + } + + public void setOnSuccess(Consumer onSuccess) { +this.onSuccess = onSuccess; + } + + public void setOnError(Consumer onError) { +this.onError = onError; + } + + public void init(CuratorFramework client, Supplier parserConfigSupplier) { +if (parserConfigSupplier == null) { + throw new IllegalStateException("A parser config supplier must be set before initializing the ParserRunner."); +} +initializeStellar(client, parserConfigSupplier); +initializeParsers(parserConfigSupplier); + } + + protected void initializeStellar(CuratorFramework client, Supplier parserConfigSupplier) { --- End diff -- We could make the client optional here. The only required dependency would then be ParserConfigurations. ---
[GitHub] metron pull request #1213: METRON-1681: Decouple the ParserBolt from the Par...
GitHub user merrimanr opened a pull request: https://github.com/apache/metron/pull/1213 METRON-1681: Decouple the ParserBolt from the Parse execution logic ## Contributor Comments The primary purpose of this PR is to create a parser container abstraction that is decoupled from Storm. This parser container (I call it ParserRunner, anyone have a better name?) is responsible for: - Instantiating the MessageParser and MessageFilter objects for each sensor. - Initializing the Stellar environment. - Accepting a raw binary message along with parser configurations and calling the MessageParser.parseOptional method for the appropriate sensor type - Process each message. Most of this logic was migrated from the ParserBolt.execute method. - Execute callbacks depending on the processing result of each message Configuration is external to this abstraction. A configuration supplier is passed in when initializing and a configuration object is passed in when processing each message. I believe this was originally done because we want message processing to be atomic without the configuration unexpectedly changing. We can easily change to a message supplier during execute if necessary. A CuratorFramework object is also required for setting up Stellar but we could easily make this optional. I decided to keep writing out of the abstraction in this PR. I can envision different platforms having different requirements or needs for sending along messages after they are parsed. Therefore the ParserBolt still handles writing messages to Kafka. If we do decide we want to add writing to our abstraction we could do it in a follow on PR to keep this from becoming even bigger. Since all of the post parsing logic was in the Parserbolt, messages could be written as they were processed rather than having to wait for all messages to be processed. To maintain this behavior I added 2 callback functions in the form of Java Consumers: onSuccess and OnError. The other option would be to make message processing synchronous and just return a list of results. This lifecycle of this container looks like: 1. Container is created from a collection of sensor types 2. Container is initialized with the init method that accepts a Curator client and a configuration Supplier. This in turn sets up Stellar and instantiates MessageParser and MessageFilter classes. 3. Container is ready and accepts messages for processing and calls the appropriate callbacks. Because this splits the ParserBolt into 2 different classes much of the ParserBoltTest unit test didn't make sense anymore. I ended up essentially rewriting it and also creating a unit test for ParserRunner. I tried to represent all the original tests in ParserBoltTest and have 95%+ coverage. If I missed any cases or made undesirable style changes, let me know. ### Changes Included - ParserRunner abstraction that is decoupled from the ParserBolt. The ParserBolt now initializes the ParserRunner and defers parsing to that class. The only thing required is Metron configuration and a Curator client (which could be optional) - Refactored ParserBolt that sets up the ParserRunner, passes message to it for processing, and writes results to Kafka. - MessageParser and MessageFilter objects are now created when Storm calls prepare, avoiding serialization issues (https://issues.apache.org/jira/browse/METRON-1793) - Updated unit and integration tests ### Testing I have done basic testing in full dev and will add a testing plan soon. You should be able to spin up full dev and see bro and snort data in ES. ### Next Steps This PR contains working code but is still needs documentation. I am planning on testing a couple different parsers in full dev (grok for example) in addition to what I've already tested. I will be adding inline comments for some of the less obvious changes or refactors to make it easier to review. My plan is for any discussion around specific parts of the code to get added as javadocs eventually. I also think we should add some developer documentation to make it easier for maintaining and integrating this into other platforms. I imagine a lot of info in this description would make it in there as well. The intention for now is to get some feedback on the overall approach and get people thinking about it. I'm still working on documentation and will add that soon. Let me know what you think! ## Pull Request Checklist Thank you for submitting a contribution to Apache Metron. Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions. Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence
[GitHub] metron pull request #1190: METRON-1771: Update REST endpoints to support eve...
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1190#discussion_r217462174 --- Diff: metron-platform/metron-elasticsearch/src/test/java/org/apache/metron/elasticsearch/dao/ElasticsearchUpdateDaoTest.java --- @@ -0,0 +1,48 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.metron.elasticsearch.dao; + +import org.apache.metron.indexing.dao.AccessConfig; +import org.apache.metron.indexing.dao.UpdateDaoTest; +import org.apache.metron.indexing.dao.update.UpdateDao; +import org.elasticsearch.client.transport.TransportClient; +import org.junit.Before; + +import static org.mockito.Mockito.mock; + +public class ElasticsearchUpdateDaoTest extends UpdateDaoTest { --- End diff -- No problem ---
[GitHub] metron pull request #1190: METRON-1771: Update REST endpoints to support eve...
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1190#discussion_r217462087 --- Diff: metron-platform/metron-indexing/src/test/java/org/apache/metron/indexing/InMemoryMetaAlertRetrieveLatestDao.java --- @@ -0,0 +1,45 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.indexing; + +import org.apache.metron.indexing.dao.IndexDao; +import org.apache.metron.indexing.dao.metaalert.MetaAlertRetrieveLatestDao; +import org.apache.metron.indexing.dao.search.GetRequest; +import org.apache.metron.indexing.dao.update.Document; + +import java.io.IOException; +import java.util.List; + +public class InMemoryMetaAlertRetrieveLatestDao implements MetaAlertRetrieveLatestDao { --- End diff -- Sure no problem. ---
[GitHub] metron pull request #1190: METRON-1771: Update REST endpoints to support eve...
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1190#discussion_r217459943 --- Diff: metron-platform/metron-indexing/src/main/java/org/apache/metron/indexing/dao/MultiIndexDao.java --- @@ -282,4 +276,27 @@ public Document getLatest(final String guid, String sensorType) throws IOExcepti public List getIndices() { return indices; } + + private Document getDocument(List documentContainers) throws IOException { +Document ret = null; +List error = new ArrayList<>(); +for(DocumentContainer dc : documentContainers) { + if(dc.getException().isPresent()) { +Throwable e = dc.getException().get(); +error.add(e.getMessage() + "\n" + ExceptionUtils.getStackTrace(e)); + } + else { +if(dc.getDocument().isPresent()) { + Document d = dc.getDocument().get(); + if(ret == null || ret.getTimestamp() < d.getTimestamp()) { +ret = d; + } +} + } +} --- End diff -- I like it. Will make that change. ---
[GitHub] metron pull request #1190: METRON-1771: Update REST endpoints to support eve...
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/1190#discussion_r217459852 --- Diff: metron-platform/metron-elasticsearch/src/test/java/org/apache/metron/elasticsearch/dao/ElasticsearchUpdateDaoTest.java --- @@ -0,0 +1,48 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.metron.elasticsearch.dao; + +import org.apache.metron.indexing.dao.AccessConfig; +import org.apache.metron.indexing.dao.UpdateDaoTest; +import org.apache.metron.indexing.dao.update.UpdateDao; +import org.elasticsearch.client.transport.TransportClient; +import org.junit.Before; + +import static org.mockito.Mockito.mock; + +public class ElasticsearchUpdateDaoTest extends UpdateDaoTest { --- End diff -- Check out UpdateDaoTest. As I created tests for ElasticsearchUpdateDao, SolrUpdateDao, and HBaseDao I found myself copy/pasting the various tests. So I consolidated them into a single test. I believe there is value in having a single sets of tests for the UpdateDao interface methods that all the Daos must satisfy. This class sets up it's Dao implementation to be used in the tests. ---
[GitHub] metron issue #1190: METRON-1771: Update REST endpoints to support eventually...
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1190 The latest commit updates the various places where looking up a document that doesn't exist returns null. Now an IOException is thrown with a helpful message and guid of the missing object. I added tests to cover these cases. I also added changes that will throw an exception when alerts to be add/removed don't exist. Previously that wasn't covered and the operation would succeed even though the missing alert wasn't added/removed. For the parallel stream issue, I made it consistent with the rest of the class. It now uses the same approach as the getLatest method and uses the DocumentContainer class. As part of this I moved common code to a separate method. Let me know what you think about these changes. ---