[GitHub] metron issue #1107: METRON-1673: Fix Javadoc errors
Github user cestella commented on the issue: https://github.com/apache/metron/pull/1107 +1 by inspection, thanks! ---
[jira] [Commented] (METRON-1673) Fix Javadoc errors
[ https://issues.apache.org/jira/browse/METRON-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545605#comment-16545605 ] ASF GitHub Bot commented on METRON-1673: Github user mmiklavc commented on the issue: https://github.com/apache/metron/pull/1107 +1 by inspection, thanks @justinleet. > Fix Javadoc errors > -- > > Key: METRON-1673 > URL: https://issues.apache.org/jira/browse/METRON-1673 > Project: Metron > Issue Type: Bug >Reporter: Justin Leet >Assignee: Justin Leet >Priority: Major > > Our Javadocs have errors. They should be fixed. Ideally, building Javadoc > should be part of the build. However, we've held off for time restraints in > the build. [https://github.com/apache/metron/pull/1099] proposes a build > matrix for our build, and we could slot it into one of the sub builds without > these time constraints. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] metron pull request #1099: METRON-1657: Parser aggregation in storm
Github user cestella commented on a diff in the pull request: https://github.com/apache/metron/pull/1099#discussion_r202801374 --- Diff: metron-platform/metron-parsers/README.md --- @@ -82,6 +82,12 @@ topology in kafka. Errors are collected with the context of the error (e.g. stacktrace) and original message causing the error and sent to an `error` queue. Invalid messages as determined by global validation functions are also treated as errors and sent to an `error` queue. + +Multiple sensors can be aggregated into a single Storm topology. When this is done, there will be +multiple Kafka spouts, but only a single parser bolt which will handle delegating to the correct --- End diff -- @ottobackwards I think that's essentially the parser chaining stuff I added earlier, am I misunderstanding? The [use-case](https://github.com/apache/metron/tree/master/use-cases/parser_chaining) is using the "we get a ton of types of data in syslog" example. ---
[jira] [Assigned] (METRON-1614) Create job status abstraction
[ https://issues.apache.org/jira/browse/METRON-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Miklavcic reassigned METRON-1614: - Assignee: Michael Miklavcic > Create job status abstraction > - > > Key: METRON-1614 > URL: https://issues.apache.org/jira/browse/METRON-1614 > Project: Metron > Issue Type: Sub-task >Reporter: Ryan Merriman >Assignee: Michael Miklavcic >Priority: Major > > It is possible to use different job engines such as MR or Spark. There should > be an abstraction that allows us to track status independent of the > underlying job engine. Initially we will use YARN/MR to run pcap query jobs. > We will also need a way to persist this information. > Pcap job submission should be asynchronous. Some kind of id should be > returned upon successful job submission rather than blocking and waiting on > the job to complete. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (METRON-1557) PcapJob should be asynchronous
[ https://issues.apache.org/jira/browse/METRON-1557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Miklavcic reassigned METRON-1557: - Assignee: Michael Miklavcic > PcapJob should be asynchronous > -- > > Key: METRON-1557 > URL: https://issues.apache.org/jira/browse/METRON-1557 > Project: Metron > Issue Type: Sub-task >Reporter: Ryan Merriman >Assignee: Michael Miklavcic >Priority: Major > > Pcap job submission should be asynchronous. The PcapJob class should submit > the job and return the job id as soon as possible rather than blocking and > waiting on the job to complete. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] metron pull request #1099: METRON-1657: Parser aggregation in storm
Github user ottobackwards commented on a diff in the pull request: https://github.com/apache/metron/pull/1099#discussion_r202802349 --- Diff: metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/bolt/ParserBolt.java --- @@ -182,40 +185,61 @@ public void prepare(Map stormConf, TopologyContext context, OutputCollector coll super.prepare(stormConf, context, collector); messageGetStrategy = MessageGetters.DEFAULT_BYTES_FROM_POSITION.get(); this.collector = collector; -if(getSensorParserConfig() != null) { - cache = CachingStellarProcessor.createCache(getSensorParserConfig().getCacheConfig()); -} -initializeStellar(); -if(getSensorParserConfig() != null && filter == null) { - getSensorParserConfig().getParserConfig().putIfAbsent("stellarContext", stellarContext); - if (!StringUtils.isEmpty(getSensorParserConfig().getFilterClassName())) { -filter = Filters.get(getSensorParserConfig().getFilterClassName() -, getSensorParserConfig().getParserConfig() -); + +// Build the Stellar cache +Map cacheConfig = new HashMap<>(); +for (Map.Entry entry: sensorToComponentMap.entrySet()) { + String sensor = entry.getKey(); + SensorParserConfig config = getSensorParserConfig(sensor); + + if (config != null) { +cacheConfig.putAll(config.getCacheConfig()); } } +cache = CachingStellarProcessor.createCache(cacheConfig); -parser.init(); +// Need to prep all sensors +for (Map.Entry entry: sensorToComponentMap.entrySet()) { + String sensor = entry.getKey(); + MessageParser parser = entry.getValue().getMessageParser(); --- End diff -- I'm still not sure sharing the stellar Context is a good thing. @cestella ---
[GitHub] metron pull request #1106: METRON-1672: Add metron-alerts's UI unit tests to...
GitHub user justinleet opened a pull request: https://github.com/apache/metron/pull/1106 METRON-1672: Add metron-alerts's UI unit tests to travis build process ## Contributor Comments Couple things happen here: - Added a mvn test goal for the metron-alert UI tests - Made the tests use Chrome headless in order to actually have them run properly in Travis. Also added the Chrome add-on in Travis for this. - Updated node/npm versions to fix some issues with running the tests. Matched the version to https://github.com/apache/metron/pull/1096 - The build started going over time, so I resurrected the build matrix from https://github.com/apache/metron/pull/854. I left out the Maven wrapper stuff. Given that Apache's been pretty good with Travis and other projects use this capability, I'm in favor of using it now that we're cramming even more in. ### Build Matrix As stated, a build matrix was added to the Travis build. It breaks things into 4 sections the same as the PR it's based on. 1. Unit Tests 2. Integration Tests 3. UI tests 4. License validation Javadoc is skipped, because it's broken (sigh). I have the fixes in a branch, and I'd like to turn it on as follow-on. ### Testing Make sure that `mvn test` works from metron-interface/metron-alerts. It should not spin up an actual Chrome window and should be headless. If you choose to push into Travis, you should see something the following in the output. This can be seen at https://travis-ci.org/justinleet/metron/jobs/404500911. The overall build matrix job can be seen at https://travis-ci.org/justinleet/metron/builds/404500908. ``` [INFO] > metron-alerts@0.5.1 test /home/travis/build/justinleet/metron/metron-interface/metron-alerts [INFO] > karma start --single-run --browsers ChromeHeadless karma.conf.js [INFO] [INFO] 16 07 2018 16:01:20.477:INFO [karma]: Karma v1.4.1 server started at http://0.0.0.0:9876/ [INFO] 16 07 2018 16:01:20.480:INFO [launcher]: Launching browser ChromeHeadless with unlimited concurrency [INFO] 16 07 2018 16:01:20.492:INFO [launcher]: Starting browser Chrome [INFO] 16 07 2018 16:01:44.456:INFO [HeadlessChrome 0.0.0 (Linux 0.0.0)]: Connected on socket Ts9_772t05A32Ohr with id 78239511 [INFO] HeadlessChrome 0.0.0 (Linux 0.0.0): Executed 0 of 23 SUCCESS (0 secs / 0 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 1 of 23 SUCCESS (0 secs / 0.205 secs) [INFO] 16 07 2018 16:01:48.411:WARN [web-server]: 404: /api/v1/global/config [INFO] 16 07 2018 16:01:48.418:WARN [web-server]: 404: /api/v1/global/config [INFO] e 0.0.0 (Linux 0.0.0): Executed 2 of 23 SUCCESS (0 secs / 0.41 secs) [INFO] 16 07 2018 16:01:48.611:WARN [web-server]: 404: /api/v1/global/config [INFO] e 0.0.0 (Linux 0.0.0): Executed 3 of 23 SUCCESS (0 secs / 0.581 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 4 of 23 SUCCESS (0 secs / 0.738 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 5 of 23 SUCCESS (0 secs / 0.783 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 6 of 23 SUCCESS (0 secs / 0.853 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 7 of 23 SUCCESS (0 secs / 0.884 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 8 of 23 SUCCESS (0 secs / 0.911 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 9 of 23 SUCCESS (0 secs / 0.936 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 10 of 23 SUCCESS (0 secs / 0.952 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 11 of 23 SUCCESS (0 secs / 1.003 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 12 of 23 SUCCESS (0 secs / 1.039 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 13 of 23 SUCCESS (0 secs / 1.04 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 14 of 23 SUCCESS (0 secs / 1.04 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 15 of 23 SUCCESS (0 secs / 1.073 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 16 of 23 SUCCESS (0 secs / 1.074 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 17 of 23 SUCCESS (0 secs / 1.075 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 18 of 23 SUCCESS (0 secs / 1.1 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 19 of 23 SUCCESS (0 secs / 1.101 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 20 of 23 SUCCESS (0 secs / 1.101 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 21 of 23 SUCCESS (0 secs / 1.101 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 22 of 23 SUCCESS (0 secs / 1.126 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 23 of 23 SUCCESS (0 secs / 1.214 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 23 of 23 SUCCESS (1.249 secs / 1.214 secs) ``` ## Pull Request Checklist Thank you for submitting a contribution to Apache Metron. Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions. Please refer also to our [Build Verification
[jira] [Commented] (METRON-1672) Add metron-alerts's UI unit tests to travis build process
[ https://issues.apache.org/jira/browse/METRON-1672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545466#comment-16545466 ] ASF GitHub Bot commented on METRON-1672: GitHub user justinleet opened a pull request: https://github.com/apache/metron/pull/1106 METRON-1672: Add metron-alerts's UI unit tests to travis build process ## Contributor Comments Couple things happen here: - Added a mvn test goal for the metron-alert UI tests - Made the tests use Chrome headless in order to actually have them run properly in Travis. Also added the Chrome add-on in Travis for this. - Updated node/npm versions to fix some issues with running the tests. Matched the version to https://github.com/apache/metron/pull/1096 - The build started going over time, so I resurrected the build matrix from https://github.com/apache/metron/pull/854. I left out the Maven wrapper stuff. Given that Apache's been pretty good with Travis and other projects use this capability, I'm in favor of using it now that we're cramming even more in. ### Build Matrix As stated, a build matrix was added to the Travis build. It breaks things into 4 sections the same as the PR it's based on. 1. Unit Tests 2. Integration Tests 3. UI tests 4. License validation Javadoc is skipped, because it's broken (sigh). I have the fixes in a branch, and I'd like to turn it on as follow-on. ### Testing Make sure that `mvn test` works from metron-interface/metron-alerts. It should not spin up an actual Chrome window and should be headless. If you choose to push into Travis, you should see something the following in the output. This can be seen at https://travis-ci.org/justinleet/metron/jobs/404500911. The overall build matrix job can be seen at https://travis-ci.org/justinleet/metron/builds/404500908. ``` [INFO] > metron-alerts@0.5.1 test /home/travis/build/justinleet/metron/metron-interface/metron-alerts [INFO] > karma start --single-run --browsers ChromeHeadless karma.conf.js [INFO] [INFO] 16 07 2018 16:01:20.477:INFO [karma]: Karma v1.4.1 server started at http://0.0.0.0:9876/ [INFO] 16 07 2018 16:01:20.480:INFO [launcher]: Launching browser ChromeHeadless with unlimited concurrency [INFO] 16 07 2018 16:01:20.492:INFO [launcher]: Starting browser Chrome [INFO] 16 07 2018 16:01:44.456:INFO [HeadlessChrome 0.0.0 (Linux 0.0.0)]: Connected on socket Ts9_772t05A32Ohr with id 78239511 [INFO] HeadlessChrome 0.0.0 (Linux 0.0.0): Executed 0 of 23 SUCCESS (0 secs / 0 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 1 of 23 SUCCESS (0 secs / 0.205 secs) [INFO] 16 07 2018 16:01:48.411:WARN [web-server]: 404: /api/v1/global/config [INFO] 16 07 2018 16:01:48.418:WARN [web-server]: 404: /api/v1/global/config [INFO] e 0.0.0 (Linux 0.0.0): Executed 2 of 23 SUCCESS (0 secs / 0.41 secs) [INFO] 16 07 2018 16:01:48.611:WARN [web-server]: 404: /api/v1/global/config [INFO] e 0.0.0 (Linux 0.0.0): Executed 3 of 23 SUCCESS (0 secs / 0.581 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 4 of 23 SUCCESS (0 secs / 0.738 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 5 of 23 SUCCESS (0 secs / 0.783 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 6 of 23 SUCCESS (0 secs / 0.853 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 7 of 23 SUCCESS (0 secs / 0.884 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 8 of 23 SUCCESS (0 secs / 0.911 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 9 of 23 SUCCESS (0 secs / 0.936 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 10 of 23 SUCCESS (0 secs / 0.952 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 11 of 23 SUCCESS (0 secs / 1.003 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 12 of 23 SUCCESS (0 secs / 1.039 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 13 of 23 SUCCESS (0 secs / 1.04 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 14 of 23 SUCCESS (0 secs / 1.04 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 15 of 23 SUCCESS (0 secs / 1.073 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 16 of 23 SUCCESS (0 secs / 1.074 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 17 of 23 SUCCESS (0 secs / 1.075 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 18 of 23 SUCCESS (0 secs / 1.1 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 19 of 23 SUCCESS (0 secs / 1.101 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 20 of 23 SUCCESS (0 secs / 1.101 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 21 of 23 SUCCESS (0 secs / 1.101 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 22 of 23 SUCCESS (0 secs / 1.126 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 23 of 23 SUCCESS (0 secs / 1.214 secs) [INFO] e 0.0.0 (Linux 0.0.0): Executed 23 of 23 SUCCESS (1.249 secs / 1.214 secs) ``` ## Pull Request Checklist Thank you for submitting a contribution to
[GitHub] metron pull request #1099: METRON-1657: Parser aggregation in storm
Github user ottobackwards commented on a diff in the pull request: https://github.com/apache/metron/pull/1099#discussion_r202797418 --- Diff: metron-platform/metron-parsers/README.md --- @@ -82,6 +82,12 @@ topology in kafka. Errors are collected with the context of the error (e.g. stacktrace) and original message causing the error and sent to an `error` queue. Invalid messages as determined by global validation functions are also treated as errors and sent to an `error` queue. + +Multiple sensors can be aggregated into a single Storm topology. When this is done, there will be +multiple Kafka spouts, but only a single parser bolt which will handle delegating to the correct --- End diff -- There is another, more likely use case where we have a transport wrapper on another message, and 1 topic split into many parsers as well. How can we handle that? Specifically -> Syslog (Many Msg types) -> kafka -> bolt -> Split per message I expect to add the ability for syslog parsing later, so set that aside. The issue is we *will* have more than one use case wrt topics. I am not going to say this PR needs to address it, but I would want us to understand our path forward and minimize the churn. It would be best if we did not have to redo this work when accounting for that. ---
[jira] [Commented] (METRON-1657) Parser aggregation in storm
[ https://issues.apache.org/jira/browse/METRON-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545654#comment-16545654 ] ASF GitHub Bot commented on METRON-1657: Github user ottobackwards commented on a diff in the pull request: https://github.com/apache/metron/pull/1099#discussion_r202797418 --- Diff: metron-platform/metron-parsers/README.md --- @@ -82,6 +82,12 @@ topology in kafka. Errors are collected with the context of the error (e.g. stacktrace) and original message causing the error and sent to an `error` queue. Invalid messages as determined by global validation functions are also treated as errors and sent to an `error` queue. + +Multiple sensors can be aggregated into a single Storm topology. When this is done, there will be +multiple Kafka spouts, but only a single parser bolt which will handle delegating to the correct --- End diff -- There is another, more likely use case where we have a transport wrapper on another message, and 1 topic split into many parsers as well. How can we handle that? Specifically -> Syslog (Many Msg types) -> kafka -> bolt -> Split per message I expect to add the ability for syslog parsing later, so set that aside. The issue is we *will* have more than one use case wrt topics. I am not going to say this PR needs to address it, but I would want us to understand our path forward and minimize the churn. It would be best if we did not have to redo this work when accounting for that. > Parser aggregation in storm > --- > > Key: METRON-1657 > URL: https://issues.apache.org/jira/browse/METRON-1657 > Project: Metron > Issue Type: Bug >Reporter: Justin Leet >Assignee: Justin Leet >Priority: Major > > Currently our parsing solution requires one storm topology per sensor. It has > been complained that this may be wasteful of resources and that, rather than > one storm topology per sensor, it would be advantageous to have multiple > sensors in the same topology. The benefit to this is that it would require > fewer storm slots. > The issue with this is that whenever we've aggregated functionality like this > before, we've run into issues appropriately being able to scale storm (e.g. > batch vs random access indexing in the same topology). The main point in > addressing this is to recommend that parsers with similar velocities and > complexity are grouped together. > Particularly for a first cut, leave the configuration mostly as-is, while > allowing for comma separated lists of sensors in start_parser_topology.sh > (e.g. bro,yaf creates a aggregated parser consisting of those two). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] metron pull request #1099: METRON-1657: Parser aggregation in storm
Github user ottobackwards commented on a diff in the pull request: https://github.com/apache/metron/pull/1099#discussion_r202803106 --- Diff: metron-platform/metron-parsers/README.md --- @@ -82,6 +82,12 @@ topology in kafka. Errors are collected with the context of the error (e.g. stacktrace) and original message causing the error and sent to an `error` queue. Invalid messages as determined by global validation functions are also treated as errors and sent to an `error` queue. + +Multiple sensors can be aggregated into a single Storm topology. When this is done, there will be +multiple Kafka spouts, but only a single parser bolt which will handle delegating to the correct --- End diff -- I think I'm misunderstanding between your diagram and this implementation. There will be one kafka topic monitored by the bolt, then routed to the right parser, then output to a different spout per parser? ---
[jira] [Commented] (METRON-1657) Parser aggregation in storm
[ https://issues.apache.org/jira/browse/METRON-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545684#comment-16545684 ] ASF GitHub Bot commented on METRON-1657: Github user cestella commented on a diff in the pull request: https://github.com/apache/metron/pull/1099#discussion_r202805609 --- Diff: metron-platform/metron-parsers/README.md --- @@ -82,6 +82,12 @@ topology in kafka. Errors are collected with the context of the error (e.g. stacktrace) and original message causing the error and sent to an `error` queue. Invalid messages as determined by global validation functions are also treated as errors and sent to an `error` queue. + +Multiple sensors can be aggregated into a single Storm topology. When this is done, there will be +multiple Kafka spouts, but only a single parser bolt which will handle delegating to the correct --- End diff -- This PR gives us the ability to group the parsers into a single topology if we so desire. You would still write through to kafka. So, the topology in the example would have 3 kafka spouts: * One for monitoring `pix_syslog_router` (the syslog parser aka the routing parser) * One for monitoring `cisco-5-304` * One for monitoring `cisco-6-302` There would be one parser bolt, though, which would handle parsing all 3 sensor types. That is the contribution of this PR, the ability to determine the parser and filter and field transformations from the input kafka topic and use the appropriate one to parse the messages. There is not, however, any code here that would bypass the intermediate kafka write (e.g. from the router topology to the individual `cisco-5-304` or `cisco-6-302` topics). That's a current gap. > Parser aggregation in storm > --- > > Key: METRON-1657 > URL: https://issues.apache.org/jira/browse/METRON-1657 > Project: Metron > Issue Type: Bug >Reporter: Justin Leet >Assignee: Justin Leet >Priority: Major > > Currently our parsing solution requires one storm topology per sensor. It has > been complained that this may be wasteful of resources and that, rather than > one storm topology per sensor, it would be advantageous to have multiple > sensors in the same topology. The benefit to this is that it would require > fewer storm slots. > The issue with this is that whenever we've aggregated functionality like this > before, we've run into issues appropriately being able to scale storm (e.g. > batch vs random access indexing in the same topology). The main point in > addressing this is to recommend that parsers with similar velocities and > complexity are grouped together. > Particularly for a first cut, leave the configuration mostly as-is, while > allowing for comma separated lists of sensors in start_parser_topology.sh > (e.g. bro,yaf creates a aggregated parser consisting of those two). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (METRON-1673) Fix Javadoc errors
[ https://issues.apache.org/jira/browse/METRON-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545493#comment-16545493 ] ASF GitHub Bot commented on METRON-1673: GitHub user justinleet opened a pull request: https://github.com/apache/metron/pull/1107 METRON-1673: Fix Javadoc errors ## Contributor Comments Javadoc blew up when we run `mvn javadoc:javadoc`. Now it should complete successfully. Although with a ton of warnings. To test just `mvn javadoc:javadoc` from the root dir. If we're good with the build matrix from https://github.com/apache/metron/pull/1106, I propose we add the javadoc to the build as part of this PR, in order to avoid this in the future. I didn't alter `-Xdoclint:none` in the pom, because we use the multiline string lib and it'll blow up. I'm unsure if there's a way around this. ## Pull Request Checklist Thank you for submitting a contribution to Apache Metron. Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions. Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides. In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following: ### For all changes: - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [x] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? ### For code changes: - [x] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [x] Have you included steps or a guide to how the change may be verified and tested manually? - [x] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via: ``` mvn -q clean integration-test install && dev-utilities/build-utils/verify_licenses.sh ``` - [x] Have you written or updated unit tests and or integration tests to verify your changes? - [x] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [x] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent? ### For documentation related changes: - [x] Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via `site-book/target/site/index.html`: ``` cd site-book mvn site ``` Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. It is also recommended that [travis-ci](https://travis-ci.org) is set up for your personal repository such that your branches are built there before submitting a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/justinleet/metron javadocFixing Alternatively you can review and apply these changes as the patch at: https://github.com/apache/metron/pull/1107.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1107 commit dba5984a493a8d6e11ea0e6de5a235f779aa35ee Author: justinjleet Date: 2018-07-16T16:33:54Z Javadoc error fixes > Fix Javadoc errors > -- > > Key: METRON-1673 > URL: https://issues.apache.org/jira/browse/METRON-1673 > Project: Metron > Issue Type: Bug >Reporter: Justin Leet >Assignee: Justin Leet >Priority: Major > > Our Javadocs have errors. They should be fixed. Ideally, building Javadoc > should be part of the build. However, we've held off for time restraints in > the build. [https://github.com/apache/metron/pull/1099] proposes a build > matrix for our build, and we could slot it into one of the sub builds without > these time constraints. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (METRON-1673) Fix Javadoc errors
[ https://issues.apache.org/jira/browse/METRON-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545517#comment-16545517 ] ASF GitHub Bot commented on METRON-1673: Github user cestella commented on the issue: https://github.com/apache/metron/pull/1107 +1 by inspection, thanks! > Fix Javadoc errors > -- > > Key: METRON-1673 > URL: https://issues.apache.org/jira/browse/METRON-1673 > Project: Metron > Issue Type: Bug >Reporter: Justin Leet >Assignee: Justin Leet >Priority: Major > > Our Javadocs have errors. They should be fixed. Ideally, building Javadoc > should be part of the build. However, we've held off for time restraints in > the build. [https://github.com/apache/metron/pull/1099] proposes a build > matrix for our build, and we could slot it into one of the sub builds without > these time constraints. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (METRON-1657) Parser aggregation in storm
[ https://issues.apache.org/jira/browse/METRON-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545667#comment-16545667 ] ASF GitHub Bot commented on METRON-1657: Github user ottobackwards commented on the issue: https://github.com/apache/metron/pull/1099 @justinleet the main things I saw that I would think of cutting down, or I though about looking into ( the idea may turn out to be bad ) are places where the bolt 'knows' a lot of weird or complicated initialization logic around the configurations or classes it uses, like what we do initializing Stellar, or in getComponentConfiguration. I'm ok with a follow on, I don't think this PR is creating that issue. > Parser aggregation in storm > --- > > Key: METRON-1657 > URL: https://issues.apache.org/jira/browse/METRON-1657 > Project: Metron > Issue Type: Bug >Reporter: Justin Leet >Assignee: Justin Leet >Priority: Major > > Currently our parsing solution requires one storm topology per sensor. It has > been complained that this may be wasteful of resources and that, rather than > one storm topology per sensor, it would be advantageous to have multiple > sensors in the same topology. The benefit to this is that it would require > fewer storm slots. > The issue with this is that whenever we've aggregated functionality like this > before, we've run into issues appropriately being able to scale storm (e.g. > batch vs random access indexing in the same topology). The main point in > addressing this is to recommend that parsers with similar velocities and > complexity are grouped together. > Particularly for a first cut, leave the configuration mostly as-is, while > allowing for comma separated lists of sensors in start_parser_topology.sh > (e.g. bro,yaf creates a aggregated parser consisting of those two). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] metron issue #1099: METRON-1657: Parser aggregation in storm
Github user ottobackwards commented on the issue: https://github.com/apache/metron/pull/1099 @justinleet the main things I saw that I would think of cutting down, or I though about looking into ( the idea may turn out to be bad ) are places where the bolt 'knows' a lot of weird or complicated initialization logic around the configurations or classes it uses, like what we do initializing Stellar, or in getComponentConfiguration. I'm ok with a follow on, I don't think this PR is creating that issue. ---
[jira] [Commented] (METRON-1657) Parser aggregation in storm
[ https://issues.apache.org/jira/browse/METRON-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545675#comment-16545675 ] ASF GitHub Bot commented on METRON-1657: Github user ottobackwards commented on a diff in the pull request: https://github.com/apache/metron/pull/1099#discussion_r202803106 --- Diff: metron-platform/metron-parsers/README.md --- @@ -82,6 +82,12 @@ topology in kafka. Errors are collected with the context of the error (e.g. stacktrace) and original message causing the error and sent to an `error` queue. Invalid messages as determined by global validation functions are also treated as errors and sent to an `error` queue. + +Multiple sensors can be aggregated into a single Storm topology. When this is done, there will be +multiple Kafka spouts, but only a single parser bolt which will handle delegating to the correct --- End diff -- I think I'm misunderstanding between your diagram and this implementation. There will be one kafka topic monitored by the bolt, then routed to the right parser, then output to a different spout per parser? > Parser aggregation in storm > --- > > Key: METRON-1657 > URL: https://issues.apache.org/jira/browse/METRON-1657 > Project: Metron > Issue Type: Bug >Reporter: Justin Leet >Assignee: Justin Leet >Priority: Major > > Currently our parsing solution requires one storm topology per sensor. It has > been complained that this may be wasteful of resources and that, rather than > one storm topology per sensor, it would be advantageous to have multiple > sensors in the same topology. The benefit to this is that it would require > fewer storm slots. > The issue with this is that whenever we've aggregated functionality like this > before, we've run into issues appropriately being able to scale storm (e.g. > batch vs random access indexing in the same topology). The main point in > addressing this is to recommend that parsers with similar velocities and > complexity are grouped together. > Particularly for a first cut, leave the configuration mostly as-is, while > allowing for comma separated lists of sensors in start_parser_topology.sh > (e.g. bro,yaf creates a aggregated parser consisting of those two). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (METRON-1672) Add metron-alerts's UI unit tests to travis build process
Justin Leet created METRON-1672: --- Summary: Add metron-alerts's UI unit tests to travis build process Key: METRON-1672 URL: https://issues.apache.org/jira/browse/METRON-1672 Project: Metron Issue Type: Bug Reporter: Justin Leet Assignee: Justin Leet The tests for metron-alerts don't run as part of Travis. They should run as part of Travis. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] metron issue #1107: METRON-1673: Fix Javadoc errors
Github user mmiklavc commented on the issue: https://github.com/apache/metron/pull/1107 +1 by inspection, thanks @justinleet. ---
[jira] [Commented] (METRON-1620) Fixes for forensic clustering use case example
[ https://issues.apache.org/jira/browse/METRON-1620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545617#comment-16545617 ] ASF GitHub Bot commented on METRON-1620: Github user asfgit closed the pull request at: https://github.com/apache/metron/pull/1065 > Fixes for forensic clustering use case example > -- > > Key: METRON-1620 > URL: https://issues.apache.org/jira/browse/METRON-1620 > Project: Metron > Issue Type: Bug >Reporter: Michael Miklavcic >Assignee: Michael Miklavcic >Priority: Major > > ES mapping needed some adjustments. Change to dynamic template mapping so it > will work for non-existent indexes yet to be created. Make work with ES 5.6.x > data types. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] metron pull request #1065: METRON-1620: Fixes for forensic clustering use ca...
Github user asfgit closed the pull request at: https://github.com/apache/metron/pull/1065 ---
[jira] [Commented] (METRON-1657) Parser aggregation in storm
[ https://issues.apache.org/jira/browse/METRON-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545670#comment-16545670 ] ASF GitHub Bot commented on METRON-1657: Github user ottobackwards commented on a diff in the pull request: https://github.com/apache/metron/pull/1099#discussion_r202802349 --- Diff: metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/bolt/ParserBolt.java --- @@ -182,40 +185,61 @@ public void prepare(Map stormConf, TopologyContext context, OutputCollector coll super.prepare(stormConf, context, collector); messageGetStrategy = MessageGetters.DEFAULT_BYTES_FROM_POSITION.get(); this.collector = collector; -if(getSensorParserConfig() != null) { - cache = CachingStellarProcessor.createCache(getSensorParserConfig().getCacheConfig()); -} -initializeStellar(); -if(getSensorParserConfig() != null && filter == null) { - getSensorParserConfig().getParserConfig().putIfAbsent("stellarContext", stellarContext); - if (!StringUtils.isEmpty(getSensorParserConfig().getFilterClassName())) { -filter = Filters.get(getSensorParserConfig().getFilterClassName() -, getSensorParserConfig().getParserConfig() -); + +// Build the Stellar cache +Map cacheConfig = new HashMap<>(); +for (Map.Entry entry: sensorToComponentMap.entrySet()) { + String sensor = entry.getKey(); + SensorParserConfig config = getSensorParserConfig(sensor); + + if (config != null) { +cacheConfig.putAll(config.getCacheConfig()); } } +cache = CachingStellarProcessor.createCache(cacheConfig); -parser.init(); +// Need to prep all sensors +for (Map.Entry entry: sensorToComponentMap.entrySet()) { + String sensor = entry.getKey(); + MessageParser parser = entry.getValue().getMessageParser(); --- End diff -- I'm still not sure sharing the stellar Context is a good thing. @cestella > Parser aggregation in storm > --- > > Key: METRON-1657 > URL: https://issues.apache.org/jira/browse/METRON-1657 > Project: Metron > Issue Type: Bug >Reporter: Justin Leet >Assignee: Justin Leet >Priority: Major > > Currently our parsing solution requires one storm topology per sensor. It has > been complained that this may be wasteful of resources and that, rather than > one storm topology per sensor, it would be advantageous to have multiple > sensors in the same topology. The benefit to this is that it would require > fewer storm slots. > The issue with this is that whenever we've aggregated functionality like this > before, we've run into issues appropriately being able to scale storm (e.g. > batch vs random access indexing in the same topology). The main point in > addressing this is to recommend that parsers with similar velocities and > complexity are grouped together. > Particularly for a first cut, leave the configuration mostly as-is, while > allowing for comma separated lists of sensors in start_parser_topology.sh > (e.g. bro,yaf creates a aggregated parser consisting of those two). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] metron pull request #1107: METRON-1673: Fix Javadoc errors
GitHub user justinleet opened a pull request: https://github.com/apache/metron/pull/1107 METRON-1673: Fix Javadoc errors ## Contributor Comments Javadoc blew up when we run `mvn javadoc:javadoc`. Now it should complete successfully. Although with a ton of warnings. To test just `mvn javadoc:javadoc` from the root dir. If we're good with the build matrix from https://github.com/apache/metron/pull/1106, I propose we add the javadoc to the build as part of this PR, in order to avoid this in the future. I didn't alter `-Xdoclint:none` in the pom, because we use the multiline string lib and it'll blow up. I'm unsure if there's a way around this. ## Pull Request Checklist Thank you for submitting a contribution to Apache Metron. Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions. Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides. In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following: ### For all changes: - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [x] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? ### For code changes: - [x] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [x] Have you included steps or a guide to how the change may be verified and tested manually? - [x] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via: ``` mvn -q clean integration-test install && dev-utilities/build-utils/verify_licenses.sh ``` - [x] Have you written or updated unit tests and or integration tests to verify your changes? - [x] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [x] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent? ### For documentation related changes: - [x] Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via `site-book/target/site/index.html`: ``` cd site-book mvn site ``` Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. It is also recommended that [travis-ci](https://travis-ci.org) is set up for your personal repository such that your branches are built there before submitting a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/justinleet/metron javadocFixing Alternatively you can review and apply these changes as the patch at: https://github.com/apache/metron/pull/1107.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1107 commit dba5984a493a8d6e11ea0e6de5a235f779aa35ee Author: justinjleet Date: 2018-07-16T16:33:54Z Javadoc error fixes ---
[GitHub] metron pull request #1099: METRON-1657: Parser aggregation in storm
Github user ottobackwards commented on a diff in the pull request: https://github.com/apache/metron/pull/1099#discussion_r202755740 --- Diff: metron-platform/metron-parsers/README.md --- @@ -82,6 +82,12 @@ topology in kafka. Errors are collected with the context of the error (e.g. stacktrace) and original message causing the error and sent to an `error` queue. Invalid messages as determined by global validation functions are also treated as errors and sent to an `error` queue. + +Multiple sensors can be aggregated into a single Storm topology. When this is done, there will be +multiple Kafka spouts, but only a single parser bolt which will handle delegating to the correct --- End diff -- So, this means that there is a kafka topic/spout per parser? ---
[jira] [Commented] (METRON-1657) Parser aggregation in storm
[ https://issues.apache.org/jira/browse/METRON-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545503#comment-16545503 ] ASF GitHub Bot commented on METRON-1657: Github user ottobackwards commented on a diff in the pull request: https://github.com/apache/metron/pull/1099#discussion_r202755740 --- Diff: metron-platform/metron-parsers/README.md --- @@ -82,6 +82,12 @@ topology in kafka. Errors are collected with the context of the error (e.g. stacktrace) and original message causing the error and sent to an `error` queue. Invalid messages as determined by global validation functions are also treated as errors and sent to an `error` queue. + +Multiple sensors can be aggregated into a single Storm topology. When this is done, there will be +multiple Kafka spouts, but only a single parser bolt which will handle delegating to the correct --- End diff -- So, this means that there is a kafka topic/spout per parser? > Parser aggregation in storm > --- > > Key: METRON-1657 > URL: https://issues.apache.org/jira/browse/METRON-1657 > Project: Metron > Issue Type: Bug >Reporter: Justin Leet >Assignee: Justin Leet >Priority: Major > > Currently our parsing solution requires one storm topology per sensor. It has > been complained that this may be wasteful of resources and that, rather than > one storm topology per sensor, it would be advantageous to have multiple > sensors in the same topology. The benefit to this is that it would require > fewer storm slots. > The issue with this is that whenever we've aggregated functionality like this > before, we've run into issues appropriately being able to scale storm (e.g. > batch vs random access indexing in the same topology). The main point in > addressing this is to recommend that parsers with similar velocities and > complexity are grouped together. > Particularly for a first cut, leave the configuration mostly as-is, while > allowing for comma separated lists of sensors in start_parser_topology.sh > (e.g. bro,yaf creates a aggregated parser consisting of those two). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (METRON-1657) Parser aggregation in storm
[ https://issues.apache.org/jira/browse/METRON-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545504#comment-16545504 ] ASF GitHub Bot commented on METRON-1657: Github user ottobackwards commented on a diff in the pull request: https://github.com/apache/metron/pull/1099#discussion_r202758396 --- Diff: metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/bolt/ParserBolt.java --- @@ -182,40 +185,61 @@ public void prepare(Map stormConf, TopologyContext context, OutputCollector coll super.prepare(stormConf, context, collector); messageGetStrategy = MessageGetters.DEFAULT_BYTES_FROM_POSITION.get(); this.collector = collector; -if(getSensorParserConfig() != null) { - cache = CachingStellarProcessor.createCache(getSensorParserConfig().getCacheConfig()); -} -initializeStellar(); -if(getSensorParserConfig() != null && filter == null) { - getSensorParserConfig().getParserConfig().putIfAbsent("stellarContext", stellarContext); - if (!StringUtils.isEmpty(getSensorParserConfig().getFilterClassName())) { -filter = Filters.get(getSensorParserConfig().getFilterClassName() -, getSensorParserConfig().getParserConfig() -); + +// Build the Stellar cache +Map cacheConfig = new HashMap<>(); +for (Map.Entry entry: sensorToComponentMap.entrySet()) { + String sensor = entry.getKey(); + SensorParserConfig config = getSensorParserConfig(sensor); + + if (config != null) { +cacheConfig.putAll(config.getCacheConfig()); } } +cache = CachingStellarProcessor.createCache(cacheConfig); -parser.init(); +// Need to prep all sensors +for (Map.Entry entry: sensorToComponentMap.entrySet()) { + String sensor = entry.getKey(); + MessageParser parser = entry.getValue().getMessageParser(); --- End diff -- I don't believe this is correct. We want to initialize stellar PER parser. Each should have it's own stellar instance and cache. > Parser aggregation in storm > --- > > Key: METRON-1657 > URL: https://issues.apache.org/jira/browse/METRON-1657 > Project: Metron > Issue Type: Bug >Reporter: Justin Leet >Assignee: Justin Leet >Priority: Major > > Currently our parsing solution requires one storm topology per sensor. It has > been complained that this may be wasteful of resources and that, rather than > one storm topology per sensor, it would be advantageous to have multiple > sensors in the same topology. The benefit to this is that it would require > fewer storm slots. > The issue with this is that whenever we've aggregated functionality like this > before, we've run into issues appropriately being able to scale storm (e.g. > batch vs random access indexing in the same topology). The main point in > addressing this is to recommend that parsers with similar velocities and > complexity are grouped together. > Particularly for a first cut, leave the configuration mostly as-is, while > allowing for comma separated lists of sensors in start_parser_topology.sh > (e.g. bro,yaf creates a aggregated parser consisting of those two). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (METRON-1657) Parser aggregation in storm
[ https://issues.apache.org/jira/browse/METRON-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545611#comment-16545611 ] ASF GitHub Bot commented on METRON-1657: Github user justinleet commented on the issue: https://github.com/apache/metron/pull/1099 @ottobackwards Is there anything we want to do in this PR about the ParserBolt? I agree that it's getting unwieldy, and if there's easy wins it's not a bad opportunity to fix it up. > Parser aggregation in storm > --- > > Key: METRON-1657 > URL: https://issues.apache.org/jira/browse/METRON-1657 > Project: Metron > Issue Type: Bug >Reporter: Justin Leet >Assignee: Justin Leet >Priority: Major > > Currently our parsing solution requires one storm topology per sensor. It has > been complained that this may be wasteful of resources and that, rather than > one storm topology per sensor, it would be advantageous to have multiple > sensors in the same topology. The benefit to this is that it would require > fewer storm slots. > The issue with this is that whenever we've aggregated functionality like this > before, we've run into issues appropriately being able to scale storm (e.g. > batch vs random access indexing in the same topology). The main point in > addressing this is to recommend that parsers with similar velocities and > complexity are grouped together. > Particularly for a first cut, leave the configuration mostly as-is, while > allowing for comma separated lists of sensors in start_parser_topology.sh > (e.g. bro,yaf creates a aggregated parser consisting of those two). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] metron issue #1099: METRON-1657: Parser aggregation in storm
Github user justinleet commented on the issue: https://github.com/apache/metron/pull/1099 @ottobackwards Is there anything we want to do in this PR about the ParserBolt? I agree that it's getting unwieldy, and if there's easy wins it's not a bad opportunity to fix it up. ---
[jira] [Commented] (METRON-1657) Parser aggregation in storm
[ https://issues.apache.org/jira/browse/METRON-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545668#comment-16545668 ] ASF GitHub Bot commented on METRON-1657: Github user cestella commented on a diff in the pull request: https://github.com/apache/metron/pull/1099#discussion_r202801374 --- Diff: metron-platform/metron-parsers/README.md --- @@ -82,6 +82,12 @@ topology in kafka. Errors are collected with the context of the error (e.g. stacktrace) and original message causing the error and sent to an `error` queue. Invalid messages as determined by global validation functions are also treated as errors and sent to an `error` queue. + +Multiple sensors can be aggregated into a single Storm topology. When this is done, there will be +multiple Kafka spouts, but only a single parser bolt which will handle delegating to the correct --- End diff -- @ottobackwards I think that's essentially the parser chaining stuff I added earlier, am I misunderstanding? The [use-case](https://github.com/apache/metron/tree/master/use-cases/parser_chaining) is using the "we get a ton of types of data in syslog" example. > Parser aggregation in storm > --- > > Key: METRON-1657 > URL: https://issues.apache.org/jira/browse/METRON-1657 > Project: Metron > Issue Type: Bug >Reporter: Justin Leet >Assignee: Justin Leet >Priority: Major > > Currently our parsing solution requires one storm topology per sensor. It has > been complained that this may be wasteful of resources and that, rather than > one storm topology per sensor, it would be advantageous to have multiple > sensors in the same topology. The benefit to this is that it would require > fewer storm slots. > The issue with this is that whenever we've aggregated functionality like this > before, we've run into issues appropriately being able to scale storm (e.g. > batch vs random access indexing in the same topology). The main point in > addressing this is to recommend that parsers with similar velocities and > complexity are grouped together. > Particularly for a first cut, leave the configuration mostly as-is, while > allowing for comma separated lists of sensors in start_parser_topology.sh > (e.g. bro,yaf creates a aggregated parser consisting of those two). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (METRON-1657) Parser aggregation in storm
[ https://issues.apache.org/jira/browse/METRON-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545679#comment-16545679 ] ASF GitHub Bot commented on METRON-1657: Github user cestella commented on a diff in the pull request: https://github.com/apache/metron/pull/1099#discussion_r202803869 --- Diff: metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/bolt/ParserBolt.java --- @@ -182,40 +185,61 @@ public void prepare(Map stormConf, TopologyContext context, OutputCollector coll super.prepare(stormConf, context, collector); messageGetStrategy = MessageGetters.DEFAULT_BYTES_FROM_POSITION.get(); this.collector = collector; -if(getSensorParserConfig() != null) { - cache = CachingStellarProcessor.createCache(getSensorParserConfig().getCacheConfig()); -} -initializeStellar(); -if(getSensorParserConfig() != null && filter == null) { - getSensorParserConfig().getParserConfig().putIfAbsent("stellarContext", stellarContext); - if (!StringUtils.isEmpty(getSensorParserConfig().getFilterClassName())) { -filter = Filters.get(getSensorParserConfig().getFilterClassName() -, getSensorParserConfig().getParserConfig() -); + +// Build the Stellar cache +Map cacheConfig = new HashMap<>(); +for (Map.Entry entry: sensorToComponentMap.entrySet()) { + String sensor = entry.getKey(); + SensorParserConfig config = getSensorParserConfig(sensor); + + if (config != null) { +cacheConfig.putAll(config.getCacheConfig()); } } +cache = CachingStellarProcessor.createCache(cacheConfig); -parser.init(); +// Need to prep all sensors +for (Map.Entry entry: sensorToComponentMap.entrySet()) { + String sensor = entry.getKey(); + MessageParser parser = entry.getValue().getMessageParser(); --- End diff -- So, the consequences of this decision are as follows: * You share an expression cache (i.e. the statement -> abstract syntax tree cache; distinct from the expression -> evaluated return cache) * You share an stellar value cache (expression -> evaluated return) * You share the state in the Context (e.g. hbase connections, zookeeper connections). On the whole, anything shared in the context is intended to be shared across users and sensors by virtue of Stellar being used in the enrichment topology (where it's not sensor-by-sensor), so we shoudl be ok there. The real question is whether users would prefer to have one knob per topology for stellar cache sizing or whether they would prefer to have one knob per sensor. I'd say that I'm ok with how this PR is doing it, because it's easier to reason about resources, IMO, on a per-topology perspective. > Parser aggregation in storm > --- > > Key: METRON-1657 > URL: https://issues.apache.org/jira/browse/METRON-1657 > Project: Metron > Issue Type: Bug >Reporter: Justin Leet >Assignee: Justin Leet >Priority: Major > > Currently our parsing solution requires one storm topology per sensor. It has > been complained that this may be wasteful of resources and that, rather than > one storm topology per sensor, it would be advantageous to have multiple > sensors in the same topology. The benefit to this is that it would require > fewer storm slots. > The issue with this is that whenever we've aggregated functionality like this > before, we've run into issues appropriately being able to scale storm (e.g. > batch vs random access indexing in the same topology). The main point in > addressing this is to recommend that parsers with similar velocities and > complexity are grouped together. > Particularly for a first cut, leave the configuration mostly as-is, while > allowing for comma separated lists of sensors in start_parser_topology.sh > (e.g. bro,yaf creates a aggregated parser consisting of those two). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] metron pull request #1099: METRON-1657: Parser aggregation in storm
Github user cestella commented on a diff in the pull request: https://github.com/apache/metron/pull/1099#discussion_r202803869 --- Diff: metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/bolt/ParserBolt.java --- @@ -182,40 +185,61 @@ public void prepare(Map stormConf, TopologyContext context, OutputCollector coll super.prepare(stormConf, context, collector); messageGetStrategy = MessageGetters.DEFAULT_BYTES_FROM_POSITION.get(); this.collector = collector; -if(getSensorParserConfig() != null) { - cache = CachingStellarProcessor.createCache(getSensorParserConfig().getCacheConfig()); -} -initializeStellar(); -if(getSensorParserConfig() != null && filter == null) { - getSensorParserConfig().getParserConfig().putIfAbsent("stellarContext", stellarContext); - if (!StringUtils.isEmpty(getSensorParserConfig().getFilterClassName())) { -filter = Filters.get(getSensorParserConfig().getFilterClassName() -, getSensorParserConfig().getParserConfig() -); + +// Build the Stellar cache +Map cacheConfig = new HashMap<>(); +for (Map.Entry entry: sensorToComponentMap.entrySet()) { + String sensor = entry.getKey(); + SensorParserConfig config = getSensorParserConfig(sensor); + + if (config != null) { +cacheConfig.putAll(config.getCacheConfig()); } } +cache = CachingStellarProcessor.createCache(cacheConfig); -parser.init(); +// Need to prep all sensors +for (Map.Entry entry: sensorToComponentMap.entrySet()) { + String sensor = entry.getKey(); + MessageParser parser = entry.getValue().getMessageParser(); --- End diff -- So, the consequences of this decision are as follows: * You share an expression cache (i.e. the statement -> abstract syntax tree cache; distinct from the expression -> evaluated return cache) * You share an stellar value cache (expression -> evaluated return) * You share the state in the Context (e.g. hbase connections, zookeeper connections). On the whole, anything shared in the context is intended to be shared across users and sensors by virtue of Stellar being used in the enrichment topology (where it's not sensor-by-sensor), so we shoudl be ok there. The real question is whether users would prefer to have one knob per topology for stellar cache sizing or whether they would prefer to have one knob per sensor. I'd say that I'm ok with how this PR is doing it, because it's easier to reason about resources, IMO, on a per-topology perspective. ---
[jira] [Commented] (METRON-1657) Parser aggregation in storm
[ https://issues.apache.org/jira/browse/METRON-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545696#comment-16545696 ] ASF GitHub Bot commented on METRON-1657: Github user cestella commented on a diff in the pull request: https://github.com/apache/metron/pull/1099#discussion_r202809295 --- Diff: metron-platform/metron-parsers/README.md --- @@ -82,6 +82,12 @@ topology in kafka. Errors are collected with the context of the error (e.g. stacktrace) and original message causing the error and sent to an `error` queue. Invalid messages as determined by global validation functions are also treated as errors and sent to an `error` queue. + +Multiple sensors can be aggregated into a single Storm topology. When this is done, there will be +multiple Kafka spouts, but only a single parser bolt which will handle delegating to the correct --- End diff -- Yeah, in order to do that, we'd need to execute the DAG without the intermediate kafka step. That'd be a follow-on. This is the first step in that. We have all the information here (the input -> output mapping of kafka topics), so it shouldn't be so hard to intervene and cut the intermediate kafka writing step out. > Parser aggregation in storm > --- > > Key: METRON-1657 > URL: https://issues.apache.org/jira/browse/METRON-1657 > Project: Metron > Issue Type: Bug >Reporter: Justin Leet >Assignee: Justin Leet >Priority: Major > > Currently our parsing solution requires one storm topology per sensor. It has > been complained that this may be wasteful of resources and that, rather than > one storm topology per sensor, it would be advantageous to have multiple > sensors in the same topology. The benefit to this is that it would require > fewer storm slots. > The issue with this is that whenever we've aggregated functionality like this > before, we've run into issues appropriately being able to scale storm (e.g. > batch vs random access indexing in the same topology). The main point in > addressing this is to recommend that parsers with similar velocities and > complexity are grouped together. > Particularly for a first cut, leave the configuration mostly as-is, while > allowing for comma separated lists of sensors in start_parser_topology.sh > (e.g. bro,yaf creates a aggregated parser consisting of those two). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] metron pull request #1099: METRON-1657: Parser aggregation in storm
Github user ottobackwards commented on a diff in the pull request: https://github.com/apache/metron/pull/1099#discussion_r202808756 --- Diff: metron-platform/metron-parsers/README.md --- @@ -82,6 +82,12 @@ topology in kafka. Errors are collected with the context of the error (e.g. stacktrace) and original message causing the error and sent to an `error` queue. Invalid messages as determined by global validation functions are also treated as errors and sent to an `error` queue. + +Multiple sensors can be aggregated into a single Storm topology. When this is done, there will be +multiple Kafka spouts, but only a single parser bolt which will handle delegating to the correct --- End diff -- The bolt is actually executing the parser, not sending it to kafka though. Let's say that I route all my syslog stuff ( multiple message types ) through 1 kafka topic ( not tied to _ANY_ sensor. I would want to point this bolt at that one topic, and have it route to multiple parsers. I think then they all just go to enrichment? ---
[jira] [Commented] (METRON-1657) Parser aggregation in storm
[ https://issues.apache.org/jira/browse/METRON-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545692#comment-16545692 ] ASF GitHub Bot commented on METRON-1657: Github user ottobackwards commented on a diff in the pull request: https://github.com/apache/metron/pull/1099#discussion_r202808756 --- Diff: metron-platform/metron-parsers/README.md --- @@ -82,6 +82,12 @@ topology in kafka. Errors are collected with the context of the error (e.g. stacktrace) and original message causing the error and sent to an `error` queue. Invalid messages as determined by global validation functions are also treated as errors and sent to an `error` queue. + +Multiple sensors can be aggregated into a single Storm topology. When this is done, there will be +multiple Kafka spouts, but only a single parser bolt which will handle delegating to the correct --- End diff -- The bolt is actually executing the parser, not sending it to kafka though. Let's say that I route all my syslog stuff ( multiple message types ) through 1 kafka topic ( not tied to _ANY_ sensor. I would want to point this bolt at that one topic, and have it route to multiple parsers. I think then they all just go to enrichment? > Parser aggregation in storm > --- > > Key: METRON-1657 > URL: https://issues.apache.org/jira/browse/METRON-1657 > Project: Metron > Issue Type: Bug >Reporter: Justin Leet >Assignee: Justin Leet >Priority: Major > > Currently our parsing solution requires one storm topology per sensor. It has > been complained that this may be wasteful of resources and that, rather than > one storm topology per sensor, it would be advantageous to have multiple > sensors in the same topology. The benefit to this is that it would require > fewer storm slots. > The issue with this is that whenever we've aggregated functionality like this > before, we've run into issues appropriately being able to scale storm (e.g. > batch vs random access indexing in the same topology). The main point in > addressing this is to recommend that parsers with similar velocities and > complexity are grouped together. > Particularly for a first cut, leave the configuration mostly as-is, while > allowing for comma separated lists of sensors in start_parser_topology.sh > (e.g. bro,yaf creates a aggregated parser consisting of those two). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (METRON-1674) Create REST endpoint for job status abstraction
Ryan Merriman created METRON-1674: - Summary: Create REST endpoint for job status abstraction Key: METRON-1674 URL: https://issues.apache.org/jira/browse/METRON-1674 Project: Metron Issue Type: Sub-task Reporter: Ryan Merriman We need a REST endpoint that will enable us to get the status of a running job. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] metron pull request #1109: METRON-1674: Create REST endpoint for job status ...
GitHub user merrimanr opened a pull request: https://github.com/apache/metron/pull/1109 METRON-1674: Create REST endpoint for job status abstraction ## Contributor Comments This PR is built on top of https://github.com/apache/metron/pull/1108 and should be merged and final review done after that is accepted into the feature branch. This exposes the job manager and job status abstraction through a REST endpoint. A summary of the changes included: - Adjustments to the existing PcapService to accommodate the new JobManager abstraction - JobManager is now a Spring bean and job submission/status is done through that - New properties were added to application.yml and PcapServiceImpl - PcapJobSupplier was added that allows switching to a mock pcap job during integration testing - Unit and integration tests were adjusted to match - Time parameters are now `startTimeMs` and `endTimeMs` - FixedPcapRequest now matches the pattern used in PcapRequest - Addition of a PcapStatus object to return status in a simple, consumable structure for the UI - Endpoint to get job status was added - ConfigOption was adjusted to automatically handle type conversions in cases where Jackson deserialization is used (Integer to Long for example) - InMemoryJobManager now throws a JobNotFoundException when jobs don't exist for a username/job id combination - PcapJob will automatically convert PcapOptions.START_TIME_MS to PcapOptions.START_TIME_NS when START_TIME_NS is not set (same goes for END_TIME_NS) - New unit and integration tests added for get status endpoint and service This has been lightly tested in full dev. The HDFS paths must be created manually for the paths specified in application.yml. For now and only job submission and subsequent status retrieval had been tested. This PR should strictly be used for testing at this point. To test in full dev, create the HDFS directories mentioned above and put pcap data in `/apps/metron/pcap/input`. Submit a fixed pcap query: ``` curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/json' -d '{ "endTime": 1458240269424, "startTime": 1458240269419 }' 'http://node1:8082/api/v1/pcap/fixed' ``` A job id should be returned in the response: ``` { "jobId": "job_1531258337010_0021", "jobStatus": "RUNNING", "description": "map: 0.0%, reduce: 0.0%", "percentComplete": 0, "size": 0 } ``` Job status can now be retrieved using the get job status endpoint: ``` curl -X GET --header 'Accept: application/json' 'http://node1:8082/api/v1/pcap/job_1531258337010_0021' ``` A full, comprehensive review should be done after https://github.com/apache/metron/pull/1108 has been merged. ## Pull Request Checklist Thank you for submitting a contribution to Apache Metron. Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions. Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides. In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following: ### For all changes: - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [x] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? ### For code changes: - [ ] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [ ] Have you included steps or a guide to how the change may be verified and tested manually? - [ ] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via: ``` mvn -q clean integration-test install && dev-utilities/build-utils/verify_licenses.sh ``` - [x] Have you written or updated unit tests and or integration tests to verify your changes? - [x] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [x] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent? ### For documentation
[jira] [Commented] (METRON-1657) Parser aggregation in storm
[ https://issues.apache.org/jira/browse/METRON-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545717#comment-16545717 ] ASF GitHub Bot commented on METRON-1657: Github user justinleet commented on a diff in the pull request: https://github.com/apache/metron/pull/1099#discussion_r202818595 --- Diff: metron-platform/metron-parsers/README.md --- @@ -82,6 +82,12 @@ topology in kafka. Errors are collected with the context of the error (e.g. stacktrace) and original message causing the error and sent to an `error` queue. Invalid messages as determined by global validation functions are also treated as errors and sent to an `error` queue. + +Multiple sensors can be aggregated into a single Storm topology. When this is done, there will be +multiple Kafka spouts, but only a single parser bolt which will handle delegating to the correct --- End diff -- Yeah, I'll go ahead and add a diagram to the doc, and flesh out that explanation > Parser aggregation in storm > --- > > Key: METRON-1657 > URL: https://issues.apache.org/jira/browse/METRON-1657 > Project: Metron > Issue Type: Bug >Reporter: Justin Leet >Assignee: Justin Leet >Priority: Major > > Currently our parsing solution requires one storm topology per sensor. It has > been complained that this may be wasteful of resources and that, rather than > one storm topology per sensor, it would be advantageous to have multiple > sensors in the same topology. The benefit to this is that it would require > fewer storm slots. > The issue with this is that whenever we've aggregated functionality like this > before, we've run into issues appropriately being able to scale storm (e.g. > batch vs random access indexing in the same topology). The main point in > addressing this is to recommend that parsers with similar velocities and > complexity are grouped together. > Particularly for a first cut, leave the configuration mostly as-is, while > allowing for comma separated lists of sensors in start_parser_topology.sh > (e.g. bro,yaf creates a aggregated parser consisting of those two). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (METRON-1674) Create REST endpoint for job status abstraction
[ https://issues.apache.org/jira/browse/METRON-1674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545766#comment-16545766 ] ASF GitHub Bot commented on METRON-1674: GitHub user merrimanr opened a pull request: https://github.com/apache/metron/pull/1109 METRON-1674: Create REST endpoint for job status abstraction ## Contributor Comments This PR is built on top of https://github.com/apache/metron/pull/1108 and should be merged and final review done after that is accepted into the feature branch. This exposes the job manager and job status abstraction through a REST endpoint. A summary of the changes included: - Adjustments to the existing PcapService to accommodate the new JobManager abstraction - JobManager is now a Spring bean and job submission/status is done through that - New properties were added to application.yml and PcapServiceImpl - PcapJobSupplier was added that allows switching to a mock pcap job during integration testing - Unit and integration tests were adjusted to match - Time parameters are now `startTimeMs` and `endTimeMs` - FixedPcapRequest now matches the pattern used in PcapRequest - Addition of a PcapStatus object to return status in a simple, consumable structure for the UI - Endpoint to get job status was added - ConfigOption was adjusted to automatically handle type conversions in cases where Jackson deserialization is used (Integer to Long for example) - InMemoryJobManager now throws a JobNotFoundException when jobs don't exist for a username/job id combination - PcapJob will automatically convert PcapOptions.START_TIME_MS to PcapOptions.START_TIME_NS when START_TIME_NS is not set (same goes for END_TIME_NS) - New unit and integration tests added for get status endpoint and service This has been lightly tested in full dev. The HDFS paths must be created manually for the paths specified in application.yml. For now and only job submission and subsequent status retrieval had been tested. This PR should strictly be used for testing at this point. To test in full dev, create the HDFS directories mentioned above and put pcap data in `/apps/metron/pcap/input`. Submit a fixed pcap query: ``` curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/json' -d '{ "endTime": 1458240269424, "startTime": 1458240269419 }' 'http://node1:8082/api/v1/pcap/fixed' ``` A job id should be returned in the response: ``` { "jobId": "job_1531258337010_0021", "jobStatus": "RUNNING", "description": "map: 0.0%, reduce: 0.0%", "percentComplete": 0, "size": 0 } ``` Job status can now be retrieved using the get job status endpoint: ``` curl -X GET --header 'Accept: application/json' 'http://node1:8082/api/v1/pcap/job_1531258337010_0021' ``` A full, comprehensive review should be done after https://github.com/apache/metron/pull/1108 has been merged. ## Pull Request Checklist Thank you for submitting a contribution to Apache Metron. Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions. Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides. In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following: ### For all changes: - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [x] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? ### For code changes: - [ ] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [ ] Have you included steps or a guide to how the change may be verified and tested manually? - [ ] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via: ``` mvn -q clean integration-test install && dev-utilities/build-utils/verify_licenses.sh ``` - [x] Have you written or updated unit tests and or integration tests to verify your changes? - [x] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF
[GitHub] metron pull request #1099: METRON-1657: Parser aggregation in storm
Github user cestella commented on a diff in the pull request: https://github.com/apache/metron/pull/1099#discussion_r202813546 --- Diff: metron-platform/metron-parsers/README.md --- @@ -82,6 +82,12 @@ topology in kafka. Errors are collected with the context of the error (e.g. stacktrace) and original message causing the error and sent to an `error` queue. Invalid messages as determined by global validation functions are also treated as errors and sent to an `error` queue. + +Multiple sensors can be aggregated into a single Storm topology. When this is done, there will be +multiple Kafka spouts, but only a single parser bolt which will handle delegating to the correct --- End diff -- Well, parser chaining allows for DAGs of parsers, not just one level. Also, you might not want to group parsers based on chained units, but rather based on velocity or some other metric (e.g. I don't want to group a high velocity sensor with a bunch of low velcoity sensors in the syslog case). In that case, you would need the intermediate kafka topics. ---
[jira] [Commented] (METRON-1657) Parser aggregation in storm
[ https://issues.apache.org/jira/browse/METRON-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545709#comment-16545709 ] ASF GitHub Bot commented on METRON-1657: Github user cestella commented on a diff in the pull request: https://github.com/apache/metron/pull/1099#discussion_r202813546 --- Diff: metron-platform/metron-parsers/README.md --- @@ -82,6 +82,12 @@ topology in kafka. Errors are collected with the context of the error (e.g. stacktrace) and original message causing the error and sent to an `error` queue. Invalid messages as determined by global validation functions are also treated as errors and sent to an `error` queue. + +Multiple sensors can be aggregated into a single Storm topology. When this is done, there will be +multiple Kafka spouts, but only a single parser bolt which will handle delegating to the correct --- End diff -- Well, parser chaining allows for DAGs of parsers, not just one level. Also, you might not want to group parsers based on chained units, but rather based on velocity or some other metric (e.g. I don't want to group a high velocity sensor with a bunch of low velcoity sensors in the syslog case). In that case, you would need the intermediate kafka topics. > Parser aggregation in storm > --- > > Key: METRON-1657 > URL: https://issues.apache.org/jira/browse/METRON-1657 > Project: Metron > Issue Type: Bug >Reporter: Justin Leet >Assignee: Justin Leet >Priority: Major > > Currently our parsing solution requires one storm topology per sensor. It has > been complained that this may be wasteful of resources and that, rather than > one storm topology per sensor, it would be advantageous to have multiple > sensors in the same topology. The benefit to this is that it would require > fewer storm slots. > The issue with this is that whenever we've aggregated functionality like this > before, we've run into issues appropriately being able to scale storm (e.g. > batch vs random access indexing in the same topology). The main point in > addressing this is to recommend that parsers with similar velocities and > complexity are grouped together. > Particularly for a first cut, leave the configuration mostly as-is, while > allowing for comma separated lists of sensors in start_parser_topology.sh > (e.g. bro,yaf creates a aggregated parser consisting of those two). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (METRON-1657) Parser aggregation in storm
[ https://issues.apache.org/jira/browse/METRON-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545710#comment-16545710 ] ASF GitHub Bot commented on METRON-1657: Github user cestella commented on a diff in the pull request: https://github.com/apache/metron/pull/1099#discussion_r202814185 --- Diff: metron-platform/metron-parsers/README.md --- @@ -82,6 +82,12 @@ topology in kafka. Errors are collected with the context of the error (e.g. stacktrace) and original message causing the error and sent to an `error` queue. Invalid messages as determined by global validation functions are also treated as errors and sent to an `error` queue. + +Multiple sensors can be aggregated into a single Storm topology. When this is done, there will be +multiple Kafka spouts, but only a single parser bolt which will handle delegating to the correct --- End diff -- You also might want 2 different parsers that source from the same kafka topic (think: 1 parser to send the data to enrichment and 1 parser to send to hbase as a streaming enrichment for authentication data) > Parser aggregation in storm > --- > > Key: METRON-1657 > URL: https://issues.apache.org/jira/browse/METRON-1657 > Project: Metron > Issue Type: Bug >Reporter: Justin Leet >Assignee: Justin Leet >Priority: Major > > Currently our parsing solution requires one storm topology per sensor. It has > been complained that this may be wasteful of resources and that, rather than > one storm topology per sensor, it would be advantageous to have multiple > sensors in the same topology. The benefit to this is that it would require > fewer storm slots. > The issue with this is that whenever we've aggregated functionality like this > before, we've run into issues appropriately being able to scale storm (e.g. > batch vs random access indexing in the same topology). The main point in > addressing this is to recommend that parsers with similar velocities and > complexity are grouped together. > Particularly for a first cut, leave the configuration mostly as-is, while > allowing for comma separated lists of sensors in start_parser_topology.sh > (e.g. bro,yaf creates a aggregated parser consisting of those two). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] metron pull request #1099: METRON-1657: Parser aggregation in storm
Github user cestella commented on a diff in the pull request: https://github.com/apache/metron/pull/1099#discussion_r202814185 --- Diff: metron-platform/metron-parsers/README.md --- @@ -82,6 +82,12 @@ topology in kafka. Errors are collected with the context of the error (e.g. stacktrace) and original message causing the error and sent to an `error` queue. Invalid messages as determined by global validation functions are also treated as errors and sent to an `error` queue. + +Multiple sensors can be aggregated into a single Storm topology. When this is done, there will be +multiple Kafka spouts, but only a single parser bolt which will handle delegating to the correct --- End diff -- You also might want 2 different parsers that source from the same kafka topic (think: 1 parser to send the data to enrichment and 1 parser to send to hbase as a streaming enrichment for authentication data) ---
[GitHub] metron pull request #1099: METRON-1657: Parser aggregation in storm
Github user justinleet commented on a diff in the pull request: https://github.com/apache/metron/pull/1099#discussion_r202818595 --- Diff: metron-platform/metron-parsers/README.md --- @@ -82,6 +82,12 @@ topology in kafka. Errors are collected with the context of the error (e.g. stacktrace) and original message causing the error and sent to an `error` queue. Invalid messages as determined by global validation functions are also treated as errors and sent to an `error` queue. + +Multiple sensors can be aggregated into a single Storm topology. When this is done, there will be +multiple Kafka spouts, but only a single parser bolt which will handle delegating to the correct --- End diff -- Yeah, I'll go ahead and add a diagram to the doc, and flesh out that explanation ---
[GitHub] metron pull request #1099: METRON-1657: Parser aggregation in storm
Github user ottobackwards commented on a diff in the pull request: https://github.com/apache/metron/pull/1099#discussion_r202812681 --- Diff: metron-platform/metron-parsers/README.md --- @@ -82,6 +82,12 @@ topology in kafka. Errors are collected with the context of the error (e.g. stacktrace) and original message causing the error and sent to an `error` queue. Invalid messages as determined by global validation functions are also treated as errors and sent to an `error` queue. + +Multiple sensors can be aggregated into a single Storm topology. When this is done, there will be +multiple Kafka spouts, but only a single parser bolt which will handle delegating to the correct --- End diff -- what *is* the intermediate kafka step? That is what is confusing me. My understanding is that for each sensor you reference it will build a spout for that sensor parser topic, and then pass everything from those to this bolt, which will call the right parser and output to I'm not sure. Why have to have a sensor specific topic at all? ---
[jira] [Commented] (METRON-1657) Parser aggregation in storm
[ https://issues.apache.org/jira/browse/METRON-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545704#comment-16545704 ] ASF GitHub Bot commented on METRON-1657: Github user ottobackwards commented on a diff in the pull request: https://github.com/apache/metron/pull/1099#discussion_r202812681 --- Diff: metron-platform/metron-parsers/README.md --- @@ -82,6 +82,12 @@ topology in kafka. Errors are collected with the context of the error (e.g. stacktrace) and original message causing the error and sent to an `error` queue. Invalid messages as determined by global validation functions are also treated as errors and sent to an `error` queue. + +Multiple sensors can be aggregated into a single Storm topology. When this is done, there will be +multiple Kafka spouts, but only a single parser bolt which will handle delegating to the correct --- End diff -- what *is* the intermediate kafka step? That is what is confusing me. My understanding is that for each sensor you reference it will build a spout for that sensor parser topic, and then pass everything from those to this bolt, which will call the right parser and output to I'm not sure. Why have to have a sensor specific topic at all? > Parser aggregation in storm > --- > > Key: METRON-1657 > URL: https://issues.apache.org/jira/browse/METRON-1657 > Project: Metron > Issue Type: Bug >Reporter: Justin Leet >Assignee: Justin Leet >Priority: Major > > Currently our parsing solution requires one storm topology per sensor. It has > been complained that this may be wasteful of resources and that, rather than > one storm topology per sensor, it would be advantageous to have multiple > sensors in the same topology. The benefit to this is that it would require > fewer storm slots. > The issue with this is that whenever we've aggregated functionality like this > before, we've run into issues appropriately being able to scale storm (e.g. > batch vs random access indexing in the same topology). The main point in > addressing this is to recommend that parsers with similar velocities and > complexity are grouped together. > Particularly for a first cut, leave the configuration mostly as-is, while > allowing for comma separated lists of sensors in start_parser_topology.sh > (e.g. bro,yaf creates a aggregated parser consisting of those two). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] metron pull request #1099: METRON-1657: Parser aggregation in storm
Github user cestella commented on a diff in the pull request: https://github.com/apache/metron/pull/1099#discussion_r202817242 --- Diff: metron-platform/metron-parsers/README.md --- @@ -82,6 +82,12 @@ topology in kafka. Errors are collected with the context of the error (e.g. stacktrace) and original message causing the error and sent to an `error` queue. Invalid messages as determined by global validation functions are also treated as errors and sent to an `error` queue. + +Multiple sensors can be aggregated into a single Storm topology. When this is done, there will be +multiple Kafka spouts, but only a single parser bolt which will handle delegating to the correct --- End diff -- @justinleet can you maybe create a data flow diagram or sequence diagram that shows a syslog record from the use-case flowing through this topology and add it to the use-case around parser chaining? It'd be something like, given a `cisco-6-302` record, it'll go: * From NiFi to the `pix_syslog_router` kafka topic * From the `pix_syslog_router` kafka topic to the `pix_syslog_router` spout in the aggregated storm topology * From the `pix_syslog_router` kafka spout to the parser bolt, which will run the `pix_syslog_router` Grok parser and write out to the `cisco-6-302` kafka topic * From the `cisco-6-302` kafka topic to the `cisco-6-302` spout in the aggregated storm topology * From the `cisco-6-302` kafka spout to the `cisco-6-302` Grok parser and write out to the `enrichments` kafka topic, where it's picked up by the enrichment topology. Eventually, we should consider taking out the writing to the `cisco-6-302` topic (optionally), but even eventually there may be value in those intermediate kafka topics due to how users may want to group sensors (e.g. grouping may be done via velocity or scalability requirements, rather than logical connection). ---
[GitHub] metron pull request #1103: METRON-1554: Initial PCAP UI
GitHub user tiborm reopened a pull request: https://github.com/apache/metron/pull/1103 METRON-1554: Initial PCAP UI ## Contributor Comments This PR contains the initial cut of PCAP UI. https://user-images.githubusercontent.com/2437400/42747095-28d510bc-88db-11e8-8501-98c82f6ec521.png;> https://user-images.githubusercontent.com/2437400/42747099-2cbd9db6-88db-11e8-8be8-7f2bb2971fe3.png;> ## Pull Request Checklist Thank you for submitting a contribution to Apache Metron. Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions. Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides. In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following: ### For all changes: - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [x] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? ### For code changes: - [x] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [x] Have you included steps or a guide to how the change may be verified and tested manually? - [x] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via: ``` mvn -q clean integration-test install && dev-utilities/build-utils/verify_licenses.sh ``` - [x] Have you written or updated unit tests and or integration tests to verify your changes? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [x] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent? ### For documentation related changes: - [ ] Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via `site-book/target/site/index.html`: ``` cd site-book mvn site ``` Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. It is also recommended that [travis-ci](https://travis-ci.org) is set up for your personal repository such that your branches are built there before submitting a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tiborm/metron feature/METRON-1554-pcap-query-panel Alternatively you can review and apply these changes as the patch at: https://github.com/apache/metron/pull/1103.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1103 commit 872d1b1ee13e358c18956945d71d3667d19fca8a Author: merrimanr Date: 2018-04-12T14:57:48Z Merge branch 'pcap-front' of https://github.com/simonellistonball/metron into pcaprest Conflicts: metron-interface/metron-alerts/src/app/app.module.ts commit b1b6a7dabea1a1d0d132482c8d97af29c0ac2683 Author: merrimanr Date: 2018-04-13T15:00:15Z initial commit Conflicts: metron-interface/metron-rest/src/main/java/org/apache/metron/rest/MetronRestApplication.java metron-interface/metron-rest/src/main/java/org/apache/metron/rest/controller/PcapQueryController.java metron-interface/metron-rest/src/main/java/org/apache/metron/rest/util/pcapQueryThread.java commit 55cf2d945a4fcff1e7e2e47a234037ed6f394b2e Author: merrimanr Date: 2018-04-18T15:52:56Z added license headers ---
[jira] [Commented] (METRON-1554) Pcap Query Panel
[ https://issues.apache.org/jira/browse/METRON-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16544896#comment-16544896 ] ASF GitHub Bot commented on METRON-1554: GitHub user tiborm reopened a pull request: https://github.com/apache/metron/pull/1103 METRON-1554: Initial PCAP UI ## Contributor Comments This PR contains the initial cut of PCAP UI. https://user-images.githubusercontent.com/2437400/42747095-28d510bc-88db-11e8-8501-98c82f6ec521.png;> https://user-images.githubusercontent.com/2437400/42747099-2cbd9db6-88db-11e8-8be8-7f2bb2971fe3.png;> ## Pull Request Checklist Thank you for submitting a contribution to Apache Metron. Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions. Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides. In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following: ### For all changes: - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [x] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? ### For code changes: - [x] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [x] Have you included steps or a guide to how the change may be verified and tested manually? - [x] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via: ``` mvn -q clean integration-test install && dev-utilities/build-utils/verify_licenses.sh ``` - [x] Have you written or updated unit tests and or integration tests to verify your changes? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [x] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent? ### For documentation related changes: - [ ] Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via `site-book/target/site/index.html`: ``` cd site-book mvn site ``` Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. It is also recommended that [travis-ci](https://travis-ci.org) is set up for your personal repository such that your branches are built there before submitting a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tiborm/metron feature/METRON-1554-pcap-query-panel Alternatively you can review and apply these changes as the patch at: https://github.com/apache/metron/pull/1103.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1103 commit 872d1b1ee13e358c18956945d71d3667d19fca8a Author: merrimanr Date: 2018-04-12T14:57:48Z Merge branch 'pcap-front' of https://github.com/simonellistonball/metron into pcaprest Conflicts: metron-interface/metron-alerts/src/app/app.module.ts commit b1b6a7dabea1a1d0d132482c8d97af29c0ac2683 Author: merrimanr Date: 2018-04-13T15:00:15Z initial commit Conflicts: metron-interface/metron-rest/src/main/java/org/apache/metron/rest/MetronRestApplication.java metron-interface/metron-rest/src/main/java/org/apache/metron/rest/controller/PcapQueryController.java metron-interface/metron-rest/src/main/java/org/apache/metron/rest/util/pcapQueryThread.java commit 55cf2d945a4fcff1e7e2e47a234037ed6f394b2e Author: merrimanr Date: 2018-04-18T15:52:56Z added license headers > Pcap Query Panel > > > Key: METRON-1554 > URL: https://issues.apache.org/jira/browse/METRON-1554 > Project: Metron > Issue Type: New Feature >Reporter: Ryan Merriman >
[GitHub] metron pull request #1104: METRON-1670 Stellar WEEK_OF_YEAR test is locale s...
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/1104#discussion_r202600375 --- Diff: metron-stellar/stellar-common/src/test/java/org/apache/metron/stellar/dsl/functions/DateFunctionsTest.java --- @@ -182,7 +182,8 @@ public void testDayOfMonthNull() { @Test public void testWeekOfYear() { Object result = run("WEEK_OF_YEAR(epoch)"); -assertEquals(35, result); +calendar.setTimeInMillis(AUG2016); --- End diff -- It is perfectly safe: The instance is create in the @Before annotated method, so in fact, I am already doing exactly what you suggest. See: https://github.com/simonellistonball/metron/blob/bafd3827d273c5c33b4293fd338595b8de3a75b2/metron-stellar/stellar-common/src/test/java/org/apache/metron/stellar/dsl/functions/DateFunctionsTest.java#L61 So there is no sharing of this object between tests, since a new instance is acquired before each test method is run. The second issue you raise should be handled by a separate unit of change and a separate ticket if it is really necessary (not sure why it would be, but perhaps you could raise a JIRA and explain). ---
[jira] [Commented] (METRON-1554) Pcap Query Panel
[ https://issues.apache.org/jira/browse/METRON-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16544895#comment-16544895 ] ASF GitHub Bot commented on METRON-1554: Github user tiborm closed the pull request at: https://github.com/apache/metron/pull/1103 > Pcap Query Panel > > > Key: METRON-1554 > URL: https://issues.apache.org/jira/browse/METRON-1554 > Project: Metron > Issue Type: New Feature >Reporter: Ryan Merriman >Priority: Major > > Legacy OpenSOC included a panel in Kibana that allowed users to query for > pcap data. We would like to add this feature back into Metron. There are 2 > discussions happening on the dev list where we are gathering user > requirements: > [http://mail-archives.apache.org/mod_mbox/metron-dev/201805.mbox/%3CCAEVkqPYxfe3Q65mX7Mkuk_FKUCV420yb6hcLmf+FF=1ozer...@mail.gmail.com%3E] > and working through the backend architecture: > [http://mail-archives.apache.org/mod_mbox/metron-dev/201805.mbox/%3ccaevkqpbxzjnu_wgrbfwnz-mvqnkb7mthedveq9plyhwfit7...@mail.gmail.com%3E] > Forthcoming sub tasks will be based on the outcome of these discussions. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (METRON-1670) Stellar WEEK_OF_YEAR test is locale sensitive
[ https://issues.apache.org/jira/browse/METRON-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16544932#comment-16544932 ] ASF GitHub Bot commented on METRON-1670: Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/1104#discussion_r202600375 --- Diff: metron-stellar/stellar-common/src/test/java/org/apache/metron/stellar/dsl/functions/DateFunctionsTest.java --- @@ -182,7 +182,8 @@ public void testDayOfMonthNull() { @Test public void testWeekOfYear() { Object result = run("WEEK_OF_YEAR(epoch)"); -assertEquals(35, result); +calendar.setTimeInMillis(AUG2016); --- End diff -- It is perfectly safe: The instance is create in the @Before annotated method, so in fact, I am already doing exactly what you suggest. See: https://github.com/simonellistonball/metron/blob/bafd3827d273c5c33b4293fd338595b8de3a75b2/metron-stellar/stellar-common/src/test/java/org/apache/metron/stellar/dsl/functions/DateFunctionsTest.java#L61 So there is no sharing of this object between tests, since a new instance is acquired before each test method is run. The second issue you raise should be handled by a separate unit of change and a separate ticket if it is really necessary (not sure why it would be, but perhaps you could raise a JIRA and explain). > Stellar WEEK_OF_YEAR test is locale sensitive > - > > Key: METRON-1670 > URL: https://issues.apache.org/jira/browse/METRON-1670 > Project: Metron > Issue Type: Bug >Affects Versions: 0.5.0 >Reporter: Simon Elliston Ball >Priority: Trivial > > The Stellar WEEK_OF_YEAR(epoch) function is sensitive to the locale of the > machine it is running on. The tests in DateFunctionsTest are not, this leads > to test failures on machine locales that differ in their first day of week > definition or days in first week definition. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (METRON-1662) PCAP UI: Downloading PCAP page files
[ https://issues.apache.org/jira/browse/METRON-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tibor Meller updated METRON-1662: - Summary: PCAP UI: Downloading PCAP page files (was: PCAP UI: Downloading PCAP file) > PCAP UI: Downloading PCAP page files > > > Key: METRON-1662 > URL: https://issues.apache.org/jira/browse/METRON-1662 > Project: Metron > Issue Type: New Feature >Reporter: Tibor Meller >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (METRON-1236) Add mpack support for ambari-agent running as non-root user
[ https://issues.apache.org/jira/browse/METRON-1236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16544912#comment-16544912 ] ASF GitHub Bot commented on METRON-1236: GitHub user Condla opened a pull request: https://github.com/apache/metron/pull/1105 METRON-1236 Add start/stop/restart commands that execute successfully… …, when ambari agents run as non-root user ## Contributor Comments Fix for issue https://issues.apache.org/jira/browse/METRON-1236 * In order to make it work, the non-root user - in the following example "ambari" - additionally needs the following permission given in a sudoers file: ``` ambari ALL=(ALL) NOPASSWD:SETENV: /usr/sbin/service metron-alerts-ui * ambari ALL=(ALL) NOPASSWD:SETENV: /usr/sbin/service metron-management-ui * ``` ## Pull Request Checklist Thank you for submitting a contribution to Apache Metron. Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions. Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides. In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following: ### For all changes: - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [x] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? ### For code changes: - [x] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [x] Have you included steps or a guide to how the change may be verified and tested manually? - [ ] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via: ``` mvn -q clean integration-test install && dev-utilities/build-utils/verify_licenses.sh ``` not applicable - [ ] Have you written or updated unit tests and or integration tests to verify your changes? not applicable - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? not applicable - [ ] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent? not applicable ### For documentation related changes: - [ ] Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via `site-book/target/site/index.html`: ``` cd site-book mvn site ``` not applicable Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. It is also recommended that [travis-ci](https://travis-ci.org) is set up for your personal repository such that your branches are built there before submitting a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/Condla/metron fix-METRON-1236 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/metron/pull/1105.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1105 commit bcc01d3f5cf996d1a2c5cc175c2141518dd71d3b Author: Stefan Kupstaitis-Dunkler Date: 2018-07-16T07:41:33Z METRON-1236 Add start/stop/restart commands that execute successfully, when ambari agents run as non-root user > Add mpack support for ambari-agent running as non-root user > --- > > Key: METRON-1236 > URL: https://issues.apache.org/jira/browse/METRON-1236 > Project: Metron > Issue Type: Improvement >Reporter: Kyle Richardson >Priority: Minor > Labels: mpack > > The current service start/stop/status/restart commands do not utilize `sudo` > and therefore the ambari-agent must be running as root on each node. -- This message was sent by
[GitHub] metron pull request #1105: METRON-1236 Add start/stop/restart commands that ...
GitHub user Condla opened a pull request: https://github.com/apache/metron/pull/1105 METRON-1236 Add start/stop/restart commands that execute successfully⦠â¦, when ambari agents run as non-root user ## Contributor Comments Fix for issue https://issues.apache.org/jira/browse/METRON-1236 * In order to make it work, the non-root user - in the following example "ambari" - additionally needs the following permission given in a sudoers file: ``` ambari ALL=(ALL) NOPASSWD:SETENV: /usr/sbin/service metron-alerts-ui * ambari ALL=(ALL) NOPASSWD:SETENV: /usr/sbin/service metron-management-ui * ``` ## Pull Request Checklist Thank you for submitting a contribution to Apache Metron. Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions. Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides. In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following: ### For all changes: - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [x] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? ### For code changes: - [x] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [x] Have you included steps or a guide to how the change may be verified and tested manually? - [ ] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via: ``` mvn -q clean integration-test install && dev-utilities/build-utils/verify_licenses.sh ``` not applicable - [ ] Have you written or updated unit tests and or integration tests to verify your changes? not applicable - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? not applicable - [ ] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent? not applicable ### For documentation related changes: - [ ] Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via `site-book/target/site/index.html`: ``` cd site-book mvn site ``` not applicable Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. It is also recommended that [travis-ci](https://travis-ci.org) is set up for your personal repository such that your branches are built there before submitting a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/Condla/metron fix-METRON-1236 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/metron/pull/1105.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1105 commit bcc01d3f5cf996d1a2c5cc175c2141518dd71d3b Author: Stefan Kupstaitis-Dunkler Date: 2018-07-16T07:41:33Z METRON-1236 Add start/stop/restart commands that execute successfully, when ambari agents run as non-root user ---
[jira] [Created] (METRON-1673) Fix Javadoc errors
Justin Leet created METRON-1673: --- Summary: Fix Javadoc errors Key: METRON-1673 URL: https://issues.apache.org/jira/browse/METRON-1673 Project: Metron Issue Type: Bug Reporter: Justin Leet Assignee: Justin Leet Our Javadocs have errors. They should be fixed. Ideally, building Javadoc should be part of the build. However, we've held off for time restraints in the build. [https://github.com/apache/metron/pull/1099] proposes a build matrix for our build, and we could slot it into one of the sub builds without these time constraints. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] metron pull request #1099: METRON-1657: Parser aggregation in storm
Github user ottobackwards commented on a diff in the pull request: https://github.com/apache/metron/pull/1099#discussion_r202758396 --- Diff: metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/bolt/ParserBolt.java --- @@ -182,40 +185,61 @@ public void prepare(Map stormConf, TopologyContext context, OutputCollector coll super.prepare(stormConf, context, collector); messageGetStrategy = MessageGetters.DEFAULT_BYTES_FROM_POSITION.get(); this.collector = collector; -if(getSensorParserConfig() != null) { - cache = CachingStellarProcessor.createCache(getSensorParserConfig().getCacheConfig()); -} -initializeStellar(); -if(getSensorParserConfig() != null && filter == null) { - getSensorParserConfig().getParserConfig().putIfAbsent("stellarContext", stellarContext); - if (!StringUtils.isEmpty(getSensorParserConfig().getFilterClassName())) { -filter = Filters.get(getSensorParserConfig().getFilterClassName() -, getSensorParserConfig().getParserConfig() -); + +// Build the Stellar cache +Map cacheConfig = new HashMap<>(); +for (Map.Entry entry: sensorToComponentMap.entrySet()) { + String sensor = entry.getKey(); + SensorParserConfig config = getSensorParserConfig(sensor); + + if (config != null) { +cacheConfig.putAll(config.getCacheConfig()); } } +cache = CachingStellarProcessor.createCache(cacheConfig); -parser.init(); +// Need to prep all sensors +for (Map.Entry entry: sensorToComponentMap.entrySet()) { + String sensor = entry.getKey(); + MessageParser parser = entry.getValue().getMessageParser(); --- End diff -- I don't believe this is correct. We want to initialize stellar PER parser. Each should have it's own stellar instance and cache. ---
[jira] [Commented] (METRON-1657) Parser aggregation in storm
[ https://issues.apache.org/jira/browse/METRON-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545512#comment-16545512 ] ASF GitHub Bot commented on METRON-1657: Github user justinleet commented on a diff in the pull request: https://github.com/apache/metron/pull/1099#discussion_r202761519 --- Diff: metron-platform/metron-parsers/README.md --- @@ -82,6 +82,12 @@ topology in kafka. Errors are collected with the context of the error (e.g. stacktrace) and original message causing the error and sent to an `error` queue. Invalid messages as determined by global validation functions are also treated as errors and sent to an `error` queue. + +Multiple sensors can be aggregated into a single Storm topology. When this is done, there will be +multiple Kafka spouts, but only a single parser bolt which will handle delegating to the correct --- End diff -- Correct, it's aggregating existing topics / sensors into a single topology (with multiple spouts). It's pulling from each individual topic/sensor as a spout, and then passing to a single parser bolt which handles parsing and output as needed. > Parser aggregation in storm > --- > > Key: METRON-1657 > URL: https://issues.apache.org/jira/browse/METRON-1657 > Project: Metron > Issue Type: Bug >Reporter: Justin Leet >Assignee: Justin Leet >Priority: Major > > Currently our parsing solution requires one storm topology per sensor. It has > been complained that this may be wasteful of resources and that, rather than > one storm topology per sensor, it would be advantageous to have multiple > sensors in the same topology. The benefit to this is that it would require > fewer storm slots. > The issue with this is that whenever we've aggregated functionality like this > before, we've run into issues appropriately being able to scale storm (e.g. > batch vs random access indexing in the same topology). The main point in > addressing this is to recommend that parsers with similar velocities and > complexity are grouped together. > Particularly for a first cut, leave the configuration mostly as-is, while > allowing for comma separated lists of sensors in start_parser_topology.sh > (e.g. bro,yaf creates a aggregated parser consisting of those two). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (METRON-1657) Parser aggregation in storm
[ https://issues.apache.org/jira/browse/METRON-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545610#comment-16545610 ] ASF GitHub Bot commented on METRON-1657: Github user justinleet commented on a diff in the pull request: https://github.com/apache/metron/pull/1099#discussion_r202785248 --- Diff: metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/bolt/ParserBolt.java --- @@ -182,40 +185,61 @@ public void prepare(Map stormConf, TopologyContext context, OutputCollector coll super.prepare(stormConf, context, collector); messageGetStrategy = MessageGetters.DEFAULT_BYTES_FROM_POSITION.get(); this.collector = collector; -if(getSensorParserConfig() != null) { - cache = CachingStellarProcessor.createCache(getSensorParserConfig().getCacheConfig()); -} -initializeStellar(); -if(getSensorParserConfig() != null && filter == null) { - getSensorParserConfig().getParserConfig().putIfAbsent("stellarContext", stellarContext); - if (!StringUtils.isEmpty(getSensorParserConfig().getFilterClassName())) { -filter = Filters.get(getSensorParserConfig().getFilterClassName() -, getSensorParserConfig().getParserConfig() -); + +// Build the Stellar cache +Map cacheConfig = new HashMap<>(); +for (Map.Entry entry: sensorToComponentMap.entrySet()) { + String sensor = entry.getKey(); + SensorParserConfig config = getSensorParserConfig(sensor); + + if (config != null) { +cacheConfig.putAll(config.getCacheConfig()); } } +cache = CachingStellarProcessor.createCache(cacheConfig); -parser.init(); +// Need to prep all sensors +for (Map.Entry entry: sensorToComponentMap.entrySet()) { + String sensor = entry.getKey(); + MessageParser parser = entry.getValue().getMessageParser(); --- End diff -- I left it as a single shared cache on purpose. I don't believe that there'd be any incorrect evictions by sharing the cache, and I think evicting based on the overall usage in the aggregated parser is the appropriate place to handle it. Since the cache is (mostly) LRU, I'd prefer to drop the least recently used entry of all parsers rather than dropping for each parser. LRU of the overall flow seems better than LRU of each of the sensors. Assuming a single cache, you'd bump up the cache configs to account for this, rather than having to optimize each config individually (and potentially as a group afterwards). Caching is also off by default for the parsers, so this is a case that's only hit if the user explicitly chooses to do so. Having said that, I do think I need to shore up the documentation around that logic, assuming we choose to go forward with it. Let me know what you think, and I can adjust appropriately. > Parser aggregation in storm > --- > > Key: METRON-1657 > URL: https://issues.apache.org/jira/browse/METRON-1657 > Project: Metron > Issue Type: Bug >Reporter: Justin Leet >Assignee: Justin Leet >Priority: Major > > Currently our parsing solution requires one storm topology per sensor. It has > been complained that this may be wasteful of resources and that, rather than > one storm topology per sensor, it would be advantageous to have multiple > sensors in the same topology. The benefit to this is that it would require > fewer storm slots. > The issue with this is that whenever we've aggregated functionality like this > before, we've run into issues appropriately being able to scale storm (e.g. > batch vs random access indexing in the same topology). The main point in > addressing this is to recommend that parsers with similar velocities and > complexity are grouped together. > Particularly for a first cut, leave the configuration mostly as-is, while > allowing for comma separated lists of sensors in start_parser_topology.sh > (e.g. bro,yaf creates a aggregated parser consisting of those two). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] metron pull request #1099: METRON-1657: Parser aggregation in storm
Github user justinleet commented on a diff in the pull request: https://github.com/apache/metron/pull/1099#discussion_r202785248 --- Diff: metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/bolt/ParserBolt.java --- @@ -182,40 +185,61 @@ public void prepare(Map stormConf, TopologyContext context, OutputCollector coll super.prepare(stormConf, context, collector); messageGetStrategy = MessageGetters.DEFAULT_BYTES_FROM_POSITION.get(); this.collector = collector; -if(getSensorParserConfig() != null) { - cache = CachingStellarProcessor.createCache(getSensorParserConfig().getCacheConfig()); -} -initializeStellar(); -if(getSensorParserConfig() != null && filter == null) { - getSensorParserConfig().getParserConfig().putIfAbsent("stellarContext", stellarContext); - if (!StringUtils.isEmpty(getSensorParserConfig().getFilterClassName())) { -filter = Filters.get(getSensorParserConfig().getFilterClassName() -, getSensorParserConfig().getParserConfig() -); + +// Build the Stellar cache +Map cacheConfig = new HashMap<>(); +for (Map.Entry entry: sensorToComponentMap.entrySet()) { + String sensor = entry.getKey(); + SensorParserConfig config = getSensorParserConfig(sensor); + + if (config != null) { +cacheConfig.putAll(config.getCacheConfig()); } } +cache = CachingStellarProcessor.createCache(cacheConfig); -parser.init(); +// Need to prep all sensors +for (Map.Entry entry: sensorToComponentMap.entrySet()) { + String sensor = entry.getKey(); + MessageParser parser = entry.getValue().getMessageParser(); --- End diff -- I left it as a single shared cache on purpose. I don't believe that there'd be any incorrect evictions by sharing the cache, and I think evicting based on the overall usage in the aggregated parser is the appropriate place to handle it. Since the cache is (mostly) LRU, I'd prefer to drop the least recently used entry of all parsers rather than dropping for each parser. LRU of the overall flow seems better than LRU of each of the sensors. Assuming a single cache, you'd bump up the cache configs to account for this, rather than having to optimize each config individually (and potentially as a group afterwards). Caching is also off by default for the parsers, so this is a case that's only hit if the user explicitly chooses to do so. Having said that, I do think I need to shore up the documentation around that logic, assuming we choose to go forward with it. Let me know what you think, and I can adjust appropriately. ---
[jira] [Commented] (METRON-1657) Parser aggregation in storm
[ https://issues.apache.org/jira/browse/METRON-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545655#comment-16545655 ] ASF GitHub Bot commented on METRON-1657: Github user ottobackwards commented on a diff in the pull request: https://github.com/apache/metron/pull/1099#discussion_r202798006 --- Diff: metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/bolt/ParserBolt.java --- @@ -182,40 +185,61 @@ public void prepare(Map stormConf, TopologyContext context, OutputCollector coll super.prepare(stormConf, context, collector); messageGetStrategy = MessageGetters.DEFAULT_BYTES_FROM_POSITION.get(); this.collector = collector; -if(getSensorParserConfig() != null) { - cache = CachingStellarProcessor.createCache(getSensorParserConfig().getCacheConfig()); -} -initializeStellar(); -if(getSensorParserConfig() != null && filter == null) { - getSensorParserConfig().getParserConfig().putIfAbsent("stellarContext", stellarContext); - if (!StringUtils.isEmpty(getSensorParserConfig().getFilterClassName())) { -filter = Filters.get(getSensorParserConfig().getFilterClassName() -, getSensorParserConfig().getParserConfig() -); + +// Build the Stellar cache +Map cacheConfig = new HashMap<>(); +for (Map.Entry entry: sensorToComponentMap.entrySet()) { + String sensor = entry.getKey(); + SensorParserConfig config = getSensorParserConfig(sensor); + + if (config != null) { +cacheConfig.putAll(config.getCacheConfig()); } } +cache = CachingStellarProcessor.createCache(cacheConfig); -parser.init(); +// Need to prep all sensors +for (Map.Entry entry: sensorToComponentMap.entrySet()) { + String sensor = entry.getKey(); + MessageParser parser = entry.getValue().getMessageParser(); --- End diff -- We may need a test then. > Parser aggregation in storm > --- > > Key: METRON-1657 > URL: https://issues.apache.org/jira/browse/METRON-1657 > Project: Metron > Issue Type: Bug >Reporter: Justin Leet >Assignee: Justin Leet >Priority: Major > > Currently our parsing solution requires one storm topology per sensor. It has > been complained that this may be wasteful of resources and that, rather than > one storm topology per sensor, it would be advantageous to have multiple > sensors in the same topology. The benefit to this is that it would require > fewer storm slots. > The issue with this is that whenever we've aggregated functionality like this > before, we've run into issues appropriately being able to scale storm (e.g. > batch vs random access indexing in the same topology). The main point in > addressing this is to recommend that parsers with similar velocities and > complexity are grouped together. > Particularly for a first cut, leave the configuration mostly as-is, while > allowing for comma separated lists of sensors in start_parser_topology.sh > (e.g. bro,yaf creates a aggregated parser consisting of those two). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (METRON-1614) Create job status abstraction
[ https://issues.apache.org/jira/browse/METRON-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545656#comment-16545656 ] ASF GitHub Bot commented on METRON-1614: GitHub user mmiklavc opened a pull request: https://github.com/apache/metron/pull/1108 METRON-1614: Create job status abstraction ## Contributor Comments https://issues.apache.org/jira/browse/METRON-1614 ### DO NOT MERGE until follow-on PR created/reviewed/+1'ed This PR requires a separate PR that will round out the failing/final bits in the REST endpoints - the build will be broken in this PR because the follow-up will fix the rest layer and accompanying tests. Once that PR is created, all testing should be done via that branch for e2e testing via the REST app and pcap cli. This branch/PR is provided as a subset of the final deliverable in this feature branch in order to make review and attribution easier. The main thrust in this PR is to provide a `metron-job` module that contains a job manager abstraction for interacting with `Statusable` jobs. The scope in this round is limited to making the `PcapJob` a `Statusable` that will enable asynchronous calls for managing the underlying MR job as well as for the finalization routine. A `TimerTask` is kicked off in the `PcapJob` for finalizing the results from the MR job when it completes successfully. As we expand the feature set to include other job types, e.g. Spark, the thread/timer management should be refactored into the `JobManager` implementation, but we'll have a better sense of what those api changes should look like once we start to add those new job types. This version will not reload jobs upon restarting the REST api as the UI will only deal with the current query for a logged in user. PCAP MR job detail will be contained in the output paths that are configured via the REST application or PCAP CLI application. The CLI job will continue to write to the local FS runtime exec directory while the REST app will write its final raw PCAP output to HDFS. Also per the mailing list, this PR completely removes the existing `metron-api` module, which was the existing REST server wrapping the pcap query functionality. This is now deprecated/removed in favor of an implementation being added to the newer REST api in `metron-interface/metron-rest.` ## Pull Request Checklist ### For all changes: - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [x] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? ### For code changes: - [ ] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [ ] Have you included steps or a guide to how the change may be verified and tested manually? - [ ] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via: ``` mvn -q clean integration-test install && dev-utilities/build-utils/verify_licenses.sh ``` - [ ] Have you written or updated unit tests and or integration tests to verify your changes? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent? ### For documentation related changes: - [ ] Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via `site-book/target/site/index.html`: ``` cd site-book mvn site ``` Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. It is also recommended that [travis-ci](https://travis-ci.org) is set up for your personal repository such that your branches are built there before submitting a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/mmiklavc/metron pcap-job-manager Alternatively you can review and apply these changes as the patch at: https://github.com/apache/metron/pull/1108.patch To close this pull request, make a commit to your master/trunk branch with (at
[GitHub] metron pull request #1108: METRON-1614: Create job status abstraction
GitHub user mmiklavc opened a pull request: https://github.com/apache/metron/pull/1108 METRON-1614: Create job status abstraction ## Contributor Comments https://issues.apache.org/jira/browse/METRON-1614 ### DO NOT MERGE until follow-on PR created/reviewed/+1'ed This PR requires a separate PR that will round out the failing/final bits in the REST endpoints - the build will be broken in this PR because the follow-up will fix the rest layer and accompanying tests. Once that PR is created, all testing should be done via that branch for e2e testing via the REST app and pcap cli. This branch/PR is provided as a subset of the final deliverable in this feature branch in order to make review and attribution easier. The main thrust in this PR is to provide a `metron-job` module that contains a job manager abstraction for interacting with `Statusable` jobs. The scope in this round is limited to making the `PcapJob` a `Statusable` that will enable asynchronous calls for managing the underlying MR job as well as for the finalization routine. A `TimerTask` is kicked off in the `PcapJob` for finalizing the results from the MR job when it completes successfully. As we expand the feature set to include other job types, e.g. Spark, the thread/timer management should be refactored into the `JobManager` implementation, but we'll have a better sense of what those api changes should look like once we start to add those new job types. This version will not reload jobs upon restarting the REST api as the UI will only deal with the current query for a logged in user. PCAP MR job detail will be contained in the output paths that are configured via the REST application or PCAP CLI application. The CLI job will continue to write to the local FS runtime exec directory while the REST app will write its final raw PCAP output to HDFS. Also per the mailing list, this PR completely removes the existing `metron-api` module, which was the existing REST server wrapping the pcap query functionality. This is now deprecated/removed in favor of an implementation being added to the newer REST api in `metron-interface/metron-rest.` ## Pull Request Checklist ### For all changes: - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [x] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? ### For code changes: - [ ] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [ ] Have you included steps or a guide to how the change may be verified and tested manually? - [ ] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via: ``` mvn -q clean integration-test install && dev-utilities/build-utils/verify_licenses.sh ``` - [ ] Have you written or updated unit tests and or integration tests to verify your changes? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent? ### For documentation related changes: - [ ] Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via `site-book/target/site/index.html`: ``` cd site-book mvn site ``` Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. It is also recommended that [travis-ci](https://travis-ci.org) is set up for your personal repository such that your branches are built there before submitting a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/mmiklavc/metron pcap-job-manager Alternatively you can review and apply these changes as the patch at: https://github.com/apache/metron/pull/1108.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1108 commit 41ecf36cd7c2da0399d03a37fee5916a9bfa87e7 Author: Michael Miklavcic Date: 2018-06-13T01:48:41Z Add metron-job project. Update pcap to be Statusable. commit
[GitHub] metron pull request #1099: METRON-1657: Parser aggregation in storm
Github user ottobackwards commented on a diff in the pull request: https://github.com/apache/metron/pull/1099#discussion_r202798006 --- Diff: metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/bolt/ParserBolt.java --- @@ -182,40 +185,61 @@ public void prepare(Map stormConf, TopologyContext context, OutputCollector coll super.prepare(stormConf, context, collector); messageGetStrategy = MessageGetters.DEFAULT_BYTES_FROM_POSITION.get(); this.collector = collector; -if(getSensorParserConfig() != null) { - cache = CachingStellarProcessor.createCache(getSensorParserConfig().getCacheConfig()); -} -initializeStellar(); -if(getSensorParserConfig() != null && filter == null) { - getSensorParserConfig().getParserConfig().putIfAbsent("stellarContext", stellarContext); - if (!StringUtils.isEmpty(getSensorParserConfig().getFilterClassName())) { -filter = Filters.get(getSensorParserConfig().getFilterClassName() -, getSensorParserConfig().getParserConfig() -); + +// Build the Stellar cache +Map cacheConfig = new HashMap<>(); +for (Map.Entry entry: sensorToComponentMap.entrySet()) { + String sensor = entry.getKey(); + SensorParserConfig config = getSensorParserConfig(sensor); + + if (config != null) { +cacheConfig.putAll(config.getCacheConfig()); } } +cache = CachingStellarProcessor.createCache(cacheConfig); -parser.init(); +// Need to prep all sensors +for (Map.Entry entry: sensorToComponentMap.entrySet()) { + String sensor = entry.getKey(); + MessageParser parser = entry.getValue().getMessageParser(); --- End diff -- We may need a test then. ---
[GitHub] metron pull request #1099: METRON-1657: Parser aggregation in storm
Github user cestella commented on a diff in the pull request: https://github.com/apache/metron/pull/1099#discussion_r202805609 --- Diff: metron-platform/metron-parsers/README.md --- @@ -82,6 +82,12 @@ topology in kafka. Errors are collected with the context of the error (e.g. stacktrace) and original message causing the error and sent to an `error` queue. Invalid messages as determined by global validation functions are also treated as errors and sent to an `error` queue. + +Multiple sensors can be aggregated into a single Storm topology. When this is done, there will be +multiple Kafka spouts, but only a single parser bolt which will handle delegating to the correct --- End diff -- This PR gives us the ability to group the parsers into a single topology if we so desire. You would still write through to kafka. So, the topology in the example would have 3 kafka spouts: * One for monitoring `pix_syslog_router` (the syslog parser aka the routing parser) * One for monitoring `cisco-5-304` * One for monitoring `cisco-6-302` There would be one parser bolt, though, which would handle parsing all 3 sensor types. That is the contribution of this PR, the ability to determine the parser and filter and field transformations from the input kafka topic and use the appropriate one to parse the messages. There is not, however, any code here that would bypass the intermediate kafka write (e.g. from the router topology to the individual `cisco-5-304` or `cisco-6-302` topics). That's a current gap. ---
[jira] [Commented] (METRON-1657) Parser aggregation in storm
[ https://issues.apache.org/jira/browse/METRON-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545683#comment-16545683 ] ASF GitHub Bot commented on METRON-1657: Github user justinleet commented on a diff in the pull request: https://github.com/apache/metron/pull/1099#discussion_r202805243 --- Diff: metron-platform/metron-parsers/README.md --- @@ -82,6 +82,12 @@ topology in kafka. Errors are collected with the context of the error (e.g. stacktrace) and original message causing the error and sent to an `error` queue. Invalid messages as determined by global validation functions are also treated as errors and sent to an `error` queue. + +Multiple sensors can be aggregated into a single Storm topology. When this is done, there will be +multiple Kafka spouts, but only a single parser bolt which will handle delegating to the correct --- End diff -- Essentially the flow in the syslog case is: 1. Syslog message comes in through the wrapper's topic. 2. The spout for syslog passes it through to the aggregate bolt. 3. The aggregated bolt does its job unwrapping and outputs the inner messages to the appropriate topics for each. 4. The spouts for the inner messages then pick up from the now populated topics. 5. All spouts are now passing messages as they receive them to the aggregate bolt. 6. All messages get delegated and handled as needed. This implementation doesn't do anything clever around avoiding output topics if the aggregate parser is already contained here. > Parser aggregation in storm > --- > > Key: METRON-1657 > URL: https://issues.apache.org/jira/browse/METRON-1657 > Project: Metron > Issue Type: Bug >Reporter: Justin Leet >Assignee: Justin Leet >Priority: Major > > Currently our parsing solution requires one storm topology per sensor. It has > been complained that this may be wasteful of resources and that, rather than > one storm topology per sensor, it would be advantageous to have multiple > sensors in the same topology. The benefit to this is that it would require > fewer storm slots. > The issue with this is that whenever we've aggregated functionality like this > before, we've run into issues appropriately being able to scale storm (e.g. > batch vs random access indexing in the same topology). The main point in > addressing this is to recommend that parsers with similar velocities and > complexity are grouped together. > Particularly for a first cut, leave the configuration mostly as-is, while > allowing for comma separated lists of sensors in start_parser_topology.sh > (e.g. bro,yaf creates a aggregated parser consisting of those two). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] metron pull request #1099: METRON-1657: Parser aggregation in storm
Github user justinleet commented on a diff in the pull request: https://github.com/apache/metron/pull/1099#discussion_r202805243 --- Diff: metron-platform/metron-parsers/README.md --- @@ -82,6 +82,12 @@ topology in kafka. Errors are collected with the context of the error (e.g. stacktrace) and original message causing the error and sent to an `error` queue. Invalid messages as determined by global validation functions are also treated as errors and sent to an `error` queue. + +Multiple sensors can be aggregated into a single Storm topology. When this is done, there will be +multiple Kafka spouts, but only a single parser bolt which will handle delegating to the correct --- End diff -- Essentially the flow in the syslog case is: 1. Syslog message comes in through the wrapper's topic. 2. The spout for syslog passes it through to the aggregate bolt. 3. The aggregated bolt does its job unwrapping and outputs the inner messages to the appropriate topics for each. 4. The spouts for the inner messages then pick up from the now populated topics. 5. All spouts are now passing messages as they receive them to the aggregate bolt. 6. All messages get delegated and handled as needed. This implementation doesn't do anything clever around avoiding output topics if the aggregate parser is already contained here. ---
[jira] [Commented] (METRON-1554) Pcap Query Panel
[ https://issues.apache.org/jira/browse/METRON-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545168#comment-16545168 ] ASF GitHub Bot commented on METRON-1554: Github user ottobackwards commented on the issue: https://github.com/apache/metron/pull/1103 I think we should rename from alert ui to investigate or something > Pcap Query Panel > > > Key: METRON-1554 > URL: https://issues.apache.org/jira/browse/METRON-1554 > Project: Metron > Issue Type: New Feature >Reporter: Ryan Merriman >Priority: Major > > Legacy OpenSOC included a panel in Kibana that allowed users to query for > pcap data. We would like to add this feature back into Metron. There are 2 > discussions happening on the dev list where we are gathering user > requirements: > [http://mail-archives.apache.org/mod_mbox/metron-dev/201805.mbox/%3CCAEVkqPYxfe3Q65mX7Mkuk_FKUCV420yb6hcLmf+FF=1ozer...@mail.gmail.com%3E] > and working through the backend architecture: > [http://mail-archives.apache.org/mod_mbox/metron-dev/201805.mbox/%3ccaevkqpbxzjnu_wgrbfwnz-mvqnkb7mthedveq9plyhwfit7...@mail.gmail.com%3E] > Forthcoming sub tasks will be based on the outcome of these discussions. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] metron issue #1103: METRON-1554: Initial PCAP UI
Github user ottobackwards commented on the issue: https://github.com/apache/metron/pull/1103 I think we should rename from alert ui to investigate or something ---
[jira] [Created] (METRON-1671) Create PCAP UI
Tibor Meller created METRON-1671: Summary: Create PCAP UI Key: METRON-1671 URL: https://issues.apache.org/jira/browse/METRON-1671 Project: Metron Issue Type: Sub-task Reporter: Tibor Meller -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] metron issue #1105: METRON-1236 Add start/stop/restart commands that execute...
Github user cestella commented on the issue: https://github.com/apache/metron/pull/1105 +1 by inspection; thanks! ---
[jira] [Commented] (METRON-1236) Add mpack support for ambari-agent running as non-root user
[ https://issues.apache.org/jira/browse/METRON-1236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545235#comment-16545235 ] ASF GitHub Bot commented on METRON-1236: Github user cestella commented on the issue: https://github.com/apache/metron/pull/1105 +1 by inspection; thanks! > Add mpack support for ambari-agent running as non-root user > --- > > Key: METRON-1236 > URL: https://issues.apache.org/jira/browse/METRON-1236 > Project: Metron > Issue Type: Improvement >Reporter: Kyle Richardson >Priority: Minor > Labels: mpack > > The current service start/stop/status/restart commands do not utilize `sudo` > and therefore the ambari-agent must be running as root on each node. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (METRON-1671) Create PCAP UI
[ https://issues.apache.org/jira/browse/METRON-1671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tibor Meller updated METRON-1671: - Description: The initial feature set of PCAP UI is the follwing: - Filtering by - IP Source Address - IP Source Port - IP Dest Address - IP Dest Port - Protocol - Include Reverse Traffic - Free text filtering - Showing PDML result https://user-images.githubusercontent.com/2437400/42747095-28d510bc-88db-11e8-8501-98c82f6ec521.png;> https://user-images.githubusercontent.com/2437400/42747099-2cbd9db6-88db-11e8-8be8-7f2bb2971fe3.png;> was: The initial feature set of PCAP UI is the follwing: - Filtering by - IP Source Address - IP Source Port - IP Dest Address - IP Dest Port - Protocol - Include Reverse Traffic - Free text filtering - Showing PDML result https://user-images.githubusercontent.com/2437400/42747095-28d510bc-88db-11e8-8501-98c82f6ec521.png;> https://user-images.githubusercontent.com/2437400/42747099-2cbd9db6-88db-11e8-8be8-7f2bb2971fe3.png;> > Create PCAP UI > -- > > Key: METRON-1671 > URL: https://issues.apache.org/jira/browse/METRON-1671 > Project: Metron > Issue Type: Sub-task >Reporter: Tibor Meller >Priority: Major > > The initial feature set of PCAP UI is the follwing: > - Filtering by > - IP Source Address > - IP Source Port > - IP Dest Address > - IP Dest Port > - Protocol > - Include Reverse Traffic > - Free text filtering > - Showing PDML result > src="https://user-images.githubusercontent.com/2437400/42747095-28d510bc-88db-11e8-8501-98c82f6ec521.png;> > src="https://user-images.githubusercontent.com/2437400/42747099-2cbd9db6-88db-11e8-8be8-7f2bb2971fe3.png;> -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (METRON-1671) Create PCAP UI
[ https://issues.apache.org/jira/browse/METRON-1671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tibor Meller updated METRON-1671: - Description: The initial feature set of PCAP UI is the follwing: - Filtering by - IP Source Address - IP Source Port - IP Dest Address - IP Dest Port - Protocol - Include Reverse Traffic - Free text filtering - Showing PDML result https://user-images.githubusercontent.com/2437400/42747095-28d510bc-88db-11e8-8501-98c82f6ec521.png;> https://user-images.githubusercontent.com/2437400/42747099-2cbd9db6-88db-11e8-8be8-7f2bb2971fe3.png;> > Create PCAP UI > -- > > Key: METRON-1671 > URL: https://issues.apache.org/jira/browse/METRON-1671 > Project: Metron > Issue Type: Sub-task >Reporter: Tibor Meller >Priority: Major > > The initial feature set of PCAP UI is the follwing: > - Filtering by > - IP Source Address > - IP Source Port > - IP Dest Address > - IP Dest Port > - Protocol > - Include Reverse Traffic > - Free text filtering > - Showing PDML result > src="https://user-images.githubusercontent.com/2437400/42747095-28d510bc-88db-11e8-8501-98c82f6ec521.png;> > src="https://user-images.githubusercontent.com/2437400/42747099-2cbd9db6-88db-11e8-8be8-7f2bb2971fe3.png;> -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (METRON-1671) Create PCAP UI
[ https://issues.apache.org/jira/browse/METRON-1671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tibor Meller updated METRON-1671: - Description: The initial feature set of PCAP UI is the follwing: - Filtering by - IP Source Address - IP Source Port - IP Dest Address - IP Dest Port - Protocol - Include Reverse Traffic - Free text filtering - Showing PDML result was: The initial feature set of PCAP UI is the follwing: - Filtering by - IP Source Address - IP Source Port - IP Dest Address - IP Dest Port - Protocol - Include Reverse Traffic - Free text filtering - Showing PDML result https://user-images.githubusercontent.com/2437400/42747095-28d510bc-88db-11e8-8501-98c82f6ec521.png;> https://user-images.githubusercontent.com/2437400/42747099-2cbd9db6-88db-11e8-8be8-7f2bb2971fe3.png;> > Create PCAP UI > -- > > Key: METRON-1671 > URL: https://issues.apache.org/jira/browse/METRON-1671 > Project: Metron > Issue Type: Sub-task >Reporter: Tibor Meller >Priority: Major > > The initial feature set of PCAP UI is the follwing: > - Filtering by > - IP Source Address > - IP Source Port > - IP Dest Address > - IP Dest Port > - Protocol > - Include Reverse Traffic > - Free text filtering > - Showing PDML result -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (METRON-1476) Update angular
[ https://issues.apache.org/jira/browse/METRON-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545316#comment-16545316 ] ASF GitHub Bot commented on METRON-1476: Github user sardell commented on the issue: https://github.com/apache/metron/pull/1096 Per @simonellistonball's comment, I removed the compiled Angular code from the pom.xml excludes, and instead added them to the prepend_license_header.sh script. In addition, I've added .nvmrc files to both UI projects and updated the documentation accordingly. > Update angular > -- > > Key: METRON-1476 > URL: https://issues.apache.org/jira/browse/METRON-1476 > Project: Metron > Issue Type: Improvement >Reporter: Daniel Toth >Assignee: Daniel Toth >Priority: Major > > Update angular to speed up development -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] metron pull request #1095: METRON-1651: Fixing faIling protractor e2e tests
Github user justinleet commented on a diff in the pull request: https://github.com/apache/metron/pull/1095#discussion_r202722662 --- Diff: metron-interface/metron-alerts/e2e/alerts-list/meta-alerts/meta-alert.e2e-spec.ts --- @@ -137,14 +135,16 @@ describe('Test spec for meta alerts workflow', function() { expect(await metaAlertPage.getAvailableMetaAlerts()).toEqualBcoz('e2e-meta-alert (22)', 'Meta alert should be present'); await metaAlertPage.selectRadio(); await metaAlertPage.addToMetaAlert(); -expect(await tablePage.getCellValue(0, 2, '(22')).toContain('(23)', 'alert count should be incremented'); +// FIXME: line below will fail because the following: https://hortonworks.jira.com/browse/BUG-106815 --- End diff -- Can you make this and the other FIXMEs reference the Apache Jira (https://issues.apache.org/jira/browse/METRON-1654) instead of the Hortonworks Jira? ---
[jira] [Commented] (METRON-1651) Fixing failing protractor e2e test
[ https://issues.apache.org/jira/browse/METRON-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545327#comment-16545327 ] ASF GitHub Bot commented on METRON-1651: Github user justinleet commented on a diff in the pull request: https://github.com/apache/metron/pull/1095#discussion_r202722662 --- Diff: metron-interface/metron-alerts/e2e/alerts-list/meta-alerts/meta-alert.e2e-spec.ts --- @@ -137,14 +135,16 @@ describe('Test spec for meta alerts workflow', function() { expect(await metaAlertPage.getAvailableMetaAlerts()).toEqualBcoz('e2e-meta-alert (22)', 'Meta alert should be present'); await metaAlertPage.selectRadio(); await metaAlertPage.addToMetaAlert(); -expect(await tablePage.getCellValue(0, 2, '(22')).toContain('(23)', 'alert count should be incremented'); +// FIXME: line below will fail because the following: https://hortonworks.jira.com/browse/BUG-106815 --- End diff -- Can you make this and the other FIXMEs reference the Apache Jira (https://issues.apache.org/jira/browse/METRON-1654) instead of the Hortonworks Jira? > Fixing failing protractor e2e test > -- > > Key: METRON-1651 > URL: https://issues.apache.org/jira/browse/METRON-1651 > Project: Metron > Issue Type: Bug >Reporter: Tibor Meller >Priority: Major > Attachments: e2e-assertion-errors.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (METRON-1651) Fixing failing protractor e2e test
[ https://issues.apache.org/jira/browse/METRON-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545349#comment-16545349 ] ASF GitHub Bot commented on METRON-1651: Github user justinleet commented on the issue: https://github.com/apache/metron/pull/1095 +1, assuming @merrimanr is good. I'm glad to see the tests getting fixed up and improved! > Fixing failing protractor e2e test > -- > > Key: METRON-1651 > URL: https://issues.apache.org/jira/browse/METRON-1651 > Project: Metron > Issue Type: Bug >Reporter: Tibor Meller >Priority: Major > Attachments: e2e-assertion-errors.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] metron issue #1104: METRON-1670 Stellar WEEK_OF_YEAR test is locale sensitiv...
Github user cestella commented on the issue: https://github.com/apache/metron/pull/1104 +1 Great catch! ---
[jira] [Commented] (METRON-1670) Stellar WEEK_OF_YEAR test is locale sensitive
[ https://issues.apache.org/jira/browse/METRON-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545366#comment-16545366 ] ASF GitHub Bot commented on METRON-1670: Github user cestella commented on the issue: https://github.com/apache/metron/pull/1104 +1 Great catch! > Stellar WEEK_OF_YEAR test is locale sensitive > - > > Key: METRON-1670 > URL: https://issues.apache.org/jira/browse/METRON-1670 > Project: Metron > Issue Type: Bug >Affects Versions: 0.5.0 >Reporter: Simon Elliston Ball >Priority: Trivial > > The Stellar WEEK_OF_YEAR(epoch) function is sensitive to the locale of the > machine it is running on. The tests in DateFunctionsTest are not, this leads > to test failures on machine locales that differ in their first day of week > definition or days in first week definition. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] metron pull request #1104: METRON-1670 Stellar WEEK_OF_YEAR test is locale s...
Github user asfgit closed the pull request at: https://github.com/apache/metron/pull/1104 ---
[jira] [Commented] (METRON-1670) Stellar WEEK_OF_YEAR test is locale sensitive
[ https://issues.apache.org/jira/browse/METRON-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545391#comment-16545391 ] ASF GitHub Bot commented on METRON-1670: Github user asfgit closed the pull request at: https://github.com/apache/metron/pull/1104 > Stellar WEEK_OF_YEAR test is locale sensitive > - > > Key: METRON-1670 > URL: https://issues.apache.org/jira/browse/METRON-1670 > Project: Metron > Issue Type: Bug >Affects Versions: 0.5.0 >Reporter: Simon Elliston Ball >Priority: Trivial > > The Stellar WEEK_OF_YEAR(epoch) function is sensitive to the locale of the > machine it is running on. The tests in DateFunctionsTest are not, this leads > to test failures on machine locales that differ in their first day of week > definition or days in first week definition. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (METRON-1236) Add mpack support for ambari-agent running as non-root user
[ https://issues.apache.org/jira/browse/METRON-1236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545404#comment-16545404 ] ASF GitHub Bot commented on METRON-1236: Github user asfgit closed the pull request at: https://github.com/apache/metron/pull/1105 > Add mpack support for ambari-agent running as non-root user > --- > > Key: METRON-1236 > URL: https://issues.apache.org/jira/browse/METRON-1236 > Project: Metron > Issue Type: Improvement >Reporter: Kyle Richardson >Priority: Minor > Labels: mpack > > The current service start/stop/status/restart commands do not utilize `sudo` > and therefore the ambari-agent must be running as root on each node. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] metron pull request #1105: METRON-1236 Add start/stop/restart commands that ...
Github user asfgit closed the pull request at: https://github.com/apache/metron/pull/1105 ---
[jira] [Commented] (METRON-1658) Upgrade bro to 2.5.4
[ https://issues.apache.org/jira/browse/METRON-1658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545410#comment-16545410 ] ASF GitHub Bot commented on METRON-1658: Github user asfgit closed the pull request at: https://github.com/apache/metron/pull/1101 > Upgrade bro to 2.5.4 > > > Key: METRON-1658 > URL: https://issues.apache.org/jira/browse/METRON-1658 > Project: Metron > Issue Type: Improvement >Reporter: Jon Zeolla >Assignee: Jon Zeolla >Priority: Minor > > We're currently running Bro 2.5.2, and two releases have come out, both > fixing some security issues, and 2.5.4 also contains a couple of bugfixes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)