[GitHub] metron issue #946: METRON-1465:Support for Elasticsearch X-pack
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/946 @ottobackwards in it's current state, sort of, but you're not required to turn it on. In the desired (reflection based nifi style state) no, it should load it and use it if present, but otherwise just use the vanilla transport client. Of course someday they're going to force us to REST, which will bring all this up again no doubt. ---
[GitHub] metron pull request #946: METRON-1465:Support for Elasticsearch X-pack
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/946#discussion_r173416554 --- Diff: metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/package/scripts/metron_service.py --- @@ -70,6 +70,11 @@ def build_global_config_patch(params, patch_file): "path": "/es.date.format", "value": "{{es_date_format}}" }, +{ +"op": "add", +"path": "/es.xpack.user", +"value": "{{es_xpack_user}}" --- End diff -- could we lean on the hadoop credentials apis? ---
[GitHub] metron issue #945: METRON-1464: Convert schemas to be compatible with Solr 5...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/945 To be fair, my question is probably just as appropriate on a discuss thread and a separate ticket out of said thread if it comes to it. ---
[GitHub] metron issue #955: METRON-1474 Add normalizecss
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/955 Have the dependencies, License and Notices files been updated? ---
[GitHub] metron issue #953: Metron-1472 Add stylelint support
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/953 Have the dependencies, License and Notices files been updated? ---
[GitHub] metron issue #952: Metron-1480 Add yarn as default build tool for the fronte...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/952 Have the dependencies, License and Notices files been updated? ---
[GitHub] metron issue #942: METRON-1461: Modify the MIN, MAX Stellar methods to take ...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/942 I would say performance trumps complexity of functionality here. ---
[GitHub] metron issue #946: METRON-1465:Support for Elasticsearch X-pack
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/946 Should we consider a dual client in the same project similar to the approach in https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-elasticsearch-bundle/nifi-elasticsearch-5-processors/src/main/java/org/apache/nifi/processors/elasticsearch/AbstractElasticsearch5TransportClientProcessor.java ? We can then switch based on the presence of the es.xpack.user setting in the ambari. ---
[GitHub] metron issue #945: METRON-1464: Convert schemas to be compatible with Solr 5...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/945 Are we losing anything by moving the scheme from Range to Trie types?, repeating my comment on https://github.com/apache/metron/pull/922: Given that our use case is heavily dependant on sorting, I wonder why not the Trie based indices for numeric fields. I may be completely wrong on their advantages but would love to hear the logic behind the choice of Point indices. If there is a good reason, maybe we should consider retaining those for 6.6 in addition to the 5.5 clusters. Either way it would be be good to understand the basis for the type decision. ---
[GitHub] metron issue #579: METRON-941 fix PaloAltoParser
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/579 I don't believe I have any further things to add. I'm +1 and keen to see this get in. ---
[GitHub] metron issue #895: METRON-1394:Create Rest endpoint to add the ACL for curre...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/895 One option here, that would make me less grumpy would be to incorporate the acl actions with the topic creation actions, which at least prevents nefarious insiders from using this endpoint to give themselves access to arbitrary topics they didn't create. ---
[GitHub] metron issue #934: METRON-1423: Ambari work to handle Solr configuration
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/934 Given that we're happy to accept limiting ourselves to a Hadoop distribution, it doesn't seem unfair to limit ourselves to a Solr distribution. While sticking pure raw Apache has some appeal, it may well make sense to stick to Ambari mpack based installs in the distro @ottobackwards suggest. ---
[GitHub] metron issue #895: METRON-1394:Create Rest endpoint to add the ACL for curre...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/895 Ok, so we have some authentication, with clear text passwords, but we don't have any authorization on the end points, which causes compliance issues with things like access change request. ---
[GitHub] metron issue #895: METRON-1394:Create Rest endpoint to add the ACL for curre...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/895 -1 (non-binding) This is a pen-tester's dream. We currently have no authentication around this endpoint, and allowing it to actually set acls make it a serious security hole. That may be out of scope in the case of this PR, but I wouldn't want this in a release until we had some security. ---
[GitHub] metron issue #922: METRON-1441: Create complementary Solr schemas for the ma...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/922 @cestella much neater. Thank you! I'll put my data schema OCD away now. ---
[GitHub] metron pull request #922: METRON-1441: Create complementary Solr schemas for...
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/922#discussion_r166908797 --- Diff: metron-platform/metron-solr/src/main/config/schema/error/solrconfig.xml --- @@ -0,0 +1,1601 @@ + --- End diff -- Is this just basic stock solr? The /browse endpoint configs seem like they come from the tutorial sample, and could be mis-leading for example. ---
[GitHub] metron issue #914: METRON-1397 Support for JSON Path and complex documents i...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/914 My bad: the dependencies file is updated, but not the NOTICES. ---
[GitHub] metron issue #914: METRON-1397 Support for JSON Path and complex documents i...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/914 And in fairness to you, we don't do the NOTICES properly for anything else either I see. That's probably a separate scope. ---
[GitHub] metron issue #914: METRON-1397 Support for JSON Path and complex documents i...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/914 And in more topical an relevant points... we're introducing a net new dependency here... Do we need to also update the NOTICES file (and the dependencies_with_url.csv list)? ---
[GitHub] metron issue #914: METRON-1397 Support for JSON Path and complex documents i...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/914 @ottobackwards Really want that myth to be a reality... as soon as we can get the config overwrite problem solved... but this really isn't the ticket to discuss that on :) I'm kinda keen on this parser looking at things like CloudTrail logs and such like which appear with merged JSON right now. The other thing that I think this would link up to is the idea that's been floating around about chaining parsers. This parser could well be a split parser, which then routes to other parsers with different transformations for example depending on the message, but again, that's deep future and should probably be on a discuss thread about parser chaining. ---
[GitHub] metron issue #922: METRON-1441: Create complementary Solr schemas for the ma...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/922 Should we tidy up the ordering of the schema files for better legibility (uniquekey next to the field, dynamic catch alls in a consistent location, some semantic ordering of the key elements)? ---
[GitHub] metron issue #915: METRON-1433: Only emit debugging timing fields in enrichm...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/915 Does anyone anticipate using (or even use) these fields to do things like monitor the flow and latency of their pipelines? I can imagine people wanting to build data quality monitoring reports on these fields, but would like to verify that assumption. ---
[GitHub] metron issue #873: METRON-1367 Stellar should have some instrumentation of f...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/873 One question: why %timing? I would vote for %time, but that might just be because I use too much jupyter and like the consistency. Am I missing another reference for the choice of timing? ---
[GitHub] metron pull request #867: METRON-1350: Add reservoir sampling functions to S...
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/867#discussion_r156794990 --- Diff: metron-analytics/metron-statistics/README.md --- @@ -53,6 +53,32 @@ functions can be used from everywhere where Stellar is used. * bounds - A list of value bounds (excluding min and max) in sorted order. * Returns: Which bin N the value falls in such that bound(N-1) < value <= bound(N). No min and max bounds are provided, so values smaller than the 0'th bound go in the 0'th bin, and values greater than the last bound go in the M'th bin. +### Sampling Functions + + `SAMPLE_ADD` +* Description: Add a value or collection of values to a sampler. +* Input: --- End diff -- Recency would surely be more relevant for merged resampling in a profile context? ---
[GitHub] metron pull request #867: METRON-1350: Add reservoir sampling functions to S...
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/867#discussion_r156791019 --- Diff: metron-analytics/metron-statistics/src/main/java/org/apache/metron/statistics/sampling/UniformSampler.java --- @@ -0,0 +1,91 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.statistics.sampling; + +import java.util.ArrayList; +import java.util.List; +import java.util.Random; + +public class UniformSampler implements Sampler { + private List reservoir; + private int seen = 0; + private int size; + private Random rng = new Random(0); + + public UniformSampler() { +this(DEFAULT_SIZE); + } + + public UniformSampler(int size) { +this.size = size; +reservoir = new ArrayList<>(size); + } + + @Override + public Iterable get() { +return reservoir; + } + + /** + * Add an object to the reservoir + * @param o + */ + public void add(Object o) { +if(o == null) { + return; +} +if (reservoir.size() < size) { + reservoir.add(o); +} else { + int rIndex = rng.nextInt(seen + 1); --- End diff -- you are 100% right, that's what I get for skim reading. ---
[GitHub] metron issue #867: METRON-1350: Add reservoir sampling functions to Stellar
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/867 Should the size limit on the sample really be a cut off? In a likely usage scenario a users would sample over a window in a profile. Limiting the size is likely to skew to time at the beginning of the window rather than being genuinely uniform. Would a random replacement strategy make more sense when over the limit? This could be a lot heavier in terms of performance, but may be more mathematically sound. ---
[GitHub] metron pull request #863: METRON-1347: Indexing Topology should fail tuples ...
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/863#discussion_r156676868 --- Diff: metron-platform/metron-writer/src/main/java/org/apache/metron/writer/bolt/BulkMessageWriterBolt.java --- @@ -229,17 +239,30 @@ public void execute(Tuple tuple) { LOG.trace("Writing enrichment message: {}", message); WriterConfiguration writerConfiguration = configurationTransformation.apply( new IndexingWriterConfiguration(bulkMessageWriter.getName(), getConfigurations())); - if(writerConfiguration.isDefault(sensorType)) { -//want to warn, but not fail the tuple -collector.reportError(new Exception("WARNING: Default and (likely) unoptimized writer config used for " + bulkMessageWriter.getName() + " writer and sensor " + sensorType)); + if(sensorType == null) { --- End diff -- Strictly speaking that's true, but by convention original_string should be required. There is a broader topic about what should be required, but that certainly doesn't belong in a comment on a PR. ---
[GitHub] metron issue #861: METRON-1341 Implemented SELECT transformer to project fie...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/861 Good catch, my test included the source.type in the select list, and generalised badly in the instructions. We should include the source.type in the protected system fields, let me update the test and implementation. ---
[GitHub] metron issue #861: METRON-1341 Implemented SELECT transformer to project fie...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/861 Actually, to follow up on that... I have a proxy feed, and some proxy use cases (enrichment, profile, etc). I want to keep my data clean and be explicit about which fields I pass on, so I have a 'library' field set that means "proxy like stuff", these are the fields I push into the output here. ---
[GitHub] metron issue #861: METRON-1341 Implemented SELECT transformer to project fie...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/861 A typical case might be something like the CEF parser. You could potentially kick out a lot of fields you really don't care about, which at scale can produce huge amounts of ES and HDFS storage (in addition to the original_string representation. The goal for this is to focus on just outputting fields which match active use cases in the rest of the flow to control data storage costs and data clarity. This also allows you to map an explicit data model and provide some governability to the data model. ---
[GitHub] metron issue #861: METRON-1341 Implemented SELECT transformer to project fie...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/861 If they want to select most, and remove the ones they don't want, then I would recommend using the remove transformation, or a set null in stellar. Perhaps regex support might be a nice follow on, but it breaks the mental model of people who are used to any other language that handles projection, such as SQL. The goal for this was to allow user to explicitly select only a defined set of fields. To be honest, people have lived with the idea of explicitly choosing fields for decades in SQL and quite liked it, so I suspect adding something that is pattern based might make it less usable than more. ---
[GitHub] metron pull request #861: METRON-1341 Implemented SELECT transformer to proj...
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/861#discussion_r155698109 --- Diff: metron-platform/metron-parsers/README.md --- @@ -216,6 +216,23 @@ whenever `field2` exists and whose corresponding equal to 'foo': } ``` +* `SELECT`: This transformation filters the fields in the message to include only the configured output fields, and drops any not explicitly included. + +For example: +``` +{ +... +"fieldTransformations" : [ + { +"output" : ["field1", "field2" ] + , "transformation" : "SELECT" + } + ] +} +``` + +when applied to a message containing keys field1, field2 and field3, will only output the first two. + --- End diff -- agreed, and added. ---
[GitHub] metron pull request #861: METRON-1341 Implemented SELECT transformer to proj...
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/861#discussion_r155659293 --- Diff: metron-platform/metron-common/src/test/java/org/apache/metron/common/field/transformation/SelectTransformationTest.java --- @@ -0,0 +1,75 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.metron.common.field.transformation; + +import java.util.HashMap; + +import org.apache.hadoop.hbase.util.Bytes; +import org.apache.metron.common.configuration.FieldTransformer; +import org.apache.metron.common.configuration.SensorParserConfig; +import org.apache.metron.stellar.dsl.Context; +import org.json.simple.JSONObject; +import org.junit.Assert; +import org.junit.Test; + +import com.google.common.collect.Iterables; + --- End diff -- It's a pretty broad reaching concern when you account for the impact of remove, stellar, and the other field transforms. I'm happy to look at it, but I expect it will just need to be reviewed again as part of a follow on. ---
[GitHub] metron issue #861: METRON-1341 Implemented SELECT transformer to project fie...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/861 Right, that should cover it. ---
[GitHub] metron pull request #861: METRON-1341 Implemented SELECT transformer to proj...
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/861#discussion_r155649024 --- Diff: metron-platform/metron-parsers/README.md --- @@ -216,6 +216,23 @@ whenever `field2` exists and whose corresponding equal to 'foo': } ``` +* `SELECT`: This transformation filters the fields in the message to include only the configured output fields, and drops any not explicitly included. + +For example: +``` +{ +... +"fieldTransformations" : [ + { +"output" : ["field1", "field2" ] + , "transformation" : "SELECT" + } + ] +} +``` + +when applied to a message containing keys field1, field2 and field3, will only output the first two. + --- End diff -- Yes, that would break. I'd say that was flat out user error, and not something to guard against, or indeed something it's possible to guard against. I guess the key thing here is that it requires maintaining order in field transformations. ---
[GitHub] metron pull request #861: METRON-1341 Implemented SELECT transformer to proj...
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/861#discussion_r155646268 --- Diff: metron-platform/metron-common/src/main/java/org/apache/metron/common/field/transformation/SelectTransformation.java --- @@ -0,0 +1,52 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.metron.common.field.transformation; + +import java.util.HashMap; +import java.util.LinkedHashMap; +import java.util.List; +import java.util.Map; +import java.util.Map.Entry; + +import org.apache.metron.stellar.dsl.Context; + --- End diff -- I've added to relevant readme (in parsers docs) which would seem a more likely place than javadoc (if we want to add javadoc, which we don't publish anyway, we should also look at adding this to everything in this area, which feels out of scope). Not even the stellar version gets javadoc! ---
[GitHub] metron pull request #861: METRON-1341 Implemented SELECT transformer to proj...
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/861#discussion_r155645745 --- Diff: metron-platform/metron-common/src/test/java/org/apache/metron/common/field/transformation/SelectTransformationTest.java --- @@ -0,0 +1,75 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.metron.common.field.transformation; + +import java.util.HashMap; + +import org.apache.hadoop.hbase.util.Bytes; +import org.apache.metron.common.configuration.FieldTransformer; +import org.apache.metron.common.configuration.SensorParserConfig; +import org.apache.metron.stellar.dsl.Context; +import org.json.simple.JSONObject; +import org.junit.Assert; +import org.junit.Test; + +import com.google.common.collect.Iterables; + --- End diff -- yeah, makes sense, I suspect on a different ticket. ---
[GitHub] metron pull request #861: METRON-1341 Implemented SELECT transformer to proj...
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/861#discussion_r155645303 --- Diff: metron-platform/metron-parsers/README.md --- @@ -216,6 +216,23 @@ whenever `field2` exists and whose corresponding equal to 'foo': } ``` +* `SELECT`: This transformation filters the fields in the message to include only the configured output fields, and drops any not explicitly included. + +For example: +``` +{ +... +"fieldTransformations" : [ + { +"output" : ["field1", "field2" ] + , "transformation" : "SELECT" + } + ] +} +``` + +when applied to a message containing keys field1, field2 and field3, will only output the first two. + --- End diff -- And order doesn't really matter, you might have this first, or last, or sandwiched between two stellar blocks. That said, I would expect the most common use case to be have this one at the end of a set. ---
[GitHub] metron pull request #861: METRON-1341 Implemented SELECT transformer to proj...
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/861#discussion_r155645129 --- Diff: metron-platform/metron-parsers/README.md --- @@ -216,6 +216,23 @@ whenever `field2` exists and whose corresponding equal to 'foo': } ``` +* `SELECT`: This transformation filters the fields in the message to include only the configured output fields, and drops any not explicitly included. + +For example: +``` +{ +... +"fieldTransformations" : [ + { +"output" : ["field1", "field2" ] + , "transformation" : "SELECT" + } + ] +} +``` + +when applied to a message containing keys field1, field2 and field3, will only output the first two. + --- End diff -- This is the opposite of remove. ---
[GitHub] metron issue #861: METRON-1341 Implemented SELECT transformer to project fie...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/861 It suddenly occurs to me that we should probably whitelist the original_string and timestamp fields, so that these are always kept by this transformation. Does that make sense? ---
[GitHub] metron issue #861: METRON-1341 Implemented SELECT transformer to project fie...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/861 Somehow I knew you were going to say something about docs will add. ---
[GitHub] metron pull request #861: METRON-1341 Implemented SELECT transformer to proj...
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/861#discussion_r155641855 --- Diff: metron-platform/metron-common/src/test/java/org/apache/metron/common/field/transformation/SelectTransformationTest.java --- @@ -0,0 +1,75 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.metron.common.field.transformation; + +import java.util.HashMap; + +import org.apache.hadoop.hbase.util.Bytes; +import org.apache.metron.common.configuration.FieldTransformer; +import org.apache.metron.common.configuration.SensorParserConfig; +import org.apache.metron.stellar.dsl.Context; +import org.json.simple.JSONObject; +import org.junit.Assert; +import org.junit.Test; + +import com.google.common.collect.Iterables; + --- End diff -- that feels a lot more like an integration test, or a broader test than belongs in this one piece. Otherwise every single test of transformations is going to end up with combinatorial complexity. ---
[GitHub] metron pull request #861: Implemented SELECT transformer to project fields f...
GitHub user simonellistonball opened a pull request: https://github.com/apache/metron/pull/861 Implemented SELECT transformer to project fields from parser ## Contributor Comments This is a simple PR to add FieldTransformation capabilities to the parsers allowing basic projection. There are definitely better ways to implement this as reflected in the comments, but those would need a wide ranging refactor of FieldTransformations, so I've decided to shelve that for now. ## Pull Request Checklist Thank you for submitting a contribution to Apache Metron. Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions. Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides. In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following: ### For all changes: - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [x] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? ### For code changes: - [x] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [x] Have you included steps or a guide to how the change may be verified and tested manually? - [x] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via: ``` mvn -q clean integration-test install && build_utils/verify_licenses.sh ``` - [x] Have you written or updated unit tests and or integration tests to verify your changes? - [x] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent? You can merge this pull request into a Git repository by running: $ git pull https://github.com/simonellistonball/metron METRON-1341 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/metron/pull/861.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #861 commit 7674c7283b5cefd2a40bb1329fc227e0b3a2227d Author: Simon Elliston Ball Date: 2017-12-05T01:07:46Z Implemented SELECT transformer to project fields from parser ---
[GitHub] metron issue #856: METRON-1339 Stellar Shell functionality to verify stored ...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/856 @cestella I would say that proposed validate function has to be very much in a namespace. It feels like a name that would be much more useful for a function replacing our current approach to global validation in the future than config validation, other than that it sounds like a good idea. ---
[GitHub] metron pull request #800: METRON-1251: Typo and formatting fixes for metron-...
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/800#discussion_r145197280 --- Diff: metron-interface/metron-rest/README.md --- @@ -112,42 +112,42 @@ The following configures the application for MySQL: 1. Install MySQL if not already available (this example uses version 5.7, installation instructions can be found [here](https://dev.mysql.com/doc/refman/5.7/en/linux-installation-yum-repo.html)) 1. Create a metron user and REST database and permission the user for that database: - ``` -CREATE USER 'metron'@'node1' IDENTIFIED BY 'Myp@ssw0rd'; -CREATE DATABASE IF NOT EXISTS metronrest; -GRANT ALL PRIVILEGES ON metronrest.* TO 'metron'@'node1'; - ``` +``` +CREATE USER 'metron'@'node1' IDENTIFIED BY 'Myp@ssw0rd'; +CREATE DATABASE IF NOT EXISTS metronrest; +GRANT ALL PRIVILEGES ON metronrest.* TO 'metron'@'node1'; +``` 1. Install the MySQL JDBC client onto the REST application host and configurate the METRON_JDBC_CLIENT_PATH variable: - ``` -cd $METRON_HOME/lib -wget https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.41.tar.gz -tar xf mysql-connector-java-5.1.41.tar.gz - ``` +``` +cd $METRON_HOME/lib +wget https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.41.tar.gz +tar xf mysql-connector-java-5.1.41.tar.gz +``` 1. Edit these variables in `/etc/sysconfig/metron` to configure the REST application for MySQL: - ``` -METRON_JDBC_DRIVER="com.mysql.jdbc.Driver" -METRON_JDBC_URL="jdbc:mysql://mysql_host:3306/metronrest" -METRON_JDBC_USERNAME="metron" -METRON_JDBC_PLATFORM="mysql" -METRON_JDBC_CLIENT_PATH=$METRON_HOME/lib/mysql-connector-java-5.1.41/mysql-connector-java-5.1.41-bin.jar - ``` +``` +METRON_JDBC_DRIVER="com.mysql.jdbc.Driver" +METRON_JDBC_URL="jdbc:mysql://mysql_host:3306/metronrest" +METRON_JDBC_USERNAME="metron" +METRON_JDBC_PLATFORM="mysql" + METRON_JDBC_CLIENT_PATH=$METRON_HOME/lib/mysql-connector-java-5.1.41/mysql-connector-java-5.1.41-bin.jar +``` 1. Switch to the metron user - ``` -sudo su - metron - ``` +``` +sudo su - metron +``` 1. Start the REST API. Adjust the password as necessary. - ``` -set -o allexport; -source /etc/metron/sysconfig; -set +o allexport; -export METRON_JDBC_PASSWORD='Myp@ssw0rd'; -$METRON_HOME/bin/metron-rest.sh -unset METRON_JDBC_PASSWORD; - ``` +``` +set -o allexport; +source /etc/sysconfig/metron; --- End diff -- Didn't all this move to /etc/default/metron for ubuntu compat? ---
[GitHub] metron issue #682: METRON-1081: Fix Alerts and Ops UI Notices file
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/682 @ottobackwards I just re-ran this. We're not fixing versions in the UIs at the moment, so I would expect this kind of minor variance. This is mainly to ensure licenses are at least covered. The solution to the version fixing is the yarn stuff we're discussing elsewhere. ---
[GitHub] metron pull request #788: METRON-1223: Support for adding comments to alerts
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/788#discussion_r143470175 --- Diff: metron-interface/metron-alerts/src/app/alerts/alert-details/alert-details.component.ts --- @@ -133,6 +173,40 @@ export class AlertDetailsComponent implements OnInit { }); } + onAddComment() { +let alertComment = new AlertComment(this.alertCommentStr, this.authenticationService.getCurrentUserName(), new Date().getTime()); +let tAlertComments = this.alertCommentsWrapper.map(alertsWrapper => alertsWrapper.alertComment); +tAlertComments.unshift(alertComment); +this.patchAlert(new Patch('add', '/comments', tAlertComments)); + } + + patchAlert(patch: Patch) { +let patchRequest = new PatchRequest(); +patchRequest.guid = this.alertSource.guid; +patchRequest.index = this.alertIndex; +patchRequest.patch = [patch]; +patchRequest.sensorType = this.alertSourceType; + +this.updateService.patch(patchRequest).subscribe(() => { + this.getData(); +}); + } + + onDeleteComment(index: number) { +let commentText = 'Do you wish to delete the comment '; +if (this.alertCommentsWrapper[index].alertComment.comment.length > 25 ) { + commentText += ' \'' + this.alertCommentsWrapper[index].alertComment.comment.substr(0, 25) + '...\''; +} else { + commentText += ' \'' + this.alertCommentsWrapper[index].alertComment.comment + '\''; +} + + this.metronDialogBox.showConfirmationMessage(commentText).subscribe(response => { + if (response) { +this.alertCommentsWrapper.splice(index, 1); +this.patchAlert(new Patch('add', '/comments', this.alertCommentsWrapper.map(alertsWrapper => alertsWrapper.alertComment))); --- End diff -- @justinleet makes sense, the goal is to get something in fast and iterate, but it sounds like getting the field nested is not going to be much more work than the alerts work we already have to do, so can piggyback on that and minimise the upgrade effort. There is I believe some separate work going on making reindexing easier, which may help here (though we should also ensure that that work is DAO aware when reindexing commented stuff from HDFS) ---
[GitHub] metron issue #682: METRON-1081: Fix Alerts and Ops UI Notices file
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/682 Yes, and needs to be done as an ongoing thing really. We have to keep on top of our Node licenses. I note that recent version of angular cli actually seem to generate licence and notice files for you, but not had chance to verify their depth and correctness yet. This may make life easier in future, but for now we're on older angular-cli, so have to use this method to keep on top of things. ---
[GitHub] metron pull request #788: METRON-1223: Support for adding comments to alerts
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/788#discussion_r143158180 --- Diff: metron-interface/metron-alerts/src/app/alerts/alert-details/alert-details.component.ts --- @@ -133,6 +173,40 @@ export class AlertDetailsComponent implements OnInit { }); } + onAddComment() { +let alertComment = new AlertComment(this.alertCommentStr, this.authenticationService.getCurrentUserName(), new Date().getTime()); +let tAlertComments = this.alertCommentsWrapper.map(alertsWrapper => alertsWrapper.alertComment); +tAlertComments.unshift(alertComment); +this.patchAlert(new Patch('add', '/comments', tAlertComments)); + } + + patchAlert(patch: Patch) { +let patchRequest = new PatchRequest(); +patchRequest.guid = this.alertSource.guid; +patchRequest.index = this.alertIndex; +patchRequest.patch = [patch]; +patchRequest.sensorType = this.alertSourceType; + +this.updateService.patch(patchRequest).subscribe(() => { + this.getData(); +}); + } + + onDeleteComment(index: number) { +let commentText = 'Do you wish to delete the comment '; +if (this.alertCommentsWrapper[index].alertComment.comment.length > 25 ) { + commentText += ' \'' + this.alertCommentsWrapper[index].alertComment.comment.substr(0, 25) + '...\''; +} else { + commentText += ' \'' + this.alertCommentsWrapper[index].alertComment.comment + '\''; +} + + this.metronDialogBox.showConfirmationMessage(commentText).subscribe(response => { + if (response) { +this.alertCommentsWrapper.splice(index, 1); +this.patchAlert(new Patch('add', '/comments', this.alertCommentsWrapper.map(alertsWrapper => alertsWrapper.alertComment))); --- End diff -- In the short term, we could just consider this an unanalysed field, and not have comments searchable, no? In terms of doing it properly, we already have the nested field issue with the meta-alerts concept, so reusing the same coping mechanisms seems viable. ---
[GitHub] metron pull request #779: METRON-1218: Metron REST should return better erro...
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/779#discussion_r141869253 --- Diff: metron-interface/metron-rest/src/main/java/org/apache/metron/rest/controller/RestExceptionHandler.java --- @@ -45,4 +45,14 @@ private HttpStatus getStatus(HttpServletRequest request) { } return HttpStatus.valueOf(statusCode); } + + private String getFullMessage(Throwable ex) { +String fullMessage = ex.getMessage(); +Throwable cause = ex.getCause(); +while(cause != null) { + fullMessage = cause.getMessage(); + cause = cause.getCause(); +} +return fullMessage; --- End diff -- @merrimanr I'm suggesting we log it somewhere, right now there is no apparent indication in the rest logs, so people end up hunting. When I hit the missing alter: nested template the other day, it took a PCAP of the transport protocol to find out what the actual root error was. ---
[GitHub] metron issue #771: METRON-1204: UI does not time out after being idle, but s...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/771 @ottobackwards should we push that as a follow up issue for now rather than expanding the scope of this PR? ---
[GitHub] metron pull request #779: METRON-1218: Metron REST should return better erro...
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/779#discussion_r141844398 --- Diff: metron-interface/metron-rest/src/main/java/org/apache/metron/rest/controller/RestExceptionHandler.java --- @@ -45,4 +45,14 @@ private HttpStatus getStatus(HttpServletRequest request) { } return HttpStatus.valueOf(statusCode); } + + private String getFullMessage(Throwable ex) { +String fullMessage = ex.getMessage(); +Throwable cause = ex.getCause(); +while(cause != null) { + fullMessage = cause.getMessage(); + cause = cause.getCause(); +} +return fullMessage; --- End diff -- Perhaps a good alternative would be to show the error to the endpoint (the UI) and at least log the exception in the metron-rest.log ---
[GitHub] metron issue #776: METRON-1215: Fix link to RPMs chapter
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/776 Yes, it is, not sure how that happened. Probably too many tabs. ---
[GitHub] metron issue #775: [METRON-1214] rpm build fails due to npm absence in Docke...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/775 Do we need to ensure we're picking this from the right repos? We should probably use the nodesource rpm repo rather than relying on the centos build, which is a very very old version, and could cause issues. Adding a the repo file in might be a good call. ---
[GitHub] metron issue #776: METRON-1215: Fix link to RPMs chapter
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/776 Do we need to ensure we're picking this from the right repos? We should probably use the nodesource rpm repo rather than relying on the centos build, which is a very very old version, and could cause issues. Adding a the repo file in might be a good call. ---
[GitHub] metron issue #711: METRON-1127: Add ability to escalate alerts for external ...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/711 @ottobackwards agreed, this is very separate from the management ui (won't touch, or be used by anything in the management ui). Also agreed this is a separate entity, but one that will be used by the alerts ui. Honestly, I'm not sure what else this needs to do. If we want to expand the scope, maybe we can do that as a follow on. @nickwallen thinking about it, I'm inclined to agree that the path of least surprise for the soc user is the one to go for here, so you're right. The representation passed should be that shown in the UI. The challenge here is things like meta-alerts, where the actual alert to pass may not be fully present on the ui due to paging on its member alerts. (I can certainly see use cases where a meta-alert might contain O(1000s) of alerts). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron pull request #722: METRON-1139 Fixed service advisor profilerHost var...
GitHub user simonellistonball opened a pull request: https://github.com/apache/metron/pull/722 METRON-1139 Fixed service advisor profilerHost variable ## Contributor Comments This is a simple variable mis-naming error which doesn't cause any immediate problems, but prevents the stack advisor running properly, which can prevent subsequent component installs. To recreate, install a metron cluster, then attempt to install another component. The next button in the ambari wizard will remain disabled on master selection, and the request history will show a 500 error on the stack advisor step. This fix solves the problem. ## Pull Request Checklist Thank you for submitting a contribution to Apache Metron. Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions. Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides. In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following: ### For all changes: - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [x] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? ### For code changes: - [x] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [x] Have you included steps or a guide to how the change may be verified and tested manually? - [x] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via: ``` mvn -q clean integration-test install && build_utils/verify_licenses.sh ``` - [x] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent? You can merge this pull request into a Git repository by running: $ git pull https://github.com/simonellistonball/incubator-metron METRON-1139 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/metron/pull/722.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #722 commit 942c7a518463f117e77ba0111b65a40203db74f5 Author: Simon Elliston Ball Date: 2017-08-30T00:48:48Z Fixed service advisor profilerHost variable --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron issue #711: METRON-1127: Add ability to escalate alerts for external ...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/711 I'd say the docs belong with the UI docs, since this is pretty much an endpoint to drive the UI buttons, no? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron pull request #711: METRON-1127: Add ability to escalate alerts for ex...
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/711#discussion_r135813488 --- Diff: metron-interface/metron-rest/src/main/java/org/apache/metron/rest/service/impl/AlertServiceImpl.java --- @@ -0,0 +1,57 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.rest.service.impl; + +import com.fasterxml.jackson.core.JsonProcessingException; +import java.util.Map; +import org.apache.metron.common.utils.JSONUtils; +import org.apache.metron.rest.MetronRestConstants; +import org.apache.metron.rest.RestException; +import org.apache.metron.rest.service.AlertService; +import org.apache.metron.rest.service.KafkaService; +import org.springframework.beans.factory.annotation.Autowired; +import org.springframework.core.env.Environment; +import org.springframework.stereotype.Service; + +/** + * The default service layer implementation of {@link AlertService}. + * + * @see AlertService + */ +@Service +public class AlertServiceImpl implements AlertService { + + private Environment environment; + private final KafkaService kafkaService; + + @Autowired + public AlertServiceImpl(final KafkaService kafkaService, + final Environment environment) { +this.kafkaService = kafkaService; +this.environment = environment; + } + --- End diff -- My understanding is that this is for manual escalation, and should just expose everything in the alert (or meta-alert) that is being manually escalated. The user would essentially add comments and make modifications before escalating, which would be persisted The alert which is pushed onto the queue should be the current alert state, i.e. with all those modifications. This was done elsewhere (sorry, can't remember the ticket number). So the alert data being pushed should come off the cluster side, and it should have all the content already sent from the client by a separate process. I think that should achieve pretty much what you're talking about @ottobackwards. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron pull request #709: METRON-1122: Add support for the profiler in the m...
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/709#discussion_r134235349 --- Diff: metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/configuration/metron-profiler-env.xml --- @@ -0,0 +1,98 @@ + + + + + +profiler_kafka_start +UNCOMMITTED_EARLIEST +One of EARLIEST, LATEST, UNCOMMITTED_EARLIEST, UNCOMMITTED_LATEST +Input Topic Start + + +profiler_period_duration +15 +The duration of each profile period. This value should be defined along with profiler.period.duration.units +Period Duration + + +profiler_period_units +MINUTES +The units used to specify the profiler.period.duration. This value should be defined along with profiler.period.duration. +Period Units + + +profiler_ttl +30 +If a message has not been applied to a Profile in this period of time, the Profile will be terminated and its resources will be cleaned up. This value should be defined along with profiler.ttl.units. + This time-to-live does not affect the persisted Profile data in HBase. It only affects the state stored in memory during the execution of the latest profile period. This state will be deleted if the time-to-live is exceeded. + +Time to Live + + +profiler_ttl_units +MINUTES --- End diff -- Should constrain choice here too --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron pull request #709: METRON-1122: Add support for the profiler in the m...
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/709#discussion_r134235333 --- Diff: metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/configuration/metron-profiler-env.xml --- @@ -0,0 +1,98 @@ + + + + + +profiler_kafka_start +UNCOMMITTED_EARLIEST +One of EARLIEST, LATEST, UNCOMMITTED_EARLIEST, UNCOMMITTED_LATEST +Input Topic Start + + +profiler_period_duration +15 +The duration of each profile period. This value should be defined along with profiler.period.duration.units +Period Duration + + --- End diff -- Should constrain choice here too --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron pull request #709: METRON-1122: Add support for the profiler in the m...
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/709#discussion_r134235256 --- Diff: metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/configuration/metron-profiler-env.xml --- @@ -0,0 +1,98 @@ + + + + + +profiler_kafka_start --- End diff -- Should this be a select-option choice like the enrichments section? (https://github.com/apache/metron/blob/ae1dc3a4aa31cfd6b4af7558dbb2fc0314f80a29/metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/configuration/metron-enrichment-env.xml#L36) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron pull request #709: METRON-1122: Add support for the profiler in the m...
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/709#discussion_r134233234 --- Diff: metron-deployment/packaging/ambari/metron-mpack/src/main/resources/addon-services/METRON/CURRENT/role_command_order.json --- @@ -5,11 +5,13 @@ "_comment" : "dependencies for all cases", "METRON_INDEXING-INSTALL" : ["METRON_PARSERS-INSTALL"], "METRON_ENRICHMENT-INSTALL": ["METRON_INDEXING-INSTALL"], +"METRON_PROFILER-INSTALL": ["METRON_ENRICHMENT-INSTALL"], "METRON_REST-INSTALL": ["METRON_PARSERS-INSTALL"], "METRON_PARSERS-START" : ["NAMENODE-START", "ZOOKEEPER_SERVER-START", "KAFKA_BROKER-START", "STORM_REST_API-START","METRON_ENRICHMENT_MASTER-START"], "METRON_ENRICHMENT_MASTER-START" : ["NAMENODE-START", "ZOOKEEPER_SERVER-START", "KAFKA_BROKER-START", "STORM_REST_API-START", "HBASE_MASTER-START", "HBASE_REGIONSERVER-START"], "METRON_ENRICHMENT_SERVICE_CHECK-SERVICE_CHECK" : ["METRON_ENRICHMENT_MASTER-START"], "METRON_INDEXING-START" : ["NAMENODE-START", "ZOOKEEPER_SERVER-START", "KAFKA_BROKER-START", "STORM_REST_API-START","METRON_PARSERS-START"], +"METRON_PROFILER-START" : ["NAMENODE-START", "ZOOKEEPER_SERVER-START", "KAFKA_BROKER-START", "STORM_REST_API-START","METRON_INDEXING-START"], --- End diff -- does it really need indexing running to start? (to work, yes, but to start?) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron issue #672: METRON-1047: REST should use core-site.xml for Hadoop con...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/672 @cestella @merrimanr Any updates on this? Seems like a good thing to have. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron pull request #689: METRON-1102: Add support for ingesting cybox URI o...
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/689#discussion_r132347288 --- Diff: metron-platform/metron-data-management/src/main/java/org/apache/metron/dataloads/extractor/stix/StixExtractor.java --- @@ -38,6 +39,7 @@ import java.util.Map; public class StixExtractor implements Extractor { +private static final Logger LOG = Logger.getLogger(StixExtractor.class); --- End diff -- Didn't we change how logging worked in METRON-975? @mmiklavc see also #599 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron issue #620: Metron-988: UI for viewing alerts generated by Metron
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/620 +1 I'm good with this. My one niggle will be dealt with by other follow on issues. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron issue #674: METRON-681 CSVConverter should trim key and values
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/674 +1, looks good to me --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron issue #599: METRON-975: Normalize logging and switch to common idiom ...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/599 +1 (non-binding) the sooner we get this in the better. It has performance benefits, and the longer we wait the more work and trouble it will create. No reason not to get it done, lots to get it in. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron pull request #674: METRON-681 CSVConverter should trim key and values
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/674#discussion_r130342209 --- Diff: metron-platform/metron-parsers/src/test/java/org/apache/metron/parsers/csv/CSVParserTest.java --- @@ -87,15 +87,15 @@ public void test() throws IOException { Assert.assertEquals("grok", o.get("col3")); } { - String line = "foo, bar, grok"; + String line = " foo , bar , grok "; --- End diff -- This should really be implemented as a separate test function (additional) since we're adding functionality here. you can definitely use the same form, but we're testing a different unit of functionality. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron issue #530: METRON-777 Metron Extension System and Parser Extensions
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/530 @ottobackwards I didn't steal _that_ much of it. Still it shouldn't cause a problem, as long as we get all the updates since this PR started ported in. That one has moved a bit from things found in the wild. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron issue #619: METRON-939 Elasticsearch ES5 with Xshield client support
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/619 Seems like this is failing on some un-related temporary test failures. Can we get Travis kicked, and see what's left to do on this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron pull request #643: METRON-1026: threatintel_taxii_load.sh throws exce...
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/643#discussion_r129352107 --- Diff: metron-platform/metron-data-management/src/main/java/org/apache/metron/dataloads/nonbulk/taxii/TaxiiLoader.java --- @@ -165,6 +167,19 @@ public static Options getOptions() { public static final long DEFAULT_TIME_BETWEEN_POLLS = ONE_HR_IN_MS; + public static boolean isStixExtractor(Extractor e) { +if(e instanceof StixExtractor) { + return true; +} +else if(e instanceof ExtractorDecorator) { + ExtractorDecorator decorator = (ExtractorDecorator)e; + if(decorator.getUnderlyingExtractor() != null && decorator.getUnderlyingExtractor() instanceof StixExtractor) { +return true; + } +} +return false; + } + --- End diff -- why not just? `e instanceof StixExtractor || (ExtractorDecorator)e.getUnderlyingExtractor() instanceof StixExtractor` this is java after all and has shortcutting... even stellar does that now too :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron issue #650: METRON-1038: Stellar should have a better collection of b...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/650 I would like it to be committed to make it much easier for me to port my PR to it so +1 (non-binding) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron pull request #662: METRON-1056: Get field types from Elasticsearch
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/662#discussion_r129241629 --- Diff: metron-interface/metron-rest/src/main/java/org/apache/metron/rest/controller/SearchController.java --- @@ -45,4 +49,18 @@ ResponseEntity search(final @ApiParam(name = "searchRequest", value = "Search request", required = true) @RequestBody SearchRequest searchRequest) throws RestException { return new ResponseEntity<>(searchService.search(searchRequest), HttpStatus.OK); } + --- End diff -- Feels like any asynchronicity should be at the service layer anyway for other things to potentially benefit from, of course that would bubble up here, but do we really expect the REST server to have to worry about high load? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron pull request #622: METRON-1005 Create Decodable Row Key for Profiler
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/622#discussion_r129221553 --- Diff: metron-analytics/metron-profiler-common/src/main/java/org/apache/metron/profiler/hbase/DecodableRowKeyBuilder.java --- @@ -0,0 +1,402 @@ +/* + * + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * + */ + +package org.apache.metron.profiler.hbase; + +import org.apache.hadoop.hbase.util.Bytes; +import org.apache.metron.profiler.ProfileMeasurement; +import org.apache.metron.profiler.ProfilePeriod; + +import java.nio.BufferUnderflowException; +import java.nio.ByteBuffer; +import java.nio.ByteOrder; +import java.security.MessageDigest; +import java.security.NoSuchAlgorithmException; +import java.util.ArrayList; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.concurrent.TimeUnit; + +import static org.apache.metron.profiler.ProfilerClientConfig.PROFILER_PERIOD; +import static org.apache.metron.profiler.ProfilerClientConfig.PROFILER_PERIOD_UNITS; +import static org.apache.metron.profiler.ProfilerClientConfig.PROFILER_SALT_DIVISOR; + +/** + * Responsible for building the row keys used to store profile data in HBase. + * + * This builder generates decodable row keys. A decodable row key is one that can be interrogated to extract + * the constituent components of that row key. Given a previously generated row key this builder + * can extract the profile name, entity name, group name(s), period duration, and period. + * + * The row key is composed of the following fields. + * + * magic number - Helps to validate the row key. + * version - The version number of the row key. + * salt - A salt that helps prevent hot-spotting. + * profile - The name of the profile. + * entity - The name of the entity being profiled. + * group(s) - The group(s) used to sort the data in HBase. For example, a group may distinguish between weekends and weekdays. + * period - The period in which the measurement was taken. The first period starts at the epoch and increases monotonically. + * + */ +public class DecodableRowKeyBuilder implements RowKeyBuilder { + + /** + * Defines the byte order when encoding and decoding the row keys. + * + * Making this configurable is likely not necessary and is left as a practice exercise for the reader. :) + */ + private static final ByteOrder byteOrder = ByteOrder.BIG_ENDIAN; + + /** + * Defines some level of sane max field length to avoid any shenanigans with oddly encoded row keys. + */ + private static final int MAX_FIELD_LENGTH = 1000; + + /** + * A magic number embedded in each row key to help validate the row key and byte ordering when decoding. + */ + protected static final short MAGIC_NUMBER = 77; + + /** + * The version number of the row keys supported by this builder. + */ + protected static final byte VERSION = (byte) 1; + + /** + * A salt can be prepended to the row key to help prevent hot-spotting. The salt + * divisor is used to generate the salt. The salt divisor should be roughly equal + * to the number of nodes in the Hbase cluster. + */ + private int saltDivisor; + + /** + * The duration of each profile period in milliseconds. + */ + private long periodDurationMillis; + + public DecodableRowKeyBuilder() { +this(PROFILER_SALT_DIVISOR.getDefault(Integer.class), +PROFILER_PERIOD.getDefault(Long.class), + TimeUnit.valueOf(PROFILER_PERIOD_UNITS.getDefault(String.class))); + } + + public DecodableRowKeyBuilder(int saltDivisor, long duration, TimeUnit units) { +this.saltDivisor = saltDivisor; +this.periodDurationMillis = units.toMillis(duration); + } + + /** + * Builds a list o
[GitHub] metron pull request #656: METRON-1050 Improve Docs of 'profile.period.durati...
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/656#discussion_r128132389 --- Diff: metron-analytics/metron-profiler/README.md --- @@ -137,15 +137,20 @@ This section will describe the steps required to get your first "Hello, World!"" hbase(main):001:0> count 'profiler' ``` -1. Use the Profiler Client to read the profile data. The below example `PROFILE_GET` command will read data written by the sample profile given above, if 10.0.0.1 is one of the input values for `ip_src_addr`. -More information on configuring and using the client can be found [here](../metron-profiler-client). -It is assumed that the `PROFILE_GET` client is correctly configured before using it. +1. Use the [Profiler Client]((../metron-profiler-client)) to read the profile data. The following `PROFILE_GET` command will read the data written by the "Hello, World" profile. This assumes that `10.0.0.1` is one of the values for `ip_src_addr` contained within the telemetry consumed by the Profiler. --- End diff -- Hello, world vs hello-world? Should the naming be consistent here to avoid confusion? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron pull request #647: METRON-1031: Management UI Cannot Start Topologies...
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/647#discussion_r127474849 --- Diff: metron-interface/metron-rest/src/main/java/org/apache/metron/rest/service/impl/StormCLIWrapper.java --- @@ -75,37 +81,50 @@ public int stopIndexingTopology(boolean stopNow) throws RestException { protected int runCommand(String[] command) throws RestException { ProcessBuilder pb = getProcessBuilder(command); pb.inheritIO(); -Process process = null; +LOG.debug("Running command: cmd={}", String.join(" ", command)); + +Process process; try { process = pb.start(); process.waitFor(); + } catch (Exception e) { throw new RestException(e); } -return process.exitValue(); + +int exitValue = process.exitValue(); +LOG.debug("Command completed: cmd={}, exit={}", String.join(" ", command), exitValue); + +return exitValue; } protected String[] getParserStartCommand(String name) { -String[] command = new String[7]; +String[] command = new String[9]; command[0] = environment.getProperty(MetronRestConstants.PARSER_SCRIPT_PATH_SPRING_PROPERTY); command[1] = "-k"; command[2] = environment.getProperty(MetronRestConstants.KAFKA_BROKER_URL_SPRING_PROPERTY); command[3] = "-z"; command[4] = environment.getProperty(MetronRestConstants.ZK_URL_SPRING_PROPERTY); command[5] = "-s"; command[6] = name; +command[7] = "-ksp"; +command[8] = environment.getProperty(MetronRestConstants.KAFKA_SECURITY_PROTOCOL_SPRING_PROPERTY); return command; } protected String[] getEnrichmentStartCommand() { -String[] command = new String[1]; +String[] command = new String[3]; --- End diff -- right, my bad, it was just parser script I had to change --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron pull request #647: METRON-1031: Management UI Cannot Start Topologies...
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/647#discussion_r127472746 --- Diff: metron-interface/metron-rest/src/main/java/org/apache/metron/rest/service/impl/StormCLIWrapper.java --- @@ -75,37 +81,50 @@ public int stopIndexingTopology(boolean stopNow) throws RestException { protected int runCommand(String[] command) throws RestException { ProcessBuilder pb = getProcessBuilder(command); pb.inheritIO(); -Process process = null; +LOG.debug("Running command: cmd={}", String.join(" ", command)); + +Process process; try { process = pb.start(); process.waitFor(); + } catch (Exception e) { throw new RestException(e); } -return process.exitValue(); + +int exitValue = process.exitValue(); +LOG.debug("Command completed: cmd={}, exit={}", String.join(" ", command), exitValue); + +return exitValue; } protected String[] getParserStartCommand(String name) { -String[] command = new String[7]; +String[] command = new String[9]; command[0] = environment.getProperty(MetronRestConstants.PARSER_SCRIPT_PATH_SPRING_PROPERTY); command[1] = "-k"; command[2] = environment.getProperty(MetronRestConstants.KAFKA_BROKER_URL_SPRING_PROPERTY); command[3] = "-z"; command[4] = environment.getProperty(MetronRestConstants.ZK_URL_SPRING_PROPERTY); command[5] = "-s"; command[6] = name; +command[7] = "-ksp"; +command[8] = environment.getProperty(MetronRestConstants.KAFKA_SECURITY_PROTOCOL_SPRING_PROPERTY); return command; } protected String[] getEnrichmentStartCommand() { -String[] command = new String[1]; +String[] command = new String[3]; --- End diff -- Has this changed since 0.4.0 RC1? Didn't work when I tried it until ksp added in the start script. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron pull request #620: Metron-988: UI for viewing alerts generated by Met...
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/620#discussion_r127438525 --- Diff: metron-interface/metron-alerts/src/app/utils/elasticsearch-utils.ts --- @@ -0,0 +1,74 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +import {ColumnMetadata} from '../model/column-metadata'; +import {AlertsSearchResponse} from '../model/alerts-search-response'; + +export class ElasticsearchUtils { + + public static excludeIndexName = 'kibana'; + + private static createColumMetaData(properties: any, columnMetadata: ColumnMetadata[], seen: string[]) { + try { + let columnNames = Object.keys(properties); + for (let columnName of columnNames) { + if (seen.indexOf(columnName) === -1) { + seen.push(columnName); + columnMetadata.push( + new ColumnMetadata(columnName, (properties[columnName].type ? properties[columnName].type.toUpperCase() : '')) --- End diff -- Forcing upper case in this model prevents sorting from working later... e.g. sorting by threat.triage.level leads to failed searches with: Caused by: java.lang.IllegalArgumentException: No mapper found for type [DOUBLE] The type mapping in elastic appears to be case sensitive, so squashing case here is going to break things. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron pull request #620: Metron-988: UI for viewing alerts generated by Met...
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/620#discussion_r127432908 --- Diff: metron-interface/metron-alerts/src/app/service/elasticsearch-localstorage-impl.ts --- @@ -0,0 +1,294 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +import {Observable} from 'rxjs/Rx'; +import {Headers, RequestOptions} from '@angular/http'; + +import {HttpUtil} from '../utils/httpUtil'; +import {DataSource} from './data-source'; +import {Alert} from '../model/alert'; +import {ColumnMetadata} from '../model/column-metadata'; +import {ElasticsearchUtils} from '../utils/elasticsearch-utils'; +import { + ALERTS_COLUMN_NAMES, ALERTS_TABLE_METADATA, ALERTS_RECENT_SEARCH, + ALERTS_SAVED_SEARCH, NUM_SAVED_SEARCH +} from '../utils/constants'; +import {ColumnNames} from '../model/column-names'; +import {ColumnNamesService} from './column-names.service'; +import {TableMetadata} from '../model/table-metadata'; +import {SaveSearch} from '../model/save-search'; +import {AlertsSearchResponse} from '../model/alerts-search-response'; +import {SearchRequest} from '../model/search-request'; + +export class ElasticSearchLocalstorageImpl extends DataSource { + + private defaultColumnMetadata = [ +new ColumnMetadata('_id', 'string'), +new ColumnMetadata('timestamp', 'date'), +new ColumnMetadata('source:type', 'string'), +new ColumnMetadata('ip_src_addr', 'ip'), +new ColumnMetadata('enrichments:geo:ip_dst_addr:country', 'string'), +new ColumnMetadata('ip_dst_addr', 'ip'), +new ColumnMetadata('host', 'string'), +new ColumnMetadata('alert_status', 'string') + ]; + + getAlerts(searchRequest: SearchRequest): Observable { +let url = '/search/*,-*' + ElasticsearchUtils.excludeIndexName + '/_search'; +let request: any = searchRequest; +request.query = { query_string: { query: searchRequest.query } }; + +return this.http.post(url, request, new RequestOptions({headers: new Headers(this.defaultHeaders)})) + .map(HttpUtil.extractData) + .map(ElasticsearchUtils.extractAlertsData) + .catch(HttpUtil.handleError) + .onErrorResumeNext(); + } + + getAlert(index: string, type: string, alertId: string): Observable { +return this.http.get('/search/' + index + '/' + type + '/' + alertId, new RequestOptions({headers: new Headers(this.defaultHeaders)})) + .map(HttpUtil.extractData); + } + + updateAlertState(request: any) { +return this.http.post('/search/_bulk', request, new RequestOptions({headers: new Headers(this.defaultHeaders)})) + .map(HttpUtil.extractData) + .catch(HttpUtil.handleError); + } + + getDefaultAlertTableColumnNames(): Observable { +return Observable.create(observer => { + observer.next(JSON.parse(JSON.stringify(this.defaultColumnMetadata))); + observer.complete(); +}); + } + + getAllFieldNames(): Observable { +let url = '_cluster/state'; --- End diff -- using cluster state here to enumerate indexes adds a lot of overhead, and brings back a lot of unnecessary meta data. Of course really this should ultimately be fronted with its own API from the REST service, but it might also be worth considering a simple '*' as url, which will just bring back index details without all the node and replica details. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron issue #651: METRON-1037 Added POWER function
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/651 Yes, I think I'll just refactor this around #650 when that's committed to keep the workflow simple. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron pull request #651: METRON-1037 Added POWER function
Github user simonellistonball closed the pull request at: https://github.com/apache/metron/pull/651 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron pull request #651: METRON-1037 Added POWER function
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/651#discussion_r127264326 --- Diff: metron-stellar/stellar-common/src/main/java/org/apache/metron/stellar/dsl/functions/MathFunctions.java --- @@ -60,4 +60,38 @@ public boolean isInitialized() { } } + @Stellar(name="POWER" --- End diff -- I went with POWER to be consistent with SQL and Excel, rather than python, since that's a more likely familiar ground for security and BI people. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron issue #651: METRON-1037 Added POWER function
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/651 I get the point @mattf-horton and thought about it, but went this way for familiarity's sake, a lot of the likely authors of stellar statements are security analysts who will be more familiar with languages like SQL or Excel functions, or even python than languages which use the ^ operator (off the top of my head: urgh... just tried python, scala, java, c, c++, javascript, can't actually think of one that does, though some have **) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron pull request #652: METRON-1039: Add ZIP function to Stellar
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/652#discussion_r127244811 --- Diff: metron-stellar/stellar-common/src/test/java/org/apache/metron/stellar/dsl/functions/FunctionalFunctionsTest.java --- @@ -24,13 +24,124 @@ import org.junit.Assert; import org.junit.Test; +import java.util.ArrayList; +import java.util.HashMap; import java.util.List; +import java.util.Map; import static org.apache.metron.stellar.common.utils.StellarProcessorUtils.run; public class FunctionalFunctionsTest { @Test + public void testZipLongest_boundary() { +for (String expr : ImmutableList.of( "ZIP_LONGEST()" +, "ZIP_LONGEST( null, null )" +, "ZIP_LONGEST( [], null )" +, "ZIP_LONGEST( [], [] )" +, "ZIP_LONGEST( null, [] )" +) +) +{ + List> o = (List>) run(expr, new HashMap<>()); + Assert.assertEquals(0, o.size()); +} + } + + @Test + public void testZip_longest() { +Map variables = ImmutableMap.of( +"list1" , ImmutableList.of(1, 2, 3) +,"list2", ImmutableList.of(4, 5, 6, 7) +); +for (String expr : ImmutableList.of( "ZIP_LONGEST(list1, list2)" +, "ZIP_LONGEST( [1, 2, 3], [4, 5, 6, 7] )" +) +) +{ + List> o = (List>) run(expr, variables); + Assert.assertEquals(4, o.size()); + for (int i = 0; i < 3; ++i) { +List l = o.get(i); +Assert.assertEquals(2, l.size()); +Assert.assertEquals(i+1, l.get(0)); +Assert.assertEquals(i+4, l.get(1)); + } + { +int i = 3; +List l = o.get(i); +Assert.assertEquals(2, l.size()); +Assert.assertNull(l.get(0)); +Assert.assertEquals(i+4, l.get(1)); + } +} + + +for (String expr : ImmutableList.of( + "REDUCE(ZIP_LONGEST(list2, list1), (s, x) -> s + GET_FIRST(x) * GET_LAST(x), 0)" +, "REDUCE(ZIP_LONGEST( [1, 2, 3], [4, 5, 6, 7] ), (s, x) -> s + GET_FIRST(x) * GET_LAST(x), 0)" +, "REDUCE(ZIP_LONGEST(list1, list2), (s, x) -> s + GET_FIRST(x) * GET_LAST(x), 0)" //this works because stellar treats nulls as 0 in arithmetic operations. +, "REDUCE(ZIP_LONGEST(list1, list2), (s, x) -> s + (GET_FIRST(x) == null?0:GET_FIRST(x)) * (GET_LAST(x) == null?0:GET_LAST(x)), 0)" //with proper guarding NOT assuming stellar peculiarities +) +) +{ + int o = (int) run(expr, variables); + Assert.assertEquals(1*4 + 2*5 + 3*6, o, 1e-7); --- End diff -- good point, i should probably have looked at the numbers. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron pull request #652: METRON-1039: Add ZIP function to Stellar
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/652#discussion_r127239254 --- Diff: metron-stellar/stellar-common/src/test/java/org/apache/metron/stellar/dsl/functions/FunctionalFunctionsTest.java --- @@ -24,13 +24,124 @@ import org.junit.Assert; import org.junit.Test; +import java.util.ArrayList; +import java.util.HashMap; import java.util.List; +import java.util.Map; import static org.apache.metron.stellar.common.utils.StellarProcessorUtils.run; public class FunctionalFunctionsTest { @Test + public void testZipLongest_boundary() { +for (String expr : ImmutableList.of( "ZIP_LONGEST()" +, "ZIP_LONGEST( null, null )" +, "ZIP_LONGEST( [], null )" +, "ZIP_LONGEST( [], [] )" +, "ZIP_LONGEST( null, [] )" +) +) +{ + List> o = (List>) run(expr, new HashMap<>()); + Assert.assertEquals(0, o.size()); +} + } + + @Test + public void testZip_longest() { +Map variables = ImmutableMap.of( +"list1" , ImmutableList.of(1, 2, 3) +,"list2", ImmutableList.of(4, 5, 6, 7) +); +for (String expr : ImmutableList.of( "ZIP_LONGEST(list1, list2)" +, "ZIP_LONGEST( [1, 2, 3], [4, 5, 6, 7] )" +) +) +{ + List> o = (List>) run(expr, variables); + Assert.assertEquals(4, o.size()); + for (int i = 0; i < 3; ++i) { +List l = o.get(i); +Assert.assertEquals(2, l.size()); +Assert.assertEquals(i+1, l.get(0)); +Assert.assertEquals(i+4, l.get(1)); + } + { +int i = 3; +List l = o.get(i); +Assert.assertEquals(2, l.size()); +Assert.assertNull(l.get(0)); +Assert.assertEquals(i+4, l.get(1)); + } +} + + +for (String expr : ImmutableList.of( + "REDUCE(ZIP_LONGEST(list2, list1), (s, x) -> s + GET_FIRST(x) * GET_LAST(x), 0)" +, "REDUCE(ZIP_LONGEST( [1, 2, 3], [4, 5, 6, 7] ), (s, x) -> s + GET_FIRST(x) * GET_LAST(x), 0)" +, "REDUCE(ZIP_LONGEST(list1, list2), (s, x) -> s + GET_FIRST(x) * GET_LAST(x), 0)" //this works because stellar treats nulls as 0 in arithmetic operations. +, "REDUCE(ZIP_LONGEST(list1, list2), (s, x) -> s + (GET_FIRST(x) == null?0:GET_FIRST(x)) * (GET_LAST(x) == null?0:GET_LAST(x)), 0)" //with proper guarding NOT assuming stellar peculiarities +) +) +{ + int o = (int) run(expr, variables); + Assert.assertEquals(1*4 + 2*5 + 3*6, o, 1e-7); --- End diff -- surely this does not actually test the structure is correct. I could get the same asserted result from zip(a, reversed(b)). I think we should have tests around defined ordering of pairs. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron pull request #652: METRON-1039: Add ZIP function to Stellar
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/652#discussion_r127231248 --- Diff: metron-stellar/stellar-common/README.md --- @@ -711,6 +713,18 @@ In the core language functions, we support basic functional programming primitiv * dateTime - The datetime as a long representing the milliseconds since unix epoch * Returns: The current year +### `ZIP` + * Description: Zips lists into a single list where the ith element is an list containing the ith items from the constituent lists. --- End diff -- might be kinder to link to the python docs on zip, or at least an example that isn't in lisp :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron pull request #651: METRON-1037 Added POWER function
GitHub user simonellistonball opened a pull request: https://github.com/apache/metron/pull/651 METRON-1037 Added POWER function ## Contributor Comments This is a quick addition to the Math functions. It may be worth revising following the work @cestella did this morning to add a more general implementation on top of #650 I will probably add 2 argument capabilities and port this work to #650 when it's in once that is in. ### For all changes: - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [x] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? ### For code changes: - [x] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [x] Have you included steps or a guide to how the change may be verified and tested manually? - [x] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via: ``` mvn -q clean integration-test install && build_utils/verify_licenses.sh ``` - [x] Have you written or updated unit tests and or integration tests to verify your changes? - [x] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent? ### For documentation related changes: - [x] Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via `site-book/target/site/index.html`: ``` cd site-book mvn site ``` You can merge this pull request into a Git repository by running: $ git pull https://github.com/simonellistonball/incubator-metron METRON-1037 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/metron/pull/651.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #651 commit 3e784245576d28ba275b1d4b3cf98347ea13f14f Author: Simon Elliston Ball Date: 2017-07-13T00:51:44Z Added POWER function --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron pull request #649: METRON-1035 Added SUM to the rules triage aggregat...
GitHub user simonellistonball opened a pull request: https://github.com/apache/metron/pull/649 METRON-1035 Added SUM to the rules triage aggregation docs ## Contributor Comments Quick doc fix verified against code. ## Pull Request Checklist ### For all changes: - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [x] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? ### For documentation related changes: - [x] Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via `site-book/target/site/index.html`: ``` cd site-book mvn site ``` Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. It is also recommended that [travis-ci](https://travis-ci.org) is set up for your personal repository such that your branches are built there before submitting a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/simonellistonball/incubator-metron METRON-1035 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/metron/pull/649.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #649 commit d2604086c52f00656ffab23b8b4eab1d30aeecee Author: Simon Elliston Ball Date: 2017-07-12T22:12:31Z Added SUM to the rules triage aggregation docs --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron pull request #648: METRON-1033 Corrected profiler docs units on expir...
GitHub user simonellistonball opened a pull request: https://github.com/apache/metron/pull/648 METRON-1033 Corrected profiler docs units on expires field Minor change to update profiler docs ## Contributor Comments [Please place any comments here. A description of the problem/enhancement, how to reproduce the issue, your testing methodology, etc.] ## Pull Request Checklist Thank you for submitting a contribution to Apache Metron. Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions. Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides. In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following: ### For all changes: - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [x] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? ### For documentation related changes: - [x] Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via `site-book/target/site/index.html`: You can merge this pull request into a Git repository by running: $ git pull https://github.com/simonellistonball/incubator-metron METRON-1033 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/metron/pull/648.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #648 commit 880e10cfc31f49cb11a3f5639c2c0a217743c044 Author: Simon Elliston Ball Date: 2017-07-12T16:30:25Z Corrected profiler docs units on expires field --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron issue #599: METRON-975: Normalize logging and switch to common idiom ...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/599 Is it worth us getting this in sooner rather than later, before we get too many other bits of logging that will need to be backported? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron pull request #617: METRON-996 Performance improvement for ASA parser
GitHub user simonellistonball reopened a pull request: https://github.com/apache/metron/pull/617 METRON-996 Performance improvement for ASA parser Moved the compilation of Grok into initialisation and created a map of Grok instances for each ASA message type. No functional changes, but a significant performance improvement according to perfidix benchmarks (see JIRA). All functional tests related to the parser have been run, performance tests were run manually with a simple Perfidix harness (not included) running the parse method against sample messages taken from the existing unit tests. ## Pull Request Checklist Thank you for submitting a contribution to Apache Metron. Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions. Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides. In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following: ### For all changes: - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [x] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? ### For code changes: - [x] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [x] Have you included steps or a guide to how the change may be verified and tested manually? - [x] Have you ensured that the full suite of tests and checks have been executed in the root incubating-metron folder via: ``` mvn -q clean integration-test install && build_utils/verify_licenses.sh ``` - [x] Have you written or updated unit tests and or integration tests to verify your changes? - [x] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent? You can merge this pull request into a Git repository by running: $ git pull https://github.com/simonellistonball/incubator-metron METRON-996 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/metron/pull/617.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #617 commit 49c4bc63cca8d31e27b5cdee0d2b8787ebefb9be Author: Simon Elliston Ball Date: 2017-06-11T12:45:18Z Moved compilation of Grok statements into init, and produced a hashmap of Grok instances for each ASA tyype for performance gains commit a095b9ecccaf07442d2bcd767b2a7198ff4dffbc Author: Simon Elliston Ball Date: 2017-06-11T12:49:26Z Fixed style conventions --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron pull request #617: METRON-996 Performance improvement for ASA parser
Github user simonellistonball closed the pull request at: https://github.com/apache/metron/pull/617 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron issue #602: METRON-906: Rest service storm configuration does not all...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/602 Is there anything preventing this getting merged? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron pull request #617: METRON-996 Performance improvement for ASA parser
GitHub user simonellistonball opened a pull request: https://github.com/apache/metron/pull/617 METRON-996 Performance improvement for ASA parser Moved the compilation of Grok into initialisation and created a map of Grok instances for each ASA message type. No functional changes, but a significant performance improvement according to perfidix benchmarks (see JIRA). All functional tests related to the parser have been run, performance tests were run manually with a simple Perfidix harness (not included) running the parse method against sample messages taken from the existing unit tests. ## Pull Request Checklist Thank you for submitting a contribution to Apache Metron. Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions. Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides. In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following: ### For all changes: - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [x] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? ### For code changes: - [x] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [x] Have you included steps or a guide to how the change may be verified and tested manually? - [x] Have you ensured that the full suite of tests and checks have been executed in the root incubating-metron folder via: ``` mvn -q clean integration-test install && build_utils/verify_licenses.sh ``` - [x] Have you written or updated unit tests and or integration tests to verify your changes? - [x] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent? You can merge this pull request into a Git repository by running: $ git pull https://github.com/simonellistonball/incubator-metron METRON-996 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/metron/pull/617.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #617 commit 49c4bc63cca8d31e27b5cdee0d2b8787ebefb9be Author: Simon Elliston Ball Date: 2017-06-11T12:45:18Z Moved compilation of Grok statements into init, and produced a hashmap of Grok instances for each ASA tyype for performance gains commit a095b9ecccaf07442d2bcd767b2a7198ff4dffbc Author: Simon Elliston Ball Date: 2017-06-11T12:49:26Z Fixed style conventions --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron pull request #614: METRON-992: Create performance tuning guide
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/614#discussion_r120961228 --- Diff: metron-platform/Performance-tuning-guide.md --- @@ -0,0 +1,326 @@ +# Metron Performance Tunining Guide + +## Overview + +This document provides guidance from our experiences tuning the Apache Metron Storm topologies for maximum performance. You'll find +suggestions for optimum configurations under a 1 gbps load along with some guidance around the tooling we used to monitor and assess +our throughput. + +In the simplest terms, Metron is a streaming architecture created on top of Kafka and three main types of Storm topologies: parsers, +enrichment, and indexing. Each parser has it's own topology and there is also a highly performant, specialized spout-only topology +for streaming PCAP data to HDFS. We found that the architecture can be tuned almost exclusively through using a few primary Storm and +Kafka parameters along with a few Metron-specific options. You can think of the data flow as being similar to water flowing through a +pipe, and the majority of these options assist in tweaking the various pipe widths in the system. + +## General Suggestions + +Note that there is currently no method for specifying the number of tasks from the number of executors in Flux topologies (enrichment, + indexing). By default, the number of tasks will equal the number of executors. Logically, setting the number of tasks equal to the number +of executors is sensible. Storm enforces # executors <= # tasks. The reason you might set the number of tasks higher than the number of +executors is for future performance tuning and rebalancing without the need to bring down your topologies. The number of tasks is fixed +at topology startup time whereas the number of executors can be increased up to a maximum value equal to the number of tasks. + +We found that the default values for poll.timeout.ms, offset.commit.period.ms, and max.uncommitted.offsets worked well in nearly all cases. +As a general rule, it was optimal to set spout parallelism equal to the number of partitions used in your Kafka topic. Any greater +parallelism will leave you with idle consumers since Kafka limits the max number of consumers to the number of partitions. This is +important because Kafka has certain ordering guarantees for message delivery per partition that would not be possible if more than +one consumer in a given consumer group were able to read from that partition. + +## Tooling + +Before we get to the actual tooling used to monitor performance, it helps to describe what we might actually want to monitor and potential +pain points. Prior to switching over to the new Storm Kafka client, which leverages the new Kafka consumer API under the hood, offsets +were stored in Zookeeper. While the broker hosts are still stored in Zookeeper, this is no longer true for the offsets which are now +stored in Kafka itself. This is a configurable option, and you may switch back to Zookeeper if you choose, but Metron is currently using +the new defaults. This is useful to know as you're investigating both correctness as well as throughput performance. + +First we need to setup some environment variables +``` +export BROKERLIST= +export ZOOKEEPER= +export KAFKA_HOME= +export METRON_HOME= +export HDP_HOME= +``` + +If you have Kerberos enabled, setup the security protocol +``` +$ cat /tmp/consumergroup.config +security.protocol=SASL_PLAINTEXT +``` + +Now run the following command for a running topology's consumer group. In this example we are using enrichments. +``` +${KAFKA_HOME}/bin/kafka-consumer-groups.sh \ +--command-config=/tmp/consumergroup.config \ +--describe \ +--group enrichments \ +--bootstrap-server $BROKERLIST \ +--new-consumer +``` + +This will return a table with the following output depicting offsets for all partitions and consumers associated with the specified +consumer group: +``` +GROUP TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG OWNER +enrichmentsenrichments9 29746066 297460671 consumer-2_/xxx.xxx.xxx.xxx +enrichmentsenrichments3 29754325 297543261 consumer-1_/xxx.xxx.xxx.xxx +enrichmentsenrichments43 29754331 297543321 consumer-6_/xxx.xxx.xxx.xxx +... +``` + +_Note_: You won't see any output until a topology is actually running because the consumer groups only exist while cons
[GitHub] metron pull request #614: METRON-992: Create performance tuning guide
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/614#discussion_r120961066 --- Diff: metron-platform/Performance-tuning-guide.md --- @@ -0,0 +1,326 @@ +# Metron Performance Tunining Guide + +## Overview + +This document provides guidance from our experiences tuning the Apache Metron Storm topologies for maximum performance. You'll find +suggestions for optimum configurations under a 1 gbps load along with some guidance around the tooling we used to monitor and assess +our throughput. + +In the simplest terms, Metron is a streaming architecture created on top of Kafka and three main types of Storm topologies: parsers, +enrichment, and indexing. Each parser has it's own topology and there is also a highly performant, specialized spout-only topology +for streaming PCAP data to HDFS. We found that the architecture can be tuned almost exclusively through using a few primary Storm and +Kafka parameters along with a few Metron-specific options. You can think of the data flow as being similar to water flowing through a +pipe, and the majority of these options assist in tweaking the various pipe widths in the system. + +## General Suggestions + +Note that there is currently no method for specifying the number of tasks from the number of executors in Flux topologies (enrichment, + indexing). By default, the number of tasks will equal the number of executors. Logically, setting the number of tasks equal to the number +of executors is sensible. Storm enforces # executors <= # tasks. The reason you might set the number of tasks higher than the number of +executors is for future performance tuning and rebalancing without the need to bring down your topologies. The number of tasks is fixed +at topology startup time whereas the number of executors can be increased up to a maximum value equal to the number of tasks. + +We found that the default values for poll.timeout.ms, offset.commit.period.ms, and max.uncommitted.offsets worked well in nearly all cases. +As a general rule, it was optimal to set spout parallelism equal to the number of partitions used in your Kafka topic. Any greater +parallelism will leave you with idle consumers since Kafka limits the max number of consumers to the number of partitions. This is +important because Kafka has certain ordering guarantees for message delivery per partition that would not be possible if more than +one consumer in a given consumer group were able to read from that partition. + +## Tooling + +Before we get to the actual tooling used to monitor performance, it helps to describe what we might actually want to monitor and potential +pain points. Prior to switching over to the new Storm Kafka client, which leverages the new Kafka consumer API under the hood, offsets +were stored in Zookeeper. While the broker hosts are still stored in Zookeeper, this is no longer true for the offsets which are now +stored in Kafka itself. This is a configurable option, and you may switch back to Zookeeper if you choose, but Metron is currently using +the new defaults. This is useful to know as you're investigating both correctness as well as throughput performance. + +First we need to setup some environment variables +``` +export BROKERLIST= +export ZOOKEEPER= +export KAFKA_HOME= +export METRON_HOME= +export HDP_HOME= +``` + +If you have Kerberos enabled, setup the security protocol +``` +$ cat /tmp/consumergroup.config +security.protocol=SASL_PLAINTEXT +``` + +Now run the following command for a running topology's consumer group. In this example we are using enrichments. +``` +${KAFKA_HOME}/bin/kafka-consumer-groups.sh \ +--command-config=/tmp/consumergroup.config \ +--describe \ +--group enrichments \ +--bootstrap-server $BROKERLIST \ +--new-consumer +``` + +This will return a table with the following output depicting offsets for all partitions and consumers associated with the specified +consumer group: +``` +GROUP TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG OWNER +enrichmentsenrichments9 29746066 297460671 consumer-2_/xxx.xxx.xxx.xxx +enrichmentsenrichments3 29754325 297543261 consumer-1_/xxx.xxx.xxx.xxx +enrichmentsenrichments43 29754331 297543321 consumer-6_/xxx.xxx.xxx.xxx +... +``` + +_Note_: You won't see any output until a topology is actually running because the consumer groups only exist while cons
[GitHub] metron issue #530: METRON-777 Metron Extension System and Parser Extensions
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/530 Awesome, I guess this should be covered by the integration test suite as well, which has been kept reasonably up to date with the recent changes. Anything there you think might deserve a little tightening, or any specific risk areas you see that we should cover in integration-tests? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron issue #530: METRON-777 Metron Extension System and Parser Extensions
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/530 @ottobackwards there have been a number of tweaks to the parsers since this first went in, do you anticipate any need to port any of those by hand, or will this structure pick up changes with a master rebase? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron pull request #607: METRON-982 add new rest api for storm supervisor s...
Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/607#discussion_r120112633 --- Diff: metron-interface/metron-rest/src/main/java/org/apache/metron/rest/service/impl/StormStatusServiceImpl.java --- @@ -49,6 +44,11 @@ public StormStatusServiceImpl(Environment environment, RestTemplate restTemplate } @Override + public SupervisorSummary getSupervisorSummary(){ +return restTemplate.getForObject("http://"; + environment.getProperty(STORM_UI_SPRING_PROPERTY) + SUPERVISOR_SUMMARY_URL, SupervisorSummary.class); --- End diff -- see https://issues.apache.org/jira/browse/METRON-906 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---