[GitHub] metron issue #861: METRON-1341 Implemented SELECT transformer to project fie...
Github user ottobackwards commented on the issue: https://github.com/apache/metron/pull/861 Thank you for the contribution @simonellistonball, please be sure to assign and close METRON-1341 in jira. Cheers! #ynwa ---
[GitHub] metron issue #861: METRON-1341 Implemented SELECT transformer to project fie...
Github user cestella commented on the issue: https://github.com/apache/metron/pull/861 Yeah, this is good. +1 by inspection. ---
[GitHub] metron issue #861: METRON-1341 Implemented SELECT transformer to project fie...
Github user ottobackwards commented on the issue: https://github.com/apache/metron/pull/861 OK, Re-ran my scenario after the latest changes, and verified in kibana that only the msg field was indexed ( except for metron system or added fields ). +1 ---
[GitHub] metron issue #861: METRON-1341 Implemented SELECT transformer to project fie...
Github user ottobackwards commented on the issue: https://github.com/apache/metron/pull/861 I ran the following test: Modified the default snort parser configuration such that it was : ```json { "parserClassName":"org.apache.metron.parsers.snort.BasicSnortParser", "sensorTopic":"snort", "parserConfig": {}, "fieldTransformations" : [ { "output" : ["msg" ], "transformation" : "SELECT" } ] } ``` And the default snort enrichment configuration such that it was : ```json { "enrichment" : { }, "threatIntel" : { } } } ``` I got the following: ``` 2.168.138.158,49189,62.75.195.236,80,00:00:00:00:00:00,00:00:00:00:00:00,0x3C,***A,0x9DFB1927,0xF1BD72CC,,0xFAF0,128,0,2360,40,40960","enrichmentsplitterbolt.splitter.end.ts":"1512763453749","enrichmentsplitterbolt.splitter.begin.ts":"1512763453749","guid":"08a84757-bf05-431b-9d81-5fa95fb99938","timestamp":1512763452000} at org.apache.metron.enrichment.bolt.EnrichmentJoinBolt.getStreamIds(EnrichmentJoinBolt.java:53) ~[stormjar.jar:?] at org.apache.metron.enrichment.bolt.EnrichmentJoinBolt.getStreamIds(EnrichmentJoinBolt.java:33) ~[stormjar.jar:?] at org.apache.metron.enrichment.bolt.JoinBolt.execute(JoinBolt.java:138) [stormjar.jar:?] at org.apache.storm.daemon.executor$fn__6573$tuple_action_fn__6575.invoke(executor.clj:734) [storm-core-1.0.1.2.5.3.0-37.jar:1.0.1.2.5.3.0-37] at org.apache.storm.daemon.executor$mk_task_receiver$fn__6494.invoke(executor.clj:466) [storm-core-1.0.1.2.5.3.0-37.jar:1.0.1.2.5.3.0-37] at org.apache.storm.disruptor$clojure_handler$reify__6007.onEvent(disruptor.clj:40) [storm-core-1.0.1.2.5.3.0-37.jar:1.0.1.2.5.3.0-37] at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:451) [storm-core-1.0.1.2.5.3.0-37.jar:1.0.1.2.5.3.0-37] at org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:430) [storm-core-1.0.1.2.5.3.0-37.jar:1.0.1.2.5.3.0-37] at org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73) [storm-core-1.0.1.2.5.3.0-37.jar:1.0.1.2.5.3.0-37] at org.apache.storm.daemon.executor$fn__6573$fn__6586$fn__6639.invoke(executor.clj:853) [storm-core-1.0.1.2.5.3.0-37.jar:1.0.1.2.5.3.0-37] at org.apache.storm.util$async_loop$fn__554.invoke(util.clj:484) [storm-core-1.0.1.2.5.3.0-37.jar:1.0.1.2.5.3.0-37] at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?] at java.lang.Thread.run(Thread.java:745) [?:1.8.0_77] 2017-12-08 20:04:17.171 o.a.m.e.b.EnrichmentSplitterBolt [ERROR] Trying to retrieve a field map with sensor type of null 2017-12-08 20:04:17.171 o.a.m.e.b.EnrichmentSplitterBolt [ERROR] Trying to retrieve a field map with sensor type of null 2017-12-08 20:04:17.171 o.a.m.e.b.EnrichmentSplitterBolt [ERROR] Trying to retrieve a field map with sensor type of null 2017-12-08 20:04:17.171 o.a.m.e.b.EnrichmentSplitterBolt [ERROR] Trying to retrieve a field map with sensor type of null 2017-12-08 20:04:17.171 o.a.m.e.b.EnrichmentSplitterBolt [ERROR] Trying to retrieve a field map with sensor type of null 2017-12-08 20:04:17.171 o.a.m.e.b.EnrichmentSplitterBolt [ERROR] Trying to retrieve a field map with sensor type of null 2017-12-08 20:04:17.171 o.a.m.e.b.EnrichmentSplitterBolt [ERROR] Trying to retrieve a field map with sensor type of null 2017-12-08 20:04:17.171 o.a.m.e.b.EnrichmentSplitterBolt [ERROR] Trying to retrieve a field map with sensor type of null 2017-12-08 20:04:17.171 o.a.m.e.b.EnrichmentSplitterBolt [ERROR] Trying to retrieve a field map with sensor type of null ``` So it looks like there are more fields to protect. ---
[GitHub] metron issue #861: METRON-1341 Implemented SELECT transformer to project fie...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/861 Good catch, my test included the source.type in the select list, and generalised badly in the instructions. We should include the source.type in the protected system fields, let me update the test and implementation. ---
[GitHub] metron issue #861: METRON-1341 Implemented SELECT transformer to project fie...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/861 Actually, to follow up on that... I have a proxy feed, and some proxy use cases (enrichment, profile, etc). I want to keep my data clean and be explicit about which fields I pass on, so I have a 'library' field set that means "proxy like stuff", these are the fields I push into the output here. ---
[GitHub] metron issue #861: METRON-1341 Implemented SELECT transformer to project fie...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/861 If they want to select most, and remove the ones they don't want, then I would recommend using the remove transformation, or a set null in stellar. Perhaps regex support might be a nice follow on, but it breaks the mental model of people who are used to any other language that handles projection, such as SQL. The goal for this was to allow user to explicitly select only a defined set of fields. To be honest, people have lived with the idea of explicitly choosing fields for decades in SQL and quite liked it, so I suspect adding something that is pattern based might make it less usable than more. ---
[GitHub] metron issue #861: METRON-1341 Implemented SELECT transformer to project fie...
Github user ottobackwards commented on the issue: https://github.com/apache/metron/pull/861 I trying to run this up in full dev. I have two issues: 1. expressJS is failing download, obviously outside this PR 2. The usability of this transform. If a user just wants a couple of fields, then no problem, but If they want anything more than that, then selecting all the fields that they want seems like a bit of a chore. So if they want to select *most* of the fields, they have to list them all out. It seems tough. I wonder if we don't need regex support or something. Thoughts? ---
[GitHub] metron issue #861: METRON-1341 Implemented SELECT transformer to project fie...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/861 Right, that should cover it. ---
[GitHub] metron issue #861: METRON-1341 Implemented SELECT transformer to project fie...
Github user ottobackwards commented on the issue: https://github.com/apache/metron/pull/861 That is an excellent catch @simonellistonball, a test to go with it too ;) ---
[GitHub] metron issue #861: METRON-1341 Implemented SELECT transformer to project fie...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/861 It suddenly occurs to me that we should probably whitelist the original_string and timestamp fields, so that these are always kept by this transformation. Does that make sense? ---
[GitHub] metron issue #861: METRON-1341 Implemented SELECT transformer to project fie...
Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/861 Somehow I knew you were going to say something about docs will add. ---
[GitHub] metron issue #861: METRON-1341 Implemented SELECT transformer to project fie...
Github user ottobackwards commented on the issue: https://github.com/apache/metron/pull/861 Thanks Simon. Before a code review, a couple of questions: I see a check next to steps to verify manually, but I don't see the steps. This is also missing documentation. ---