Re: [DISCUSS] Persisting user data

2017-08-02 Thread Simon Elliston Ball
Agreed on Postgres. It's a lot easier to work with license-wise in apache projects, and has a lot of the capability we need here, especially if we can find a sensible ORM. Anyone got any thoughts on what would work there? Simon > On 2 Aug 2017, at 21:21, Matt Foley wrote: >

Re: [DISCUSS] Persisting user data

2017-08-03 Thread Simon Elliston Ball
y wars by being agnostic. > > On Wed, Aug 2, 2017 at 9:36 PM, Ryan Merriman <merrim...@gmail.com> wrote: > >> Spring supports a variety of databases including Postgres. I have no >> problem with using Postgres instead of MySQL. >> >> On Wed, Aug 2, 2017 a

Re: Elasticsearch 5.x upgrade

2017-07-17 Thread Simon Elliston Ball
I assume you're taking about METRON-939. The upgrade is not a huge change, and unlikely to touch any of the bulk writing pieces. The main thing outstanding on the PR is log4j dependency conflicts from the new ES client. There's no work in that PR to do anything around mpack based install either

Re: Post-parsing and Enrichment test framework

2017-07-04 Thread Simon Elliston Ball
You should probably use the Stellar REPL (../metron/bin/stellar -z $ZK) which gives you a kind of Stellar playground. Simon > On 4 Jul 2017, at 15:02, Ali Nazemian wrote: > > Hi all, > > I was wondering if there is a test framework we can use for Stellar > post-parsing

Re: [DISCUSS] Metron Rest to Install Parser (METRON-258)

2017-04-27 Thread Simon Elliston Ball
Otto, Happy to help around this. Couple of questions and things to maybe think about… REST seems fine for sending the payload, but the unpacking job should probably be async (triggered by the job, but not responsible for replying, i.e. the initial call gets an ACCEPTED response immediately

Re: Normalization topology or separate normalization bolt for parsing topology

2017-04-27 Thread Simon Elliston Ball
e parsing. This is only a single example of cases that might > affect the production data. Unless Stellar transformation is something > that can be done at pre-parse and for the entire message. > > > On Thu, Apr 27, 2017 at 11:14 AM, Simon Elliston Ball < > si...@simonellistonball.c

Re: [DISCUSS] Synopsis of Community Meeting on 8/22/2017

2017-08-23 Thread Simon Elliston Ball
Hi Jon, Good points all. Some of the versions we have been using are certainly aging. There are some PRs and tickets out for ES 5 upgrade, and some fine work has already been done on Centos 7 installation. I think it would be well worth us making some community decisions about bumping up some

Re: Parser Docs

2017-05-11 Thread Simon Elliston Ball
t;>> > There is a readme.md <http://readme.md/> PER parser in 777. >>> > I only stubbed them out however. >>> > >>> > Each parser created by the archetype has one as well. >>> > >>> > What I was hoping to do was to

Re: Parser Docs

2017-05-11 Thread Simon Elliston Ball
>> >> What I was hoping to do was to include the parser docs in the package >> assembly so the UI could load it. >> >> >> >> On May 8, 2017 at 19:35:41, Simon Elliston Ball >> (si...@simonellistonball.com) wrote: >> >> Quic

Re: Question about metron config licensing

2017-05-11 Thread Simon Elliston Ball
That would be an oversight. It should be Apache. I’ll fix that now. Simon > On 11 May 2017, at 12:59, Otto Fowler wrote: > > Hey, in the package.json for metron config, it is showing the lic. as MIT, > but the LICENSE file shows Apache 2.0. > That doesn’t seem right.

Re: [DISCUSS] Enrichment Split/Join issues

2017-05-16 Thread Simon Elliston Ball
Nick, I’d tend to agree with you there. How about: If an enrichment fails / effectively times out, the join bolt emits the message before cache eviction (as Nick’s point 2), but also adds a field stub to indicate failed enrichment. This is then an indicator to an operator or investigator as

Re: [DISCUSS] Enrichment Split/Join issues

2017-05-16 Thread Simon Elliston Ball
Would you then parallelise within Stellar to handle things like multiple lookups? This feels like it would be breaking the storm model somewhat, and could lead to bad things with threads for example. Or would you think of doing something like the grouping Stellar uses today to parallelise

Re: Why bro parser allows periods in keys?

2017-05-09 Thread Simon Elliston Ball
When we encounter this problem in things like enrichment, we’ve generally re-written the dots to ‘:’ at the indexing stage, no? That should be fine, which would suggest the cleaner is being over enthusiastic in this limit, may be worth validating that before changing the parser. Simon > On 9

Re: Why bro parser allows periods in keys?

2017-05-09 Thread Simon Elliston Ball
Jon, What would you use case be for comparison? Reconciliation of the sources? In theory both should be identical since they’re indexed from the same source. There should never be any reason to combine ES and HDFS indexing, unless there is a use case I’m missing... Simon > On 9 May 2017, at

Re: performance benchmarks on the asa parser

2017-06-08 Thread Simon Elliston Ball
low hanging fruit. Simon > On 9 Jun 2017, at 01:52, Otto Fowler <ottobackwa...@gmail.com> wrote: > > Are these changes that all grok parsers can benefit from? Are your changes > to the base classes that they use or asa only? > > > > On June 8, 2017 at 20:

Re: performance benchmarks on the asa parser

2017-06-08 Thread Simon Elliston Ball
can compile on first use of a grok and > then hold in memory? Avoids the up front burden but should also boost > performance. > > -Kyle > >> On Jun 8, 2017, at 8:56 PM, Simon Elliston Ball >> <si...@simonellistonball.com> wrote: >> >> The changes are

Re: [DISCUSS] Metadata Ingest

2017-06-21 Thread Simon Elliston Ball
I really like this idea. A good use case I imagine would be to have something like asa data, tagged with some custom meta data (e.g. Tenant ID in a multi-tenant install) but not have to mess with the actual parser. To that extent it makes sense to expose said meta data via stellar so users can

Re: [DISCUSS] Mutation of Indexed Data

2017-06-21 Thread Simon Elliston Ball
I'd say that was an excellent set of requirements (very similar to the one we arrived on with the last discuss thread on this) My vote remains a transaction log in hbase given the relatively low volume (human scale) i would not expect this to need anything fancy like compaction into hdfs

Re: Hello world of Metron !

2017-05-24 Thread Simon Elliston Ball
Welcome Geoff! Really looking forward to your contributions. There have been a few discussions around data models and standards, and I expect quite a few more to come, so it is certainly a great time to have you involved with that experience. Simon > On 24 May 2017, at 13:40, Geoff M

Re: Storm Slots - creating and deploying parsers

2017-06-02 Thread Simon Elliston Ball
Correct, you need to make sure you have sufficient slots in Ambari. Personally I tend to just add a bunch of ports at install, and create slots across the supervisors, but you can always add more supervisors too. Simon > On 2 Jun 2017, at 18:10, Otto Fowler wrote: >

Re: Question about the customization of Metron with my machine learining algo.

2017-06-05 Thread Simon Elliston Ball
Hi Simone, and welcome to the community. There are a number of extension points in Metron, the key ones being around machine learning. I suggest taking a look at https://github.com/apache/metron/tree/master/metron-analytics/metron-maas-service

Re: [DISCUSS] REST + ambari

2017-05-08 Thread Simon Elliston Ball
t? > > Right now, we have env parameters in ambari, that rest should honor. I don’t > understand how moving rest config into ambari get’s my rest service access to > those parameters > > > > On May 8, 2017 at 08:56:21, Simon Elliston Ball (si...@simonellistonb

Re: [DISCUSS] REST + ambari

2017-05-08 Thread Simon Elliston Ball
t caught up :) >> I’ll look for an example where we are reading application configuration >> variables out in the rest service. >> >> Thanks! >> >> >> >> On May 8, 2017 at 09:08:46, Simon Elliston Ball (si...@simonellistonball.com >>

Re: [GitHub] metron pull request #:

2017-09-17 Thread Simon Elliston Ball
Did you also install the metron-maas rpm? The ambari install does not currently install Model as a Service. yum install -y metron-maas on the note that has the parsers / indexing / enrichment service should do the trick. Simon > On 18 Sep 2017, at 14:36, HitsuYaga

Re: [DISUCUSS] [CALL FOR COMMENT] Metron parsers as actual extensions

2017-09-20 Thread Simon Elliston Ball
Otto, Can you just clarify what you mean by parsers in this instance. To my mind parsers in metron are be classes, and do not have any configuration settings. Instances of parsers are referred to in the ui as sensors, and are essentially concrete instances of parsers and as such do have

[DISCUSS] Dropping support for elastic 2.x

2017-10-04 Thread Simon Elliston Ball
A number of people are currently working on upgrading the ES support in Metron to 5.x (including the clients, and the mpack managed install). Would anyone have any objections to dropping formal support for 2.x as a result of this work? In theory the clients should be backward compatible against

Re: [DISCUSS] Dropping support for elastic 2.x

2017-10-04 Thread Simon Elliston Ball
oped for. >> >> The 5.6 client can communicate with any 5.6.x Elasticsearch node. Previous >> 5.x minor versions like 5.5.x, 5.4.x etc. are not (fully) supported. >> " >> >> Best, >> Mike >> >> >> On Wed, Oct 4, 2017 at 10:45 AM, Sim

Re: Suricata parser

2017-10-17 Thread Simon Elliston Ball
Suricata will quite happily produce json (http://suricata.readthedocs.io/en/latest/output/eve/eve-json-output.html ) , which works nicely in the the JSONMapParser. You can then use simple field transformations from that

Re: Ambari Metrics Collector failing...

2017-09-07 Thread Simon Elliston Ball
Correct, it’s not critical. Metrics can help a bit with debugging things like node hotspots in Metron and in HDP generally, but it’s certainly not required to run. Simon > On 7 Sep 2017, at 16:13, zeo...@gmail.com wrote: > > I wouldn't consider it a show stopper myself,

Re: [DISCUSS] Community meeting on Tuesday, Sept.23 10AM PST

2017-09-26 Thread Simon Elliston Ball
Hi Otto, This is a great demo, nice and clear, many thanks. Two questions remain for me: 1. how I would change configuration outside of the bundle? i.e. I install a bundle and that gives me enrichment and indexing config, but I then want to tune indexing for the characteristics of the

Re: [DISCUSS] Community meeting on Tuesday, Sept.23 10AM PST

2017-09-26 Thread Simon Elliston Ball
I was always expecting > refactoring around the configuration management and install part, so please, > let me know what you think and I can take a shot at it. Don’t forget about > irc. > > > > > On September 26, 2017 at 05:17:32, Simon Elliston Ball > (si...@simo

Re: sensor-parser-config-history

2017-08-31 Thread Simon Elliston Ball
That looks like it's the front for when there was versioning of config in the rest api. The versioning was based on hibernate envers so had licence issues, and stored change history into the rest rdbms. That functionality should probably come back one day once we have a proper approach to

Shuffle methods in indexing and enrichment topologies

2017-09-03 Thread Simon Elliston Ball
Is there a reason we use shuffle instead of local or shuffle in the indexing topology? It seems like LOCAL_OR_SHUFFLE would help us get better pipelines through Storm and reduce the storm to storm shuffle traffic. Did I miss something here about balancing, or is it worth changing? Simon

Re: Quick Dev

2017-10-06 Thread Simon Elliston Ball
+1 we see a lot of people struggling with the profusion of install and run methods as it is, if we can reduce that surface area, life will be a lot easier on the user list. > On 6 Oct 2017, at 13:28, zeo...@gmail.com wrote: > > I say we kill it and repoint the site. That

Re: Quick Dev

2017-10-06 Thread Simon Elliston Ball
tform" >>>> >>>> On Fri, Oct 6, 2017 at 8:39 AM, Nick Allen <n...@nickallen.org> wrote: >>>> >>>>> +1 To killing Quick Dev and updating the Wiki. Quick Dev has been >>> broken >>>>> for eons. Simon's poin

Re: ASA ciscotag error messages

2017-10-17 Thread Simon Elliston Ball
We certainly don’t parse every type of asa message at present. The challenge is getting hold of good samples from the wild to extend the range. If you have samples that can be anonymised of the missing tags, it would be easy to extend the patterns library to pull those in. What we need to get

Re: Sizing of components proportional to EPS

2017-10-17 Thread Simon Elliston Ball
To an extent it very much depends on the use case. I have seen over a million EPS on a six node cluster for pcap and basic net flow. If you add a lot of complex enrichment and profiling that will obviously increase the load. Tuning the components for the workload can also make a significant

Re: [DISCUSS] NPM / Node Problems

2017-11-27 Thread Simon Elliston Ball
of node & npm but > we can surely suggest a min version required to build UI successfully. > > -Raghu > > > > On Fri, Nov 24, 2017 at 10:21 PM, Simon Elliston Ball > <si...@simonellistonball.com> wrote: >> Agreeing with Nick, it seems like the main reason peop

Re: [DISCUSS] NPM / Node Problems

2017-11-27 Thread Simon Elliston Ball
os in all these cases. > > > > On November 27, 2017 at 07:02:51, Simon Elliston Ball > (si...@simonellistonball.com <mailto:si...@simonellistonball.com>) wrote: > >> Thinking about this, doesn’t our build plugin explicitly install it’s own >> node? So actuall

Re: Using Storm Resource Aware Scheduler

2017-11-26 Thread Simon Elliston Ball
I/Ambari, it will be overwritten. > > > Cheers, > Ali > > On Sat, Nov 25, 2017 at 3:36 AM, Simon Elliston Ball < > si...@simonellistonball.com> wrote: > >> Implementing the resource aware scheduler would be decidedly non-trivial. >> Every topology will need

DISCUSS: Quick change to parser config

2017-11-30 Thread Simon Elliston Ball
I’m looking at the way parser config works, and transformation of field from their native names in, for example the ASA or CEF parsers, into a standard data model. At the moment I would do something like this: assuming I have fields [ipSrc, ipDst, pointlessExtraStuff, message] I might have:

Re: DISCUSS: Quick change to parser config

2017-11-30 Thread Simon Elliston Ball
dr": "ipDst" > } , > { > "transformation": "STELLAR", > “operation": “SomeOtherThing", > "output": [“foo", “bar"], > "config": { > “foo": “TO_UPPER(foo)", > “bar": “TO_LOWER(bar)&quo

Re: DISCUSS: Quick change to parser config

2017-11-30 Thread Simon Elliston Ball
R", >> “operation": “SomeOtherThing", >> "output": [“foo", “bar"], >> "config": { >> “foo": “TO_UPPER(foo)", >> “bar": “TO_LOWER(bar)" >> } >> } >> ] >> } >> >&

Re: Using Storm Resource Aware Scheduler

2017-11-24 Thread Simon Elliston Ball
Implementing the resource aware scheduler would be decidedly non-trivial. Every topology will need additional configuration to tune for things like memory sizes, which is not going to buy you much change. So, at the micro-tuning level of parser this doesn’t make a lot of sense. However, it

Re: DISCUSS: Quick change to parser config

2017-12-04 Thread Simon Elliston Ball
" > ] >}, > { > "transformation": "COMPLETE", > "output" : [ "ip_src_addr", "ip_dst_addr", "message"] >} > ] > } > > I think having these two treated separately makes sense beca

Re: [DISCUSS] Community Meetings

2017-12-13 Thread Simon Elliston Ball
be made in the community meeting itself - this gives > others in other timezones and commitments review and voice in the decisions. > > If it didn't happen on the mailing lists then it didn't happen. :) > > > On Tue, Dec 12, 2017 at 1:39 PM, Simon Elliston Ball < > si..

Re: [DISCUSS] Community Meetings

2017-12-12 Thread Simon Elliston Ball
> > On December 12, 2017 at 13:19:55, Simon Elliston Ball ( > si...@simonellistonball.com) wrote: > > Happy to volunteer a zoom room. That seems to have worked for most in the > past. > > Simon > >> On 12 Dec 2017, at 18:09, Otto Fowler <ottobackwa...@gmail.com&

Re: [DISCUSS] Community Meetings

2017-12-12 Thread Simon Elliston Ball
Happy to volunteer a zoom room. That seems to have worked for most in the past. Simon > On 12 Dec 2017, at 18:09, Otto Fowler wrote: > > Thanks! I think I’d like something hosted though. > > > On December 12, 2017 at 11:18:52, Ahmed Shah

Re: Metron - Emailing Alerts

2017-12-13 Thread Simon Elliston Ball
re that I would want to discuss and flesh out > > Thanks, > James > > 13.12.2017, 14:26, "Simon Elliston Ball" <si...@simonellistonball.com>: >> We can already do that with profiles I would have thought. Create a profile >> that only picks alerts and

Re: Metron - Emailing Alerts

2017-12-13 Thread Simon Elliston Ball
ch is more manageable. This is probably a feature worthy of > consideration for Metron. > > 13.12.2017, 12:19, "Simon Elliston Ball" <si...@simonellistonball.com>: >> Metron generates alerts onto a Kafka queue, which can be used to integrate >> with Alert managem

Re: analytics exchange platform

2017-11-15 Thread Simon Elliston Ball
The analytics exchange concept is not really part of Apache Metron, but some commercial offerings include it. In terms of Metron itself, are you maybe thinking about Model as a Service: http://metron.apache.org/current-book/metron-analytics/metron-maas-service/index.html

Wiki Docs links seem wrong

2017-12-07 Thread Simon Elliston Ball
https://cwiki.apache.org/confluence/display/METRON/Metron+User+Guide+-+per+release The links don’t seem to correspond to the versions on this page. Would be happy to fix, but I don’t have wiki perms. Simon

Re: Wiki Docs links seem wrong

2017-12-07 Thread Simon Elliston Ball
Awesome, many thanks! > On 7 Dec 2017, at 13:08, Kyle Richardson <kylerichards...@gmail.com> wrote: > > Fixed. > > -Kyle > > On Thu, Dec 7, 2017 at 7:20 AM, Simon Elliston Ball < > si...@simonellistonball.com> wrote: > >> https://cwiki.apache.

Re: [DISCUSS] Release?

2018-05-09 Thread Simon Elliston Ball
Is it about time for a release? I know we got some substantial > performance > > changes in since the last release. I think we might have a justification > > for a release. > > > > Casey > > > -- -- simon elliston ball @sireb

Re: Streaming Machine Learning use case

2018-05-08 Thread Simon Elliston Ball
ery close from the integration point of view with Metron, so I wanted to > see if anyone had tried SAMOA in practice and especially with Metron use > cases. > > Regards, > Ali > -- -- simon elliston ball @sireb

Re: [DISCUSS] Pcap panel architecture

2018-05-08 Thread Simon Elliston Ball
; > > > > > > > > > > > > > > > (Youhouuu my first reply on this kind of mail chain^^) > > > > > > > > > > > > > > > > > > > > > > > > > > > > If I may, I would like to share my view on the following 3 > > points. > > > > > > > > > > > > > > - Backend: > > > > > > > > > > > > > > The current metron-api is totally seperate, it will be logic > for > > me > > > > to > > > > > > have > > > > > > > it at the same place as the others rest api. Especially when > > more > > > > > > security > > > > > > > will be added, it will not be needed to do the job twice. > > > > > > > The current implementation send back a pcap object which still > > need > > > > to > > > > > > be > > > > > > > decoded. In the opensoc, the decoding was done with tshard on > > the > > > > > > frontend. > > > > > > > It will be good to have this decoding happening directly on the > > > > backend > > > > > > to > > > > > > > not create a load on frontend. An option will be to install > > tshark > > > on > > > > > > the > > > > > > > rest server and to use to convert the pcap to xml and then to a > > > json > > > > > > that > > > > > > > will be send to the frontend. > > > > > > > > > > > > > > I tried to start directly the map/reduce job to search over all > > the > > > > > pcap > > > > > > > data from the rest server and as Ryan mention it, we had > > trouble. I > > > > > will > > > > > > > try to find back the error. > > > > > > > > > > > > > > Then in the POC, what we tried is to use the pcap_query script > > and > > > > this > > > > > > > work fine. I just modified it that he sends back directly the > > > job_id > > > > of > > > > > > > yarn and not waiting that the job is finished. Then it will > > allow > > > the > > > > > UI > > > > > > > and the rest server to know what the status of the research by > > > > querying > > > > > > the > > > > > > > yarn rest api. This will allow the UI and the rest server to be > > > async > > > > > > > without any blocking phase. What do you think about that? > > > > > > > > > > > > > > > > > > > > > > > > > > > > Having the job submitted directly from the code of the rest > > server > > > > will > > > > > > be > > > > > > > perfect, but it will need a lot of investigation I think (but > > I'm > > > not > > > > > > the > > > > > > > expert so I might be completely wrong ^^). > > > > > > > > > > > > > > We know that the pcap_query scritp work fine so why not calling > > it? > > > > Is > > > > > > it > > > > > > > that bad? (maybe stupid question, but I really don’t see a lot > > of > > > > > > drawback) > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Front end: > > > > > > > > > > > > > > Adding the the pcap search to the alert UI is, I think, the > > easiest > > > > way > > > > > > to > > > > > > > move forward. But indeed, it will then be the “Alert UI and > > > > pcapquery”. > > > > > > > Maybe the name of the UI should just change to something like > > > > > > “Monitoring & > > > > > > > Investigation UI” ? > > > > > > > > > > > > > > > > > > > > > > > > > > > > Is there any roadmap or plan for the different UI? I mean did > > you > > > > > > already > > > > > > > had discussion on how you see the ui evolving with the new > > feature > > > > that > > > > > > > will come in the future? > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Microservices: > > > > > > > > > > > > > > > > > > > > > > > > > > > > What do you mean exactly by microservices? Is it to separate > all > > > the > > > > > > > features in different projects? Or something like having the > > > > different > > > > > > > components in container like kubernet? (again maybe stupid > > > question, > > > > > but > > > > > > I > > > > > > > don’t clearly understand what you mean J ) > > > > > > > > > > > > > > > > > > > > > > > > > > > > Michel > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- -- simon elliston ball @sireb

Re: [DISCUSS] parser ES + Solr schema abstraction

2018-05-23 Thread Simon Elliston Ball
ith…. > > > > >> On May 22, 2018 at 13:56:23, Simon Elliston Ball >> (si...@simonellistonball.com) wrote: >> >> Absolutely. I would agree with that as an approach. >> >> I would also suggest we discuss where schemas and versions should be sto

Re: [DISCUSS] Field conversions

2018-06-05 Thread Simon Elliston Ball
gt; > > The > > > >> others only apply to a single field which does not scale well. Now > we > > > >> have > > > >> an issue with another field in > > > >> https://issues.apache.org/jira/browse/METRON-1600. Rather than > > > >> continuing > > > >> with a patchwork of different fixes I want to attempt to design a > > > >> system-wide solution. > > > >> > > > >> My first thought is to expand > > > https://github.com/apache/metron/pull/1022 > > > >> to > > > >> apply globally. However this is not trivial and would require > > > significant > > > >> changes. It would also make https://github.com/apache/ > > metron/pull/1010 > > > >> obsolete and we might end up having to revert all of it. > > > >> > > > >> Does anyone have any ideas or opinions? I am still researching > > > solutions > > > >> but would love some guidance from the community. > > > >> > > > > > > > > > > -- -- simon elliston ball @sireb

Re: [DISCUSS] Field conversions

2018-06-05 Thread Simon Elliston Ball
ion of ES they are running). If I am wrong and there is a >> better approach that works, then we should just revert #1022. >> >> On Tue, Jun 5, 2018 at 9:37 AM, Simon Elliston Ball < >> si...@simonellistonball.com> wrote: >> >>> I would definitely agree

Re: Writing enrichment data directly from NiFi with PutHBaseJSON

2018-06-05 Thread Simon Elliston Ball
Do you mean in the sense of a separate module, or are you suggesting we go as far as a sub-project? On 5 June 2018 at 10:08, Otto Fowler wrote: > If we do that, we should have it as a separate component maybe. > > > On June 5, 2018 at 12:42:57, Simon Elliston

Re: Writing enrichment data directly from NiFi with PutHBaseJSON

2018-06-05 Thread Simon Elliston Ball
in strong support of that, Simon. I think we should have some other > NiFi components in Metron to enable users to interact with our > infrastructure from NiFi (e.g. being able to transform via stellar, etc). > > On Tue, Jun 5, 2018 at 10:32 AM Simon Elliston Ball < > si...@simon

Re: Writing enrichment data directly from NiFi with PutHBaseJSON

2018-06-05 Thread Simon Elliston Ball
>> >> On June 5, 2018 at 14:07:22, Simon Elliston Ball ( >> si...@simonellistonball.com) wrote: >> >> To be honest, I would expect this to be heavily linked to the Metron >> releases, since it's going to use other metron classes and dependencies to >&

Re: Architectural reason to split in 4 topologies / impact on the kafka ressources

2018-06-25 Thread Simon Elliston Ball
architectural reason to split the > > >> > ingestion in metron in 4 differents toppologies that all read/write > > to > > >> > kafka? > > >> > > > >> > For example, why the parsing and enrichment topologies have not > been > > >> > merged? Would it not be possible when you parse the message to > > directly > > >> > enricht it? > > >> > > > >> > Im asking that because splitting in several topologies means that > > all of > > >> > the topologies read/write to Kafka, which produce a bigger load on > > the > > >> > kafka cluster and then a need for way more infrastructure/servers. > > The > > >> cost > > >> > is especially true when we speak about TBs of data ingested every > > day. > > >> > > > >> > Im sure there were a very good reason, I was just curious. > > >> > > > >> > Thanks, > > >> > Michel > > >> > > > > > --- > > Thank you, > > > > James Sirota > > PMC- Apache Metron > > jsirota AT apache DOT org > > > > > -- -- simon elliston ball @sireb

Re: new committer: Raghu Mitra

2017-10-20 Thread Simon Elliston Ball
Congratulations Raghu. Well deserved with all that awesome UI work that’s coming in. Simon > On 20 Oct 2017, at 17:10, James Sirota wrote: > > > > The Project Management Committee (PMC) for Apache Metron > has invited Raghu Mitra to become a committer and we are pleased

Re: When things change in hdfs, how do we know

2018-01-26 Thread Simon Elliston Ball
Should we consider using the Inotify interface to trigger reconfiguration, in same way we trigger config changes in curator? We also need to fix caching and lifecycle in the Grok parser to make the zookeeper changes propagate pattern changes while we’re at it. Simon > On 26 Jan 2018, at

Re: When things change in hdfs, how do we know

2018-01-26 Thread Simon Elliston Ball
13:27, Otto Fowler <ottobackwa...@gmail.com> wrote: > > https://github.com/ottobackwards/hdfs-inotify-zookeeper > <https://github.com/ottobackwards/hdfs-inotify-zookeeper> > > Working on a poc > > > > On January 26, 2018 at 07:41:44, Simon Elliston Ball

Re: [DISCUSS] Update Metron Elasticsearch index names to metron_

2018-01-26 Thread Simon Elliston Ball
+1 on this. The idea of a default broad matching template should also include an order entry to avoid conflicts with more specific templates, and we should then document the need for a higher order value in all per-source index templates. In terms of production migration, I think we may want

Re: Metron User Community Meeting Call

2018-01-26 Thread Simon Elliston Ball
This is going to be a really exciting call. Looking forward to seeing how the GCR Canary sings :) I’m going to volunteer https://hortonworks.zoom.us/my/simonellistonball as a location for the meeting. I would also support the idea of a quick poll on what people are doing with Metron, and

Re: Metron nested object

2018-01-11 Thread Simon Elliston Ball
> Keen to hear your thoughts > > > Cheers > > > > [1] I appreciate the architecture is flexible... > [-] Apologies for the delay but I suspect my previous message got stuck in > moderation > > On Fri, Dec 22, 2017 at 3:59 AM, Simon Elliston Ball < > si

Re: [DISCUSS] Persistence store for user profile settings

2018-02-02 Thread Simon Elliston Ball
shared dashboards or queries vs. personal version in jira. Would RDBMS help > with that? > > > > On February 2, 2018 at 07:17:04, Simon Elliston Ball > (si...@simonellistonball.com <mailto:si...@simonellistonball.com>) wrote: > >> Introducing a RDBMS to the stack

Re: [DISCUSS] Persistence store for user profile settings

2018-02-02 Thread Simon Elliston Ball
eady very familiar with RDBMS > solutions and have the infrastructure in place to manage those. For users > that don't need HA/DR, just use the DB that gets spun-up with Ambari. > > > > > > On Fri, Feb 2, 2018 at 7:17 AM Simon Elliston Ball < > si...@simonellist

Re: [DISCUSS] Persistence store for user profile settings

2018-02-02 Thread Simon Elliston Ball
Introducing a RDBMS to the stack seems unnecessary for this. If we consider the data access patterns for user profiles, we are unlikely to query into them, or indeed do anything other than look them up, or write them out by a username key. To that end, using an ORM to translate a a nested

Re: [DISCUSS] Persistence store for user profile settings

2018-02-02 Thread Simon Elliston Ball
ly if we are going to > be having permissions, grouping and crud around that, and preloading, before > just throwing everything in RDBMS -or- HBASE. > > > > On February 2, 2018 at 08:08:24, Simon Elliston Ball > (si...@simonellistonball.com <mailto:si...

Re: Enrichment and indexing routing mechanism

2018-01-29 Thread Simon Elliston Ball
Yes, it is. Sent from my iPhone > On 29 Jan 2018, at 09:33, Ali Nazemian wrote: > > Hi All, > > I was wondering how the routing mechanism works in Metron currently. Can > somebody please explain how Enrichment Storm topology understands a single > event is related to

Re: Enrichment and indexing routing mechanism

2018-01-29 Thread Simon Elliston Ball
rser > and post-parser Stellar implementation? I am trying to understand If I > change it in post-parser Stellar, will it be overwritten at the last step > of Parser topology or not? > > Cheers, > Ali > > On Mon, Jan 29, 2018 at 8:55 PM, Simon Elliston Ball < > si...@simonel

Re: When things change in hdfs, how do we know

2018-01-31 Thread Simon Elliston Ball
I take it your service would just be a thin daemon along the lines of the PoC you linked, which makes a lot of sense, delegating the actual notification to the zookeeper bits we already have. That makes sense to me. One other question would be around the availability of that service (which is

Re: Disable Metron parser output writer entirely

2018-02-05 Thread Simon Elliston Ball
I expect the performance would be dire. If you really wanted to do something like this, a custom writer might make sense. KAFKA_PUT is really meant for debugging use cases only. It’s a very non-stellar construct (non-expression, no return, side-effect dependent…) Also, it creates a producer for

Re: [DISCUSS] community view/roadmap of threat intel

2018-02-14 Thread Simon Elliston Ball
We used to install soltra edge in the old ansible builds (which have thankfully now been pared back in the interests of stability in full dev). Soltra has not been a good option since they went proprietary, so since then we’ve included opentaxii (BSD 3) as a discovery and aggregator. Most of

Re: [DISCUSS] community view/roadmap of threat intel

2018-02-19 Thread Simon Elliston Ball
regator > X or aggregator Y, we can integrate it with Metron based on integration > points. > > Cheers, > Ali > > On Wed, Feb 14, 2018 at 11:28 PM, Simon Elliston Ball < > si...@simonellistonball.com> wrote: > >> We used to install soltra edge in the o

Re: [DISCUSS] community view/roadmap of threat intel

2018-02-19 Thread Simon Elliston Ball
3. Atemporal matching - Given the use of big data technologies it seems to > me Metron should be able to look into past enrichment data in order to > classify traffic. I am not sure this is possible today? > > > Cheers > > > On Mon, Feb 19, 2018 at 8:48 PM, Simon Elli

Re: Metron nested object

2017-12-21 Thread Simon Elliston Ball
Correct, nested objects in lucene indexes lead to sub-documents, which leads to a massive drop in ingest and query rates, this is why the JSONMap parser for example deliberately flattens the Metorn JSON object. Before this decision was made, very early versions of OpenSOC nested enrichments for

Re: [DISCUSS] Generating and Interacting with serialized summary objects

2018-01-03 Thread Simon Elliston Ball
There is some really cool stuff happening here, if only I’d been allowed to see the lists over Christmas... :) A few thoughts... I like Otto’s generalisation of the problem to include specific local stellar objects in a cache loaded from a store (HDFS seems a natural, but not only place,

Re: [DISCUSS] Batch Profiler

2018-07-30 Thread Simon Elliston Ball
thread.html/d28d18cc9358f5d9c276c7c304ff4e > e601041fb47bfc97acb6825083@%3Cdev... > > < > https://lists.apache.org/thread.html/d28d18cc9358f5d9c276c7c304ff4e > e601041fb47bfc97acb6825083@%3Cdev.metron.apache.org%3E> > > [2] https://issues.apache.org/jira/browse/METRON-1699 > -- -- simon elliston ball @sireb

Knox SSO feature branch PRs: a quick demo

2018-08-01 Thread Simon Elliston Ball
I've recently put in a number of PRs on the Knox feature branch, and thought it might be useful to post a quick 'sprint demo' style explanation of what the various PRs and functionality entails: https://youtu.be/9OJz6hg0N1I Hope this helps with review process. There are a couple of areas where

Re: [ANNOUNCE] - Apache Metron Slack channel

2018-08-15 Thread Simon Elliston Ball
3. Use your Apache email for your login > >4. Click "Channels" and look for #metron (Created by ottO June 15, > 2018) > > > > Best > > Mike Miklavcic > > > -- -- simon elliston ball @sireb

Re: [DISCUSS] Getting to a 1.0 release

2018-08-15 Thread Simon Elliston Ball
it is on the roadmap”. > > Regardless of the implementation, conceptually, security of data at rest is > important, and is a major outstanding item or the core metron proposition. > > > > >> On August 15, 2018 at 16:03:19, Simon Elliston Ball >> (si...@simonelli

Re: [DISCUSS] Getting to a 1.0 release

2018-08-15 Thread Simon Elliston Ball
and closing it > > > >> On August 15, 2018 at 15:53:02, Otto Fowler (ottobackwa...@gmail.com) wrote: >> >> https://issues.apache.org/jira/browse/METRON-343 >> >>> On August 15, 2018 at 15:47:24, Simon Elliston Ball >>> (si...@simonellistonba

Re: [DISCUSS] Getting to a 1.0 release

2018-08-15 Thread Simon Elliston Ball
What would you see as secure? I’ve seen people use TDE for the HDFS store, but it’s harder to encrypt storage with solr / es. Something I was thinking of doing to follow up on the Knox Feature was to add Ranger integration for securing and auditing configs, and potentially extending to the

Re: [DISCUSS] Getting to a 1.0 release

2018-08-15 Thread Simon Elliston Ball
ote: > > https://issues.apache.org/jira/browse/METRON-343 > >> On August 15, 2018 at 15:47:24, Simon Elliston Ball >> (si...@simonellistonball.com) wrote: >> >> What would you see as secure? I’ve seen people use TDE for the HDFS store, >> but it’s harder to e

Re: [DISCUSS] Metron Parsers in Nifi

2018-08-13 Thread Simon Elliston Ball
used by > > Processors. > > > > >>> There's friction involved there in terms of schemas, but also in > > > > terms of > > > > >>> > > > > >>> access to ZK configs and things like parser chaining. We might > > > > >>> be able to > > > > >>> leverage it, but it seems like it'd be fairly shoehorned in > > > > >>> without getting > > > > >>> the schema and other benefits. > > > > >>> > > > > >>> We won’t have to provide our ‘no schema processors’ ( grok, csv, > > > json > > > > ). > > > > >>> > > > > >>> All the remaining processors DO have schemas that we know about. > We > > > > can > > > > >>> just provide the avro schemas the same way we provide the ES > > > schemas. > > > > >>> > > > > >>> The “parsing” should not be conflated with the transform/stellar > in > > > > >>> NiFi. We should make that separate. Running Stellar over Records > > > > would be > > > > >>> the best thing. > > > > >>> > > > > >>> - This Processor would work similarly to Storm: bytes[] in -> > JSON > > > > >>> out. > > > > >>> - There is a Processor > > > > >>> < > > > > >>> > > > > > > > > > https://github.com/apache/nifi/blob/master/nifi-nar- > bundles/nifi-standard-bundle/nifi-standard-processors/src/ > main/java/org/apache/nifi/processors/standard/JoltTransformJSON.java > > > > >>> > > > > > >>> that > > > > >>> handles loading other JARs that we can model a > > > > >>> MetronParserProcessor off of > > > > >>> that handles classpath/classloader issues (basically just sets > up a > > > > >>> classloader specific to what's being loaded and swaps out the > > > Thread's > > > > >>> loader when it calls to outside resources). > > > > >>> > > > > >>> There should be no reason to load modules outside the NAR. Why do > > > you > > > > >>> expect to? If each Metron Processor equiv of a Metron Storm > Parser > > > is > > > > just > > > > >>> parsing to json it shouldn’t need much.And we could package them > in > > > > the > > > > >>> NAR. I would suggest we have a Processor per Parser to allow for > > > > >>> specialization. It should all be in the nar. > > > > >>> > > > > >>> The Stellar Processor, if you would support the works would > > possibly > > > > need > > > > >>> this. > > > > >>> > > > > >>> 3. Create a MetronZkControllerService to supply our configs to > our > > > > >>> processors. > > > > >>> - This is a pretty established NiFi pattern for being able to > > > provide > > > > >>> access to other services needed by a Processor (e.g. databases or > > > > large > > > > >>> configurations files). > > > > >>> - The same controller service can be used by all Processors to > > > manage > > > > >>> configs in a consistent manner. > > > > >>> > > > > >>> I think controller services would make sense where needed, I’m > just > > > > not > > > > >>> sure what you imagine them being needed for? > > > > >>> > > > > >>> If the user has NiFi, and a Registry etc, are you saying you > > imagine > > > > them > > > > >>> using Metron + ZK to manage configurations? Or to be using BOTH > > > storm > > > > >>> processors and Nifi Processors? > > > > >>> > > > > >>> At that point, we can just NAR our controller service and parser > > > > processor > > > > >>> > > > > >>> up as needed, deploy them to NiFi, and let the user provide a > > config > > > > for > > > > >>> where their custom parsers can be provided (i.e. their parser > jar). > > > > This > > > > >>> would be 3 nars (processor, controller-service, and > > > > controller-service-api > > > > >>> > > > > >>> in order to bind the other two together). > > > > >>> > > > > >>> Once deployed, our ability to use parsers should fit well into > the > > > > >>> standard > > > > >>> NiFi workflow: > > > > >>> > > > > >>> 1. Create a MetronZkControllerService. > > > > >>> 2. Configure the service to point at zookeeper. > > > > >>> 3. Create a MetronParser. > > > > >>> 4. Configure it to use the controller service + parser jar > location > > > + > > > > >>> any other needed configs. > > > > >>> 5. Use the outputs as needed downstream (either writing out to > > Kafka > > > > or > > > > >>> feeding into more MetronParsers, etc.) > > > > >>> > > > > >>> Chaining parsers should ideally become a matter of chaining > > > > MetronParsers > > > > >>> > > > > >>> (and making sure the enveloping configs carry through properly). > > For > > > > >>> parser > > > > >>> aggregation, I'd just avoid it entirely until we know it's needed > > in > > > > NiFi. > > > > >>> > > > > >>> Justin > > > > > > > > --- > > > > Thank you, > > > > > > > > James Sirota > > > > PMC- Apache Metron > > > > jsirota AT apache DOT org > > > > > > > > > > > > > > > > -- -- simon elliston ball @sireb

Re: [DISCUSS] Metron Parsers in Nifi

2018-08-13 Thread Simon Elliston Ball
s approach will > make that not possible, as another consideration. > > > > On August 13, 2018 at 06:50:09, Simon Elliston Ball ( > si...@simonellistonball.com) wrote: > > Maybe the edge use case will clarify the config issue a little. The reason > I would want to be able

Re: Change field separator in Metron to make it Hive and ORC friendly

2018-08-13 Thread Simon Elliston Ball
Elasticsearch > and HDFS. > > https://github.com/apache/metron/pull/1022 > > Cheers, > Ali > -- -- simon elliston ball @sireb

Re: Change field separator in Metron to make it Hive and ORC friendly

2018-08-14 Thread Simon Elliston Ball
on separator. Maybe it would be nice to have an ability to >> change the separator to any other character and let users decide what they >> want to use. >> >> Cheers, >> Ali >> >> On Tue, Aug 14, 2018 at 12:14 AM Simon Elliston Ball < >> si...@si

Re: [DISCUSS] Contributing a General Purpose Regex Parser

2018-08-27 Thread Simon Elliston Ball
regex. The >expression that is evaluated is based on the output of the > recordTypeRegex >- Note: recordTypeRegex and messageHeaderRegex could be specified as >lists also (as a JSON array), where the list will be evaluated in order >until a matching regular expression is found. > > > > > > If there are no objections to having this type of Parser within Metron, we > will open a JIRA/PR for code review. > > *Jagdeep Singh* > -- -- simon elliston ball @sireb

Re: Performance comparison between Grok and Java regex

2018-07-11 Thread Simon Elliston Ball
A streaming token parser might well get you good performance for that format... maybe something like an antlr grammar or even a simple scanner. Regex is not the only pattern :) It would also be great to see such a parser contributed back to the community of possible, and I sure we would be

Re: Security Feature Branch?

2018-07-12 Thread Simon Elliston Ball
ast on such things is to require that they are broken > into small reviewable chunks on a feature branch, even if the end to end > working version was more ‘usable’. > > > > On July 12, 2018 at 10:51:30, Simon Elliston Ball ( > si...@simonellistonball.com) wrote: > > I've b

Re: [DISCUSS] Time to remove github updates from dev?

2018-04-04 Thread Simon Elliston Ball
I would say we should also update our website with subscription information. Simon > On 4 Apr 2018, at 18:51, Nick Allen wrote: > > https://lists.apache.org/list.html?iss...@metron.apache.org​ > > On Tue, Mar 20, 2018 at 5:06 PM, Otto Fowler >

Re: GeoLite deprecating legacy DBs

2018-04-13 Thread Simon Elliston Ball
Don’t we already use the GeoLite2 database? Mine are all /apps/metron/geo/default/GeoLite2-City.mmdb.gz downloaded from http://geolite.maxmind.com/download/geoip/database/GeoLite2-City.mmdb.gz which seems to match the new format page. Am I missing something Jon, or are you referring to the

Re: [DISCUSS] Time to remove github updates from dev?

2018-03-19 Thread Simon Elliston Ball
Should we not add the new lists to the website? Simon > On 19 Mar 2018, at 14:02, Casey Stella wrote: > > +1 > > > On Mon, Mar 19, 2018 at 8:16 AM Andre wrote: > >> Folks, >> >> All rejoice. This has been finally implemented. >> >> Cheers >> >>

Re: [DISCUSS] Generic Syslog Parsing capability for parsers

2018-03-20 Thread Simon Elliston Ball
It seems like parser chaining is becomes a hot topic on the repo too with https://github.com/apache/metron/pull/969#partial-pull-merging I would like to discuss the option, and how we might architect, of configuring parsers to

  1   2   >