Re: Architectural reason to split in 4 topologies / impact on the kafka ressources

2018-06-27 Thread Carolyn Duby
Another reason for the original string is that you may not want to extract all 
components of the original event into JSON.  If you look at Windows events you 
will want to have the original event but you will not want to extract 
everything because they are very verbose.   

You should have a choice on the sensor type whether you want to include the 
original string in the index not.

Thanks  

Carolyn Duby
Solutions Engineer, Northeast
cd...@hortonworks.com
+1.508.965.0584

Join my team!
Enterprise Account Manager – Boston - http://grnh.se/wepchv1
Solutions Engineer – Boston - http://grnh.se/8gbxy41
Need Answers? Try https://community.hortonworks.com 









On 6/25/18, 8:02 PM, "Simon Elliston Ball"  wrote:

>The original string serves purposes well beyond debugging. Many users will
>need to be able to prove provenance to the raw logs in order to prove or
>prosecute an attack from an internal threat, or provide evidence to law
>enforcement or an external threat. As such, the original string is
>important.
>
>It also provides a valuable source for the free text search where parsing
>has not extracted all the necessary tokens for a hunt use case, so it can
>be a valuable field to have in Elastic or Solr for text rather than keyword
>indexing.
>
>That said, it may make sense to remove a heavy weight processing and
>storage field like this from the lucene store. We have been talking for a
>while about filtering some of the data out of the realtime index, and
>preserving full copies in the batch index, which could meet the forensic
>use cases above, and would make it a matter of user choice. That would
>probably be configured through indexing config to filter fields.
>
>Simon
>
>On 25 June 2018 at 23:43, Michel Sumbul  wrote:
>
>> Depending on the source of data, it might be interesting to bypass a step
>> that the user concider useless.
>> For example if you have a source of data that dont need profiling and you
>> want to have it ingested like the other source to allow the  SOC analyst to
>> use it in there analysis. To have everything at the same place.
>>
>> How can we bypass it for a specific sensor?
>>
>> 2018-06-25 23:38 GMT+01:00 James Sirota :
>>
>> > There is a way to wire the system to bypass enrichment and profiling, but
>> > you would then bypass a lot of key features of the system.  It would be
>> > unwise to do that.
>> >
>> > 25.06.2018, 15:13, "Michel Sumbul" :
>> > > Hi Casey,
>> > >
>> > > Thats make completely sense.
>> > > Short question, if there is no enrichment or no profiling, does the
>> > message
>> > > still pass through the enrichment/profiling topic?
>> > >
>> > > If yes, do you think its possible to imagine a way that for messages
>> that
>> > > doesn't need enrichment or profiling to skip the topic and to go
>> directly
>> > > to the next one? This is again to avoid in/out in kafka.
>> > >
>> > > Thanks for the explaination,
>> > > Michel
>> > >
>> > > 2018-06-23 3:58 GMT+01:00 Casey Stella :
>> > >
>> > >>  Hey Michel,
>> > >>
>> > >>  Those are good questions and there were some reasons surrounding
>> that.
>> > In
>> > >>  fact, historically, we had fewer topologies (e.g. indexing and
>> > enrichment
>> > >>  were merged). Even earlier on, we had just one giant topology per
>> > parser
>> > >>  that enriched and indexed. The long story short is that we moved this
>> > way
>> > >>  because we saw how people were using metron and we gained more
>> insight
>> > >>  tuning Metron. That led us down this architectural path.
>> > >>
>> > >>  Some of the reasons that we went this way:
>> > >>
>> > >> - Fewer large topologies were a nightmare to tune
>> > >>- Enrichment would have different memory requirements than,
>> say,
>> > >>parsers or indexing
>> > >>- You can adjust the kafka topic params per topology to adjust
>> > the
>> > >>number of partitions, etc.
>> > >> - Having the separate topologies gives a natural set of extension
>> > points
>> > >> for customization and enhancement (e.g. you want a phase between
>> > parsing
>> > >> and enrichment).
>> > >> - Decoupling the topologies lets us spin up and down parts of
>> Metron
>> > >> without affecting others (e.g. you don't have to take down
>> > enrichments
>> > >>  to
>> > >> add a parser, even for a moment)
>> > >> - The movement to Flux meant we were limited in how much we could
>> > adjust
>> > >> the topology at runtime (e.g. colocating parsers and enrichment
>> > would
>> > >>  mean
>> > >> moving away from flux essentially as the topology changes its
>> > structure)
>> > >>
>> > >>  Best,
>> > >>
>> > >>  Casey
>> > >>
>> > >>  On Fri, Jun 22, 2018 at 5:25 PM Michel Sumbul <
>> michelsum...@gmail.com>
>> > >>  wrote:
>> > >>
>> > >>  > Hi Everyone,
>> > >>  >
>> > >>  > I was asking myself what was the architectural reason to split the
>> > >>  > ingestion in metron in 4 differents 

Re: CVE-2018-1273 fixed in Metron 0.5.0

2018-06-27 Thread Jon Zeolla
[Pulling this back to the Metron dev and security lists]

I started poking around to see how someone may mitigate this prior to doing
the 0.5.0 upgrade (since that effort is definitely non-trivial), and I came
up with this

file change which *seems* to be the relevant change.

That being said, in order to mitigate an older version of Metron as an
interim solution, is shutting down the Metron rest services in Ambari
(breaking all of the things that depend on/use them) be sufficient, or is
there a better interim solution?

Also, has anyone discussed deploying a 0.4.3 that just has this patch on
top of 0.4.2?  Since the process to do an upgrade to 0.5.0 is such a big
deal, I would be in favor of rolling out a patch, assuming it isn't more or
nearly equal to the amount of effort of an upgrade to 0.5.0.

Thanks,

Jon

On Tue, Jun 26, 2018 at 3:33 PM James Sirota  wrote:

>
> The following CVE was fixed in Metron 0.5.0:
>
> [CVEID]: CVE-2018-1273
> [PRODUCT]:Spring Data Commons
> [VERSION]: versions prior to 1.13 to 1.13.10, 2.0 to 2.0.5, and older
> [PROBLEMTYPE]:remote code execution attack
> [REFERENCES]: https://pivotal.io/security/cve-2018-1273
> [DESCRIPTION]:
>
> Spring Data Commons, versions prior to 1.13 to 1.13.10, 2.0 to 2.0.5, and
> older unsupported versions, contain a property binder vulnerability caused
> by improper neutralization of special elements. An unauthenticated remote
> malicious user (or attacker) can supply specially crafted request
> parameters against Spring Data REST backed HTTP resources or using Spring
> Data’s projection-based request payload binding hat can lead to a remote
> code execution attack.
>
> --

Jon


Re: [DISCUSS] Merging Solr feature branch (METRON-1416) into master

2018-06-27 Thread James Sirota
Thank you, Justin.  Great job on the merge 

26.06.2018, 14:20, "Justin Leet" :
> The Solr feature branch is in now in master. Note that there is no
> METRON-1416 commit in the logs because all subtasks are committed under
> their own JIRA and are in the history to maintain attribution.
>
> On Tue, Jun 26, 2018 at 1:26 PM Otto Fowler  wrote:
>
>>  +1
>>
>>  On June 26, 2018 at 11:43:39, Justin Leet (justinjl...@gmail.com) wrote:
>>
>>  The PR has two +1's at this point (and I'm implicitly +1). In the interest
>>  of full disclosure, both are from people who made contributions of varying
>>  degrees to the branch.
>>
>>  Are there any objections to merging the feature branch into master at this
>>  point?
>>
>>  On Fri, Jun 22, 2018 at 1:12 PM Justin Leet  wrote:
>>
>>>  That's more or less why I didn't flesh out testing. Might be worth
>>>  spinning up full dev and the site-book to smoke test, but the branch should
>>>  be in a good state. I figured if we get a couple +1's on the PR, it's
>>>  essentially voting anyway, but this is pretty new in terms of process.
>>>
>>>  On Fri, Jun 22, 2018 at 12:53 PM Otto Fowler 
>>>  wrote:
>>>
  If all the PR’s are on master->feature branch. Why do we need testing?
  this is almost a vote situation.

  On June 22, 2018 at 12:01:11, Justin Leet (justinjl...@gmail.com) wrote:

  The (formerly) active PRs are now merged in and closed.

  We don't seem to have defined way to merge a feature branch into master
  (unless I missed it), so I went ahead and opened a PR against the parent
  ticket. Please see #1076 .

  I haven't fleshed out testing and so on for the PR description, although
  if
  we'd like it compiled from the various child PRs against the branch, I
  can
  certainly do so.

  Justin

  On Thu, Jun 21, 2018 at 6:46 PM Michael Miklavcic <
  michael.miklav...@gmail.com> wrote:

  > +1 let's do it.
  >
  > On Thu, Jun 21, 2018, 2:01 PM Nick Allen  wrote:
  >
  > > +1 I think we should merge ASAP and kill the feature branch. I think
  the
  > > work has well surpassed the level required to get it into master.
  > >
  > > On Thu, Jun 21, 2018 at 1:20 PM, Justin Leet 
  > > wrote:
  > >
  > > > Hi All,
  > > >
  > > > The Solr branch (/feature/METRON-1416-upgrade-solr
  > > > <
  > https://github.com/apache/metron/tree/feature/METRON-1416-upgrade-solr
  > > >),
  > > > has been progressing for a while now. I'd like to open up
  discussion
  > > > around what it takes to get it into master.
  > > >
  > > > The JIRA for tracking this feature branch is METRON-1416
  > > > .
  > > >
  > > > As shown in the JIRA, the majority of tasks are complete, with a
  few
  > > > outstanding issues. Of these, I believe these are the main ones of
  > > interest
  > > > to this discussion.
  > > >
  > > > - METRON-1629 
  -
  > > > There is an active PR #1072 <
  > > https://github.com/apache/metron/pull/1072
  > > > >
  > > > - METRON-1609 
  -
  > > > There is an active PR #1056 <
  > > https://github.com/apache/metron/pull/1056
  > > > >
  > > > - METRON-1602 
  -
  > > > Full
  > > > dev can run with Solr without this, it would simply be more
  > > convenient.
  > > > - METRON-1632 
  -
  > > > Causes a metaalert specific issue where UI filtering on
  > > > source.type:metaalert fails. More detail is on the Jira.
  > > > - Two validation tickets. It's been run up on multinode, and manual
  > > > testing has happened (and I'm will be seen a bit more on the final
  > PR
  > > by
  > > > various reviewers), so I'm inclined to just leave these open until
  > > we're
  > > > good to go. Let me know if we want to handle this differently.
  > > >
  > > > I'm of the opinion both of the active PRs need to be merged before
  we
  > > merge
  > > > this into master, especially the documentation one. The other two
  > > tickets
  > > > can be done in the future; one can be worked around and one is a
  > > metaalert
  > > > specific issue that primarily effects the alerts UI.
  > > >
  > > > As the branch has grown and diverged from master, it's gotten
  > > increasingly
  > > > unwieldy to maintain (and I think it's worth a follow-on discussion
  > about
  > > > how we manage refactorings that happen in these sorts of
  branches). I
  > > know
  > > > there's been at least a couple merges from master that have been