Re: Metron User Community Meeting Call

2018-01-26 Thread Casey Stella
I can't wait!  This is going to be really cool :)

On Fri, Jan 26, 2018 at 5:25 PM, James Sirota  wrote:

> Yeah very interested in the presentation as well
>
> 26.01.2018, 15:15, "Simon Elliston Ball" :
> > This is going to be a really exciting call. Looking forward to seeing
> how the GCR Canary sings :)
> >
> > I’m going to volunteer https://hortonworks.zoom.us/my/simonellistonball
> as a location for the meeting.
> >
> > I would also support the idea of a quick poll on what people are doing
> with Metron, and maybe if anyone wants to volunteer at the end of the
> meeting it would be great to have an open mic of use cases.
> >
> > Talk to you all Wednesday.
> >
> > Simon
> >
> >>  On 26 Jan 2018, at 22:10, Seal, Steve  wrote:
> >>
> >>  HI all,
> >>
> >>  I have several people on my team that are looking forward to hearing
> about Ahmed’s work.
> >>
> >>  Steve
> >>
> >>  From: Daniel Schafer [mailto:daniel.scha...@sstech.us]
> >>  Sent: Friday, January 26, 2018 5:05 PM
> >>  To: u...@metron.apache.org; dev@metron.apache.org
> >>  Subject: Re: Metron User Community Meeting Call
> >>
> >>  My team members and me would like to join as well.
> >>  We can provide Zoom Meeting login if necessary.
> >>
> >>  Thanks
> >>
> >>  Daniel
> >>  7134806608
> >>
> >>  From: Ahmed Shah  carleton.ca>>
> >>  Reply-To: "u...@metron.apache.org " <
> u...@metron.apache.org >
> >>  Date: Friday, January 26, 2018 at 2:06 PM
> >>  To: "dev@metron.apache.org " <
> dev@metron.apache.org >, "
> u...@metron.apache.org " <
> u...@metron.apache.org >
> >>  Subject: Re: Metron User Community Meeting Call
> >>
> >>  Looking forward to presenting!
> >>
> >>  Just a thought...
> >>  In advanced should we create a Google Forms to collect survey data on
> who is using Metron, how they are using it, ext.. and present the results
> to the group?
> >>
> >>  -Ahmed
> >>  ___
> >>  Ahmed Shah (PMP, M. Eng.)
> >>  Cybersecurity Analyst & Developer
> >>  GCR - Cybersecurity Operations Center
> >>  Carleton University - cugcr.com  proofpoint.com/v2/url?u=https-3A__cugcr.com_tiki_lce_index.
> php=DwMGaQ=H50I6Bh8SW87d_bXfZP_8g=yeB_CytRmKpr9adMUN0qfcwJfnmWAQuHY9
> inQHsSRow=1J5p3hWBZj3Fc4Xy-CytnTi_kafYqRMsY-Ntvr5HlHw=
> Pj0RGStdqj0bZkCYqDZCE_ZA1mRVP-jN6kxxYqgzK2E=>
> >>
> >>  From: Andrew Psaltis >
> >>  Sent: January 26, 2018 1:53 PM
> >>  To: dev@metron.apache.org 
> >>  Subject: Re: Metron User Community Meeting Call
> >>
> >>  Count me in. Very interested to hear about Ahmed's journey.
> >>
> >>  On Fri, Jan 26, 2018 at 8:58 AM, Kyle Richardson <
> kylerichards...@gmail.com >
> >>  wrote:
> >>
> >>  > Thanks! I'll be there. Excited to hear Ahmed's successes and
> challenges.
> >>  >
> >>  > -Kyle
> >>  >
> >>  > On Thu, Jan 25, 2018 at 7:44 PM zeo...@gmail.com  zeo...@gmail.com> > wrote:
> >>  >
> >>  > > Thanks Otto, I'm in to attend at that time/place.
> >>  > >
> >>  > > Jon
> >>  > >
> >>  > > On Thu, Jan 25, 2018, 14:45 Otto Fowler  > wrote:
> >>  > >
> >>  > >> I would like to propose a Metron user community meeting. I
> propose that
> >>  > >> we set the meeting next week, and will throw out Wednesday,
> January
> >>  > 31st at
> >>  > >> 09:30AM PST, 12:30 on the East Coast and 5:30 in London Towne.
> This
> >>  > meeting
> >>  > >> will be held over a web-ex, the details of which will be included
> in the
> >>  > >> actual meeting notice.
> >>  > >> Topics
> >>  > >>
> >>  > >> We have a volunteer for a community member presentation:
> >>  > >>
> >>  > >> Ahmed Shah (PMP, M. Eng.) Cybersecurity Analyst & Developer GCR -
> >>  > >> Cybersecurity Operations Center Carleton University - cugcr.com <
> https://urldefense.proofpoint.com/v2/url?u=http-3A__cugcr.com=DwQGaQ=
> H50I6Bh8SW87d_bXfZP_8g=yeB_CytRmKpr9adMUN0qfcwJfnmWAQuHY9
> inQHsSRow=1J5p3hWBZj3Fc4Xy-CytnTi_kafYqRMsY-Ntvr5HlHw=
> d7cvqZL6hK21y2Y3YW0B49AlEgsICM0D9An4huvIsUI=>
> >>  > >>
> >>  > >> Ahmed would like to talk to the community about
> >>  > >>
> >>  > >> -
> >>  > >>
> >>  > >> Who the GCR group is
> >>  > >> -
> >>  > >>
> >>  > >> How they use Metron 0.4.1
> >>  > >> -
> >>  > >>
> >>  > >> Walk through their dashboards, UI management screen, nifi
> >>  > >> -
> >>  > >>
> >>  > >> Challenges we faced up until now
> >>  > >>
> >>  > >> I would like to thank Ahmed for stepping forward for this meeting.
> >>  > >>
> >>  > >> If you have something you would like to present or 

[GitHub] metron pull request #911: METRON-1419: Create a SolrDao

2018-01-26 Thread merrimanr
GitHub user merrimanr opened a pull request:

https://github.com/apache/metron/pull/911

METRON-1419: Create a SolrDao

## Contributor Comments
This PR is an initial attempt at creating a SolrDao that implements the 
IndexDao interface, is functionally equivalent to ElasticsearchDao and passes 
all tests in SearchIntegrationTest and UpdateIntegrationTest.  

A high level summary of the changes include:

- Upgraded the Solr client version to 6.6.0
- Updated the SolrComponent to work with Solr 6.6 and added a couple 
convenience methods similar to ElasticsearchComponent
- Added a new SolrDao implementation with all IndexDao methods implemented
- Refactored the SearchIntegrationTest to work for both Solr and 
Elasticsearch and added an Solr implementation (more detail below)
- Created an abstract UpdateIntegrationTest and added a Solr implementation
- Added Solr schemas for test data sets
- Added new tests to SearchIntegrationTest including filtering on fields 
with different types, faceting on fields with different types, and faceting on 
fields with missing types.
- Broke the IndexDao down in the SolrDao to smaller, easier to understand 
classes.  The ElasticsearchDao class has become very large so I attempted to 
make the SolrDao more readable.

There were a couple areas where Elasticsearch and Solr behave slightly 
different.  I attempted to accommodated for that through the SolrDao 
implementation, by adjusting existing tests, and by splitting out specific 
tests:

- Column metadata is different between the 2 search engines so each 
implementation has their own tests
- There are cases where the clients will return different types in search 
results.  I am handling this in SearchIntegrationTest by first converting the 
types to strings and then comparing (other ideas?).  For example, the ES client 
returns an Integer for timestamp while Solr returns a Long.
- There are cases where ES throws an error under certain conditions while 
Solr does not (and vice-versa).  These were moved to either ES or Solr 
SearchIntegrationTest implementations.
- There is no support in Solr for sorting group results so I am sorting 
them client-side instead.

At this point the scope is limited to tests passing.  There are other PRs 
in progress that are needed before automated testing with full dev can be done. 
 I am still actively working on manually testing in full dev and adding 
documentation but this should get us started.

This PR is intended to be merged into the upgrade-solr feature branch but I 
have it set to master temporarily so review is easier.  We will need to merge 
in master to the feature branch to get rid of the extra commits since this PR 
is up to date with master.

I'm expecting a lengthy review and would request multiple reviewers and +1s.

## Pull Request Checklist

Thank you for submitting a contribution to Apache Metron.  
Please refer to our [Development 
Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235)
 for the complete guide to follow for contributions.  
Please refer also to our [Build Verification 
Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview)
 for complete smoke testing guides.  


In order to streamline the review of the contribution we ask you follow 
these guidelines and ask you to double check the following:

### For all changes:
- [x] Is there a JIRA ticket associated with this PR? If not one needs to 
be created at [Metron 
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
- [x] Does your PR title start with METRON- where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
- [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?


### For code changes:
- [x] Have you included steps to reproduce the behavior or problem that is 
being changed or addressed?
- [x] Have you included steps or a guide to how the change may be verified 
and tested manually?
- [x] Have you ensured that the full suite of tests and checks have been 
executed in the root metron folder via:
  ```
  mvn -q clean integration-test install && 
dev-utilities/build-utils/verify_licenses.sh 
  ```

- [x] Have you written or updated unit tests and or integration tests to 
verify your changes?
- [x] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [ ] Have you verified the basic functionality of the build by building 
and running locally with Vagrant full-dev environment or the equivalent?

### For 

Re: [DISCUSS] Update Metron Elasticsearch index names to metron_

2018-01-26 Thread James Sirota
i am +1 on it then 

26.01.2018, 15:56, "Michael Miklavcic" :
> Just checked on the length issue - we should be good -
> https://github.com/elastic/elasticsearch/issues/8079
>
> On Fri, Jan 26, 2018 at 3:37 PM, James Sirota  wrote:
>
>>  Seems reasonable to me. The only thing is that it may make the index
>>  names too long. Not sure if that matters to ES or not
>>
>>  26.01.2018, 15:32, "Simon Elliston Ball" :
>>  > +1 on this. The idea of a default broad matching template should also
>>  include an order entry to avoid conflicts with more specific templates, and
>>  we should then document the need for a higher order value in all per-source
>>  index templates.
>>  >
>>  > In terms of production migration, I think we may want to provide some
>>  detailed documentation in the upgrade guide on this, because there will be
>>  people with a lot of existing indices that will be difficult to handle. We
>>  may also need some tooling, but I expect docs would do the job. What do
>>  people think about migration?
>>  >
>>  > Simon
>>  >
>>  >> One other benefit of this revised approach - we can more effectively
>>  use
>>  >> index template patterns to specify our base set of Metron property
>>  types.
>>  >> Call me crazy, but I think we should be able to do something like:
>>  >>
>>  >> 
>>  >>
>>  >> {
>>  >> *"template": "metron_*",*
>>  >> "mappings": {
>>  >> "metron_doc": {
>>  >> "dynamic_templates": [
>>  >> {
>>  >> "geo_location_point": {
>>  >> "match": "enrichments:geo:*:location_point",
>>  >> "match_mapping_type": "*",
>>  >> "mapping": {
>>  >> "type": "geo_point"
>>  >> }
>>  >> }
>>  >> },
>>  >> {
>>  >> "geo_country": {
>>  >> "match": "enrichments:geo:*:country",
>>  >> "match_mapping_type": "*",
>>  >> "mapping": {
>>  >> "type": "keyword"
>>  >> }
>>  >> }
>>  >> },
>>  >> {
>>  >> "geo_city": {
>>  >> "match": "enrichments:geo:*:city",
>>  >> "match_mapping_type": "*",
>>  >> "mapping": {
>>  >> "type": "keyword"
>>  >> }
>>  >> }
>>  >> },
>>  >> {
>>  >> "geo_location_id": {
>>  >> "match": "enrichments:geo:*:locID",
>>  >> "match_mapping_type": "*",
>>  >> "mapping": {
>>  >> "type": "keyword"
>>  >> }
>>  >> }
>>  >> },
>>  >> {
>>  >> "geo_dma_code": {
>>  >> "match": "enrichments:geo:*:dmaCode",
>>  >> "match_mapping_type": "*",
>>  >> "mapping": {
>>  >> "type": "keyword"
>>  >> }
>>  >> }
>>  >> },
>>  >> {
>>  >> "geo_postal_code": {
>>  >> "match": "enrichments:geo:*:postalCode",
>>  >> "match_mapping_type": "*",
>>  >> "mapping": {
>>  >> "type": "keyword"
>>  >> }
>>  >> }
>>  >> },
>>  >> {
>>  >> "geo_latitude": {
>>  >> "match": "enrichments:geo:*:latitude",
>>  >> "match_mapping_type": "*",
>>  >> "mapping": {
>>  >> "type": "float"
>>  >> }
>>  >> }
>>  >> },
>>  >> {
>>  >> "geo_longitude": {
>>  >> "match": "enrichments:geo:*:longitude",
>>  >> "match_mapping_type": "*",
>>  >> "mapping": {
>>  >> "type": "float"
>>  >> }
>>  >> }
>>  >> },
>>  >> {
>>  >> "timestamps": {
>>  >> "match": "*:ts",
>>  >> "match_mapping_type": "*",
>>  >> "mapping": {
>>  >> "type": "date",
>>  >> "format": "epoch_millis"
>>  >> }
>>  >> }
>>  >> },
>>  >> {
>>  >> "threat_triage_score": {
>>  >> "mapping": {
>>  >> "type": "float"
>>  >> },
>>  >> "match": "threat:triage:*score",
>>  >> "match_mapping_type": "*"
>>  >> }
>>  >> },
>>  >> {
>>  >> "threat_triage_reason": {
>>  >> "mapping": {
>>  >> "type": "text",
>>  >> "fielddata": "true"
>>  >> },
>>  >> "match": "threat:triage:rules:*:reason",
>>  >> "match_mapping_type": "*"
>>  >> }
>>  >> },
>>  >> {
>>  >> "threat_triage_name": {
>>  >> "mapping": {
>>  >> "type": "text",
>>  >> "fielddata": "true"
>>  >> },
>>  >> "match": "threat:triage:rules:*:name",
>>  >> "match_mapping_type": "*"
>>  >> }
>>  >> }
>>  >>
>>  >> ]}}
>>  >>
>>  >> That means that for every new sensor we bring on board we can skip
>>  >> adding that boiler plate mapping config to every new template.
>>  >>
>>  >> On Wed, Jan 24, 2018 at 6:34 PM, Michael Miklavcic <
>>  >> michael.miklav...@gmail.com> wrote:
>>  >>
>>  >>> I hear you Ali. I think this type of change would actually ease issues
>>  >>> with downtime because it offers an easy path to migrating existing
>>  indices.
>>  >>> I'd have to review the specifics in the ES docs again, but I believe
>>  you
>>  >>> could duplicate the old indexes and migrate them to "metron_" in
>>  advance of
>>  >>> the upgrade, and then consume new data to the new index pattern/name
>>  after
>>  >>> the upgrade. That should be pretty seamless, I think. I guess it
>>  depends on
>>  >>> how you're using ES.
>>  >>>
>>  >>> On Wed, Jan 24, 2018 at 4:08 PM, Ali Nazemian 
>>  >>> wrote:
>>  >>>
>>   Hi All,
>>  
>>   I just wanted to say it would be great if we can be careful with
>>  these
>>   type
>>   of changes. From the development point of view, it is just a few
>>  lines of
>>   code which can 

Re: [DISCUSS] Update Metron Elasticsearch index names to metron_

2018-01-26 Thread Michael Miklavcic
Just checked on the length issue - we should be good -
https://github.com/elastic/elasticsearch/issues/8079

On Fri, Jan 26, 2018 at 3:37 PM, James Sirota  wrote:

> Seems reasonable to me.  The only thing is that it may make the index
> names too long. Not sure if that matters to ES or not
>
> 26.01.2018, 15:32, "Simon Elliston Ball" :
> > +1 on this. The idea of a default broad matching template should also
> include an order entry to avoid conflicts with more specific templates, and
> we should then document the need for a higher order value in all per-source
> index templates.
> >
> > In terms of production migration, I think we may want to provide some
> detailed documentation in the upgrade guide on this, because there will be
> people with a lot of existing indices that will be difficult to handle. We
> may also need some tooling, but I expect docs would do the job. What do
> people think about migration?
> >
> > Simon
> >
> >>  One other benefit of this revised approach - we can more effectively
> use
> >>  index template patterns to specify our base set of Metron property
> types.
> >>  Call me crazy, but I think we should be able to do something like:
> >>
> >>  
> >>
> >>  {
> >>   *"template": "metron_*",*
> >>   "mappings": {
> >> "metron_doc": {
> >>   "dynamic_templates": [
> >>   {
> >> "geo_location_point": {
> >>   "match": "enrichments:geo:*:location_point",
> >>   "match_mapping_type": "*",
> >>   "mapping": {
> >> "type": "geo_point"
> >>   }
> >> }
> >>   },
> >>   {
> >> "geo_country": {
> >>   "match": "enrichments:geo:*:country",
> >>   "match_mapping_type": "*",
> >>   "mapping": {
> >> "type": "keyword"
> >>   }
> >> }
> >>   },
> >>   {
> >> "geo_city": {
> >>   "match": "enrichments:geo:*:city",
> >>   "match_mapping_type": "*",
> >>   "mapping": {
> >> "type": "keyword"
> >>   }
> >> }
> >>   },
> >>   {
> >> "geo_location_id": {
> >>   "match": "enrichments:geo:*:locID",
> >>   "match_mapping_type": "*",
> >>   "mapping": {
> >> "type": "keyword"
> >>   }
> >> }
> >>   },
> >>   {
> >> "geo_dma_code": {
> >>   "match": "enrichments:geo:*:dmaCode",
> >>   "match_mapping_type": "*",
> >>   "mapping": {
> >> "type": "keyword"
> >>   }
> >> }
> >>   },
> >>   {
> >> "geo_postal_code": {
> >>   "match": "enrichments:geo:*:postalCode",
> >>   "match_mapping_type": "*",
> >>   "mapping": {
> >> "type": "keyword"
> >>   }
> >> }
> >>   },
> >>   {
> >> "geo_latitude": {
> >>   "match": "enrichments:geo:*:latitude",
> >>   "match_mapping_type": "*",
> >>   "mapping": {
> >> "type": "float"
> >>   }
> >> }
> >>   },
> >>   {
> >> "geo_longitude": {
> >>   "match": "enrichments:geo:*:longitude",
> >>   "match_mapping_type": "*",
> >>   "mapping": {
> >> "type": "float"
> >>   }
> >> }
> >>   },
> >>   {
> >> "timestamps": {
> >>   "match": "*:ts",
> >>   "match_mapping_type": "*",
> >>   "mapping": {
> >> "type": "date",
> >> "format": "epoch_millis"
> >>   }
> >> }
> >>   },
> >>   {
> >> "threat_triage_score": {
> >>   "mapping": {
> >> "type": "float"
> >>   },
> >>   "match": "threat:triage:*score",
> >>   "match_mapping_type": "*"
> >> }
> >>   },
> >>   {
> >> "threat_triage_reason": {
> >>   "mapping": {
> >> "type": "text",
> >> "fielddata": "true"
> >>   },
> >>   "match": "threat:triage:rules:*:reason",
> >>   "match_mapping_type": "*"
> >> }
> >>   },
> >>   {
> >> "threat_triage_name": {
> >>   "mapping": {
> >> "type": "text",
> >> "fielddata": "true"
> >>   },
> >>   "match": "threat:triage:rules:*:name",
> >>   "match_mapping_type": "*"
> >> }
> >>   }
> >>
> >>  ]}}
> >>
> >>  That means that for every new sensor we bring on board we can skip
> >>  adding that boiler plate mapping config to every new template.
> >>
> >>  On Wed, Jan 24, 2018 at 6:34 PM, Michael Miklavcic <
> >>  michael.miklav...@gmail.com> wrote:
> >>
> >>>  I hear you Ali. I think this type of change would actually ease issues
> >>>  with downtime because it offers an easy path to migrating existing
> indices.
> >>>  I'd have to review the specifics in the ES docs again, but I believe
> you
> >>>  could 

Re: [DISCUSS] Time to remove github updates from dev?

2018-01-26 Thread James Sirota
Should we file an infra ticket on this? 

19.01.2018, 13:56, "zeo...@gmail.com" :
> I would give that +1 as well.
>
> Jon
>
> On Fri, Jan 19, 2018 at 3:32 PM Casey Stella  wrote:
>
>>  I could get behind that.
>>
>>  On Fri, Jan 19, 2018 at 3:31 PM, Andre  wrote:
>>
>>  > Folks,
>>  >
>>  > May I suggest Metron follows the NiFi mailing list strategy (we got
>>  > inspired by another project but I don't recall the name) and remove the
>>  > github comments from the dev list?
>>  >
>>  > Within NiFi we have both the dev and the issues lists. dev is for humans,
>>  > issues is for JIRA and github commits.[1]
>>  >
>>  > This allows the list thread list to be cleaner and is particularly
>>  helpful
>>  > for those reading the list from a list aggregation service.
>>  >
>>  > Cheers
>>  >
>>  >
>>  > [1] https://lists.apache.org/list.html?iss...@nifi.apache.org
>>  >
>
> --
>
> Jon

--- 
Thank you,

James Sirota
PMC- Apache Metron
jsirota AT apache DOT org



Re: [DISCUSS] Update Metron Elasticsearch index names to metron_

2018-01-26 Thread James Sirota
Seems reasonable to me.  The only thing is that it may make the index names too 
long. Not sure if that matters to ES or not 

26.01.2018, 15:32, "Simon Elliston Ball" :
> +1 on this. The idea of a default broad matching template should also include 
> an order entry to avoid conflicts with more specific templates, and we should 
> then document the need for a higher order value in all per-source index 
> templates.
>
> In terms of production migration, I think we may want to provide some 
> detailed documentation in the upgrade guide on this, because there will be 
> people with a lot of existing indices that will be difficult to handle. We 
> may also need some tooling, but I expect docs would do the job. What do 
> people think about migration?
>
> Simon
>
>>  One other benefit of this revised approach - we can more effectively use
>>  index template patterns to specify our base set of Metron property types.
>>  Call me crazy, but I think we should be able to do something like:
>>
>>  
>>
>>  {
>>   *"template": "metron_*",*
>>   "mappings": {
>> "metron_doc": {
>>   "dynamic_templates": [
>>   {
>> "geo_location_point": {
>>   "match": "enrichments:geo:*:location_point",
>>   "match_mapping_type": "*",
>>   "mapping": {
>> "type": "geo_point"
>>   }
>> }
>>   },
>>   {
>> "geo_country": {
>>   "match": "enrichments:geo:*:country",
>>   "match_mapping_type": "*",
>>   "mapping": {
>> "type": "keyword"
>>   }
>> }
>>   },
>>   {
>> "geo_city": {
>>   "match": "enrichments:geo:*:city",
>>   "match_mapping_type": "*",
>>   "mapping": {
>> "type": "keyword"
>>   }
>> }
>>   },
>>   {
>> "geo_location_id": {
>>   "match": "enrichments:geo:*:locID",
>>   "match_mapping_type": "*",
>>   "mapping": {
>> "type": "keyword"
>>   }
>> }
>>   },
>>   {
>> "geo_dma_code": {
>>   "match": "enrichments:geo:*:dmaCode",
>>   "match_mapping_type": "*",
>>   "mapping": {
>> "type": "keyword"
>>   }
>> }
>>   },
>>   {
>> "geo_postal_code": {
>>   "match": "enrichments:geo:*:postalCode",
>>   "match_mapping_type": "*",
>>   "mapping": {
>> "type": "keyword"
>>   }
>> }
>>   },
>>   {
>> "geo_latitude": {
>>   "match": "enrichments:geo:*:latitude",
>>   "match_mapping_type": "*",
>>   "mapping": {
>> "type": "float"
>>   }
>> }
>>   },
>>   {
>> "geo_longitude": {
>>   "match": "enrichments:geo:*:longitude",
>>   "match_mapping_type": "*",
>>   "mapping": {
>> "type": "float"
>>   }
>> }
>>   },
>>   {
>> "timestamps": {
>>   "match": "*:ts",
>>   "match_mapping_type": "*",
>>   "mapping": {
>> "type": "date",
>> "format": "epoch_millis"
>>   }
>> }
>>   },
>>   {
>> "threat_triage_score": {
>>   "mapping": {
>> "type": "float"
>>   },
>>   "match": "threat:triage:*score",
>>   "match_mapping_type": "*"
>> }
>>   },
>>   {
>> "threat_triage_reason": {
>>   "mapping": {
>> "type": "text",
>> "fielddata": "true"
>>   },
>>   "match": "threat:triage:rules:*:reason",
>>   "match_mapping_type": "*"
>> }
>>   },
>>   {
>> "threat_triage_name": {
>>   "mapping": {
>> "type": "text",
>> "fielddata": "true"
>>   },
>>   "match": "threat:triage:rules:*:name",
>>   "match_mapping_type": "*"
>> }
>>   }
>>
>>  ]}}
>>
>>  That means that for every new sensor we bring on board we can skip
>>  adding that boiler plate mapping config to every new template.
>>
>>  On Wed, Jan 24, 2018 at 6:34 PM, Michael Miklavcic <
>>  michael.miklav...@gmail.com> wrote:
>>
>>>  I hear you Ali. I think this type of change would actually ease issues
>>>  with downtime because it offers an easy path to migrating existing indices.
>>>  I'd have to review the specifics in the ES docs again, but I believe you
>>>  could duplicate the old indexes and migrate them to "metron_" in advance of
>>>  the upgrade, and then consume new data to the new index pattern/name after
>>>  the upgrade. That should be pretty seamless, I think. I guess it depends on
>>>  how you're using ES.
>>>
>>>  On Wed, Jan 24, 2018 at 4:08 PM, Ali Nazemian 
>>>  wrote:
>>>
  Hi All,

  I just wanted to say it would be great if we can be careful with these
  type
  of changes. 

Re: [DISCUSS] Update Metron Elasticsearch index names to metron_

2018-01-26 Thread Simon Elliston Ball
+1 on this. The idea of a default broad matching template should also include 
an order entry to avoid conflicts with more specific templates, and we should 
then document the need for a higher order value in all per-source index 
templates. 

In terms of production migration, I think we may want to provide some detailed 
documentation in the upgrade guide on this, because there will be people with a 
lot of existing indices that will be difficult to handle. We may also need some 
tooling, but I expect docs would do the job. What do people think about 
migration?

Simon

> 
> One other benefit of this revised approach - we can more effectively use
> index template patterns to specify our base set of Metron property types.
> Call me crazy, but I think we should be able to do something like:
> 
> 
> 
> {
>  *"template": "metron_*",*
>  "mappings": {
>"metron_doc": {
>  "dynamic_templates": [
>  {
>"geo_location_point": {
>  "match": "enrichments:geo:*:location_point",
>  "match_mapping_type": "*",
>  "mapping": {
>"type": "geo_point"
>  }
>}
>  },
>  {
>"geo_country": {
>  "match": "enrichments:geo:*:country",
>  "match_mapping_type": "*",
>  "mapping": {
>"type": "keyword"
>  }
>}
>  },
>  {
>"geo_city": {
>  "match": "enrichments:geo:*:city",
>  "match_mapping_type": "*",
>  "mapping": {
>"type": "keyword"
>  }
>}
>  },
>  {
>"geo_location_id": {
>  "match": "enrichments:geo:*:locID",
>  "match_mapping_type": "*",
>  "mapping": {
>"type": "keyword"
>  }
>}
>  },
>  {
>"geo_dma_code": {
>  "match": "enrichments:geo:*:dmaCode",
>  "match_mapping_type": "*",
>  "mapping": {
>"type": "keyword"
>  }
>}
>  },
>  {
>"geo_postal_code": {
>  "match": "enrichments:geo:*:postalCode",
>  "match_mapping_type": "*",
>  "mapping": {
>"type": "keyword"
>  }
>}
>  },
>  {
>"geo_latitude": {
>  "match": "enrichments:geo:*:latitude",
>  "match_mapping_type": "*",
>  "mapping": {
>"type": "float"
>  }
>}
>  },
>  {
>"geo_longitude": {
>  "match": "enrichments:geo:*:longitude",
>  "match_mapping_type": "*",
>  "mapping": {
>"type": "float"
>  }
>}
>  },
>  {
>"timestamps": {
>  "match": "*:ts",
>  "match_mapping_type": "*",
>  "mapping": {
>"type": "date",
>"format": "epoch_millis"
>  }
>}
>  },
>  {
>"threat_triage_score": {
>  "mapping": {
>"type": "float"
>  },
>  "match": "threat:triage:*score",
>  "match_mapping_type": "*"
>}
>  },
>  {
>"threat_triage_reason": {
>  "mapping": {
>"type": "text",
>"fielddata": "true"
>  },
>  "match": "threat:triage:rules:*:reason",
>  "match_mapping_type": "*"
>}
>  },
>  {
>"threat_triage_name": {
>  "mapping": {
>"type": "text",
>"fielddata": "true"
>  },
>  "match": "threat:triage:rules:*:name",
>  "match_mapping_type": "*"
>}
>  }
> 
> ]}}
> 
> That means that for every new sensor we bring on board we can skip
> adding that boiler plate mapping config to every new template.
> 
> 
> 
> On Wed, Jan 24, 2018 at 6:34 PM, Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
> 
>> I hear you Ali. I think this type of change would actually ease issues
>> with downtime because it offers an easy path to migrating existing indices.
>> I'd have to review the specifics in the ES docs again, but I believe you
>> could duplicate the old indexes and migrate them to "metron_" in advance of
>> the upgrade, and then consume new data to the new index pattern/name after
>> the upgrade. That should be pretty seamless, I think. I guess it depends on
>> how you're using ES.
>> 
>> On Wed, Jan 24, 2018 at 4:08 PM, Ali Nazemian 
>> wrote:
>> 
>>> Hi All,
>>> 
>>> I just wanted to say it would be great if we can be careful with these
>>> type
>>> of changes. From the development point of view, it is just a few lines of
>>> code which can provide multiple advantages, but for live large-scale
>>> Metron
>>> platforms, some of these changes might be really expensive to address with
>>> zero-downtime.
>>> 
>>> Cheers,
>>> Ali
>>> 
>>> On Thu, Jan 25, 2018 at 9:29 AM, Otto Fowler 
>>> wrote:
>>> 
 +1
 
 
 On January 24, 2018 at 16:28:42, Nick Allen (n...@nickallen.org) wrote:
 
 +1 to a standard 

Re: Metron User Community Meeting Call

2018-01-26 Thread James Sirota
Yeah very interested in the presentation as well

26.01.2018, 15:15, "Simon Elliston Ball" :
> This is going to be a really exciting call. Looking forward to seeing how the 
> GCR Canary sings :)
>
> I’m going to volunteer https://hortonworks.zoom.us/my/simonellistonball as a 
> location for the meeting.
>
> I would also support the idea of a quick poll on what people are doing with 
> Metron, and maybe if anyone wants to volunteer at the end of the meeting it 
> would be great to have an open mic of use cases.
>
> Talk to you all Wednesday.
>
> Simon
>
>>  On 26 Jan 2018, at 22:10, Seal, Steve  wrote:
>>
>>  HI all,
>>
>>  I have several people on my team that are looking forward to hearing about 
>> Ahmed’s work.
>>
>>  Steve
>>
>>  From: Daniel Schafer [mailto:daniel.scha...@sstech.us]
>>  Sent: Friday, January 26, 2018 5:05 PM
>>  To: u...@metron.apache.org; dev@metron.apache.org
>>  Subject: Re: Metron User Community Meeting Call
>>
>>  My team members and me would like to join as well.
>>  We can provide Zoom Meeting login if necessary.
>>
>>  Thanks
>>
>>  Daniel
>>  7134806608
>>
>>  From: Ahmed Shah > >
>>  Reply-To: "u...@metron.apache.org " 
>> >
>>  Date: Friday, January 26, 2018 at 2:06 PM
>>  To: "dev@metron.apache.org " 
>> >, 
>> "u...@metron.apache.org " 
>> >
>>  Subject: Re: Metron User Community Meeting Call
>>
>>  Looking forward to presenting!
>>
>>  Just a thought...
>>  In advanced should we create a Google Forms to collect survey data on who 
>> is using Metron, how they are using it, ext.. and present the results to the 
>> group?
>>
>>  -Ahmed
>>  ___
>>  Ahmed Shah (PMP, M. Eng.)
>>  Cybersecurity Analyst & Developer
>>  GCR - Cybersecurity Operations Center
>>  Carleton University - cugcr.com 
>> 
>>
>>  From: Andrew Psaltis > >
>>  Sent: January 26, 2018 1:53 PM
>>  To: dev@metron.apache.org 
>>  Subject: Re: Metron User Community Meeting Call
>>
>>  Count me in. Very interested to hear about Ahmed's journey.
>>
>>  On Fri, Jan 26, 2018 at 8:58 AM, Kyle Richardson > >
>>  wrote:
>>
>>  > Thanks! I'll be there. Excited to hear Ahmed's successes and challenges.
>>  >
>>  > -Kyle
>>  >
>>  > On Thu, Jan 25, 2018 at 7:44 PM zeo...@gmail.com 
>>  > wrote:
>>  >
>>  > > Thanks Otto, I'm in to attend at that time/place.
>>  > >
>>  > > Jon
>>  > >
>>  > > On Thu, Jan 25, 2018, 14:45 Otto Fowler > > wrote:
>>  > >
>>  > >> I would like to propose a Metron user community meeting. I propose that
>>  > >> we set the meeting next week, and will throw out Wednesday, January
>>  > 31st at
>>  > >> 09:30AM PST, 12:30 on the East Coast and 5:30 in London Towne. This
>>  > meeting
>>  > >> will be held over a web-ex, the details of which will be included in 
>> the
>>  > >> actual meeting notice.
>>  > >> Topics
>>  > >>
>>  > >> We have a volunteer for a community member presentation:
>>  > >>
>>  > >> Ahmed Shah (PMP, M. Eng.) Cybersecurity Analyst & Developer GCR -
>>  > >> Cybersecurity Operations Center Carleton University - cugcr.com 
>> 
>>  > >>
>>  > >> Ahmed would like to talk to the community about
>>  > >>
>>  > >> -
>>  > >>
>>  > >> Who the GCR group is
>>  > >> -
>>  > >>
>>  > >> How they use Metron 0.4.1
>>  > >> -
>>  > >>
>>  > >> Walk through their dashboards, UI management screen, nifi
>>  > >> -
>>  > >>
>>  > >> Challenges we faced up until now
>>  > >>
>>  > >> I would like to thank Ahmed for stepping forward for this meeting.
>>  > >>
>>  > >> If you have something you would like to present or talk about please
>>  > >> reply here! Maybe we can have people ask for “A better explanation of
>>  > >> feature X” type things?
>>  > >> Metron User Community Meetings
>>  > >>
>>  > >> User Community Meetings are a means for realtime discussion of
>>  > >> experiences with Apache Metron, or demonstration of how 

Re: Metron User Community Meeting Call

2018-01-26 Thread Simon Elliston Ball
This is going to be a really exciting call. Looking forward to seeing how the 
GCR Canary sings :) 

I’m going to volunteer https://hortonworks.zoom.us/my/simonellistonball as a 
location for the meeting.

I would also support the idea of a quick poll on what people are doing with 
Metron, and maybe if anyone wants to volunteer at the end of the meeting it 
would be great to have an open mic of use cases. 

Talk to you all Wednesday. 

Simon

> On 26 Jan 2018, at 22:10, Seal, Steve  wrote:
> 
> HI all,
>  
> I have several people on my team that are looking forward to hearing about 
> Ahmed’s work. 
>  
> Steve
>  
>  
> From: Daniel Schafer [mailto:daniel.scha...@sstech.us] 
> Sent: Friday, January 26, 2018 5:05 PM
> To: u...@metron.apache.org; dev@metron.apache.org
> Subject: Re: Metron User Community Meeting Call
>  
> My team members and me would like to join as well.
> We can provide Zoom Meeting login if necessary.
>  
> Thanks
>  
> Daniel
> 7134806608 
>  
> From: Ahmed Shah  >
> Reply-To: "u...@metron.apache.org " 
> >
> Date: Friday, January 26, 2018 at 2:06 PM
> To: "dev@metron.apache.org " 
> >, 
> "u...@metron.apache.org " 
> >
> Subject: Re: Metron User Community Meeting Call
>  
> Looking forward to presenting!
>  
> Just a thought...
> In advanced should we create a Google Forms to collect survey data on who is 
> using Metron, how they are using it, ext.. and present the results to the 
> group? 
>  
> -Ahmed
> ___
> Ahmed Shah (PMP, M. Eng.)
> Cybersecurity Analyst & Developer 
> GCR - Cybersecurity Operations Center
> Carleton University - cugcr.com 
> 
>  
> 
> From: Andrew Psaltis  >
> Sent: January 26, 2018 1:53 PM
> To: dev@metron.apache.org 
> Subject: Re: Metron User Community Meeting Call
>  
> Count me in. Very interested to hear about Ahmed's journey.
> 
> On Fri, Jan 26, 2018 at 8:58 AM, Kyle Richardson  >
> wrote:
> 
> > Thanks! I'll be there. Excited to hear Ahmed's successes and challenges.
> >
> > -Kyle
> >
> > On Thu, Jan 25, 2018 at 7:44 PM zeo...@gmail.com  
> > > wrote:
> >
> > > Thanks Otto, I'm in to attend at that time/place.
> > >
> > > Jon
> > >
> > > On Thu, Jan 25, 2018, 14:45 Otto Fowler  > > > wrote:
> > >
> > >> I would like to propose a Metron user community meeting. I propose that
> > >> we set the meeting next week, and will throw out Wednesday, January
> > 31st at
> > >> 09:30AM PST, 12:30 on the East Coast and 5:30 in London Towne. This
> > meeting
> > >> will be held over a web-ex, the details of which will be included in the
> > >> actual meeting notice.
> > >> Topics
> > >>
> > >> We have a volunteer for a community member presentation:
> > >>
> > >> Ahmed Shah (PMP, M. Eng.) Cybersecurity Analyst & Developer GCR -
> > >> Cybersecurity Operations Center Carleton University - cugcr.com 
> > >> 
> > >>
> > >> Ahmed would like to talk to the community about
> > >>
> > >>-
> > >>
> > >>Who the GCR group is
> > >>-
> > >>
> > >>How they use Metron 0.4.1
> > >>-
> > >>
> > >>Walk through their dashboards, UI management screen, nifi
> > >>-
> > >>
> > >>Challenges we faced up until now
> > >>
> > >> I would like to thank Ahmed for stepping forward for this meeting.
> > >>
> > >> If you have something you would like to present or talk about please
> > >> reply here! Maybe we can have people ask for “A better explanation of
> > >> feature X” type things?
> > >> Metron User Community Meetings
> > >>
> > >> User Community Meetings are a means for realtime discussion of
> > >> experiences with Apache Metron, or demonstration of how the community is
> > >> using or will be using Apache Metron.
> > >>
> > >> These meetings are geared towards:
> > >>
> > >>-
> > >>
> > >>Demonstrations and knowledge sharing as opposed to technical
> > >>discussion or implementation details from members 

[GitHub] metron issue #903: METRON-1370 Create Full Dev Equivalent for Ubuntu

2018-01-26 Thread cestella
Github user cestella commented on the issue:

https://github.com/apache/metron/pull/903
  
Looks good to me!


---


[GitHub] metron issue #907: METRON-1427: Add support for storm 1.1 and hdp 2.6

2018-01-26 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/907
  
I'm testing now as well for good measure since this is our main dev testing 
environment. We should probably take a little extra care with the breadth of 
changes in this and https://github.com/apache/metron/pull/903


---


[GitHub] metron issue #901: METRON-1410 [MPACK] Check for existing HBASE tables befor...

2018-01-26 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/901
  
Thanks for fixing this @ottobackwards. +1 via inspection from me as well.


---


[GitHub] metron issue #907: METRON-1427: Add support for storm 1.1 and hdp 2.6

2018-01-26 Thread cestella
Github user cestella commented on the issue:

https://github.com/apache/metron/pull/907
  
ok, I ran this guy up in kerberos and regular and tooled around a bit; 
ensured the alerts UI worked in both.


---


[GitHub] metron issue #890: METRON-1391 Fix for README.md in Metron Management

2018-01-26 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/890
  
+1, lgtm via inspection. Thanks for the contribution!


---


[GitHub] metron pull request #910: METRON-1430: Isolate jackson from being used as ar...

2018-01-26 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/metron/pull/910


---


[GitHub] metron issue #903: METRON-1370 Create Full Dev Equivalent for Ubuntu

2018-01-26 Thread ottobackwards
Github user ottobackwards commented on the issue:

https://github.com/apache/metron/pull/903
  
+1 still stands



---


[GitHub] metron issue #903: METRON-1370 Create Full Dev Equivalent for Ubuntu

2018-01-26 Thread JonZeolla
Github user JonZeolla commented on the issue:

https://github.com/apache/metron/pull/903
  
+1 to that latest round of naming


---


Re: Metron User Community Meeting Call

2018-01-26 Thread Ahmed Shah
Looking forward to presenting!


Just a thought...

In advanced should we create a Google Forms to collect survey data on who is 
using Metron, how they are using it, ext.. and present the results to the group?


-Ahmed
___
Ahmed Shah (PMP, M. Eng.)
Cybersecurity Analyst & Developer
GCR - Cybersecurity Operations Center
Carleton University - cugcr.com



From: Andrew Psaltis 
Sent: January 26, 2018 1:53 PM
To: dev@metron.apache.org
Subject: Re: Metron User Community Meeting Call

Count me in. Very interested to hear about Ahmed's journey.

On Fri, Jan 26, 2018 at 8:58 AM, Kyle Richardson 
wrote:

> Thanks! I'll be there. Excited to hear Ahmed's successes and challenges.
>
> -Kyle
>
> On Thu, Jan 25, 2018 at 7:44 PM zeo...@gmail.com  wrote:
>
> > Thanks Otto, I'm in to attend at that time/place.
> >
> > Jon
> >
> > On Thu, Jan 25, 2018, 14:45 Otto Fowler  wrote:
> >
> >> I would like to propose a Metron user community meeting. I propose that
> >> we set the meeting next week, and will throw out Wednesday, January
> 31st at
> >> 09:30AM PST, 12:30 on the East Coast and 5:30 in London Towne. This
> meeting
> >> will be held over a web-ex, the details of which will be included in the
> >> actual meeting notice.
> >> Topics
> >>
> >> We have a volunteer for a community member presentation:
> >>
> >> Ahmed Shah (PMP, M. Eng.) Cybersecurity Analyst & Developer GCR -
> >> Cybersecurity Operations Center Carleton University - cugcr.com
> >>
> >> Ahmed would like to talk to the community about
> >>
> >>-
> >>
> >>Who the GCR group is
> >>-
> >>
> >>How they use Metron 0.4.1
> >>-
> >>
> >>Walk through their dashboards, UI management screen, nifi
> >>-
> >>
> >>Challenges we faced up until now
> >>
> >> I would like to thank Ahmed for stepping forward for this meeting.
> >>
> >> If you have something you would like to present or talk about please
> >> reply here! Maybe we can have people ask for “A better explanation of
> >> feature X” type things?
> >> Metron User Community Meetings
> >>
> >> User Community Meetings are a means for realtime discussion of
> >> experiences with Apache Metron, or demonstration of how the community is
> >> using or will be using Apache Metron.
> >>
> >> These meetings are geared towards:
> >>
> >>-
> >>
> >>Demonstrations and knowledge sharing as opposed to technical
> >>discussion or implementation details from members of the Apache
> Metron
> >>Community
> >>-
> >>
> >>Existing Feature demonstrations
> >>-
> >>
> >>Proposed Feature demonstrations
> >>-
> >>
> >>Community feedback
> >>
> >> These meetings are *not* for :
> >>
> >>-
> >>
> >>Support discussions. Those are best left to the mailing lists.
> >>-
> >>
> >>Development discussions. There is another type of meeting for that.
> >>
> >>
> >>
> >>
> >
> > --
> >
> > Jon
> >
>



--
Thanks,
Andrew

Subscribe to my book: Streaming Data 

twiiter: @itmdata 


Re: When things change in hdfs, how do we know

2018-01-26 Thread Otto Fowler
https://github.com/ottobackwards/hdfs-inotify-zookeeper
Has the basics framed out, short of pushing to zookeeper, which I mocked
out at this time.
I’ll add pushing to zk and a cache notification listener to the test soon.


On January 26, 2018 at 08:48:54, Otto Fowler (ottobackwa...@gmail.com)
wrote:

In the future, when the ‘filter paths on the name node side for inotify’
lands in hdfs ( there is a jira from the summer that is not making progress
) we can
just use the paths to register.


On January 26, 2018 at 08:47:11, Otto Fowler (ottobackwa...@gmail.com)
wrote:

In the end, what I’m thinking is this:

We have an ambari service that runs the notification -> zookeeper
it reads the ‘registration area’ from zookeeper to get it’s state and what
to watch
post 777 when parsers are installed and registered it is trivial to have my
installer also register the files to watch

the notifications service also has a notification from zookeeper for new
registrations.

On notify event, the ‘notification node’ has it’s content set to the event
details and time
which the parser would pick up…. causing the reload
???
profit


This would work for the future script parser etc etc.


On January 26, 2018 at 08:30:32, Simon Elliston Ball (
si...@simonellistonball.com) wrote:

Interesting, so you have an INotify listener to filter events, and then on
given changes, propagate a notification to zookeeper, which then triggers
the reconfiguration event via the curator client in Metron. I kinda like it
given our existing zookeeper methods.

Simon

On 26 Jan 2018, at 13:27, Otto Fowler  wrote:

https://github.com/ottobackwards/hdfs-inotify-zookeeper

Working on a poc



On January 26, 2018 at 07:41:44, Simon Elliston Ball (
si...@simonellistonball.com) wrote:

Should we consider using the Inotify interface to trigger reconfiguration,
in same way we trigger config changes in curator? We also need to fix
caching and lifecycle in the Grok parser to make the zookeeper changes
propagate pattern changes while we’re at it.

Simon

> On 26 Jan 2018, at 03:16, Casey Stella  wrote:
>
> Right now you have to restart the parser topology.
>
> On Thu, Jan 25, 2018 at 10:15 PM, Otto Fowler 
> wrote:
>
>> At the moment, when a grok file or something changes in HDFS, how do we
>> know? Do we have to restart the parser topology to pick it up?
>> Just trying to clarify for myself.
>>
>> ottO
>>


Re: Metron User Community Meeting Call

2018-01-26 Thread Andrew Psaltis
Count me in. Very interested to hear about Ahmed's journey.

On Fri, Jan 26, 2018 at 8:58 AM, Kyle Richardson 
wrote:

> Thanks! I'll be there. Excited to hear Ahmed's successes and challenges.
>
> -Kyle
>
> On Thu, Jan 25, 2018 at 7:44 PM zeo...@gmail.com  wrote:
>
> > Thanks Otto, I'm in to attend at that time/place.
> >
> > Jon
> >
> > On Thu, Jan 25, 2018, 14:45 Otto Fowler  wrote:
> >
> >> I would like to propose a Metron user community meeting. I propose that
> >> we set the meeting next week, and will throw out Wednesday, January
> 31st at
> >> 09:30AM PST, 12:30 on the East Coast and 5:30 in London Towne. This
> meeting
> >> will be held over a web-ex, the details of which will be included in the
> >> actual meeting notice.
> >> Topics
> >>
> >> We have a volunteer for a community member presentation:
> >>
> >> Ahmed Shah (PMP, M. Eng.) Cybersecurity Analyst & Developer GCR -
> >> Cybersecurity Operations Center Carleton University - cugcr.com
> >>
> >> Ahmed would like to talk to the community about
> >>
> >>-
> >>
> >>Who the GCR group is
> >>-
> >>
> >>How they use Metron 0.4.1
> >>-
> >>
> >>Walk through their dashboards, UI management screen, nifi
> >>-
> >>
> >>Challenges we faced up until now
> >>
> >> I would like to thank Ahmed for stepping forward for this meeting.
> >>
> >> If you have something you would like to present or talk about please
> >> reply here! Maybe we can have people ask for “A better explanation of
> >> feature X” type things?
> >> Metron User Community Meetings
> >>
> >> User Community Meetings are a means for realtime discussion of
> >> experiences with Apache Metron, or demonstration of how the community is
> >> using or will be using Apache Metron.
> >>
> >> These meetings are geared towards:
> >>
> >>-
> >>
> >>Demonstrations and knowledge sharing as opposed to technical
> >>discussion or implementation details from members of the Apache
> Metron
> >>Community
> >>-
> >>
> >>Existing Feature demonstrations
> >>-
> >>
> >>Proposed Feature demonstrations
> >>-
> >>
> >>Community feedback
> >>
> >> These meetings are *not* for :
> >>
> >>-
> >>
> >>Support discussions. Those are best left to the mailing lists.
> >>-
> >>
> >>Development discussions. There is another type of meeting for that.
> >>
> >>
> >>
> >>
> >
> > --
> >
> > Jon
> >
>



-- 
Thanks,
Andrew

Subscribe to my book: Streaming Data 

twiiter: @itmdata 


Re: [DISCUSS] Using JSON Path to support more complex documents with the JSONMap Parser

2018-01-26 Thread Laurens Vets

On 2018-01-25 07:57, Otto Fowler wrote:

While it would be preferred if all data streamed into the parsers is
already in ‘stream’ form, as opposed to ‘batched’ form, it may not 
always

be possible, or possible at every step of system development.

I was wondering if it would be worth adding optional support to the 
JSONMap
Parser to support more complex documents, and split them in the parser 
into

multiple messages. This is similar in function to the JSON Splitter
processor in NiFi

So, a document would come into the JSONMap Parser from Kafka, with some
embedded set of the real message content, such as in this simplified
example:

{
“messages" : [
{ message1},
{ message2},
….
{messageN}
]
}

the JSONMap Parser, would have a new configuration item for message
selection, that would be a JSON Path expression

“messageSelector” : “$.messages “

Inside the JSONMap Parser, it would evaluate the expression, and do the
same processing on each item returned by the expression list.

the Parser interface already supports returning multiple message 
objects

from a single byte[] input.

There is a performance penalty to be paid here, and it is more than 
just

doing more than one message due to the JSONPath evaluation.

I can see this being useful in a couple of circumstances:

   -

   You want to work with some document format with metron but do not 
have

   NiFi or the equivalent available or setup yet
   -

   You want to prototype with Metron before you get the ‘preprocessing’
   setup
   -

   You are not going to be able to use NiFi and are ok with the 
performance


I have something in github to look at for more detail :
ottobackwards/json-path-play


Thoughts?


I like this, it's the exact reason why we use NiFi Splitter right now. 
We get 'batched' CloudTrail events which need to be split in individual 
events...


[GitHub] metron issue #903: METRON-1370 Create Full Dev Equivalent for Ubuntu

2018-01-26 Thread lvets
Github user lvets commented on the issue:

https://github.com/apache/metron/pull/903
  
@nickwallen Fine for me :) This might be even better if you ever decide to 
make production ready pre-made ansible deployments (if that makes sense?)


---


[GitHub] metron issue #903: METRON-1370 Create Full Dev Equivalent for Ubuntu

2018-01-26 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/903
  
Ok, I renamed the development environments.  

I went with a slightly different name than I previously mentioned, but it 
still matches the suggestions that I received earlier.  I thought this made 
more sense.  Feel free to tell me if you don't like it.

Instead of `vagrant/full-dev-environment` or `vagrant/metron-on-centos` or 
`vagrant/dev-on-centos6`, we have `development/centos6` which is concise and 
very clear as to the intended purpose. 

* `metron-deployment/development/centos6`
* `metron-deployment/development/ubuntu14`
* `metron-deployment/development/fastcapa`

I also added `metron-deployment/ansible/README.md` to clarify the purpose 
and use of those shared Ansible assets.  I really do not want to see people 
trying to use those for anything outside of the development environments.

I edited `metron-deployment/development/README.md` to describe the various 
development environments.

Let me know if this jives for everyone; @ottobackwards, @cestella, @lvets, 
etc.


---


[GitHub] metron issue #910: METRON-1430: Isolate jackson from being used as arguments...

2018-01-26 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/910
  
Ah, ok I didn't pick that up from the context, but that makes sense. +1 
stands.


---


[GitHub] metron pull request #898: METRON-1398:Exclude the basic-error-controller fro...

2018-01-26 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/metron/pull/898


---


[GitHub] metron pull request #892: METRON-1392 Fix a test case to expect an Exception...

2018-01-26 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/metron/pull/892


---


[GitHub] metron issue #892: METRON-1392 Fix a test case to expect an Exception when r...

2018-01-26 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/892
  
+1 by inspection.  Thanks for fixing this @MohanDV.  The original test 
didn't even make sense.


---


[GitHub] metron issue #898: METRON-1398:Exclude the basic-error-controller from being...

2018-01-26 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/898
  
+1 by inspection.  Thanks @MohanDV!


---


Re: When things change in hdfs, how do we know

2018-01-26 Thread Otto Fowler
In the future, when the ‘filter paths on the name node side for inotify’
lands in hdfs ( there is a jira from the summer that is not making progress
) we can
just use the paths to register.


On January 26, 2018 at 08:47:11, Otto Fowler (ottobackwa...@gmail.com)
wrote:

In the end, what I’m thinking is this:

We have an ambari service that runs the notification -> zookeeper
it reads the ‘registration area’ from zookeeper to get it’s state and what
to watch
post 777 when parsers are installed and registered it is trivial to have my
installer also register the files to watch

the notifications service also has a notification from zookeeper for new
registrations.

On notify event, the ‘notification node’ has it’s content set to the event
details and time
which the parser would pick up…. causing the reload
???
profit


This would work for the future script parser etc etc.


On January 26, 2018 at 08:30:32, Simon Elliston Ball (
si...@simonellistonball.com) wrote:

Interesting, so you have an INotify listener to filter events, and then on
given changes, propagate a notification to zookeeper, which then triggers
the reconfiguration event via the curator client in Metron. I kinda like it
given our existing zookeeper methods.

Simon

On 26 Jan 2018, at 13:27, Otto Fowler  wrote:

https://github.com/ottobackwards/hdfs-inotify-zookeeper

Working on a poc



On January 26, 2018 at 07:41:44, Simon Elliston Ball (
si...@simonellistonball.com) wrote:

Should we consider using the Inotify interface to trigger reconfiguration,
in same way we trigger config changes in curator? We also need to fix
caching and lifecycle in the Grok parser to make the zookeeper changes
propagate pattern changes while we’re at it.

Simon

> On 26 Jan 2018, at 03:16, Casey Stella  wrote:
>
> Right now you have to restart the parser topology.
>
> On Thu, Jan 25, 2018 at 10:15 PM, Otto Fowler 
> wrote:
>
>> At the moment, when a grok file or something changes in HDFS, how do we
>> know? Do we have to restart the parser topology to pick it up?
>> Just trying to clarify for myself.
>>
>> ottO
>>


Re: When things change in hdfs, how do we know

2018-01-26 Thread Otto Fowler
In the end, what I’m thinking is this:

We have an ambari service that runs the notification -> zookeeper
it reads the ‘registration area’ from zookeeper to get it’s state and what
to watch
post 777 when parsers are installed and registered it is trivial to have my
installer also register the files to watch

the notifications service also has a notification from zookeeper for new
registrations.

On notify event, the ‘notification node’ has it’s content set to the event
details and time
which the parser would pick up…. causing the reload
???
profit


This would work for the future script parser etc etc.


On January 26, 2018 at 08:30:32, Simon Elliston Ball (
si...@simonellistonball.com) wrote:

Interesting, so you have an INotify listener to filter events, and then on
given changes, propagate a notification to zookeeper, which then triggers
the reconfiguration event via the curator client in Metron. I kinda like it
given our existing zookeeper methods.

Simon

On 26 Jan 2018, at 13:27, Otto Fowler  wrote:

https://github.com/ottobackwards/hdfs-inotify-zookeeper

Working on a poc



On January 26, 2018 at 07:41:44, Simon Elliston Ball (
si...@simonellistonball.com) wrote:

Should we consider using the Inotify interface to trigger reconfiguration,
in same way we trigger config changes in curator? We also need to fix
caching and lifecycle in the Grok parser to make the zookeeper changes
propagate pattern changes while we’re at it.

Simon

> On 26 Jan 2018, at 03:16, Casey Stella  wrote:
>
> Right now you have to restart the parser topology.
>
> On Thu, Jan 25, 2018 at 10:15 PM, Otto Fowler 
> wrote:
>
>> At the moment, when a grok file or something changes in HDFS, how do we
>> know? Do we have to restart the parser topology to pick it up?
>> Just trying to clarify for myself.
>>
>> ottO
>>


Re: When things change in hdfs, how do we know

2018-01-26 Thread Simon Elliston Ball
Interesting, so you have an INotify listener to filter events, and then on 
given changes, propagate a notification to zookeeper, which then triggers the 
reconfiguration event via the curator client in Metron. I kinda like it given 
our existing zookeeper methods. 

Simon

> On 26 Jan 2018, at 13:27, Otto Fowler  wrote:
> 
> https://github.com/ottobackwards/hdfs-inotify-zookeeper 
> 
> 
> Working on a poc
> 
> 
> 
> On January 26, 2018 at 07:41:44, Simon Elliston Ball 
> (si...@simonellistonball.com ) wrote:
> 
>> Should we consider using the Inotify interface to trigger reconfiguration, 
>> in same way we trigger config changes in curator? We also need to fix 
>> caching and lifecycle in the Grok parser to make the zookeeper changes 
>> propagate pattern changes while we’re at it.  
>> 
>> Simon 
>> 
>> > On 26 Jan 2018, at 03:16, Casey Stella > > > wrote: 
>> >  
>> > Right now you have to restart the parser topology. 
>> >  
>> > On Thu, Jan 25, 2018 at 10:15 PM, Otto Fowler > > > 
>> > wrote: 
>> >  
>> >> At the moment, when a grok file or something changes in HDFS, how do we 
>> >> know? Do we have to restart the parser topology to pick it up? 
>> >> Just trying to clarify for myself. 
>> >>  
>> >> ottO 
>> >> 



Re: When things change in hdfs, how do we know

2018-01-26 Thread Otto Fowler
https://github.com/ottobackwards/hdfs-inotify-zookeeper

Working on a poc



On January 26, 2018 at 07:41:44, Simon Elliston Ball (
si...@simonellistonball.com) wrote:

Should we consider using the Inotify interface to trigger reconfiguration,
in same way we trigger config changes in curator? We also need to fix
caching and lifecycle in the Grok parser to make the zookeeper changes
propagate pattern changes while we’re at it.

Simon

> On 26 Jan 2018, at 03:16, Casey Stella  wrote:
>
> Right now you have to restart the parser topology.
>
> On Thu, Jan 25, 2018 at 10:15 PM, Otto Fowler 
> wrote:
>
>> At the moment, when a grok file or something changes in HDFS, how do we
>> know? Do we have to restart the parser topology to pick it up?
>> Just trying to clarify for myself.
>>
>> ottO
>>


Re: When things change in hdfs, how do we know

2018-01-26 Thread Simon Elliston Ball
Should we consider using the Inotify interface to trigger reconfiguration, in 
same way we trigger config changes in curator? We also need to fix caching and 
lifecycle in the Grok parser to make the zookeeper changes propagate pattern 
changes while we’re at it. 

Simon

> On 26 Jan 2018, at 03:16, Casey Stella  wrote:
> 
> Right now you have to restart the parser topology.
> 
> On Thu, Jan 25, 2018 at 10:15 PM, Otto Fowler 
> wrote:
> 
>> At the moment, when a grok file or something changes in HDFS, how do we
>> know?  Do we have to restart the parser topology to pick it up?
>> Just trying to clarify for myself.
>> 
>> ottO
>>