GitHub user ottobackwards reopened a pull request:
https://github.com/apache/metron/pull/530
METRON-777 Metron Extension System and Parser Extensions
## Contributor Comments
The pr. introduces an extension system for metron, along with refactoring
the metron parsers on top of it. This is the base work for METRON-258 -
Sideloading parsers, which is to follow, as it enables the creation and
management of extensions outside of the metron codebase. The work for enabling
side loading is the ability to install and deploy 3rd party extensions/parsers.
There is a lot that can be done with this, but I could nibble at it
forever, and I'd like to get feedback and improvements going. There is still
more documentation work that can be done for example.
The areas of change:
### Travis
- Building requires a VM instead of a container - requires : sudo
### Metron Maven Bundle Plugin
- adaptation of the nifi plugin
- more configurable wrt file extension/dependency and metadata naming
- new pre-build step on clean systems to install plugin
### bundle-lib
My goal here was not to make any radical changes. If you want the diff, I
would suggest cloning nifi latest and diffing this module with
[nifi-nar_utils](https://github.com/apache/nifi/tree/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-nar-utils)
- adaptation of nifi-nar-utils to be used outside of the nifi project
- rudimentary extensibility to allow configuration and injection of
service types and other things that were hard coded to nifi
- refactored from File based to VFS based
- introduced class cluster for FileUtils to allow for specialized HDFS
file functionality ( HDFS with VFS is read only )
- rebranding to Bundle from Nar ( although the lib and the plugin allow
that to be configured now )
- added capability to the properties class to write to stream, adapted to
uri from paths
- added integration tests for hdfs
- changed to be ClassIndex based instead of ServiceLoader. Service loader
is slower, and Casey's ClassIndex work is great. This also removes the NAR's
required manual maintenance of the service file.
### Metron Maven Parser Extension Archetype
- locally installable archetype for creating Metron Parsers
- <provided> dependencies instead of shading
- builds Bundle
- creates an assembly with bundle and configuration
- configuration for parsers now includes all parser related configurations
( except ES and Logrotate )
- Includes sample data for testing, global config etc. ( such that you
don't need metron code to build and test ).
- Can be used with configuration only parsers, so that you can still unit
and integration test them, common deployment
- Creates documentation readme.
### Metron-Platform/Metron-Extensions / Metron-Parser-Extensions/**
- Module area for extensions, first extension type is parsers
- All parsers re-based on archetype generated projects
- All parser test data/ configuration located with parser ( see above )
- Each parser had a readme that should be filled out, but I didn't do that
[note - during integration tests, the parsers are found in the system
classloader not the bundles, because the way the tests are tied together]
### Bundle Extension Test
- added a separate test to test loading parsers without any dependency
### Metron-Parsers
- Removed all parsers and their tests etc except CSV, JSON, GROK
- Still shaded, still the storm loaded jar
- extended or fix tests so that they work when derived outside of code tree
- Parser bolt no longer takes MessageParser<> instance, loads it as from
extension/bundle system
- MessageParser<> annotated for ClassIndex
### Metron-configuration
- changes to support new parser locations
- added functionality to load and store bundle.properties to zk
### Metron Tests
- Extended to work with relative path / formatted paths
### RPM-Build
- Copy all the parser extensions
- Include in the spec
/usr/metron/V/ now has a new directories for extensions:
/extension_etc/PARSER_NAME/ -> that parser's configuration
/extension_alt_etc/ -> location for 3rd Party extension configuration
/extension_lib -> location on disk for rpm to place bundles
/extension_alt_lib -> location on disk for staging 3rd party bundles
### METRON-SERVICE ambari
- Load zk configurations for parsers from their location
- filles out the properties template and deploys to hdfs
- create HDFS directories
- deploy/copy bundles to HDFS
### the metron workflow that this enables
We need a new parser:
*create with archetype under metron-extensions/metron-parser-extensions
*implement including tests and test data, all configurations
*add to the copy-resources of RPM-Docker pom
*add to the spec file
*add to the all_parsers variable in params
- this will get it installed but not started, no ES no log rotate
* add to parsers variable in the env.xml to get it to start as well ( still
no ES or Log rotate )
* other steps to get the ES template integrated with indexing scripts and
log rotate with ansible
I have been working in Full Dev to get this going, and I believe it is
working enough to get this started.
At the end of vagrant up with full dev, you should have data in kibana, as
if nothing had changed ;)
There are issues however:
I have not integrated this with the Metron Docker project, I'm not sure how
yet.
I have fixed Metron-Interface to get the test to run, but I think that work
needs to be done there.
The next steps here are follow ons for installing parsers from the ui.
Testing:
- Build with tests works
- Start full_dev_platform and confirm parsers are installed and working
as expected
- Management UI confirm that all 14 parsers are installed and available
- Ensure the UI can spin up the squid parser
- Follow the
metron-platform/metron-extensions/metron-parser-extensions/ADDING_SYSTEM_PARSERS.md
guide
I am sure I missed many things, and that there are things that could be
better. Thank you in advance for your review.
## Pull Request Checklist
Thank you for submitting a contribution to Apache Metron (Incubating).
Please refer to our [Development
Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235)
for the complete guide to follow for contributions.
Please refer also to our [Build Verification
Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview)
for complete smoke testing guides.
In order to streamline the review of the contribution we ask you follow
these guidelines and ask you to double check the following:
### For all changes:
- [x ] Is there a JIRA ticket associated with this PR? If not one needs to
be created at [Metron
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
- [x ] Does your PR title start with METRON-XXXX where XXXX is the JIRA
number you are trying to resolve? Pay particular attention to the hyphen "-"
character.
- [ x] Has your PR been rebased against the latest commit within the target
branch (typically master)?
### For code changes:
- [x ] Have you included steps to reproduce the behavior or problem that is
being changed or addressed?
- [x ] Have you included steps or a guide to how the change may be verified
and tested manually?
- [x ] Have you ensured that the full suite of tests and checks have been
executed in the root incubating-metron folder via:
```
mvn -q clean integration-test install && build_utils/verify_licenses.sh
```
- [x ] Have you written or updated unit tests and or integration tests to
verify your changes?
- [x ] If adding new dependencies to the code, are these dependencies
licensed in a way that is compatible for inclusion under [ASF
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [x ] Have you verified the basic functionality of the build by building
and running locally with Vagrant full-dev environment or the equivalent?
### For documentation related changes:
- [ x] Have you ensured that format looks appropriate for the output in
which it is rendered by building and verifying the site-book? If not then run
the following commands and the verify changes via
`site-book/target/site/index.html`:
```
cd site-book
bin/generate-md.sh
mvn site:site
```
#### Note:
Please ensure that once the PR is submitted, you check travis-ci for build
issues and submit an update to your PR as soon as possible.
It is also recommened that [travis-ci](https://travis-ci.org) is set up for
your personal repository such that your branches are built there before
submitting a pull request.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/ottobackwards/metron METRON-777
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/metron/pull/530.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #530
----
commit 864d320d91c522dfc2eb63fc12341f316a3f8952
Author: Otto Fowler <[email protected]>
Date: 2017-03-17T04:56:49Z
Metron Extension system
Based on Apache Nifi Nars
NAR changes
* new lib , rebrand to bundles from NAR
* port to VFS/FileObject from File based
* ability to set property values
* Rework FileUtils so that you can derive and override
* added initializers to set 'classes' that we care about instead of hard
coding them, still needs defaults
* added components nec. for integration tests ( do not want dep. on
metron-* )
* VFSClassloader for NarClassLoader
* Hdfs based integration test version of unpacknars tests
* HDFS ( filesystem ) based fileutilities to cover for writes to hdfs,
since VFS is currently R/O HDFS
* modified plugin to support configuration of outputs
* use class index not service loader ( both subclass and annotated
supported )
Archetype
* Parser Extension archetyp
* incudes all configuration
* creates tar.gz with bundle and configuration
* class index support ( automatic generation )
Extensions
* new extensions modules
* parser
* archetype built module for each parser type
* support for configuration only parsers with tests
Parsers
* moved all but json, csv, grok to extensions
* Bolt now loads from bundle properties
Deployment
* rpms for parsers
* create extension directories
* ambari initializes zookeeper per parser
* amabri creates hdfs directories
* ISSUE: Writing to hdfs
Rest-API
* only test against parsers in metron-parsers
* still needs integration
commit 850154cb086bd9f825daa6797b575cde95392a87
Author: Otto Fowler <[email protected]>
Date: 2017-04-14T12:55:37Z
fix package names
commit e8caaf51e3f3d640e9ec1c3c548bd722d595687d
Author: Otto Fowler <[email protected]>
Date: 2017-04-17T12:19:02Z
Merge remote-tracking branch 'apache/master' into METRON-777
commit c18cbe3a8847fbed7b5f6ba9182b58f01d142c2c
Author: Otto Fowler <[email protected]>
Date: 2017-04-23T16:14:15Z
merge apache/master
commit 7760425b02142f01f37d58876aa09fb88b197c82
Author: Otto Fowler <[email protected]>
Date: 2017-04-27T10:45:37Z
Merge remote-tracking branch 'apache/master' into METRON-777
commit bc7cc2d823d54be58fae13ef94f671457d4c10f0
Author: Otto Fowler <[email protected]>
Date: 2017-04-29T04:05:41Z
use correct testing bundle
commit c25585d10529ca2048d2ed0df4d8dbd904d3db17
Author: Otto Fowler <[email protected]>
Date: 2017-04-29T04:07:11Z
use the configured extension list, do not hardcode
commit 158f463f121cbca2cf2c4d3c1cd9cf7b707e1d23
Author: Otto Fowler <[email protected]>
Date: 2017-04-29T12:58:41Z
Merge remote-tracking branch 'apache/master' into METRON-777
commit e21efb89a37d034e36cb82dab6c0b862cad3777b
Author: Otto Fowler <[email protected]>
Date: 2017-04-30T13:16:41Z
make dependency explicit, I think parent version is not what we think it is
when running from archetype, causing the extension versio to be used for this
dependency
commit ac92d7b88404f8b8bc825c71a4aeee5aa015d757
Author: Otto Fowler <[email protected]>
Date: 2017-04-30T13:44:58Z
instead of hard-coding in metron-parsers-extensions, overload in the
archetype to use metronVersion
commit fd021e42a2cdb80676426b65ae227ac9f44fcd6a
Author: Otto Fowler <[email protected]>
Date: 2017-05-03T14:34:31Z
Merge remote-tracking branch 'apache/master' into METRON-777
commit 9cafe970b40d551a846fc442ac121591ab1d0d6d
Author: Otto Fowler <[email protected]>
Date: 2017-05-04T02:18:29Z
specify the plugin to fix dependency problem
commit de48845634ae1cecb258175326592a28f5b3f8fe
Author: Otto Fowler <[email protected]>
Date: 2017-05-04T02:40:55Z
second attempt to fix plugin errors with jacoco
commit 995a4d746fa21400a55e9571957878ef2d5b48d8
Author: Otto Fowler <[email protected]>
Date: 2017-05-04T13:46:58Z
Merge remote-tracking branch 'apache/master' into METRON-777
commit 5f74fc3a15fe5e56f9e3483fa4c25961ffb284dd
Author: Otto Fowler <[email protected]>
Date: 2017-05-04T20:25:18Z
be sure to clear out before and after, we cannot assume order of maven build
commit a3e63555e4e76486be58ddd6f818006ff42cdd1f
Author: Otto Fowler <[email protected]>
Date: 2017-05-05T03:42:54Z
Merge remote-tracking branch 'apache/master' into METRON-777
commit 1e0e30562b4b526d9512bb1dc06077674d6b3277
Author: Otto Fowler <[email protected]>
Date: 2017-05-05T18:18:32Z
update archetype based on changes to loading and configuration
commit e3b71aed74271cf255be5fad072a99506f553d2e
Author: Otto Fowler <[email protected]>
Date: 2017-05-06T16:08:58Z
use simple json and not the JSONUtil Instance, there are issues in storm
loading
commit 6c0201b853ffd079f02589011c57cb6a960475e8
Author: Otto Fowler <[email protected]>
Date: 2017-05-09T13:55:31Z
Merge remote-tracking branch 'apache/master' into METRON-777
commit ab0aad849319117dd3d964247861cf09bc3e7822
Author: Otto Fowler <[email protected]>
Date: 2017-05-10T12:57:21Z
Merge remote-tracking branch 'apache/master' into METRON-777
commit 5565ad661f4143234b67230f7cfd420ea1bce91a
Author: Otto Fowler <[email protected]>
Date: 2017-05-10T19:47:58Z
Merge remote-tracking branch 'apache/master' into METRON-777
commit 225fb4e38d6e5ec5eff877be2a84c8fd29af0c3e
Author: Otto Fowler <[email protected]>
Date: 2017-05-12T18:12:03Z
Merge remote-tracking branch 'apache/master' into METRON-777
commit 97c1be132c114a0f3f729e5f9887adb5aa49f582
Author: Otto Fowler <[email protected]>
Date: 2017-05-17T02:22:33Z
Merge remote-tracking branch 'apache/master' into METRON-777
commit b73f35fc25982556ad469de409e15c72b0305a8a
Author: Otto Fowler <[email protected]>
Date: 2017-05-19T15:33:30Z
Merge remote-tracking branch 'apache/master' into METRON-777
commit eb2062be4c14d0549a6ce2504dbbe29ea5510141
Author: Otto Fowler <[email protected]>
Date: 2017-05-22T11:58:50Z
Merge remote-tracking branch 'apache/master' into METRON-777
commit 382a29e18f9ab161f0a21d92c1ba3ce6f2601eeb
Author: Otto Fowler <[email protected]>
Date: 2017-05-22T12:01:38Z
Merge remote-tracking branch 'apache/master' into METRON-777
commit 90a51deb14078b57966f3ad8e2f7e000cbb74ca3
Author: Jon Zeolla <[email protected]>
Date: 2017-05-25T21:52:55Z
Merge branch 'METRON-777' of https://github.com/ottobackwards/metron into
METRON-777
commit 2e66ea397ffc82caf6a76d47ba3651754361dd51
Author: Jon Zeolla <[email protected]>
Date: 2017-05-25T22:07:23Z
Trivial documentation changes
commit 0dd8cf8c9ab89b43d543de8b5237010a948540a9
Author: Otto Fowler <[email protected]>
Date: 2017-05-26T01:58:05Z
Merge remote-tracking branch 'apache/master' into METRON-777
commit a2634f31d2b16f9ece00dbae49a8b0265f091cfd
Author: Otto Fowler <[email protected]>
Date: 2017-05-26T02:00:07Z
Merge branch 'METRON-777' of https://github.com/JonZeolla/metron into
jz-metron-777
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---