[
https://issues.apache.org/jira/browse/METRON-733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877300#comment-15877300
]
ASF GitHub Bot commented on METRON-733:
---------------------------------------
GitHub user justinleet opened a pull request:
https://github.com/apache/incubator-metron/pull/461
METRON-733: Remove Geo database from ParserBolt
To create the original problem, just run up a parser and make sure there's
no geo data on hdfs (by default in /apps/metron/geo).
This PR removes geo from metron-parsers entirely (since it shouldn't be
necessary at all, only pulled in when Stellar cares about it).
### Testing
To validate this, I basically ran through the squid demo with the geo file
missing, which works as expected (no exception thrown). In addition, the
ParserBoltTest is updated to not have a reference to the test Geo DB data (and
runs fine without it).
To ensure that Stellar GEO_GET works as expected in a parser, quick-dev was
spun up. The steps for squid were followed, but with a custom parser config
```
{
"parserClassName": "org.apache.metron.parsers.GrokParser",
"sensorTopic": "squid",
"parserConfig": {
"grokPath": "/patterns/squid",
"patternLabel": "SQUID_DELIMITED",
"timestampField": "timestamp"
},
"fieldTransformations" : [
{
"transformation" : "STELLAR"
,"output" : [ "geo_test" ]
,"config" : {
"geo_test" : "GEO_GET(ip_dst_addr)"
}
}
]
}
```
Either update global.json with a valid geo.hdfs.file or run
`/usr/metron/0.3.1/bin/geo_enrichment_load.sh -z node1:2181 -r
/apps/metron/geo/default/` to place the file in the default spot (instead of a
timestamped stop). This is necessary to ensure that the push doesn't clobber
geo configs.
The resulting data in the index includes
```
{
...
"geo_test": {
"country": "US",
"dmaCode": "807",
"city": "San Francisco",
"postalCode": "94107",
"latitude": "37.7697",
"location_point": "37.7697,-122.3933",
"locID": "5391959",
"longitude": "-122.3933"
},
...
"ip_dst_addr": "151.101.192.73",
...
}
```
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/justinleet/incubator-metron geo_profiler_fix
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-metron/pull/461.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #461
----
----
> Remove Geo database from ParserBolt
> -----------------------------------
>
> Key: METRON-733
> URL: https://issues.apache.org/jira/browse/METRON-733
> Project: Metron
> Issue Type: Bug
> Reporter: Justin Leet
> Assignee: Justin Leet
>
> The ParserBolt inits the Geo DB in its prepare() method. Parsers, unlike
> enrichments, do not use geo as a base capability. They should be able to run
> without the DB file existing in HDFS. This init causes issues if the file is
> missing, even if GEO_GET is unused in the parser definition.
> This change should preserve the ability of Parsers to employ Stellar's
> GEO_GET. The Stellar function already handles that init, so it shouldn't be
> an issue.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)