GitHub user justinleet opened a pull request:

    https://github.com/apache/incubator-metron/pull/461

    METRON-733: Remove Geo database from ParserBolt

    To create the original problem, just run up a parser and make sure there's 
no geo data on hdfs (by default in /apps/metron/geo).
    
    This PR removes geo from metron-parsers entirely (since it shouldn't be 
necessary at all, only pulled in when Stellar cares about it).
    
    ### Testing
    To validate this, I basically ran through the squid demo with the geo file 
missing, which works as expected (no exception thrown).  In addition, the 
ParserBoltTest is updated to not have a reference to the test Geo DB data (and 
runs fine without it).
    
    To ensure that Stellar GEO_GET works as expected in a parser, quick-dev was 
spun up.  The steps for squid were followed, but with a custom parser config
    ```
    {
      "parserClassName": "org.apache.metron.parsers.GrokParser",
      "sensorTopic": "squid",
      "parserConfig": {
        "grokPath": "/patterns/squid",
        "patternLabel": "SQUID_DELIMITED",
        "timestampField": "timestamp"
      },
      "fieldTransformations" : [
        {
          "transformation" : "STELLAR"
        ,"output" : [ "geo_test" ]
        ,"config" : {
          "geo_test" : "GEO_GET(ip_dst_addr)"
                    }
        }
                               ]
    }
    ```
    
    Either update global.json with a valid geo.hdfs.file or run 
`/usr/metron/0.3.1/bin/geo_enrichment_load.sh -z node1:2181 -r 
/apps/metron/geo/default/` to place the file in the default spot (instead of a 
timestamped stop). This is necessary to ensure that the push doesn't clobber 
geo configs.
    
    The resulting data in the index includes
    ```
    {
    ...
                   "geo_test": {
                      "country": "US",
                      "dmaCode": "807",
                      "city": "San Francisco",
                      "postalCode": "94107",
                      "latitude": "37.7697",
                      "location_point": "37.7697,-122.3933",
                      "locID": "5391959",
                      "longitude": "-122.3933"
                   },
    ...
                   "ip_dst_addr": "151.101.192.73",
    ...
    }
    ```

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/justinleet/incubator-metron geo_profiler_fix

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-metron/pull/461.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #461
    
----

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to