[
https://issues.apache.org/jira/browse/METRON-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16446311#comment-16446311
]
ASF GitHub Bot commented on METRON-1533:
----------------------------------------
GitHub user nickwallen opened a pull request:
https://github.com/apache/metron/pull/1000
METRON-1533 Create KAFKA_FIND Stellar Function
I created a `KAFKA_FIND` function that allows you to provide a filter
expression so that only messages satisfying a condition are returned. For
example...
- Find a message that has been enriched with geolocation data.
```
KAFKA_FIND('indexing', m -> MAP_EXISTS('geo.city', m))
```
- Find a Bro message.
```
KAFKA_FIND('indexing', m -> MAP_GET('source.type', m) == 'bro')
```
## Use Case
When creating enrichments, I often find that I want to validate that the
enrichment I just created was successful on the live, incoming stream of
telemetry. My workflow looks something like this.
1. Create and test the enrichment that I want to create.
```
[Stellar]>>> ip_src_addr := "72.34.49.86"
72.34.49.86
[Stellar]>>> geo := GEO_GET(ip_src_addr)
{country=US, dmaCode=803, city=Los Angeles, postalCode=90014,
latitude=34.0438, location_point=34.0438,-118.2512, locID=5368361,
longitude=-118.2512}
```
2. That looks good to me. Now let's add that to my Bro telemetry.
```
[Stellar]>>> conf := SHELL_EDIT(conf)
{
"enrichment" : {
"fieldMap": {
"stellar": {
"config": [
"geo := GEO_GET(ip_src_addr)"
]
}
}
},
"threatIntel": {
}
}
[Stellar]>>> CONFIG_PUT("ENRICHMENTS", e, "bro")
```
3. It looks like that worked, but did that really work?
At this point, I would run KAFKA_GET as many times as it takes to
retrieve a Bro message. You would just have to get lucky and hope that the
enrichment worked and secondly that you would pull down a Bro message (as
opposed to a different sensor).
I would rather have a function that lets me only pull back the messages
that I care about. In this case I could either retrieve only Bro messages.
```
KAFKA_FIND('indexing', m -> MAP_GET('source.type', m) == 'bro')
```
Or I could look for messages that contain geolocation data.
```
KAFKA_FIND('indexing', m -> MAP_EXISTS('geo.city', m))
```
### Changes
* Created the `KAFKA_FIND` function along with unit tests.
* Defined the global property `bootstrap.servers` by default during the
MPack install. This allows all of the `KAKFA_*` functions to work
out-of-the-box. Previously, a user would have to manually define this value
before using any of the `KAFKA_*` functions.
### Pull Request Checklist
- [ ] Is there a JIRA ticket associated with this PR? If not one needs to
be created at [Metron
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
- [ ] Does your PR title start with METRON-XXXX where XXXX is the JIRA
number you are trying to resolve? Pay particular attention to the hyphen "-"
character.
- [ ] Has your PR been rebased against the latest commit within the target
branch (typically master)?
- [ ] Have you included steps to reproduce the behavior or problem that is
being changed or addressed?
- [ ] Have you included steps or a guide to how the change may be verified
and tested manually?
- [ ] Have you ensured that the full suite of tests and checks have been
executed in the root metron folder via:
- [ ] Have you written or updated unit tests and or integration tests to
verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies
licensed in a way that is compatible for inclusion under [ASF
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [ ] Have you verified the basic functionality of the build by building
and running locally with Vagrant full-dev environment or the equivalent?
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/nickwallen/metron METRON-1533
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/metron/pull/1000.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1000
----
commit 5cc3cc5d541e95a5fcb9a49eed6b291a48e6cb59
Author: Nick Allen <nick@...>
Date: 2018-04-20T20:18:16Z
METRON-1533 Create KAFKA_FIND Stellar Function
----
> Create KAFKA_FIND Stellar Function
> ----------------------------------
>
> Key: METRON-1533
> URL: https://issues.apache.org/jira/browse/METRON-1533
> Project: Metron
> Issue Type: Improvement
> Reporter: Nick Allen
> Assignee: Nick Allen
> Priority: Minor
>
> When creating enrichments, I often find that I want to validate that the
> enrichment I just created was successful on the live, incoming stream of
> telemetry. My workflow looks something like this.
> 1. Create and test the enrichment that I want to create.
> {code:java}
> [Stellar]>>> ip_src_addr := "72.34.49.86"
> 72.34.49.86
> [Stellar]>>> geo := GEO_GET(ip_src_addr)
> {country=US, dmaCode=803, city=Los Angeles, postalCode=90014,
> latitude=34.0438, location_point=34.0438,-118.2512, locID=5368361,
> longitude=-118.2512}
> {code}
> 2. That looks good to me. Now let's add that to my Bro telemetry.
> {code:java}
> [Stellar]>>> conf := SHELL_EDIT(conf)
> {
> "enrichment" : {
> "fieldMap": {
> "stellar": {
> "config": [
> "geo := GEO_GET(ip_src_addr)"
> ]
> }
> }
> },
> "threatIntel": {
> }
> }
> [Stellar]>>> CONFIG_PUT("ENRICHMENTS", e, "bro")
> {code}
>
> 3. It looks like that worked, but did that really work?
> At this point, I would run KAFKA_GET as many times as it takes to retrieve a
> Bro message. You would just have to get lucky and hope that the enrichment
> worked and secondly that you would pull down a Bro message (as opposed to a
> different sensor).
>
> I would rather have a function that lets me only pull back the messages that
> I care about. In this case I could either retrieve only Bro messages.
> {code:java}
> KAFKA_FIND('indexing', m -> MAP_GET('source.type', m) == 'bro')
> {code}
> Or I could look for messages that contain geolocation data.
> {code:java}
> KAFKA_FIND('indexing', m -> MAP_EXISTS('geo.city', m))
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)