[
https://issues.apache.org/jira/browse/METRON-712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15897394#comment-15897394
]
ASF GitHub Bot commented on METRON-712:
---------------------------------------
GitHub user cestella opened a pull request:
https://github.com/apache/incubator-metron/pull/473
METRON-712: Separate evaluation from parsing in Stellar
# Description
With the current implementation of Stellar, we cannot cache the parse tree
and then apply it after the fact. It's just an artifact of how we do the
parsing: we actually execute the statement as we parse rather than constructing
an AST that can then be evaluated later given a message. Essentially what I'm
proposing is that we build the equivalent of Pattern.compile() in Java except
for Stellar.
We should for multiple reasons:
* code clarity - decoupling the stellar language from the generated parser
code
* performance - saving lexing and parsing for every message. Also, the
resulting parse-stack may be much smaller than the somewhat complex.
In this PR, I have added a google cache that will cache the resulting
compiled expression in `BaseStellarProcessor` for 10 minutes (by default). I
have also created a microbenchmarking suite and have evaluated this on a few
representative expressions.
Results:
* `TO_UPPER('casey')`
* Median ms before: `880.5`
* Median ms after: `15`
* Speedup: 58.6x faster
* `TO_LOWER(name)`
* Median ms before: `497`
* Median ms after: `3`
* Speedup: 165.6x faster
* `1 + 2*(3 + int_num) / 10.0`
* Median ms before: `676`
* Median ms after: `4`
* Speedup: 169x faster
* `1.5 + 2*(3 + double_num) / 10.0`
* Median ms before: `634`
* Median ms after: `1`
* Speedup: 634x faster
* `if ('foo' in ['foo']) OR one == very_nearly_one then 'one' else 'two'`
* Median ms before: `616`
* Median ms after: `23`
* Speedup: 26x faster
* `1 + 2*(3 + int_num) / 10.0`
* Median ms before: `601`
* Median ms after: `2`
* Speedup: 300.5x faster
* `DOMAIN_TO_TLD(domain)`
* Median ms before: `505`
* Median ms after: `16`
* Speedup: 32.5x
* `DOMAIN_REMOVE_SUBDOMAINS(domain)`
* Median ms before: `496`
* Median ms after: `11`
* Speedup: 45x faster
# Testing Plan
Please refer to the METRON-744
[PR](https://github.com/apache/incubator-metron/pull/468#issue-210707129)
testing plan.
In order to streamline the review of the contribution we ask you follow
these guidelines and ask you to double check
the following:
### For all changes:
- [x] Is there a JIRA ticket associated with this PR? If not one needs to
be created at [Metron
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
- [x] Does your PR title start with METRON-XXXX where XXXX is the JIRA
number you are trying to resolve? Pay particular attention to the hyphen "-"
character.
- [x] Has your PR been rebased against the latest commit within the target
branch (typically master)?
### For code changes:
- [x] Have you included steps to reproduce the behavior or problem that is
being changed or addressed?
- [x] Have you included steps or a guide to how the change may be verified
and tested manually?
- [x] Have you ensured that the full suite of tests and checks have been
executed in the root incubating-metron folder via:
```
mvn -q clean integration-test install && build_utils/verify_licenses.sh
```
- [x] Have you written or updated unit tests and or integration tests to
verify your changes?
- [x] If adding new dependencies to the code, are these dependencies
licensed in a way that is compatible for inclusion under [ASF
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [ ] Have you verified the basic functionality of the build by building
and running locally with Vagrant full-dev environment or the equivalent?
### For documentation related changes:
- [x] Have you ensured that format looks appropriate for the output in
which it is rendered by building and verifying the site-book? If not then run
the following commands and the verify changes via
site-book/target/site/index.html.
```
cd site-book
bin/generate-md.sh
mvn site:site
```
### Note:
Please ensure that once the PR is submitted, you check travis-ci for build
issues and submit an update to your PR as soon as possible.
It is also recommened that [travis-ci](https://travis-ci.org) is set up for
your personal repository such that your branches are built there before
submitting a pull request.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/cestella/incubator-metron stellar_optimize
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-metron/pull/473.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #473
----
commit 7c04584950c452b5f0dc786de8d91f3978bb92ec
Author: cstella <[email protected]>
Date: 2017-03-06T08:15:57Z
Renaming Compiler to Interpreter.
commit 7483361f395707a86031bc5eb7027259cc75786e
Author: cstella <[email protected]>
Date: 2017-03-06T08:16:25Z
Merge branch 'master' into stellar_optimize
commit c2e7eb08a9c001bc0eef62e5a0423cd17227ef46
Author: cstella <[email protected]>
Date: 2017-03-06T10:21:54Z
Added cache to speed up stellar
commit 3c416e7de9de7c96f80116884f264cab92f7f9e6
Author: cstella <[email protected]>
Date: 2017-03-06T10:24:36Z
Deleted StellarInterpreter.
commit fbad7f31e6171abf4a8e2e0031d207a2de41f5ac
Author: cstella <[email protected]>
Date: 2017-03-06T11:22:36Z
Adding microbenchmarking suite.
commit 7b9ce9a9a3b97059a25af05901f9077392a0e58a
Author: cstella <[email protected]>
Date: 2017-03-06T13:36:02Z
Updating tests to expect real exceptions, not wrapped exceptions.
----
> Separate evaluation from parsing in Stellar
> -------------------------------------------
>
> Key: METRON-712
> URL: https://issues.apache.org/jira/browse/METRON-712
> Project: Metron
> Issue Type: Improvement
> Reporter: Casey Stella
>
> With the current implementation of Stellar, we cannot cache the parse tree
> and then apply it after the fact. It's just an artifact of how we do the
> parsing: we actually execute the statement as we parse rather than
> constructing an AST that can then be evaluated later given a message.
> Essentially what I'm proposing is that we build the equivalent of
> Pattern.compile() in Java except for Stellar.
> We should for multiple reasons:
> * code clarity - decoupling the stellar language from the generated parser
> code
> * performance - saving lexing and parsing for every message
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)