[GitHub] metron issue #940: METRON-1460: Create a complementary non-split-join enrich...

2018-03-02 Thread cestella
Github user cestella commented on the issue:

https://github.com/apache/metron/pull/940
  
Just FYI, as part of the performance experimentation in the lab here, we 
found that one major impediment to scale was the guava cache in this topology 
when the size of the cache becomes non-trivial in size (e.g. 10k+).  Swapping 
out [Caffeine](https://github.com/ben-manes/caffeine) immediately had a 
substantial affect.  I created #947 to migrate the split/join infrastructure to 
use caffeine as well and will look at the performance impact of that change.  I 
wanted to separate that work from here as it may be that guava performance is 
fine outside of an explicit threadpool like we have here.


---


[GitHub] metron pull request #947: METRON-1467: Replace guava caches in places where ...

2018-03-02 Thread cestella
GitHub user cestella opened a pull request:

https://github.com/apache/metron/pull/947

METRON-1467: Replace guava caches in places where the keyspace might be 
large

## Contributor Comments
Based on the performance tuning exercise as part of METRON-1460, guava has 
difficulties with cache sizes over 10k.  We, unfortunately, are quite demanding 
of guava in this regard so we should transition a few uses of guava to Caffeine:

* Stellar processor cache
* The JoinBolt cache
* The Enrichment Bolt Cache

NOTE: This depends on METRON-1460 aka #940
Test plan pending

## Pull Request Checklist

Thank you for submitting a contribution to Apache Metron.  
Please refer to our [Development 
Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235)
 for the complete guide to follow for contributions.  
Please refer also to our [Build Verification 
Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview)
 for complete smoke testing guides.  


In order to streamline the review of the contribution we ask you follow 
these guidelines and ask you to double check the following:

### For all changes:
- [x] Is there a JIRA ticket associated with this PR? If not one needs to 
be created at [Metron 
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
- [x] Does your PR title start with METRON- where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
- [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?


### For code changes:
- [x] Have you included steps to reproduce the behavior or problem that is 
being changed or addressed?
- [x] Have you included steps or a guide to how the change may be verified 
and tested manually?
- [x] Have you ensured that the full suite of tests and checks have been 
executed in the root metron folder via:
  ```
  mvn -q clean integration-test install && 
dev-utilities/build-utils/verify_licenses.sh 
  ```

- [x] Have you written or updated unit tests and or integration tests to 
verify your changes?
- [x] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [x] Have you verified the basic functionality of the build by building 
and running locally with Vagrant full-dev environment or the equivalent?

### For documentation related changes:
- [x] Have you ensured that format looks appropriate for the output in 
which it is rendered by building and verifying the site-book? If not then run 
the following commands and the verify changes via 
`site-book/target/site/index.html`:

  ```
  cd site-book
  mvn site
  ```

 Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.
It is also recommended that [travis-ci](https://travis-ci.org) is set up 
for your personal repository such that your branches are built there before 
submitting a pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cestella/incubator-metron 
guava_cache_replacement

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/metron/pull/947.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #947


commit a4f618a3ad895d62772366e0e93e5b8b37c5c964
Author: cstella 
Date:   2018-02-21T23:59:16Z

Adding parallel enrichment bolt.

commit 99fe0b86005fe04294b3851a17ae3d88f228c5d2
Author: cstella 
Date:   2018-02-22T00:21:06Z

Updating to include trace statements.

commit 79736c6f3fab04d01dd1eb998b308f438003a0e1
Author: cstella 
Date:   2018-02-22T15:35:44Z

Updating with some cleanup

commit cb4a527c9146865dafad1d597ba93032ef398d94
Author: cstella 
Date:   2018-02-22T15:48:11Z

Updating spec.

commit fb4d4383f366776f446e33a422652c3ec1f56bfa
Author: cstella 
Date:   2018-02-22T18:00:36Z

Updating threadpool creation

commit 87ef6a72827c31f8adee42ee71272a32c350bc1f
Author: cstella 
Date:   2018-02-22T18:04:37Z

better docs

commit 6ae9594ee4ae2b4d33e0feca398b527077dac0d3
Author: cstella 
Date:   2018-02-22T18:41:20Z

Updating readme.

commit 82ebc9550d759ea0bd06b48c586fd5e53c6e553a
Author: cstella 
Date:   2018-02-22T20:53:31Z

Better documentation.

commit 235046d3d1fcda31690f4fc6b64cb38f050fc5af
Author: cstella 
Date:   

[GitHub] metron issue #946: METRON-1465:Support for Elasticsearch X-pack

2018-03-02 Thread wardbekker
Github user wardbekker commented on the issue:

https://github.com/apache/metron/pull/946
  
Steps to test:

- build centos vagrant dev build box
- vagrant ssh
- sudo /usr/share/elasticsearch/bin/elasticsearch-plugin install x-pack
- sudo /usr/share/elasticsearch/bin/x-pack/users useradd 
transport_client_user -p changeme -r superuser
- restart ES via ambari
- verify 401 status: curl http://localhost:9200
- verify new documents are indexed in ES: http://node1:9200/_cat/indices?v


---


[GitHub] metron issue #940: METRON-1460: Create a complementary non-split-join enrich...

2018-03-02 Thread cestella
Github user cestella commented on the issue:

https://github.com/apache/metron/pull/940
  
@arunmahadevan Thanks for chiming in Arun.   I would say that most of the 
enrichment work is I/O bound and we try to avoid it whenever possible with a a 
time-evicted LRU cache in front of the enrichments.  We don't always know a 
priori what enrichments users are doing, per se, as their individual 
enrichments may be expressed via stellar.  The threads here are entirely 
managed via the fixed threadpool service in storm and the threadpool is shared 
across all of the executors running in-process on the worker, so we try to 
minimize that.


---


[GitHub] metron issue #940: METRON-1460: Create a complementary non-split-join enrich...

2018-03-02 Thread arunmahadevan
Github user arunmahadevan commented on the issue:

https://github.com/apache/metron/pull/940
  
Managing threadpools within a bolt isn't fundamentally wrong, we have see 
some use cases where this is done. However, we have been putting efforts to 
reduce the overall number of threads created  internally within storm since the 
thread context switches were causing performance bottlenecks. I assume the 
threadpool threads are mostly IO/network bound so it should not cause too much 
harm.

Do you need multiple threads since the enrichments involve external DB look 
ups and are time consuming ?  Maybe you could compare the performance of 
maintaining a thread pool v/s increasing the bolt's parallelism to achieve a 
similar effect. 

Another option might be to prefetch the enrichment data and load it into 
each bolt so that you might not need separate threads to do the enrichment.

If you are able to manage without threads, that would be preferable. Even 
otherwise its not that bad as long as you don't create too many threads and 
they are cleaned up properly. (we have had some cases were the internal threads 
were causing workers to hang).


---


[GitHub] metron issue #940: METRON-1460: Create a complementary non-split-join enrich...

2018-03-02 Thread cestella
Github user cestella commented on the issue:

https://github.com/apache/metron/pull/940
  
@ottobackwards I haven't sent an email to the storm team, but I did run the 
PR past a storm committer that I know and asked his opinion prior to submitting 
the PR.  The general answer was something to the effect of `The overall goal 
should be to reduce the network shuffle unless its really required.`  Also, the 
notion of using an external threadpool didn't seem to be fundamentally 
offensive.


---


[GitHub] metron issue #940: METRON-1460: Create a complementary non-split-join enrich...

2018-03-02 Thread ottobackwards
Github user ottobackwards commented on the issue:

https://github.com/apache/metron/pull/940
  
have we thought to send a mail to the storm dev list and ask if anyone has 
done this?  potential issues?


---


[GitHub] metron issue #940: METRON-1460: Create a complementary non-split-join enrich...

2018-03-02 Thread ottobackwards
Github user ottobackwards commented on the issue:

https://github.com/apache/metron/pull/940
  
If we integrated storm with yarn this would also be a problem, as our 
resource management may be at odds with yarn's.  I think?

What would be nice is if storm could manage the pool and we could just use 
it.


---


[GitHub] metron issue #940: METRON-1460: Create a complementary non-split-join enrich...

2018-03-02 Thread cestella
Github user cestella commented on the issue:

https://github.com/apache/metron/pull/940
  
@mraliagha It's definitely a tradeoff.  This is why this is as a complement 
to the original split/join topology.  Keep in mind, also, that this 
architecture enables use-cases that the other would prevent or make extremely 
difficult and/or network intensive, such as multi-level stellar statements 
rather than the 2 levels we have now.  We are undergoing some preliminary 
testing in-lab right now, which @nickwallen alluded to, to compare the two 
approaches under at least synthetic load and will report back.

Ultimately this boils down to efficiencies gained by avoiding network hops 
and whether that's going to provide an outsized impact, I think.


---