Github user cestella commented on the issue:
https://github.com/apache/incubator-metron/pull/352
I want to also summarize a couple of my thoughts that I had in the course
of creating this PR. It appears to me that there are a number of these sorts
of outlier analysis models that can be split into a common pattern:
* Gather state via the profiler.
* Operate on that state during the scoring mechanism
For these z-score based statistical outlier analysis techniques/models,
that state comes in the form of a statistical summary or multiple statistical
summaries in the case of median absolute deviation. The scoring component is
literally composing a modified z-score (scaled by MAD). The question arises as
to whether or not this model is better served in MaaS or as a stellar function.
Where I have come down in this PR is that I feel these very simple z-score or
deviation-from-a-central-moment style outlier models fit better within Stellar
due to their simplicity from a computational perspective. Furthermore, it
would seem to be less than ideal performance-wise to perform a network hop to a
REST interface in MaaS to do simple subtraction and division.
An alternate architecture here would be to use MaaS and have the model pull
the appropriate statistical context from the profiler's store in HBase. I am
sensitive to the possibility that the placement of model application within
Metron may be a point of confusion and a consistent message is important. I
decided to go with simplicity and performance for the moment, but I'd love to
hear if there is opposition in the audience.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---