nickwallen opened a new pull request #1556: METRON-2284 Metron Profiler for Spark doesn't work as expected URL: https://github.com/apache/metron/pull/1556 ### The Problem Some profile "update" expressions execute incorrectly in the Batch Profiler. The bug report provides an example where a call to `IS_EMPTY` returns true, when it should be returning false when the profile is executed by the Batch Profiler. The same profile executed in the REPL or in the Streaming Profiler, returns the expected result. ### Root Cause The values contained within a telemetry message are exposed to a profile's update expression at runtime. This allows the profile to refer to message fields by name. After message routing occurs, the telemetry message is corrupted during serialization. It appears that all type information is lost when the corruption occurs. This causes variable resolution for message fields to fail when executing a profile’s ‘update’ expression. * This does not impact variables defined within the profile itself. * This does not impact the ‘onlyif’, ‘foreach’, ‘init’, and 'result' expressions of a profile; only the 'update' expression. The corruption occurs when a `MessageRoute` is serialized. This only affects the `JSONObject` representing the telemetry message, rather than any other fields like the profile definition, entity, or timestamp data also contained within a `MessageRoute`. The `MapVariableResolver` is passed the corrupted `JSONObject` so that variables can be resolved from the fields contained within the message. Due to the corruption variables referring to the message are not resolved. The corruption is caused by the use of Spark's bean encoder when serializing `MessageRoute` objects. ### Changes * Rather than using the bean encoder, I opted to use the Kryo encoder, which correctly serializes the objects. * Added test cases for the defect. ## Pull Request Checklist - [ ] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [ ] Does your PR title start with METRON-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [ ] Has your PR been rebased against the latest commit within the target branch (typically master)? - [ ] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [ ] Have you included steps or a guide to how the change may be verified and tested manually? - [ ] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via: - [ ] Have you written or updated unit tests and or integration tests to verify your changes? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent?
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
