nickwallen opened a new pull request #1556: METRON-2284 Metron Profiler for 
Spark doesn't work as expected
URL: https://github.com/apache/metron/pull/1556
 
 
   ### The Problem 
   
   Some profile "update" expressions execute incorrectly in the Batch Profiler. 
The bug report provides an example where a call to `IS_EMPTY` returns true, 
when it should be returning false when the profile is executed by the Batch 
Profiler.  The same profile executed in the REPL or in the Streaming Profiler, 
returns the expected result.
   
   ### Root Cause
   
   The values contained within a telemetry message are exposed to a profile's 
update expression at runtime.  This allows the profile to refer to message 
fields by name.  
   
   After message routing occurs, the telemetry message is corrupted during 
serialization. It appears that all type information is lost when the corruption 
occurs.  This causes variable resolution for message fields to fail when 
executing a profile’s ‘update’ expression.  
   * This does not impact variables defined within the profile itself. 
   * This does not impact the ‘onlyif’, ‘foreach’, ‘init’, and 'result' 
expressions of a profile; only the 'update' expression.
   
   The corruption occurs when a `MessageRoute` is serialized. This only affects 
the `JSONObject` representing the telemetry message, rather than any other 
fields like the profile definition, entity, or timestamp data also contained 
within a `MessageRoute`. 
    
   The `MapVariableResolver` is passed the corrupted `JSONObject` so that 
variables can be resolved from the fields contained within the message.  Due to 
the corruption variables referring to the message are not resolved.
   
   The corruption is caused by the use of Spark's bean encoder when serializing 
`MessageRoute` objects. 
   
   ### Changes
   
   * Rather than using the bean encoder, I opted to use the Kryo encoder, which 
correctly serializes the objects.
   * Added test cases for the defect.
   
   ## Pull Request Checklist
   
   - [ ] Is there a JIRA ticket associated with this PR? If not one needs to be 
created at [Metron 
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
   - [ ] Does your PR title start with METRON-XXXX where XXXX is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
   - [ ] Has your PR been rebased against the latest commit within the target 
branch (typically master)?
   - [ ] Have you included steps to reproduce the behavior or problem that is 
being changed or addressed?
   - [ ] Have you included steps or a guide to how the change may be verified 
and tested manually?
   - [ ] Have you ensured that the full suite of tests and checks have been 
executed in the root metron folder via:
   - [ ] Have you written or updated unit tests and or integration tests to 
verify your changes?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] Have you verified the basic functionality of the build by building and 
running locally with Vagrant full-dev environment or the equivalent?
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to