nickwallen commented on issue #1564: METRON-2285 Batch Profiler Cannot Persist Data Sketches URL: https://github.com/apache/metron/pull/1564#issuecomment-557138214 I added the promised test case. Unfortunately, this test case would not have caught this specific bug. I still added it in hopes of catching potential future problems. Now why can't our tests replicate this bug? It appears that the original problem is only triggered when running against the HDP-flavor of Spark and its dependencies. This is the "flavor" that you would get when running Metron in a cluster deployed with Ambari and the MPack. The issue cannot be replicated in the integration tests as they are run against the against the pure, open source version of Spark and its dependencies, which are slightly different. Now the only way I could really prove this theory is if I were to attempt to run all of our integration tests against the HDP-flavor of each of our dependencies. It seems like this would be a rather time-consuming task. It might be really useful to define a separate profile that builds against the HDP-flavor of dependencies that we could use to potentially catch issues like this, but it seems a bit overkill for this specific bug. I am less concerned with the tests here because the fix included in this PR actually removes some of the ugly scaffolding that I had to use (ProfileMeasurementAdapter) to get this working originally. I see this (along with #1556) as the approach I would have preferred to take originally had I been able to get it working at the time. Of course, I could be persuaded otherwise. Let me know if more due diligence is needed here.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
