merrimanr commented on issue #1436: METRON-2149: Shaded jar classifier is not consistent URL: https://github.com/apache/metron/pull/1436#issuecomment-499591873 As I worked through resolving the final failing test I realized there is a use case that is not properly handled by the original changes in this PR. There are 2 versions of guava required by different classes (Stellar and HBase testing utility) so we need a way to relocate one of them. It's not possible to do this without depending on a shaded module because transitive dependencies have already been resolved (meaning only 1 version remains) by the time the final shaded jar is built. Relocating at this point just relocates the single remaining version. At this point I want to summarize my findings and present some options. Here are the requirements I see: 1. Transitive dependency resolution should be predictable and easy to troubleshoot. Maven configuration settings (excludes, etc) should work as expected. 2. Versions reported by mvn dependency:tree should match what's included in the uber jar. 3. There should be a well understood and robust strategy for relocating classes that conflict. Using classifiers on all modules solves 1 and 2 but does not support 3 as described above. We currently support 3 but not 1 and 2 because lower level modules (like metron-common) do not use a classifier. This means other modules that depend on it inherit relocated classes. The problem is that transitive dependencies from these modules overwrite other dependencies, making it harder to determine which versions end up in the final uber jar. To get to a point where we can satisfy all 3 requirements above, I can think of a couple options. Both of these options are based on the assumption that most class version conflicts involve Stellar classes. Stellar contains most of our business logic and contains a long list of dependencies, including several that commonly conflict with other projects (guava, log4j, jackson, etc). The idea behind both of these options involves isolating Stellar from the rest of the project. Here they are: 1. Make Stellar the exception and remove the classifier on stellar-common. This module would be the only one that does this. This satisfies 3 as long as code requiring different class versions is located in this module. This means we may need to move classes into this module (or do this with other modules too). To satisfy 1 and 2 we would need to ensure we are rewriting ALL transitive dependencies or tolerate relocating classes as we run into issues. The advantage of this approach is there would still be a single uber jar so changes to scripts and classpath setup would not change. The disadvantage is there is still the risk of transitive dependencies leaking into the main uber jar. 2. Deploy Stellar code in a separate jar and add it to classpath after the main uber jar, whatever that is (metron-data-management, metron-enrichment-storm, etc). This satisfies 3 because the separate Stellar jar can contain the relocated classes but other dependencies will not overwrite dependencies of the main uber jar (because it's listed after the main uber jar). 1 and 2 are not a concern when classifiers are used which is the case here. The main disadvantage I see is that there will be work adding this extra jar in the various scripts or startup options and we may have to reorganize some classes. I tested both options and was able to get both working for the these use cases: - Generate a bloom filter and read it back (https://metron.apache.org/current-book/use-cases/typosquat_detection/index.html) - Enrichment and parser topology regression test This is all fairly complex so if anything is not clear I can elaborate. Are there other options I'm not thinking of? Thoughts?
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
