[ 
https://issues.apache.org/jira/browse/HADOOP-18500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Willi Raschkowski updated HADOOP-18500:
---------------------------------------
    Description: 
Maven-shade-plugin rewrites classes when moving them into {{hadoop-client}} 
JARs. That's true even when it doesn't actually need to modify the byte code of 
the classes, say for shading.

We use a tool that checks for classpath duplicates that don't have equal byte 
codes. We noticed that it flags classes brought in via Hadoop, where one JAR 
containing them is {{hadoop-client-api}} and {{{}hadoop-client-runtime{}}}, and 
the other JAR is {{hadoop-common}} or {{{}hadoop-shaded-guava{}}}. The byte 
code for the same class is indeed different between the relocated and 
non-relocated JARs.

This is because maven-shade-plugin, before 3.3.0, was rewriting class files 
even when the relocation was a "no-op". See MSHADE-391 and 
[apache/maven-shade-plugin#95|https://github.com/apache/maven-shade-plugin/pull/95].

{quote}
Maven Shade internally uses [ASM's 
{{ClassRemapper}}|https://asm.ow2.io/javadoc/org/objectweb/asm/commons/ClassRemapper.html]
 and defines a custom {{Remapper}} subclass, which takes care of relocation, 
partially doing the work by itself and partially delegating to the ASM parent 
class. An ASM {{ClassReader}} reads each class file from the original JAR and 
*unconditionally* writes it into a {{{}ClassWriter{}}}, plugging in the 
transformer.

This transformation, even if not a single relocation (package name mapping) 
takes place, often leads to binary differences between original class and 
transformed class, because constant pool or stack map frames have been 
adjusted, not changing the functionality of the class, but making it look like 
something changed when comparing class files before and after the relocation 
process.
{quote}

Upgrading to maven-shade-plugin 3.3.0 fixes the unnecessary rewrite of classes.

  was:
Maven-shade-plugin rewrites classes when moving them into {{hadoop-client}} 
JARs. That's true even when it doesn't actually need to modify the byte code of 
the classes, say for shading.

We use a tool that checks for classpath duplicates that don't have equal byte 
codes. We noticed that it flags classes brought in via Hadoop, where one JAR 
containing them is {{hadoop-client-api}} and {{{}hadoop-client-runtime{}}}, and 
the other JAR is {{hadoop-common}} or {{{}hadoop-shaded-guava{}}}. The byte 
code for the same class is indeed different between the relocated and 
non-relocated JARs.

This is because maven-shade-plugin, before 3.3.0, was rewriting class files 
even when the relocation was a "no-op". See MSHADE-391 and 
[apache/maven-shade-plugin#95|https://github.com/apache/maven-shade-plugin/pull/95].

Upgrading to maven-shade-plugin 3.3.0 fixes the unnecessary rewrite of classes.


> Upgrade maven-shade-plugin to 3.3.0
> -----------------------------------
>
>                 Key: HADOOP-18500
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18500
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: build
>            Reporter: Willi Raschkowski
>            Assignee: Ashutosh Gupta
>            Priority: Minor
>
> Maven-shade-plugin rewrites classes when moving them into {{hadoop-client}} 
> JARs. That's true even when it doesn't actually need to modify the byte code 
> of the classes, say for shading.
> We use a tool that checks for classpath duplicates that don't have equal byte 
> codes. We noticed that it flags classes brought in via Hadoop, where one JAR 
> containing them is {{hadoop-client-api}} and {{{}hadoop-client-runtime{}}}, 
> and the other JAR is {{hadoop-common}} or {{{}hadoop-shaded-guava{}}}. The 
> byte code for the same class is indeed different between the relocated and 
> non-relocated JARs.
> This is because maven-shade-plugin, before 3.3.0, was rewriting class files 
> even when the relocation was a "no-op". See MSHADE-391 and 
> [apache/maven-shade-plugin#95|https://github.com/apache/maven-shade-plugin/pull/95].
> {quote}
> Maven Shade internally uses [ASM's 
> {{ClassRemapper}}|https://asm.ow2.io/javadoc/org/objectweb/asm/commons/ClassRemapper.html]
>  and defines a custom {{Remapper}} subclass, which takes care of relocation, 
> partially doing the work by itself and partially delegating to the ASM parent 
> class. An ASM {{ClassReader}} reads each class file from the original JAR and 
> *unconditionally* writes it into a {{{}ClassWriter{}}}, plugging in the 
> transformer.
> This transformation, even if not a single relocation (package name mapping) 
> takes place, often leads to binary differences between original class and 
> transformed class, because constant pool or stack map frames have been 
> adjusted, not changing the functionality of the class, but making it look 
> like something changed when comparing class files before and after the 
> relocation process.
> {quote}
> Upgrading to maven-shade-plugin 3.3.0 fixes the unnecessary rewrite of 
> classes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to