[ 
https://issues.apache.org/jira/browse/SPARK-27786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Rosen updated SPARK-27786:
-------------------------------
    Description: 
When running a custom build of Spark which shades {{commons-codec}}, the 
{{sha1Hex}} expression generates code which doesn't compile:

{code}
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator: failed to 
compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', 
Line 47, Column 93: A method named "sha1Hex" is not declared in any enclosing 
class nor any supertype, nor through a static import
{code}

This is caused by an interaction between Spark's code generator and the 
shading: the current code generator embeds 
"org.apache.commons.codec.digest.DigestUtils.sha1Hex" into a larger codegen 
template, preventing JarJarLinks from being able to replace it with the shaded 
class's name. The generated code ends up using the unshaded name but the 
unshaded dependency isn't on our classpath, triggering the above compilation 
error.

To fix this problem and allow for proper shading, we can replace the hardcoded 
string literal with {{classof[DigestUtils].getName}}

  was:
When running a custom build of Spark which shades {{commons-codec}}, the 
{{sha1Hex}} expression generates code which doesn't compile:

{code}
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator: failed to 
compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', 
Line 47, Column 93: A method named "sha1Hex" is not declared in any enclosing 
class nor any supertype, nor through a static import
{code}

The problem here is not in the generated code itself but, rather, is an issue 
caused by an interaction between Spark's code generator and the shading: the 
current code generator embeds 
"org.apache.commons.codec.digest.DigestUtils.sha1Hex" into a larger codegen 
template, preventing JarJarLinks from being able to replace it with the shaded 
class's name. The generated code ends up using the unshaded name but the 
unshaded dependency isn't on our classpath, triggering the above compilation 
error.

To fix this problem and allow for proper shading, we can replace the hardcoded 
string literal with {{classof[DigestUtils].getName}}


> SHA1, MD5, and Base64 expression codegen doesn't work when commons-codec is 
> shaded
> ----------------------------------------------------------------------------------
>
>                 Key: SPARK-27786
>                 URL: https://issues.apache.org/jira/browse/SPARK-27786
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.4.0
>            Reporter: Josh Rosen
>            Assignee: Josh Rosen
>            Priority: Minor
>
> When running a custom build of Spark which shades {{commons-codec}}, the 
> {{sha1Hex}} expression generates code which doesn't compile:
> {code}
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator: failed to 
> compile: org.codehaus.commons.compiler.CompileException: File 
> 'generated.java', Line 47, Column 93: A method named "sha1Hex" is not 
> declared in any enclosing class nor any supertype, nor through a static import
> {code}
> This is caused by an interaction between Spark's code generator and the 
> shading: the current code generator embeds 
> "org.apache.commons.codec.digest.DigestUtils.sha1Hex" into a larger codegen 
> template, preventing JarJarLinks from being able to replace it with the 
> shaded class's name. The generated code ends up using the unshaded name but 
> the unshaded dependency isn't on our classpath, triggering the above 
> compilation error.
> To fix this problem and allow for proper shading, we can replace the 
> hardcoded string literal with {{classof[DigestUtils].getName}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to