[
https://issues.apache.org/jira/browse/SPARK-6589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385499#comment-14385499
]
Benyi Wang commented on SPARK-6589:
-----------------------------------
I found a method to fix this issue. But I still think DataType should find a
better way to find the correct class loader.
{code}
# put the UDT jar to SPARK_CLASSPATH so that Launcher$AppClassLoader can find
it.
export SPARK_CLASSPATH=myUDT.jar
spark-shell --jars myUDT.jar ...
{code}
> SQLUserDefinedType failed in spark-shell
> ----------------------------------------
>
> Key: SPARK-6589
> URL: https://issues.apache.org/jira/browse/SPARK-6589
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.2.0
> Environment: CDH 5.3.2
> Reporter: Benyi Wang
>
> {{DataType.fromJson}} will fail in spark-shell if the schema includes "udt".
> It works if running in an application.
> This causes that I cannot read a parquet file including a UDT field.
> {{DataType.fromCaseClass}} does not support UDT.
> I can load the class which shows that my UDT is in the classpath.
> {code}
> scala> Class.forName("com.bwang.MyTestUDT")
> res6: Class[_] = class com.bwang.MyTestUDT
> {code}
> But DataType fails:
> {code}
> scala> DataType.fromJson(json)
>
> java.lang.ClassNotFoundException: com.bwang.MyTestUDT
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:190)
> at
> org.apache.spark.sql.catalyst.types.DataType$.parseDataType(dataTypes.scala:77)
> {code}
> The reason is DataType.fromJson tries to load {{udtClass}} using this code:
> {code}
> case JSortedObject(
> ("class", JString(udtClass)),
> ("pyClass", _),
> ("sqlType", _),
> ("type", JString("udt"))) =>
> Class.forName(udtClass).newInstance().asInstanceOf[UserDefinedType[_]]
> }
> {code}
> Unfortunately, my UDT is loaded by {{SparkIMain$TranslatingClassLoader}}, but
> DataType is loaded by {{Launcher$AppClassLoader}}.
> {code}
> scala> DataType.getClass.getClassLoader
> res2: ClassLoader = sun.misc.Launcher$AppClassLoader@6876fb1b
> scala> this.getClass.getClassLoader
> res3: ClassLoader =
> org.apache.spark.repl.SparkIMain$TranslatingClassLoader@63d36b29
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]