[ https://issues.apache.org/jira/browse/SPARK-6589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen resolved SPARK-6589. ------------------------------ Resolution: Not A Problem I think this is, effectively, "not a problem" in the sense that this is just how the shell works. It necessarily must put classes into its classloader, which is a child of Spark's, and Spark can't see your classes, and you must supply your classes with Spark to make it work. This basically won't work. > SQLUserDefinedType failed in spark-shell > ---------------------------------------- > > Key: SPARK-6589 > URL: https://issues.apache.org/jira/browse/SPARK-6589 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.2.0 > Environment: CDH 5.3.2 > Reporter: Benyi Wang > > {{DataType.fromJson}} will fail in spark-shell if the schema includes "udt". > It works if running in an application. > This causes that I cannot read a parquet file including a UDT field. > {{DataType.fromCaseClass}} does not support UDT. > I can load the class which shows that my UDT is in the classpath. > {code} > scala> Class.forName("com.bwang.MyTestUDT") > res6: Class[_] = class com.bwang.MyTestUDT > {code} > But DataType fails: > {code} > scala> DataType.fromJson(json) > > java.lang.ClassNotFoundException: com.bwang.MyTestUDT > at java.net.URLClassLoader$1.run(URLClassLoader.java:366) > at java.net.URLClassLoader$1.run(URLClassLoader.java:355) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:354) > at java.lang.ClassLoader.loadClass(ClassLoader.java:425) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) > at java.lang.ClassLoader.loadClass(ClassLoader.java:358) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:190) > at > org.apache.spark.sql.catalyst.types.DataType$.parseDataType(dataTypes.scala:77) > {code} > The reason is DataType.fromJson tries to load {{udtClass}} using this code: > {code} > case JSortedObject( > ("class", JString(udtClass)), > ("pyClass", _), > ("sqlType", _), > ("type", JString("udt"))) => > Class.forName(udtClass).newInstance().asInstanceOf[UserDefinedType[_]] > } > {code} > Unfortunately, my UDT is loaded by {{SparkIMain$TranslatingClassLoader}}, but > DataType is loaded by {{Launcher$AppClassLoader}}. > {code} > scala> DataType.getClass.getClassLoader > res2: ClassLoader = sun.misc.Launcher$AppClassLoader@6876fb1b > scala> this.getClass.getClassLoader > res3: ClassLoader = > org.apache.spark.repl.SparkIMain$TranslatingClassLoader@63d36b29 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org