Benyi Wang created SPARK-6589:
---------------------------------

             Summary: SQLUserDefinedType failed in spark-shell
                 Key: SPARK-6589
                 URL: https://issues.apache.org/jira/browse/SPARK-6589
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 1.2.0
         Environment: CDH 5.3.2
            Reporter: Benyi Wang


{{DataType.fromJson}} will fail in spark-shell if the schema includes "udt". It 
works if running in an application. 

This causes that I cannot read a parquet file including a UDT field. 
{{DataType.fromCaseClass}} does not support UDT.

I can load the class which shows that my UDT is in the classpath.
{code}
scala> Class.forName("com.bwang.MyTestUDT")
res6: Class[_] = class com.bwang.MyTestUDT
{code}

But DataType fails:
{code}
scala> DataType.fromJson(json)                                                  
                                                    
java.lang.ClassNotFoundException: com.bwang.MyTestUDT
        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:190)
        at 
org.apache.spark.sql.catalyst.types.DataType$.parseDataType(dataTypes.scala:77)
{code}

The reason is DataType.fromJson tries to load {{udtClass}} using this code:
{code}
    case JSortedObject(
        ("class", JString(udtClass)),
        ("pyClass", _),
        ("sqlType", _),
        ("type", JString("udt"))) =>
      Class.forName(udtClass).newInstance().asInstanceOf[UserDefinedType[_]]
  }
{code}

Unfortunately, my UDT is loaded by {{SparkIMain$TranslatingClassLoader}}, but 
DataType is loaded by {{Launcher$AppClassLoader}}.

{code}
scala> DataType.getClass.getClassLoader
res2: ClassLoader = sun.misc.Launcher$AppClassLoader@6876fb1b

scala> this.getClass.getClassLoader
res3: ClassLoader = 
org.apache.spark.repl.SparkIMain$TranslatingClassLoader@63d36b29
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to