Hmm. I see. Yes, I guess that won't work then. I don't understand what you are proposing about UDFRegistration. I only see methods that take tuples of various sizes (1 .. 22).
On Tue, May 17, 2016 at 1:00 PM, Michael Armbrust <mich...@databricks.com> wrote: > I don't think that you will be able to do that. ScalaReflection is based > on the TypeTag of the object, and thus the schema of any particular object > won't be available to it. > > Instead I think you want to use the register functions in UDFRegistration > that take a schema. Does that make sense? > > On Tue, May 17, 2016 at 11:48 AM, Andy Grove <andy.gr...@agildata.com> > wrote: > >> >> Hi, >> >> I have a requirement to create types dynamically in Spark and then >> instantiate those types from Spark SQL via a UDF. >> >> I tried doing the following: >> >> val addressType = StructType(List( >> new StructField("state", DataTypes.StringType), >> new StructField("zipcode", DataTypes.IntegerType) >> )) >> >> sqlContext.udf.register("Address", (args: Seq[Any]) => new >> GenericRowWithSchema(args.toArray, addressType)) >> >> sqlContext.sql("SELECT Address('NY', 12345)").show(10) >> >> This seems reasonable to me but this fails with: >> >> Exception in thread "main" java.lang.UnsupportedOperationException: >> Schema for type >> org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema is not >> supported >> at >> org.apache.spark.sql.catalyst.ScalaReflection$.schemaFor(ScalaReflection.scala:755) >> at >> org.apache.spark.sql.catalyst.ScalaReflection$.schemaFor(ScalaReflection.scala:685) >> at >> org.apache.spark.sql.UDFRegistration.register(UDFRegistration.scala:130) >> >> It looks like it would be simple to update ScalaReflection to be able to >> infer the schema from a GenericRowWithSchema, but before I file a JIRA and >> submit a patch I wanted to see if there is already a way of achieving this. >> >> Thanks, >> >> Andy. >> >> >> >