Re: Scala types to StructType

Ted Yu Mon, 15 Feb 2016 00:21:07 -0800

Please the last line of convertToCatalyst(a: Any) :

   case other => other


FYI

On Mon, Feb 15, 2016 at 12:09 AM, Fabian Böhnlein <
fabian.boehnl...@gmail.com> wrote:

> Interesting, thanks.
>
> The (only) publicly accessible method seems *convertToCatalyst*:
>
> https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystTypeConverters.scala#L425
>
> Seems it's missing some types like Integer, Short, Long... I'll give it a
> try.
>
> Thanks,
> Fabian
>
>
> On 12/02/16 05:53, Yogesh Mahajan wrote:
>
> Right, Thanks Ted.
>
> On Fri, Feb 12, 2016 at 10:21 AM, Ted Yu <yuzhih...@gmail.com> wrote:
>
>> Minor correction: the class is CatalystTypeConverters.scala
>>
>> On Thu, Feb 11, 2016 at 8:46 PM, Yogesh Mahajan <
>> <ymaha...@snappydata.io>ymaha...@snappydata.io> wrote:
>>
>>> CatatlystTypeConverters.scala has all types of utility methods to
>>> convert from Scala to row and vice a versa.
>>>
>>>
>>> On Fri, Feb 12, 2016 at 12:21 AM, Rishabh Wadhawan <
>>> <rishabh...@gmail.com>rishabh...@gmail.com> wrote:
>>>
>>>> I had the same issue. I resolved it in Java, but I am pretty sure it
>>>> would work with scala too. Its kind of a gross hack. But what I did is say
>>>> I had a table in Mysql with 1000 columns
>>>> what is did is that I threw a jdbc query to extracted the schema of the
>>>> table. I stored that schema and wrote a map function to create StructFields
>>>> using structType and Row.Factory. Then I took that table loaded as a
>>>> dataFrame, event though it had a schema. I converted that data frame into
>>>> an RDD, this is when it lost the schema. Then performed something using
>>>> that RDD and then converted back that RDD with the structfield.
>>>> If your source is structured type then it would be better if you can
>>>> load it directly as a DF that way you can preserve the schema. However, in
>>>> your case you should do something like this
>>>> List<StructFrield> fields = new ArrayList<StructField>
>>>> for(keys in MAP)
>>>>  fields.add(DataTypes.createStructField(keys, DataTypes.StringType,
>>>> true));
>>>>
>>>> StrructType schemaOfDataFrame = DataTypes.createStructType(conffields);
>>>>
>>>> sqlcontext.createDataFrame(rdd, schemaOfDataFrame);
>>>>
>>>> This is how I would do it to make it in Java, not sure about scala
>>>> syntax. Please tell me if that helped.
>>>>
>>>> On Feb 11, 2016, at 7:20 AM, Fabian Böhnlein <
>>>> <fabian.boehnl...@gmail.com>fabian.boehnl...@gmail.com> wrote:
>>>>
>>>> Hi all,
>>>>
>>>> is there a way to create a Spark SQL Row schema based on Scala data
>>>> types without creating a manual mapping?
>>>>
>>>> That's the only example I can find which doesn't require
>>>> spark.sql.types.DataType already as input, but it requires to define them
>>>> as Strings.
>>>>
>>>> * val struct = (new StructType)*   .add("a", "int")*   .add("b", "long")*  
>>>>  .add("c", "string")
>>>>
>>>>
>>>>
>>>> Specifically I have an RDD where each element is a Map of 100s of
>>>> variables with different data types which I want to transform to a 
>>>> DataFrame
>>>> where the keys should end up as the column names:
>>>>
>>>> Map ("Amean" -> 20.3, "Asize" -> 12, "Bmean" -> ....)
>>>>
>>>>
>>>> Is there a different possibility than building a mapping from the
>>>> values' .getClass to the Spark SQL DataTypes?
>>>>
>>>>
>>>> Thanks,
>>>> Fabian
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>
>

Re: Scala types to StructType

Reply via email to