Hi!

I create DataFrame using method following
JavaRDD<Row> rows = ...
StructType structType = ...
Then apply sqlContext.createDataFrame(rows, structType).

I have pretty complex schema:
root
 |-- Id: long (nullable = true)
 |-- attributes: struct (nullable = true)
 |    |-- FirstName: array (nullable = true)
 |    |    |-- element: string (containsNull = true)
 |    |-- Identifiers: array (nullable = true)
 |    |    |-- element: struct (containsNull = true)
 |    |    |    |-- Type: array (nullable = true)
 |    |    |    |    |-- element: string (containsNull = true)

The question is when I explode attributes.Identifiers column there is one more 
field appear in the schema:
|-- Identifiers: string (nullable = true)

The question is: why the type of Identifiers is string? Is it possible to make 
it nonString?
In the given example it’s clear that the schema must be a 
struct<array<string>>. And unfortunately it’s not possible to cast this column 
as cast string to struct is not allowed.

Are there any workarounds to have correct schema?
Thanks in advance.

Eugene Morozov
fathers...@list.ru




Reply via email to