Brian Rice created SEDONA-133:
---------------------------------

             Summary: Allow user-defined schemas in Adapter.toDf()
                 Key: SEDONA-133
                 URL: https://issues.apache.org/jira/browse/SEDONA-133
             Project: Apache Sedona
          Issue Type: Improvement
            Reporter: Brian Rice


Hello!

I would like to propose a new overloaded method for supporting user-defined 
schemas in {{Adapter.toDf()}} (for both SpatialRDD and JavaPairRDD). Currently 
fields are coerced to StringType, which does not work for all use cases 
(specifically, I have structs that lose all their nested columns if casted to 
StringType). I can do a workaround, but it would be nice to have this off the 
shelf. Some sample code from Adapter.scala:

{{cols = cols ++ fieldNames.map(f => StructField(f, {+}StringType{+}))}}
 
{{...}}
 
{{cols = cols ++ leftFieldnames.map(fName => StructField(fName, 
{+}StringType{+}))}}
{{cols = cols ++ rightFieldNames.map(fName => StructField(fName, 
{+}StringType{+}))}}
 
My thinking is that the user could provide the schema directly in the form of a 
StructType object. The expectation would be that they are responsible enough to 
provide the correct field names and data types if they want to provide the 
schema at all.
 
I would be happy to work on a PR if it's deemed appropriate. What are your 
thoughts?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to