[ 
https://issues.apache.org/jira/browse/SEDONA-133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17571647#comment-17571647
 ] 

Brian Rice commented on SEDONA-133:
-----------------------------------

I got sick and this will take me a little while. At its core this is due to 
representing userData in the Geometry data type as String (TSV). So we can 
either (1) change the data type, which seems complex to me because of 
serialization since I don't fully understand how that works in Sedona/Spark, or 
(2) keep it represented as String and convert to other data types like it might 
be done when reading CSV data. So I'm mostly looking into option 2.

> Allow user-defined schemas in Adapter.toDf()
> --------------------------------------------
>
>                 Key: SEDONA-133
>                 URL: https://issues.apache.org/jira/browse/SEDONA-133
>             Project: Apache Sedona
>          Issue Type: Improvement
>            Reporter: Brian Rice
>            Assignee: Brian Rice
>            Priority: Normal
>
> Hello!
> I would like to propose a new overloaded method for supporting user-defined 
> schemas in {{Adapter.toDf()}} (for both SpatialRDD and JavaPairRDD). 
> Currently fields are coerced to StringType, which does not work for all use 
> cases (specifically, I have structs that lose all their nested columns if 
> casted to StringType). I can do a workaround, but it would be nice to have 
> this off the shelf. Some sample code from Adapter.scala:
> {{cols = cols ++ fieldNames.map(f => StructField(f, {+}StringType{+}))}}
>  
> {{...}}
>  
> {{cols = cols ++ leftFieldnames.map(fName => StructField(fName, 
> {+}StringType{+}))}}
> {{cols = cols ++ rightFieldNames.map(fName => StructField(fName, 
> {+}StringType{+}))}}
>  
> My thinking is that the user could provide the schema directly in the form of 
> a StructType object. The expectation would be that they are responsible 
> enough to provide the correct field names and data types if they want to 
> provide the schema at all.
>  
> I would be happy to work on a PR if it's deemed appropriate. What are your 
> thoughts?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to