Brian Rice created SEDONA-133:
---------------------------------
Summary: Allow user-defined schemas in Adapter.toDf()
Key: SEDONA-133
URL: https://issues.apache.org/jira/browse/SEDONA-133
Project: Apache Sedona
Issue Type: Improvement
Reporter: Brian Rice
Hello!
I would like to propose a new overloaded method for supporting user-defined
schemas in {{Adapter.toDf()}} (for both SpatialRDD and JavaPairRDD). Currently
fields are coerced to StringType, which does not work for all use cases
(specifically, I have structs that lose all their nested columns if casted to
StringType). I can do a workaround, but it would be nice to have this off the
shelf. Some sample code from Adapter.scala:
{{cols = cols ++ fieldNames.map(f => StructField(f, {+}StringType{+}))}}
{{...}}
{{cols = cols ++ leftFieldnames.map(fName => StructField(fName,
{+}StringType{+}))}}
{{cols = cols ++ rightFieldNames.map(fName => StructField(fName,
{+}StringType{+}))}}
My thinking is that the user could provide the schema directly in the form of a
StructType object. The expectation would be that they are responsible enough to
provide the correct field names and data types if they want to provide the
schema at all.
I would be happy to work on a PR if it's deemed appropriate. What are your
thoughts?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)