vbmacher opened a new issue, #1213: URL: https://github.com/apache/sedona/issues/1213
## Expected behavior Maybe this is possible somehow, but I haven't find this anywhere. I'm relatively new to Sedona and Geo-processing. I'd like to see a possibility to save and then load a spatial RDD which is already analyzed, partitioned and possibly with the index. We have a use case we use such dataset in many jobs and it's time-consuming to create the partitioning & build index every time. Not sure if it's possible though. For example: ``` // save: val spatialRdd = Adapter.toSpatialRdd(df, ...) spatialRdd.analyze() spatialRdd.spatialPartitioning(GridType.KDBTREE, math.min(Integer.MAX_VALUE, df.count() / 2).toInt) // IllegalArgumentException: [Sedona] Number of partitions cannot be larger than half of total records num spatialRdd.buildIndex(IndexType.RTREE, true) SomeSedonaUtility.saveSpatialRdd(spatialRdd, path) // <-- save with index and partitioned // load: val rdd = SomeSedonaUtility.loadSpatialRdd(path) // and usage: val otherRdd = Adapter.toSpatialRdd(otherDs, ...) otherRdd.spatialPartitioning(rdd.getPartitioner) val useIndex = true val considerBoundaryIntersection = SpatialPredicate.COVERS val params = new JoinQuery.JoinParams(useIndex, considerBoundaryIntersection, IndexType.RTREE, JoinBuildSide.LEFT) val joined = JoinQuery.spatialJoin(rdd, otherRdd, params) ``` ## Actual behavior Index & partitioning must be set at runtime (to my knowledge). ## Steps to reproduce the problem The feature is missing, so it's not possible to reproduce it. ## Settings Sedona version = 1.5.1 Apache Spark version = 3.5 API type = Scala Scala version = 2.12 JRE version = 1.8 Environment = EMR -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@sedona.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org