swamirishi commented on a change in pull request #536:
URL: https://github.com/apache/incubator-sedona/pull/536#discussion_r695176630
##########
File path:
core/src/main/java/org/apache/sedona/core/formatMapper/ParquetReader.java
##########
@@ -0,0 +1,38 @@
+package org.apache.sedona.core.formatMapper;
+
+import org.apache.avro.generic.GenericRecord;
+import org.apache.sedona.core.enums.GeometryType;
+import org.apache.sedona.core.formatMapper.parquet.ParquetFormatMapper;
+import org.apache.sedona.core.geometryObjects.Circle;
+import org.apache.sedona.core.io.parquet.ParquetFileReader;
+import org.apache.sedona.core.spatialRDD.SpatialRDD;
+import org.apache.spark.api.java.JavaRDD;
+import org.apache.spark.api.java.JavaSparkContext;
+import org.locationtech.jts.geom.Geometry;
+import org.locationtech.jts.geom.GeometryFactory;
+import org.locationtech.jts.geom.LineString;
+import org.locationtech.jts.geom.Polygon;
+
+import java.io.IOException;
+import java.util.List;
+
+public class ParquetReader extends RddReader {
+ public static <T extends Geometry> SpatialRDD<T> createSpatialRDD(JavaRDD
rawRDD,
+
ParquetFormatMapper<T> formatMapper,
+
GeometryType geometryType) {
+ SpatialRDD spatialRDD = new SpatialRDD<T>(geometryType);
+ spatialRDD.rawSpatialRDD = rawRDD.mapPartitions(formatMapper);
+ return spatialRDD;
+ }
+
+ public static <T extends Geometry> SpatialRDD<T>
readToGeometryRDD(JavaSparkContext sc,
Review comment:
Currently we need to support reading of random parquet file input as
well. We depend on the user to know the data in the parquet file. Otherwise it
doesn't really make sense. As a user the user might want to do some projections
thus one would need to know the files meta info. Though we can add a feature to
identify geometry column in parquet files created by Sedona Framework.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]