krishnarb3 commented on code in PR #609: URL: https://github.com/apache/incubator-sedona/pull/609#discussion_r851517297
########## core/src/main/java/org/apache/sedona/core/spatialOperator/DBScanQuery.java: ########## @@ -0,0 +1,31 @@ +package org.apache.sedona.core.spatialOperator; + +import org.apache.sedona.core.dbscanJudgement.DBScanJudgement; +import org.apache.sedona.core.knnJudgement.GeometryDistanceComparator; +import org.apache.sedona.core.knnJudgement.KnnJudgementUsingIndex; +import org.apache.sedona.core.spatialRDD.SpatialRDD; +import org.apache.spark.api.java.JavaRDD; +import org.locationtech.jts.geom.Geometry; + +import java.io.Serializable; +import java.util.HashSet; +import java.util.List; + +public class DBScanQuery + implements Serializable +{ + public static <T extends Geometry> List<Integer> SpatialDBScanQuery(SpatialRDD<T> spatialRDD, double eps, int minPoints, boolean useIndex) + { + if (useIndex) { + if (spatialRDD.indexedRawRDD == null) { + throw new NullPointerException("Need to invoke buildIndex() first, indexedRDDNoId is null"); + } + JavaRDD<Integer> result = spatialRDD.getRawSpatialRDD().repartition(1).mapPartitions(new DBScanJudgement(eps, minPoints, new HashSet<>()), true); Review Comment: Since the implementation of DBScan is not parallel/distributed, it does not work when there are more than one partition(s). @jiayuasu Do you think this is acceptable? I will try to come up with an implementation that can be run on multiple partitions later in that case. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@sedona.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org