PrathameshDhapodkar opened a new issue, #987: URL: https://github.com/apache/sedona/issues/987
## Expected behavior value| current_timestamp | network_operator_name | dl_load_date | isDishNrCell | isSimDish | nrNci | AOI_ID | Cluster_ID <nestedjsonvalue> | 8/22/2023 5:44:11 PM | Digicel | 8/22/2023 | FALSE | TRUE | 3569856325 | ALB | ALB-01-Downtown ## Actual behavior value| current_timestamp | network_operator_name | dl_load_date | isDishNrCell | isSimDish | nrNci | AOI_ID | Cluster_ID <nestedjsonvalue> | 8/22/2023 5:44:11 PM | Digicel | 8/22/2023 | FALSE | TRUE | 3569856325 | | ## Steps to reproduce the problem actual dataframe is a streaming dataset running on spark cluster. 1. Create spark session 2. get shape file from location(s3 here) code: def getAoiShapeDf: DataFrame = { val aoiShapefileLocation = "s3://ue-bronze-dish-wireless-source-data-np/opensource_loc/top_shp/aoi_oto/" val aoiShapeRdd = ShapefileReader.readToGeometryRDD(session.sparkContext, aoiShapefileLocation) aoiShapeRdd.CRSTransform("epsg:4326", "epsg:5070", false) val aoiShapeDf = Adapter.toDf(aoiShapeRdd, session) aoiShapeDf } 3. join shape file dataframe with actual dataframe on ST_Contains join condition. code: def enrichWithAoi(dataframe:DataFrame,clientLatColumn: String, clientLongColumn: String): DataFrame = { val networkAoiShape = broadcast(this.getAoiShapeDf.select("geometry","AOI_ID")) val ueDataWithGeom = dataframe.withColumn("aoiGeoPoint", expr(s"ST_TRANSFORM(ST_POINT(CAST($clientLatColumn AS DOUBLE), CAST($clientLongColumn AS DOUBLE)), 'EPSG:4326', 'EPSG:5070')")) val aoiShapeJoin = ueDataWithGeom.alias("roamingAoiData").join(networkAoiShape.alias("shapeData"), expr("ST_Contains(shapeData.geometry,roamingAoiData.aoiGeoPoint)"),"LeftOuter") aoiShapeJoin.drop("geometry","aoiGeoPoint") } I tried with schema for shape files as well. Still the same result. ## Settings - EMR Serverless 6.9.0 - spark 3.3.2 - scala 2.12 - jdk 11 Sedona version = ? implementation group: 'org.apache.sedona', name: 'sedona-python-adapter-3.0_2.12', version: '1.3.1-incubating' implementation group: 'org.apache.sedona', name: 'sedona-viz-3.0_2.12', version: '1.4.1' implementation group: 'org.apache.sedona', name: 'sedona-common', version: '1.4.1' implementation group: 'org.apache.sedona', name: 'sedona-sql-3.0_2.12', version: '1.4.1' Apache Spark version = ? 3.3.2 API type = Scala, Java, Python? Scala Scala version = 2.11, 2.12, 2.13? 2.12 JRE version = 1.8, 1.11? jdk11 Environment = Standalone, AWS EC2, EMR, Azure, Databricks? AWS EMR Serverless -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@sedona.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org