Hi Pedro, You should use sedona.apache.org instead of sedona.staged.apache.org. `staged` website is for us to test the website template. We haven't been updating that website for more than 1 year.
Here is the doc for Martin's RasterUDT: https://sedona.apache.org/1.4.0/api/sql/Raster-loader/ Thanks, Jia On Tue, Mar 21, 2023 at 8:30 AM Pedro Mano Fernandes <[email protected]> wrote: > > Hi Martin, > > It's weird I don't see your new Raster features in the docs in > https://sedona.staged.apache.org/api/sql/Raster-loader/. I thought the > documentation was already up-to-date after the release of sedona-1.4.0. > > Best regards, > > On Wed, 1 Mar 2023 at 10:29, Pedro Mano Fernandes <[email protected]> > wrote: > > > Hi Martin, > > > > Great news! I'll give it a go and will let you know. > > > > Thanks for letting me know. > > Best regards, > > > > On Tue, 28 Feb 2023 at 14:53, Martin Andersson < > > [email protected]> wrote: > > > >> Hi again Pedro, > >> > >> Since https://github.com/apache/sedona/pull/773 got merged you should > >> now be able to use Apache Sedona for your GeoTiff processing needs. It will > >> be included in the next Sedona release. > >> > >> All feedback is welcome! > >> > >> Br > >> Martin Andersson > >> > >> > >> Den mån 23 jan. 2023 kl 10:45 skrev Pedro Mano Fernandes < > >> [email protected]>: > >> > >>> Hi Martin, > >>> > >>> I've tested your proposal (reading binary and UDF getValue) and it works > >>> fine. I've actually converted the code to Scala easily. Now it's a matter > >>> of building/optimizing around it (spatial join, aggregate points per > >>> geotiff). > >>> > >>> Best, > >>> > >>> On Fri, 20 Jan 2023 at 13:47, Martin Andersson < > >>> [email protected]> wrote: > >>> > >>>> Yes, there are lots of things to consider when processing large blobs > >>>> in Spark. What I have come to learn: > >>>> - Do the spatial join (points and the geotiff extent) with as few > >>>> columns as possible. Ideally an id only for the geotiff. After that join > >>>> you can join back the geotiff using the id. > >>>> - Aggregate the points to an array of points per geotiff. Your > >>>> getValue udf should take an array of points and return an array of > >>>> values. > >>>> That way each geotiff is only loaded once. > >>>> - Parquet in Spark is not very good at handling large blobs. If > >>>> reading parquet with geotiffs is slow you can repartition() with a very > >>>> large number to force smaller row groups when writing or use Avro > >>>> instead. > >>>> https://www.uber.com/en-SE/blog/hdfs-file-format-apache-spark/ > >>>> > >>>> Good luck! > >>>> > >>>> Br, > >>>> Martin Andersson > >>>> > >>>> > >>>> Den fre 20 jan. 2023 kl 13:08 skrev Pedro Mano Fernandes < > >>>> [email protected]>: > >>>> > >>>>> Thanks Martin, it sounds promising. I'll actually give it a try before > >>>>> going with geotiff conversions. > >>>>> > >>>>> I'm foreseeing some concerns, though: > >>>>> > >>>>> - I'm afraid it won't be optimal for a big geotiff - I may have to > >>>>> split the geotiff into smaller geotiffs > >>>>> - I wonder how the spatial partitioning optimization will behave > >>>>> in such approach - I may have to load smaller geotiffs and use their > >>>>> geometry to join (my coordinates against envelope boundaries) before > >>>>> calculating the getValue for my coordinates > >>>>> > >>>>> Best, > >>>>> > >>>>> On Fri, 20 Jan 2023 at 08:49, Martin Andersson < > >>>>> [email protected]> wrote: > >>>>> > >>>>>> I would read the geotiff files as binary: > >>>>>> https://spark.apache.org/docs/latest/sql-data-sources-binaryFile.html > >>>>>> > >>>>>> Then you can define a udf to extract values directly from the > >>>>>> geotiffs. If you're on python you can use raster.io to do that. > >>>>>> > >>>>>> In java it would look some thing like this: > >>>>>> > >>>>>> Integer getValue(byte[] geotiff, double x, double y) > >>>>>> throws IOException, TransformException { > >>>>>> try (ByteArrayInputStream inputStream = new > >>>>>> ByteArrayInputStream(geotiff)) { > >>>>>> GeoTiffReader geoTiffReader = new GeoTiffReader(inputStream); > >>>>>> GridCoverage2D grid = geoTiffReader.read(null); > >>>>>> Raster raster = grid.getRenderedImage().getData(); > >>>>>> GridGeometry2D gridGeometry = grid.getGridGeometry(); > >>>>>> > >>>>>> DirectPosition2D directPosition2D = new DirectPosition2D(x, y); > >>>>>> GridCoordinates2D gridCoordinates2D = > >>>>>> gridGeometry.worldToGrid(directPosition2D); > >>>>>> try { > >>>>>> int[] pixel = raster.getPixel(gridCoordinates2D.x, > >>>>>> gridCoordinates2D.y, new int[1]); > >>>>>> return pixel[0]; > >>>>>> } catch (ArrayIndexOutOfBoundsException exc) { > >>>>>> // point is outside the extentent > >>>>>> result.add(null); > >>>>>> } > >>>>>> } > >>>>>> } > >>>>>> > >>>>>> Br, > >>>>>> Martin Andersson > >>>>>> > >>>>>> Den ons 18 jan. 2023 kl 17:59 skrev Pedro Mano Fernandes < > >>>>>> [email protected]>: > >>>>>> > >>>>>>> Thanks for the update, guys. > >>>>>>> > >>>>>>> I'm not ready to contribute yet. > >>>>>>> > >>>>>>> In the meanwhile, the solution could be perhaps to convert GeoTiff > >>>>>>> to another format supported by Sedona. If anyone has had this use case > >>>>>>> before or has any idea, please share. > >>>>>>> > >>>>>>> Best, > >>>>>>> > >>>>>>> On Wed, 18 Jan 2023 at 09:47, Martin Andersson < > >>>>>>> [email protected]> wrote: > >>>>>>> > >>>>>>>> Hi, > >>>>>>>> > >>>>>>>> I think you are looking for something like this: > >>>>>>>> https://postgis.net/docs/RT_ST_Value.html > >>>>>>>> > >>>>>>>> The raster support in Sedona is very limited at the moment. The > >>>>>>>> lack of a proper raster type makes implementing st_value impossible. > >>>>>>>> We had > >>>>>>>> a brief discussion about that recently. > >>>>>>>> https://lists.apache.org/thread/qdfcvxl6z5pb7m7ky5zsksyytyxqwv8c > >>>>>>>> > >>>>>>>> If you want to make a contribution and need some guidance, please > >>>>>>>> let me know! > >>>>>>>> > >>>>>>>> Br, > >>>>>>>> Martin Andersson > >>>>>>>> > >>>>>>>> Den ons 18 jan. 2023 kl 05:45 skrev Jia Yu <[email protected]>: > >>>>>>>> > >>>>>>>>> Hi Pedro, > >>>>>>>>> > >>>>>>>>> I got your point. Unfortunately, we don't have this function yet > >>>>>>>>> in Sedona. > >>>>>>>>> But we welcome anyone who want to contribute this to Sedona! > >>>>>>>>> > >>>>>>>>> Thanks, > >>>>>>>>> Jia > >>>>>>>>> > >>>>>>>>> On Tue, Jan 17, 2023 at 9:11 AM Pedro Mano Fernandes < > >>>>>>>>> [email protected]> > >>>>>>>>> wrote: > >>>>>>>>> > >>>>>>>>> > Hi all, > >>>>>>>>> > > >>>>>>>>> > Any clue? Or any documentation I can refer to? > >>>>>>>>> > > >>>>>>>>> > Here goes a dummy example to better explain myself: in QGIS I > >>>>>>>>> can click a > >>>>>>>>> > point (coordinates) of the geotiff and get the value in that > >>>>>>>>> point (in this > >>>>>>>>> > case 231 of Band 1). > >>>>>>>>> > > >>>>>>>>> > [image: image.png] > >>>>>>>>> > > >>>>>>>>> > Thanks, > >>>>>>>>> > > >>>>>>>>> > On Sun, 15 Jan 2023 at 16:17, Pedro Mano Fernandes < > >>>>>>>>> [email protected]> > >>>>>>>>> > wrote: > >>>>>>>>> > > >>>>>>>>> >> Hi Jia, > >>>>>>>>> >> > >>>>>>>>> >> Thanks for the fast response. > >>>>>>>>> >> > >>>>>>>>> >> With the regular spatial join I’ll get the array of data of the > >>>>>>>>> whole > >>>>>>>>> >> geotiff polygon. I was hoping to get the data element for > >>>>>>>>> specific > >>>>>>>>> >> coordinates inside that polygon. In other words: I guess the > >>>>>>>>> array of data > >>>>>>>>> >> corresponds to all the positions in the polygon, but I want to > >>>>>>>>> fetch > >>>>>>>>> >> specific positions. > >>>>>>>>> >> > >>>>>>>>> >> Thanks, > >>>>>>>>> >> > >>>>>>>>> >> On Sun, 15 Jan 2023 at 01:09, Jia Yu <[email protected]> wrote: > >>>>>>>>> >> > >>>>>>>>> >>> Hi Pedro, > >>>>>>>>> >>> > >>>>>>>>> >>> Once you use Sedona geotiff reader to read those geotiffs, you > >>>>>>>>> will get > >>>>>>>>> >>> a dataframe with the following schema: > >>>>>>>>> >>> > >>>>>>>>> >>> |-- image: struct (nullable = true) > >>>>>>>>> >>> | |-- origin: string (nullable = true) > >>>>>>>>> >>> | |-- Geometry: string (nullable = true) > >>>>>>>>> >>> | |-- height: integer (nullable = true) > >>>>>>>>> >>> | |-- width: integer (nullable = true) > >>>>>>>>> >>> | |-- nBands: integer (nullable = true) > >>>>>>>>> >>> | |-- data: array (nullable = true) > >>>>>>>>> >>> | | |-- element: double (containsNull = true) > >>>>>>>>> >>> > >>>>>>>>> >>> > >>>>>>>>> >>> You can use the following way to fetch the geometry column and > >>>>>>>>> perform > >>>>>>>>> >>> the spatial join; > >>>>>>>>> >>> > >>>>>>>>> >>> geotiffDF = geotiffDF.selectExpr("image.origin as > >>>>>>>>> >>> origin","ST_GeomFromWkt(image.geometry) as Geom", > >>>>>>>>> "image.height as height", > >>>>>>>>> >>> "image.width as width", "image.data as data", "image.nBands as > >>>>>>>>> bands") > >>>>>>>>> >>> geotiffDF.createOrReplaceTempView("GeotiffDataframe") > >>>>>>>>> >>> geotiffDF.show() > >>>>>>>>> >>> > >>>>>>>>> >>> More info can be found: > >>>>>>>>> >>> > >>>>>>>>> https://sedona.apache.org/1.3.1-incubating/api/sql/Raster-loader/#geotiff-dataframe-loader > >>>>>>>>> >>> > >>>>>>>>> >>> Thanks, > >>>>>>>>> >>> Jia > >>>>>>>>> >>> > >>>>>>>>> >>> On Sat, Jan 14, 2023 at 9:10 AM Pedro Mano Fernandes < > >>>>>>>>> >>> [email protected]> wrote: > >>>>>>>>> >>> > >>>>>>>>> >>>> Hi everyone! > >>>>>>>>> >>>> > >>>>>>>>> >>>> I'm trying to use elevation data in GeoTiff format. I > >>>>>>>>> understand how to > >>>>>>>>> >>>> load the dataset, as described in > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> https://sedona.staged.apache.org/api/sql/Raster-loader/#geotiff-dataframe-loader > >>>>>>>>> >>>> . > >>>>>>>>> >>>> Now I'm wondering how to join this dataframe with another one > >>>>>>>>> that > >>>>>>>>> >>>> contains > >>>>>>>>> >>>> coordinates, in order to get the elevation data for those > >>>>>>>>> coordinates. > >>>>>>>>> >>>> > >>>>>>>>> >>>> Something along these lines: > >>>>>>>>> >>>> > >>>>>>>>> >>>> pointsDF > >>>>>>>>> >>>> .join(geotiffDF, ...) > >>>>>>>>> >>>> .select("lon", "lat", "geotiff_data") > >>>>>>>>> >>>> > >>>>>>>>> >>>> Are there any examples or documentation I can follow to > >>>>>>>>> accomplish this? > >>>>>>>>> >>>> > >>>>>>>>> >>>> Thanks, > >>>>>>>>> >>>> > >>>>>>>>> >>>> -- > >>>>>>>>> >>>> Pedro Mano Fernandes > >>>>>>>>> >>>> > >>>>>>>>> >>> -- > >>>>>>>>> >> Pedro Mano Fernandes > >>>>>>>>> >> > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> > -- > >>>>>>>>> > Pedro Mano Fernandes > >>>>>>>>> > > >>>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> -- > >>>>>>>> Hälsningar, > >>>>>>>> Martin > >>>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> -- > >>>>>>> Pedro Mano Fernandes > >>>>>>> > >>>>>> > >>>>> > >>>>> -- > >>>>> Pedro Mano Fernandes > >>>>> > >>>> > >>> > >>> -- > >>> Pedro Mano Fernandes > >>> > >> > > > > -- > > Pedro Mano Fernandes > > > > > -- > Pedro Mano Fernandes
