Hi Pedro,

You should use sedona.apache.org instead of sedona.staged.apache.org.
`staged` website is for us to test the website template. We haven't
been updating that website for more than 1 year.

Here is the doc for Martin's RasterUDT:
https://sedona.apache.org/1.4.0/api/sql/Raster-loader/

Thanks,
Jia

On Tue, Mar 21, 2023 at 8:30 AM Pedro Mano Fernandes
<[email protected]> wrote:
>
> Hi Martin,
>
> It's weird I don't see your new Raster features in the docs in
> https://sedona.staged.apache.org/api/sql/Raster-loader/. I thought the
> documentation was already up-to-date after the release of sedona-1.4.0.
>
> Best regards,
>
> On Wed, 1 Mar 2023 at 10:29, Pedro Mano Fernandes <[email protected]>
> wrote:
>
> > Hi Martin,
> >
> > Great news! I'll give it a go and will let you know.
> >
> > Thanks for letting me know.
> > Best regards,
> >
> > On Tue, 28 Feb 2023 at 14:53, Martin Andersson <
> > [email protected]> wrote:
> >
> >> Hi again Pedro,
> >>
> >> Since https://github.com/apache/sedona/pull/773 got merged you should
> >> now be able to use Apache Sedona for your GeoTiff processing needs. It will
> >> be included in the next Sedona release.
> >>
> >> All feedback is welcome!
> >>
> >> Br
> >> Martin Andersson
> >>
> >>
> >> Den mån 23 jan. 2023 kl 10:45 skrev Pedro Mano Fernandes <
> >> [email protected]>:
> >>
> >>> Hi Martin,
> >>>
> >>> I've tested your proposal (reading binary and UDF getValue) and it works
> >>> fine. I've actually converted the code to Scala easily. Now it's a matter
> >>> of building/optimizing around it (spatial join, aggregate points per
> >>> geotiff).
> >>>
> >>> Best,
> >>>
> >>> On Fri, 20 Jan 2023 at 13:47, Martin Andersson <
> >>> [email protected]> wrote:
> >>>
> >>>> Yes, there are lots of things to consider when processing large blobs
> >>>> in Spark. What I have come to learn:
> >>>>  - Do the spatial join (points and the geotiff extent) with as few
> >>>> columns as possible. Ideally an id only for the geotiff. After that join
> >>>> you can join back the geotiff using the id.
> >>>>  - Aggregate the points to an array of points per geotiff. Your
> >>>> getValue udf should take an array of points and return an array of 
> >>>> values.
> >>>> That way each geotiff is only loaded once.
> >>>>  - Parquet in Spark is not very good at handling large blobs. If
> >>>> reading parquet with geotiffs is slow you can repartition() with a very
> >>>> large number to force smaller row groups when writing or use Avro 
> >>>> instead.
> >>>> https://www.uber.com/en-SE/blog/hdfs-file-format-apache-spark/
> >>>>
> >>>> Good luck!
> >>>>
> >>>> Br,
> >>>> Martin Andersson
> >>>>
> >>>>
> >>>> Den fre 20 jan. 2023 kl 13:08 skrev Pedro Mano Fernandes <
> >>>> [email protected]>:
> >>>>
> >>>>> Thanks Martin, it sounds promising. I'll actually give it a try before
> >>>>> going with geotiff conversions.
> >>>>>
> >>>>> I'm foreseeing some concerns, though:
> >>>>>
> >>>>>    - I'm afraid it won't be optimal for a big geotiff - I may have to
> >>>>>    split the geotiff into smaller geotiffs
> >>>>>    - I wonder how the spatial partitioning optimization will behave
> >>>>>    in such approach - I may have to load smaller geotiffs and use their
> >>>>>    geometry to join (my coordinates against envelope boundaries) before
> >>>>>    calculating the getValue for my coordinates
> >>>>>
> >>>>> Best,
> >>>>>
> >>>>> On Fri, 20 Jan 2023 at 08:49, Martin Andersson <
> >>>>> [email protected]> wrote:
> >>>>>
> >>>>>> I would read the geotiff files as binary:
> >>>>>> https://spark.apache.org/docs/latest/sql-data-sources-binaryFile.html
> >>>>>>
> >>>>>> Then you can define a udf to extract values directly from the
> >>>>>> geotiffs. If you're on python you can use raster.io to do that.
> >>>>>>
> >>>>>> In java it would look some thing like this:
> >>>>>>
> >>>>>>   Integer getValue(byte[] geotiff, double x, double y)
> >>>>>>       throws IOException, TransformException {
> >>>>>>     try (ByteArrayInputStream inputStream = new
> >>>>>> ByteArrayInputStream(geotiff)) {
> >>>>>>       GeoTiffReader geoTiffReader = new GeoTiffReader(inputStream);
> >>>>>>       GridCoverage2D grid = geoTiffReader.read(null);
> >>>>>>       Raster raster = grid.getRenderedImage().getData();
> >>>>>>       GridGeometry2D gridGeometry = grid.getGridGeometry();
> >>>>>>
> >>>>>>       DirectPosition2D directPosition2D = new DirectPosition2D(x, y);
> >>>>>>       GridCoordinates2D gridCoordinates2D =
> >>>>>> gridGeometry.worldToGrid(directPosition2D);
> >>>>>>       try {
> >>>>>>           int[] pixel = raster.getPixel(gridCoordinates2D.x,
> >>>>>> gridCoordinates2D.y, new int[1]);
> >>>>>>           return pixel[0];
> >>>>>>       } catch (ArrayIndexOutOfBoundsException exc) {
> >>>>>>           // point is outside the extentent
> >>>>>>           result.add(null);
> >>>>>>       }
> >>>>>>     }
> >>>>>> }
> >>>>>>
> >>>>>> Br,
> >>>>>> Martin Andersson
> >>>>>>
> >>>>>> Den ons 18 jan. 2023 kl 17:59 skrev Pedro Mano Fernandes <
> >>>>>> [email protected]>:
> >>>>>>
> >>>>>>> Thanks for the update, guys.
> >>>>>>>
> >>>>>>> I'm not ready to contribute yet.
> >>>>>>>
> >>>>>>> In the meanwhile, the solution could be perhaps to convert GeoTiff
> >>>>>>> to another format supported by Sedona. If anyone has had this use case
> >>>>>>> before or has any idea, please share.
> >>>>>>>
> >>>>>>> Best,
> >>>>>>>
> >>>>>>> On Wed, 18 Jan 2023 at 09:47, Martin Andersson <
> >>>>>>> [email protected]> wrote:
> >>>>>>>
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> I think you are looking for something like this:
> >>>>>>>> https://postgis.net/docs/RT_ST_Value.html
> >>>>>>>>
> >>>>>>>> The raster support in Sedona is very limited at the moment. The
> >>>>>>>> lack of a proper raster type makes implementing st_value impossible. 
> >>>>>>>> We had
> >>>>>>>> a brief discussion about that recently.
> >>>>>>>> https://lists.apache.org/thread/qdfcvxl6z5pb7m7ky5zsksyytyxqwv8c
> >>>>>>>>
> >>>>>>>> If you want to make a contribution and need some guidance, please
> >>>>>>>> let me know!
> >>>>>>>>
> >>>>>>>> Br,
> >>>>>>>> Martin Andersson
> >>>>>>>>
> >>>>>>>> Den ons 18 jan. 2023 kl 05:45 skrev Jia Yu <[email protected]>:
> >>>>>>>>
> >>>>>>>>> Hi Pedro,
> >>>>>>>>>
> >>>>>>>>> I got your point. Unfortunately, we don't have this function yet
> >>>>>>>>> in Sedona.
> >>>>>>>>> But we welcome anyone who want to contribute this to Sedona!
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> Jia
> >>>>>>>>>
> >>>>>>>>> On Tue, Jan 17, 2023 at 9:11 AM Pedro Mano Fernandes <
> >>>>>>>>> [email protected]>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>> > Hi all,
> >>>>>>>>> >
> >>>>>>>>> > Any clue? Or any documentation I can refer to?
> >>>>>>>>> >
> >>>>>>>>> > Here goes a dummy example to better explain myself: in QGIS I
> >>>>>>>>> can click a
> >>>>>>>>> > point (coordinates) of the geotiff and get the value in that
> >>>>>>>>> point (in this
> >>>>>>>>> > case 231 of Band 1).
> >>>>>>>>> >
> >>>>>>>>> > [image: image.png]
> >>>>>>>>> >
> >>>>>>>>> > Thanks,
> >>>>>>>>> >
> >>>>>>>>> > On Sun, 15 Jan 2023 at 16:17, Pedro Mano Fernandes <
> >>>>>>>>> [email protected]>
> >>>>>>>>> > wrote:
> >>>>>>>>> >
> >>>>>>>>> >> Hi Jia,
> >>>>>>>>> >>
> >>>>>>>>> >> Thanks for the fast response.
> >>>>>>>>> >>
> >>>>>>>>> >> With the regular spatial join I’ll get the array of data of the
> >>>>>>>>> whole
> >>>>>>>>> >> geotiff polygon. I was hoping to get the data element for
> >>>>>>>>> specific
> >>>>>>>>> >> coordinates inside that polygon. In other words: I guess the
> >>>>>>>>> array of data
> >>>>>>>>> >> corresponds to all the positions in the polygon, but I want to
> >>>>>>>>> fetch
> >>>>>>>>> >> specific positions.
> >>>>>>>>> >>
> >>>>>>>>> >> Thanks,
> >>>>>>>>> >>
> >>>>>>>>> >> On Sun, 15 Jan 2023 at 01:09, Jia Yu <[email protected]> wrote:
> >>>>>>>>> >>
> >>>>>>>>> >>> Hi Pedro,
> >>>>>>>>> >>>
> >>>>>>>>> >>> Once you use Sedona geotiff reader to read those geotiffs, you
> >>>>>>>>> will get
> >>>>>>>>> >>> a dataframe with the following schema:
> >>>>>>>>> >>>
> >>>>>>>>> >>>  |-- image: struct (nullable = true)
> >>>>>>>>> >>>  |    |-- origin: string (nullable = true)
> >>>>>>>>> >>>  |    |-- Geometry: string (nullable = true)
> >>>>>>>>> >>>  |    |-- height: integer (nullable = true)
> >>>>>>>>> >>>  |    |-- width: integer (nullable = true)
> >>>>>>>>> >>>  |    |-- nBands: integer (nullable = true)
> >>>>>>>>> >>>  |    |-- data: array (nullable = true)
> >>>>>>>>> >>>  |    |    |-- element: double (containsNull = true)
> >>>>>>>>> >>>
> >>>>>>>>> >>>
> >>>>>>>>> >>> You can use the following way to fetch the geometry column and
> >>>>>>>>> perform
> >>>>>>>>> >>> the spatial join;
> >>>>>>>>> >>>
> >>>>>>>>> >>> geotiffDF = geotiffDF.selectExpr("image.origin as
> >>>>>>>>> >>> origin","ST_GeomFromWkt(image.geometry) as Geom",
> >>>>>>>>> "image.height as height",
> >>>>>>>>> >>> "image.width as width", "image.data as data", "image.nBands as
> >>>>>>>>> bands")
> >>>>>>>>> >>> geotiffDF.createOrReplaceTempView("GeotiffDataframe")
> >>>>>>>>> >>> geotiffDF.show()
> >>>>>>>>> >>>
> >>>>>>>>> >>> More info can be found:
> >>>>>>>>> >>>
> >>>>>>>>> https://sedona.apache.org/1.3.1-incubating/api/sql/Raster-loader/#geotiff-dataframe-loader
> >>>>>>>>> >>>
> >>>>>>>>> >>> Thanks,
> >>>>>>>>> >>> Jia
> >>>>>>>>> >>>
> >>>>>>>>> >>> On Sat, Jan 14, 2023 at 9:10 AM Pedro Mano Fernandes <
> >>>>>>>>> >>> [email protected]> wrote:
> >>>>>>>>> >>>
> >>>>>>>>> >>>> Hi everyone!
> >>>>>>>>> >>>>
> >>>>>>>>> >>>> I'm trying to use elevation data in GeoTiff format. I
> >>>>>>>>> understand how to
> >>>>>>>>> >>>> load the dataset, as described in
> >>>>>>>>> >>>>
> >>>>>>>>> >>>>
> >>>>>>>>> https://sedona.staged.apache.org/api/sql/Raster-loader/#geotiff-dataframe-loader
> >>>>>>>>> >>>> .
> >>>>>>>>> >>>> Now I'm wondering how to join this dataframe with another one
> >>>>>>>>> that
> >>>>>>>>> >>>> contains
> >>>>>>>>> >>>> coordinates, in order to get the elevation data for those
> >>>>>>>>> coordinates.
> >>>>>>>>> >>>>
> >>>>>>>>> >>>> Something along these lines:
> >>>>>>>>> >>>>
> >>>>>>>>> >>>> pointsDF
> >>>>>>>>> >>>>   .join(geotiffDF, ...)
> >>>>>>>>> >>>>   .select("lon", "lat", "geotiff_data")
> >>>>>>>>> >>>>
> >>>>>>>>> >>>> Are there any examples or documentation I can follow to
> >>>>>>>>> accomplish this?
> >>>>>>>>> >>>>
> >>>>>>>>> >>>> Thanks,
> >>>>>>>>> >>>>
> >>>>>>>>> >>>> --
> >>>>>>>>> >>>> Pedro Mano Fernandes
> >>>>>>>>> >>>>
> >>>>>>>>> >>> --
> >>>>>>>>> >> Pedro Mano Fernandes
> >>>>>>>>> >>
> >>>>>>>>> >
> >>>>>>>>> >
> >>>>>>>>> > --
> >>>>>>>>> > Pedro Mano Fernandes
> >>>>>>>>> >
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> Hälsningar,
> >>>>>>>> Martin
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Pedro Mano Fernandes
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>> --
> >>>>> Pedro Mano Fernandes
> >>>>>
> >>>>
> >>>
> >>> --
> >>> Pedro Mano Fernandes
> >>>
> >>
> >
> > --
> > Pedro Mano Fernandes
> >
>
>
> --
> Pedro Mano Fernandes

Reply via email to