jiayuasu commented on code in PR #1014:
URL: https://github.com/apache/sedona/pull/1014#discussion_r1322280702
##########
common/src/test/java/org/apache/sedona/common/raster/RasterTestBase.java:
##########
@@ -50,13 +51,17 @@ public class RasterTestBase {
GridCoverage2D multiBandRaster;
byte[] geoTiff;
+ private final PrintStream standardOut = System.out;
+ private final ByteArrayOutputStream outputStreamCaptor = new
ByteArrayOutputStream();
+
@Before
public void setup() throws IOException {
oneBandRaster =
RasterConstructors.fromArcInfoAsciiGrid(arc.getBytes(StandardCharsets.UTF_8));
multiBandRaster = createMultibandRaster();
ByteArrayOutputStream bos = new ByteArrayOutputStream();
new GeoTiffWriter(bos).write(multiBandRaster, new
GeneralParameterValue[]{});
geoTiff = bos.toByteArray();
+ //System.setOut(new PrintStream(outputStreamCaptor));
Review Comment:
This comment should be removed.
##########
common/src/main/java/org/apache/sedona/common/raster/RasterOutputs.java:
##########
@@ -132,4 +132,70 @@ public static String asBase64(GridCoverage2D raster)
throws IOException {
ImageIO.write(renderedImage, "png", out);
return Base64.getEncoder().encodeToString(out.toByteArray());
}
+
+ public static String asMatrix(GridCoverage2D raster, int band, int
postDecimalPrecision) {
+ RasterUtils.ensureBand(raster, band);
+ Raster rasterData = RasterUtils.getRaster(raster.getRenderedImage());
+ int dataTypeCode = rasterData.getDataBuffer().getDataType();
+ int width = rasterData.getWidth(), height = rasterData.getHeight();
+ if (RasterUtils.isDataTypeIntegral(dataTypeCode)) {
+ int[] bandValues = rasterData.getSamples(0, 0, width, height, band
- 1, (int[]) null);
+ return createPaddedMatrixStringFromInt(bandValues, width, height,
postDecimalPrecision);
+ }else {
+ double[] bandValues = rasterData.getSamples(0, 0, width, height,
band - 1, (double[]) null);
+ return createPaddedMatrixStringFromDouble(bandValues, width,
height, postDecimalPrecision);
+ }
+ }
+
+ public static String asMatrix(GridCoverage2D raster, int band) {
+ return asMatrix(raster, band, 6);
+ }
+
+
+ public static String asMatrix(GridCoverage2D raster) {
+ return asMatrix(raster, 1);
+ }
+
+ private static String createPaddedMatrixStringFromDouble(double[] values,
int width, int height, int decimalPrecision) {
+ StringBuilder res = new StringBuilder();
+ int maxColWidth = 0;
+ int maxDecimalPrecision = 0;
+ for (double value : values) {
+ String[] splitByDecimal = String.valueOf(value).split("\\.");
+ int preDecimal = splitByDecimal[0].length(), postDecimal =
Math.min(decimalPrecision, splitByDecimal[1].length());
+ maxDecimalPrecision = Math.max(maxDecimalPrecision, postDecimal);
+ int currWidth = preDecimal + postDecimal + 1; //add 1 for space
occupied for decimal point
+ maxColWidth = Math.max(maxColWidth, currWidth);
+ }
+ for (int i = 0; i < values.length; i++) {
+ int row= i / width, col = i % width;
+ String fmt = String.format("%s%%%d.%df%s",
+ col == 0 ? "|" : " ",
+ maxColWidth,
+ maxDecimalPrecision,
+ col < width - 1 ? "" : "|%n");
+ res.append(String.format(fmt, values[i]));
+ }
+
+ return res.toString();
+ }
+
+ private static String createPaddedMatrixStringFromInt(int[] values, int
width, int height, int decimalPrecision) {
Review Comment:
Since the decimalPrecision is not used in this function, why not remove it?
##########
common/src/main/java/org/apache/sedona/common/utils/RasterUtils.java:
##########
@@ -347,4 +348,19 @@ public static int getDataTypeCode(String s) {
}
return 5; // defaulting to double
}
+
+ public static boolean isDataTypeIntegral(int dataTypeCode) {
+ //returns true if the datatype code refers to an int-like datatype
(int, short, etc)
+ switch (dataTypeCode) {
+ case 3: //int
Review Comment:
I suspect this logic will fail when you give short or byte. Can you try to
add test cases for shorts/byte?
##########
common/src/main/java/org/apache/sedona/common/Constants/RasterConstants.java:
##########
@@ -0,0 +1,6 @@
+package org.apache.sedona.common.Constants;
+
+public class RasterConstants {
+
+ public static final String INVALID_BAND = "Provided band index %d is not
present in the raster";
Review Comment:
I don't think we need the centralized place for error message for now.
Usually such a centralize place is for global parameters or hard-coded values.
##########
docs/api/sql/Raster-visualizer.md:
##########
@@ -0,0 +1,65 @@
+Sedona offers some APIs to aid in easy visualization of a raster object. Some
of these APIs might be specifically for Jupyter users and will not work in SQL.
Such APIs will have a note explicitly mentioning compatibility.
+
+## Visualize raster using base64 viewers
+
+Introduction: Returns a base64 encoded string of the given raster. This
function internally takes the first 4 bands as RGBA, and converts them to the
PNG format, finally produces a base64 string. To visualize other bands, please
use it together with `RS_Band`. You can take the resulting base64 string in [an
online viewer](https://base64-viewer.onrender.com/) to check how the image
looks like.
+
+Since: `v1.5.0`
+
+Format: `RS_AsBase64(raster: Raster)`
+
+Spark SQL Example:
+
+```sql
+SELECT RS_AsBase64(raster) from rasters
+```
+
+Output:
+
+```
+iVBORw0KGgoAAAA...
+```
+
+## Visualize a Raster band as a 2D matrix
+
+Introduction: Returns a string, that when printed, outputs the raster band as
a pretty printed 2D matrix. All the values of the raster are cast to double for
the string. RS_AsMatrix allows specifying the number of digits to be considered
after the decimal point.
+RS_AsMatrix expects a raster, and optionally a band (default: 1) and
postDecimalPrecision (default: 6). The band parameter is 1-indexed.
+
+!!!Note
+ If the provided band is not present in the raster, RS_AsMatrix throws an
IllegalArgumentException
+
+!!!Note
+ If the provided raster has integral values, postDecimalPrecision (if any)
is simply ignored and integers are printed in the resultant string
+
+Since: `1.5.0`
+
+Format: `RS_AsMatrix(raster: Raster, band: Int = 1, postDecimalPrecision: Int
= 6)`
+
+SQL Example:
+
+```scala
+val inputDf = Seq(Seq(1, 3.333333, 4, 0.0001, 2.2222, 9, 10, 11.11111111, 3,
4, 5, 6)).toDF("band")
+inputDf.selectExpr("RS_AsMatrix(RS_AddBandFromArray(RS_MakeEmptyRaster(1, 'd',
4, 3, 0, 0, 1, -1, 0, 0, 0), band, 1, 0))").show()
Review Comment:
The examples should be pure SQL example
##########
docs/api/sql/Raster-visualizer.md:
##########
@@ -0,0 +1,65 @@
+Sedona offers some APIs to aid in easy visualization of a raster object. Some
of these APIs might be specifically for Jupyter users and will not work in SQL.
Such APIs will have a note explicitly mentioning compatibility.
+
+## Visualize raster using base64 viewers
+
+Introduction: Returns a base64 encoded string of the given raster. This
function internally takes the first 4 bands as RGBA, and converts them to the
PNG format, finally produces a base64 string. To visualize other bands, please
use it together with `RS_Band`. You can take the resulting base64 string in [an
online viewer](https://base64-viewer.onrender.com/) to check how the image
looks like.
+
+Since: `v1.5.0`
+
+Format: `RS_AsBase64(raster: Raster)`
+
+Spark SQL Example:
+
+```sql
+SELECT RS_AsBase64(raster) from rasters
+```
+
+Output:
+
+```
+iVBORw0KGgoAAAA...
+```
+
+## Visualize a Raster band as a 2D matrix
+
+Introduction: Returns a string, that when printed, outputs the raster band as
a pretty printed 2D matrix. All the values of the raster are cast to double for
the string. RS_AsMatrix allows specifying the number of digits to be considered
after the decimal point.
+RS_AsMatrix expects a raster, and optionally a band (default: 1) and
postDecimalPrecision (default: 6). The band parameter is 1-indexed.
+
+!!!Note
+ If the provided band is not present in the raster, RS_AsMatrix throws an
IllegalArgumentException
+
+!!!Note
+ If the provided raster has integral values, postDecimalPrecision (if any)
is simply ignored and integers are printed in the resultant string
+
+Since: `1.5.0`
+
+Format: `RS_AsMatrix(raster: Raster, band: Int = 1, postDecimalPrecision: Int
= 6)`
+
+SQL Example:
+
+```scala
+val inputDf = Seq(Seq(1, 3.333333, 4, 0.0001, 2.2222, 9, 10, 11.11111111, 3,
4, 5, 6)).toDF("band")
+inputDf.selectExpr("RS_AsMatrix(RS_AddBandFromArray(RS_MakeEmptyRaster(1, 'd',
4, 3, 0, 0, 1, -1, 0, 0, 0), band, 1, 0))").show()
+```
+
+Output:
+```sql
+| 1.00000 3.33333 4.00000 0.00010|
+| 2.22220 9.00000 10.00000 11.11111|
+| 3.00000 4.00000 5.00000 6.00000|
+```
+
+SQL Example:
+
+```scala
+val inputDf = Seq(Seq(1, 3, 4, 0, 2, 9, 10, 11, 3, 4, 5, 6)).toDF("band")
Review Comment:
The examples should be pure SQL example
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]