Csaba Ringhofer created HIVE-29323:
--------------------------------------
Summary: st_convexhull shouldn't return multipolygon
Key: HIVE-29323
URL: https://issues.apache.org/jira/browse/HIVE-29323
Project: Hive
Issue Type: Improvement
Reporter: Csaba Ringhofer
{code}
select ST_AsText(ST_ConvexHull(ST_GeomFromText('multipoint ((0 0), (2 0), (2
2), (0 2), (1 1))')));
result: MULTIPOLYGON (((0 0, 2 0, 2 2, 0 2, 0 0)))
{code}
This is unexpected, because the convex hull of a geometry is usually a single
convex polygon. In postgis it is usually a polygon, with the exception of some
special cases.
>From https://postgis.net/docs/ST_ConvexHull.html :
"In the general case the convex hull is a Polygon. The convex hull of two or
more collinear points is a two-point LineString. The convex hull of one or more
identical points is a Point."
The issue was found while adding tests for ST_ConvexHull in Apache Impala
(which imports these UDFs from Hive):
https://gerrit.cloudera.org/#/c/23604/6/testdata/workloads/functional-query/queries/QueryTest/geospatial-esri.test
The current behavior comes from
https://github.com/Esri/spatial-framework-for-hadoop The related code looks
unchanged in Hive.
The type of the geometry is deduced by GeometryUtils.getInferredOGCType:
https://github.com/apache/hive/blob/e34e87faf32b0167f438e30344f0a04fe10f6575/ql/src/java/org/apache/hadoop/hive/ql/udf/esri/ST_ConvexHull.java#L101
https://github.com/apache/hive/blob/866cc7d2a69d2d8b018917488ab243244707aae8/ql/src/java/org/apache/hadoop/hive/ql/udf/esri/GeometryUtils.java#L197
The underlying https://github.com/Esri/geometry-api-java lib uses Polygon
class for both polygon/multipolygon.
I think that getExteriorRingCount() should be used to decide whether it is
polygon or multipolygon:
https://github.com/Esri/geometry-api-java/blob/020a1a9f6e1126663a0a669095e61d265ca95019/src/main/java/com/esri/core/geometry/Polygon.java#L144
A related issue about ST_Aggr_ConvexHull:
https://github.com/Esri/spatial-framework-for-hadoop/pull/181
[~ayushsaxena]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)