[
https://issues.apache.org/jira/browse/IMPALA-14577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18051092#comment-18051092
]
ASF subversion and git services commented on IMPALA-14577:
----------------------------------------------------------
Commit c96b7b082dff6bb2fd0062e63b508e3924e14297 in impala's branch
refs/heads/master from Csaba Ringhofer
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=c96b7b082 ]
IMPALA-14576, IMPALA-14577: add rewrite rules for geospatial relations
Apply rewrites to geospatial relations like st_intersects.
3 rewrites are added:
1. NormalizeGeospatialRelationsRule moves const arguments
to the first position (this can be useful as the current Java
implementation optimizes const first arguments, see IMPALA-14575):
st_intersects(geom_col, ST_Polygon(1,1, 1,4, 4,4, 4,1))
->
st_intersects(ST_Polygon(1,1, 1,4, 4,4, 4,1), geom_col)
2. AddEnvIntersectsRule adds st_envintersects() before
relations that can be only true when the bounding rectangles
intersect. This is useful as st_envintersects() has native
implementation (IMPALA-14573):
st_intersects(geom1, geom2)
->
st_envintersects(geom1, geom2) AND
st_intersects(geom1, geom2)
3. PointEnvIntersectsRule replaces bounding rect (envelope)
intersection on geometries from st_point with predicates directly
on coordinates:
st_envintersects(CONST_GEOM, st_point(x, y))
->
x >= st_minx(CONST_GEOM) AND y >= st_miny(CONST_GEOM) AND
x <= st_maxx(CONST_GEOM) AND y <= st_maxy(CONST_GEOM)
Note that AddEnvIntersectsRule is only valid in planar geometry (the
relation functions in HIVE_ESRI are all planar).
2 and 3 are not applied if the cost of child expression is
above some treshold.
AddEnvIntersectsRule needed a new type of expression rewrite
("non-idempotent rules") that runs rules only once to avoid
triggering the rules multiple times on the same input predicate.
Other changes:
- Changed handling of malformed geometries in c++ functions from
error to warning. This is consistent with handling in Java.
Change-Id: Id65f646db6f1c89a74253e9ff755c39c400328be
Reviewed-on: http://gerrit.cloudera.org:8080/23719
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Try bounding rect check on x/y cols directly in st_point(x,y)
> -------------------------------------------------------------
>
> Key: IMPALA-14577
> URL: https://issues.apache.org/jira/browse/IMPALA-14577
> Project: IMPALA
> Issue Type: Sub-task
> Reporter: Csaba Ringhofer
> Assignee: Csaba Ringhofer
> Priority: Major
>
> Example:
> {code}
> st_intersect(CONST_GEOM, st_point(x, y))
> -> infer predicate:
> x >= st_minx(CONST_GEOM) AND y >= st_minx(CONST_GEOM) AND
> x <= st_maxx(CONST_GEOM) AND y <= st_maxy(CONST_GEOM) AND
> st_intersect(CONST_GEOM, st_point(x, y))
> {code}
> with IMPALA-14576 it would be also possible to do this by rewriting
> st_envIntersects:
> {code}
> st_intersect(CONST_GEOM, st_point(x, y))
> -> add st_envIntersects:
> st_envIntersects(CONST_GEOM, st_point(x, y)) AND
> st_intersect(CONST_GEOM, st_point(x, y))
> -> rewrite st_envIntersects
> x >= st_minx(CONST_GEOM) AND y >= st_minx(CONST_GEOM) AND
> x <= st_maxx(CONST_GEOM) AND y <= st_maxy(CONST_GEOM) AND
> st_intersect(CONST_GEOM, st_point(x, y))
> {code}
> Doing <= / >= on coordinate columns directly allows using min/max
> filtering, so bounding box check can be also done at file/page level.
> With IMPALA-14123 it also possible to ensure that these predicates are always
> pushed down to Iceberg.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]