[Impala-ASF-CR] IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19429 ) Change subject: IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/12209/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19429 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8f1e78e325baafbe23101909d47e82bf140a2d77 Gerrit-Change-Number: 19429 Gerrit-PatchSet: 4 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 20 Jan 2023 02:53:03 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11626: Handle COMMIT COMPACTION EVENT from HMS
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/19155 ) Change subject: IMPALA-11626: Handle COMMIT_COMPACTION_EVENT from HMS .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/19155/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19155/6//COMMIT_MSG@7 PS6, Line 7: Handle > Nit: Handle Ack -- To view, visit http://gerrit.cloudera.org:8080/19155 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I464faedb4e3bbcd417bab2e3cb0d57e339d42605 Gerrit-Change-Number: 19155 Gerrit-PatchSet: 7 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Fri, 20 Jan 2023 02:43:51 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11626: Handle COMMIT COMPACTION EVENT from HMS
Hello Quanlong Huang, Daniel Becker, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/19155 to look at the new patch set (#7). Change subject: IMPALA-11626: Handle COMMIT_COMPACTION_EVENT from HMS .. IMPALA-11626: Handle COMMIT_COMPACTION_EVENT from HMS Since HIVE-24329 HMS emits an event when a compaction is committed, but Impala ignores it. Handling it would allow automatic refreshing of file metadata after commit compactions. Change-Id: I464faedb4e3bbcd417bab2e3cb0d57e339d42605 --- M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java 4 files changed, 107 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/19155/7 -- To view, visit http://gerrit.cloudera.org:8080/19155 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I464faedb4e3bbcd417bab2e3cb0d57e339d42605 Gerrit-Change-Number: 19155 Gerrit-PatchSet: 7 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/19429 ) Change subject: IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/19429/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19429/3//COMMIT_MSG@40 PS3, Line 40: tri > Nit: tries Done http://gerrit.cloudera.org:8080/#/c/19429/3//COMMIT_MSG@50 PS3, Line 50: rooted tuple d > Could you explain why the first condition is true for tables/views? Is it b Yeah, needs the second point to better explain this. When it's a table/view STAR expansion, destType() of the STAR path is the type of the rooted tuple descriptor: https://github.com/apache/impala/blob/9baf790606073d88c3a2fd431110812140df0cb7/fe/src/main/java/org/apache/impala/analysis/Path.java#L354 The type of a tuple descriptor is always a StructType: https://github.com/apache/impala/blob/9baf790606073d88c3a2fd431110812140df0cb7/fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java#L88 -- To view, visit http://gerrit.cloudera.org:8080/19429 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8f1e78e325baafbe23101909d47e82bf140a2d77 Gerrit-Change-Number: 19429 Gerrit-PatchSet: 4 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 20 Jan 2023 02:32:35 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking
Hello Fang-Yu Rao, Daniel Becker, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/19429 to look at the new patch set (#4). Change subject: IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking .. IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking resolvePathWithMasking() is a wrapper on resolvePath() to further resolve nested columns inside the table masking view. When it was added, complex types in the select list hadn't been supported yet. So the table masking view can't expose complex type columns directly in the select list. Any paths in nested types will be further resolved inside the table masking view in resolvePathWithMasking(). Take the following query as an example: select id, nested_struct.* from complextypestbl; If Ranger column-masking/row-filter policies applied on the table, the query is rewritten as select id, nested_struct.* from ( select mask(id) from complextypestbl where row-filtering-condition ) t; Table masking view "t" can't expose the nested column "nested_struct". So we further resolve "nested_struct" inside the inlineView to use the masked table "complextypestbl". The underlying TableRef is expected to be a BaseTableRef. Paths that don't reference nested columns should be resolved and returned directly (just like the original resolvePath() does). E.g. select v.* from masked_view v is rewritten to select v.* from ( select mask(c1), mask(c2), ..., mask(cn) from masked_view where row-filtering-condition ) v; The STAR path "v.*" should be resolved directly. However, it's treated as a nested column unexpectedly. The code then tries to resolve it inside the table "masked_view" and found "masked_view" is not a table so throws the IllegalStateException. These are the current conditions for identifying nested STAR paths: - The destType is STRUCT - And the resolved path is rooted at a valid tuple descriptor They don't really recognize the nested struct columns because STAR paths on table/view also match these conditions. When the STAR paths is an expansion on a catalog table/view, the rooted tuple descriptor is exactly the output tuple of the table/view. The destType is the type of the tuple descriptor which is always a StructType. Note that STAR paths on other nested types, i.e. array/map, are invalid. So the first condition matches for all valid cases. The second condition also matches all valid cases since both the table/view or struct STAR expansion have the path rooted at a valid tuple descriptor. This patch fixes the check for nested struct STAR path by checking the matched types instead. Note that if "v.*" is a table/view expansion, the matched type list is empty. If "v.*" is a struct column expansion, the matched type list contains the STRUCT column type. Tests: - Add missing coverage on STAR paths (v.*) on masked views. Change-Id: I8f1e78e325baafbe23101909d47e82bf140a2d77 --- M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/Path.java M testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test M testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_complex_types.test 4 files changed, 67 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/29/19429/4 -- To view, visit http://gerrit.cloudera.org:8080/19429 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8f1e78e325baafbe23101909d47e82bf140a2d77 Gerrit-Change-Number: 19429 Gerrit-PatchSet: 4 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR] IMPALA-11013 (part 1): Support 'MIGRATE TABLE' for external Hdfs tables
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/19397 ) Change subject: IMPALA-11013 (part 1): Support 'MIGRATE TABLE' for external Hdfs tables .. Patch Set 6: (3 comments) This seems a very useful change. I have a few general comments/questions. http://gerrit.cloudera.org:8080/#/c/19397/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19397/6//COMMIT_MSG@7 PS6, Line 7: IMPALA-11013 (part 1): Support 'MIGRATE TABLE' for external Hdfs tables I know what you mean by an "Hdfs table" but maybe "Hive table" is clearer. Our new table will still be in hdfs, it will just have a different table format. http://gerrit.cloudera.org:8080/#/c/19397/6//COMMIT_MSG@14 PS6, Line 14: tables. Is it it true to say "the data files themselves are not changed during this migration". If so it would be nice to state this explicitly. http://gerrit.cloudera.org:8080/#/c/19397/6//COMMIT_MSG@35 PS6, Line 35: - Child query 4: Drop the temporary Hdfs table. > What happens if there is an error at any step? It would be nice if we could I read that in Spark "When you migrate a Hive table to Iceberg, a backup of the table, named backup, is created." That could be a nice feature which might be easy to implement. Of course this could be deferred to future work. -- To view, visit http://gerrit.cloudera.org:8080/19397 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I91e6a9cfe099c263f17b5506d6db459b79ad31a5 Gerrit-Change-Number: 19397 Gerrit-PatchSet: 6 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 20 Jan 2023 02:26:34 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11850 Adds HTTP tracing headers when using the hs2-http protocol.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19428 ) Change subject: IMPALA-11850 Adds HTTP tracing headers when using the hs2-http protocol. .. Patch Set 5: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/12208/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/19428 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7857eb5ec03eba32e06ec8d4133480f2e958ad2f Gerrit-Change-Number: 19428 Gerrit-PatchSet: 5 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 20 Jan 2023 00:24:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11850 Adds HTTP tracing headers when using the hs2-http protocol.
Jason Fehr has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/19428 ) Change subject: IMPALA-11850 Adds HTTP tracing headers when using the hs2-http protocol. .. IMPALA-11850 Adds HTTP tracing headers when using the hs2-http protocol. When using the hs2 protocol with the http transport, include several tracing http headers by default. These headers are: * X-Request-Id-- client defined string that identifies the http request, this string is meaningful only to the client * X-Impala-Session-Id -- session id generated by the Impala backend, will be omitted on http calls that occur before this id has been generated * X-Impala-Query-Id -- query id generated by the Impala backend, will be omitted on http calls that occur before this id has been generated The Impala shell includes these flags by default. Command line arguments have been added to remove these headers. The Impala backend logs out these headers if they are on the http request. Testing: - manual testing (verified using debugging proxy and impala logs) - new python test Change-Id: I7857eb5ec03eba32e06ec8d4133480f2e958ad2f --- M be/src/transport/THttpServer.cpp M be/src/transport/THttpServer.h M shell/ImpalaHttpClient.py M shell/impala_client.py M shell/impala_shell.py M shell/impala_shell_config_defaults.py M shell/option_parser.py M tests/common/test_dimensions.py M tests/shell/test_shell_commandline.py 9 files changed, 265 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/19428/5 -- To view, visit http://gerrit.cloudera.org:8080/19428 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7857eb5ec03eba32e06ec8d4133480f2e958ad2f Gerrit-Change-Number: 19428 Gerrit-PatchSet: 5 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-11745: Add Hive's ESRI geospatial functions as builtins
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19425 ) Change subject: IMPALA-11745: Add Hive's ESRI geospatial functions as builtins .. Patch Set 13: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/12207/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19425 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0ca02a70b4ba244778c9db6d14df4423072b225 Gerrit-Change-Number: 19425 Gerrit-PatchSet: 13 Gerrit-Owner: Peter Rozsa Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Peter Rozsa Gerrit-Comment-Date: Thu, 19 Jan 2023 21:23:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11745: Add Hive's ESRI geospatial functions as builtins
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19425 ) Change subject: IMPALA-11745: Add Hive's ESRI geospatial functions as builtins .. Patch Set 12: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/12206/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19425 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0ca02a70b4ba244778c9db6d14df4423072b225 Gerrit-Change-Number: 19425 Gerrit-PatchSet: 12 Gerrit-Owner: Peter Rozsa Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Peter Rozsa Gerrit-Comment-Date: Thu, 19 Jan 2023 21:20:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11745: Add Hive's ESRI geospatial functions as builtins
Peter Rozsa has uploaded a new patch set (#13). ( http://gerrit.cloudera.org:8080/19425 ) Change subject: IMPALA-11745: Add Hive's ESRI geospatial functions as builtins .. IMPALA-11745: Add Hive's ESRI geospatial functions as builtins This change adds geospatial functions from Hive's ESRI library as builtin UDFs. Plain Hive UDFs are imported without changes, but the generic and varargs functions are handled differently; generic functions are added with all of the combinations of their parameters (cartesian product of the parameters), and varargs functions are unfolded as an nth parameter simple function. The varargs function wrappers are generated at build time and they can be configured in gen_geospatial_udf_wrappers.py. These additional steps are required because of the limitations in Impala's UDF Executor which could be further improved; in this case, the additional wrapping/mapping steps could be removed. Changes regarding function handling/creating are sourced from https://gerrit.cloudera.org/c/19177 A new backend flag was added to turn this feature on/off as "geospatial_library". The default value is "NONE" which means no geospatial function gets registered as builtin, "HIVE_ESRI" value enables this implementation. Known limitations: - ST_MultiLineString, ST_MultiPolygon only works with the WKT overload - ST_Polygon supports a maximum of 6 pairs of coordinates - ST_MultiPoint, ST_LineStrin supports a maximum of 7 pairs of coordinates - ST_ConvexHull, ST_Union supports a maximum of 6 geoms These limits can be increased in gen_geospatial_udf_wrappers.py and HiveEsriGeospatialBuiltins.java Tests: - test_geospatial_udfs.py added based on https://github.com/Esri/spatial-framework-for-hadoop Co-Authored-by: Csaba Ringhofer Change-Id: If0ca02a70b4ba244778c9db6d14df4423072b225 --- M be/src/common/global-flags.cc M be/src/exprs/hive-udf-call.cc M be/src/util/backend-gflag-util.cc M bin/start-impala-cluster.py M common/function-registry/CMakeLists.txt A common/function-registry/gen_geospatial_udf_wrappers.py M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/HiveEsriGeospatialBuiltins.java M fe/src/main/java/org/apache/impala/catalog/ScalarFunction.java A fe/src/main/java/org/apache/impala/hive/executor/BinaryToBinaryHiveLegacyFunctionExtractor.java M fe/src/main/java/org/apache/impala/hive/executor/HiveGenericJavaFunction.java M fe/src/main/java/org/apache/impala/hive/executor/HiveJavaFunction.java M fe/src/main/java/org/apache/impala/hive/executor/HiveJavaFunctionFactory.java M fe/src/main/java/org/apache/impala/hive/executor/HiveJavaFunctionFactoryImpl.java A fe/src/main/java/org/apache/impala/hive/executor/HiveLegacyFunctionExtractor.java M fe/src/main/java/org/apache/impala/hive/executor/HiveLegacyJavaFunction.java M fe/src/main/java/org/apache/impala/hive/executor/ImpalaDoubleWritable.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java M fe/src/test/java/org/apache/impala/hive/executor/TestHiveJavaFunctionFactory.java M java/CMakeLists.txt M java/shaded-deps/hive-exec/pom.xml M testdata/datasets/README A testdata/workloads/functional-query/queries/QueryTest/udf-esri-geospatial.test A tests/custom_cluster/test_geospatial_udfs.py 28 files changed, 3,527 insertions(+), 139 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/19425/13 -- To view, visit http://gerrit.cloudera.org:8080/19425 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If0ca02a70b4ba244778c9db6d14df4423072b225 Gerrit-Change-Number: 19425 Gerrit-PatchSet: 13 Gerrit-Owner: Peter Rozsa Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Peter Rozsa
[Impala-ASF-CR] IMPALA-11745: Add Hive's ESRI geospatial functions as builtins
Peter Rozsa has uploaded a new patch set (#12). ( http://gerrit.cloudera.org:8080/19425 ) Change subject: IMPALA-11745: Add Hive's ESRI geospatial functions as builtins .. IMPALA-11745: Add Hive's ESRI geospatial functions as builtins This change adds geospatial functions from Hive's ESRI library as builtin UDFs. Plain Hive UDFs are imported without changes, but the generic and varargs functions are handled differently; generic functions are added with all of the combinations of their parameters (cartesian product of the parameters), and varargs functions are unfolded as an nth parameter simple function. The varargs function wrappers are generated at build time and they can be configured in gen_geospatial_udf_wrappers.py. These additional steps are required because of the limitations in Impala's UDF Executor which could be further improved; in this case, the additional wrapping/mapping steps could be removed. Changes regarding function handling/creating are sourced from https://gerrit.cloudera.org/c/19177 A new backend flag was added to turn this feature on/off as "geospatial_library". The default value is "NONE" which means no geospatial function gets registered as builtin, "HIVE_ESRI" value enables this implementation. Known limitations: - ST_MultiLineString, ST_MultiPolygon only works with the WKT overload - ST_Polygon supports a maximum of 6 pairs of coordinates - ST_MultiPoint, ST_LineStrin supports a maximum of 7 pairs of coordinates - ST_ConvexHull, ST_Union supports a maximum of 6 geoms These limits can be increased in gen_geospatial_udf_wrappers.py and HiveEsriGeospatialBuiltins.java Tests: - test_geospatial_udfs.py added based on https://github.com/Esri/spatial-framework-for-hadoop Co-Authored-by: Csaba Ringhofer Change-Id: If0ca02a70b4ba244778c9db6d14df4423072b225 --- M be/src/common/global-flags.cc M be/src/exprs/hive-udf-call.cc M be/src/util/backend-gflag-util.cc M bin/start-impala-cluster.py M common/function-registry/CMakeLists.txt A common/function-registry/gen_geospatial_udf_wrappers.py M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/HiveEsriGeospatialBuiltins.java M fe/src/main/java/org/apache/impala/catalog/ScalarFunction.java A fe/src/main/java/org/apache/impala/hive/executor/BinaryToBinaryHiveLegacyFunctionExtractor.java M fe/src/main/java/org/apache/impala/hive/executor/HiveGenericJavaFunction.java M fe/src/main/java/org/apache/impala/hive/executor/HiveJavaFunction.java M fe/src/main/java/org/apache/impala/hive/executor/HiveJavaFunctionFactory.java M fe/src/main/java/org/apache/impala/hive/executor/HiveJavaFunctionFactoryImpl.java A fe/src/main/java/org/apache/impala/hive/executor/HiveLegacyFunctionExtractor.java M fe/src/main/java/org/apache/impala/hive/executor/HiveLegacyJavaFunction.java M fe/src/main/java/org/apache/impala/hive/executor/ImpalaDoubleWritable.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java M fe/src/test/java/org/apache/impala/hive/executor/TestHiveJavaFunctionFactory.java M java/CMakeLists.txt M java/shaded-deps/hive-exec/pom.xml M testdata/datasets/README A testdata/workloads/functional-query/queries/QueryTest/udf-esri-geospatial.test A tests/custom_cluster/test_geospatial_udfs.py 28 files changed, 3,527 insertions(+), 139 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/19425/12 -- To view, visit http://gerrit.cloudera.org:8080/19425 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If0ca02a70b4ba244778c9db6d14df4423072b225 Gerrit-Change-Number: 19425 Gerrit-PatchSet: 12 Gerrit-Owner: Peter Rozsa Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Peter Rozsa
[Impala-ASF-CR] IMPALA-11745: Add Hive's ESRI geospatial functions as builtins
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19425 ) Change subject: IMPALA-11745: Add Hive's ESRI geospatial functions as builtins .. Patch Set 12: (1 comment) http://gerrit.cloudera.org:8080/#/c/19425/12/tests/custom_cluster/test_geospatial_udfs.py File tests/custom_cluster/test_geospatial_udfs.py: http://gerrit.cloudera.org:8080/#/c/19425/12/tests/custom_cluster/test_geospatial_udfs.py@26 PS12, Line 26: class TestGeospatialUdfs(CustomClusterTestSuite): flake8: E302 expected 2 blank lines, found 1 -- To view, visit http://gerrit.cloudera.org:8080/19425 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0ca02a70b4ba244778c9db6d14df4423072b225 Gerrit-Change-Number: 19425 Gerrit-PatchSet: 12 Gerrit-Owner: Peter Rozsa Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Peter Rozsa Gerrit-Comment-Date: Thu, 19 Jan 2023 21:01:30 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11850 Adds HTTP tracing headers when using the hs2-http protocol.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19428 ) Change subject: IMPALA-11850 Adds HTTP tracing headers when using the hs2-http protocol. .. Patch Set 4: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/12205/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/19428 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7857eb5ec03eba32e06ec8d4133480f2e958ad2f Gerrit-Change-Number: 19428 Gerrit-PatchSet: 4 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 19 Jan 2023 20:42:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11850 Adds HTTP tracing headers when using the hs2-http protocol.
Jason Fehr has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/19428 ) Change subject: IMPALA-11850 Adds HTTP tracing headers when using the hs2-http protocol. .. IMPALA-11850 Adds HTTP tracing headers when using the hs2-http protocol. When using the hs2 protocol with the http transport, include several tracing http headers by default. These headers are: * X-Request-Id-- client defined string that identifies the http request, this string is meaningful only to the client * X-Impala-Session-Id -- session id generated by the Impala backend, will be omitted on http calls that occur before this id has been generated * X-Impala-Query-Id -- query id generated by the Impala backend, will be omitted on http calls that occur before this id has been generated The Impala shell includes these flags by default. Command line arguments have been added to remove these headers. The Impala backend logs out these headers if they are on the http request. Testing: - manual testing (verified using debugging proxy and impala logs) - new python test Change-Id: I7857eb5ec03eba32e06ec8d4133480f2e958ad2f --- M be/src/transport/THttpServer.cpp M be/src/transport/THttpServer.h M shell/ImpalaHttpClient.py M shell/impala_client.py M shell/impala_shell.py M shell/impala_shell_config_defaults.py M shell/option_parser.py M tests/common/test_dimensions.py M tests/shell/test_shell_commandline.py 9 files changed, 265 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/19428/4 -- To view, visit http://gerrit.cloudera.org:8080/19428 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7857eb5ec03eba32e06ec8d4133480f2e958ad2f Gerrit-Change-Number: 19428 Gerrit-PatchSet: 4 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-11807: Rewrite iceberg metadata if not on hdfs
Noemi Pap-Takacs has posted comments on this change. ( http://gerrit.cloudera.org:8080/19432 ) Change subject: IMPALA-11807: Rewrite iceberg metadata if not on hdfs .. Patch Set 1: Code-Review+1 Thanks for the fix! -- To view, visit http://gerrit.cloudera.org:8080/19432 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic04c5abdd42cb0c1cf5abd310b06c39cf8cd64ba Gerrit-Change-Number: 19432 Gerrit-PatchSet: 1 Gerrit-Owner: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Comment-Date: Thu, 19 Jan 2023 17:25:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11807: Rewrite iceberg metadata if not on hdfs
Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/19432 ) Change subject: IMPALA-11807: Rewrite iceberg metadata if not on hdfs .. Patch Set 1: Code-Review+1 (2 comments) http://gerrit.cloudera.org:8080/#/c/19432/1/testdata/bin/load-test-warehouse-snapshot.sh File testdata/bin/load-test-warehouse-snapshot.sh: http://gerrit.cloudera.org:8080/#/c/19432/1/testdata/bin/load-test-warehouse-snapshot.sh@120 PS1, Line 120: ${IMPALA_HOME}/testdata/bin/rewrite-iceberg-metadata.py "${WAREHOUSE_LOCATION_PREFIX}" \ So this removes the authority portion so we just have a path, even if WAREHOUSE_LOCATION_PREFIX is empty? That lets it work in S3. Makes sense to me. http://gerrit.cloudera.org:8080/#/c/19432/1/testdata/bin/load-test-warehouse-snapshot.sh@121 PS1, Line 121: $(find ${SNAPSHOT_STAGING_DIR}${TEST_WAREHOUSE_DIR}/iceberg_test -name "metadata") For future updates, I think it would be safe to search all files instead of just in iceberg_test. -- To view, visit http://gerrit.cloudera.org:8080/19432 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic04c5abdd42cb0c1cf5abd310b06c39cf8cd64ba Gerrit-Change-Number: 19432 Gerrit-PatchSet: 1 Gerrit-Owner: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Comment-Date: Thu, 19 Jan 2023 16:59:46 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11850 Adds HTTP tracing headers when using the hs2-http protocol.
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/19428 ) Change subject: IMPALA-11850 Adds HTTP tracing headers when using the hs2-http protocol. .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/19428/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19428/3//COMMIT_MSG@9 PS3, Line 9: When using the hs2 protocol with the http transport, include several tracing http > The commit message should be wrapped at 732 characters. I mean 72 -- To view, visit http://gerrit.cloudera.org:8080/19428 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7857eb5ec03eba32e06ec8d4133480f2e958ad2f Gerrit-Change-Number: 19428 Gerrit-PatchSet: 3 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 19 Jan 2023 16:48:40 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11807: Rewrite iceberg metadata if not on hdfs
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19432 ) Change subject: IMPALA-11807: Rewrite iceberg metadata if not on hdfs .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/12204/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19432 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic04c5abdd42cb0c1cf5abd310b06c39cf8cd64ba Gerrit-Change-Number: 19432 Gerrit-PatchSet: 1 Gerrit-Owner: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Comment-Date: Thu, 19 Jan 2023 16:39:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11807: Rewrite iceberg metadata if not on hdfs
Gergely Fürnstáhl has uploaded this change for review. ( http://gerrit.cloudera.org:8080/19432 Change subject: IMPALA-11807: Rewrite iceberg metadata if not on hdfs .. IMPALA-11807: Rewrite iceberg metadata if not on hdfs Iceberg test tables are usually written on hdfs and the file paths start with "hdfs://localhost:20500/test-warehouse". Earlier we manually transformed the metadata so paths would start with "/test-warehouse" Since IMPALA-11821, testdata/bin/rewrite-iceberg-metadata.py supports not only a custom WAREHOUSE_LOCATION_PREFIX, but the ability to trim the beginning of the file paths. This commit modifies the data load, so metadata rewrite always executes if not on hdfs, even with empty WAREHOUSE_LOCATION_PREFIX. Testing: - Ran iceberg tests on ozone and S3 Change-Id: Ic04c5abdd42cb0c1cf5abd310b06c39cf8cd64ba --- M testdata/bin/load-test-warehouse-snapshot.sh 1 file changed, 4 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/32/19432/1 -- To view, visit http://gerrit.cloudera.org:8080/19432 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ic04c5abdd42cb0c1cf5abd310b06c39cf8cd64ba Gerrit-Change-Number: 19432 Gerrit-PatchSet: 1 Gerrit-Owner: Gergely Fürnstáhl
[Impala-ASF-CR] IMPALA-11658: Implement Iceberg manifest caching config for Impala
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/19423 ) Change subject: IMPALA-11658: Implement Iceberg manifest caching config for Impala .. IMPALA-11658: Implement Iceberg manifest caching config for Impala Impala needs to supply Iceberg's catalog properties to enable manifest caching feature. This commit implements the necessary config reading. Iceberg related config is read from hadoop-conf.xml and supplied as a Map in catalog instantiation. Additionally, this patch also replace deprecated RuntimeIOException with its superclass, UncheckedIOException. Testing: - Pass core tests. - Checked that manifest caching works through debug logging. Change-Id: I5a60a700d2ae6302dfe395d1ef602e6b1d821888 Reviewed-on: http://gerrit.cloudera.org:8080/19423 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCatalog.java M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCatalogs.java M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHadoopCatalog.java M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHadoopTables.java M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.py 7 files changed, 65 insertions(+), 23 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/19423 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I5a60a700d2ae6302dfe395d1ef602e6b1d821888 Gerrit-Change-Number: 19423 Gerrit-PatchSet: 4 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Yida Wu Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-11658: Implement Iceberg manifest caching config for Impala
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19423 ) Change subject: IMPALA-11658: Implement Iceberg manifest caching config for Impala .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/19423 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a60a700d2ae6302dfe395d1ef602e6b1d821888 Gerrit-Change-Number: 19423 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Yida Wu Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 19 Jan 2023 16:03:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/19429 ) Change subject: IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/19429/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19429/3//COMMIT_MSG@40 PS3, Line 40: try Nit: tries http://gerrit.cloudera.org:8080/#/c/19429/3//COMMIT_MSG@50 PS3, Line 50: always matches Could you explain why the first condition is true for tables/views? Is it because their types are also represented as structs? -- To view, visit http://gerrit.cloudera.org:8080/19429 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8f1e78e325baafbe23101909d47e82bf140a2d77 Gerrit-Change-Number: 19429 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Thu, 19 Jan 2023 14:12:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11745: Add Hive's ESRI geospatial functions as builtins
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19425 ) Change subject: IMPALA-11745: Add Hive's ESRI geospatial functions as builtins .. Patch Set 10: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/12203/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19425 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0ca02a70b4ba244778c9db6d14df4423072b225 Gerrit-Change-Number: 19425 Gerrit-PatchSet: 10 Gerrit-Owner: Peter Rozsa Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Peter Rozsa Gerrit-Comment-Date: Thu, 19 Jan 2023 13:44:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11745: Add Hive's ESRI geospatial functions as builtins
Peter Rozsa has uploaded a new patch set (#10). ( http://gerrit.cloudera.org:8080/19425 ) Change subject: IMPALA-11745: Add Hive's ESRI geospatial functions as builtins .. IMPALA-11745: Add Hive's ESRI geospatial functions as builtins This change adds geospatial functions from Hive's ESRI library as builtin UDFs. Plain Hive UDFs are imported without changes, but the generic and varargs functions are handled differently; generic functions are added with all of the combinations of their parameters (cartesian product of the parameters), and varargs functions are unfolded as an nth parameter simple function. The varargs function wrappers are generated at build time and they can be configured in gen_geospatial_udf_wrappers.py. These additional steps are required because of the limitations in Impala's UDFExecuter which could be further improved; in this case, the additional wrapping/mapping steps could be removed. A new backend flag was added to turn this feature on/off as "geospatial_library". The default value is "NONE" which means no geospatial function gets registered as builtin, "HIVE_ESRI" value enables this implementation. Known limitations: - ST_MultiLineString, ST_MultiPolygon only works with the WKT overload - ST_Polygon supports a maximum of 6 pairs of coordinates - ST_MultiPoint, ST_LineStrin supports a maximum of 7 pairs of coordinates - ST_ConvexHull, ST_Union supports a maximum of 6 geoms These limits can be increased in gen_geospatial_udf_wrappers.py and HiveEsriGeospatialBuiltins.java Tests: - test_geospatial_udfs.py added based on https://github.com/Esri/spatial-framework-for-hadoop/tree/master/hive/test Change-Id: If0ca02a70b4ba244778c9db6d14df4423072b225 --- M be/src/common/global-flags.cc M be/src/exprs/hive-udf-call.cc M be/src/util/backend-gflag-util.cc M common/function-registry/CMakeLists.txt A common/function-registry/gen_geospatial_udf_wrappers.py M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/HiveEsriGeospatialBuiltins.java M fe/src/main/java/org/apache/impala/catalog/ScalarFunction.java A fe/src/main/java/org/apache/impala/hive/executor/BinaryToBinaryHiveLegacyFunctionExtractor.java M fe/src/main/java/org/apache/impala/hive/executor/HiveGenericJavaFunction.java M fe/src/main/java/org/apache/impala/hive/executor/HiveJavaFunction.java M fe/src/main/java/org/apache/impala/hive/executor/HiveJavaFunctionFactory.java M fe/src/main/java/org/apache/impala/hive/executor/HiveJavaFunctionFactoryImpl.java A fe/src/main/java/org/apache/impala/hive/executor/HiveLegacyFunctionExtractor.java M fe/src/main/java/org/apache/impala/hive/executor/HiveLegacyJavaFunction.java M fe/src/main/java/org/apache/impala/hive/executor/ImpalaDoubleWritable.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java M fe/src/test/java/org/apache/impala/hive/executor/TestHiveJavaFunctionFactory.java M java/CMakeLists.txt M java/shaded-deps/hive-exec/pom.xml M testdata/datasets/README A testdata/workloads/functional-query/queries/QueryTest/udf-esri-geospatial.test A tests/custom_cluster/test_geospatial_udfs.py 27 files changed, 3,505 insertions(+), 139 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/19425/10 -- To view, visit http://gerrit.cloudera.org:8080/19425 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If0ca02a70b4ba244778c9db6d14df4423072b225 Gerrit-Change-Number: 19425 Gerrit-PatchSet: 10 Gerrit-Owner: Peter Rozsa Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Peter Rozsa
[Impala-ASF-CR] IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/19429 ) Change subject: IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/19429/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19429/2//COMMIT_MSG@41 PS2, Line 41: it's not a table > Shouldn't this be "it's not a struct"? I think I should reword it to: "masked_view" is not a table Note that the underlying TableRef is expected to be a BaseTableRef. http://gerrit.cloudera.org:8080/#/c/19429/2//COMMIT_MSG@44 PS2, Line 44: These are the conditions for returning the STAR paths directly: > After looking at the code again, aren't the conditions for returning the pa Ah, my bad! I said it in the reverse way.. It's the condition for nested struct path. -- To view, visit http://gerrit.cloudera.org:8080/19429 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8f1e78e325baafbe23101909d47e82bf140a2d77 Gerrit-Change-Number: 19429 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Thu, 19 Jan 2023 13:08:26 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking
Hello Fang-Yu Rao, Daniel Becker, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/19429 to look at the new patch set (#3). Change subject: IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking .. IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking resolvePathWithMasking() is a wrapper on resolvePath() to further resolve nested columns inside the table masking view. When it was added, complex types in the select list hadn't been supported yet. So the table masking view can't expose complex type columns directly in the select list. Any paths in nested types will be further resolved inside the table masking view in resolvePathWithMasking(). Take the following query as an example: select id, nested_struct.* from complextypestbl; If Ranger column-masking/row-filter policies applied on the table, the query is rewritten as select id, nested_struct.* from ( select mask(id) from complextypestbl where row-filtering-condition ) t; Table masking view "t" can't expose the nested column "nested_struct". So we further resolve "nested_struct" inside the inlineView to use the masked table "complextypestbl". The underlying TableRef is expected to be a BaseTableRef. Paths that don't reference nested columns should be resolved and returned directly (just like the original resolvePath() does). E.g. select v.* from masked_view v is rewritten to select v.* from ( select mask(c1), mask(c2), ..., mask(cn) from masked_view where row-filtering-condition ) v; The STAR path "v.*" should be resolved directly. However, it's treated as a nested column unexpectedly. The code then try to resolve it inside the table "masked_view" and found "masked_view" is not a table so throws the IllegalStateException. These are the conditions for nested STAR paths: - The type is STRUCT - And the resolved path is rooted at a valid tuple descriptor They don't really recognize the nested columns. In fact, all STAR paths match these conditions. Reason: - STAR expansion is only valid for paths to a struct type (or a table/view). So the first condition always matches. - The second condition also matches for STAR paths on table/view, i.e. paths of "v.*" when "v" is a catalog table/view. The rooted tuple descriptor is exactly the output tuple of the table/view. This patch fixes the check for nested struct STAR path by checking the matched types instead. Note that if "v.*" is a table/view expansion, the matched type list is empty. If "v.*" is a struct column expansion, the matched type list contains the STRUCT column type. Tests: - Add missing coverage on STAR paths (v.*) on masked views. Change-Id: I8f1e78e325baafbe23101909d47e82bf140a2d77 --- M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/Path.java M testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test M testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_complex_types.test 4 files changed, 67 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/29/19429/3 -- To view, visit http://gerrit.cloudera.org:8080/19429 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8f1e78e325baafbe23101909d47e82bf140a2d77 Gerrit-Change-Number: 19429 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR] IMPALA-3120: Support Bucket Shuffle Join for bucketed table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19430 ) Change subject: IMPALA-3120: Support Bucket Shuffle Join for bucketed table .. Patch Set 6: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/8977/ -- To view, visit http://gerrit.cloudera.org:8080/19430 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If321e7987bc88374d79500cffb77ea25b2ed0316 Gerrit-Change-Number: 19430 Gerrit-PatchSet: 6 Gerrit-Owner: Baike Xia Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Thu, 19 Jan 2023 11:57:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3120: Support Bucket Shuffle Join for bucketed table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19430 ) Change subject: IMPALA-3120: Support Bucket Shuffle Join for bucketed table .. Patch Set 5: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/8976/ -- To view, visit http://gerrit.cloudera.org:8080/19430 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If321e7987bc88374d79500cffb77ea25b2ed0316 Gerrit-Change-Number: 19430 Gerrit-PatchSet: 5 Gerrit-Owner: Baike Xia Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Thu, 19 Jan 2023 11:28:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11658: Implement Iceberg manifest caching config for Impala
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/19423 ) Change subject: IMPALA-11658: Implement Iceberg manifest caching config for Impala .. Patch Set 2: Code-Review+2 Thanks for adding this feature! -- To view, visit http://gerrit.cloudera.org:8080/19423 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a60a700d2ae6302dfe395d1ef602e6b1d821888 Gerrit-Change-Number: 19423 Gerrit-PatchSet: 2 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Yida Wu Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 19 Jan 2023 10:47:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11658: Implement Iceberg manifest caching config for Impala
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19423 ) Change subject: IMPALA-11658: Implement Iceberg manifest caching config for Impala .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8978/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/19423 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a60a700d2ae6302dfe395d1ef602e6b1d821888 Gerrit-Change-Number: 19423 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Yida Wu Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 19 Jan 2023 10:47:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11658: Implement Iceberg manifest caching config for Impala
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19423 ) Change subject: IMPALA-11658: Implement Iceberg manifest caching config for Impala .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/19423 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a60a700d2ae6302dfe395d1ef602e6b1d821888 Gerrit-Change-Number: 19423 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Yida Wu Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 19 Jan 2023 10:47:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/19429 ) Change subject: IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking .. Patch Set 2: (3 comments) Thanks http://gerrit.cloudera.org:8080/#/c/19429/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19429/1//COMMIT_MSG@12 PS1, Line 12: the table masking view can't expose complex type columns directly > Yeah, I filed IMPALA-11847 for the refactor since it's a broader change. It Great, thanks. http://gerrit.cloudera.org:8080/#/c/19429/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19429/2//COMMIT_MSG@41 PS2, Line 41: it's not a table Shouldn't this be "it's not a struct"? http://gerrit.cloudera.org:8080/#/c/19429/2//COMMIT_MSG@44 PS2, Line 44: These are the conditions for returning the STAR paths directly: After looking at the code again, aren't the conditions for returning the path directly that at least one of the mentioned statements is false? if (!resolvedPath.destType().isStructType() || !resolvedPath.isRootedAtTuple()) { return resolvedPath; } Shouldn't we say - the type is not struct OR - the resolved path is not rooted at a valid tuple descriptor ? -- To view, visit http://gerrit.cloudera.org:8080/19429 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8f1e78e325baafbe23101909d47e82bf140a2d77 Gerrit-Change-Number: 19429 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Thu, 19 Jan 2023 10:16:07 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11626: Handled COMMIT COMPACTION EVENT from HMS
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/19155 ) Change subject: IMPALA-11626: Handled COMMIT_COMPACTION_EVENT from HMS .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/19155/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19155/6//COMMIT_MSG@7 PS6, Line 7: Handled Nit: Handle -- To view, visit http://gerrit.cloudera.org:8080/19155 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I464faedb4e3bbcd417bab2e3cb0d57e339d42605 Gerrit-Change-Number: 19155 Gerrit-PatchSet: 6 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Thu, 19 Jan 2023 09:55:21 + Gerrit-HasComments: Yes