[jira] [Assigned] (IMPALA-13020) catalog-topic updates >2GB do not work due to Thrift's max message size
[ https://issues.apache.org/jira/browse/IMPALA-13020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe McDonnell reassigned IMPALA-13020: -- Assignee: Joe McDonnell > catalog-topic updates >2GB do not work due to Thrift's max message size > --- > > Key: IMPALA-13020 > URL: https://issues.apache.org/jira/browse/IMPALA-13020 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 4.2.0, Impala 4.3.0 >Reporter: Joe McDonnell >Assignee: Joe McDonnell >Priority: Critical > > Thrift 0.16.0 added a max message size to protect against malicious packets > that can consume a large amount of memory on the receiver side. This max > message size is a signed 32-bit integer, so it maxes out at 2GB (which we set > via thrift_rpc_max_message_size). > In catalog v1, the catalog-update statestore topic can become larger than 2GB > when there are a large number of tables / partitions / files. If this happens > and an Impala coordinator needs to start up (or needs a full topic update for > any other reason), it is expecting the statestore to send it the full topic > update, but the coordinator actually can't process the message. The > deserialization of the message hits the 2GB max message size limit and fails. > On the statestore side, it shows this message: > {noformat} > I0418 16:54:51.727290 3844140 statestore.cc:507] Preparing initial > catalog-update topic update for > impa...@mcdonnellthrift.vpc.cloudera.com:27000. Size = 2.27 GB > I0418 16:54:53.889446 3844140 thrift-util.cc:198] TSocket::write_partial() > send() : Broken pipe > I0418 16:54:53.889488 3844140 client-cache.cc:82] ReopenClient(): re-creating > client for mcdonnellthrift.vpc.cloudera.com:23000 > I0418 16:54:53.889493 3844140 thrift-util.cc:198] TSocket::write_partial() > send() : Broken pipe > I0418 16:54:53.889503 3844140 thrift-client.cc:116] Error closing connection > to: mcdonnellthrift.vpc.cloudera.com:23000, ignoring (write() send(): Broken > pipe) > I0418 16:54:56.052882 3844140 thrift-util.cc:198] TSocket::write_partial() > send() : Broken pipe > I0418 16:54:56.052932 3844140 client-cache.h:363] RPC Error: Client for > mcdonnellthrift.vpc.cloudera.com:23000 hit an unexpected exception: write() > send(): Broken pipe, type: N6apache6thrift9transport19TTransportExceptionE, > rpc: N6impala20TUpdateStateResponseE, send: not done > I0418 16:54:56.052937 3844140 client-cache.cc:174] Broken Connection, destroy > client for mcdonnellthrift.vpc.cloudera.com:23000{noformat} > On the Impala side, it doesn't give a good error, but we see this: > {noformat} > I0418 16:54:53.889683 3214537 TAcceptQueueServer.cpp:355] New connection to > server StatestoreSubscriber from client > I0418 16:54:54.080694 3214136 Frontend.java:1837] Waiting for local catalog > to be initialized, attempt: 110 > I0418 16:54:56.080920 3214136 Frontend.java:1837] Waiting for local catalog > to be initialized, attempt: 111 > I0418 16:54:58.081131 3214136 Frontend.java:1837] Waiting for local catalog > to be initialized, attempt: 112 > I0418 16:55:00.081358 3214136 Frontend.java:1837] Waiting for local catalog > to be initialized, attempt: 113{noformat} > With a patch Thrift that allows an int64_t max message size and setting that > to a larger value, the Impala was able to start up (even without restarting > the statestored). > Some clusters that upgrade to a newer version may hit this, as Thrift didn't > use to enforce this limit, so this is something we should fix to avoid > upgrade issues. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-13020) catalog-topic updates >2GB do not work due to Thrift's max message size
[ https://issues.apache.org/jira/browse/IMPALA-13020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe McDonnell resolved IMPALA-13020. Fix Version/s: Impala 4.5.0 Resolution: Fixed > catalog-topic updates >2GB do not work due to Thrift's max message size > --- > > Key: IMPALA-13020 > URL: https://issues.apache.org/jira/browse/IMPALA-13020 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 4.2.0, Impala 4.3.0 >Reporter: Joe McDonnell >Assignee: Joe McDonnell >Priority: Critical > Fix For: Impala 4.5.0 > > > Thrift 0.16.0 added a max message size to protect against malicious packets > that can consume a large amount of memory on the receiver side. This max > message size is a signed 32-bit integer, so it maxes out at 2GB (which we set > via thrift_rpc_max_message_size). > In catalog v1, the catalog-update statestore topic can become larger than 2GB > when there are a large number of tables / partitions / files. If this happens > and an Impala coordinator needs to start up (or needs a full topic update for > any other reason), it is expecting the statestore to send it the full topic > update, but the coordinator actually can't process the message. The > deserialization of the message hits the 2GB max message size limit and fails. > On the statestore side, it shows this message: > {noformat} > I0418 16:54:51.727290 3844140 statestore.cc:507] Preparing initial > catalog-update topic update for > impa...@mcdonnellthrift.vpc.cloudera.com:27000. Size = 2.27 GB > I0418 16:54:53.889446 3844140 thrift-util.cc:198] TSocket::write_partial() > send() : Broken pipe > I0418 16:54:53.889488 3844140 client-cache.cc:82] ReopenClient(): re-creating > client for mcdonnellthrift.vpc.cloudera.com:23000 > I0418 16:54:53.889493 3844140 thrift-util.cc:198] TSocket::write_partial() > send() : Broken pipe > I0418 16:54:53.889503 3844140 thrift-client.cc:116] Error closing connection > to: mcdonnellthrift.vpc.cloudera.com:23000, ignoring (write() send(): Broken > pipe) > I0418 16:54:56.052882 3844140 thrift-util.cc:198] TSocket::write_partial() > send() : Broken pipe > I0418 16:54:56.052932 3844140 client-cache.h:363] RPC Error: Client for > mcdonnellthrift.vpc.cloudera.com:23000 hit an unexpected exception: write() > send(): Broken pipe, type: N6apache6thrift9transport19TTransportExceptionE, > rpc: N6impala20TUpdateStateResponseE, send: not done > I0418 16:54:56.052937 3844140 client-cache.cc:174] Broken Connection, destroy > client for mcdonnellthrift.vpc.cloudera.com:23000{noformat} > On the Impala side, it doesn't give a good error, but we see this: > {noformat} > I0418 16:54:53.889683 3214537 TAcceptQueueServer.cpp:355] New connection to > server StatestoreSubscriber from client > I0418 16:54:54.080694 3214136 Frontend.java:1837] Waiting for local catalog > to be initialized, attempt: 110 > I0418 16:54:56.080920 3214136 Frontend.java:1837] Waiting for local catalog > to be initialized, attempt: 111 > I0418 16:54:58.081131 3214136 Frontend.java:1837] Waiting for local catalog > to be initialized, attempt: 112 > I0418 16:55:00.081358 3214136 Frontend.java:1837] Waiting for local catalog > to be initialized, attempt: 113{noformat} > With a patch Thrift that allows an int64_t max message size and setting that > to a larger value, the Impala was able to start up (even without restarting > the statestored). > Some clusters that upgrade to a newer version may hit this, as Thrift didn't > use to enforce this limit, so this is something we should fix to avoid > upgrade issues. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-13020) catalog-topic updates >2GB do not work due to Thrift's max message size
[ https://issues.apache.org/jira/browse/IMPALA-13020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847419#comment-17847419 ] ASF subversion and git services commented on IMPALA-13020: -- Commit 13df8239d82a61afc3196295a7878ca2ffe91873 in impala's branch refs/heads/master from Joe McDonnell [ https://gitbox.apache.org/repos/asf?p=impala.git;h=13df8239d ] IMPALA-13020 (part 1): Change thrift_rpc_max_message_size to int64_t Thrift 0.16.0 introduced a max message size to protect receivers against a malicious message allocating large amounts of memory. That limit is a 32-bit signed integer, so the max value is 2GB. Impala introduced the thrift_rpc_max_message_size startup option to set that for Impala's thrift servers. There are times when Impala wants to send a message that is larger than 2GB. In particular, the catalog-update topic for the statestore can exceed 2GBs when there is a lot of metadata loaded using the old v1 catalog. When there is a 2GB max message size, the statestore can create and send a >2GB message, but the impalads will reject it. This can lead to impalads having stale metadata. This switches to a patched Thrift that uses an int64_t for the max message size for C++ code. It does not modify the limit. The MaxMessageSize error was being swallowed in TAcceptQueueServer.cpp, so this fixes that location to always print MaxMessageSize exceptions. This is only patching the Thrift C++ library. It does not patch the Thrift Java library. There are a few reasons for that: - This specific issue involves C++ to C++ communication and will be solved by patching the C++ library. - C++ is easy to patch as it is built via the native-toolchain. There is no corresponding mechanism for patching our Java dependencies (though one could be developed). - Java modifications have implications for other dependencies like Hive which use Thrift to communicate with HMS. For the Java code that uses max message size, this converts the 64-bit value to 32-bit value by capping the value at Integer.MAX_VALUE. Testing: - Added enough tables to produce a >2GB catalog-topic and restarted an impalad with a higher limit specific. Without the patch, the catalog-topic update would be rejected by the impalad. With the patch, it succeeds. Change-Id: I681b1849cc565dcb25de8c070c18776ce69cbb87 Reviewed-on: http://gerrit.cloudera.org:8080/21367 Reviewed-by: Michael Smith Reviewed-by: Joe McDonnell Tested-by: Joe McDonnell > catalog-topic updates >2GB do not work due to Thrift's max message size > --- > > Key: IMPALA-13020 > URL: https://issues.apache.org/jira/browse/IMPALA-13020 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 4.2.0, Impala 4.3.0 >Reporter: Joe McDonnell >Priority: Critical > > Thrift 0.16.0 added a max message size to protect against malicious packets > that can consume a large amount of memory on the receiver side. This max > message size is a signed 32-bit integer, so it maxes out at 2GB (which we set > via thrift_rpc_max_message_size). > In catalog v1, the catalog-update statestore topic can become larger than 2GB > when there are a large number of tables / partitions / files. If this happens > and an Impala coordinator needs to start up (or needs a full topic update for > any other reason), it is expecting the statestore to send it the full topic > update, but the coordinator actually can't process the message. The > deserialization of the message hits the 2GB max message size limit and fails. > On the statestore side, it shows this message: > {noformat} > I0418 16:54:51.727290 3844140 statestore.cc:507] Preparing initial > catalog-update topic update for > impa...@mcdonnellthrift.vpc.cloudera.com:27000. Size = 2.27 GB > I0418 16:54:53.889446 3844140 thrift-util.cc:198] TSocket::write_partial() > send() : Broken pipe > I0418 16:54:53.889488 3844140 client-cache.cc:82] ReopenClient(): re-creating > client for mcdonnellthrift.vpc.cloudera.com:23000 > I0418 16:54:53.889493 3844140 thrift-util.cc:198] TSocket::write_partial() > send() : Broken pipe > I0418 16:54:53.889503 3844140 thrift-client.cc:116] Error closing connection > to: mcdonnellthrift.vpc.cloudera.com:23000, ignoring (write() send(): Broken > pipe) > I0418 16:54:56.052882 3844140 thrift-util.cc:198] TSocket::write_partial() > send() : Broken pipe > I0418 16:54:56.052932 3844140 client-cache.h:363] RPC Error: Client for > mcdonnellthrift.vpc.cloudera.com:23000 hit an unexpected exception: write() > send(): Broken pipe, type: N6apache6thrift9transport19TTransportExceptionE, > rpc: N6impala20TUpdateStateResponseE, send: not done > I0418 16:54:56.052937 3844140 client-cache.cc:174] Broken Connection, destroy > client for
[jira] [Commented] (IMPALA-13020) catalog-topic updates >2GB do not work due to Thrift's max message size
[ https://issues.apache.org/jira/browse/IMPALA-13020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847420#comment-17847420 ] ASF subversion and git services commented on IMPALA-13020: -- Commit bcff4df6194b2f192d937bb9c031721feccb69df in impala's branch refs/heads/master from Joe McDonnell [ https://gitbox.apache.org/repos/asf?p=impala.git;h=bcff4df61 ] IMPALA-13020 (part 2): Split out external vs internal Thrift max message size The Thrift max message size is designed to protect against malicious messages that consume a lot of memory on the receiver. This is an important security measure for externally facing services, but it can interfere with internal communication within the cluster. Currently, the max message size is controlled by a single startup flag for both. This puts tensions between having a low value to protect against malicious messages versus having a high value to avoid issues with internal communication (e.g. large statestore updates). This introduces a new flag thrift_external_rpc_max_message_size to specify the limit for externally-facing services. The current thrift_rpc_max_message_size now applies only for internal services. Splitting them apart allows setting a much higher value for internal services (64GB) while leaving the externally facing services using the current 2GB limit. This modifies various code locations that wrap a Thrift transport to pass in the original transport's TConfiguration. This also adds DCHECKs to make sure that the new transport inherits the max message size. This limits the locations where we actually need to set max message size. ThriftServer/ThriftServerBuilder have a setting "is_external_facing" which can be specified on each ThriftServer. This modifies statestore and catalog to set is_external_facing to false. All other servers stay with the default of true. Testing: - This adds a test case to verify that is_external_facing uses the higher limit. - Ran through the steps in testdata/scale_test_metadata/README.md and updated the value in that doc. - Created many tables to push the catalog-update topic to be >2GB and verified that statestore successfully sends it when an impalad restarts. Change-Id: Ib9a649ef49a8a99c7bd9a1b73c37c4c621661311 Reviewed-on: http://gerrit.cloudera.org:8080/21420 Tested-by: Impala Public Jenkins Reviewed-by: Riza Suminto Reviewed-by: Michael Smith > catalog-topic updates >2GB do not work due to Thrift's max message size > --- > > Key: IMPALA-13020 > URL: https://issues.apache.org/jira/browse/IMPALA-13020 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 4.2.0, Impala 4.3.0 >Reporter: Joe McDonnell >Priority: Critical > > Thrift 0.16.0 added a max message size to protect against malicious packets > that can consume a large amount of memory on the receiver side. This max > message size is a signed 32-bit integer, so it maxes out at 2GB (which we set > via thrift_rpc_max_message_size). > In catalog v1, the catalog-update statestore topic can become larger than 2GB > when there are a large number of tables / partitions / files. If this happens > and an Impala coordinator needs to start up (or needs a full topic update for > any other reason), it is expecting the statestore to send it the full topic > update, but the coordinator actually can't process the message. The > deserialization of the message hits the 2GB max message size limit and fails. > On the statestore side, it shows this message: > {noformat} > I0418 16:54:51.727290 3844140 statestore.cc:507] Preparing initial > catalog-update topic update for > impa...@mcdonnellthrift.vpc.cloudera.com:27000. Size = 2.27 GB > I0418 16:54:53.889446 3844140 thrift-util.cc:198] TSocket::write_partial() > send() : Broken pipe > I0418 16:54:53.889488 3844140 client-cache.cc:82] ReopenClient(): re-creating > client for mcdonnellthrift.vpc.cloudera.com:23000 > I0418 16:54:53.889493 3844140 thrift-util.cc:198] TSocket::write_partial() > send() : Broken pipe > I0418 16:54:53.889503 3844140 thrift-client.cc:116] Error closing connection > to: mcdonnellthrift.vpc.cloudera.com:23000, ignoring (write() send(): Broken > pipe) > I0418 16:54:56.052882 3844140 thrift-util.cc:198] TSocket::write_partial() > send() : Broken pipe > I0418 16:54:56.052932 3844140 client-cache.h:363] RPC Error: Client for > mcdonnellthrift.vpc.cloudera.com:23000 hit an unexpected exception: write() > send(): Broken pipe, type: N6apache6thrift9transport19TTransportExceptionE, > rpc: N6impala20TUpdateStateResponseE, send: not done > I0418 16:54:56.052937 3844140 client-cache.cc:174] Broken Connection, destroy > client for mcdonnellthrift.vpc.cloudera.com:23000{noformat} > On the Impala side, it doesn't
[jira] [Work started] (IMPALA-12277) metadata reload of INSERT failed by NullPointerException: Invalid partition name
[ https://issues.apache.org/jira/browse/IMPALA-12277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-12277 started by Sai Hemanth Gantasala. -- > metadata reload of INSERT failed by NullPointerException: Invalid partition > name > > > Key: IMPALA-12277 > URL: https://issues.apache.org/jira/browse/IMPALA-12277 > Project: IMPALA > Issue Type: Bug > Components: Catalog >Reporter: Quanlong Huang >Assignee: Sai Hemanth Gantasala >Priority: Major > > INSERT into a partition that exists in catalogd but doesn't exist in HMS will > fail in metadata reloading on the partition. The cause is that updateCatalog > doesn't create the partition in HMS (since catalogd is not aware of the > non-existence of the partition in HMS): > [https://github.com/apache/impala/blob/d0fe4c604f72d41019832513ebf65cfe8f469953/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L6697-L6699] > When reloading the partition, catalogd first removes it since it doesn't > exist in HMS: > [https://github.com/apache/impala/blob/d0fe4c604f72d41019832513ebf65cfe8f469953/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L1530-L1531] > It then try to reload it, which hits NullPointerException at: > [https://github.com/apache/impala/blob/d0fe4c604f72d41019832513ebf65cfe8f469953/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L1566] > To reproduce the issue, launch Impala with event processing disabled so > catalogd can be unsynced with HMS. Create a partitioned table in Impala with > one partition: > {noformat} > bin/start-impala-cluster.py --catalogd_args=--hms_event_polling_interval_s=0 > impala> create table my_part2 (id int) partitioned by (p int) stored as > textfile; > impala> insert into my_part2 partition(p=0) values (0); {noformat} > Drop the partition in Hive: > {code:sql} > hive> alter table my_part2 drop partition (p=0);{code} > Then insert the partition again in Impala > {code:sql} > impala> insert into my_part2 partition(p=0) values (1); > ERROR: TableLoadingException: Failed to load metadata for table: > default.my_part2 > CAUSED BY: NullPointerException: Invalid partition name: p=0 > {code} > The exception: > {noformat} > E0710 19:34:43.569339 4413 JniUtil.java:183] > bb4452d18eafe116:eaf16c40] Error in Update catalog for > default.my_part2. Time spent: 1s186ms > I0710 19:34:43.569918 4413 jni-util.cc:288] > bb4452d18eafe116:eaf16c40] > org.apache.impala.catalog.TableLoadingException: Failed to load metadata for > table: default.my_part2 > at org.apache.impala.catalog.HdfsTable.load(HdfsTable.java:1308) > at > org.apache.impala.service.CatalogOpExecutor.loadTableMetadata(CatalogOpExecutor.java:1521) > at > org.apache.impala.service.CatalogOpExecutor.updateCatalog(CatalogOpExecutor.java:6863) > at > org.apache.impala.service.JniCatalog.lambda$updateCatalog$16(JniCatalog.java:471) > at > org.apache.impala.service.JniCatalogOp.lambda$execAndSerialize$1(JniCatalogOp.java:90) > at org.apache.impala.service.JniCatalogOp.execOp(JniCatalogOp.java:58) > at > org.apache.impala.service.JniCatalogOp.execAndSerialize(JniCatalogOp.java:89) > at > org.apache.impala.service.JniCatalogOp.execAndSerialize(JniCatalogOp.java:100) > at > org.apache.impala.service.JniCatalog.execAndSerialize(JniCatalog.java:230) > at > org.apache.impala.service.JniCatalog.updateCatalog(JniCatalog.java:470) > Caused by: java.lang.NullPointerException: Invalid partition name: p=0 > at > com.google.common.base.Preconditions.checkNotNull(Preconditions.java:907) > at > org.apache.impala.catalog.HdfsTable.getPartitionsForNames(HdfsTable.java:1766) > at > org.apache.impala.catalog.HdfsTable$PartitionDeltaUpdater.apply(HdfsTable.java:1566) > at > org.apache.impala.catalog.HdfsTable.updatePartitionsFromHms(HdfsTable.java:1447) > at org.apache.impala.catalog.HdfsTable.load(HdfsTable.java:1282) > ... 9 more{noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-13100) Document support for ROLLUP, CUBE and GROUPING SETS
[ https://issues.apache.org/jira/browse/IMPALA-13100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Smith updated IMPALA-13100: --- Summary: Document support for ROLLUP, CUBE and GROUPING SETS (was: Document support for new syntax) > Document support for ROLLUP, CUBE and GROUPING SETS > --- > > Key: IMPALA-13100 > URL: https://issues.apache.org/jira/browse/IMPALA-13100 > Project: IMPALA > Issue Type: Sub-task >Reporter: Michael Smith >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-13100) Document support for new syntax
Michael Smith created IMPALA-13100: -- Summary: Document support for new syntax Key: IMPALA-13100 URL: https://issues.apache.org/jira/browse/IMPALA-13100 Project: IMPALA Issue Type: Sub-task Reporter: Michael Smith -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-13099) ImpalaOperator getAllowedSignatures needs to be implemented
Steve Carlin created IMPALA-13099: - Summary: ImpalaOperator getAllowedSignatures needs to be implemented Key: IMPALA-13099 URL: https://issues.apache.org/jira/browse/IMPALA-13099 Project: IMPALA Issue Type: Sub-task Reporter: Steve Carlin This method is used to show what syntax is allowed for a given function. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-13098) computeEvalCost needs better implementation in Calcite planner
Steve Carlin created IMPALA-13098: - Summary: computeEvalCost needs better implementation in Calcite planner Key: IMPALA-13098 URL: https://issues.apache.org/jira/browse/IMPALA-13098 Project: IMPALA Issue Type: Sub-task Reporter: Steve Carlin Right now, computeEvalCost is always returning UNKNOWN_COST. The costing needs to be calculated. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-13097) Better exception for Analyze*Expr class in Calcite Planner
Steve Carlin created IMPALA-13097: - Summary: Better exception for Analyze*Expr class in Calcite Planner Key: IMPALA-13097 URL: https://issues.apache.org/jira/browse/IMPALA-13097 Project: IMPALA Issue Type: Sub-task Reporter: Steve Carlin Some of the Analyzed*Expr classes throw a RuntimeException. There should be a cleaner exception thrown in these cases. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-13095) Handle UDFs in Calcite planner
Steve Carlin created IMPALA-13095: - Summary: Handle UDFs in Calcite planner Key: IMPALA-13095 URL: https://issues.apache.org/jira/browse/IMPALA-13095 Project: IMPALA Issue Type: Sub-task Reporter: Steve Carlin -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-13055) Some Iceberg metadata table tests doesn't assert
[ https://issues.apache.org/jira/browse/IMPALA-13055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847380#comment-17847380 ] ASF subversion and git services commented on IMPALA-13055: -- Commit 3a8eb999cbc746c055708425e071c30e3c00422e in impala's branch refs/heads/master from Gabor Kaszab [ https://gitbox.apache.org/repos/asf?p=impala.git;h=3a8eb999c ] IMPALA-13055: Some Iceberg metadata table tests don't assert Some tests in the Iceberg metadata table suite use the following regex to verify numbers in the output: [1-9]\d*|0 However, if this format is given, the test unconditionally passes. This patch changes this format to \d+ and fixes the test results that incorrectly passed before due to the test not asserting. Opened IMPALA-13067 to investigate why the test framework works like this for |0 in the regexes. Change-Id: Ie47093f25a70253b3e6faca27d466d7cf6999fad Reviewed-on: http://gerrit.cloudera.org:8080/21394 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Some Iceberg metadata table tests doesn't assert > > > Key: IMPALA-13055 > URL: https://issues.apache.org/jira/browse/IMPALA-13055 > Project: IMPALA > Issue Type: Test >Reporter: Gabor Kaszab >Priority: Major > Labels: impala-iceberg > > Some test in the Iceberg metadata table suite use the following regex to > verify numbers in the output: [1-9]\d*|0 > However, if this format is given, the test unconditionally passes. On could > put the formula within parentheses, or simply verify for \d+ -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-13067) Some regex make the tests unconditionally pass
[ https://issues.apache.org/jira/browse/IMPALA-13067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847381#comment-17847381 ] ASF subversion and git services commented on IMPALA-13067: -- Commit 3a8eb999cbc746c055708425e071c30e3c00422e in impala's branch refs/heads/master from Gabor Kaszab [ https://gitbox.apache.org/repos/asf?p=impala.git;h=3a8eb999c ] IMPALA-13055: Some Iceberg metadata table tests don't assert Some tests in the Iceberg metadata table suite use the following regex to verify numbers in the output: [1-9]\d*|0 However, if this format is given, the test unconditionally passes. This patch changes this format to \d+ and fixes the test results that incorrectly passed before due to the test not asserting. Opened IMPALA-13067 to investigate why the test framework works like this for |0 in the regexes. Change-Id: Ie47093f25a70253b3e6faca27d466d7cf6999fad Reviewed-on: http://gerrit.cloudera.org:8080/21394 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Some regex make the tests unconditionally pass > -- > > Key: IMPALA-13067 > URL: https://issues.apache.org/jira/browse/IMPALA-13067 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Reporter: Gabor Kaszab >Priority: Major > Labels: test-framework > > This issue came out in the Iceberg metadata table tests where this regex was > used: > [1-9]\d*|0 > > The "|0" part for some reason made the test framework confused and then > regardless of what you provide as an expected result the tests passed. One > workaround was to put the regex expression between parentheses. Or simply use > "d+". https://issues.apache.org/jira/browse/IMPALA-13055 applied this second > workaround on the tests. > Some analysis would be great why this is the behavior of the test framework, > and if it's indeed the issue of the framnework, we should fix it. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-12990) impala-shell broken if Iceberg delete deletes 0 rows
[ https://issues.apache.org/jira/browse/IMPALA-12990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Csaba Ringhofer resolved IMPALA-12990. -- Fix Version/s: Impala 4.4.0 Resolution: Fixed > impala-shell broken if Iceberg delete deletes 0 rows > > > Key: IMPALA-12990 > URL: https://issues.apache.org/jira/browse/IMPALA-12990 > Project: IMPALA > Issue Type: Bug > Components: Clients >Reporter: Csaba Ringhofer >Assignee: Csaba Ringhofer >Priority: Major > Labels: iceberg > Fix For: Impala 4.4.0 > > > Happens only with Python 3 > {code} > impala-python3 shell/impala_shell.py > create table icebergupdatet (i int, s string) stored as iceberg; > alter table icebergupdatet set tblproperties("format-version"="2"); > delete from icebergupdatet where i=0; > Unknown Exception : '>' not supported between instances of 'NoneType' and > 'int' > Traceback (most recent call last): > File "shell/impala_shell.py", line 1428, in _execute_stmt > if is_dml and num_rows == 0 and num_deleted_rows > 0: > TypeError: '>' not supported between instances of 'NoneType' and 'int' > {code} > The same erros should also happen when the delete removes > 0 rows, but the > impala server has an older version that doesn't set TDmlResult.rows_deleted -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-13091) query_test.test_iceberg.TestIcebergV2Table.test_metadata_tables fails on an expected constant
[ https://issues.apache.org/jira/browse/IMPALA-13091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-13091 started by Daniel Becker. -- > query_test.test_iceberg.TestIcebergV2Table.test_metadata_tables fails on an > expected constant > - > > Key: IMPALA-13091 > URL: https://issues.apache.org/jira/browse/IMPALA-13091 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 4.5.0 >Reporter: Laszlo Gaal >Assignee: Daniel Becker >Priority: Critical > Labels: impala-iceberg > > This fails in various sanitizer builds (ASAN, UBSAN): > Failure report:{code} > query_test/test_iceberg.py:1527: in test_metadata_tables > '$OVERWRITE_SNAPSHOT_TS': str(overwrite_snapshot_ts.data[0])}) > common/impala_test_suite.py:820: in run_test_case > self.__verify_results_and_errors(vector, test_section, result, use_db) > common/impala_test_suite.py:627: in __verify_results_and_errors > replace_filenames_with_placeholder) > common/test_result_verifier.py:520: in verify_raw_results > VERIFIER_MAP[verifier](expected, actual) > common/test_result_verifier.py:313: in verify_query_result_is_equal > assert expected_results == actual_results > E assert Comparing QueryTestResults (expected vs actual): > E > 0,regex:'.*\.parquet','PARQUET',0,3,3648,'{1:32,2:63,3:71,4:43,5:55,6:47,7:39,8:58,9:47,13:63,14:96,15:75,16:78}','{1:3,2:3,3:3,4:3,5:3,6:3,7:3,8:3,9:3,13:3,14:6,15:6,16:6}','{1:1,2:0,3:0,4:0,5:0,6:1,7:1,8:1,9:1,13:0,14:0,15:0,16:0}','{16:0,4:1,5:1,14:0}','{1:"AA==",2:"AQ==",3:"9v8=",4:"/+ZbLw==",5:"MAWO5C7/O6s=",6:"AFgLImsYBgA=",7:"kU0AAA==",8:"QSBzdHJpbmc=",9:"YmluMQ==",13:"av///w==",14:"fcOUJa1JwtQ=",16:"Pw=="}','{1:"AQ==",2:"BQ==",3:"lgA=",4:"qV/jWA==",5:"fcOUJa1JwlQ=",6:"AMhZw6A3BgA=",7:"Hk8AAA==",8:"U29tZSBzdHJpbmc=",9:"YmluMg==",13:"Cg==",14:"NEA=",16:"AAB6RA=="}','NULL','[4]','NULL',0,'{"arr.element":{"column_size":96,"value_count":6,"null_value_count":0,"nan_value_count":0,"lower_bound":-2e+100,"upper_bound":20},"b":{"column_size":32,"value_count":3,"null_value_count":1,"nan_value_count":null,"lower_bound":false,"upper_bound":true},"bn":{"column_size":47,"value_count":3,"null_value_count":1,"nan_value_count":null,"lower_bound":"YmluMQ==","upper_bound":"YmluMg=="},"d":{"column_size":55,"value_count":3,"null_value_count":0,"nan_value_count":1,"lower_bound":-2e-100,"upper_bound":2e+100},"dt":{"column_size":39,"value_count":3,"null_value_count":1,"nan_value_count":null,"lower_bound":"2024-05-14","upper_bound":"2025-06-15"},"f":{"column_size":43,"value_count":3,"null_value_count":0,"nan_value_count":1,"lower_bound":2.00026702864e-10,"upper_bound":199973982208},"i":{"column_size":63,"value_count":3,"null_value_count":0,"nan_value_count":null,"lower_bound":1,"upper_bound":5},"l":{"column_size":71,"value_count":3,"null_value_count":0,"nan_value_count":null,"lower_bound":-10,"upper_bound":150},"mp.key":{"column_size":75,"value_count":6,"null_value_count":0,"nan_value_count":null,"lower_bound":null,"upper_bound":null},"mp.value":{"column_size":78,"value_count":6,"null_value_count":0,"nan_value_count":0,"lower_bound":0.5,"upper_bound":1000},"s":{"column_size":58,"value_count":3,"null_value_count":1,"nan_value_count":null,"lower_bound":"A > string","upper_bound":"Some > string"},"strct.i":{"column_size":63,"value_count":3,"null_value_count":0,"nan_value_count":null,"lower_bound":-150,"upper_bound":10},"ts":{"column_size":47,"value_count":3,"null_value_count":1,"nan_value_count":null,"lower_bound":"2024-05-14 > 14:51:12","upper_bound":"2025-06-15 18:51:12"}}' != >
[jira] [Updated] (IMPALA-13091) query_test.test_iceberg.TestIcebergV2Table.test_metadata_tables fails on an expected constant
[ https://issues.apache.org/jira/browse/IMPALA-13091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Becker updated IMPALA-13091: --- Labels: impala-iceberg (was: ) > query_test.test_iceberg.TestIcebergV2Table.test_metadata_tables fails on an > expected constant > - > > Key: IMPALA-13091 > URL: https://issues.apache.org/jira/browse/IMPALA-13091 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 4.5.0 >Reporter: Laszlo Gaal >Assignee: Daniel Becker >Priority: Critical > Labels: impala-iceberg > > This fails in various sanitizer builds (ASAN, UBSAN): > Failure report:{code} > query_test/test_iceberg.py:1527: in test_metadata_tables > '$OVERWRITE_SNAPSHOT_TS': str(overwrite_snapshot_ts.data[0])}) > common/impala_test_suite.py:820: in run_test_case > self.__verify_results_and_errors(vector, test_section, result, use_db) > common/impala_test_suite.py:627: in __verify_results_and_errors > replace_filenames_with_placeholder) > common/test_result_verifier.py:520: in verify_raw_results > VERIFIER_MAP[verifier](expected, actual) > common/test_result_verifier.py:313: in verify_query_result_is_equal > assert expected_results == actual_results > E assert Comparing QueryTestResults (expected vs actual): > E > 0,regex:'.*\.parquet','PARQUET',0,3,3648,'{1:32,2:63,3:71,4:43,5:55,6:47,7:39,8:58,9:47,13:63,14:96,15:75,16:78}','{1:3,2:3,3:3,4:3,5:3,6:3,7:3,8:3,9:3,13:3,14:6,15:6,16:6}','{1:1,2:0,3:0,4:0,5:0,6:1,7:1,8:1,9:1,13:0,14:0,15:0,16:0}','{16:0,4:1,5:1,14:0}','{1:"AA==",2:"AQ==",3:"9v8=",4:"/+ZbLw==",5:"MAWO5C7/O6s=",6:"AFgLImsYBgA=",7:"kU0AAA==",8:"QSBzdHJpbmc=",9:"YmluMQ==",13:"av///w==",14:"fcOUJa1JwtQ=",16:"Pw=="}','{1:"AQ==",2:"BQ==",3:"lgA=",4:"qV/jWA==",5:"fcOUJa1JwlQ=",6:"AMhZw6A3BgA=",7:"Hk8AAA==",8:"U29tZSBzdHJpbmc=",9:"YmluMg==",13:"Cg==",14:"NEA=",16:"AAB6RA=="}','NULL','[4]','NULL',0,'{"arr.element":{"column_size":96,"value_count":6,"null_value_count":0,"nan_value_count":0,"lower_bound":-2e+100,"upper_bound":20},"b":{"column_size":32,"value_count":3,"null_value_count":1,"nan_value_count":null,"lower_bound":false,"upper_bound":true},"bn":{"column_size":47,"value_count":3,"null_value_count":1,"nan_value_count":null,"lower_bound":"YmluMQ==","upper_bound":"YmluMg=="},"d":{"column_size":55,"value_count":3,"null_value_count":0,"nan_value_count":1,"lower_bound":-2e-100,"upper_bound":2e+100},"dt":{"column_size":39,"value_count":3,"null_value_count":1,"nan_value_count":null,"lower_bound":"2024-05-14","upper_bound":"2025-06-15"},"f":{"column_size":43,"value_count":3,"null_value_count":0,"nan_value_count":1,"lower_bound":2.00026702864e-10,"upper_bound":199973982208},"i":{"column_size":63,"value_count":3,"null_value_count":0,"nan_value_count":null,"lower_bound":1,"upper_bound":5},"l":{"column_size":71,"value_count":3,"null_value_count":0,"nan_value_count":null,"lower_bound":-10,"upper_bound":150},"mp.key":{"column_size":75,"value_count":6,"null_value_count":0,"nan_value_count":null,"lower_bound":null,"upper_bound":null},"mp.value":{"column_size":78,"value_count":6,"null_value_count":0,"nan_value_count":0,"lower_bound":0.5,"upper_bound":1000},"s":{"column_size":58,"value_count":3,"null_value_count":1,"nan_value_count":null,"lower_bound":"A > string","upper_bound":"Some > string"},"strct.i":{"column_size":63,"value_count":3,"null_value_count":0,"nan_value_count":null,"lower_bound":-150,"upper_bound":10},"ts":{"column_size":47,"value_count":3,"null_value_count":1,"nan_value_count":null,"lower_bound":"2024-05-14 > 14:51:12","upper_bound":"2025-06-15 18:51:12"}}' != >
[jira] [Created] (IMPALA-13094) Query links in /admission page of admissiond doesn't work
Quanlong Huang created IMPALA-13094: --- Summary: Query links in /admission page of admissiond doesn't work Key: IMPALA-13094 URL: https://issues.apache.org/jira/browse/IMPALA-13094 Project: IMPALA Issue Type: Bug Components: Backend Reporter: Quanlong Huang Attachments: Selection_115.png, Selection_116.png In the /admission page, there are records for queued queries and running queries. The details links for these queries use the hostname of the admissiond. Instead, they should point to the corresponding coordinators. Clicking on the link will jump to the /query_plan endpoint of the admissiond which doesn't exist. Thus failed by Error: No URI handler for '/query_plan'. Attached the screenshots for reference. CC [~arawat] -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-13094) Query links in /admission page of admissiond doesn't work
[ https://issues.apache.org/jira/browse/IMPALA-13094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Quanlong Huang updated IMPALA-13094: Attachment: Selection_116.png > Query links in /admission page of admissiond doesn't work > - > > Key: IMPALA-13094 > URL: https://issues.apache.org/jira/browse/IMPALA-13094 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Quanlong Huang >Priority: Critical > Attachments: Selection_115.png, Selection_116.png > > > In the /admission page, there are records for queued queries and running > queries. The details links for these queries use the hostname of the > admissiond. Instead, they should point to the corresponding coordinators. > Clicking on the link will jump to the /query_plan endpoint of the admissiond > which doesn't exist. Thus failed by Error: No URI handler for '/query_plan'. > Attached the screenshots for reference. > CC [~arawat] -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-13094) Query links in /admission page of admissiond doesn't work
[ https://issues.apache.org/jira/browse/IMPALA-13094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Quanlong Huang updated IMPALA-13094: Attachment: Selection_115.png > Query links in /admission page of admissiond doesn't work > - > > Key: IMPALA-13094 > URL: https://issues.apache.org/jira/browse/IMPALA-13094 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Quanlong Huang >Priority: Critical > Attachments: Selection_115.png, Selection_116.png > > > In the /admission page, there are records for queued queries and running > queries. The details links for these queries use the hostname of the > admissiond. Instead, they should point to the corresponding coordinators. > Clicking on the link will jump to the /query_plan endpoint of the admissiond > which doesn't exist. Thus failed by Error: No URI handler for '/query_plan'. > Attached the screenshots for reference. > CC [~arawat] -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-12266) Sporadic failure after migrating a table to Iceberg
[ https://issues.apache.org/jira/browse/IMPALA-12266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847169#comment-17847169 ] Gabor Kaszab commented on IMPALA-12266: --- [~laszlog] I see you increased the priority of this. Note, there is another Jira for the root cause: https://issues.apache.org/jira/browse/IMPALA-12712 If that's fixed this would be gone too. > Sporadic failure after migrating a table to Iceberg > --- > > Key: IMPALA-12266 > URL: https://issues.apache.org/jira/browse/IMPALA-12266 > Project: IMPALA > Issue Type: Bug > Components: fe >Affects Versions: Impala 4.2.0 >Reporter: Tamas Mate >Assignee: Gabor Kaszab >Priority: Critical > Labels: impala-iceberg > Attachments: > catalogd.bd40020df22b.invalid-user.log.INFO.20230704-181939.1, > impalad.6c0f48d9ce66.invalid-user.log.INFO.20230704-181940.1 > > > TestIcebergTable.test_convert_table test failed in a recent verify job's > dockerised tests: > https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/7629 > {code:none} > E ImpalaBeeswaxException: ImpalaBeeswaxException: > EINNER EXCEPTION: > EMESSAGE: AnalysisException: Failed to load metadata for table: > 'parquet_nopartitioned' > E CAUSED BY: TableLoadingException: Could not load table > test_convert_table_cdba7383.parquet_nopartitioned from catalog > E CAUSED BY: TException: > TGetPartialCatalogObjectResponse(status:TStatus(status_code:GENERAL, > error_msgs:[NullPointerException: null]), lookup_status:OK) > {code} > {code:none} > E0704 19:09:22.980131 833 JniUtil.java:183] > 7145c21173f2c47b:2579db55] Error in Getting partial catalog object of > TABLE:test_convert_table_cdba7383.parquet_nopartitioned. Time spent: 49ms > I0704 19:09:22.980309 833 jni-util.cc:288] > 7145c21173f2c47b:2579db55] java.lang.NullPointerException > at > org.apache.impala.catalog.CatalogServiceCatalog.replaceTableIfUnchanged(CatalogServiceCatalog.java:2357) > at > org.apache.impala.catalog.CatalogServiceCatalog.getOrLoadTable(CatalogServiceCatalog.java:2300) > at > org.apache.impala.catalog.CatalogServiceCatalog.doGetPartialCatalogObject(CatalogServiceCatalog.java:3587) > at > org.apache.impala.catalog.CatalogServiceCatalog.getPartialCatalogObject(CatalogServiceCatalog.java:3513) > at > org.apache.impala.catalog.CatalogServiceCatalog.getPartialCatalogObject(CatalogServiceCatalog.java:3480) > at > org.apache.impala.service.JniCatalog.lambda$getPartialCatalogObject$11(JniCatalog.java:397) > at > org.apache.impala.service.JniCatalogOp.lambda$execAndSerialize$1(JniCatalogOp.java:90) > at org.apache.impala.service.JniCatalogOp.execOp(JniCatalogOp.java:58) > at > org.apache.impala.service.JniCatalogOp.execAndSerialize(JniCatalogOp.java:89) > at > org.apache.impala.service.JniCatalogOp.execAndSerializeSilentStartAndFinish(JniCatalogOp.java:109) > at > org.apache.impala.service.JniCatalog.execAndSerializeSilentStartAndFinish(JniCatalog.java:238) > at > org.apache.impala.service.JniCatalog.getPartialCatalogObject(JniCatalog.java:396) > I0704 19:09:22.980324 833 status.cc:129] 7145c21173f2c47b:2579db55] > NullPointerException: null > @ 0x1012f9f impala::Status::Status() > @ 0x187f964 impala::JniUtil::GetJniExceptionMsg() > @ 0xfee920 impala::JniCall::Call<>() > @ 0xfccd0f impala::Catalog::GetPartialCatalogObject() > @ 0xfb55a5 > impala::CatalogServiceThriftIf::GetPartialCatalogObject() > @ 0xf7a691 > impala::CatalogServiceProcessorT<>::process_GetPartialCatalogObject() > @ 0xf82151 impala::CatalogServiceProcessorT<>::dispatchCall() > @ 0xee330f apache::thrift::TDispatchProcessor::process() > @ 0x1329246 > apache::thrift::server::TAcceptQueueServer::Task::run() > @ 0x1315a89 impala::ThriftThread::RunRunnable() > @ 0x131773d > boost::detail::function::void_function_obj_invoker0<>::invoke() > @ 0x195ba8c impala::Thread::SuperviseThread() > @ 0x195c895 boost::detail::thread_data<>::run() > @ 0x23a03a7 thread_proxy > @ 0x7faaad2a66ba start_thread > @ 0x7f2c151d clone > E0704 19:09:23.006968 833 catalog-server.cc:278] > 7145c21173f2c47b:2579db55] NullPointerException: null > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org