[jira] [Commented] (IMPALA-8271) Refactor the use of Thrift enums in query-options.cc
[ https://issues.apache.org/jira/browse/IMPALA-8271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784103#comment-16784103 ] Fredy Wijaya commented on IMPALA-8271: -- [~arodoni_cloudera] nope. > Refactor the use of Thrift enums in query-options.cc > > > Key: IMPALA-8271 > URL: https://issues.apache.org/jira/browse/IMPALA-8271 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Reporter: Fredy Wijaya >Priority: Minor > Labels: ramp-up > > Currently the logic for handling Thrift enums in query-options is very > error-prone. A change in Thrift enums require updating query-options.cc. For > example: > https://github.com/apache/impala/blob/master/be/src/service/query-options.cc#L276-L288. > This CR: https://gerrit.cloudera.org/c/12635/ is an attempt to fix such > issue for compression_codec. > This ticket aims to update the use of Thrift enums with the style similar to: > https://gerrit.cloudera.org/c/12635/ to make it less error-prone. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Comment Edited] (IMPALA-6326) segfault during impyla HiveServer2Cursor.cancel_operation() over SSL
[ https://issues.apache.org/jira/browse/IMPALA-6326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784082#comment-16784082 ] Tim Armstrong edited comment on IMPALA-6326 at 3/5/19 5:06 AM: --- I have a strong suspicion that the root cause of at least some of the issues is the way run_query forks off a thread in _hash_result(), because that could end up with two threads accessing the same underlying thrift connection. I might try to inject some failures there to see if the symptoms reproduce more frequently. was (Author: tarmstrong): I have a strong suspicious that the root cause of at least some of the issues is the way run_query forks off a thread in _hash_result(), because that could end up with two threads accessing the same underlying thrift connection. I might try to inject some failures there to see if the symptoms reproduce more frequently. > segfault during impyla HiveServer2Cursor.cancel_operation() over SSL > > > Key: IMPALA-6326 > URL: https://issues.apache.org/jira/browse/IMPALA-6326 > Project: IMPALA > Issue Type: Bug > Components: Clients >Affects Versions: Impala 2.10.0, Impala 2.11.0 >Reporter: Matthew Mulder >Priority: Major > Attachments: test_fork_crash.py > > > During a stress test on a secure cluster one of the clients crashed in > HiveServer2Cursor.cancel_operation(). > The stress test debug log shows{code}2017-12-13 16:50:52,624 21607 Query > Consumer DEBUG:concurrent_select[579]:Requesting memory reservation > 2017-12-13 16:50:52,624 21607 Query Consumer > DEBUG:concurrent_select[245]:Reserved 102 MB; 1455 MB available; 95180 MB > overcommitted > 2017-12-13 16:50:52,625 21607 Query Consumer > DEBUG:concurrent_select[581]:Received memory reservation > 2017-12-13 16:50:52,658 21607 Query Consumer > DEBUG:concurrent_select[865]:Using tpcds_300_decimal_parquet database > 2017-12-13 16:50:52,658 21607 Query Consumer DEBUG:db_connection[203]:IMPALA: > USE tpcds_300_decimal_parquet > 2017-12-13 16:50:52,825 21607 Query Consumer DEBUG:db_connection[203]:IMPALA: > SET ABORT_ON_ERROR=1 > 2017-12-13 16:50:53,060 21607 Query Consumer > DEBUG:concurrent_select[877]:Setting mem limit to 102 MB > 2017-12-13 16:50:53,060 21607 Query Consumer DEBUG:db_connection[203]:IMPALA: > SET MEM_LIMIT=102M > 2017-12-13 16:50:53,370 21607 Query Consumer > DEBUG:concurrent_select[881]:Running query with 102 MB mem limit at > vc0704.test with timeout secs 52: > select > dt.d_year, > item.i_category_id, > item.i_category, > sum(ss_ext_sales_price) > from > date_dim dt, > store_sales, > item > where > dt.d_date_sk = store_sales.ss_sold_date_sk > and store_sales.ss_item_sk = item.i_item_sk > and item.i_manager_id = 1 > and dt.d_moy = 11 > and dt.d_year = 2000 > group by > dt.d_year, > item.i_category_id, > item.i_category > order by > sum(ss_ext_sales_price) desc, > dt.d_year, > item.i_category_id, > item.i_category > limit 100; > 2017-12-13 16:51:08,491 21607 Query Consumer > DEBUG:concurrent_select[889]:Query id is b6425b84aa45f633:9ce7cad9 > 2017-12-13 16:51:15,337 21607 Query Consumer > DEBUG:concurrent_select[900]:Waiting for query to execute > 2017-12-13 16:51:22,316 21607 Query Consumer > DEBUG:concurrent_select[900]:Waiting for query to execute > 2017-12-13 16:51:27,266 21607 Fetch Results b6425b84aa45f633:9ce7cad9 > DEBUG:concurrent_select[1009]:Fetching result for query with id > b6425b84aa45f633:9ce7cad9 > 2017-12-13 16:51:44,625 21607 Query Consumer > DEBUG:concurrent_select[940]:Attempting cancellation of query with id > b6425b84aa45f633:9ce7cad9 > 2017-12-13 16:51:44,627 21607 Query Consumer INFO:hiveserver2[259]:Canceling > active operation{code}The impalad log shows{code}I1213 16:50:54.287511 136399 > admission-controller.cc:510] Schedule for > id=b6425b84aa45f633:9ce7cad9 in pool_name=root.systest > cluster_mem_needed=816.00 MB PoolConfig: max_requests=-1 max_queued=200 > max_mem=-1.00 B > I1213 16:50:54.289767 136399 admission-controller.cc:515] Stats: > agg_num_running=184, agg_num_queued=0, agg_mem_reserved=1529.63 GB, > local_host(local_mem_admitted=132.02 GB, num_admitted_running=21, > num_queued=0, backend_mem_reserved=194.58 GB) > I1213 16:50:54.291550 136399 admission-controller.cc:531] Admitted query > id=b6425b84aa45f633:9ce7cad9 > I1213 16:50:54.296922 136399 coordinator.cc:99] Exec() > query_id=b6425b84aa45f633:9ce7cad9 stmt=/* Mem: 102 MB. Coordinator: > vc0704.test. */ > select > dt.d_year, > item.i_category_id, > item.i_category, > sum(ss_ext_sales_price) > from > date_dim dt, > store_sales, > item > where > dt.d_date_sk = store_sales.ss_sold_date_sk > and
[jira] [Assigned] (IMPALA-6326) segfault during impyla HiveServer2Cursor.cancel_operation() over SSL
[ https://issues.apache.org/jira/browse/IMPALA-6326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong reassigned IMPALA-6326: - Assignee: Tim Armstrong > segfault during impyla HiveServer2Cursor.cancel_operation() over SSL > > > Key: IMPALA-6326 > URL: https://issues.apache.org/jira/browse/IMPALA-6326 > Project: IMPALA > Issue Type: Bug > Components: Clients >Affects Versions: Impala 2.10.0, Impala 2.11.0 >Reporter: Matthew Mulder >Assignee: Tim Armstrong >Priority: Major > Attachments: test_fork_crash.py > > > During a stress test on a secure cluster one of the clients crashed in > HiveServer2Cursor.cancel_operation(). > The stress test debug log shows{code}2017-12-13 16:50:52,624 21607 Query > Consumer DEBUG:concurrent_select[579]:Requesting memory reservation > 2017-12-13 16:50:52,624 21607 Query Consumer > DEBUG:concurrent_select[245]:Reserved 102 MB; 1455 MB available; 95180 MB > overcommitted > 2017-12-13 16:50:52,625 21607 Query Consumer > DEBUG:concurrent_select[581]:Received memory reservation > 2017-12-13 16:50:52,658 21607 Query Consumer > DEBUG:concurrent_select[865]:Using tpcds_300_decimal_parquet database > 2017-12-13 16:50:52,658 21607 Query Consumer DEBUG:db_connection[203]:IMPALA: > USE tpcds_300_decimal_parquet > 2017-12-13 16:50:52,825 21607 Query Consumer DEBUG:db_connection[203]:IMPALA: > SET ABORT_ON_ERROR=1 > 2017-12-13 16:50:53,060 21607 Query Consumer > DEBUG:concurrent_select[877]:Setting mem limit to 102 MB > 2017-12-13 16:50:53,060 21607 Query Consumer DEBUG:db_connection[203]:IMPALA: > SET MEM_LIMIT=102M > 2017-12-13 16:50:53,370 21607 Query Consumer > DEBUG:concurrent_select[881]:Running query with 102 MB mem limit at > vc0704.test with timeout secs 52: > select > dt.d_year, > item.i_category_id, > item.i_category, > sum(ss_ext_sales_price) > from > date_dim dt, > store_sales, > item > where > dt.d_date_sk = store_sales.ss_sold_date_sk > and store_sales.ss_item_sk = item.i_item_sk > and item.i_manager_id = 1 > and dt.d_moy = 11 > and dt.d_year = 2000 > group by > dt.d_year, > item.i_category_id, > item.i_category > order by > sum(ss_ext_sales_price) desc, > dt.d_year, > item.i_category_id, > item.i_category > limit 100; > 2017-12-13 16:51:08,491 21607 Query Consumer > DEBUG:concurrent_select[889]:Query id is b6425b84aa45f633:9ce7cad9 > 2017-12-13 16:51:15,337 21607 Query Consumer > DEBUG:concurrent_select[900]:Waiting for query to execute > 2017-12-13 16:51:22,316 21607 Query Consumer > DEBUG:concurrent_select[900]:Waiting for query to execute > 2017-12-13 16:51:27,266 21607 Fetch Results b6425b84aa45f633:9ce7cad9 > DEBUG:concurrent_select[1009]:Fetching result for query with id > b6425b84aa45f633:9ce7cad9 > 2017-12-13 16:51:44,625 21607 Query Consumer > DEBUG:concurrent_select[940]:Attempting cancellation of query with id > b6425b84aa45f633:9ce7cad9 > 2017-12-13 16:51:44,627 21607 Query Consumer INFO:hiveserver2[259]:Canceling > active operation{code}The impalad log shows{code}I1213 16:50:54.287511 136399 > admission-controller.cc:510] Schedule for > id=b6425b84aa45f633:9ce7cad9 in pool_name=root.systest > cluster_mem_needed=816.00 MB PoolConfig: max_requests=-1 max_queued=200 > max_mem=-1.00 B > I1213 16:50:54.289767 136399 admission-controller.cc:515] Stats: > agg_num_running=184, agg_num_queued=0, agg_mem_reserved=1529.63 GB, > local_host(local_mem_admitted=132.02 GB, num_admitted_running=21, > num_queued=0, backend_mem_reserved=194.58 GB) > I1213 16:50:54.291550 136399 admission-controller.cc:531] Admitted query > id=b6425b84aa45f633:9ce7cad9 > I1213 16:50:54.296922 136399 coordinator.cc:99] Exec() > query_id=b6425b84aa45f633:9ce7cad9 stmt=/* Mem: 102 MB. Coordinator: > vc0704.test. */ > select > dt.d_year, > item.i_category_id, > item.i_category, > sum(ss_ext_sales_price) > from > date_dim dt, > store_sales, > item > where > dt.d_date_sk = store_sales.ss_sold_date_sk > and store_sales.ss_item_sk = item.i_item_sk > and item.i_manager_id = 1 > and dt.d_moy = 11 > and dt.d_year = 2000 > group by > dt.d_year, > item.i_category_id, > item.i_category > order by > sum(ss_ext_sales_price) desc, > dt.d_year, > item.i_category_id, > item.i_category > limit 100; > I1213 16:50:59.263310 136399 query-state.cc:151] Using query memory limit > from query options: 102.00 MB > I1213 16:50:59.267033 136399 mem-tracker.cc:189] Using query memory limit: > 102.00 MB > I1213 16:50:59.272271 136399 coordinator.cc:357] starting execution on 8 > backends for query b6425b84aa45f633:9ce7cad9 > I1213 16:51:07.525143 136399 coordinator.cc:370] started execution on 8 >
[jira] [Commented] (IMPALA-6326) segfault during impyla HiveServer2Cursor.cancel_operation() over SSL
[ https://issues.apache.org/jira/browse/IMPALA-6326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784082#comment-16784082 ] Tim Armstrong commented on IMPALA-6326: --- I have a strong suspicious that the root cause of at least some of the issues is the way run_query forks off a thread in _hash_result(), because that could end up with two threads accessing the same underlying thrift connection. I might try to inject some failures there to see if the symptoms reproduce more frequently. > segfault during impyla HiveServer2Cursor.cancel_operation() over SSL > > > Key: IMPALA-6326 > URL: https://issues.apache.org/jira/browse/IMPALA-6326 > Project: IMPALA > Issue Type: Bug > Components: Clients >Affects Versions: Impala 2.10.0, Impala 2.11.0 >Reporter: Matthew Mulder >Priority: Major > Attachments: test_fork_crash.py > > > During a stress test on a secure cluster one of the clients crashed in > HiveServer2Cursor.cancel_operation(). > The stress test debug log shows{code}2017-12-13 16:50:52,624 21607 Query > Consumer DEBUG:concurrent_select[579]:Requesting memory reservation > 2017-12-13 16:50:52,624 21607 Query Consumer > DEBUG:concurrent_select[245]:Reserved 102 MB; 1455 MB available; 95180 MB > overcommitted > 2017-12-13 16:50:52,625 21607 Query Consumer > DEBUG:concurrent_select[581]:Received memory reservation > 2017-12-13 16:50:52,658 21607 Query Consumer > DEBUG:concurrent_select[865]:Using tpcds_300_decimal_parquet database > 2017-12-13 16:50:52,658 21607 Query Consumer DEBUG:db_connection[203]:IMPALA: > USE tpcds_300_decimal_parquet > 2017-12-13 16:50:52,825 21607 Query Consumer DEBUG:db_connection[203]:IMPALA: > SET ABORT_ON_ERROR=1 > 2017-12-13 16:50:53,060 21607 Query Consumer > DEBUG:concurrent_select[877]:Setting mem limit to 102 MB > 2017-12-13 16:50:53,060 21607 Query Consumer DEBUG:db_connection[203]:IMPALA: > SET MEM_LIMIT=102M > 2017-12-13 16:50:53,370 21607 Query Consumer > DEBUG:concurrent_select[881]:Running query with 102 MB mem limit at > vc0704.test with timeout secs 52: > select > dt.d_year, > item.i_category_id, > item.i_category, > sum(ss_ext_sales_price) > from > date_dim dt, > store_sales, > item > where > dt.d_date_sk = store_sales.ss_sold_date_sk > and store_sales.ss_item_sk = item.i_item_sk > and item.i_manager_id = 1 > and dt.d_moy = 11 > and dt.d_year = 2000 > group by > dt.d_year, > item.i_category_id, > item.i_category > order by > sum(ss_ext_sales_price) desc, > dt.d_year, > item.i_category_id, > item.i_category > limit 100; > 2017-12-13 16:51:08,491 21607 Query Consumer > DEBUG:concurrent_select[889]:Query id is b6425b84aa45f633:9ce7cad9 > 2017-12-13 16:51:15,337 21607 Query Consumer > DEBUG:concurrent_select[900]:Waiting for query to execute > 2017-12-13 16:51:22,316 21607 Query Consumer > DEBUG:concurrent_select[900]:Waiting for query to execute > 2017-12-13 16:51:27,266 21607 Fetch Results b6425b84aa45f633:9ce7cad9 > DEBUG:concurrent_select[1009]:Fetching result for query with id > b6425b84aa45f633:9ce7cad9 > 2017-12-13 16:51:44,625 21607 Query Consumer > DEBUG:concurrent_select[940]:Attempting cancellation of query with id > b6425b84aa45f633:9ce7cad9 > 2017-12-13 16:51:44,627 21607 Query Consumer INFO:hiveserver2[259]:Canceling > active operation{code}The impalad log shows{code}I1213 16:50:54.287511 136399 > admission-controller.cc:510] Schedule for > id=b6425b84aa45f633:9ce7cad9 in pool_name=root.systest > cluster_mem_needed=816.00 MB PoolConfig: max_requests=-1 max_queued=200 > max_mem=-1.00 B > I1213 16:50:54.289767 136399 admission-controller.cc:515] Stats: > agg_num_running=184, agg_num_queued=0, agg_mem_reserved=1529.63 GB, > local_host(local_mem_admitted=132.02 GB, num_admitted_running=21, > num_queued=0, backend_mem_reserved=194.58 GB) > I1213 16:50:54.291550 136399 admission-controller.cc:531] Admitted query > id=b6425b84aa45f633:9ce7cad9 > I1213 16:50:54.296922 136399 coordinator.cc:99] Exec() > query_id=b6425b84aa45f633:9ce7cad9 stmt=/* Mem: 102 MB. Coordinator: > vc0704.test. */ > select > dt.d_year, > item.i_category_id, > item.i_category, > sum(ss_ext_sales_price) > from > date_dim dt, > store_sales, > item > where > dt.d_date_sk = store_sales.ss_sold_date_sk > and store_sales.ss_item_sk = item.i_item_sk > and item.i_manager_id = 1 > and dt.d_moy = 11 > and dt.d_year = 2000 > group by > dt.d_year, > item.i_category_id, > item.i_category > order by > sum(ss_ext_sales_price) desc, > dt.d_year, > item.i_category_id, > item.i_category > limit 100; > I1213 16:50:59.263310 136399 query-state.cc:151] Using query memory limit > from query options: 102.00 MB >
[jira] [Commented] (IMPALA-8256) ImpalaServicePool::RejectTooBusy() should print more meaningful message
[ https://issues.apache.org/jira/browse/IMPALA-8256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784051#comment-16784051 ] ASF subversion and git services commented on IMPALA-8256: - Commit 63d45d59bae3fb37571088c0a2418a9df7630c51 in impala's branch refs/heads/master from Michael Ho [ https://gitbox.apache.org/repos/asf?p=impala.git;h=63d45d5 ] IMPALA-8256: Better error message for ImpalaServicePool::RejectTooBusy() An incoming request to a RPC service can be rejected due to either exceeding the memory limit or maximum allowed queue length. It's unclear from the current error message which of those factors contributes to the failure as neither the actual queue length nor the memory consumption is printed. This patch fixes the problem by printing the estimated queue length and memory consumption when a RPC request is dropped. Testing done: verified the new error message with test_rpc_timeout.py Change-Id: If0297658acf2b23823dcb7d2bdff5d8e4475bb98 Reviewed-on: http://gerrit.cloudera.org:8080/12624 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > ImpalaServicePool::RejectTooBusy() should print more meaningful message > > > Key: IMPALA-8256 > URL: https://issues.apache.org/jira/browse/IMPALA-8256 > Project: IMPALA > Issue Type: Bug > Components: Distributed Exec >Affects Versions: Impala 2.12.0, Impala 3.1.0, Impala 3.2.0 >Reporter: Michael Ho >Assignee: Michael Ho >Priority: Major > > An RPC request to a service can be rejected either due to exceeding the > memory limit or maximum allowed queue length. It's unclear from the current > error message which of those factors contribute to the failure as neither the > actual queue length nor the memory consumption is printed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-6326) segfault during impyla HiveServer2Cursor.cancel_operation() over SSL
[ https://issues.apache.org/jira/browse/IMPALA-6326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784046#comment-16784046 ] Tim Armstrong commented on IMPALA-6326: --- {noformat} 18:01:56 2019-03-04 18:01:56,425 14376 Fetch Results 164d2c564b750b6c:2cd8d9e2 ERROR:hiveserver2[943]:Failed to open transport (tries_left=3) 18:01:56 Traceback (most recent call last): 18:01:56 File "/data0/jenkins/workspace/impala-cdh6.x-test-stress-secure-manual/Impala/infra/python/env/local/lib/python2.7/site-packages/impala/hiveserver2.py", line 940, in _execute 18:01:56 return func(request) 18:01:56 File "/data0/jenkins/workspace/impala-cdh6.x-test-stress-secure-manual/Impala/infra/python/env/local/lib/python2.7/site-packages/impala/_thrift_gen/TCLIService/TCLIService.py", line 505, in GetOperationStatus 18:01:56 return self.recv_GetOperationStatus() 18:01:56 File "/data0/jenkins/workspace/impala-cdh6.x-test-stress-secure-manual/Impala/infra/python/env/local/lib/python2.7/site-packages/impala/_thrift_gen/TCLIService/TCLIService.py", line 516, in recv_GetOperationStatus 18:01:56 (fname, mtype, rseqid) = self._iprot.readMessageBegin() 18:01:56 File "/data0/jenkins/workspace/impala-cdh6.x-test-stress-secure-manual/Impala/toolchain/thrift-0.9.3-p5/python/lib/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py", line 126, in readMessageBegin 18:01:56 sz = self.readI32() 18:01:56 File "/data0/jenkins/workspace/impala-cdh6.x-test-stress-secure-manual/Impala/toolchain/thrift-0.9.3-p5/python/lib/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py", line 206, in readI32 18:01:56 buff = self.trans.readAll(4) 18:01:56 File "/data0/jenkins/workspace/impala-cdh6.x-test-stress-secure-manual/Impala/toolchain/thrift-0.9.3-p5/python/lib/python2.7/site-packages/thrift/transport/TTransport.py", line 58, in readAll 18:01:56 chunk = self.read(sz - have) 18:01:56 File "/data0/jenkins/workspace/impala-cdh6.x-test-stress-secure-manual/Impala/infra/python/env/local/lib/python2.7/site-packages/thrift_sasl/__init__.py", line 159, in read 18:01:56 self._read_frame() 18:01:56 File "/data0/jenkins/workspace/impala-cdh6.x-test-stress-secure-manual/Impala/infra/python/env/local/lib/python2.7/site-packages/thrift_sasl/__init__.py", line 163, in _read_frame 18:01:56 header = read_all_compat(self._trans, 4) 18:01:56 File "/data0/jenkins/workspace/impala-cdh6.x-test-stress-secure-manual/Impala/infra/python/env/local/lib/python2.7/site-packages/thrift_sasl/six.py", line 31, in 18:01:56 read_all_compat = lambda trans, sz: trans.readAll(sz) 18:01:56 File "/data0/jenkins/workspace/impala-cdh6.x-test-stress-secure-manual/Impala/toolchain/thrift-0.9.3-p5/python/lib/python2.7/site-packages/thrift/transport/TTransport.py", line 58, in readAll 18:01:56 chunk = self.read(sz - have) 18:01:56 File "/data0/jenkins/workspace/impala-cdh6.x-test-stress-secure-manual/Impala/toolchain/thrift-0.9.3-p5/python/lib/python2.7/site-packages/thrift/transport/TSocket.py", line 105, in read 18:01:56 buff = self.handle.recv(sz) 18:01:56 File "/usr/lib/python2.7/ssl.py", line 341, in recv 18:01:56 return self.read(buflen) 18:01:56 File "/usr/lib/python2.7/ssl.py", line 260, in read 18:01:56 return self._sslobj.read(len) 18:01:56 SSLError: [Errno 1] _ssl.c:1429: error:1408F081:SSL routines:SSL3_GET_RECORD:block cipher pad is wrong 18:01:56 Process Process-36: 18:01:56 Traceback (most recent call last): 18:01:56 File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap 18:01:56 self.run() 18:01:56 File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run 18:01:56 self._target(*self._args, **self._kwargs) 18:01:56 File "tests/stress/concurrent_select.py", line 841, in _start_single_runner 18:01:56 mesg=error_msg)) 18:01:56 Exception: Query tpcds_300_decimal_parquet_q51a ID None failed: Bad version in readMessageBegin: -614891738 18:01:56 Query runner (14376) exited with exit code 1 {noformat} {noformat} 18:00:15 2019-03-04 18:00:15,073 14414 Fetch Results e743fed952c6e11a:6c88df9c ERROR:hiveserver2[943]:Failed to open transport (tries_left=3) 18:00:15 Traceback (most recent call last): 18:00:15 File "/data0/jenkins/workspace/impala-cdh6.x-test-stress-secure-manual/Impala/infra/python/env/local/lib/python2.7/site-packages/impala/hiveserver2.py", line 940, in _execute 18:00:15 return func(request) 18:00:15 File "/data0/jenkins/workspace/impala-cdh6.x-test-stress-secure-manual/Impala/infra/python/env/local/lib/python2.7/site-packages/impala/_thrift_gen/TCLIService/TCLIService.py", line 505, in GetOperationStatus 18:00:15 return self.recv_GetOperationStatus() 18:00:15 File
[jira] [Created] (IMPALA-8281) Implement SHOW GRANT GROUP
Fredy Wijaya created IMPALA-8281: Summary: Implement SHOW GRANT GROUP Key: IMPALA-8281 URL: https://issues.apache.org/jira/browse/IMPALA-8281 Project: IMPALA Issue Type: Sub-task Components: Catalog, Frontend Reporter: Fredy Wijaya Syntax: {noformat} SHOW GRANT GROUP [ON ] {noformat} The command is to show list of privileges for a given group with an optional ON clause. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8281) Implement SHOW GRANT GROUP
Fredy Wijaya created IMPALA-8281: Summary: Implement SHOW GRANT GROUP Key: IMPALA-8281 URL: https://issues.apache.org/jira/browse/IMPALA-8281 Project: IMPALA Issue Type: Sub-task Components: Catalog, Frontend Reporter: Fredy Wijaya Syntax: {noformat} SHOW GRANT GROUP [ON ] {noformat} The command is to show list of privileges for a given group with an optional ON clause. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8280) Implement SHOW GRANT USER
Fredy Wijaya created IMPALA-8280: Summary: Implement SHOW GRANT USER Key: IMPALA-8280 URL: https://issues.apache.org/jira/browse/IMPALA-8280 Project: IMPALA Issue Type: Sub-task Components: Catalog, Frontend Reporter: Fredy Wijaya Syntax: {noformat} SHOW GRANT USER [ON ] {noformat} The command is to show list of privileges for a given user with an optional ON clause. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8280) Implement SHOW GRANT USER
Fredy Wijaya created IMPALA-8280: Summary: Implement SHOW GRANT USER Key: IMPALA-8280 URL: https://issues.apache.org/jira/browse/IMPALA-8280 Project: IMPALA Issue Type: Sub-task Components: Catalog, Frontend Reporter: Fredy Wijaya Syntax: {noformat} SHOW GRANT USER [ON ] {noformat} The command is to show list of privileges for a given user with an optional ON clause. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8278) Fix MetastoreEventsProcessorTest flakiness
[ https://issues.apache.org/jira/browse/IMPALA-8278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated IMPALA-8278: Summary: Fix MetastoreEventsProcessorTest flakiness (was: Fix testEventProcessorFetchAfterHMSRestart) > Fix MetastoreEventsProcessorTest flakiness > -- > > Key: IMPALA-8278 > URL: https://issues.apache.org/jira/browse/IMPALA-8278 > Project: IMPALA > Issue Type: Sub-task >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > > {{testEventProcessorFetchAfterHMSRestart}} test case in > {{MetastoreEventsProcessorTest}} causes flakiness because it creates a new > event processor pointing to the same the catalog instance. This means that > all the events generated are now being processed by two events processor > instances and they both try to modify the state of catalogd causing race > conditions. The failures vary and depend a lot on the timing. I see the > following exception which is related to this issue. > Easiest way to figure out if this is a problem is to look into FeSupport logs > of the test to confirm if a event ID is being processed twice (i.e you see to > exactly similar logs for a given event id). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8274) Missing update to index into profiles vector in Coordinator::BackendState::ApplyExecStatusReport()
[ https://issues.apache.org/jira/browse/IMPALA-8274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Ho resolved IMPALA-8274. Resolution: Fixed Fix Version/s: Impala 3.2.0 > Missing update to index into profiles vector in > Coordinator::BackendState::ApplyExecStatusReport() > -- > > Key: IMPALA-8274 > URL: https://issues.apache.org/jira/browse/IMPALA-8274 > Project: IMPALA > Issue Type: Bug > Components: Distributed Exec >Reporter: Michael Ho >Assignee: Michael Ho >Priority: Blocker > Labels: crash > Fix For: Impala 3.2.0 > > > {{idx}} isn't updated in case we skip a duplicated or stale duplicated update > of a fragment instance. As a result, we may end up passing the wrong profile > to {{instance_stats->Update()}}. This may lead to random crashes in > {{Coordinator::BackendState::InstanceStats::Update}}. > {noformat} > int idx = 0; > const bool has_profile = thrift_profiles.profile_trees.size() > 0; > TRuntimeProfileTree empty_profile; > for (const FragmentInstanceExecStatusPB& instance_exec_status : >backend_exec_status.instance_exec_status()) { > int64_t report_seq_no = instance_exec_status.report_seq_no(); > int instance_idx = > GetInstanceIdx(instance_exec_status.fragment_instance_id()); > DCHECK_EQ(instance_stats_map_.count(instance_idx), 1); > InstanceStats* instance_stats = instance_stats_map_[instance_idx]; > int64_t last_report_seq_no = instance_stats->last_report_seq_no_; > DCHECK(instance_stats->exec_params_.instance_id == > ProtoToQueryId(instance_exec_status.fragment_instance_id())); > // Ignore duplicate or out-of-order messages. > if (report_seq_no <= last_report_seq_no) { > VLOG_QUERY << Substitute("Ignoring stale update for query instance $0 > with " > "seq no $1", PrintId(instance_stats->exec_params_.instance_id), > report_seq_no); > continue; <<--- // XXX bad > } > DCHECK(!instance_stats->done_); > DCHECK(!has_profile || idx < thrift_profiles.profile_trees.size()); > const TRuntimeProfileTree& profile = > has_profile ? thrift_profiles.profile_trees[idx++] : empty_profile; > instance_stats->Update(instance_exec_status, profile, exec_summary, > scan_range_progress); > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8274) Missing update to index into profiles vector in Coordinator::BackendState::ApplyExecStatusReport()
[ https://issues.apache.org/jira/browse/IMPALA-8274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Ho resolved IMPALA-8274. Resolution: Fixed Fix Version/s: Impala 3.2.0 > Missing update to index into profiles vector in > Coordinator::BackendState::ApplyExecStatusReport() > -- > > Key: IMPALA-8274 > URL: https://issues.apache.org/jira/browse/IMPALA-8274 > Project: IMPALA > Issue Type: Bug > Components: Distributed Exec >Reporter: Michael Ho >Assignee: Michael Ho >Priority: Blocker > Labels: crash > Fix For: Impala 3.2.0 > > > {{idx}} isn't updated in case we skip a duplicated or stale duplicated update > of a fragment instance. As a result, we may end up passing the wrong profile > to {{instance_stats->Update()}}. This may lead to random crashes in > {{Coordinator::BackendState::InstanceStats::Update}}. > {noformat} > int idx = 0; > const bool has_profile = thrift_profiles.profile_trees.size() > 0; > TRuntimeProfileTree empty_profile; > for (const FragmentInstanceExecStatusPB& instance_exec_status : >backend_exec_status.instance_exec_status()) { > int64_t report_seq_no = instance_exec_status.report_seq_no(); > int instance_idx = > GetInstanceIdx(instance_exec_status.fragment_instance_id()); > DCHECK_EQ(instance_stats_map_.count(instance_idx), 1); > InstanceStats* instance_stats = instance_stats_map_[instance_idx]; > int64_t last_report_seq_no = instance_stats->last_report_seq_no_; > DCHECK(instance_stats->exec_params_.instance_id == > ProtoToQueryId(instance_exec_status.fragment_instance_id())); > // Ignore duplicate or out-of-order messages. > if (report_seq_no <= last_report_seq_no) { > VLOG_QUERY << Substitute("Ignoring stale update for query instance $0 > with " > "seq no $1", PrintId(instance_stats->exec_params_.instance_id), > report_seq_no); > continue; <<--- // XXX bad > } > DCHECK(!instance_stats->done_); > DCHECK(!has_profile || idx < thrift_profiles.profile_trees.size()); > const TRuntimeProfileTree& profile = > has_profile ? thrift_profiles.profile_trees[idx++] : empty_profile; > instance_stats->Update(instance_exec_status, profile, exec_summary, > scan_range_progress); > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IMPALA-8248) Re-organize authorization tests
[ https://issues.apache.org/jira/browse/IMPALA-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] radford nguyen updated IMPALA-8248: --- Description: We have authorization tests that are specific to Sentry and authorization tests that can be applicable to any authorization provider. We need to re-organize the authorization tests to easily differentiate between Sentry-specific tests vs generic authorization tests. h3. Approach # Move `AuthorizationTest.java` and `AuthorizationStmtTest.java` to `org.apache.impala.authorization` # Rename `CustomClusterGroupMapper` and `CustomClusterResourceAuthorizationProvider` to `TestSentryGroupMapper` and `TestSentryAuthorizationProvider` since those two class aren't specific to custom cluster anymore. # Move those two files into `org.apache.impala.testutil` instead since they're not actually test classes. Note: all classes to remain in `test` sourceset was:We have authorization tests that are specific to Sentry and authorization tests that can be applicable to any authorization provider. We need to re-organize the authorization tests to easily differentiate between Sentry-specific tests vs generic authorization tests. > Re-organize authorization tests > --- > > Key: IMPALA-8248 > URL: https://issues.apache.org/jira/browse/IMPALA-8248 > Project: IMPALA > Issue Type: Sub-task > Components: Infrastructure >Reporter: Fredy Wijaya >Assignee: radford nguyen >Priority: Major > > We have authorization tests that are specific to Sentry and authorization > tests that can be applicable to any authorization provider. We need to > re-organize the authorization tests to easily differentiate between > Sentry-specific tests vs generic authorization tests. > > h3. Approach > # Move `AuthorizationTest.java` and `AuthorizationStmtTest.java` to > `org.apache.impala.authorization` > # Rename `CustomClusterGroupMapper` and > `CustomClusterResourceAuthorizationProvider` to `TestSentryGroupMapper` and > `TestSentryAuthorizationProvider` since those two class aren't specific to > custom cluster anymore. > # Move those two files into `org.apache.impala.testutil` instead since > they're not actually test classes. > Note: all classes to remain in `test` sourceset -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8248) Re-organize authorization tests
[ https://issues.apache.org/jira/browse/IMPALA-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] radford nguyen updated IMPALA-8248: --- Description: We have authorization tests that are specific to Sentry and authorization tests that can be applicable to any authorization provider. We need to re-organize the authorization tests to easily differentiate between Sentry-specific tests vs generic authorization tests. h3. Approach # Move {{AuthorizationTest.java}} and {{AuthorizationStmtTest.java}} to {{org.apache.impala.authorization}} # Rename {{CustomClusterGroupMapper}} and {{CustomClusterResourceAuthorizationProvider}} to {{TestSentryGroupMapper}} and {{TestSentryAuthorizationProvider}} since those two class aren't specific to custom cluster anymore. # Move those two files into {{org.apache.impala.testutil}} instead since they're not actually test classes. Note: all classes to remain in {{test}} sourceset was: We have authorization tests that are specific to Sentry and authorization tests that can be applicable to any authorization provider. We need to re-organize the authorization tests to easily differentiate between Sentry-specific tests vs generic authorization tests. h3. Approach # Move `AuthorizationTest.java` and `AuthorizationStmtTest.java` to `org.apache.impala.authorization` # Rename `CustomClusterGroupMapper` and `CustomClusterResourceAuthorizationProvider` to `TestSentryGroupMapper` and `TestSentryAuthorizationProvider` since those two class aren't specific to custom cluster anymore. # Move those two files into `org.apache.impala.testutil` instead since they're not actually test classes. Note: all classes to remain in `test` sourceset > Re-organize authorization tests > --- > > Key: IMPALA-8248 > URL: https://issues.apache.org/jira/browse/IMPALA-8248 > Project: IMPALA > Issue Type: Sub-task > Components: Infrastructure >Reporter: Fredy Wijaya >Assignee: radford nguyen >Priority: Major > > We have authorization tests that are specific to Sentry and authorization > tests that can be applicable to any authorization provider. We need to > re-organize the authorization tests to easily differentiate between > Sentry-specific tests vs generic authorization tests. > > h3. Approach > # Move {{AuthorizationTest.java}} and {{AuthorizationStmtTest.java}} to > {{org.apache.impala.authorization}} > # Rename {{CustomClusterGroupMapper}} and > {{CustomClusterResourceAuthorizationProvider}} to {{TestSentryGroupMapper}} > and {{TestSentryAuthorizationProvider}} since those two class aren't specific > to custom cluster anymore. > # Move those two files into {{org.apache.impala.testutil}} instead since > they're not actually test classes. > Note: all classes to remain in {{test}} sourceset -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-8248) Re-organize authorization tests
[ https://issues.apache.org/jira/browse/IMPALA-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-8248 started by radford nguyen. -- > Re-organize authorization tests > --- > > Key: IMPALA-8248 > URL: https://issues.apache.org/jira/browse/IMPALA-8248 > Project: IMPALA > Issue Type: Sub-task > Components: Infrastructure >Reporter: Fredy Wijaya >Assignee: radford nguyen >Priority: Major > > We have authorization tests that are specific to Sentry and authorization > tests that can be applicable to any authorization provider. We need to > re-organize the authorization tests to easily differentiate between > Sentry-specific tests vs generic authorization tests. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8272) test_catalog_tablesfilesusage failing
[ https://issues.apache.org/jira/browse/IMPALA-8272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783928#comment-16783928 ] ASF subversion and git services commented on IMPALA-8272: - Commit 43adfac5078780cd939d8ba23d481529dbebf0aa in impala's branch refs/heads/master from Yongzhi Chen [ https://gitbox.apache.org/repos/asf?p=impala.git;h=43adfac ] IMPALA-8272: Fix test_catalog_tablesfilesusage failing The test can run in any context, do not make any assumption. Change-Id: I41cfa59882edafcd5e61d2e119cd8e8bff08e544 Reviewed-on: http://gerrit.cloudera.org:8080/12649 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > test_catalog_tablesfilesusage failing > - > > Key: IMPALA-8272 > URL: https://issues.apache.org/jira/browse/IMPALA-8272 > Project: IMPALA > Issue Type: Improvement >Affects Versions: Impala 3.2.0 >Reporter: Bikramjeet Vig >Assignee: Yongzhi Chen >Priority: Critical > Labels: broken-build > > test_catalog_tablesfilesusage fails in exhaustive builds because the way the > test is set up, it expects a certain table to always show up in the top 3 > list but if the catalog at that time has already loaded data for tables that > have more files that the expected table, then it would not show up and the > test would fail. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8274) Missing update to index into profiles vector in Coordinator::BackendState::ApplyExecStatusReport()
[ https://issues.apache.org/jira/browse/IMPALA-8274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783929#comment-16783929 ] ASF subversion and git services commented on IMPALA-8274: - Commit 110b362a52eda053caa0d177016e22a23a1a9612 in impala's branch refs/heads/master from Michael Ho [ https://gitbox.apache.org/repos/asf?p=impala.git;h=110b362 ] IMPALA-8274: Fix iteration of profiles in ApplyExecStatusReport() The coordinator skips over any stale or duplicated status reports of fragment instances. In the previous implementation, the index pointing into the vector of Thrift profiles wasn't updated when skipping over a status report. This breaks the assumption that the status reports and thrift profiles vectors have one-to-one correspondence. Consequently, we may end up passing the wrong profile to InstanceStats::Update(), leading to random crashes. This change fixes the problem above by using iterators to iterate through the status reports and thrift profiles vectors and ensures that both iterators are updated on every iteration of the loop. Change-Id: I8bce426c7d08ffbf0f8cd26889262243a52cc752 Reviewed-on: http://gerrit.cloudera.org:8080/12651 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Missing update to index into profiles vector in > Coordinator::BackendState::ApplyExecStatusReport() > -- > > Key: IMPALA-8274 > URL: https://issues.apache.org/jira/browse/IMPALA-8274 > Project: IMPALA > Issue Type: Bug > Components: Distributed Exec >Reporter: Michael Ho >Assignee: Michael Ho >Priority: Blocker > Labels: crash > > {{idx}} isn't updated in case we skip a duplicated or stale duplicated update > of a fragment instance. As a result, we may end up passing the wrong profile > to {{instance_stats->Update()}}. This may lead to random crashes in > {{Coordinator::BackendState::InstanceStats::Update}}. > {noformat} > int idx = 0; > const bool has_profile = thrift_profiles.profile_trees.size() > 0; > TRuntimeProfileTree empty_profile; > for (const FragmentInstanceExecStatusPB& instance_exec_status : >backend_exec_status.instance_exec_status()) { > int64_t report_seq_no = instance_exec_status.report_seq_no(); > int instance_idx = > GetInstanceIdx(instance_exec_status.fragment_instance_id()); > DCHECK_EQ(instance_stats_map_.count(instance_idx), 1); > InstanceStats* instance_stats = instance_stats_map_[instance_idx]; > int64_t last_report_seq_no = instance_stats->last_report_seq_no_; > DCHECK(instance_stats->exec_params_.instance_id == > ProtoToQueryId(instance_exec_status.fragment_instance_id())); > // Ignore duplicate or out-of-order messages. > if (report_seq_no <= last_report_seq_no) { > VLOG_QUERY << Substitute("Ignoring stale update for query instance $0 > with " > "seq no $1", PrintId(instance_stats->exec_params_.instance_id), > report_seq_no); > continue; <<--- // XXX bad > } > DCHECK(!instance_stats->done_); > DCHECK(!has_profile || idx < thrift_profiles.profile_trees.size()); > const TRuntimeProfileTree& profile = > has_profile ? thrift_profiles.profile_trees[idx++] : empty_profile; > instance_stats->Update(instance_exec_status, profile, exec_summary, > scan_range_progress); > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8249) End-to-end test framework doesn't read aggregated counters properly
[ https://issues.apache.org/jira/browse/IMPALA-8249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783927#comment-16783927 ] ASF subversion and git services commented on IMPALA-8249: - Commit dc1bc3ca03337d6d63d88261226047bb7a55493b in impala's branch refs/heads/master from Zoltan Borok-Nagy [ https://gitbox.apache.org/repos/asf?p=impala.git;h=dc1bc3c ] IMPALA-8249: End-to-end test framework doesn't read aggregated counters properly Updated compute_aggregation() function to not read the pretty-printed value from the runtime profile, but the accurate value which is at the end of the line in parenthesis, e.g.: RowsReturned: 2.14M (2142543) The old regex tried to parse '2.14M' with 'd+', which resulted in '2' instead of 2142543. I tested the change manually and added a test case to 'tests/unittests/test_result_verifier.py'. Change-Id: I2a6fc0d3f7cbaa87aa848cdafffad21fb1514930 Reviewed-on: http://gerrit.cloudera.org:8080/12589 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > End-to-end test framework doesn't read aggregated counters properly > --- > > Key: IMPALA-8249 > URL: https://issues.apache.org/jira/browse/IMPALA-8249 > Project: IMPALA > Issue Type: Bug >Reporter: Zoltán Borók-Nagy >Assignee: Zoltán Borók-Nagy >Priority: Major > > The test framework doesn't always read the correct value of counters from the > runtime profile. In the .test files we can have a RUNTIME_PROFILE section > where we can test our expectations against runtime profile data. We can even > calculate aggregates of runtime data, currently only SUM is supported over > integer data, e.g.: > {code:java} > RUNTIME_PROFILE > aggregation(SUM, RowsReturned): 2142543 > {code} > However, the counters are pretty-printed in the runtime profile, which means > that if they are greater than 1000, a shortened version is printed first, > then the accurate number comes in parenthesis , e.g.: > {code:java} > RowsReturned: 2.14M (2142543){code} > When the test framework parses the value of an aggregated counter, it > wrongly tries to parse the short version as a number, which returns a wrong > value (2 instead of 2142543 in the example). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7190) Remove unsupported format write support
[ https://issues.apache.org/jira/browse/IMPALA-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783926#comment-16783926 ] ASF subversion and git services commented on IMPALA-7190: - Commit 597e378dce448b9488f1dd13c0668fc3c6f828a8 in impala's branch refs/heads/2.x from Bikramjeet Vig [ https://gitbox.apache.org/repos/asf?p=impala.git;h=597e378 ] IMPALA-7190: Remove unsupported format writer support This patch removes write support for unsupported formats like Sequence, Avro and compressed text. Also, the related query options ALLOW_UNSUPPORTED_FORMATS and SEQ_COMPRESSION_MODE have been migrated to the REMOVED query options type. Testing: Ran exhaustive build. Change-Id: I821dc7495a901f1658daa500daf3791b386c7185 Reviewed-on: http://gerrit.cloudera.org:8080/10823 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins Reviewed-on: http://gerrit.cloudera.org:8080/12642 Reviewed-by: Tim Armstrong > Remove unsupported format write support > --- > > Key: IMPALA-7190 > URL: https://issues.apache.org/jira/browse/IMPALA-7190 > Project: IMPALA > Issue Type: Task > Components: Backend >Reporter: Tim Armstrong >Assignee: Bikramjeet Vig >Priority: Major > Fix For: Impala 3.1.0 > > > Let's remove the formats gated by ALLOW_UNSUPPORTED_FORMATS since progress > stalled a long time ago. It sounds like there's a consensus on the mailing > list to remove the code: > [https://lists.apache.org/thread.html/749bef4914350ae0756bc88961db2dd39901a649a9cef6949eda5870@%3Cdev.impala.apache.org%3E] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8279) Revert IMPALA-6658 to avoid ETL performance regression
Andrew Sherman created IMPALA-8279: -- Summary: Revert IMPALA-6658 to avoid ETL performance regression Key: IMPALA-8279 URL: https://issues.apache.org/jira/browse/IMPALA-8279 Project: IMPALA Issue Type: Bug Reporter: Andrew Sherman The fix for IMPALA-6658 seems to cause a measurable regression on {quote} use tpcds; create TABLE store_sales_unpart stored as parquet as SELECT * FROM tpcds.store_sales; INSERT OVERWRITE TABLE store_sales_unpart SELECT * FROM store_sales; {quote} Revert the change to avoid the regression. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8279) Revert IMPALA-6658 to avoid ETL performance regression
Andrew Sherman created IMPALA-8279: -- Summary: Revert IMPALA-6658 to avoid ETL performance regression Key: IMPALA-8279 URL: https://issues.apache.org/jira/browse/IMPALA-8279 Project: IMPALA Issue Type: Bug Reporter: Andrew Sherman The fix for IMPALA-6658 seems to cause a measurable regression on {quote} use tpcds; create TABLE store_sales_unpart stored as parquet as SELECT * FROM tpcds.store_sales; INSERT OVERWRITE TABLE store_sales_unpart SELECT * FROM store_sales; {quote} Revert the change to avoid the regression. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-7826) Potential NPE in CatalogOpExecutor
[ https://issues.apache.org/jira/browse/IMPALA-7826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers reassigned IMPALA-7826: --- Assignee: Paul Rogers > Potential NPE in CatalogOpExecutor > -- > > Key: IMPALA-7826 > URL: https://issues.apache.org/jira/browse/IMPALA-7826 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > > {{CatalogOpExecutor}} has two copies of the following: > {code:java} >Db db = catalog_.getDb(dbName); >if (db == null) { > throw new CatalogException("Database: " + db.getName() + " does not > exist."); >} > {code} > If {{db}} is null, we can’t call {{.getName()}} on that object. (The IDE > showed a warning for this which is why my attention was directed to it.) > We’ll get a null pointer exception (NPE) when creating the error message. > IMPALA-7823 includes the obvious fix, change {{db.getName()}} to {{dbName}}. > But, there may be deeper problems: > # Perhaps someone thoughtfully wrapped this call stack in a try/catch block > and used the NPE to infer that the DB was not found. > # Perhaps if-statement is wrong: perhaps the catalog_.getDb() method returns > a Db object even if not found, and the if-statement should be checking for “! > db.isValid()” or some such. > # Perhaps the code is dead: it is simply never called. > # Most likely: perhaps this code is used, but the semantics are such that we > already checked the DB earlier in the flow. The check here is superfluous: it > can never fail. The check, if we had one, should be an assertion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-7826) Potential NPE in CatalogOpExecutor
[ https://issues.apache.org/jira/browse/IMPALA-7826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers resolved IMPALA-7826. - Resolution: Fixed Fixed as part of another patch. > Potential NPE in CatalogOpExecutor > -- > > Key: IMPALA-7826 > URL: https://issues.apache.org/jira/browse/IMPALA-7826 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > > {{CatalogOpExecutor}} has two copies of the following: > {code:java} >Db db = catalog_.getDb(dbName); >if (db == null) { > throw new CatalogException("Database: " + db.getName() + " does not > exist."); >} > {code} > If {{db}} is null, we can’t call {{.getName()}} on that object. (The IDE > showed a warning for this which is why my attention was directed to it.) > We’ll get a null pointer exception (NPE) when creating the error message. > IMPALA-7823 includes the obvious fix, change {{db.getName()}} to {{dbName}}. > But, there may be deeper problems: > # Perhaps someone thoughtfully wrapped this call stack in a try/catch block > and used the NPE to infer that the DB was not found. > # Perhaps if-statement is wrong: perhaps the catalog_.getDb() method returns > a Db object even if not found, and the if-statement should be checking for “! > db.isValid()” or some such. > # Perhaps the code is dead: it is simply never called. > # Most likely: perhaps this code is used, but the semantics are such that we > already checked the DB earlier in the flow. The check here is superfluous: it > can never fail. The check, if we had one, should be an assertion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-8273) Change metastore configuration template so that table parameters do not exclude impala specific properties
[ https://issues.apache.org/jira/browse/IMPALA-8273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-8273 started by Vihang Karajgaonkar. --- > Change metastore configuration template so that table parameters do not > exclude impala specific properties > -- > > Key: IMPALA-8273 > URL: https://issues.apache.org/jira/browse/IMPALA-8273 > Project: IMPALA > Issue Type: Sub-task >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > > CDH Hive has a configuration > {{hive.metastore.notification.parameters.exclude.patterns}} which gives the > ability to exclude certain parameter keys from notification events. This is > mainly used as a safety valve in case there are huge values stored in these > parameter maps. The template file should make sure that the parameter > exclusion is disabled (or atleast configured such that it does not exclude > {{impala.disableHmsSync}} which is needed by this feature. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8002) Unstable join ordering for equivalent tables
[ https://issues.apache.org/jira/browse/IMPALA-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783891#comment-16783891 ] Paul Rogers commented on IMPALA-8002: - See IMPALA-8219 for one way to resolve this issue. > Unstable join ordering for equivalent tables > > > Key: IMPALA-8002 > URL: https://issues.apache.org/jira/browse/IMPALA-8002 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.1.0 >Reporter: Paul Rogers >Priority: Minor > > Consider the following test: {{PlannerTest.testJoins()}}: > {noformat} > select t1.d, t2.d > from functional.nulltable t1, functional.nulltable t2, functional.nulltable t3 > where t1.d IS DISTINCT FROM t2.d > and t3.a != t2.g > PLAN > PLAN-ROOT SINK > | > 04:NESTED LOOP JOIN [INNER JOIN] > | predicates: t3.a != t2.g > | > |--02:SCAN HDFS [functional.nulltable t3] > | partitions=1/1 files=1 size=18B > | > 03:NESTED LOOP JOIN [INNER JOIN] > | predicates: t1.d IS DISTINCT FROM t2.d > | > |--00:SCAN HDFS [functional.nulltable t1] > | partitions=1/1 files=1 size=18B > | > 01:SCAN HDFS [functional.nulltable t2] >partitions=1/1 files=1 size=18B > {noformat} > Despite no changes in the planner code, on one run the order flipped to the > above. Previously, 01 was t1 and 00 was t2. > Likely, the behavior when two tables are equivalent has some kind of > non-determinism, perhaps storing candidates in a Java Set or Map with > undefined order. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8050) IS [NOT] NULL gives wrong selectivity when null count is missing
[ https://issues.apache.org/jira/browse/IMPALA-8050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783889#comment-16783889 ] Paul Rogers commented on IMPALA-8050: - A patch for this is available, but it is some work to update various tests. Will offer the patch again once several other planner patches are merged to avoid excessive test case churn. > IS [NOT] NULL gives wrong selectivity when null count is missing > > > Key: IMPALA-8050 > URL: https://issues.apache.org/jira/browse/IMPALA-8050 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.1.0 >Reporter: Paul Rogers >Priority: Minor > > Suppose we have the following query: > {noformat} > select * > from tpch.customer c > where c.c_mktsegment is null > {noformat} > If we have a null count, we can estimate selectivity based on that number. In > the case of the TPC-H test data, after a recent fix to add null count back, > null count is zero so the cardinality of the predicate {{c.c_mktsegment is > null}} is 0 and no rows should be returned. Yet, the query plan shows: > {noformat} > PLAN-ROOT SINK > | > 00:SCAN HDFS [tpch.customer c] >partitions=1/1 files=1 size=23.08MB row-size=218B cardinality=15.00K >predicates: c.c_comment IS NULL > {noformat} > So, the first bug is that the existing code which is supposed to consider > null count (found in {{IsNullPredicate.analyzeImpl()}} does not work. Reason: > the code in {{ColumnStats}} to check if we have nulls is wrong: > {code:java} > public boolean hasNulls() { return numNulls_ > 0; } > {code} > Zero is a perfectly valid null count: it means a NOT NULL column. The marker > for a missing null count is -1 as shown in another method: > {code:java} > public boolean hasStats() { return numNulls_ != -1 || numDistinctValues_ != > -1; } > {code} > This is probably an ambiguity in the name: does "has nulls" mean: > * Do we have valid null count stats? > * Do we have null count stats and we have at least some nulls? > Fortunately, the only other use of this method is in (disabled) tests. > h4. Handle Missing Null Counts > Second, if the null count is not available (for older stats), the next-best > approximation is 1/NDV. The code currently guesses 0.1. The 0.1 estimate is > fine if NDV is not available either. > Note that to properly test some of these cases requires new tables in the > test suite with no or partial stats. > h4. Special Consideration for Outer Joins > When this predicate is applied to the result of an outer join, the estimation > methods above *will not* work. Using the table null count to estimate an > outer join null count is clearly wrong, as is using the table NDV value. The > fall-back of .1 will tend to under-estimate an outer join. > Instead, what is needed is a more complex estimate. Assume a left outer join > (all rows from left, plus matching rows from right.) > {noformat} > |join| = |left 휎 key is not null| * |right|/|key| + |left 휎 key is null| > {noformat} > So, to estimate {{IS NULL}} or {{IS NOT NULL}} after an outer join must use a > different algorithm then when estimating it in a scan. > This suggests that expression selectivity is not an independent exercise as > the code currently assumes it is. Instead, it must be aware of its context. > In this case, the underlying null count for the column in the predicate must > be adjusted when used in an outer join. > The following TPC-H query gives a very clear example (see {{card-join.test}}): > {code:sql} > select c.c_custkey, o.o_orderkey > from tpch.customer c > left outer join tpch.orders o on c.c_custkey = o.o_custkey > where o.o_clerk is null > {code} > The plan, with the {{IS NULL}} filter applied twice (correct structure, wrong > cardinality estimate): > {noformat} > PLAN-ROOT SINK > | > 02:HASH JOIN [RIGHT OUTER JOIN] > | hash predicates: o.o_custkey = c.c_custkey > | other predicates: o.o_clerk IS NULL > | runtime filters: RF000 <- c.c_custkey > | row-size=51B cardinality=0 > | > |--00:SCAN HDFS [tpch.customer c] > | partitions=1/1 files=1 size=23.08MB row-size=8B cardinality=150.00K > | > 01:SCAN HDFS [tpch.orders o] >partitions=1/1 files=1 size=162.56MB row-size=43B cardinality=1.50M >runtime filters: RF000 -> o.o_custkey > {noformat} > The math: > * The query obtains all customer rows, {{|customer| = 150K}}. > * The query obtains all order rows where the clerk is null, which is none. > * The query then left outer joins the customer table with orders. Since only > 100K customers have orders, 50K do not. The result would be a join with 50K > null clerks. > * But, because the {{IS NULL}} calculations after the join consider only the > {{orders}} null count, all the other rows are assumed discarded. > So, it may be that
[jira] [Updated] (IMPALA-8086) Check query option value when set
[ https://issues.apache.org/jira/browse/IMPALA-8086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers updated IMPALA-8086: Summary: Check query option value when set (was: Check the value during set ) > Check query option value when set > -- > > Key: IMPALA-8086 > URL: https://issues.apache.org/jira/browse/IMPALA-8086 > Project: IMPALA > Issue Type: Improvement >Reporter: Janaki Lahorani >Priority: Major > > When a parameter is set some value, it is evaluated when the query is run. > It should ideally be evaluated with the set is called. > [localhost:21000] functional_kudu> set runtime_filter_mode=On; > RUNTIME_FILTER_MODE set to On > [localhost:21000] functional_kudu> select STRAIGHT_JOIN count(*) from > decimal_rtf_tbl a join [BROADCAST] decimal_rtf_tbl_tiny_d5_kudu b where > a.d5_0 = b.d5_0; > Query: select STRAIGHT_JOIN count(*) from decimal_rtf_tbl a join [BROADCAST] > decimal_rtf_tbl_tiny_d5_kudu b where a.d5_0 = b.d5_0 > Query submitted at: 2018-12-07 20:00:55 (Coordinator: > http://janaki-OptiPlex-7050:25000) > ERROR: Errors parsing query options > Invalid runtime filter mode 'On'. Valid modes are OFF(0), LOCAL(1) or > GLOBAL(2). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8157) Log exceptions from the front end
[ https://issues.apache.org/jira/browse/IMPALA-8157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers resolved IMPALA-8157. - Resolution: Won't Fix Turns out the back end does log exceptions. Perhaps when I filed this I could not find them. Closing this bug for now unless I verify that the exceptions are not, in fact, getting logged. > Log exceptions from the front end > - > > Key: IMPALA-8157 > URL: https://issues.apache.org/jira/browse/IMPALA-8157 > Project: IMPALA > Issue Type: Improvement > Components: Frontend >Affects Versions: Impala 3.1.0 >Reporter: Paul Rogers >Priority: Minor > > The BE calls into the FE for a variety of operations. Each of these may fail > in expected ways (invalid query, say) or unexpected ways (a code change > introduces a null pointer exception.) > At present, the BE logs only the exception, and only at the INFO level. This > ticket asks to log all unexpected exceptions at the ERROR level. The basic > idea is to extend all FE entry points to do: > {code:java} > try { > // Do the operation > } catch (ExpectedException e) { > // Don't log expected exceptions > throw e; > } catch (Throwable e) { > LOG.error("Something went wrong", e); > throw e; > } > {code} > The above code logs all exceptions except for those that are considered > expected. The job of this ticket is to: > * Find all the entry points > * Identify which, if any, exceptions are expected > * Add logging code with an error message that identifies the operation > This pattern was tested ad-hoc to find a bug during development and seems to > work fine. As. a result, the change is mostly a matter of the above three > steps. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8278) Fix testEventProcessorFetchAfterHMSRestart
Vihang Karajgaonkar created IMPALA-8278: --- Summary: Fix testEventProcessorFetchAfterHMSRestart Key: IMPALA-8278 URL: https://issues.apache.org/jira/browse/IMPALA-8278 Project: IMPALA Issue Type: Sub-task Reporter: Vihang Karajgaonkar Assignee: Vihang Karajgaonkar {{testEventProcessorFetchAfterHMSRestart}} test case in {{MetastoreEventsProcessorTest}} causes flakiness because it creates a new event processor pointing to the same the catalog instance. This means that all the events generated are now being processed by two events processor instances and they both try to modify the state of catalogd causing race conditions. The failures vary and depend a lot on the timing. I see the following exception which is related to this issue. Easiest way to figure out if this is a problem is to look into FeSupport logs of the test to confirm if a event ID is being processed twice (i.e you see to exactly similar logs for a given event id). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-8258) Enhance functional tables with realistic star-schema simulation
[ https://issues.apache.org/jira/browse/IMPALA-8258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-8258 started by Paul Rogers. --- > Enhance functional tables with realistic star-schema simulation > --- > > Key: IMPALA-8258 > URL: https://issues.apache.org/jira/browse/IMPALA-8258 > Project: IMPALA > Issue Type: Improvement > Components: Frontend >Affects Versions: Impala 3.1.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > > The tables in the `functional` db provide many interesting cases. The tables > in TPC-H and TPC-DS simulate a well-behaved application. > We also need some tables that show messy, real-world cases: > * Correlated filters (same filter on multiple tables) > * Correlated keys (same join key across multiple tables) > * Extreme data skew > A simple four-table start-schema structure can give us what we want. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8265) Reject INSERT/UPSERT queries with ORDER BY and no OFFSET/LIMIT
[ https://issues.apache.org/jira/browse/IMPALA-8265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783867#comment-16783867 ] Tim Armstrong commented on IMPALA-8265: --- Yeah I think I understand the problem now. It's definitely an unexpected interaction between multiple decisions that made sense in isolation. Definitely we don't want any surprising behaviour, but we also don't want to break any existing workflows, so ideally we would have a solution that avoided both kinds of issues. I think we'd have to consider making this a hard failure a breaking change, so not valid in a minor release. Here are some ideas: # We could do nothing, which ensures no existing workflows are broken, and try to improve documentation. Potentially we could add a flag and/or switch the behaviour later in a major version. This leaves the potential for confusion among users. # We could change it to a hard error immediately, maybe overridable by an option. I think this is unacceptable because of potential for breakage # We could change the behaviour so that it has the expected behaviour. This solves the confusion and doesn't break existing workflows (aside from weirdly-written queries getting slower because of the sort). ## Ordering is enforced only between rows with the same primary key. I.e. we can still partition rows by the primary key and insert in parallel. This would mean that the side-effects of inserts are not strictly ordered. ## Ordering is enforced among all rows. This would force us to send all rows through the same node. To me, options 1. and 3.1 seem viable. 3.1 requires some real work but avoids the biggest downsides and makes some new workloads possible. We already insert sorts before Kudu inserts/upserts but this changes the semantics a bit. > Reject INSERT/UPSERT queries with ORDER BY and no OFFSET/LIMIT > --- > > Key: IMPALA-8265 > URL: https://issues.apache.org/jira/browse/IMPALA-8265 > Project: IMPALA > Issue Type: Improvement >Reporter: Andy Stadtler >Priority: Critical > > Currently Impala doesn't honor a order by without a limit or offset in a > insert ... select operation. While Impala currently throws a warning it seems > like this query should be rejected with the same message. Especially now with > the UPSERT ability and Kudu its obvious logic to take a table of duplicate > rows and use the following query. > {code:java} > UPSERT INTO kudu_table SELECT col1, col2, col3 FROM duplicate_row_table ORDER > BY timestamp_column ASC;{code} > Impala will happily take this query and write incorrect data. The same query > works fine as a SELECT only query and it's easy to see where users would make > the mistake of reusing it in an INSERT/UPSERT. > > Rejecting the query with the warning message would make sure the user knew > the ORDER BY would not be honored and make sure they added a limit, changed > their query logic or removed the order by. > > {quote}*Sorting considerations:* Although you can specify an {{ORDER BY}} > clause in an {{INSERT ... SELECT}} statement, any {{ORDER BY}} clause is > ignored and the results are not necessarily sorted. An {{INSERT ... SELECT}} > operation potentially creates many different data files, prepared on > different data nodes, and therefore the notion of the data being stored in > sorted order is impractical. > {quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8273) Change metastore configuration template so that table parameters do not exclude impala specific properties
[ https://issues.apache.org/jira/browse/IMPALA-8273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783865#comment-16783865 ] Vihang Karajgaonkar commented on IMPALA-8273: - Adding the gerrit link > Change metastore configuration template so that table parameters do not > exclude impala specific properties > -- > > Key: IMPALA-8273 > URL: https://issues.apache.org/jira/browse/IMPALA-8273 > Project: IMPALA > Issue Type: Sub-task >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > > CDH Hive has a configuration > {{hive.metastore.notification.parameters.exclude.patterns}} which gives the > ability to exclude certain parameter keys from notification events. This is > mainly used as a safety valve in case there are huge values stored in these > parameter maps. The template file should make sure that the parameter > exclusion is disabled (or atleast configured such that it does not exclude > {{impala.disableHmsSync}} which is needed by this feature. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8276) Self equal to self predicate "x = x" generated by Impala caused incorrect query result
[ https://issues.apache.org/jira/browse/IMPALA-8276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783854#comment-16783854 ] Tim Armstrong commented on IMPALA-8276: --- I heard [~Paul.Rogers] is taking over, so I guess that question should be directed to him (if he's not already doing it). > Self equal to self predicate "x = x" generated by Impala caused incorrect > query result > -- > > Key: IMPALA-8276 > URL: https://issues.apache.org/jira/browse/IMPALA-8276 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.0 >Reporter: Yongjun Zhang >Assignee: Paul Rogers >Priority: Blocker > Labels: correctness > > Reported with cdh5.12.1, that "self equal to self" kind of bogus predicate "x > = x" is generated by Impala and caused incorrect query result, because this > kind of predicate return false for "null" entries. > It was observed that a {{count(*)}} query returned fewer rows than a CTAS > query, though the query body is the same for both, because the former > generated the bogus predicate and the latter doesn't. > For example, > {code:java} > select count(*) from > (select a.*, b.x, b.y, b.z_dt, from view1 a left join view2 b on a.p = b.q) > a{code} > returned fewer rows than > {code:java} > create table abc as > select a.*, b.x, b.y, b.z_dt, from view1 a left join view2 b on a.p = > b.q{code} > because predicate {{a.z = a.z_dt}} was created (for reasons to understand, > notice b.z_dt is an alias of b.z), exhibited as "table1.z = table1.z" in the > query plan in Impala query profile because a and b are aliases of view1 and > view2, both of which are views created in a very nested way that involves > table table1. > Though in cdh5.12.1 the select and the count query returns different result > in the initial case, an attempted reproduction shows that both queries get > bogus predicates. And cdh5.15.2 has the same problem. Was not able to try > out with most recent master branch of impala due to meta data incompatibility. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-8276) Self equal to self predicate "x = x" generated by Impala caused incorrect query result
[ https://issues.apache.org/jira/browse/IMPALA-8276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong reassigned IMPALA-8276: - Assignee: Paul Rogers > Self equal to self predicate "x = x" generated by Impala caused incorrect > query result > -- > > Key: IMPALA-8276 > URL: https://issues.apache.org/jira/browse/IMPALA-8276 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.0 >Reporter: Yongjun Zhang >Assignee: Paul Rogers >Priority: Blocker > Labels: correctness > > Reported with cdh5.12.1, that "self equal to self" kind of bogus predicate "x > = x" is generated by Impala and caused incorrect query result, because this > kind of predicate return false for "null" entries. > It was observed that a {{count(*)}} query returned fewer rows than a CTAS > query, though the query body is the same for both, because the former > generated the bogus predicate and the latter doesn't. > For example, > {code:java} > select count(*) from > (select a.*, b.x, b.y, b.z_dt, from view1 a left join view2 b on a.p = b.q) > a{code} > returned fewer rows than > {code:java} > create table abc as > select a.*, b.x, b.y, b.z_dt, from view1 a left join view2 b on a.p = > b.q{code} > because predicate {{a.z = a.z_dt}} was created (for reasons to understand, > notice b.z_dt is an alias of b.z), exhibited as "table1.z = table1.z" in the > query plan in Impala query profile because a and b are aliases of view1 and > view2, both of which are views created in a very nested way that involves > table table1. > Though in cdh5.12.1 the select and the count query returns different result > in the initial case, an attempted reproduction shows that both queries get > bogus predicates. And cdh5.15.2 has the same problem. Was not able to try > out with most recent master branch of impala due to meta data incompatibility. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8277) CHECK can be hit when there are gaps in present CPU numbers (KUDU-2721)
Tim Armstrong created IMPALA-8277: - Summary: CHECK can be hit when there are gaps in present CPU numbers (KUDU-2721) Key: IMPALA-8277 URL: https://issues.apache.org/jira/browse/IMPALA-8277 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 3.1.0, Impala 3.2.0 Reporter: Tim Armstrong Assignee: Tim Armstrong This is a placeholder to port KUDU-2721 to our gutil once it's fixed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8277) CHECK can be hit when there are gaps in present CPU numbers (KUDU-2721)
Tim Armstrong created IMPALA-8277: - Summary: CHECK can be hit when there are gaps in present CPU numbers (KUDU-2721) Key: IMPALA-8277 URL: https://issues.apache.org/jira/browse/IMPALA-8277 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 3.1.0, Impala 3.2.0 Reporter: Tim Armstrong Assignee: Tim Armstrong This is a placeholder to port KUDU-2721 to our gutil once it's fixed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-7804) Various scanner tests intermittently failing on S3 on different runs
[ https://issues.apache.org/jira/browse/IMPALA-7804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe McDonnell resolved IMPALA-7804. --- Resolution: Fixed Fix Version/s: Impala 3.2.0 Closing this, as we made some test changes that alleviated this issue in 3.2 > Various scanner tests intermittently failing on S3 on different runs > > > Key: IMPALA-7804 > URL: https://issues.apache.org/jira/browse/IMPALA-7804 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.2.0 >Reporter: David Knupp >Assignee: Joe McDonnell >Priority: Blocker > Labels: S3, broken-build, flaky > Fix For: Impala 3.2.0 > > > The failures have to do with getting AWS client credentials. > *query_test/test_scanners.py:696: in test_decimal_encodings* > _Stacktrace_ > {noformat} > query_test/test_scanners.py:696: in test_decimal_encodings > self.run_test_case('QueryTest/parquet-decimal-formats', vector, > unique_database) > common/impala_test_suite.py:496: in run_test_case > self.__verify_results_and_errors(vector, test_section, result, use_db) > common/impala_test_suite.py:358: in __verify_results_and_errors > replace_filenames_with_placeholder) > common/test_result_verifier.py:438: in verify_raw_results > VERIFIER_MAP[verifier](expected, actual) > common/test_result_verifier.py:260: in verify_query_result_is_equal > assert expected_results == actual_results > E assert Comparing QueryTestResults (expected vs actual): > E -255.00,-255.00,-255.00 == -255.00,-255.00,-255.00 > E -255.00,-255.00,-255.00 != -65535.00,-65535.00,-65535.00 > E -65535.00,-65535.00,-65535.00 != -999.99,-999.99,-999.99 > E -65535.00,-65535.00,-65535.00 != > 0.00,-.99,-.99 > E -999.99,-999.99,-999.99 != 0.00,0.00,0.00 > E -999.99,-999.99,-999.99 != > 0.00,.99,.99 > E 0.00,-.99,-.99 != > 255.00,255.00,255.00 > E 0.00,-.99,-.99 != > 65535.00,65535.00,65535.00 > E 0.00,0.00,0.00 != 999.99,999.99,999.99 > E 0.00,0.00,0.00 != None > E 0.00,.99,.99 != None > E 0.00,.99,.99 != None > E 255.00,255.00,255.00 != None > E 255.00,255.00,255.00 != None > E 65535.00,65535.00,65535.00 != None > E 65535.00,65535.00,65535.00 != None > E 999.99,999.99,999.99 != None > E 999.99,999.99,999.99 != None > E Number of rows returned (expected vs actual): 18 != 9 > {noformat} > _Standard Error_ > {noformat} > SET sync_ddl=False; > -- executing against localhost:21000 > DROP DATABASE IF EXISTS `test_huge_num_rows_76a09ef1` CASCADE; > -- 2018-11-01 09:42:41,140 INFO MainThread: Started query > 4c4bc0e7b69d7641:130ffe73 > SET sync_ddl=False; > -- executing against localhost:21000 > CREATE DATABASE `test_huge_num_rows_76a09ef1`; > -- 2018-11-01 09:42:42,402 INFO MainThread: Started query > e34d714d6a62cba1:2a8544d0 > -- 2018-11-01 09:42:42,405 INFO MainThread: Created database > "test_huge_num_rows_76a09ef1" for test ID > "query_test/test_scanners.py::TestParquet::()::test_huge_num_rows[protocol: > beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, > 'abort_on_error': 1, 'debug_action': > '-1:OPEN:SET_DENY_RESERVATION_PROBABILITY@1.0', > 'exec_single_node_rows_threshold': 0} | table_format: parquet/none]" > 18/11/01 09:42:43 DEBUG s3a.S3AFileSystem: Initializing S3AFileSystem for > impala-test-uswest2-1 > 18/11/01 09:42:43 DEBUG s3a.S3AUtils: Propagating entries under > fs.s3a.bucket.impala-test-uswest2-1. > 18/11/01 09:42:43 WARN impl.MetricsConfig: Cannot locate configuration: tried > hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties > 18/11/01 09:42:43 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot > period at 10 second(s). > 18/11/01 09:42:43 INFO impl.MetricsSystemImpl: s3a-file-system metrics system > started > 18/11/01 09:42:43 DEBUG s3a.S3AUtils: For URI s3a://impala-test-uswest2-1/, > using credentials AWSCredentialProviderList: BasicAWSCredentialsProvider > EnvironmentVariableCredentialsProvider > com.amazonaws.auth.InstanceProfileCredentialsProvider@15bbf42f > 18/11/01 09:42:43 DEBUG s3a.S3AUtils: Value of fs.s3a.connection.maximum is > 1500 > 18/11/01 09:42:43 DEBUG s3a.S3AUtils: Value of fs.s3a.attempts.maximum is 20 > 18/11/01 09:42:43 DEBUG s3a.S3AUtils: Value of >
[jira] [Updated] (IMPALA-8189) TestParquet.test_resolution_by_name fails on S3 because 'hadoop fs -cp' fails
[ https://issues.apache.org/jira/browse/IMPALA-8189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-8189: -- Fix Version/s: Impala 3.2.0 > TestParquet.test_resolution_by_name fails on S3 because 'hadoop fs -cp' fails > -- > > Key: IMPALA-8189 > URL: https://issues.apache.org/jira/browse/IMPALA-8189 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Andrew Sherman >Assignee: Pooja Nilangekar >Priority: Critical > Labels: broken-build, flaky-test > Fix For: Impala 3.2.0 > > > In parquet-resolution-by-name.test a parquet file is copied. > {quote} > SHELL > hadoop fs -cp > $FILESYSTEM_PREFIX/test-warehouse/complextypestbl_parquet/nullable.parq \ > $FILESYSTEM_PREFIX/test-warehouse/$DATABASE.db/nested_resolution_by_name_test/ > hadoop fs -cp > $FILESYSTEM_PREFIX/test-warehouse/complextypestbl_parquet/nonnullable.parq \ > $FILESYSTEM_PREFIX/test-warehouse/$DATABASE.db/nested_resolution_by_name_test/ > {quote} > The first copy succeeds, but the second fails. In the DEBUG output (below) > you can see the copy writing data to an intermediate file > test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_ > and then after the stream is closed, the copy cannot find the file. > {quote} > 19/02/12 05:33:13 DEBUG s3a.S3AFileSystem: Getting path status for > s3a://impala-test-uswest2-1/test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_ > > (test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_) > 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_metadata_requests += > 1 -> 7 > 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_metadata_requests += > 1 -> 8 > 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_list_requests += 1 > -> 3 > 19/02/12 05:33:13 DEBUG s3a.S3AFileSystem: Not Found: > s3a://impala-test-uswest2-1/test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_ > 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: op_create += 1 -> 1 > 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: op_get_file_status += 1 -> > 6 > 19/02/12 05:33:13 DEBUG s3a.S3AFileSystem: Getting path status for > s3a://impala-test-uswest2-1/test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_ > > (test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_) > 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_metadata_requests += > 1 -> 9 > 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_metadata_requests += > 1 -> 10 > 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_list_requests += 1 > -> 4 > 19/02/12 05:33:13 DEBUG s3a.S3AFileSystem: Not Found: > s3a://impala-test-uswest2-1/test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_ > 19/02/12 05:33:13 DEBUG s3a.S3ABlockOutputStream: Initialized > S3ABlockOutputStream for > test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_ > output to FileBlock{index=1, > destFile=/tmp/hadoop-jenkins/s3a/s3ablock-0001-1315190405959387081.tmp, > state=Writing, dataSize=0, limit=104857600} > 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: op_get_file_status += 1 -> > 7 > 19/02/12 05:33:13 DEBUG s3a.S3AFileSystem: Getting path status for > s3a://impala-test-uswest2-1/test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_ > > (test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_) > 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_metadata_requests += > 1 -> 11 > 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_metadata_requests += > 1 -> 12 > 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_list_requests += 1 > -> 5 > 19/02/12 05:33:13 DEBUG s3a.S3AFileSystem: Not Found: > s3a://impala-test-uswest2-1/test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_ > 19/02/12 05:33:13 DEBUG s3a.S3AInputStream: > reopen(s3a://impala-test-uswest2-1/test-warehouse/complextypestbl_parquet/nonnullable.parq) > for read from new offset range[0-3186], length=4096, streamPosition=0, > nextReadPosition=0, policy=normal > 19/02/12 05:33:13 DEBUG s3a.S3ABlockOutputStream: > S3ABlockOutputStream{WriteOperationHelper {bucket=impala-test-uswest2-1}, > blockSize=104857600, activeBlock=FileBlock{index=1,
[jira] [Resolved] (IMPALA-7804) Various scanner tests intermittently failing on S3 on different runs
[ https://issues.apache.org/jira/browse/IMPALA-7804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe McDonnell resolved IMPALA-7804. --- Resolution: Fixed Fix Version/s: Impala 3.2.0 Closing this, as we made some test changes that alleviated this issue in 3.2 > Various scanner tests intermittently failing on S3 on different runs > > > Key: IMPALA-7804 > URL: https://issues.apache.org/jira/browse/IMPALA-7804 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.2.0 >Reporter: David Knupp >Assignee: Joe McDonnell >Priority: Blocker > Labels: S3, broken-build, flaky > Fix For: Impala 3.2.0 > > > The failures have to do with getting AWS client credentials. > *query_test/test_scanners.py:696: in test_decimal_encodings* > _Stacktrace_ > {noformat} > query_test/test_scanners.py:696: in test_decimal_encodings > self.run_test_case('QueryTest/parquet-decimal-formats', vector, > unique_database) > common/impala_test_suite.py:496: in run_test_case > self.__verify_results_and_errors(vector, test_section, result, use_db) > common/impala_test_suite.py:358: in __verify_results_and_errors > replace_filenames_with_placeholder) > common/test_result_verifier.py:438: in verify_raw_results > VERIFIER_MAP[verifier](expected, actual) > common/test_result_verifier.py:260: in verify_query_result_is_equal > assert expected_results == actual_results > E assert Comparing QueryTestResults (expected vs actual): > E -255.00,-255.00,-255.00 == -255.00,-255.00,-255.00 > E -255.00,-255.00,-255.00 != -65535.00,-65535.00,-65535.00 > E -65535.00,-65535.00,-65535.00 != -999.99,-999.99,-999.99 > E -65535.00,-65535.00,-65535.00 != > 0.00,-.99,-.99 > E -999.99,-999.99,-999.99 != 0.00,0.00,0.00 > E -999.99,-999.99,-999.99 != > 0.00,.99,.99 > E 0.00,-.99,-.99 != > 255.00,255.00,255.00 > E 0.00,-.99,-.99 != > 65535.00,65535.00,65535.00 > E 0.00,0.00,0.00 != 999.99,999.99,999.99 > E 0.00,0.00,0.00 != None > E 0.00,.99,.99 != None > E 0.00,.99,.99 != None > E 255.00,255.00,255.00 != None > E 255.00,255.00,255.00 != None > E 65535.00,65535.00,65535.00 != None > E 65535.00,65535.00,65535.00 != None > E 999.99,999.99,999.99 != None > E 999.99,999.99,999.99 != None > E Number of rows returned (expected vs actual): 18 != 9 > {noformat} > _Standard Error_ > {noformat} > SET sync_ddl=False; > -- executing against localhost:21000 > DROP DATABASE IF EXISTS `test_huge_num_rows_76a09ef1` CASCADE; > -- 2018-11-01 09:42:41,140 INFO MainThread: Started query > 4c4bc0e7b69d7641:130ffe73 > SET sync_ddl=False; > -- executing against localhost:21000 > CREATE DATABASE `test_huge_num_rows_76a09ef1`; > -- 2018-11-01 09:42:42,402 INFO MainThread: Started query > e34d714d6a62cba1:2a8544d0 > -- 2018-11-01 09:42:42,405 INFO MainThread: Created database > "test_huge_num_rows_76a09ef1" for test ID > "query_test/test_scanners.py::TestParquet::()::test_huge_num_rows[protocol: > beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, > 'abort_on_error': 1, 'debug_action': > '-1:OPEN:SET_DENY_RESERVATION_PROBABILITY@1.0', > 'exec_single_node_rows_threshold': 0} | table_format: parquet/none]" > 18/11/01 09:42:43 DEBUG s3a.S3AFileSystem: Initializing S3AFileSystem for > impala-test-uswest2-1 > 18/11/01 09:42:43 DEBUG s3a.S3AUtils: Propagating entries under > fs.s3a.bucket.impala-test-uswest2-1. > 18/11/01 09:42:43 WARN impl.MetricsConfig: Cannot locate configuration: tried > hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties > 18/11/01 09:42:43 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot > period at 10 second(s). > 18/11/01 09:42:43 INFO impl.MetricsSystemImpl: s3a-file-system metrics system > started > 18/11/01 09:42:43 DEBUG s3a.S3AUtils: For URI s3a://impala-test-uswest2-1/, > using credentials AWSCredentialProviderList: BasicAWSCredentialsProvider > EnvironmentVariableCredentialsProvider > com.amazonaws.auth.InstanceProfileCredentialsProvider@15bbf42f > 18/11/01 09:42:43 DEBUG s3a.S3AUtils: Value of fs.s3a.connection.maximum is > 1500 > 18/11/01 09:42:43 DEBUG s3a.S3AUtils: Value of fs.s3a.attempts.maximum is 20 > 18/11/01 09:42:43 DEBUG s3a.S3AUtils: Value of >
[jira] [Resolved] (IMPALA-8189) TestParquet.test_resolution_by_name fails on S3 because 'hadoop fs -cp' fails
[ https://issues.apache.org/jira/browse/IMPALA-8189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pooja Nilangekar resolved IMPALA-8189. -- Resolution: Fixed > TestParquet.test_resolution_by_name fails on S3 because 'hadoop fs -cp' fails > -- > > Key: IMPALA-8189 > URL: https://issues.apache.org/jira/browse/IMPALA-8189 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Andrew Sherman >Assignee: Pooja Nilangekar >Priority: Critical > Labels: broken-build, flaky-test > > In parquet-resolution-by-name.test a parquet file is copied. > {quote} > SHELL > hadoop fs -cp > $FILESYSTEM_PREFIX/test-warehouse/complextypestbl_parquet/nullable.parq \ > $FILESYSTEM_PREFIX/test-warehouse/$DATABASE.db/nested_resolution_by_name_test/ > hadoop fs -cp > $FILESYSTEM_PREFIX/test-warehouse/complextypestbl_parquet/nonnullable.parq \ > $FILESYSTEM_PREFIX/test-warehouse/$DATABASE.db/nested_resolution_by_name_test/ > {quote} > The first copy succeeds, but the second fails. In the DEBUG output (below) > you can see the copy writing data to an intermediate file > test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_ > and then after the stream is closed, the copy cannot find the file. > {quote} > 19/02/12 05:33:13 DEBUG s3a.S3AFileSystem: Getting path status for > s3a://impala-test-uswest2-1/test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_ > > (test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_) > 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_metadata_requests += > 1 -> 7 > 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_metadata_requests += > 1 -> 8 > 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_list_requests += 1 > -> 3 > 19/02/12 05:33:13 DEBUG s3a.S3AFileSystem: Not Found: > s3a://impala-test-uswest2-1/test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_ > 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: op_create += 1 -> 1 > 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: op_get_file_status += 1 -> > 6 > 19/02/12 05:33:13 DEBUG s3a.S3AFileSystem: Getting path status for > s3a://impala-test-uswest2-1/test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_ > > (test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_) > 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_metadata_requests += > 1 -> 9 > 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_metadata_requests += > 1 -> 10 > 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_list_requests += 1 > -> 4 > 19/02/12 05:33:13 DEBUG s3a.S3AFileSystem: Not Found: > s3a://impala-test-uswest2-1/test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_ > 19/02/12 05:33:13 DEBUG s3a.S3ABlockOutputStream: Initialized > S3ABlockOutputStream for > test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_ > output to FileBlock{index=1, > destFile=/tmp/hadoop-jenkins/s3a/s3ablock-0001-1315190405959387081.tmp, > state=Writing, dataSize=0, limit=104857600} > 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: op_get_file_status += 1 -> > 7 > 19/02/12 05:33:13 DEBUG s3a.S3AFileSystem: Getting path status for > s3a://impala-test-uswest2-1/test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_ > > (test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_) > 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_metadata_requests += > 1 -> 11 > 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_metadata_requests += > 1 -> 12 > 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_list_requests += 1 > -> 5 > 19/02/12 05:33:13 DEBUG s3a.S3AFileSystem: Not Found: > s3a://impala-test-uswest2-1/test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_ > 19/02/12 05:33:13 DEBUG s3a.S3AInputStream: > reopen(s3a://impala-test-uswest2-1/test-warehouse/complextypestbl_parquet/nonnullable.parq) > for read from new offset range[0-3186], length=4096, streamPosition=0, > nextReadPosition=0, policy=normal > 19/02/12 05:33:13 DEBUG s3a.S3ABlockOutputStream: > S3ABlockOutputStream{WriteOperationHelper {bucket=impala-test-uswest2-1}, > blockSize=104857600, activeBlock=FileBlock{index=1, >
[jira] [Resolved] (IMPALA-8189) TestParquet.test_resolution_by_name fails on S3 because 'hadoop fs -cp' fails
[ https://issues.apache.org/jira/browse/IMPALA-8189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pooja Nilangekar resolved IMPALA-8189. -- Resolution: Fixed > TestParquet.test_resolution_by_name fails on S3 because 'hadoop fs -cp' fails > -- > > Key: IMPALA-8189 > URL: https://issues.apache.org/jira/browse/IMPALA-8189 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Andrew Sherman >Assignee: Pooja Nilangekar >Priority: Critical > Labels: broken-build, flaky-test > > In parquet-resolution-by-name.test a parquet file is copied. > {quote} > SHELL > hadoop fs -cp > $FILESYSTEM_PREFIX/test-warehouse/complextypestbl_parquet/nullable.parq \ > $FILESYSTEM_PREFIX/test-warehouse/$DATABASE.db/nested_resolution_by_name_test/ > hadoop fs -cp > $FILESYSTEM_PREFIX/test-warehouse/complextypestbl_parquet/nonnullable.parq \ > $FILESYSTEM_PREFIX/test-warehouse/$DATABASE.db/nested_resolution_by_name_test/ > {quote} > The first copy succeeds, but the second fails. In the DEBUG output (below) > you can see the copy writing data to an intermediate file > test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_ > and then after the stream is closed, the copy cannot find the file. > {quote} > 19/02/12 05:33:13 DEBUG s3a.S3AFileSystem: Getting path status for > s3a://impala-test-uswest2-1/test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_ > > (test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_) > 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_metadata_requests += > 1 -> 7 > 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_metadata_requests += > 1 -> 8 > 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_list_requests += 1 > -> 3 > 19/02/12 05:33:13 DEBUG s3a.S3AFileSystem: Not Found: > s3a://impala-test-uswest2-1/test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_ > 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: op_create += 1 -> 1 > 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: op_get_file_status += 1 -> > 6 > 19/02/12 05:33:13 DEBUG s3a.S3AFileSystem: Getting path status for > s3a://impala-test-uswest2-1/test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_ > > (test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_) > 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_metadata_requests += > 1 -> 9 > 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_metadata_requests += > 1 -> 10 > 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_list_requests += 1 > -> 4 > 19/02/12 05:33:13 DEBUG s3a.S3AFileSystem: Not Found: > s3a://impala-test-uswest2-1/test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_ > 19/02/12 05:33:13 DEBUG s3a.S3ABlockOutputStream: Initialized > S3ABlockOutputStream for > test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_ > output to FileBlock{index=1, > destFile=/tmp/hadoop-jenkins/s3a/s3ablock-0001-1315190405959387081.tmp, > state=Writing, dataSize=0, limit=104857600} > 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: op_get_file_status += 1 -> > 7 > 19/02/12 05:33:13 DEBUG s3a.S3AFileSystem: Getting path status for > s3a://impala-test-uswest2-1/test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_ > > (test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_) > 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_metadata_requests += > 1 -> 11 > 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_metadata_requests += > 1 -> 12 > 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_list_requests += 1 > -> 5 > 19/02/12 05:33:13 DEBUG s3a.S3AFileSystem: Not Found: > s3a://impala-test-uswest2-1/test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_ > 19/02/12 05:33:13 DEBUG s3a.S3AInputStream: > reopen(s3a://impala-test-uswest2-1/test-warehouse/complextypestbl_parquet/nonnullable.parq) > for read from new offset range[0-3186], length=4096, streamPosition=0, > nextReadPosition=0, policy=normal > 19/02/12 05:33:13 DEBUG s3a.S3ABlockOutputStream: > S3ABlockOutputStream{WriteOperationHelper {bucket=impala-test-uswest2-1}, > blockSize=104857600, activeBlock=FileBlock{index=1, >
[jira] [Commented] (IMPALA-8276) Self equal to self predicate "x = x" generated by Impala caused incorrect query result
[ https://issues.apache.org/jira/browse/IMPALA-8276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783765#comment-16783765 ] Tim Armstrong commented on IMPALA-8276: --- [~yzhangal] do you have some view definitions that are sufficient to reproduce the issue? > Self equal to self predicate "x = x" generated by Impala caused incorrect > query result > -- > > Key: IMPALA-8276 > URL: https://issues.apache.org/jira/browse/IMPALA-8276 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.0 >Reporter: Yongjun Zhang >Priority: Blocker > Labels: correctness > > Reported with cdh5.12.1, that "self equal to self" kind of bogus predicate "x > = x" is generated by Impala and caused incorrect query result, because this > kind of predicate return false for "null" entries. > It was observed that a {{count(*)}} query returned fewer rows than a CTAS > query, though the query body is the same for both, because the former > generated the bogus predicate and the latter doesn't. > For example, > {code:java} > select count(*) from > (select a.*, b.x, b.y, b.z_dt, from view1 a left join view2 b on a.p = b.q) > a{code} > returned fewer rows than > {code:java} > create table abc as > select a.*, b.x, b.y, b.z_dt, from view1 a left join view2 b on a.p = > b.q{code} > because predicate {{a.z = a.z_dt}} was created (for reasons to understand, > notice b.z_dt is an alias of b.z), exhibited as "table1.z = table1.z" in the > query plan in Impala query profile because a and b are aliases of view1 and > view2, both of which are views created in a very nested way that involves > table table1. > Though in cdh5.12.1 the select and the count query returns different result > in the initial case, an attempted reproduction shows that both queries get > bogus predicates. And cdh5.15.2 has the same problem. Was not able to try > out with most recent master branch of impala due to meta data incompatibility. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-8269) Clean up authorization test package structure
[ https://issues.apache.org/jira/browse/IMPALA-8269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-8269 started by Mahendra Korepu. --- > Clean up authorization test package structure > - > > Key: IMPALA-8269 > URL: https://issues.apache.org/jira/browse/IMPALA-8269 > Project: IMPALA > Issue Type: Improvement > Components: Infrastructure >Reporter: Fredy Wijaya >Assignee: Mahendra Korepu >Priority: Minor > Labels: ramp-up > > The task is to do some clean up on the authorization test package structure. > 1. Move AuthorizatioinTest.java and AuthorizationStmtTest.java to > authorization test package. > 2. Rename CustomClusterGroupMapper and > CustomClusterResourceAuthorizationProvider to TestSentryGroupMapper and > TestSentryResourceAuthorizationProvider since those two class aren't specific > to custom cluster anymore. > 3. Move those two files into `testutil` instead since they're not actually > test classes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-8248) Re-organize authorization tests
[ https://issues.apache.org/jira/browse/IMPALA-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] radford nguyen reassigned IMPALA-8248: -- Assignee: radford nguyen > Re-organize authorization tests > --- > > Key: IMPALA-8248 > URL: https://issues.apache.org/jira/browse/IMPALA-8248 > Project: IMPALA > Issue Type: Sub-task > Components: Infrastructure >Reporter: Fredy Wijaya >Assignee: radford nguyen >Priority: Major > > We have authorization tests that are specific to Sentry and authorization > tests that can be applicable to any authorization provider. We need to > re-organize the authorization tests to easily differentiate between > Sentry-specific tests vs generic authorization tests. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Reopened] (IMPALA-8269) Clean up authorization test package structure
[ https://issues.apache.org/jira/browse/IMPALA-8269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fredy Wijaya reopened IMPALA-8269: -- This isn't fixed yet. > Clean up authorization test package structure > - > > Key: IMPALA-8269 > URL: https://issues.apache.org/jira/browse/IMPALA-8269 > Project: IMPALA > Issue Type: Improvement > Components: Infrastructure >Reporter: Fredy Wijaya >Assignee: Mahendra Korepu >Priority: Minor > Labels: ramp-up > > The task is to do some clean up on the authorization test package structure. > 1. Move AuthorizatioinTest.java and AuthorizationStmtTest.java to > authorization test package. > 2. Rename CustomClusterGroupMapper and > CustomClusterResourceAuthorizationProvider to TestSentryGroupMapper and > TestSentryResourceAuthorizationProvider since those two class aren't specific > to custom cluster anymore. > 3. Move those two files into `testutil` instead since they're not actually > test classes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8269) Clean up authorization test package structure
[ https://issues.apache.org/jira/browse/IMPALA-8269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahendra Korepu resolved IMPALA-8269. - Resolution: Fixed Changes pushed up for review: https://gerrit.cloudera.org/#/c/12654/ > Clean up authorization test package structure > - > > Key: IMPALA-8269 > URL: https://issues.apache.org/jira/browse/IMPALA-8269 > Project: IMPALA > Issue Type: Improvement > Components: Infrastructure >Reporter: Fredy Wijaya >Assignee: Mahendra Korepu >Priority: Minor > Labels: ramp-up > > The task is to do some clean up on the authorization test package structure. > 1. Move AuthorizatioinTest.java and AuthorizationStmtTest.java to > authorization test package. > 2. Rename CustomClusterGroupMapper and > CustomClusterResourceAuthorizationProvider to TestSentryGroupMapper and > TestSentryResourceAuthorizationProvider since those two class aren't specific > to custom cluster anymore. > 3. Move those two files into `testutil` instead since they're not actually > test classes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IMPALA-8269) Clean up authorization test package structure
[ https://issues.apache.org/jira/browse/IMPALA-8269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahendra Korepu resolved IMPALA-8269. - Resolution: Fixed Changes pushed up for review: https://gerrit.cloudera.org/#/c/12654/ > Clean up authorization test package structure > - > > Key: IMPALA-8269 > URL: https://issues.apache.org/jira/browse/IMPALA-8269 > Project: IMPALA > Issue Type: Improvement > Components: Infrastructure >Reporter: Fredy Wijaya >Assignee: Mahendra Korepu >Priority: Minor > Labels: ramp-up > > The task is to do some clean up on the authorization test package structure. > 1. Move AuthorizatioinTest.java and AuthorizationStmtTest.java to > authorization test package. > 2. Rename CustomClusterGroupMapper and > CustomClusterResourceAuthorizationProvider to TestSentryGroupMapper and > TestSentryResourceAuthorizationProvider since those two class aren't specific > to custom cluster anymore. > 3. Move those two files into `testutil` instead since they're not actually > test classes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-8269) Clean up authorization test package structure
[ https://issues.apache.org/jira/browse/IMPALA-8269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-8269 started by Mahendra Korepu. --- > Clean up authorization test package structure > - > > Key: IMPALA-8269 > URL: https://issues.apache.org/jira/browse/IMPALA-8269 > Project: IMPALA > Issue Type: Improvement > Components: Infrastructure >Reporter: Fredy Wijaya >Assignee: Mahendra Korepu >Priority: Minor > Labels: ramp-up > > The task is to do some clean up on the authorization test package structure. > 1. Move AuthorizatioinTest.java and AuthorizationStmtTest.java to > authorization test package. > 2. Rename CustomClusterGroupMapper and > CustomClusterResourceAuthorizationProvider to TestSentryGroupMapper and > TestSentryResourceAuthorizationProvider since those two class aren't specific > to custom cluster anymore. > 3. Move those two files into `testutil` instead since they're not actually > test classes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8274) Missing update to index into profiles vector in Coordinator::BackendState::ApplyExecStatusReport()
[ https://issues.apache.org/jira/browse/IMPALA-8274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Apple updated IMPALA-8274: -- Labels: crash (was: ) > Missing update to index into profiles vector in > Coordinator::BackendState::ApplyExecStatusReport() > -- > > Key: IMPALA-8274 > URL: https://issues.apache.org/jira/browse/IMPALA-8274 > Project: IMPALA > Issue Type: Bug > Components: Distributed Exec >Reporter: Michael Ho >Assignee: Michael Ho >Priority: Blocker > Labels: crash > > {{idx}} isn't updated in case we skip a duplicated or stale duplicated update > of a fragment instance. As a result, we may end up passing the wrong profile > to {{instance_stats->Update()}}. This may lead to random crashes in > {{Coordinator::BackendState::InstanceStats::Update}}. > {noformat} > int idx = 0; > const bool has_profile = thrift_profiles.profile_trees.size() > 0; > TRuntimeProfileTree empty_profile; > for (const FragmentInstanceExecStatusPB& instance_exec_status : >backend_exec_status.instance_exec_status()) { > int64_t report_seq_no = instance_exec_status.report_seq_no(); > int instance_idx = > GetInstanceIdx(instance_exec_status.fragment_instance_id()); > DCHECK_EQ(instance_stats_map_.count(instance_idx), 1); > InstanceStats* instance_stats = instance_stats_map_[instance_idx]; > int64_t last_report_seq_no = instance_stats->last_report_seq_no_; > DCHECK(instance_stats->exec_params_.instance_id == > ProtoToQueryId(instance_exec_status.fragment_instance_id())); > // Ignore duplicate or out-of-order messages. > if (report_seq_no <= last_report_seq_no) { > VLOG_QUERY << Substitute("Ignoring stale update for query instance $0 > with " > "seq no $1", PrintId(instance_stats->exec_params_.instance_id), > report_seq_no); > continue; <<--- // XXX bad > } > DCHECK(!instance_stats->done_); > DCHECK(!has_profile || idx < thrift_profiles.profile_trees.size()); > const TRuntimeProfileTree& profile = > has_profile ? thrift_profiles.profile_trees[idx++] : empty_profile; > instance_stats->Update(instance_exec_status, profile, exec_summary, > scan_range_progress); > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8276) Self equal to self predicate "x = x" generated by Impala caused incorrect query result
[ https://issues.apache.org/jira/browse/IMPALA-8276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-8276: -- Priority: Blocker (was: Major) > Self equal to self predicate "x = x" generated by Impala caused incorrect > query result > -- > > Key: IMPALA-8276 > URL: https://issues.apache.org/jira/browse/IMPALA-8276 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.0 >Reporter: Yongjun Zhang >Priority: Blocker > > Reported with cdh5.12.1, that "self equal to self" kind of bogus predicate "x > = x" is generated by Impala and caused incorrect query result, because this > kind of predicate return false for "null" entries. > It was observed that a {{count(*)}} query returned fewer rows than a CTAS > query, though the query body is the same for both, because the former > generated the bogus predicate and the latter doesn't. > For example, > {code:java} > select count(*) from > (select a.*, b.x, b.y, b.z_dt, from view1 a left join view2 b on a.p = b.q) > a{code} > returned fewer rows than > {code:java} > create table abc as > select a.*, b.x, b.y, b.z_dt, from view1 a left join view2 b on a.p = > b.q{code} > because predicate {{a.z = a.z_dt}} was created (for reasons to understand, > notice b.z_dt is an alias of b.z), exhibited as "table1.z = table1.z" in the > query plan in Impala query profile because a and b are aliases of view1 and > view2, both of which are views created in a very nested way that involves > table table1. > Though in cdh5.12.1 the select and the count query returns different result > in the initial case, an attempted reproduction shows that both queries get > bogus predicates. And cdh5.15.2 has the same problem. Was not able to try > out with most recent master branch of impala due to meta data incompatibility. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8276) Self equal to self predicate "x = x" generated by Impala caused incorrect query result
[ https://issues.apache.org/jira/browse/IMPALA-8276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-8276: -- Labels: correctness (was: ) > Self equal to self predicate "x = x" generated by Impala caused incorrect > query result > -- > > Key: IMPALA-8276 > URL: https://issues.apache.org/jira/browse/IMPALA-8276 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.0 >Reporter: Yongjun Zhang >Priority: Blocker > Labels: correctness > > Reported with cdh5.12.1, that "self equal to self" kind of bogus predicate "x > = x" is generated by Impala and caused incorrect query result, because this > kind of predicate return false for "null" entries. > It was observed that a {{count(*)}} query returned fewer rows than a CTAS > query, though the query body is the same for both, because the former > generated the bogus predicate and the latter doesn't. > For example, > {code:java} > select count(*) from > (select a.*, b.x, b.y, b.z_dt, from view1 a left join view2 b on a.p = b.q) > a{code} > returned fewer rows than > {code:java} > create table abc as > select a.*, b.x, b.y, b.z_dt, from view1 a left join view2 b on a.p = > b.q{code} > because predicate {{a.z = a.z_dt}} was created (for reasons to understand, > notice b.z_dt is an alias of b.z), exhibited as "table1.z = table1.z" in the > query plan in Impala query profile because a and b are aliases of view1 and > view2, both of which are views created in a very nested way that involves > table table1. > Though in cdh5.12.1 the select and the count query returns different result > in the initial case, an attempted reproduction shows that both queries get > bogus predicates. And cdh5.15.2 has the same problem. Was not able to try > out with most recent master branch of impala due to meta data incompatibility. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8276) Self equal to self predicate "x = x" generated by Impala caused incorrect query result
[ https://issues.apache.org/jira/browse/IMPALA-8276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang updated IMPALA-8276: -- Description: Reported with cdh5.12.1, that "self equal to self" kind of bogus predicate "x = x" is generated by Impala and caused incorrect query result, because this kind of predicate return false for "null" entries. It was observed that a {{count(*)}} query returned fewer rows than a CTAS query, though the query body is the same for both, because the former generated the bogus predicate and the latter doesn't. For example, {code:java} select count(*) from (select a.*, b.x, b.y, b.z_dt, from view1 a left join view2 b on a.p = b.q) a{code} returned fewer rows than {code:java} create table abc as select a.*, b.x, b.y, b.z_dt, from view1 a left join view2 b on a.p = b.q{code} because predicate {{a.z = a.z_dt}} was created (for reasons to understand, notice b.z_dt is an alias of b.z), exhibited as "table1.z = table1.z" in the query plan in Impala query profile because a and b are aliases of view1 and view2, both of which are views created in a very nested way that involves table table1. Though in cdh5.12.1 the select and the count query returns different result in the initial case, an attempted reproduction shows that both queries get bogus predicates. And cdh5.15.2 has the same problem. Was not able to try out with most recent master branch of impala due to meta data incompatibility. was: Reported with cdh5.12.1, that "self equal to self" kind of bogus predicate "x = x" is generated by Impala and caused incorrect query result, because this kind of predicate return false for "null" entries. It was observed that a {{count(*)}} query returned fewer rows than a CTAS query, though the query body is the same for both, because the former generated the bogus predicate and the latter doesn't. For example, {code:java} select count(*) from (select a.*, b.x, b.y, b.z_dt, from view1 a left join view2 b on a.p = b.q) a{code} returned fewer rows than {code:java} create table abc as select count(*) from (select a.*, b.x, b.y, b.z_dt, from view1 a left join view2 b on a.p = b.q{code} because predicate {{a.z = a.z_dt}} was created (for reasons to understand, notice b.z_dt is an alias of b.z), exhibited as "table1.z = table1.z" in the query plan in Impala query profile because a and b are aliases of view1 and view2, both of which are views created in a very nested way that involves table table1. Though in cdh5.12.1 the select and the count query returns different result in the initial case, an attempted reproduction shows that both queries get bogus predicates. And cdh5.15.2 has the same problem. Was not able to try out with most recent master branch of impala due to meta data incompatibility. > Self equal to self predicate "x = x" generated by Impala caused incorrect > query result > -- > > Key: IMPALA-8276 > URL: https://issues.apache.org/jira/browse/IMPALA-8276 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.0 >Reporter: Yongjun Zhang >Priority: Major > > Reported with cdh5.12.1, that "self equal to self" kind of bogus predicate "x > = x" is generated by Impala and caused incorrect query result, because this > kind of predicate return false for "null" entries. > It was observed that a {{count(*)}} query returned fewer rows than a CTAS > query, though the query body is the same for both, because the former > generated the bogus predicate and the latter doesn't. > For example, > {code:java} > select count(*) from (select a.*, b.x, b.y, b.z_dt, from view1 a left join > view2 b on a.p = b.q) a{code} > returned fewer rows than > {code:java} > create table abc as > select a.*, b.x, b.y, b.z_dt, from view1 a left join view2 b on a.p = > b.q{code} > because predicate {{a.z = a.z_dt}} was created (for reasons to understand, > notice b.z_dt is an alias of b.z), exhibited as "table1.z = table1.z" in the > query plan in Impala query profile because a and b are aliases of view1 and > view2, both of which are views created in a very nested way that involves > table table1. > Though in cdh5.12.1 the select and the count query returns different result > in the initial case, an attempted reproduction shows that both queries get > bogus predicates. And cdh5.15.2 has the same problem. Was not able to try > out with most recent master branch of impala due to meta data incompatibility. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8276) Self equal to self predicate "x = x" generated by Impala caused incorrect query result
Yongjun Zhang created IMPALA-8276: - Summary: Self equal to self predicate "x = x" generated by Impala caused incorrect query result Key: IMPALA-8276 URL: https://issues.apache.org/jira/browse/IMPALA-8276 Project: IMPALA Issue Type: Bug Components: Frontend Affects Versions: Impala 3.0 Reporter: Yongjun Zhang Reported with cdh5.12.1, that "self equal to self" kind of bogus predicate "x = x" is generated by Impala and caused incorrect query result, because this kind of predicate return false for "null" entries. It was observed that a count(*) query returned fewer rows than a CTAS query, though the query is the same, because the former generated the bogus predicate and the latter doesn't. For example, {code:java} select count(*) from (select a.*, b.x, b.y, b.z_dt, from view1 a left join view2 b on a.p = b.q) a{code} returned fewer rows than {code:java} create table abc as select count(*) from (select a.*, b.x, b.y, b.z_dt, from view1 a left join view2 b on a.p = b.q{code} because predicate {{a.z = a.z_dt}} was created (for reasons to understand, notice b.z_dt is an alias of b.z), exhibited as "table1.z = table1.z" in the query plan in Impala query profile because a and b are aliases of view1 and view2, both of which are views created in a very nested way that involves table table1. Though in cdh5.12.1 the select and the count query returns different result in the initial case, an attempted reproduction shows that both queries get bogus predicates. And cdh5.15.2 has the same problem. Was not able to try out with most recent master branch of impala due to meta data incompatibility. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IMPALA-8276) Self equal to self predicate "x = x" generated by Impala caused incorrect query result
[ https://issues.apache.org/jira/browse/IMPALA-8276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang updated IMPALA-8276: -- Description: Reported with cdh5.12.1, that "self equal to self" kind of bogus predicate "x = x" is generated by Impala and caused incorrect query result, because this kind of predicate return false for "null" entries. It was observed that a {{count(*)}} query returned fewer rows than a CTAS query, though the query body is the same for both, because the former generated the bogus predicate and the latter doesn't. For example, {code:java} select count(*) from (select a.*, b.x, b.y, b.z_dt, from view1 a left join view2 b on a.p = b.q) a{code} returned fewer rows than {code:java} create table abc as select a.*, b.x, b.y, b.z_dt, from view1 a left join view2 b on a.p = b.q{code} because predicate {{a.z = a.z_dt}} was created (for reasons to understand, notice b.z_dt is an alias of b.z), exhibited as "table1.z = table1.z" in the query plan in Impala query profile because a and b are aliases of view1 and view2, both of which are views created in a very nested way that involves table table1. Though in cdh5.12.1 the select and the count query returns different result in the initial case, an attempted reproduction shows that both queries get bogus predicates. And cdh5.15.2 has the same problem. Was not able to try out with most recent master branch of impala due to meta data incompatibility. was: Reported with cdh5.12.1, that "self equal to self" kind of bogus predicate "x = x" is generated by Impala and caused incorrect query result, because this kind of predicate return false for "null" entries. It was observed that a {{count(*)}} query returned fewer rows than a CTAS query, though the query body is the same for both, because the former generated the bogus predicate and the latter doesn't. For example, {code:java} select count(*) from (select a.*, b.x, b.y, b.z_dt, from view1 a left join view2 b on a.p = b.q) a{code} returned fewer rows than {code:java} create table abc as select a.*, b.x, b.y, b.z_dt, from view1 a left join view2 b on a.p = b.q{code} because predicate {{a.z = a.z_dt}} was created (for reasons to understand, notice b.z_dt is an alias of b.z), exhibited as "table1.z = table1.z" in the query plan in Impala query profile because a and b are aliases of view1 and view2, both of which are views created in a very nested way that involves table table1. Though in cdh5.12.1 the select and the count query returns different result in the initial case, an attempted reproduction shows that both queries get bogus predicates. And cdh5.15.2 has the same problem. Was not able to try out with most recent master branch of impala due to meta data incompatibility. > Self equal to self predicate "x = x" generated by Impala caused incorrect > query result > -- > > Key: IMPALA-8276 > URL: https://issues.apache.org/jira/browse/IMPALA-8276 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.0 >Reporter: Yongjun Zhang >Priority: Major > > Reported with cdh5.12.1, that "self equal to self" kind of bogus predicate "x > = x" is generated by Impala and caused incorrect query result, because this > kind of predicate return false for "null" entries. > It was observed that a {{count(*)}} query returned fewer rows than a CTAS > query, though the query body is the same for both, because the former > generated the bogus predicate and the latter doesn't. > For example, > {code:java} > select count(*) from > (select a.*, b.x, b.y, b.z_dt, from view1 a left join view2 b on a.p = b.q) > a{code} > returned fewer rows than > {code:java} > create table abc as > select a.*, b.x, b.y, b.z_dt, from view1 a left join view2 b on a.p = > b.q{code} > because predicate {{a.z = a.z_dt}} was created (for reasons to understand, > notice b.z_dt is an alias of b.z), exhibited as "table1.z = table1.z" in the > query plan in Impala query profile because a and b are aliases of view1 and > view2, both of which are views created in a very nested way that involves > table table1. > Though in cdh5.12.1 the select and the count query returns different result > in the initial case, an attempted reproduction shows that both queries get > bogus predicates. And cdh5.15.2 has the same problem. Was not able to try > out with most recent master branch of impala due to meta data incompatibility. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8276) Self equal to self predicate "x = x" generated by Impala caused incorrect query result
[ https://issues.apache.org/jira/browse/IMPALA-8276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang updated IMPALA-8276: -- Description: Reported with cdh5.12.1, that "self equal to self" kind of bogus predicate "x = x" is generated by Impala and caused incorrect query result, because this kind of predicate return false for "null" entries. It was observed that a {{count(*)}} query returned fewer rows than a CTAS query, though the query body is the same for both, because the former generated the bogus predicate and the latter doesn't. For example, {code:java} select count(*) from (select a.*, b.x, b.y, b.z_dt, from view1 a left join view2 b on a.p = b.q) a{code} returned fewer rows than {code:java} create table abc as select count(*) from (select a.*, b.x, b.y, b.z_dt, from view1 a left join view2 b on a.p = b.q{code} because predicate {{a.z = a.z_dt}} was created (for reasons to understand, notice b.z_dt is an alias of b.z), exhibited as "table1.z = table1.z" in the query plan in Impala query profile because a and b are aliases of view1 and view2, both of which are views created in a very nested way that involves table table1. Though in cdh5.12.1 the select and the count query returns different result in the initial case, an attempted reproduction shows that both queries get bogus predicates. And cdh5.15.2 has the same problem. Was not able to try out with most recent master branch of impala due to meta data incompatibility. was: Reported with cdh5.12.1, that "self equal to self" kind of bogus predicate "x = x" is generated by Impala and caused incorrect query result, because this kind of predicate return false for "null" entries. It was observed that a {{count(*)}} query returned fewer rows than a CTAS query, though the query is the same, because the former generated the bogus predicate and the latter doesn't. For example, {code:java} select count(*) from (select a.*, b.x, b.y, b.z_dt, from view1 a left join view2 b on a.p = b.q) a{code} returned fewer rows than {code:java} create table abc as select count(*) from (select a.*, b.x, b.y, b.z_dt, from view1 a left join view2 b on a.p = b.q{code} because predicate {{a.z = a.z_dt}} was created (for reasons to understand, notice b.z_dt is an alias of b.z), exhibited as "table1.z = table1.z" in the query plan in Impala query profile because a and b are aliases of view1 and view2, both of which are views created in a very nested way that involves table table1. Though in cdh5.12.1 the select and the count query returns different result in the initial case, an attempted reproduction shows that both queries get bogus predicates. And cdh5.15.2 has the same problem. Was not able to try out with most recent master branch of impala due to meta data incompatibility. > Self equal to self predicate "x = x" generated by Impala caused incorrect > query result > -- > > Key: IMPALA-8276 > URL: https://issues.apache.org/jira/browse/IMPALA-8276 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.0 >Reporter: Yongjun Zhang >Priority: Major > > Reported with cdh5.12.1, that "self equal to self" kind of bogus predicate "x > = x" is generated by Impala and caused incorrect query result, because this > kind of predicate return false for "null" entries. > It was observed that a {{count(*)}} query returned fewer rows than a CTAS > query, though the query body is the same for both, because the former > generated the bogus predicate and the latter doesn't. > For example, > {code:java} > select count(*) from (select a.*, b.x, b.y, b.z_dt, from view1 a left join > view2 b on a.p = b.q) a{code} > returned fewer rows than > {code:java} > create table abc as > select count(*) from (select a.*, b.x, b.y, b.z_dt, from view1 a left join > view2 b on a.p = b.q{code} > because predicate {{a.z = a.z_dt}} was created (for reasons to understand, > notice b.z_dt is an alias of b.z), exhibited as "table1.z = table1.z" in the > query plan in Impala query profile because a and b are aliases of view1 and > view2, both of which are views created in a very nested way that involves > table table1. > Though in cdh5.12.1 the select and the count query returns different result > in the initial case, an attempted reproduction shows that both queries get > bogus predicates. And cdh5.15.2 has the same problem. Was not able to try > out with most recent master branch of impala due to meta data incompatibility. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail:
[jira] [Updated] (IMPALA-8276) Self equal to self predicate "x = x" generated by Impala caused incorrect query result
[ https://issues.apache.org/jira/browse/IMPALA-8276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang updated IMPALA-8276: -- Description: Reported with cdh5.12.1, that "self equal to self" kind of bogus predicate "x = x" is generated by Impala and caused incorrect query result, because this kind of predicate return false for "null" entries. It was observed that a {{count(*)}} query returned fewer rows than a CTAS query, though the query is the same, because the former generated the bogus predicate and the latter doesn't. For example, {code:java} select count(*) from (select a.*, b.x, b.y, b.z_dt, from view1 a left join view2 b on a.p = b.q) a{code} returned fewer rows than {code:java} create table abc as select count(*) from (select a.*, b.x, b.y, b.z_dt, from view1 a left join view2 b on a.p = b.q{code} because predicate {{a.z = a.z_dt}} was created (for reasons to understand, notice b.z_dt is an alias of b.z), exhibited as "table1.z = table1.z" in the query plan in Impala query profile because a and b are aliases of view1 and view2, both of which are views created in a very nested way that involves table table1. Though in cdh5.12.1 the select and the count query returns different result in the initial case, an attempted reproduction shows that both queries get bogus predicates. And cdh5.15.2 has the same problem. Was not able to try out with most recent master branch of impala due to meta data incompatibility. was: Reported with cdh5.12.1, that "self equal to self" kind of bogus predicate "x = x" is generated by Impala and caused incorrect query result, because this kind of predicate return false for "null" entries. It was observed that a count(*) query returned fewer rows than a CTAS query, though the query is the same, because the former generated the bogus predicate and the latter doesn't. For example, {code:java} select count(*) from (select a.*, b.x, b.y, b.z_dt, from view1 a left join view2 b on a.p = b.q) a{code} returned fewer rows than {code:java} create table abc as select count(*) from (select a.*, b.x, b.y, b.z_dt, from view1 a left join view2 b on a.p = b.q{code} because predicate {{a.z = a.z_dt}} was created (for reasons to understand, notice b.z_dt is an alias of b.z), exhibited as "table1.z = table1.z" in the query plan in Impala query profile because a and b are aliases of view1 and view2, both of which are views created in a very nested way that involves table table1. Though in cdh5.12.1 the select and the count query returns different result in the initial case, an attempted reproduction shows that both queries get bogus predicates. And cdh5.15.2 has the same problem. Was not able to try out with most recent master branch of impala due to meta data incompatibility. > Self equal to self predicate "x = x" generated by Impala caused incorrect > query result > -- > > Key: IMPALA-8276 > URL: https://issues.apache.org/jira/browse/IMPALA-8276 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.0 >Reporter: Yongjun Zhang >Priority: Major > > Reported with cdh5.12.1, that "self equal to self" kind of bogus predicate "x > = x" is generated by Impala and caused incorrect query result, because this > kind of predicate return false for "null" entries. > It was observed that a {{count(*)}} query returned fewer rows than a CTAS > query, though the query is the same, because the former generated the bogus > predicate and the latter doesn't. > For example, > {code:java} > select count(*) from (select a.*, b.x, b.y, b.z_dt, from view1 a left join > view2 b on a.p = b.q) a{code} > returned fewer rows than > {code:java} > create table abc as > select count(*) from (select a.*, b.x, b.y, b.z_dt, from view1 a left join > view2 b on a.p = b.q{code} > because predicate {{a.z = a.z_dt}} was created (for reasons to understand, > notice b.z_dt is an alias of b.z), exhibited as "table1.z = table1.z" in the > query plan in Impala query profile because a and b are aliases of view1 and > view2, both of which are views created in a very nested way that involves > table table1. > Though in cdh5.12.1 the select and the count query returns different result > in the initial case, an attempted reproduction shows that both queries get > bogus predicates. And cdh5.15.2 has the same problem. Was not able to try > out with most recent master branch of impala due to meta data incompatibility. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8276) Self equal to self predicate "x = x" generated by Impala caused incorrect query result
Yongjun Zhang created IMPALA-8276: - Summary: Self equal to self predicate "x = x" generated by Impala caused incorrect query result Key: IMPALA-8276 URL: https://issues.apache.org/jira/browse/IMPALA-8276 Project: IMPALA Issue Type: Bug Components: Frontend Affects Versions: Impala 3.0 Reporter: Yongjun Zhang Reported with cdh5.12.1, that "self equal to self" kind of bogus predicate "x = x" is generated by Impala and caused incorrect query result, because this kind of predicate return false for "null" entries. It was observed that a count(*) query returned fewer rows than a CTAS query, though the query is the same, because the former generated the bogus predicate and the latter doesn't. For example, {code:java} select count(*) from (select a.*, b.x, b.y, b.z_dt, from view1 a left join view2 b on a.p = b.q) a{code} returned fewer rows than {code:java} create table abc as select count(*) from (select a.*, b.x, b.y, b.z_dt, from view1 a left join view2 b on a.p = b.q{code} because predicate {{a.z = a.z_dt}} was created (for reasons to understand, notice b.z_dt is an alias of b.z), exhibited as "table1.z = table1.z" in the query plan in Impala query profile because a and b are aliases of view1 and view2, both of which are views created in a very nested way that involves table table1. Though in cdh5.12.1 the select and the count query returns different result in the initial case, an attempted reproduction shows that both queries get bogus predicates. And cdh5.15.2 has the same problem. Was not able to try out with most recent master branch of impala due to meta data incompatibility. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8274) Missing update to index into profiles vector in Coordinator::BackendState::ApplyExecStatusReport()
[ https://issues.apache.org/jira/browse/IMPALA-8274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783086#comment-16783086 ] Michael Ho commented on IMPALA-8274: FWIW, the bug above led to crash like the following: {noformat} F0302 10:21:04.562525 22393 coordinator-backend-state.cc:571] Check failed: per_fragment_instance_idx < exec_summary.exec_stats.size() (62 vs. 1) name=HDFS_SCAN_NODE (id=3) instance_id=e54a26423c426f58:ecf1f6b400b5 fragment_idx=4 {noformat} {noformat} (gdb) bt #0 0x7f0e215c0207 in raise () from ./sysroot/lib64/libc.so.6 #1 0x7f0e215c18f8 in abort () from ./sysroot/lib64/libc.so.6 #2 0x047fe4d4 in google::DumpStackTraceAndExit() () #3 0x047f4f2d in google::LogMessage::Fail() () #4 0x047f67d2 in google::LogMessage::SendToLog() () #5 0x047f4907 in google::LogMessage::Flush() () #6 0x047f7ece in google::LogMessageFatal::~LogMessageFatal() () #7 0x0275dd6a in impala::Coordinator::BackendState::InstanceStats::Update (this=0x17d393910, exec_status=..., thrift_profile=..., exec_summary=0x1a72a940, scan_range_progress=0x1a72a8d8) at /usr/src/debug/impala-3.2.0-cdh6.2.x-SNAPSHOT/be/src/runtime/coordinator-backend-state.cc:571 #8 0x0275b0cf in impala::Coordinator::BackendState::ApplyExecStatusReport (this=0x2e71f0100, backend_exec_status=..., thrift_profiles=..., exec_summary=0x1a72a940, scan_range_progress=0x1a72a8d8, dml_exec_state=0x1a72aa80) at /usr/src/debug/impala-3.2.0-cdh6.2.x-SNAPSHOT/be/src/runtime/coordinator-backend-state.cc:337 #9 0x027474bb in impala::Coordinator::UpdateBackendExecStatus (this=0x1a72a880, request=..., thrift_profiles=...) at /usr/src/debug/impala-3.2.0-cdh6.2.x-SNAPSHOT/be/src/runtime/coordinator.cc:713 #10 0x020d5c46 in impala::ClientRequestState::UpdateBackendExecStatus (this=0xe3a1c000, request=..., thrift_profiles=...) at /usr/src/debug/impala-3.2.0-cdh6.2.x-SNAPSHOT/be/src/service/client-request-state.cc:1303 #11 0x02038291 in impala::ControlService::ReportExecStatus (this=0x1596cad0, request=0x7835ba70, response=0x47894bfa0, rpc_context=0x47894aea0) at /usr/src/debug/impala-3.2.0-cdh6.2.x-SNAPSHOT/be/src/service/control-service.cc:152 #12 0x020dbac4 in impala::ControlServiceIf::ControlServiceIf(scoped_refptr const&, scoped_refptr const&)::{lambda(google::protobuf::Message const*, google::protobuf::Message*, kudu::rpc::RpcContext*)#2}::operator()(google::protobuf::Message const*, google::protobuf::Message*, kudu::rpc::RpcContext*) const () at /usr/src/debug/impala-3.2.0-cdh6.2.x-SNAPSHOT/be/generated-sources/gen-cpp/control_service.service.cc:62 {noformat} > Missing update to index into profiles vector in > Coordinator::BackendState::ApplyExecStatusReport() > -- > > Key: IMPALA-8274 > URL: https://issues.apache.org/jira/browse/IMPALA-8274 > Project: IMPALA > Issue Type: Bug > Components: Distributed Exec >Reporter: Michael Ho >Assignee: Michael Ho >Priority: Blocker > > {{idx}} isn't updated in case we skip a duplicated or stale duplicated update > of a fragment instance. As a result, we may end up passing the wrong profile > to {{instance_stats->Update()}}. This may lead to random crashes in > {{Coordinator::BackendState::InstanceStats::Update}}. > {noformat} > int idx = 0; > const bool has_profile = thrift_profiles.profile_trees.size() > 0; > TRuntimeProfileTree empty_profile; > for (const FragmentInstanceExecStatusPB& instance_exec_status : >backend_exec_status.instance_exec_status()) { > int64_t report_seq_no = instance_exec_status.report_seq_no(); > int instance_idx = > GetInstanceIdx(instance_exec_status.fragment_instance_id()); > DCHECK_EQ(instance_stats_map_.count(instance_idx), 1); > InstanceStats* instance_stats = instance_stats_map_[instance_idx]; > int64_t last_report_seq_no = instance_stats->last_report_seq_no_; > DCHECK(instance_stats->exec_params_.instance_id == > ProtoToQueryId(instance_exec_status.fragment_instance_id())); > // Ignore duplicate or out-of-order messages. > if (report_seq_no <= last_report_seq_no) { > VLOG_QUERY << Substitute("Ignoring stale update for query instance $0 > with " > "seq no $1", PrintId(instance_stats->exec_params_.instance_id), > report_seq_no); > continue; <<--- // XXX bad > } > DCHECK(!instance_stats->done_); > DCHECK(!has_profile || idx < thrift_profiles.profile_trees.size()); > const TRuntimeProfileTree& profile = > has_profile ? thrift_profiles.profile_trees[idx++] : empty_profile; > instance_stats->Update(instance_exec_status, profile, exec_summary, >