[jira] [Created] (IMPALA-10608) Update the virtualenv's kudu-python version to the latest
Joe McDonnell created IMPALA-10608: -- Summary: Update the virtualenv's kudu-python version to the latest Key: IMPALA-10608 URL: https://issues.apache.org/jira/browse/IMPALA-10608 Project: IMPALA Issue Type: Improvement Components: Infrastructure Reporter: Joe McDonnell Assignee: Joe McDonnell The impala-python virtualenv currently installs kudu-python==1.2.0. This is very old. We should update to the latest (1.14.0). kudu-python dropped the numpy dependency several versions ago, which would speed up virtualenv bootstrap. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-10590) Ensure admissiond stays in sync with coordinators
[ https://issues.apache.org/jira/browse/IMPALA-10590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Tauber-Marshall resolved IMPALA-10590. - Fix Version/s: Impala 4.0 Resolution: Fixed > Ensure admissiond stays in sync with coordinators > - > > Key: IMPALA-10590 > URL: https://issues.apache.org/jira/browse/IMPALA-10590 > Project: IMPALA > Issue Type: Sub-task > Components: Backend >Affects Versions: Impala 4.0 >Reporter: Thomas Tauber-Marshall >Assignee: Thomas Tauber-Marshall >Priority: Major > Fix For: Impala 4.0 > > > Currently, its possible for the admission service to have an incorrect view > of what resources are being used in the cluster if there are rpc failures. > For example, if the ReleaseQuery rpc fails, the coordinator will retry a few > times and then give up. In this case, a query has completed by the admission > service doesn't know and will not allow other queries to be scheduled with > those resources. > We can solve this by adding a periodic heartbeat rpc from coordinators to the > admission service. This heartbeat will include the query ids for all queries > currently running at each coordinator, and then the admission service can > clean up resources allocated to any queries that are not in the list, on the > assumption that those queries must have completed already. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-10604) Allow setting KuduClient's verbose logging level directly
[ https://issues.apache.org/jira/browse/IMPALA-10604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Tauber-Marshall resolved IMPALA-10604. - Fix Version/s: Impala 4.0 Resolution: Fixed > Allow setting KuduClient's verbose logging level directly > - > > Key: IMPALA-10604 > URL: https://issues.apache.org/jira/browse/IMPALA-10604 > Project: IMPALA > Issue Type: Improvement >Affects Versions: Impala 4.0 >Reporter: Thomas Tauber-Marshall >Assignee: Thomas Tauber-Marshall >Priority: Major > Fix For: Impala 4.0 > > > Currently, Impala sets KuduClient's verbose logging level to the same as its > own level (taken from the -v flag) minus 1. Since KuduClient doesn't have any > way of setting vmodule, this means that to get verbose logging inside > KuduClient users must turn it on to a high level for all of Impala, which can > produce an enormous volume of logging. making it hard to collect, share, and > analyze logs. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10604) Allow setting KuduClient's verbose logging level directly
[ https://issues.apache.org/jira/browse/IMPALA-10604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17308269#comment-17308269 ] ASF subversion and git services commented on IMPALA-10604: -- Commit 452c2f1f7f9cc4c8472ab38949e9990281dcc3a3 in impala's branch refs/heads/master from Thomas Tauber-Marshall [ https://gitbox.apache.org/repos/asf?p=impala.git;h=452c2f1 ] IMPALA-10604: Allow setting KuduClient's verbose log level directly This patch adds a flag --kudu_client_v which allows setting the verbose logging level for the KuduClient to a value other than the level for the rest of Impala (set by -v) in order to enable debugging of issues in the KuduClient without producing the enormous amount of logging that comes with setting a high -v value on all of Impala. Testing: - Manually set --kudu_client_v and confirmed that the expected logging is produced. Change-Id: Ib39358709ee714b8cdffd72a0ee58f66d5fab37e Reviewed-on: http://gerrit.cloudera.org:8080/17222 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Allow setting KuduClient's verbose logging level directly > - > > Key: IMPALA-10604 > URL: https://issues.apache.org/jira/browse/IMPALA-10604 > Project: IMPALA > Issue Type: Improvement >Affects Versions: Impala 4.0 >Reporter: Thomas Tauber-Marshall >Assignee: Thomas Tauber-Marshall >Priority: Major > > Currently, Impala sets KuduClient's verbose logging level to the same as its > own level (taken from the -v flag) minus 1. Since KuduClient doesn't have any > way of setting vmodule, this means that to get verbose logging inside > KuduClient users must turn it on to a high level for all of Impala, which can > produce an enormous volume of logging. making it hard to collect, share, and > analyze logs. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10590) Ensure admissiond stays in sync with coordinators
[ https://issues.apache.org/jira/browse/IMPALA-10590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17308270#comment-17308270 ] ASF subversion and git services commented on IMPALA-10590: -- Commit e3bafcbef4fd7152ecfcbc7d331e41e9778caf15 in impala's branch refs/heads/master from Thomas Tauber-Marshall [ https://gitbox.apache.org/repos/asf?p=impala.git;h=e3bafcb ] IMPALA-10590: Introduce admission service heartbeat mechanism Currently, if a ReleaseQuery rpc fails, it's possible for the admission service to think that some resources are still being used that are actually free. This patch fixes the issue by introducing a periodic heartbeat rpc from coordinators to the admission service which contains a list of queries registered at that coordinator. If there is a query that the admission service thinks is running but is not included in the heartbeat, the admission service can conclude that the query must have already completed and release its resources. Testing: - Added a test that uses a debug action to simulate ReleaseQuery rpcs failing and checks that query resources are released properly. Change-Id: Ia528d92268cea487ada20b476935a81166f5ad34 Reviewed-on: http://gerrit.cloudera.org:8080/17194 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Ensure admissiond stays in sync with coordinators > - > > Key: IMPALA-10590 > URL: https://issues.apache.org/jira/browse/IMPALA-10590 > Project: IMPALA > Issue Type: Sub-task > Components: Backend >Affects Versions: Impala 4.0 >Reporter: Thomas Tauber-Marshall >Assignee: Thomas Tauber-Marshall >Priority: Major > > Currently, its possible for the admission service to have an incorrect view > of what resources are being used in the cluster if there are rpc failures. > For example, if the ReleaseQuery rpc fails, the coordinator will retry a few > times and then give up. In this case, a query has completed by the admission > service doesn't know and will not allow other queries to be scheduled with > those resources. > We can solve this by adding a periodic heartbeat rpc from coordinators to the > admission service. This heartbeat will include the query ids for all queries > currently running at each coordinator, and then the admission service can > clean up resources allocated to any queries that are not in the list, on the > assumption that those queries must have completed already. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-10607) TestDecimalOverflowExprs::test_ctas_exprs failed in S3 build
Wenzhe Zhou created IMPALA-10607: Summary: TestDecimalOverflowExprs::test_ctas_exprs failed in S3 build Key: IMPALA-10607 URL: https://issues.apache.org/jira/browse/IMPALA-10607 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 4.0 Reporter: Wenzhe Zhou TestDecimalOverflowExprs::test_ctas_exprs failed in S3 build Stack trace: Stack trace for S3 build. [https://master-03.jenkins.cloudera.com/job/impala-cdpd-master-staging-core-s3/34/] query_test.test_decimal_queries.TestDecimalOverflowExprs.test_ctas_exprs[protocol: beeswax | exec_option: \\{'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] (from pytest) Failing for the past 1 build (Since Failed#34 ) Took 13 sec. Error Message ImpalaBeeswaxException: ImpalaBeeswaxException: Query aborted:Parquet file s3a://impala-test-uswest2-1/test-warehouse/test_ctas_exprs_7304e515.db/overflowed_decimal_tbl_1/b74f0ce129189cf1-4c3c5bd6_1609291350_data.0.parq has an invalid file length: 4 Stacktrace query_test/test_decimal_queries.py:170: in test_ctas_exprs "SELECT count(*) FROM %s" % TBL_NAME_1) /data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/common/impala_test_suite.py:814: in wrapper return function(*args, **kwargs) /data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/common/impala_test_suite.py:822: in execute_query_expect_success result = cls.__execute_query(impalad_client, query, query_options, user) /data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/common/impala_test_suite.py:923: in __execute_query return impalad_client.execute(query, user=user) /data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/common/impala_connection.py:205: in execute return self.__beeswax_client.execute(sql_stmt, user=user) /data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/beeswax/impala_beeswax.py:187: in execute handle = self.__execute_query(query_string.strip(), user=user) /data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/beeswax/impala_beeswax.py:365: in __execute_query self.wait_for_finished(handle) /data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/beeswax/impala_beeswax.py:386: in wait_for_finished raise ImpalaBeeswaxException("Query aborted:" + error_log, None) E ImpalaBeeswaxException: ImpalaBeeswaxException: E Query aborted:Parquet file s3a://impala-test-uswest2-1/test-warehouse/test_ctas_exprs_7304e515.db/overflowed_decimal_tbl_1/b74f0ce129189cf1-4c3c5bd6_1609291350_data.0.parq has an invalid file length: 4 Standard Error SET client_identifier=query_test/test_decimal_queries.py::TestDecimalOverflowExprs::()::test_ctas_exprs[protocol:beeswax|exec_option:\{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}; SET sync_ddl=False; – executing against localhost:21000 DROP DATABASE IF EXISTS `test_ctas_exprs_7304e515` CASCADE; – 2021-03-24 03:56:00,840 INFO MainThread: Started query 574a532f47ac7c80:c1c62ae0 SET client_identifier=query_test/test_decimal_queries.py::TestDecimalOverflowExprs::()::test_ctas_exprs[protocol:beeswax|exec_option:\{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}; SET sync_ddl=False; – executing against localhost:21000 CREATE DATABASE `test_ctas_exprs_7304e515`; – 2021-03-24 03:56:03,120 INFO MainThread: Started query 424b970f206e271f:ade0b524 – 2021-03-24 03:56:03,121 INFO MainThread: Created database "test_ctas_exprs_7304e515" for test ID "query_test/test_decimal_queries.py::TestDecimalOverflowExprs::()::test_ctas_exprs[protocol: beeswax | exec_option: \\{'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: parquet/none]" – executing against localhost:21000 SET decimal_v2=true; – 2021-03-24 03:56:03,126 INFO MainThread: Started query 4545d8b9db5e9342:8b3ba570 – executing against localhost:21000 DROP TABLE IF EXISTS `test_ctas_exprs_7304e515`.`overflowed_decimal_tbl_1`; – 2021-03-24 03:56:03,131 INFO MainThread: Started query 2c4bc9fc85e2b8e8:05e35eed SET client_identifier=query_test/test_decimal_queries.py::TestDecimalOverflowExprs::()::test_ctas_exprs[protocol:beeswax|exec_option:\{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}; – executing against localhost:21000 use functional_parquet; –
[jira] [Assigned] (IMPALA-10607) TestDecimalOverflowExprs::test_ctas_exprs failed in S3 build
[ https://issues.apache.org/jira/browse/IMPALA-10607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenzhe Zhou reassigned IMPALA-10607: Assignee: Wenzhe Zhou > TestDecimalOverflowExprs::test_ctas_exprs failed in S3 build > > > Key: IMPALA-10607 > URL: https://issues.apache.org/jira/browse/IMPALA-10607 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 4.0 >Reporter: Wenzhe Zhou >Assignee: Wenzhe Zhou >Priority: Major > > TestDecimalOverflowExprs::test_ctas_exprs failed in S3 build > Stack trace: > Stack trace for S3 build. > [https://master-03.jenkins.cloudera.com/job/impala-cdpd-master-staging-core-s3/34/] > query_test.test_decimal_queries.TestDecimalOverflowExprs.test_ctas_exprs[protocol: > beeswax | exec_option: \\{'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > parquet/none] (from pytest) > Failing for the past 1 build (Since Failed#34 ) > Took 13 sec. > Error Message > ImpalaBeeswaxException: ImpalaBeeswaxException: Query aborted:Parquet file > s3a://impala-test-uswest2-1/test-warehouse/test_ctas_exprs_7304e515.db/overflowed_decimal_tbl_1/b74f0ce129189cf1-4c3c5bd6_1609291350_data.0.parq > has an invalid file length: 4 > Stacktrace > query_test/test_decimal_queries.py:170: in test_ctas_exprs > "SELECT count(*) FROM %s" % TBL_NAME_1) > /data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/common/impala_test_suite.py:814: > in wrapper > return function(*args, **kwargs) > /data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/common/impala_test_suite.py:822: > in execute_query_expect_success > result = cls.__execute_query(impalad_client, query, query_options, user) > /data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/common/impala_test_suite.py:923: > in __execute_query > return impalad_client.execute(query, user=user) > /data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/common/impala_connection.py:205: > in execute > return self.__beeswax_client.execute(sql_stmt, user=user) > /data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/beeswax/impala_beeswax.py:187: > in execute > handle = self.__execute_query(query_string.strip(), user=user) > /data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/beeswax/impala_beeswax.py:365: > in __execute_query > self.wait_for_finished(handle) > /data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/beeswax/impala_beeswax.py:386: > in wait_for_finished > raise ImpalaBeeswaxException("Query aborted:" + error_log, None) > E ImpalaBeeswaxException: ImpalaBeeswaxException: > E Query aborted:Parquet file > s3a://impala-test-uswest2-1/test-warehouse/test_ctas_exprs_7304e515.db/overflowed_decimal_tbl_1/b74f0ce129189cf1-4c3c5bd6_1609291350_data.0.parq > has an invalid file length: 4 > Standard Error > SET > client_identifier=query_test/test_decimal_queries.py::TestDecimalOverflowExprs::()::test_ctas_exprs[protocol:beeswax|exec_option:\{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}; > SET sync_ddl=False; > – executing against localhost:21000 > DROP DATABASE IF EXISTS `test_ctas_exprs_7304e515` CASCADE; > – 2021-03-24 03:56:00,840 INFO MainThread: Started query > 574a532f47ac7c80:c1c62ae0 > SET > client_identifier=query_test/test_decimal_queries.py::TestDecimalOverflowExprs::()::test_ctas_exprs[protocol:beeswax|exec_option:\{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}; > SET sync_ddl=False; > – executing against localhost:21000 > CREATE DATABASE `test_ctas_exprs_7304e515`; > – 2021-03-24 03:56:03,120 INFO MainThread: Started query > 424b970f206e271f:ade0b524 > – 2021-03-24 03:56:03,121 INFO MainThread: Created database > "test_ctas_exprs_7304e515" for test ID > "query_test/test_decimal_queries.py::TestDecimalOverflowExprs::()::test_ctas_exprs[protocol: > beeswax | exec_option: \\{'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > parquet/none]" > – executing against localhost:21000 > SET decimal_v2=true; > – 2021-03-24 03:56:03,126 INFO MainThread: Started query > 4545d8b9db5e9342:8b3ba570 > – executing against localhost:21000 > DROP TABLE IF EXISTS `test_ctas_exprs_7304e515`.`overflowed_decimal_tbl_1`; > – 2021-03-24 03:56:03,131 INFO
[jira] [Created] (IMPALA-10606) Simplify impala-python virtualenv requirements files
Joe McDonnell created IMPALA-10606: -- Summary: Simplify impala-python virtualenv requirements files Key: IMPALA-10606 URL: https://issues.apache.org/jira/browse/IMPALA-10606 Project: IMPALA Issue Type: Improvement Components: Infrastructure Affects Versions: Impala 4.0 Reporter: Joe McDonnell Assignee: Joe McDonnell The impala-python virtualenv currently has complicated logic that can do multiple rounds of pip installs depending on whether the toolchain has been bootstrapped. For example, the packages in compile-requirements.txt are only installed if the toolchain GCC has been installed. The Kudu python client is only installed if Kudu has been downloaded. This was a workaround because bootstrap_toolchain.py required the impala-python virtualenv. The different stages allowed the basics to be installed, then bootstrap_toolchain.py can run, then the rest can be installed. The bootstrap_toolchain.py script no longer requires the impala-python virtualenv, so there is no need for such a complicated setup. The things bootstrapping the impala-python virtualenv can assume that the toolchain compiler is present. This would allow the requirements to be consolidated into a main requirements file that includes both compiled and non-compiled packages. A consolidated file makes it easier to update dependency versions. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-10605) Deflake test_refresh_native
[ https://issues.apache.org/jira/browse/IMPALA-10605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated IMPALA-10605: - Description: The test uses a regex to parse the output of describe database and extract the db properties. The regex currently assumes that there will be only one property in the database. This assumption breaks when events processor is running because it might add some db properties as well. {noformat} regex = r"{(.*?)=(.*?)}" {noformat} The above regex will select subsequent properties as the value of the first key. We can fix this by changing the regex to specifically look for the functional name property key prefix. {noformat} regex = r"{.*(impala_registered_function.*?)=(.*?)[,}]" {noformat} > Deflake test_refresh_native > --- > > Key: IMPALA-10605 > URL: https://issues.apache.org/jira/browse/IMPALA-10605 > Project: IMPALA > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Minor > > The test uses a regex to parse the output of describe database and extract > the db properties. The regex currently assumes that there will be only one > property in the database. This assumption breaks when events processor is > running because it might add some db properties as well. > {noformat} > regex = r"{(.*?)=(.*?)}" > {noformat} > The above regex will select subsequent properties as the value of the first > key. We can fix this by changing the regex to specifically look for the > functional name property key prefix. > {noformat} > regex = r"{.*(impala_registered_function.*?)=(.*?)[,}]" > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-10605) Deflake test_refresh_native
Vihang Karajgaonkar created IMPALA-10605: Summary: Deflake test_refresh_native Key: IMPALA-10605 URL: https://issues.apache.org/jira/browse/IMPALA-10605 Project: IMPALA Issue Type: Improvement Reporter: Vihang Karajgaonkar Assignee: Vihang Karajgaonkar -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-6340) There is no error when inserting an invalid value into a decimal column under decimal_v2
[ https://issues.apache.org/jira/browse/IMPALA-6340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17308126#comment-17308126 ] ASF subversion and git services commented on IMPALA-6340: - Commit 410c3e79e4eeba0a3f1ad62f6bf2f11b2de48819 in impala's branch refs/heads/master from wzhou-code [ https://gitbox.apache.org/repos/asf?p=impala.git;h=410c3e7 ] IMPALA-10564: Return error when inserting an invalid decimal value When using CTAS statements or INSERT-SELECT statements to insert rows to table with decimal columns, Impala insert NULL for overflowed decimal values, instead of returning error. This issue happens when the data expression for the decimal column in SELECT sub-query consists at least one alias. This issue is similar as IMPALA-6340, but IMPALA-6340 only fixed the issue for the cases with the data expression for the decimal columns as constants. This patch fixed the issue by calling RuntimeState::CheckQueryState() in the end of HdfsTableWriter::AppendRows() and KuduTableSink::Send(). If there is an invalid decimal error, the query will be failed without inserting NULL for decimal column. We did not change the behaviour for decimal_v1. NULL will be inserted to the table for invalid decimal values with warning message. Tests: - Added unit-tests for INSERT-SELECT and CTAS statements with overflowed decimal values to be inserted into tables. The overflowed decimal values are expressed as a constant expression, or as an expression with aliases. Also added cases to verify behaviour of decimal_v1 is unchanged. - Passed exhaustive tests. Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98 Reviewed-on: http://gerrit.cloudera.org:8080/17168 Reviewed-by: Thomas Tauber-Marshall Tested-by: Impala Public Jenkins > There is no error when inserting an invalid value into a decimal column under > decimal_v2 > > > Key: IMPALA-6340 > URL: https://issues.apache.org/jira/browse/IMPALA-6340 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.11.0 >Reporter: Taras Bobrovytsky >Assignee: Taras Bobrovytsky >Priority: Blocker > Labels: correctness > Fix For: Impala 3.0, Impala 2.13.0 > > > The following series of commands does not result in an error or a warning > when decimal_v2 is enabled. > {code} > set decimal_v2=1; > create table t1 (c1 decimal(38,37)); > insert into t1 select 11.11; > {code} > We end up inserting a NULL into the column without any warnings. > If these commands are executed with decimal_v2 disabled, we get the following > warning: > {code} > WARNINGS: UDF WARNING: Decimal expression overflowed, returning NULL > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10580) Implement ds_theta_union_f() function.
[ https://issues.apache.org/jira/browse/IMPALA-10580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17308128#comment-17308128 ] ASF subversion and git services commented on IMPALA-10580: -- Commit 622e3c95adca5cf30a0aff6542556feab9b8a861 in impala's branch refs/heads/master from Fucun Chu [ https://gitbox.apache.org/repos/asf?p=impala.git;h=622e3c9 ] IMPALA-10580: Implement ds_theta_union_f() function This function receives two strings that are serialized Apache DataSketches Theta sketches. Union two sketches and returns the resulting sketch of union. Example: select ds_theta_estimate(ds_theta_union_f(sketch1, sketch2)) from sketch_tbl; +---+ | ds_theta_estimate(ds_theta_union_f(sketch1, sketch2)) | +---+ | 15| +---+ Change-Id: I8329979b81ceeaad739a43fab79768ca9c2916fa Reviewed-on: http://gerrit.cloudera.org:8080/17179 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Implement ds_theta_union_f() function. > -- > > Key: IMPALA-10580 > URL: https://issues.apache.org/jira/browse/IMPALA-10580 > Project: IMPALA > Issue Type: New Feature > Components: Backend, Frontend >Reporter: Fucun Chu >Assignee: Fucun Chu >Priority: Major > Fix For: Impala 4.0 > > > This function receives two strings that are serialized Apache DataSketches > Theta sketches. Union two sketches and returns the resulting sketch of union. > Example: > {code:java} > select ds_theta_estimate(ds_theta_union_f(sketch1, sketch2)) > from sketch_tbl; > +---+ > | ds_theta_estimate(ds_theta_union_f(sketch1, sketch2)) | > +---+ > | 15| > +---+{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-6340) There is no error when inserting an invalid value into a decimal column under decimal_v2
[ https://issues.apache.org/jira/browse/IMPALA-6340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17308127#comment-17308127 ] ASF subversion and git services commented on IMPALA-6340: - Commit 410c3e79e4eeba0a3f1ad62f6bf2f11b2de48819 in impala's branch refs/heads/master from wzhou-code [ https://gitbox.apache.org/repos/asf?p=impala.git;h=410c3e7 ] IMPALA-10564: Return error when inserting an invalid decimal value When using CTAS statements or INSERT-SELECT statements to insert rows to table with decimal columns, Impala insert NULL for overflowed decimal values, instead of returning error. This issue happens when the data expression for the decimal column in SELECT sub-query consists at least one alias. This issue is similar as IMPALA-6340, but IMPALA-6340 only fixed the issue for the cases with the data expression for the decimal columns as constants. This patch fixed the issue by calling RuntimeState::CheckQueryState() in the end of HdfsTableWriter::AppendRows() and KuduTableSink::Send(). If there is an invalid decimal error, the query will be failed without inserting NULL for decimal column. We did not change the behaviour for decimal_v1. NULL will be inserted to the table for invalid decimal values with warning message. Tests: - Added unit-tests for INSERT-SELECT and CTAS statements with overflowed decimal values to be inserted into tables. The overflowed decimal values are expressed as a constant expression, or as an expression with aliases. Also added cases to verify behaviour of decimal_v1 is unchanged. - Passed exhaustive tests. Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98 Reviewed-on: http://gerrit.cloudera.org:8080/17168 Reviewed-by: Thomas Tauber-Marshall Tested-by: Impala Public Jenkins > There is no error when inserting an invalid value into a decimal column under > decimal_v2 > > > Key: IMPALA-6340 > URL: https://issues.apache.org/jira/browse/IMPALA-6340 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.11.0 >Reporter: Taras Bobrovytsky >Assignee: Taras Bobrovytsky >Priority: Blocker > Labels: correctness > Fix For: Impala 3.0, Impala 2.13.0 > > > The following series of commands does not result in an error or a warning > when decimal_v2 is enabled. > {code} > set decimal_v2=1; > create table t1 (c1 decimal(38,37)); > insert into t1 select 11.11; > {code} > We end up inserting a NULL into the column without any warnings. > If these commands are executed with decimal_v2 disabled, we get the following > warning: > {code} > WARNINGS: UDF WARNING: Decimal expression overflowed, returning NULL > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10564) No error returned when inserting an overflowed value into a decimal column
[ https://issues.apache.org/jira/browse/IMPALA-10564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17308125#comment-17308125 ] ASF subversion and git services commented on IMPALA-10564: -- Commit 410c3e79e4eeba0a3f1ad62f6bf2f11b2de48819 in impala's branch refs/heads/master from wzhou-code [ https://gitbox.apache.org/repos/asf?p=impala.git;h=410c3e7 ] IMPALA-10564: Return error when inserting an invalid decimal value When using CTAS statements or INSERT-SELECT statements to insert rows to table with decimal columns, Impala insert NULL for overflowed decimal values, instead of returning error. This issue happens when the data expression for the decimal column in SELECT sub-query consists at least one alias. This issue is similar as IMPALA-6340, but IMPALA-6340 only fixed the issue for the cases with the data expression for the decimal columns as constants. This patch fixed the issue by calling RuntimeState::CheckQueryState() in the end of HdfsTableWriter::AppendRows() and KuduTableSink::Send(). If there is an invalid decimal error, the query will be failed without inserting NULL for decimal column. We did not change the behaviour for decimal_v1. NULL will be inserted to the table for invalid decimal values with warning message. Tests: - Added unit-tests for INSERT-SELECT and CTAS statements with overflowed decimal values to be inserted into tables. The overflowed decimal values are expressed as a constant expression, or as an expression with aliases. Also added cases to verify behaviour of decimal_v1 is unchanged. - Passed exhaustive tests. Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98 Reviewed-on: http://gerrit.cloudera.org:8080/17168 Reviewed-by: Thomas Tauber-Marshall Tested-by: Impala Public Jenkins > No error returned when inserting an overflowed value into a decimal column > -- > > Key: IMPALA-10564 > URL: https://issues.apache.org/jira/browse/IMPALA-10564 > Project: IMPALA > Issue Type: Bug > Components: Backend, Frontend >Affects Versions: Impala 4.0 >Reporter: Wenzhe Zhou >Assignee: Wenzhe Zhou >Priority: Major > Fix For: Impala 4.0 > > > When using CTAS statements or INSERT-SELECT statements to insert rows to > table with decimal columns, Impala insert NULL for overflowed decimal values, > instead of returning error. This issue happens when the data expression for > the decimal column in SELECT sub-query consists at least one alias. This > issue is similar as IMPALA-6340, but IMPALA-6340 only fixed the issue for the > cases with the data expression for the decimal columns as constants so that > the overflowed decimal values could be detected by frontend during expression > analysis. If there is alias (variables) in the data expression for the > decimal column, Frontend could not evaluate data expression in expression > analysis phase. Only backend could evaluate the data expression when backend > execute fragment instances for SELECT sub-queries. The log messages showed > that the executor detected the decimal overflow error, but somehow it did not > propagate the error to the coordinator, hence the error was not returned to > the client. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10564) No error returned when inserting an overflowed value into a decimal column
[ https://issues.apache.org/jira/browse/IMPALA-10564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17308011#comment-17308011 ] Wenzhe Zhou commented on IMPALA-10564: -- Stack trace for S3 build. [https://master-03.jenkins.cloudera.com/job/impala-cdpd-master-staging-core-s3/34/] query_test.test_decimal_queries.TestDecimalOverflowExprs.test_ctas_exprs[protocol: beeswax | exec_option: \{'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] (from pytest) Failing for the past 1 build (Since Failed#34 ) Took 13 sec. Error Message ImpalaBeeswaxException: ImpalaBeeswaxException: Query aborted:Parquet file s3a://impala-test-uswest2-1/test-warehouse/test_ctas_exprs_7304e515.db/overflowed_decimal_tbl_1/b74f0ce129189cf1-4c3c5bd6_1609291350_data.0.parq has an invalid file length: 4 Stacktrace query_test/test_decimal_queries.py:170: in test_ctas_exprs "SELECT count(*) FROM %s" % TBL_NAME_1) /data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/common/impala_test_suite.py:814: in wrapper return function(*args, **kwargs) /data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/common/impala_test_suite.py:822: in execute_query_expect_success result = cls.__execute_query(impalad_client, query, query_options, user) /data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/common/impala_test_suite.py:923: in __execute_query return impalad_client.execute(query, user=user) /data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/common/impala_connection.py:205: in execute return self.__beeswax_client.execute(sql_stmt, user=user) /data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/beeswax/impala_beeswax.py:187: in execute handle = self.__execute_query(query_string.strip(), user=user) /data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/beeswax/impala_beeswax.py:365: in __execute_query self.wait_for_finished(handle) /data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/beeswax/impala_beeswax.py:386: in wait_for_finished raise ImpalaBeeswaxException("Query aborted:" + error_log, None) E ImpalaBeeswaxException: ImpalaBeeswaxException: E Query aborted:Parquet file s3a://impala-test-uswest2-1/test-warehouse/test_ctas_exprs_7304e515.db/overflowed_decimal_tbl_1/b74f0ce129189cf1-4c3c5bd6_1609291350_data.0.parq has an invalid file length: 4 Standard Error SET client_identifier=query_test/test_decimal_queries.py::TestDecimalOverflowExprs::()::test_ctas_exprs[protocol:beeswax|exec_option:\{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}; SET sync_ddl=False; -- executing against localhost:21000 DROP DATABASE IF EXISTS `test_ctas_exprs_7304e515` CASCADE; -- 2021-03-24 03:56:00,840 INFO MainThread: Started query 574a532f47ac7c80:c1c62ae0 SET client_identifier=query_test/test_decimal_queries.py::TestDecimalOverflowExprs::()::test_ctas_exprs[protocol:beeswax|exec_option:\{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}; SET sync_ddl=False; -- executing against localhost:21000 CREATE DATABASE `test_ctas_exprs_7304e515`; -- 2021-03-24 03:56:03,120 INFO MainThread: Started query 424b970f206e271f:ade0b524 -- 2021-03-24 03:56:03,121 INFO MainThread: Created database "test_ctas_exprs_7304e515" for test ID "query_test/test_decimal_queries.py::TestDecimalOverflowExprs::()::test_ctas_exprs[protocol: beeswax | exec_option: \{'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: parquet/none]" -- executing against localhost:21000 SET decimal_v2=true; -- 2021-03-24 03:56:03,126 INFO MainThread: Started query 4545d8b9db5e9342:8b3ba570 -- executing against localhost:21000 DROP TABLE IF EXISTS `test_ctas_exprs_7304e515`.`overflowed_decimal_tbl_1`; -- 2021-03-24 03:56:03,131 INFO MainThread: Started query 2c4bc9fc85e2b8e8:05e35eed SET client_identifier=query_test/test_decimal_queries.py::TestDecimalOverflowExprs::()::test_ctas_exprs[protocol:beeswax|exec_option:\{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}; -- executing against localhost:21000 use functional_parquet; -- 2021-03-24 03:56:03,135 INFO MainThread: Started query 38403231c3885691:b0ba2cc4 SET
[jira] [Reopened] (IMPALA-10564) No error returned when inserting an overflowed value into a decimal column
[ https://issues.apache.org/jira/browse/IMPALA-10564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenzhe Zhou reopened IMPALA-10564: -- New test case TestDecimalOverflowExprs.test_ctas_exprs failed in S3 build. > No error returned when inserting an overflowed value into a decimal column > -- > > Key: IMPALA-10564 > URL: https://issues.apache.org/jira/browse/IMPALA-10564 > Project: IMPALA > Issue Type: Bug > Components: Backend, Frontend >Affects Versions: Impala 4.0 >Reporter: Wenzhe Zhou >Assignee: Wenzhe Zhou >Priority: Major > Fix For: Impala 4.0 > > > When using CTAS statements or INSERT-SELECT statements to insert rows to > table with decimal columns, Impala insert NULL for overflowed decimal values, > instead of returning error. This issue happens when the data expression for > the decimal column in SELECT sub-query consists at least one alias. This > issue is similar as IMPALA-6340, but IMPALA-6340 only fixed the issue for the > cases with the data expression for the decimal columns as constants so that > the overflowed decimal values could be detected by frontend during expression > analysis. If there is alias (variables) in the data expression for the > decimal column, Frontend could not evaluate data expression in expression > analysis phase. Only backend could evaluate the data expression when backend > execute fragment instances for SELECT sub-queries. The log messages showed > that the executor detected the decimal overflow error, but somehow it did not > propagate the error to the coordinator, hence the error was not returned to > the client. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org