[jira] [Created] (IMPALA-10608) Update the virtualenv's kudu-python version to the latest

2021-03-24 Thread Joe McDonnell (Jira)
Joe McDonnell created IMPALA-10608:
--

 Summary: Update the virtualenv's kudu-python version to the latest
 Key: IMPALA-10608
 URL: https://issues.apache.org/jira/browse/IMPALA-10608
 Project: IMPALA
  Issue Type: Improvement
  Components: Infrastructure
Reporter: Joe McDonnell
Assignee: Joe McDonnell


The impala-python virtualenv currently installs kudu-python==1.2.0. This is 
very old. We should update to the latest (1.14.0). kudu-python dropped the 
numpy dependency several versions ago, which would speed up virtualenv 
bootstrap.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-10590) Ensure admissiond stays in sync with coordinators

2021-03-24 Thread Thomas Tauber-Marshall (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Tauber-Marshall resolved IMPALA-10590.
-
Fix Version/s: Impala 4.0
   Resolution: Fixed

> Ensure admissiond stays in sync with coordinators
> -
>
> Key: IMPALA-10590
> URL: https://issues.apache.org/jira/browse/IMPALA-10590
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Affects Versions: Impala 4.0
>Reporter: Thomas Tauber-Marshall
>Assignee: Thomas Tauber-Marshall
>Priority: Major
> Fix For: Impala 4.0
>
>
> Currently, its possible for the admission service to have an incorrect view 
> of what resources are being used in the cluster if there are rpc failures. 
> For example, if the ReleaseQuery rpc fails, the coordinator will retry a few 
> times and then give up. In this case, a query has completed by the admission 
> service doesn't know and will not allow other queries to be scheduled with 
> those resources.
> We can solve this by adding a periodic heartbeat rpc from coordinators to the 
> admission service. This heartbeat will include the query ids for all queries 
> currently running at each coordinator, and then the admission service can 
> clean up resources allocated to any queries that are not in the list, on the 
> assumption that those queries must have completed already.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-10604) Allow setting KuduClient's verbose logging level directly

2021-03-24 Thread Thomas Tauber-Marshall (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Tauber-Marshall resolved IMPALA-10604.
-
Fix Version/s: Impala 4.0
   Resolution: Fixed

> Allow setting KuduClient's verbose logging level directly
> -
>
> Key: IMPALA-10604
> URL: https://issues.apache.org/jira/browse/IMPALA-10604
> Project: IMPALA
>  Issue Type: Improvement
>Affects Versions: Impala 4.0
>Reporter: Thomas Tauber-Marshall
>Assignee: Thomas Tauber-Marshall
>Priority: Major
> Fix For: Impala 4.0
>
>
> Currently, Impala sets KuduClient's verbose logging level to the same as its 
> own level (taken from the -v flag) minus 1. Since KuduClient doesn't have any 
> way of setting vmodule, this means that to get verbose logging inside 
> KuduClient users must turn it on to a high level for all of Impala, which can 
> produce an enormous volume of logging. making it hard to collect, share, and 
> analyze logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10604) Allow setting KuduClient's verbose logging level directly

2021-03-24 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17308269#comment-17308269
 ] 

ASF subversion and git services commented on IMPALA-10604:
--

Commit 452c2f1f7f9cc4c8472ab38949e9990281dcc3a3 in impala's branch 
refs/heads/master from Thomas Tauber-Marshall
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=452c2f1 ]

IMPALA-10604: Allow setting KuduClient's verbose log level directly

This patch adds a flag --kudu_client_v which allows setting the
verbose logging level for the KuduClient to a value other than the
level for the rest of Impala (set by -v) in order to enable debugging
of issues in the KuduClient without producing the enormous amount of
logging that comes with setting a high -v value on all of Impala.

Testing:
- Manually set --kudu_client_v and confirmed that the expected logging
  is produced.

Change-Id: Ib39358709ee714b8cdffd72a0ee58f66d5fab37e
Reviewed-on: http://gerrit.cloudera.org:8080/17222
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Allow setting KuduClient's verbose logging level directly
> -
>
> Key: IMPALA-10604
> URL: https://issues.apache.org/jira/browse/IMPALA-10604
> Project: IMPALA
>  Issue Type: Improvement
>Affects Versions: Impala 4.0
>Reporter: Thomas Tauber-Marshall
>Assignee: Thomas Tauber-Marshall
>Priority: Major
>
> Currently, Impala sets KuduClient's verbose logging level to the same as its 
> own level (taken from the -v flag) minus 1. Since KuduClient doesn't have any 
> way of setting vmodule, this means that to get verbose logging inside 
> KuduClient users must turn it on to a high level for all of Impala, which can 
> produce an enormous volume of logging. making it hard to collect, share, and 
> analyze logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10590) Ensure admissiond stays in sync with coordinators

2021-03-24 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17308270#comment-17308270
 ] 

ASF subversion and git services commented on IMPALA-10590:
--

Commit e3bafcbef4fd7152ecfcbc7d331e41e9778caf15 in impala's branch 
refs/heads/master from Thomas Tauber-Marshall
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=e3bafcb ]

IMPALA-10590: Introduce admission service heartbeat mechanism

Currently, if a ReleaseQuery rpc fails, it's possible for the
admission service to think that some resources are still being used
that are actually free.

This patch fixes the issue by introducing a periodic heartbeat rpc
from coordinators to the admission service which contains a list of
queries registered at that coordinator.

If there is a query that the admission service thinks is running but
is not included in the heartbeat, the admission service can conclude
that the query must have already completed and release its resources.

Testing:
- Added a test that uses a debug action to simulate ReleaseQuery rpcs
  failing and checks that query resources are released properly.

Change-Id: Ia528d92268cea487ada20b476935a81166f5ad34
Reviewed-on: http://gerrit.cloudera.org:8080/17194
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Ensure admissiond stays in sync with coordinators
> -
>
> Key: IMPALA-10590
> URL: https://issues.apache.org/jira/browse/IMPALA-10590
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Affects Versions: Impala 4.0
>Reporter: Thomas Tauber-Marshall
>Assignee: Thomas Tauber-Marshall
>Priority: Major
>
> Currently, its possible for the admission service to have an incorrect view 
> of what resources are being used in the cluster if there are rpc failures. 
> For example, if the ReleaseQuery rpc fails, the coordinator will retry a few 
> times and then give up. In this case, a query has completed by the admission 
> service doesn't know and will not allow other queries to be scheduled with 
> those resources.
> We can solve this by adding a periodic heartbeat rpc from coordinators to the 
> admission service. This heartbeat will include the query ids for all queries 
> currently running at each coordinator, and then the admission service can 
> clean up resources allocated to any queries that are not in the list, on the 
> assumption that those queries must have completed already.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-10607) TestDecimalOverflowExprs::test_ctas_exprs failed in S3 build

2021-03-24 Thread Wenzhe Zhou (Jira)
Wenzhe Zhou created IMPALA-10607:


 Summary: TestDecimalOverflowExprs::test_ctas_exprs failed in S3 
build
 Key: IMPALA-10607
 URL: https://issues.apache.org/jira/browse/IMPALA-10607
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 4.0
Reporter: Wenzhe Zhou


TestDecimalOverflowExprs::test_ctas_exprs failed in S3 build

Stack trace:

Stack trace for S3 build. 
[https://master-03.jenkins.cloudera.com/job/impala-cdpd-master-staging-core-s3/34/]

query_test.test_decimal_queries.TestDecimalOverflowExprs.test_ctas_exprs[protocol:
 beeswax | exec_option: \\{'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
parquet/none] (from pytest)

Failing for the past 1 build (Since Failed#34 )
Took 13 sec.
Error Message
ImpalaBeeswaxException: ImpalaBeeswaxException: Query aborted:Parquet file 
s3a://impala-test-uswest2-1/test-warehouse/test_ctas_exprs_7304e515.db/overflowed_decimal_tbl_1/b74f0ce129189cf1-4c3c5bd6_1609291350_data.0.parq
 has an invalid file length: 4
Stacktrace
query_test/test_decimal_queries.py:170: in test_ctas_exprs
"SELECT count(*) FROM %s" % TBL_NAME_1)
/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/common/impala_test_suite.py:814:
 in wrapper
return function(*args, **kwargs)
/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/common/impala_test_suite.py:822:
 in execute_query_expect_success
result = cls.__execute_query(impalad_client, query, query_options, user)
/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/common/impala_test_suite.py:923:
 in __execute_query
return impalad_client.execute(query, user=user)
/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/common/impala_connection.py:205:
 in execute
return self.__beeswax_client.execute(sql_stmt, user=user)
/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/beeswax/impala_beeswax.py:187:
 in execute
handle = self.__execute_query(query_string.strip(), user=user)
/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/beeswax/impala_beeswax.py:365:
 in __execute_query
self.wait_for_finished(handle)
/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/beeswax/impala_beeswax.py:386:
 in wait_for_finished
raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
E ImpalaBeeswaxException: ImpalaBeeswaxException:
E Query aborted:Parquet file 
s3a://impala-test-uswest2-1/test-warehouse/test_ctas_exprs_7304e515.db/overflowed_decimal_tbl_1/b74f0ce129189cf1-4c3c5bd6_1609291350_data.0.parq
 has an invalid file length: 4
Standard Error
SET 
client_identifier=query_test/test_decimal_queries.py::TestDecimalOverflowExprs::()::test_ctas_exprs[protocol:beeswax|exec_option:\{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0};
SET sync_ddl=False;
– executing against localhost:21000

DROP DATABASE IF EXISTS `test_ctas_exprs_7304e515` CASCADE;

– 2021-03-24 03:56:00,840 INFO MainThread: Started query 
574a532f47ac7c80:c1c62ae0
SET 
client_identifier=query_test/test_decimal_queries.py::TestDecimalOverflowExprs::()::test_ctas_exprs[protocol:beeswax|exec_option:\{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0};
SET sync_ddl=False;
– executing against localhost:21000

CREATE DATABASE `test_ctas_exprs_7304e515`;

– 2021-03-24 03:56:03,120 INFO MainThread: Started query 
424b970f206e271f:ade0b524
– 2021-03-24 03:56:03,121 INFO MainThread: Created database 
"test_ctas_exprs_7304e515" for test ID 
"query_test/test_decimal_queries.py::TestDecimalOverflowExprs::()::test_ctas_exprs[protocol:
 beeswax | exec_option: \\{'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
parquet/none]"
– executing against localhost:21000

SET decimal_v2=true;

– 2021-03-24 03:56:03,126 INFO MainThread: Started query 
4545d8b9db5e9342:8b3ba570
– executing against localhost:21000

DROP TABLE IF EXISTS `test_ctas_exprs_7304e515`.`overflowed_decimal_tbl_1`;

– 2021-03-24 03:56:03,131 INFO MainThread: Started query 
2c4bc9fc85e2b8e8:05e35eed
SET 
client_identifier=query_test/test_decimal_queries.py::TestDecimalOverflowExprs::()::test_ctas_exprs[protocol:beeswax|exec_option:\{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0};
– executing against localhost:21000

use functional_parquet;

– 

[jira] [Assigned] (IMPALA-10607) TestDecimalOverflowExprs::test_ctas_exprs failed in S3 build

2021-03-24 Thread Wenzhe Zhou (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenzhe Zhou reassigned IMPALA-10607:


Assignee: Wenzhe Zhou

> TestDecimalOverflowExprs::test_ctas_exprs failed in S3 build
> 
>
> Key: IMPALA-10607
> URL: https://issues.apache.org/jira/browse/IMPALA-10607
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.0
>Reporter: Wenzhe Zhou
>Assignee: Wenzhe Zhou
>Priority: Major
>
> TestDecimalOverflowExprs::test_ctas_exprs failed in S3 build
> Stack trace:
> Stack trace for S3 build. 
> [https://master-03.jenkins.cloudera.com/job/impala-cdpd-master-staging-core-s3/34/]
> query_test.test_decimal_queries.TestDecimalOverflowExprs.test_ctas_exprs[protocol:
>  beeswax | exec_option: \\{'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> parquet/none] (from pytest)
> Failing for the past 1 build (Since Failed#34 )
> Took 13 sec.
> Error Message
> ImpalaBeeswaxException: ImpalaBeeswaxException: Query aborted:Parquet file 
> s3a://impala-test-uswest2-1/test-warehouse/test_ctas_exprs_7304e515.db/overflowed_decimal_tbl_1/b74f0ce129189cf1-4c3c5bd6_1609291350_data.0.parq
>  has an invalid file length: 4
> Stacktrace
> query_test/test_decimal_queries.py:170: in test_ctas_exprs
> "SELECT count(*) FROM %s" % TBL_NAME_1)
> /data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/common/impala_test_suite.py:814:
>  in wrapper
> return function(*args, **kwargs)
> /data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/common/impala_test_suite.py:822:
>  in execute_query_expect_success
> result = cls.__execute_query(impalad_client, query, query_options, user)
> /data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/common/impala_test_suite.py:923:
>  in __execute_query
> return impalad_client.execute(query, user=user)
> /data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/common/impala_connection.py:205:
>  in execute
> return self.__beeswax_client.execute(sql_stmt, user=user)
> /data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/beeswax/impala_beeswax.py:187:
>  in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> /data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/beeswax/impala_beeswax.py:365:
>  in __execute_query
> self.wait_for_finished(handle)
> /data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/beeswax/impala_beeswax.py:386:
>  in wait_for_finished
> raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
> E ImpalaBeeswaxException: ImpalaBeeswaxException:
> E Query aborted:Parquet file 
> s3a://impala-test-uswest2-1/test-warehouse/test_ctas_exprs_7304e515.db/overflowed_decimal_tbl_1/b74f0ce129189cf1-4c3c5bd6_1609291350_data.0.parq
>  has an invalid file length: 4
> Standard Error
> SET 
> client_identifier=query_test/test_decimal_queries.py::TestDecimalOverflowExprs::()::test_ctas_exprs[protocol:beeswax|exec_option:\{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0};
> SET sync_ddl=False;
> – executing against localhost:21000
> DROP DATABASE IF EXISTS `test_ctas_exprs_7304e515` CASCADE;
> – 2021-03-24 03:56:00,840 INFO MainThread: Started query 
> 574a532f47ac7c80:c1c62ae0
> SET 
> client_identifier=query_test/test_decimal_queries.py::TestDecimalOverflowExprs::()::test_ctas_exprs[protocol:beeswax|exec_option:\{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0};
> SET sync_ddl=False;
> – executing against localhost:21000
> CREATE DATABASE `test_ctas_exprs_7304e515`;
> – 2021-03-24 03:56:03,120 INFO MainThread: Started query 
> 424b970f206e271f:ade0b524
> – 2021-03-24 03:56:03,121 INFO MainThread: Created database 
> "test_ctas_exprs_7304e515" for test ID 
> "query_test/test_decimal_queries.py::TestDecimalOverflowExprs::()::test_ctas_exprs[protocol:
>  beeswax | exec_option: \\{'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> parquet/none]"
> – executing against localhost:21000
> SET decimal_v2=true;
> – 2021-03-24 03:56:03,126 INFO MainThread: Started query 
> 4545d8b9db5e9342:8b3ba570
> – executing against localhost:21000
> DROP TABLE IF EXISTS `test_ctas_exprs_7304e515`.`overflowed_decimal_tbl_1`;
> – 2021-03-24 03:56:03,131 INFO 

[jira] [Created] (IMPALA-10606) Simplify impala-python virtualenv requirements files

2021-03-24 Thread Joe McDonnell (Jira)
Joe McDonnell created IMPALA-10606:
--

 Summary: Simplify impala-python virtualenv requirements files
 Key: IMPALA-10606
 URL: https://issues.apache.org/jira/browse/IMPALA-10606
 Project: IMPALA
  Issue Type: Improvement
  Components: Infrastructure
Affects Versions: Impala 4.0
Reporter: Joe McDonnell
Assignee: Joe McDonnell


The impala-python virtualenv currently has complicated logic that can do 
multiple rounds of pip installs depending on whether the toolchain has been 
bootstrapped. For example, the packages in compile-requirements.txt are only 
installed if the toolchain GCC has been installed. The Kudu python client is 
only installed if Kudu has been downloaded. This was a workaround because 
bootstrap_toolchain.py required the impala-python virtualenv. The different 
stages allowed the basics to be installed, then bootstrap_toolchain.py can run, 
then the rest can be installed.

The bootstrap_toolchain.py script no longer requires the impala-python 
virtualenv, so there is no need for such a complicated setup. The things 
bootstrapping the impala-python virtualenv can assume that the toolchain 
compiler is present. This would allow the requirements to be consolidated into 
a main requirements file that includes both compiled and non-compiled packages. 
A consolidated file makes it easier to update dependency versions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10605) Deflake test_refresh_native

2021-03-24 Thread Vihang Karajgaonkar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated IMPALA-10605:
-
Description: 
The test uses a regex to parse the output of describe database and extract the 
db properties. The regex currently assumes that there will be only one property 
in the database. This assumption breaks when events processor is running 
because it might add some db properties as well.

{noformat}
regex = r"{(.*?)=(.*?)}"
{noformat}

The above regex will select subsequent properties as the value of the first 
key. We can fix this by changing the regex to specifically look for the 
functional name property key prefix.
{noformat}
regex = r"{.*(impala_registered_function.*?)=(.*?)[,}]"
{noformat}

> Deflake test_refresh_native
> ---
>
> Key: IMPALA-10605
> URL: https://issues.apache.org/jira/browse/IMPALA-10605
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
>
> The test uses a regex to parse the output of describe database and extract 
> the db properties. The regex currently assumes that there will be only one 
> property in the database. This assumption breaks when events processor is 
> running because it might add some db properties as well.
> {noformat}
> regex = r"{(.*?)=(.*?)}"
> {noformat}
> The above regex will select subsequent properties as the value of the first 
> key. We can fix this by changing the regex to specifically look for the 
> functional name property key prefix.
> {noformat}
> regex = r"{.*(impala_registered_function.*?)=(.*?)[,}]"
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-10605) Deflake test_refresh_native

2021-03-24 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created IMPALA-10605:


 Summary: Deflake test_refresh_native
 Key: IMPALA-10605
 URL: https://issues.apache.org/jira/browse/IMPALA-10605
 Project: IMPALA
  Issue Type: Improvement
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6340) There is no error when inserting an invalid value into a decimal column under decimal_v2

2021-03-24 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-6340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17308126#comment-17308126
 ] 

ASF subversion and git services commented on IMPALA-6340:
-

Commit 410c3e79e4eeba0a3f1ad62f6bf2f11b2de48819 in impala's branch 
refs/heads/master from wzhou-code
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=410c3e7 ]

IMPALA-10564: Return error when inserting an invalid decimal value

When using CTAS statements or INSERT-SELECT statements to insert rows to
table with decimal columns, Impala insert NULL for overflowed decimal
values, instead of returning error. This issue happens when the data
expression for the decimal column in SELECT sub-query consists at least
one alias.
This issue is similar as IMPALA-6340, but IMPALA-6340 only fixed the
issue for the cases with the data expression for the decimal columns as
constants.

This patch fixed the issue by calling RuntimeState::CheckQueryState()
in the end of HdfsTableWriter::AppendRows() and KuduTableSink::Send().
If there is an invalid decimal error, the query will be failed without
inserting NULL for decimal column.
We did not change the behaviour for decimal_v1. NULL will be inserted
to the table for invalid decimal values with warning message.

Tests:
 - Added unit-tests for INSERT-SELECT and CTAS statements with
   overflowed decimal values to be inserted into tables. The
   overflowed decimal values are expressed as a constant expression,
   or as an expression with aliases.
   Also added cases to verify behaviour of decimal_v1 is unchanged.
 - Passed exhaustive tests.

Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98
Reviewed-on: http://gerrit.cloudera.org:8080/17168
Reviewed-by: Thomas Tauber-Marshall 
Tested-by: Impala Public Jenkins 


> There is no error when inserting an invalid value into a decimal column under 
> decimal_v2
> 
>
> Key: IMPALA-6340
> URL: https://issues.apache.org/jira/browse/IMPALA-6340
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.11.0
>Reporter: Taras Bobrovytsky
>Assignee: Taras Bobrovytsky
>Priority: Blocker
>  Labels: correctness
> Fix For: Impala 3.0, Impala 2.13.0
>
>
> The following series of commands does not result in an error or a warning 
> when decimal_v2 is enabled.
> {code}
> set decimal_v2=1;
> create table t1 (c1 decimal(38,37));
> insert into t1 select 11.11;
> {code}
> We end up inserting a NULL into the column without any warnings.
> If these commands are executed with decimal_v2 disabled, we get the following 
> warning:
> {code}
> WARNINGS: UDF WARNING: Decimal expression overflowed, returning NULL
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10580) Implement ds_theta_union_f() function.

2021-03-24 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17308128#comment-17308128
 ] 

ASF subversion and git services commented on IMPALA-10580:
--

Commit 622e3c95adca5cf30a0aff6542556feab9b8a861 in impala's branch 
refs/heads/master from Fucun Chu
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=622e3c9 ]

IMPALA-10580: Implement ds_theta_union_f() function

This function receives two strings that are serialized Apache
DataSketches Theta sketches. Union two sketches and returns the
resulting sketch of union.

Example:
select ds_theta_estimate(ds_theta_union_f(sketch1, sketch2))
from sketch_tbl;
+---+
| ds_theta_estimate(ds_theta_union_f(sketch1, sketch2)) |
+---+
| 15|
+---+

Change-Id: I8329979b81ceeaad739a43fab79768ca9c2916fa
Reviewed-on: http://gerrit.cloudera.org:8080/17179
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Implement ds_theta_union_f() function.
> --
>
> Key: IMPALA-10580
> URL: https://issues.apache.org/jira/browse/IMPALA-10580
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend, Frontend
>Reporter: Fucun Chu
>Assignee: Fucun Chu
>Priority: Major
> Fix For: Impala 4.0
>
>
> This function receives two strings that are serialized Apache DataSketches 
> Theta sketches. Union two sketches and returns the resulting sketch of union.
> Example:
> {code:java}
> select ds_theta_estimate(ds_theta_union_f(sketch1, sketch2))
> from sketch_tbl;
> +---+
> | ds_theta_estimate(ds_theta_union_f(sketch1, sketch2)) |
> +---+
> | 15|
> +---+{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6340) There is no error when inserting an invalid value into a decimal column under decimal_v2

2021-03-24 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-6340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17308127#comment-17308127
 ] 

ASF subversion and git services commented on IMPALA-6340:
-

Commit 410c3e79e4eeba0a3f1ad62f6bf2f11b2de48819 in impala's branch 
refs/heads/master from wzhou-code
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=410c3e7 ]

IMPALA-10564: Return error when inserting an invalid decimal value

When using CTAS statements or INSERT-SELECT statements to insert rows to
table with decimal columns, Impala insert NULL for overflowed decimal
values, instead of returning error. This issue happens when the data
expression for the decimal column in SELECT sub-query consists at least
one alias.
This issue is similar as IMPALA-6340, but IMPALA-6340 only fixed the
issue for the cases with the data expression for the decimal columns as
constants.

This patch fixed the issue by calling RuntimeState::CheckQueryState()
in the end of HdfsTableWriter::AppendRows() and KuduTableSink::Send().
If there is an invalid decimal error, the query will be failed without
inserting NULL for decimal column.
We did not change the behaviour for decimal_v1. NULL will be inserted
to the table for invalid decimal values with warning message.

Tests:
 - Added unit-tests for INSERT-SELECT and CTAS statements with
   overflowed decimal values to be inserted into tables. The
   overflowed decimal values are expressed as a constant expression,
   or as an expression with aliases.
   Also added cases to verify behaviour of decimal_v1 is unchanged.
 - Passed exhaustive tests.

Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98
Reviewed-on: http://gerrit.cloudera.org:8080/17168
Reviewed-by: Thomas Tauber-Marshall 
Tested-by: Impala Public Jenkins 


> There is no error when inserting an invalid value into a decimal column under 
> decimal_v2
> 
>
> Key: IMPALA-6340
> URL: https://issues.apache.org/jira/browse/IMPALA-6340
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.11.0
>Reporter: Taras Bobrovytsky
>Assignee: Taras Bobrovytsky
>Priority: Blocker
>  Labels: correctness
> Fix For: Impala 3.0, Impala 2.13.0
>
>
> The following series of commands does not result in an error or a warning 
> when decimal_v2 is enabled.
> {code}
> set decimal_v2=1;
> create table t1 (c1 decimal(38,37));
> insert into t1 select 11.11;
> {code}
> We end up inserting a NULL into the column without any warnings.
> If these commands are executed with decimal_v2 disabled, we get the following 
> warning:
> {code}
> WARNINGS: UDF WARNING: Decimal expression overflowed, returning NULL
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10564) No error returned when inserting an overflowed value into a decimal column

2021-03-24 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17308125#comment-17308125
 ] 

ASF subversion and git services commented on IMPALA-10564:
--

Commit 410c3e79e4eeba0a3f1ad62f6bf2f11b2de48819 in impala's branch 
refs/heads/master from wzhou-code
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=410c3e7 ]

IMPALA-10564: Return error when inserting an invalid decimal value

When using CTAS statements or INSERT-SELECT statements to insert rows to
table with decimal columns, Impala insert NULL for overflowed decimal
values, instead of returning error. This issue happens when the data
expression for the decimal column in SELECT sub-query consists at least
one alias.
This issue is similar as IMPALA-6340, but IMPALA-6340 only fixed the
issue for the cases with the data expression for the decimal columns as
constants.

This patch fixed the issue by calling RuntimeState::CheckQueryState()
in the end of HdfsTableWriter::AppendRows() and KuduTableSink::Send().
If there is an invalid decimal error, the query will be failed without
inserting NULL for decimal column.
We did not change the behaviour for decimal_v1. NULL will be inserted
to the table for invalid decimal values with warning message.

Tests:
 - Added unit-tests for INSERT-SELECT and CTAS statements with
   overflowed decimal values to be inserted into tables. The
   overflowed decimal values are expressed as a constant expression,
   or as an expression with aliases.
   Also added cases to verify behaviour of decimal_v1 is unchanged.
 - Passed exhaustive tests.

Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98
Reviewed-on: http://gerrit.cloudera.org:8080/17168
Reviewed-by: Thomas Tauber-Marshall 
Tested-by: Impala Public Jenkins 


> No error returned when inserting an overflowed value into a decimal column
> --
>
> Key: IMPALA-10564
> URL: https://issues.apache.org/jira/browse/IMPALA-10564
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend, Frontend
>Affects Versions: Impala 4.0
>Reporter: Wenzhe Zhou
>Assignee: Wenzhe Zhou
>Priority: Major
> Fix For: Impala 4.0
>
>
> When using CTAS statements or INSERT-SELECT statements to insert rows to 
> table with decimal columns, Impala insert NULL for overflowed decimal values, 
> instead of returning error. This issue happens when the data expression for 
> the decimal column in SELECT sub-query consists at least one alias. This 
> issue is similar as IMPALA-6340, but IMPALA-6340 only fixed the issue for the 
> cases with the data expression for the decimal columns as constants so that 
> the overflowed decimal values could be detected by frontend during expression 
> analysis.  If there is alias (variables) in the data expression for the 
> decimal column, Frontend could not evaluate data expression in expression 
> analysis phase. Only backend could evaluate the data expression when backend 
> execute fragment instances for SELECT sub-queries. The log messages showed 
> that the executor detected the decimal overflow error, but somehow it did not 
> propagate the error to the coordinator, hence the error was not returned to 
> the client.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10564) No error returned when inserting an overflowed value into a decimal column

2021-03-24 Thread Wenzhe Zhou (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17308011#comment-17308011
 ] 

Wenzhe Zhou commented on IMPALA-10564:
--

Stack trace for S3 build. 
[https://master-03.jenkins.cloudera.com/job/impala-cdpd-master-staging-core-s3/34/]

query_test.test_decimal_queries.TestDecimalOverflowExprs.test_ctas_exprs[protocol:
 beeswax | exec_option: \{'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
parquet/none] (from pytest)

Failing for the past 1 build (Since Failed#34 )
Took 13 sec.
Error Message
ImpalaBeeswaxException: ImpalaBeeswaxException: Query aborted:Parquet file 
s3a://impala-test-uswest2-1/test-warehouse/test_ctas_exprs_7304e515.db/overflowed_decimal_tbl_1/b74f0ce129189cf1-4c3c5bd6_1609291350_data.0.parq
 has an invalid file length: 4
Stacktrace
query_test/test_decimal_queries.py:170: in test_ctas_exprs
 "SELECT count(*) FROM %s" % TBL_NAME_1)
/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/common/impala_test_suite.py:814:
 in wrapper
 return function(*args, **kwargs)
/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/common/impala_test_suite.py:822:
 in execute_query_expect_success
 result = cls.__execute_query(impalad_client, query, query_options, user)
/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/common/impala_test_suite.py:923:
 in __execute_query
 return impalad_client.execute(query, user=user)
/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/common/impala_connection.py:205:
 in execute
 return self.__beeswax_client.execute(sql_stmt, user=user)
/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/beeswax/impala_beeswax.py:187:
 in execute
 handle = self.__execute_query(query_string.strip(), user=user)
/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/beeswax/impala_beeswax.py:365:
 in __execute_query
 self.wait_for_finished(handle)
/data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/beeswax/impala_beeswax.py:386:
 in wait_for_finished
 raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
E ImpalaBeeswaxException: ImpalaBeeswaxException:
E Query aborted:Parquet file 
s3a://impala-test-uswest2-1/test-warehouse/test_ctas_exprs_7304e515.db/overflowed_decimal_tbl_1/b74f0ce129189cf1-4c3c5bd6_1609291350_data.0.parq
 has an invalid file length: 4
Standard Error
SET 
client_identifier=query_test/test_decimal_queries.py::TestDecimalOverflowExprs::()::test_ctas_exprs[protocol:beeswax|exec_option:\{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0};
SET sync_ddl=False;
-- executing against localhost:21000

DROP DATABASE IF EXISTS `test_ctas_exprs_7304e515` CASCADE;

-- 2021-03-24 03:56:00,840 INFO MainThread: Started query 
574a532f47ac7c80:c1c62ae0
SET 
client_identifier=query_test/test_decimal_queries.py::TestDecimalOverflowExprs::()::test_ctas_exprs[protocol:beeswax|exec_option:\{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0};
SET sync_ddl=False;
-- executing against localhost:21000

CREATE DATABASE `test_ctas_exprs_7304e515`;

-- 2021-03-24 03:56:03,120 INFO MainThread: Started query 
424b970f206e271f:ade0b524
-- 2021-03-24 03:56:03,121 INFO MainThread: Created database 
"test_ctas_exprs_7304e515" for test ID 
"query_test/test_decimal_queries.py::TestDecimalOverflowExprs::()::test_ctas_exprs[protocol:
 beeswax | exec_option: \{'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
parquet/none]"
-- executing against localhost:21000

SET decimal_v2=true;

-- 2021-03-24 03:56:03,126 INFO MainThread: Started query 
4545d8b9db5e9342:8b3ba570
-- executing against localhost:21000

DROP TABLE IF EXISTS `test_ctas_exprs_7304e515`.`overflowed_decimal_tbl_1`;

-- 2021-03-24 03:56:03,131 INFO MainThread: Started query 
2c4bc9fc85e2b8e8:05e35eed
SET 
client_identifier=query_test/test_decimal_queries.py::TestDecimalOverflowExprs::()::test_ctas_exprs[protocol:beeswax|exec_option:\{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0};
-- executing against localhost:21000

use functional_parquet;

-- 2021-03-24 03:56:03,135 INFO MainThread: Started query 
38403231c3885691:b0ba2cc4
SET 

[jira] [Reopened] (IMPALA-10564) No error returned when inserting an overflowed value into a decimal column

2021-03-24 Thread Wenzhe Zhou (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenzhe Zhou reopened IMPALA-10564:
--

New test case TestDecimalOverflowExprs.test_ctas_exprs failed in S3 build.

> No error returned when inserting an overflowed value into a decimal column
> --
>
> Key: IMPALA-10564
> URL: https://issues.apache.org/jira/browse/IMPALA-10564
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend, Frontend
>Affects Versions: Impala 4.0
>Reporter: Wenzhe Zhou
>Assignee: Wenzhe Zhou
>Priority: Major
> Fix For: Impala 4.0
>
>
> When using CTAS statements or INSERT-SELECT statements to insert rows to 
> table with decimal columns, Impala insert NULL for overflowed decimal values, 
> instead of returning error. This issue happens when the data expression for 
> the decimal column in SELECT sub-query consists at least one alias. This 
> issue is similar as IMPALA-6340, but IMPALA-6340 only fixed the issue for the 
> cases with the data expression for the decimal columns as constants so that 
> the overflowed decimal values could be detected by frontend during expression 
> analysis.  If there is alias (variables) in the data expression for the 
> decimal column, Frontend could not evaluate data expression in expression 
> analysis phase. Only backend could evaluate the data expression when backend 
> execute fragment instances for SELECT sub-queries. The log messages showed 
> that the executor detected the decimal overflow error, but somehow it did not 
> propagate the error to the coordinator, hence the error was not returned to 
> the client.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org