[
https://issues.apache.org/jira/browse/IMPALA-10607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17309054#comment-17309054
]
Wenzhe Zhou commented on IMPALA-10607:
--------------------------------------
Verified that this issue does not happen when query option
S3_SKIP_INSERT_STAGING is set as FALSE. When this query option is set as TRUE,
INSERT writes to S3 go directly to their final location rather than being
copied there by the coordinator. If CTAS finishs with error, the parquet
partition file is left as un-finalized. To fix it, we could call
WriteFileFooter() before HdfsParquetTableWriter::AppendRows() return with
error. Or delete the HDFS file when AppendRows() return error and
ShouldSkipStaging() return TRUE.
> TestDecimalOverflowExprs::test_ctas_exprs failed in S3 build
> ------------------------------------------------------------
>
> Key: IMPALA-10607
> URL: https://issues.apache.org/jira/browse/IMPALA-10607
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Affects Versions: Impala 4.0
> Reporter: Wenzhe Zhou
> Assignee: Wenzhe Zhou
> Priority: Major
>
> TestDecimalOverflowExprs::test_ctas_exprs failed in S3 build
> Stack trace:
> Stack trace for S3 build.
> [https://master-03.jenkins.cloudera.com/job/impala-cdpd-master-staging-core-s3/34/]
> query_test.test_decimal_queries.TestDecimalOverflowExprs.test_ctas_exprs[protocol:
> beeswax | exec_option: \\{'batch_size': 0, 'num_nodes': 0,
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False,
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format:
> parquet/none] (from pytest)
> Failing for the past 1 build (Since Failed#34 )
> Took 13 sec.
> Error Message
> ImpalaBeeswaxException: ImpalaBeeswaxException: Query aborted:Parquet file
> s3a://impala-test-uswest2-1/test-warehouse/test_ctas_exprs_7304e515.db/overflowed_decimal_tbl_1/b74f0ce129189cf1-4c3c5bd600000000_1609291350_data.0.parq
> has an invalid file length: 4
> Stacktrace
> query_test/test_decimal_queries.py:170: in test_ctas_exprs
> "SELECT count(*) FROM %s" % TBL_NAME_1)
> /data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/common/impala_test_suite.py:814:
> in wrapper
> return function(*args, **kwargs)
> /data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/common/impala_test_suite.py:822:
> in execute_query_expect_success
> result = cls.__execute_query(impalad_client, query, query_options, user)
> /data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/common/impala_test_suite.py:923:
> in __execute_query
> return impalad_client.execute(query, user=user)
> /data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/common/impala_connection.py:205:
> in execute
> return self.__beeswax_client.execute(sql_stmt, user=user)
> /data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/beeswax/impala_beeswax.py:187:
> in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> /data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/beeswax/impala_beeswax.py:365:
> in __execute_query
> self.wait_for_finished(handle)
> /data/jenkins/workspace/impala-cdpd-master-staging-core-s3/repos/Impala/tests/beeswax/impala_beeswax.py:386:
> in wait_for_finished
> raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
> E ImpalaBeeswaxException: ImpalaBeeswaxException:
> E Query aborted:Parquet file
> s3a://impala-test-uswest2-1/test-warehouse/test_ctas_exprs_7304e515.db/overflowed_decimal_tbl_1/b74f0ce129189cf1-4c3c5bd600000000_1609291350_data.0.parq
> has an invalid file length: 4
> Standard Error
> SET
> client_identifier=query_test/test_decimal_queries.py::TestDecimalOverflowExprs::()::test_ctas_exprs[protocol:beeswax|exec_option:\{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0};
> SET sync_ddl=False;
> – executing against localhost:21000
> DROP DATABASE IF EXISTS `test_ctas_exprs_7304e515` CASCADE;
> – 2021-03-24 03:56:00,840 INFO MainThread: Started query
> 574a532f47ac7c80:c1c62ae000000000
> SET
> client_identifier=query_test/test_decimal_queries.py::TestDecimalOverflowExprs::()::test_ctas_exprs[protocol:beeswax|exec_option:\{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0};
> SET sync_ddl=False;
> – executing against localhost:21000
> CREATE DATABASE `test_ctas_exprs_7304e515`;
> – 2021-03-24 03:56:03,120 INFO MainThread: Started query
> 424b970f206e271f:ade0b52400000000
> – 2021-03-24 03:56:03,121 INFO MainThread: Created database
> "test_ctas_exprs_7304e515" for test ID
> "query_test/test_decimal_queries.py::TestDecimalOverflowExprs::()::test_ctas_exprs[protocol:
> beeswax | exec_option: \\{'batch_size': 0, 'num_nodes': 0,
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False,
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format:
> parquet/none]"
> – executing against localhost:21000
> SET decimal_v2=true;
> – 2021-03-24 03:56:03,126 INFO MainThread: Started query
> 4545d8b9db5e9342:8b3ba57000000000
> – executing against localhost:21000
> DROP TABLE IF EXISTS `test_ctas_exprs_7304e515`.`overflowed_decimal_tbl_1`;
> – 2021-03-24 03:56:03,131 INFO MainThread: Started query
> 2c4bc9fc85e2b8e8:05e35eed00000000
> SET
> client_identifier=query_test/test_decimal_queries.py::TestDecimalOverflowExprs::()::test_ctas_exprs[protocol:beeswax|exec_option:\{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0};
> – executing against localhost:21000
> use functional_parquet;
> – 2021-03-24 03:56:03,135 INFO MainThread: Started query
> 38403231c3885691:b0ba2cc400000000
> SET
> client_identifier=query_test/test_decimal_queries.py::TestDecimalOverflowExprs::()::test_ctas_exprs[protocol:beeswax|exec_option:\{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0};
> SET batch_size=0;
> SET num_nodes=0;
> SET disable_codegen_rows_threshold=0;
> SET disable_codegen=False;
> SET abort_on_error=1;
> SET exec_single_node_rows_threshold=0;
> – executing against localhost:21000
> CREATE TABLE `test_ctas_exprs_7304e515`.`overflowed_decimal_tbl_1` STORED AS
> PARQUET AS SELECT 1 as i, cast(a*a*a as decimal (28,10)) as d_28 FROM (SELECT
> cast(654964569154.9565 as decimal (28,7)) as a) q;
> – 2021-03-24 03:56:03,399 INFO MainThread: Started query
> b74f0ce129189cf1:4c3c5bd600000000
> – executing against localhost:21000
> SELECT count(*) FROM `test_ctas_exprs_7304e515`.`overflowed_decimal_tbl_1`;
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]