[
https://issues.apache.org/jira/browse/IMPALA-12740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ye Zihao resolved IMPALA-12740.
-------------------------------
Fix Version/s: Impala 4.4.0
Resolution: Fixed
> TestHdfsJsonScanNodeErrors fails in exhaustive mode
> ---------------------------------------------------
>
> Key: IMPALA-12740
> URL: https://issues.apache.org/jira/browse/IMPALA-12740
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Affects Versions: Impala 4.4.0
> Reporter: Laszlo Gaal
> Assignee: Ye Zihao
> Priority: Blocker
> Labels: broken-build
> Fix For: Impala 4.4.0
>
>
> data_errors.test_data_errors.TestHdfsJsonScanNodeErrors fails when test are
> run in exhaustive more. Both debug and release builds exhibit the same
> symptoms.
> the run log:{code}
> FAIL
> data_errors/test_data_errors.py::TestHdfsJsonScanNodeErrors::()::test_hdfs_json_scan_node_errors[protocol:
> beeswax | exec_option: {'test_replan': 1, 'batch_size': 1, 'num_nodes': 0,
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': True,
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format:
> json/snap/block]
> FAIL
> data_errors/test_data_errors.py::TestHdfsJsonScanNodeErrors::()::test_hdfs_json_scan_node_errors[protocol:
> beeswax | exec_option: {'test_replan': 1, 'batch_size': 0, 'num_nodes': 0,
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': True,
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format:
> json/snap/block]
> FAIL
> data_errors/test_data_errors.py::TestHdfsJsonScanNodeErrors::()::test_hdfs_json_scan_node_errors[protocol:
> beeswax | exec_option: {'test_replan': 1, 'batch_size': 1, 'num_nodes': 0,
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False,
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format:
> json/snap/block]
> FAIL
> data_errors/test_data_errors.py::TestHdfsJsonScanNodeErrors::()::test_hdfs_json_scan_node_errors[protocol:
> beeswax | exec_option: {'test_replan': 1, 'batch_size': 0, 'num_nodes': 0,
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False,
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format:
> json/snap/block]
> FAIL
> data_errors/test_data_errors.py::TestHdfsJsonScanNodeErrors::()::test_hdfs_json_scan_node_errors[protocol:
> beeswax | exec_option: {'test_replan': 1, 'batch_size': 1, 'num_nodes': 0,
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': True,
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format:
> json/def/block]
> FAIL
> data_errors/test_data_errors.py::TestHdfsJsonScanNodeErrors::()::test_hdfs_json_scan_node_errors[protocol:
> beeswax | exec_option: {'test_replan': 1, 'batch_size': 0, 'num_nodes': 0,
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False,
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format:
> json/def/block]
> FAIL
> data_errors/test_data_errors.py::TestHdfsJsonScanNodeErrors::()::test_hdfs_json_scan_node_errors[protocol:
> beeswax | exec_option: {'test_replan': 1, 'batch_size': 0, 'num_nodes': 0,
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': True,
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format:
> json/def/block]
> FAIL
> data_errors/test_data_errors.py::TestHdfsJsonScanNodeErrors::()::test_hdfs_json_scan_node_errors[protocol:
> beeswax | exec_option: {'test_replan': 1, 'batch_size': 1, 'num_nodes': 0,
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False,
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format:
> json/def/block]
> FAIL
> data_errors/test_data_errors.py::TestHdfsJsonScanNodeErrors::()::test_hdfs_json_scan_node_errors[protocol:
> beeswax | exec_option: {'test_replan': 1, 'batch_size': 1, 'num_nodes': 0,
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False,
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format:
> json/bzip/block]
> FAIL
> data_errors/test_data_errors.py::TestHdfsJsonScanNodeErrors::()::test_hdfs_json_scan_node_errors[protocol:
> beeswax | exec_option: {'test_replan': 1, 'batch_size': 0, 'num_nodes': 0,
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False,
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format:
> json/bzip/block]
> FAIL
> data_errors/test_data_errors.py::TestHdfsJsonScanNodeErrors::()::test_hdfs_json_scan_node_errors[protocol:
> beeswax | exec_option: {'test_replan': 1, 'batch_size': 1, 'num_nodes': 0,
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': True,
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format:
> json/gzip/block]
> FAIL
> data_errors/test_data_errors.py::TestHdfsJsonScanNodeErrors::()::test_hdfs_json_scan_node_errors[protocol:
> beeswax | exec_option: {'test_replan': 1, 'batch_size': 0, 'num_nodes': 0,
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False,
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format:
> json/gzip/block]
> FAIL
> data_errors/test_data_errors.py::TestHdfsJsonScanNodeErrors::()::test_hdfs_json_scan_node_errors[protocol:
> beeswax | exec_option: {'test_replan': 1, 'batch_size': 1, 'num_nodes': 0,
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': True,
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format:
> json/bzip/block]
> FAIL
> data_errors/test_data_errors.py::TestHdfsJsonScanNodeErrors::()::test_hdfs_json_scan_node_errors[protocol:
> beeswax | exec_option: {'test_replan': 1, 'batch_size': 1, 'num_nodes': 0,
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False,
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format:
> json/gzip/block]
> FAIL
> data_errors/test_data_errors.py::TestHdfsJsonScanNodeErrors::()::test_hdfs_json_scan_node_errors[protocol:
> beeswax | exec_option: {'test_replan': 1, 'batch_size': 0, 'num_nodes': 0,
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': True,
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format:
> json/bzip/block]
> FAIL
> data_errors/test_data_errors.py::TestHdfsJsonScanNodeErrors::()::test_hdfs_json_scan_node_errors[protocol:
> beeswax | exec_option: {'test_replan': 1, 'batch_size': 0, 'num_nodes': 0,
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': True,
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format:
> json/gzip/block]
> {code}
> stderr complains about conversion failures:
> {code}
> SET
> client_identifier=data_errors/test_data_errors.py::TestHdfsJsonScanNodeErrors::()::test_hdfs_json_scan_node_errors[protocol:beeswax|exec_option:{'test_replan':1;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':True;'abort_on_error':1;'exec_;
> -- executing against localhost:21000
> use functional_json_snap;
> -- 2024-01-19 23:30:10,326 INFO MainThread: Started query
> 824002d0b11e7ba1:2e75861b00000000
> SET
> client_identifier=data_errors/test_data_errors.py::TestHdfsJsonScanNodeErrors::()::test_hdfs_json_scan_node_errors[protocol:beeswax|exec_option:{'test_replan':1;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':True;'abort_on_error':1;'exec_;
> SET test_replan=1;
> SET batch_size=0;
> SET num_nodes=0;
> SET disable_codegen_rows_threshold=0;
> SET disable_codegen=True;
> SET abort_on_error=0;
> SET exec_single_node_rows_threshold=0;
> -- 2024-01-19 23:30:10,327 INFO MainThread: Loading query test file:
> /data/jenkins/workspace/impala-asf-master-exhaustive-release/repos/Impala/testdata/workloads/functional-query/queries/DataErrorsTest/hdfs-json-scan-node-errors.test
> -- executing against localhost:21000
> select * from alltypeserror order by id;
> -- 2024-01-19 23:30:14,212 INFO MainThread: Started query
> d54d7edb55f53beb:dbbb778a00000000
> -- 2024-01-19 23:30:14,295 ERROR MainThread: Comparing QueryTestResults
> (expected vs actual):
> 'Error converting column: functional_json.alltypeserror.bigint_col, type:
> BIGINT, data: 'err300'' != 'Error converting column:
> functional_json_snap.alltypeserror.bigint_col, type: BIGINT, data: 'err300''
> 'Error converting column: functional_json.alltypeserror.bigint_col, type:
> BIGINT, data: 'err50'' != 'Error converting column:
> functional_json_snap.alltypeserror.bigint_col, type: BIGINT, data: 'err50''
> 'Error converting column: functional_json.alltypeserror.bigint_col, type:
> BIGINT, data: 'err90'' != 'Error converting column:
> functional_json_snap.alltypeserror.bigint_col, type: BIGINT, data: 'err90''
> 'Error converting column: functional_json.alltypeserror.bool_col, type:
> BOOLEAN, data: 'errfalse'' != 'Error converting column:
> functional_json_snap.alltypeserror.bool_col, type: BOOLEAN, data: 'errfalse''
> 'Error converting column: functional_json.alltypeserror.bool_col, type:
> BOOLEAN, data: 'errtrue'' != 'Error converting column:
> functional_json_snap.alltypeserror.bool_col, type: BOOLEAN, data: 'errtrue''
> 'Error converting column: functional_json.alltypeserror.bool_col, type:
> BOOLEAN, data: 't\rue'' != 'Error converting column:
> functional_json_snap.alltypeserror.bool_col, type: BOOLEAN, data: 't\rue''
> 'Error converting column: functional_json.alltypeserror.double_col, type:
> DOUBLE, data: 'err300.900000'' != 'Error converting column:
> functional_json_snap.alltypeserror.double_col, type: DOUBLE, data:
> 'err300.900000''
> 'Error converting column: functional_json.alltypeserror.double_col, type:
> DOUBLE, data: 'err70.700000'' != 'Error converting column:
> functional_json_snap.alltypeserror.double_col, type: DOUBLE, data:
> 'err70.700000''
> 'Error converting column: functional_json.alltypeserror.double_col, type:
> DOUBLE, data: 'err90.900000'' != 'Error converting column:
> functional_json_snap.alltypeserror.double_col, type: DOUBLE, data:
> 'err90.900000''
> 'Error converting column: functional_json.alltypeserror.double_col, type:
> DOUBLE, data: 'xyz30.300000'' != 'Error converting column:
> functional_json_snap.alltypeserror.double_col, type: DOUBLE, data:
> 'xyz30.300000''
> 'Error converting column: functional_json.alltypeserror.double_col, type:
> DOUBLE, data: 'xyz70.700000'' != 'Error converting column:
> functional_json_snap.alltypeserror.double_col, type: DOUBLE, data:
> 'xyz70.700000''
> 'Error converting column: functional_json.alltypeserror.float_col, type:
> FLOAT, data: 'err30..000000'' != 'Error converting column:
> functional_json_snap.alltypeserror.float_col, type: FLOAT, data:
> 'err30..000000''
> 'Error converting column: functional_json.alltypeserror.float_col, type:
> FLOAT, data: 'err6.000000'' != 'Error converting column:
> functional_json_snap.alltypeserror.float_col, type: FLOAT, data:
> 'err6.000000''
> 'Error converting column: functional_json.alltypeserror.float_col, type:
> FLOAT, data: 'err9.000000'' != 'Error converting column:
> functional_json_snap.alltypeserror.float_col, type: FLOAT, data:
> 'err9.000000''
> 'Error converting column: functional_json.alltypeserror.float_col, type:
> FLOAT, data: 'xyz3.000000'' != 'Error converting column:
> functional_json_snap.alltypeserror.float_col, type: FLOAT, data:
> 'xyz3.000000''
> 'Error converting column: functional_json.alltypeserror.int_col, type: INT,
> data: 'abc5'' != 'Error converting column:
> functional_json_snap.alltypeserror.int_col, type: INT, data: 'abc5''
> 'Error converting column: functional_json.alltypeserror.int_col, type: INT,
> data: 'abc9'' != 'Error converting column:
> functional_json_snap.alltypeserror.int_col, type: INT, data: 'abc9''
> 'Error converting column: functional_json.alltypeserror.int_col, type: INT,
> data: 'err30'' != 'Error converting column:
> functional_json_snap.alltypeserror.int_col, type: INT, data: 'err30''
> 'Error converting column: functional_json.alltypeserror.int_col, type: INT,
> data: 'err4'' != 'Error converting column:
> functional_json_snap.alltypeserror.int_col, type: INT, data: 'err4''
> 'Error converting column: functional_json.alltypeserror.int_col, type: INT,
> data: 'err9'' != 'Error converting column:
> functional_json_snap.alltypeserror.int_col, type: INT, data: 'err9''
> 'Error converting column: functional_json.alltypeserror.smallint_col, type:
> SMALLINT, data: 'abc3'' != 'Error converting column:
> functional_json_snap.alltypeserror.smallint_col, type: SMALLINT, data: 'abc3''
> 'Error converting column: functional_json.alltypeserror.smallint_col, type:
> SMALLINT, data: 'err3'' != 'Error converting column:
> functional_json_snap.alltypeserror.smallint_col, type: SMALLINT, data: 'err3''
> 'Error converting column: functional_json.alltypeserror.smallint_col, type:
> SMALLINT, data: 'err30'' != 'Error converting column:
> functional_json_snap.alltypeserror.smallint_col, type: SMALLINT, data:
> 'err30''
> 'Error converting column: functional_json.alltypeserror.smallint_col, type:
> SMALLINT, data: 'err9'' != 'Error converting column:
> functional_json_snap.alltypeserror.smallint_col, type: SMALLINT, data: 'err9''
> 'Error converting column: functional_json.alltypeserror.timestamp_col, type:
> TIMESTAMP, data: '0'' != 'Error converting column:
> functional_json_snap.alltypeserror.timestamp_col, type: TIMESTAMP, data: '0''
> 'Error converting column: functional_json.alltypeserror.timestamp_col, type:
> TIMESTAMP, data: '0'' != 'Error converting column:
> functional_json_snap.alltypeserror.timestamp_col, type: TIMESTAMP, data: '0''
> 'Error converting column: functional_json.alltypeserror.timestamp_col, type:
> TIMESTAMP, data: '0000-01-01 00:00:00'' != 'Error converting column:
> functional_json_snap.alltypeserror.timestamp_col, type: TIMESTAMP, data:
> '0000-01-01 00:00:00''
> 'Error converting column: functional_json.alltypeserror.timestamp_col, type:
> TIMESTAMP, data: '0000-01-01 00:00:00'' != 'Error converting column:
> functional_json_snap.alltypeserror.timestamp_col, type: TIMESTAMP, data:
> '0000-01-01 00:00:00''
> 'Error converting column: functional_json.alltypeserror.timestamp_col, type:
> TIMESTAMP, data: '0009-01-01 00:00:00'' != 'Error converting column:
> functional_json_snap.alltypeserror.timestamp_col, type: TIMESTAMP, data:
> '0009-01-01 00:00:00''
> 'Error converting column: functional_json.alltypeserror.timestamp_col, type:
> TIMESTAMP, data: '1999-10-10 90:10:10'' != 'Error converting column:
> functional_json_snap.alltypeserror.timestamp_col, type: TIMESTAMP, data:
> '1999-10-10 90:10:10''
> 'Error converting column: functional_json.alltypeserror.timestamp_col, type:
> TIMESTAMP, data: '2002-14-10 00:00:00'' != 'Error converting column:
> functional_json_snap.alltypeserror.timestamp_col, type: TIMESTAMP, data:
> '2002-14-10 00:00:00''
> 'Error converting column: functional_json.alltypeserror.timestamp_col, type:
> TIMESTAMP, data: '2020-10-10 10:70:10.123'' != 'Error converting column:
> functional_json_snap.alltypeserror.timestamp_col, type: TIMESTAMP, data:
> '2020-10-10 10:70:10.123''
> 'Error converting column: functional_json.alltypeserror.timestamp_col, type:
> TIMESTAMP, data: '2020-10-10 60:10:10.123'' != 'Error converting column:
> functional_json_snap.alltypeserror.timestamp_col, type: TIMESTAMP, data:
> '2020-10-10 60:10:10.123''
> 'Error converting column: functional_json.alltypeserror.timestamp_col, type:
> TIMESTAMP, data: '2020-10-40 10:10:10.123'' != 'Error converting column:
> functional_json_snap.alltypeserror.timestamp_col, type: TIMESTAMP, data:
> '2020-10-40 10:10:10.123''
> 'Error converting column: functional_json.alltypeserror.timestamp_col, type:
> TIMESTAMP, data: '2020-20-10 10:10:10.123'' != 'Error converting column:
> functional_json_snap.alltypeserror.timestamp_col, type: TIMESTAMP, data:
> '2020-20-10 10:10:10.123''
> 'Error converting column: functional_json.alltypeserror.tinyint_col, type:
> TINYINT, data: 'abc7'' != 'Error converting column:
> functional_json_snap.alltypeserror.tinyint_col, type: TINYINT, data: 'abc7''
> 'Error converting column: functional_json.alltypeserror.tinyint_col, type:
> TINYINT, data: 'err2'' != 'Error converting column:
> functional_json_snap.alltypeserror.tinyint_col, type: TINYINT, data: 'err2''
> 'Error converting column: functional_json.alltypeserror.tinyint_col, type:
> TINYINT, data: 'err30'' != 'Error converting column:
> functional_json_snap.alltypeserror.tinyint_col, type: TINYINT, data: 'err30''
> 'Error converting column: functional_json.alltypeserror.tinyint_col, type:
> TINYINT, data: 'err9'' != 'Error converting column:
> functional_json_snap.alltypeserror.tinyint_col, type: TINYINT, data: 'err9''
> 'Error converting column: functional_json.alltypeserror.tinyint_col, type:
> TINYINT, data: 'xyz5'' != 'Error converting column:
> functional_json_snap.alltypeserror.tinyint_col, type: TINYINT, data: 'xyz5''
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=1/000000_0.snappy,
> before offset: 648'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=1/000000_0.snappy,
> before offset: 648'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=1/000000_0.snappy,
> before offset: 648'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=1/000000_0.snappy,
> before offset: 648'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=1/000000_0.snappy,
> before offset: 648'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=1/000000_0.snappy,
> before offset: 648'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=1/000000_0.snappy,
> before offset: 648'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=1/000000_0.snappy,
> before offset: 648'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=1/000000_0.snappy,
> before offset: 648'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=2/000000_0.snappy,
> before offset: 572'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=2/000000_0.snappy,
> before offset: 572'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=2/000000_0.snappy,
> before offset: 572'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=2/000000_0.snappy,
> before offset: 572'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=3/000000_0.snappy,
> before offset: 725'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=3/000000_0.snappy,
> before offset: 725'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=3/000000_0.snappy,
> before offset: 725'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=3/000000_0.snappy,
> before offset: 725'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=3/000000_0.snappy,
> before offset: 725'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=3/000000_0.snappy,
> before offset: 725'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=3/000000_0.snappy,
> before offset: 725'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=3/000000_0.snappy,
> before offset: 725'
> {code}
> Similar failures are reported for all failing test cases. Here is the example
> for json/gzip/block:
> {code}
> data_errors/test_data_errors.py:167: in test_hdfs_json_scan_node_errors
> self.run_test_case('DataErrorsTest/hdfs-json-scan-node-errors', vector)
> common/impala_test_suite.py:756: in run_test_case
> self.__verify_results_and_errors(vector, test_section, result, use_db)
> common/impala_test_suite.py:589: in __verify_results_and_errors
> replace_filenames_with_placeholder)
> common/test_result_verifier.py:396: in verify_raw_results
> verify_errors(expected_errors, actual_errors)
> common/test_result_verifier.py:339: in verify_errors
> VERIFIER_MAP['VERIFY_IS_EQUAL'](expected, actual)
> common/test_result_verifier.py:296: in verify_query_result_is_equal
> assert expected_results == actual_results
> E assert Comparing QueryTestResults (expected vs actual):
> E 'Error converting column: functional_json.alltypeserror.bigint_col,
> type: BIGINT, data: 'err300'' != 'Error converting column:
> functional_json_gzip.alltypeserror.bigint_col, type: BIGINT, data: 'err300''
> E 'Error converting column: functional_json.alltypeserror.bigint_col,
> type: BIGINT, data: 'err50'' != 'Error converting column:
> functional_json_gzip.alltypeserror.bigint_col, type: BIGINT, data: 'err50''
> E 'Error converting column: functional_json.alltypeserror.bigint_col,
> type: BIGINT, data: 'err90'' != 'Error converting column:
> functional_json_gzip.alltypeserror.bigint_col, type: BIGINT, data: 'err90''
> E 'Error converting column: functional_json.alltypeserror.bool_col, type:
> BOOLEAN, data: 'errfalse'' != 'Error converting column:
> functional_json_gzip.alltypeserror.bool_col, type: BOOLEAN, data: 'errfalse''
> E 'Error converting column: functional_json.alltypeserror.bool_col, type:
> BOOLEAN, data: 'errtrue'' != 'Error converting column:
> functional_json_gzip.alltypeserror.bool_col, type: BOOLEAN, data: 'errtrue''
> E 'Error converting column: functional_json.alltypeserror.bool_col, type:
> BOOLEAN, data: 't\rue'' != 'Error converting column:
> functional_json_gzip.alltypeserror.bool_col, type: BOOLEAN, data: 't\rue''
> E 'Error converting column: functional_json.alltypeserror.double_col,
> type: DOUBLE, data: 'err300.900000'' != 'Error converting column:
> functional_json_gzip.alltypeserror.double_col, type: DOUBLE, data:
> 'err300.900000''
> E 'Error converting column: functional_json.alltypeserror.double_col,
> type: DOUBLE, data: 'err70.700000'' != 'Error converting column:
> functional_json_gzip.alltypeserror.double_col, type: DOUBLE, data:
> 'err70.700000''
> E 'Error converting column: functional_json.alltypeserror.double_col,
> type: DOUBLE, data: 'err90.900000'' != 'Error converting column:
> functional_json_gzip.alltypeserror.double_col, type: DOUBLE, data:
> 'err90.900000''
> E 'Error converting column: functional_json.alltypeserror.double_col,
> type: DOUBLE, data: 'xyz30.300000'' != 'Error converting column:
> functional_json_gzip.alltypeserror.double_col, type: DOUBLE, data:
> 'xyz30.300000''
> E 'Error converting column: functional_json.alltypeserror.double_col,
> type: DOUBLE, data: 'xyz70.700000'' != 'Error converting column:
> functional_json_gzip.alltypeserror.double_col, type: DOUBLE, data:
> 'xyz70.700000''
> E 'Error converting column: functional_json.alltypeserror.float_col,
> type: FLOAT, data: 'err30..000000'' != 'Error converting column:
> functional_json_gzip.alltypeserror.float_col, type: FLOAT, data:
> 'err30..000000''
> E 'Error converting column: functional_json.alltypeserror.float_col,
> type: FLOAT, data: 'err6.000000'' != 'Error converting column:
> functional_json_gzip.alltypeserror.float_col, type: FLOAT, data:
> 'err6.000000''
> E 'Error converting column: functional_json.alltypeserror.float_col,
> type: FLOAT, data: 'err9.000000'' != 'Error converting column:
> functional_json_gzip.alltypeserror.float_col, type: FLOAT, data:
> 'err9.000000''
> E 'Error converting column: functional_json.alltypeserror.float_col,
> type: FLOAT, data: 'xyz3.000000'' != 'Error converting column:
> functional_json_gzip.alltypeserror.float_col, type: FLOAT, data:
> 'xyz3.000000''
> E 'Error converting column: functional_json.alltypeserror.int_col, type:
> INT, data: 'abc5'' != 'Error converting column:
> functional_json_gzip.alltypeserror.int_col, type: INT, data: 'abc5''
> E 'Error converting column: functional_json.alltypeserror.int_col, type:
> INT, data: 'abc9'' != 'Error converting column:
> functional_json_gzip.alltypeserror.int_col, type: INT, data: 'abc9''
> E 'Error converting column: functional_json.alltypeserror.int_col, type:
> INT, data: 'err30'' != 'Error converting column:
> functional_json_gzip.alltypeserror.int_col, type: INT, data: 'err30''
> E 'Error converting column: functional_json.alltypeserror.int_col, type:
> INT, data: 'err4'' != 'Error converting column:
> functional_json_gzip.alltypeserror.int_col, type: INT, data: 'err4''
> E 'Error converting column: functional_json.alltypeserror.int_col, type:
> INT, data: 'err9'' != 'Error converting column:
> functional_json_gzip.alltypeserror.int_col, type: INT, data: 'err9''
> E 'Error converting column: functional_json.alltypeserror.smallint_col,
> type: SMALLINT, data: 'abc3'' != 'Error converting column:
> functional_json_gzip.alltypeserror.smallint_col, type: SMALLINT, data: 'abc3''
> E 'Error converting column: functional_json.alltypeserror.smallint_col,
> type: SMALLINT, data: 'err3'' != 'Error converting column:
> functional_json_gzip.alltypeserror.smallint_col, type: SMALLINT, data: 'err3''
> E 'Error converting column: functional_json.alltypeserror.smallint_col,
> type: SMALLINT, data: 'err30'' != 'Error converting column:
> functional_json_gzip.alltypeserror.smallint_col, type: SMALLINT, data:
> 'err30''
> E 'Error converting column: functional_json.alltypeserror.smallint_col,
> type: SMALLINT, data: 'err9'' != 'Error converting column:
> functional_json_gzip.alltypeserror.smallint_col, type: SMALLINT, data: 'err9''
> E 'Error converting column: functional_json.alltypeserror.timestamp_col,
> type: TIMESTAMP, data: '0'' != 'Error converting column:
> functional_json_gzip.alltypeserror.timestamp_col, type: TIMESTAMP, data: '0''
> E 'Error converting column: functional_json.alltypeserror.timestamp_col,
> type: TIMESTAMP, data: '0'' != 'Error converting column:
> functional_json_gzip.alltypeserror.timestamp_col, type: TIMESTAMP, data: '0''
> E 'Error converting column: functional_json.alltypeserror.timestamp_col,
> type: TIMESTAMP, data: '0000-01-01 00:00:00'' != 'Error converting column:
> functional_json_gzip.alltypeserror.timestamp_col, type: TIMESTAMP, data:
> '0000-01-01 00:00:00''
> E 'Error converting column: functional_json.alltypeserror.timestamp_col,
> type: TIMESTAMP, data: '0000-01-01 00:00:00'' != 'Error converting column:
> functional_json_gzip.alltypeserror.timestamp_col, type: TIMESTAMP, data:
> '0000-01-01 00:00:00''
> E 'Error converting column: functional_json.alltypeserror.timestamp_col,
> type: TIMESTAMP, data: '0009-01-01 00:00:00'' != 'Error converting column:
> functional_json_gzip.alltypeserror.timestamp_col, type: TIMESTAMP, data:
> '0009-01-01 00:00:00''
> E 'Error converting column: functional_json.alltypeserror.timestamp_col,
> type: TIMESTAMP, data: '1999-10-10 90:10:10'' != 'Error converting column:
> functional_json_gzip.alltypeserror.timestamp_col, type: TIMESTAMP, data:
> '1999-10-10 90:10:10''
> E 'Error converting column: functional_json.alltypeserror.timestamp_col,
> type: TIMESTAMP, data: '2002-14-10 00:00:00'' != 'Error converting column:
> functional_json_gzip.alltypeserror.timestamp_col, type: TIMESTAMP, data:
> '2002-14-10 00:00:00''
> E 'Error converting column: functional_json.alltypeserror.timestamp_col,
> type: TIMESTAMP, data: '2020-10-10 10:70:10.123'' != 'Error converting
> column: functional_json_gzip.alltypeserror.timestamp_col, type: TIMESTAMP,
> data: '2020-10-10 10:70:10.123''
> E 'Error converting column: functional_json.alltypeserror.timestamp_col,
> type: TIMESTAMP, data: '2020-10-10 60:10:10.123'' != 'Error converting
> column: functional_json_gzip.alltypeserror.timestamp_col, type: TIMESTAMP,
> data: '2020-10-10 60:10:10.123''
> E 'Error converting column: functional_json.alltypeserror.timestamp_col,
> type: TIMESTAMP, data: '2020-10-40 10:10:10.123'' != 'Error converting
> column: functional_json_gzip.alltypeserror.timestamp_col, type: TIMESTAMP,
> data: '2020-10-40 10:10:10.123''
> E 'Error converting column: functional_json.alltypeserror.timestamp_col,
> type: TIMESTAMP, data: '2020-20-10 10:10:10.123'' != 'Error converting
> column: functional_json_gzip.alltypeserror.timestamp_col, type: TIMESTAMP,
> data: '2020-20-10 10:10:10.123''
> E 'Error converting column: functional_json.alltypeserror.tinyint_col,
> type: TINYINT, data: 'abc7'' != 'Error converting column:
> functional_json_gzip.alltypeserror.tinyint_col, type: TINYINT, data: 'abc7''
> E 'Error converting column: functional_json.alltypeserror.tinyint_col,
> type: TINYINT, data: 'err2'' != 'Error converting column:
> functional_json_gzip.alltypeserror.tinyint_col, type: TINYINT, data: 'err2''
> E 'Error converting column: functional_json.alltypeserror.tinyint_col,
> type: TINYINT, data: 'err30'' != 'Error converting column:
> functional_json_gzip.alltypeserror.tinyint_col, type: TINYINT, data: 'err30''
> E 'Error converting column: functional_json.alltypeserror.tinyint_col,
> type: TINYINT, data: 'err9'' != 'Error converting column:
> functional_json_gzip.alltypeserror.tinyint_col, type: TINYINT, data: 'err9''
> E 'Error converting column: functional_json.alltypeserror.tinyint_col,
> type: TINYINT, data: 'xyz5'' != 'Error converting column:
> functional_json_gzip.alltypeserror.tinyint_col, type: TINYINT, data: 'xyz5''
> E row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=1/000000_0.gz,
> before offset: 393'
> E row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=1/000000_0.gz,
> before offset: 393'
> E row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=1/000000_0.gz,
> before offset: 393'
> E row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=1/000000_0.gz,
> before offset: 393'
> E row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=1/000000_0.gz,
> before offset: 393'
> E row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=1/000000_0.gz,
> before offset: 393'
> E row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=1/000000_0.gz,
> before offset: 393'
> E row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=1/000000_0.gz,
> before offset: 393'
> E row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=1/000000_0.gz,
> before offset: 393'
> E row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=2/000000_0.gz,
> before offset: 345'
> E row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=2/000000_0.gz,
> before offset: 345'
> E row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=2/000000_0.gz,
> before offset: 345'
> E row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=2/000000_0.gz,
> before offset: 345'
> E row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=3/000000_0.gz,
> before offset: 434'
> E row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=3/000000_0.gz,
> before offset: 434'
> E row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=3/000000_0.gz,
> before offset: 434'
> E row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=3/000000_0.gz,
> before offset: 434'
> E row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=3/000000_0.gz,
> before offset: 434'
> E row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=3/000000_0.gz,
> before offset: 434'
> E row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=3/000000_0.gz,
> before offset: 434'
> E row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before
> offset: \d+ == 'Error parsing row: file:
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=3/000000_0.gz,
> before offset: 434'
> {code}
> [~Eyizoha], you have been active in this area recently; could you perhaps
> take a look at the failure? Thanks a lot!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)