[ 
https://issues.apache.org/jira/browse/IMPALA-12740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17809264#comment-17809264
 ] 

Ye Zihao commented on IMPALA-12740:
-----------------------------------

reviewed on: 
[https://gerrit.cloudera.org/#/c/20931/|https://gerrit.cloudera.org/#/c/20931/]

> TestHdfsJsonScanNodeErrors fails in exhaustive mode
> ---------------------------------------------------
>
>                 Key: IMPALA-12740
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12740
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 4.4.0
>            Reporter: Laszlo Gaal
>            Assignee: Ye Zihao
>            Priority: Blocker
>              Labels: broken-build
>
> data_errors.test_data_errors.TestHdfsJsonScanNodeErrors fails when test are 
> run in exhaustive more. Both debug and release builds exhibit the same 
> symptoms.
> the run log:{code}
> FAIL 
> data_errors/test_data_errors.py::TestHdfsJsonScanNodeErrors::()::test_hdfs_json_scan_node_errors[protocol:
>  beeswax | exec_option: {'test_replan': 1, 'batch_size': 1, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> json/snap/block]
> FAIL 
> data_errors/test_data_errors.py::TestHdfsJsonScanNodeErrors::()::test_hdfs_json_scan_node_errors[protocol:
>  beeswax | exec_option: {'test_replan': 1, 'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> json/snap/block]
> FAIL 
> data_errors/test_data_errors.py::TestHdfsJsonScanNodeErrors::()::test_hdfs_json_scan_node_errors[protocol:
>  beeswax | exec_option: {'test_replan': 1, 'batch_size': 1, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> json/snap/block]
> FAIL 
> data_errors/test_data_errors.py::TestHdfsJsonScanNodeErrors::()::test_hdfs_json_scan_node_errors[protocol:
>  beeswax | exec_option: {'test_replan': 1, 'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> json/snap/block]
> FAIL 
> data_errors/test_data_errors.py::TestHdfsJsonScanNodeErrors::()::test_hdfs_json_scan_node_errors[protocol:
>  beeswax | exec_option: {'test_replan': 1, 'batch_size': 1, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> json/def/block]
> FAIL 
> data_errors/test_data_errors.py::TestHdfsJsonScanNodeErrors::()::test_hdfs_json_scan_node_errors[protocol:
>  beeswax | exec_option: {'test_replan': 1, 'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> json/def/block]
> FAIL 
> data_errors/test_data_errors.py::TestHdfsJsonScanNodeErrors::()::test_hdfs_json_scan_node_errors[protocol:
>  beeswax | exec_option: {'test_replan': 1, 'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> json/def/block]
> FAIL 
> data_errors/test_data_errors.py::TestHdfsJsonScanNodeErrors::()::test_hdfs_json_scan_node_errors[protocol:
>  beeswax | exec_option: {'test_replan': 1, 'batch_size': 1, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> json/def/block]
> FAIL 
> data_errors/test_data_errors.py::TestHdfsJsonScanNodeErrors::()::test_hdfs_json_scan_node_errors[protocol:
>  beeswax | exec_option: {'test_replan': 1, 'batch_size': 1, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> json/bzip/block]
> FAIL 
> data_errors/test_data_errors.py::TestHdfsJsonScanNodeErrors::()::test_hdfs_json_scan_node_errors[protocol:
>  beeswax | exec_option: {'test_replan': 1, 'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> json/bzip/block]
> FAIL 
> data_errors/test_data_errors.py::TestHdfsJsonScanNodeErrors::()::test_hdfs_json_scan_node_errors[protocol:
>  beeswax | exec_option: {'test_replan': 1, 'batch_size': 1, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> json/gzip/block]
> FAIL 
> data_errors/test_data_errors.py::TestHdfsJsonScanNodeErrors::()::test_hdfs_json_scan_node_errors[protocol:
>  beeswax | exec_option: {'test_replan': 1, 'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> json/gzip/block]
> FAIL 
> data_errors/test_data_errors.py::TestHdfsJsonScanNodeErrors::()::test_hdfs_json_scan_node_errors[protocol:
>  beeswax | exec_option: {'test_replan': 1, 'batch_size': 1, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> json/bzip/block]
> FAIL 
> data_errors/test_data_errors.py::TestHdfsJsonScanNodeErrors::()::test_hdfs_json_scan_node_errors[protocol:
>  beeswax | exec_option: {'test_replan': 1, 'batch_size': 1, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> json/gzip/block]
> FAIL 
> data_errors/test_data_errors.py::TestHdfsJsonScanNodeErrors::()::test_hdfs_json_scan_node_errors[protocol:
>  beeswax | exec_option: {'test_replan': 1, 'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> json/bzip/block]
> FAIL 
> data_errors/test_data_errors.py::TestHdfsJsonScanNodeErrors::()::test_hdfs_json_scan_node_errors[protocol:
>  beeswax | exec_option: {'test_replan': 1, 'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> json/gzip/block]
> {code}
> stderr complains about conversion failures:
> {code}
> SET 
> client_identifier=data_errors/test_data_errors.py::TestHdfsJsonScanNodeErrors::()::test_hdfs_json_scan_node_errors[protocol:beeswax|exec_option:{'test_replan':1;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':True;'abort_on_error':1;'exec_;
> -- executing against localhost:21000
> use functional_json_snap;
> -- 2024-01-19 23:30:10,326 INFO     MainThread: Started query 
> 824002d0b11e7ba1:2e75861b00000000
> SET 
> client_identifier=data_errors/test_data_errors.py::TestHdfsJsonScanNodeErrors::()::test_hdfs_json_scan_node_errors[protocol:beeswax|exec_option:{'test_replan':1;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':True;'abort_on_error':1;'exec_;
> SET test_replan=1;
> SET batch_size=0;
> SET num_nodes=0;
> SET disable_codegen_rows_threshold=0;
> SET disable_codegen=True;
> SET abort_on_error=0;
> SET exec_single_node_rows_threshold=0;
> -- 2024-01-19 23:30:10,327 INFO     MainThread: Loading query test file: 
> /data/jenkins/workspace/impala-asf-master-exhaustive-release/repos/Impala/testdata/workloads/functional-query/queries/DataErrorsTest/hdfs-json-scan-node-errors.test
> -- executing against localhost:21000
> select * from alltypeserror order by id;
> -- 2024-01-19 23:30:14,212 INFO     MainThread: Started query 
> d54d7edb55f53beb:dbbb778a00000000
> -- 2024-01-19 23:30:14,295 ERROR    MainThread: Comparing QueryTestResults 
> (expected vs actual):
> 'Error converting column: functional_json.alltypeserror.bigint_col, type: 
> BIGINT, data: 'err300'' != 'Error converting column: 
> functional_json_snap.alltypeserror.bigint_col, type: BIGINT, data: 'err300''
> 'Error converting column: functional_json.alltypeserror.bigint_col, type: 
> BIGINT, data: 'err50'' != 'Error converting column: 
> functional_json_snap.alltypeserror.bigint_col, type: BIGINT, data: 'err50''
> 'Error converting column: functional_json.alltypeserror.bigint_col, type: 
> BIGINT, data: 'err90'' != 'Error converting column: 
> functional_json_snap.alltypeserror.bigint_col, type: BIGINT, data: 'err90''
> 'Error converting column: functional_json.alltypeserror.bool_col, type: 
> BOOLEAN, data: 'errfalse'' != 'Error converting column: 
> functional_json_snap.alltypeserror.bool_col, type: BOOLEAN, data: 'errfalse''
> 'Error converting column: functional_json.alltypeserror.bool_col, type: 
> BOOLEAN, data: 'errtrue'' != 'Error converting column: 
> functional_json_snap.alltypeserror.bool_col, type: BOOLEAN, data: 'errtrue''
> 'Error converting column: functional_json.alltypeserror.bool_col, type: 
> BOOLEAN, data: 't\rue'' != 'Error converting column: 
> functional_json_snap.alltypeserror.bool_col, type: BOOLEAN, data: 't\rue''
> 'Error converting column: functional_json.alltypeserror.double_col, type: 
> DOUBLE, data: 'err300.900000'' != 'Error converting column: 
> functional_json_snap.alltypeserror.double_col, type: DOUBLE, data: 
> 'err300.900000''
> 'Error converting column: functional_json.alltypeserror.double_col, type: 
> DOUBLE, data: 'err70.700000'' != 'Error converting column: 
> functional_json_snap.alltypeserror.double_col, type: DOUBLE, data: 
> 'err70.700000''
> 'Error converting column: functional_json.alltypeserror.double_col, type: 
> DOUBLE, data: 'err90.900000'' != 'Error converting column: 
> functional_json_snap.alltypeserror.double_col, type: DOUBLE, data: 
> 'err90.900000''
> 'Error converting column: functional_json.alltypeserror.double_col, type: 
> DOUBLE, data: 'xyz30.300000'' != 'Error converting column: 
> functional_json_snap.alltypeserror.double_col, type: DOUBLE, data: 
> 'xyz30.300000''
> 'Error converting column: functional_json.alltypeserror.double_col, type: 
> DOUBLE, data: 'xyz70.700000'' != 'Error converting column: 
> functional_json_snap.alltypeserror.double_col, type: DOUBLE, data: 
> 'xyz70.700000''
> 'Error converting column: functional_json.alltypeserror.float_col, type: 
> FLOAT, data: 'err30..000000'' != 'Error converting column: 
> functional_json_snap.alltypeserror.float_col, type: FLOAT, data: 
> 'err30..000000''
> 'Error converting column: functional_json.alltypeserror.float_col, type: 
> FLOAT, data: 'err6.000000'' != 'Error converting column: 
> functional_json_snap.alltypeserror.float_col, type: FLOAT, data: 
> 'err6.000000''
> 'Error converting column: functional_json.alltypeserror.float_col, type: 
> FLOAT, data: 'err9.000000'' != 'Error converting column: 
> functional_json_snap.alltypeserror.float_col, type: FLOAT, data: 
> 'err9.000000''
> 'Error converting column: functional_json.alltypeserror.float_col, type: 
> FLOAT, data: 'xyz3.000000'' != 'Error converting column: 
> functional_json_snap.alltypeserror.float_col, type: FLOAT, data: 
> 'xyz3.000000''
> 'Error converting column: functional_json.alltypeserror.int_col, type: INT, 
> data: 'abc5'' != 'Error converting column: 
> functional_json_snap.alltypeserror.int_col, type: INT, data: 'abc5''
> 'Error converting column: functional_json.alltypeserror.int_col, type: INT, 
> data: 'abc9'' != 'Error converting column: 
> functional_json_snap.alltypeserror.int_col, type: INT, data: 'abc9''
> 'Error converting column: functional_json.alltypeserror.int_col, type: INT, 
> data: 'err30'' != 'Error converting column: 
> functional_json_snap.alltypeserror.int_col, type: INT, data: 'err30''
> 'Error converting column: functional_json.alltypeserror.int_col, type: INT, 
> data: 'err4'' != 'Error converting column: 
> functional_json_snap.alltypeserror.int_col, type: INT, data: 'err4''
> 'Error converting column: functional_json.alltypeserror.int_col, type: INT, 
> data: 'err9'' != 'Error converting column: 
> functional_json_snap.alltypeserror.int_col, type: INT, data: 'err9''
> 'Error converting column: functional_json.alltypeserror.smallint_col, type: 
> SMALLINT, data: 'abc3'' != 'Error converting column: 
> functional_json_snap.alltypeserror.smallint_col, type: SMALLINT, data: 'abc3''
> 'Error converting column: functional_json.alltypeserror.smallint_col, type: 
> SMALLINT, data: 'err3'' != 'Error converting column: 
> functional_json_snap.alltypeserror.smallint_col, type: SMALLINT, data: 'err3''
> 'Error converting column: functional_json.alltypeserror.smallint_col, type: 
> SMALLINT, data: 'err30'' != 'Error converting column: 
> functional_json_snap.alltypeserror.smallint_col, type: SMALLINT, data: 
> 'err30''
> 'Error converting column: functional_json.alltypeserror.smallint_col, type: 
> SMALLINT, data: 'err9'' != 'Error converting column: 
> functional_json_snap.alltypeserror.smallint_col, type: SMALLINT, data: 'err9''
> 'Error converting column: functional_json.alltypeserror.timestamp_col, type: 
> TIMESTAMP, data: '0'' != 'Error converting column: 
> functional_json_snap.alltypeserror.timestamp_col, type: TIMESTAMP, data: '0''
> 'Error converting column: functional_json.alltypeserror.timestamp_col, type: 
> TIMESTAMP, data: '0'' != 'Error converting column: 
> functional_json_snap.alltypeserror.timestamp_col, type: TIMESTAMP, data: '0''
> 'Error converting column: functional_json.alltypeserror.timestamp_col, type: 
> TIMESTAMP, data: '0000-01-01 00:00:00'' != 'Error converting column: 
> functional_json_snap.alltypeserror.timestamp_col, type: TIMESTAMP, data: 
> '0000-01-01 00:00:00''
> 'Error converting column: functional_json.alltypeserror.timestamp_col, type: 
> TIMESTAMP, data: '0000-01-01 00:00:00'' != 'Error converting column: 
> functional_json_snap.alltypeserror.timestamp_col, type: TIMESTAMP, data: 
> '0000-01-01 00:00:00''
> 'Error converting column: functional_json.alltypeserror.timestamp_col, type: 
> TIMESTAMP, data: '0009-01-01 00:00:00'' != 'Error converting column: 
> functional_json_snap.alltypeserror.timestamp_col, type: TIMESTAMP, data: 
> '0009-01-01 00:00:00''
> 'Error converting column: functional_json.alltypeserror.timestamp_col, type: 
> TIMESTAMP, data: '1999-10-10 90:10:10'' != 'Error converting column: 
> functional_json_snap.alltypeserror.timestamp_col, type: TIMESTAMP, data: 
> '1999-10-10 90:10:10''
> 'Error converting column: functional_json.alltypeserror.timestamp_col, type: 
> TIMESTAMP, data: '2002-14-10 00:00:00'' != 'Error converting column: 
> functional_json_snap.alltypeserror.timestamp_col, type: TIMESTAMP, data: 
> '2002-14-10 00:00:00''
> 'Error converting column: functional_json.alltypeserror.timestamp_col, type: 
> TIMESTAMP, data: '2020-10-10 10:70:10.123'' != 'Error converting column: 
> functional_json_snap.alltypeserror.timestamp_col, type: TIMESTAMP, data: 
> '2020-10-10 10:70:10.123''
> 'Error converting column: functional_json.alltypeserror.timestamp_col, type: 
> TIMESTAMP, data: '2020-10-10 60:10:10.123'' != 'Error converting column: 
> functional_json_snap.alltypeserror.timestamp_col, type: TIMESTAMP, data: 
> '2020-10-10 60:10:10.123''
> 'Error converting column: functional_json.alltypeserror.timestamp_col, type: 
> TIMESTAMP, data: '2020-10-40 10:10:10.123'' != 'Error converting column: 
> functional_json_snap.alltypeserror.timestamp_col, type: TIMESTAMP, data: 
> '2020-10-40 10:10:10.123''
> 'Error converting column: functional_json.alltypeserror.timestamp_col, type: 
> TIMESTAMP, data: '2020-20-10 10:10:10.123'' != 'Error converting column: 
> functional_json_snap.alltypeserror.timestamp_col, type: TIMESTAMP, data: 
> '2020-20-10 10:10:10.123''
> 'Error converting column: functional_json.alltypeserror.tinyint_col, type: 
> TINYINT, data: 'abc7'' != 'Error converting column: 
> functional_json_snap.alltypeserror.tinyint_col, type: TINYINT, data: 'abc7''
> 'Error converting column: functional_json.alltypeserror.tinyint_col, type: 
> TINYINT, data: 'err2'' != 'Error converting column: 
> functional_json_snap.alltypeserror.tinyint_col, type: TINYINT, data: 'err2''
> 'Error converting column: functional_json.alltypeserror.tinyint_col, type: 
> TINYINT, data: 'err30'' != 'Error converting column: 
> functional_json_snap.alltypeserror.tinyint_col, type: TINYINT, data: 'err30''
> 'Error converting column: functional_json.alltypeserror.tinyint_col, type: 
> TINYINT, data: 'err9'' != 'Error converting column: 
> functional_json_snap.alltypeserror.tinyint_col, type: TINYINT, data: 'err9''
> 'Error converting column: functional_json.alltypeserror.tinyint_col, type: 
> TINYINT, data: 'xyz5'' != 'Error converting column: 
> functional_json_snap.alltypeserror.tinyint_col, type: TINYINT, data: 'xyz5''
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=1/000000_0.snappy,
>  before offset: 648'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=1/000000_0.snappy,
>  before offset: 648'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=1/000000_0.snappy,
>  before offset: 648'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=1/000000_0.snappy,
>  before offset: 648'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=1/000000_0.snappy,
>  before offset: 648'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=1/000000_0.snappy,
>  before offset: 648'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=1/000000_0.snappy,
>  before offset: 648'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=1/000000_0.snappy,
>  before offset: 648'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=1/000000_0.snappy,
>  before offset: 648'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=2/000000_0.snappy,
>  before offset: 572'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=2/000000_0.snappy,
>  before offset: 572'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=2/000000_0.snappy,
>  before offset: 572'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=2/000000_0.snappy,
>  before offset: 572'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=3/000000_0.snappy,
>  before offset: 725'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=3/000000_0.snappy,
>  before offset: 725'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=3/000000_0.snappy,
>  before offset: 725'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=3/000000_0.snappy,
>  before offset: 725'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=3/000000_0.snappy,
>  before offset: 725'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=3/000000_0.snappy,
>  before offset: 725'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=3/000000_0.snappy,
>  before offset: 725'
> row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_snap/year=2009/month=3/000000_0.snappy,
>  before offset: 725'
> {code}
> Similar failures are reported for all failing test cases. Here is the example 
> for json/gzip/block:
> {code}
> data_errors/test_data_errors.py:167: in test_hdfs_json_scan_node_errors
>     self.run_test_case('DataErrorsTest/hdfs-json-scan-node-errors', vector)
> common/impala_test_suite.py:756: in run_test_case
>     self.__verify_results_and_errors(vector, test_section, result, use_db)
> common/impala_test_suite.py:589: in __verify_results_and_errors
>     replace_filenames_with_placeholder)
> common/test_result_verifier.py:396: in verify_raw_results
>     verify_errors(expected_errors, actual_errors)
> common/test_result_verifier.py:339: in verify_errors
>     VERIFIER_MAP['VERIFY_IS_EQUAL'](expected, actual)
> common/test_result_verifier.py:296: in verify_query_result_is_equal
>     assert expected_results == actual_results
> E   assert Comparing QueryTestResults (expected vs actual):
> E     'Error converting column: functional_json.alltypeserror.bigint_col, 
> type: BIGINT, data: 'err300'' != 'Error converting column: 
> functional_json_gzip.alltypeserror.bigint_col, type: BIGINT, data: 'err300''
> E     'Error converting column: functional_json.alltypeserror.bigint_col, 
> type: BIGINT, data: 'err50'' != 'Error converting column: 
> functional_json_gzip.alltypeserror.bigint_col, type: BIGINT, data: 'err50''
> E     'Error converting column: functional_json.alltypeserror.bigint_col, 
> type: BIGINT, data: 'err90'' != 'Error converting column: 
> functional_json_gzip.alltypeserror.bigint_col, type: BIGINT, data: 'err90''
> E     'Error converting column: functional_json.alltypeserror.bool_col, type: 
> BOOLEAN, data: 'errfalse'' != 'Error converting column: 
> functional_json_gzip.alltypeserror.bool_col, type: BOOLEAN, data: 'errfalse''
> E     'Error converting column: functional_json.alltypeserror.bool_col, type: 
> BOOLEAN, data: 'errtrue'' != 'Error converting column: 
> functional_json_gzip.alltypeserror.bool_col, type: BOOLEAN, data: 'errtrue''
> E     'Error converting column: functional_json.alltypeserror.bool_col, type: 
> BOOLEAN, data: 't\rue'' != 'Error converting column: 
> functional_json_gzip.alltypeserror.bool_col, type: BOOLEAN, data: 't\rue''
> E     'Error converting column: functional_json.alltypeserror.double_col, 
> type: DOUBLE, data: 'err300.900000'' != 'Error converting column: 
> functional_json_gzip.alltypeserror.double_col, type: DOUBLE, data: 
> 'err300.900000''
> E     'Error converting column: functional_json.alltypeserror.double_col, 
> type: DOUBLE, data: 'err70.700000'' != 'Error converting column: 
> functional_json_gzip.alltypeserror.double_col, type: DOUBLE, data: 
> 'err70.700000''
> E     'Error converting column: functional_json.alltypeserror.double_col, 
> type: DOUBLE, data: 'err90.900000'' != 'Error converting column: 
> functional_json_gzip.alltypeserror.double_col, type: DOUBLE, data: 
> 'err90.900000''
> E     'Error converting column: functional_json.alltypeserror.double_col, 
> type: DOUBLE, data: 'xyz30.300000'' != 'Error converting column: 
> functional_json_gzip.alltypeserror.double_col, type: DOUBLE, data: 
> 'xyz30.300000''
> E     'Error converting column: functional_json.alltypeserror.double_col, 
> type: DOUBLE, data: 'xyz70.700000'' != 'Error converting column: 
> functional_json_gzip.alltypeserror.double_col, type: DOUBLE, data: 
> 'xyz70.700000''
> E     'Error converting column: functional_json.alltypeserror.float_col, 
> type: FLOAT, data: 'err30..000000'' != 'Error converting column: 
> functional_json_gzip.alltypeserror.float_col, type: FLOAT, data: 
> 'err30..000000''
> E     'Error converting column: functional_json.alltypeserror.float_col, 
> type: FLOAT, data: 'err6.000000'' != 'Error converting column: 
> functional_json_gzip.alltypeserror.float_col, type: FLOAT, data: 
> 'err6.000000''
> E     'Error converting column: functional_json.alltypeserror.float_col, 
> type: FLOAT, data: 'err9.000000'' != 'Error converting column: 
> functional_json_gzip.alltypeserror.float_col, type: FLOAT, data: 
> 'err9.000000''
> E     'Error converting column: functional_json.alltypeserror.float_col, 
> type: FLOAT, data: 'xyz3.000000'' != 'Error converting column: 
> functional_json_gzip.alltypeserror.float_col, type: FLOAT, data: 
> 'xyz3.000000''
> E     'Error converting column: functional_json.alltypeserror.int_col, type: 
> INT, data: 'abc5'' != 'Error converting column: 
> functional_json_gzip.alltypeserror.int_col, type: INT, data: 'abc5''
> E     'Error converting column: functional_json.alltypeserror.int_col, type: 
> INT, data: 'abc9'' != 'Error converting column: 
> functional_json_gzip.alltypeserror.int_col, type: INT, data: 'abc9''
> E     'Error converting column: functional_json.alltypeserror.int_col, type: 
> INT, data: 'err30'' != 'Error converting column: 
> functional_json_gzip.alltypeserror.int_col, type: INT, data: 'err30''
> E     'Error converting column: functional_json.alltypeserror.int_col, type: 
> INT, data: 'err4'' != 'Error converting column: 
> functional_json_gzip.alltypeserror.int_col, type: INT, data: 'err4''
> E     'Error converting column: functional_json.alltypeserror.int_col, type: 
> INT, data: 'err9'' != 'Error converting column: 
> functional_json_gzip.alltypeserror.int_col, type: INT, data: 'err9''
> E     'Error converting column: functional_json.alltypeserror.smallint_col, 
> type: SMALLINT, data: 'abc3'' != 'Error converting column: 
> functional_json_gzip.alltypeserror.smallint_col, type: SMALLINT, data: 'abc3''
> E     'Error converting column: functional_json.alltypeserror.smallint_col, 
> type: SMALLINT, data: 'err3'' != 'Error converting column: 
> functional_json_gzip.alltypeserror.smallint_col, type: SMALLINT, data: 'err3''
> E     'Error converting column: functional_json.alltypeserror.smallint_col, 
> type: SMALLINT, data: 'err30'' != 'Error converting column: 
> functional_json_gzip.alltypeserror.smallint_col, type: SMALLINT, data: 
> 'err30''
> E     'Error converting column: functional_json.alltypeserror.smallint_col, 
> type: SMALLINT, data: 'err9'' != 'Error converting column: 
> functional_json_gzip.alltypeserror.smallint_col, type: SMALLINT, data: 'err9''
> E     'Error converting column: functional_json.alltypeserror.timestamp_col, 
> type: TIMESTAMP, data: '0'' != 'Error converting column: 
> functional_json_gzip.alltypeserror.timestamp_col, type: TIMESTAMP, data: '0''
> E     'Error converting column: functional_json.alltypeserror.timestamp_col, 
> type: TIMESTAMP, data: '0'' != 'Error converting column: 
> functional_json_gzip.alltypeserror.timestamp_col, type: TIMESTAMP, data: '0''
> E     'Error converting column: functional_json.alltypeserror.timestamp_col, 
> type: TIMESTAMP, data: '0000-01-01 00:00:00'' != 'Error converting column: 
> functional_json_gzip.alltypeserror.timestamp_col, type: TIMESTAMP, data: 
> '0000-01-01 00:00:00''
> E     'Error converting column: functional_json.alltypeserror.timestamp_col, 
> type: TIMESTAMP, data: '0000-01-01 00:00:00'' != 'Error converting column: 
> functional_json_gzip.alltypeserror.timestamp_col, type: TIMESTAMP, data: 
> '0000-01-01 00:00:00''
> E     'Error converting column: functional_json.alltypeserror.timestamp_col, 
> type: TIMESTAMP, data: '0009-01-01 00:00:00'' != 'Error converting column: 
> functional_json_gzip.alltypeserror.timestamp_col, type: TIMESTAMP, data: 
> '0009-01-01 00:00:00''
> E     'Error converting column: functional_json.alltypeserror.timestamp_col, 
> type: TIMESTAMP, data: '1999-10-10 90:10:10'' != 'Error converting column: 
> functional_json_gzip.alltypeserror.timestamp_col, type: TIMESTAMP, data: 
> '1999-10-10 90:10:10''
> E     'Error converting column: functional_json.alltypeserror.timestamp_col, 
> type: TIMESTAMP, data: '2002-14-10 00:00:00'' != 'Error converting column: 
> functional_json_gzip.alltypeserror.timestamp_col, type: TIMESTAMP, data: 
> '2002-14-10 00:00:00''
> E     'Error converting column: functional_json.alltypeserror.timestamp_col, 
> type: TIMESTAMP, data: '2020-10-10 10:70:10.123'' != 'Error converting 
> column: functional_json_gzip.alltypeserror.timestamp_col, type: TIMESTAMP, 
> data: '2020-10-10 10:70:10.123''
> E     'Error converting column: functional_json.alltypeserror.timestamp_col, 
> type: TIMESTAMP, data: '2020-10-10 60:10:10.123'' != 'Error converting 
> column: functional_json_gzip.alltypeserror.timestamp_col, type: TIMESTAMP, 
> data: '2020-10-10 60:10:10.123''
> E     'Error converting column: functional_json.alltypeserror.timestamp_col, 
> type: TIMESTAMP, data: '2020-10-40 10:10:10.123'' != 'Error converting 
> column: functional_json_gzip.alltypeserror.timestamp_col, type: TIMESTAMP, 
> data: '2020-10-40 10:10:10.123''
> E     'Error converting column: functional_json.alltypeserror.timestamp_col, 
> type: TIMESTAMP, data: '2020-20-10 10:10:10.123'' != 'Error converting 
> column: functional_json_gzip.alltypeserror.timestamp_col, type: TIMESTAMP, 
> data: '2020-20-10 10:10:10.123''
> E     'Error converting column: functional_json.alltypeserror.tinyint_col, 
> type: TINYINT, data: 'abc7'' != 'Error converting column: 
> functional_json_gzip.alltypeserror.tinyint_col, type: TINYINT, data: 'abc7''
> E     'Error converting column: functional_json.alltypeserror.tinyint_col, 
> type: TINYINT, data: 'err2'' != 'Error converting column: 
> functional_json_gzip.alltypeserror.tinyint_col, type: TINYINT, data: 'err2''
> E     'Error converting column: functional_json.alltypeserror.tinyint_col, 
> type: TINYINT, data: 'err30'' != 'Error converting column: 
> functional_json_gzip.alltypeserror.tinyint_col, type: TINYINT, data: 'err30''
> E     'Error converting column: functional_json.alltypeserror.tinyint_col, 
> type: TINYINT, data: 'err9'' != 'Error converting column: 
> functional_json_gzip.alltypeserror.tinyint_col, type: TINYINT, data: 'err9''
> E     'Error converting column: functional_json.alltypeserror.tinyint_col, 
> type: TINYINT, data: 'xyz5'' != 'Error converting column: 
> functional_json_gzip.alltypeserror.tinyint_col, type: TINYINT, data: 'xyz5''
> E     row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=1/000000_0.gz,
>  before offset: 393'
> E     row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=1/000000_0.gz,
>  before offset: 393'
> E     row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=1/000000_0.gz,
>  before offset: 393'
> E     row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=1/000000_0.gz,
>  before offset: 393'
> E     row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=1/000000_0.gz,
>  before offset: 393'
> E     row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=1/000000_0.gz,
>  before offset: 393'
> E     row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=1/000000_0.gz,
>  before offset: 393'
> E     row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=1/000000_0.gz,
>  before offset: 393'
> E     row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=1/000000_0.gz,
>  before offset: 393'
> E     row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=2/000000_0.gz,
>  before offset: 345'
> E     row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=2/000000_0.gz,
>  before offset: 345'
> E     row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=2/000000_0.gz,
>  before offset: 345'
> E     row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=2/000000_0.gz,
>  before offset: 345'
> E     row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=3/000000_0.gz,
>  before offset: 434'
> E     row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=3/000000_0.gz,
>  before offset: 434'
> E     row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=3/000000_0.gz,
>  before offset: 434'
> E     row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=3/000000_0.gz,
>  before offset: 434'
> E     row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=3/000000_0.gz,
>  before offset: 434'
> E     row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=3/000000_0.gz,
>  before offset: 434'
> E     row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=3/000000_0.gz,
>  before offset: 434'
> E     row_regex: .*Error parsing row: file: hdfs://localhost:20500/.* before 
> offset: \d+ == 'Error parsing row: file: 
> hdfs://localhost:20500/test-warehouse/alltypeserror_json_gzip/year=2009/month=3/000000_0.gz,
>  before offset: 434'
> {code}
> [~Eyizoha], you have been active in this area recently; could you perhaps 
> take a look at the failure? Thanks a lot!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org


Reply via email to