[jira] [Comment Edited] (IMPALA-10501) Hit DCHECK in parquet-column-readers.cc: def_levels_.CacheRemaining() <= num_buffered_values_

Jira Tue, 16 Feb 2021 10:01:05 -0800


    [ 
https://issues.apache.org/jira/browse/IMPALA-10501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17285385#comment-17285385
 ]


Zoltán Borók-Nagy edited comment on IMPALA-10501 at 2/16/21, 6:00 PM:
----------------------------------------------------------------------

Looping load_nested and running the query didn't help. However, I was able to 
reproduce the error on a hacked version of Impala.

I added the following code to HdfsParquetScanner::EvaluatePageIndex():
{noformat}
    skip_ranges.clear();
    const int RANGES = 5;
    int64_t num_vals = row_group.num_rows;
    int64_t range_length = num_vals / RANGES;
    for (int i = 0; i < RANGES; ++i) {
      int64_t range_start = i * range_length + range_length / 2;
      int64_t range_end = range_start + range_length / 2;
      skip_ranges.push_back({range_start, range_end});
    }{noformat}
The above code adds a couple of "skip ranges" that are unaligned with the 
Parquet pages. This reproduced the error and I was able to create a fix: 
[https://gerrit.cloudera.org/#/c/17071/]


was (Author: boroknagyz):
Looping load_nested and running the query didn't help. However, I was able to 
reproduce the error on a hacked version of Impala.

I added the following code to HdfsParquetScanner::EvaluatePageIndex():

 
{noformat}
    skip_ranges.clear();
    const int RANGES = 5;
    int64_t num_vals = row_group.num_rows;
    int64_t range_length = num_vals / RANGES;
    for (int i = 0; i < RANGES; ++i) {
      int64_t range_start = i * range_length + range_length / 2;
      int64_t range_end = range_start + range_length / 2;
      skip_ranges.push_back({range_start, range_end});
    }{noformat}
The above code adds a couple of "skip ranges" that are unaligned with the 
Parquet pages. This reproduced the error and I was able to create a fix: 
[https://gerrit.cloudera.org/#/c/17071/]

 

> Hit DCHECK in parquet-column-readers.cc:  def_levels_.CacheRemaining() <= 
> num_buffered_values_
> ----------------------------------------------------------------------------------------------
>
>                 Key: IMPALA-10501
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10501
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>    Affects Versions: Impala 4.0
>            Reporter: Tim Armstrong
>            Assignee: Zoltán Borók-Nagy
>            Priority: Blocker
>              Labels: broken-build, crash, flaky, parquet
>         Attachments: consoleText.3.gz, impalad_coord_exec-0.tar.gz
>
>
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/3814/
> {noformat}
> F0211 03:55:26.383247 14487 parquet-column-readers.cc:517] 
> be46bb72819942fd:85934edd00000001] Check failed: def_levels_.CacheRemaining() 
> <= num_buffered_values_ (921 vs. 916) 
> *** Check failure stack trace: ***
>     @          0x53646ec  google::LogMessage::Fail()
>     @          0x5365fdc  google::LogMessage::SendToLog()
>     @          0x536404a  google::LogMessage::Flush()
>     @          0x5367c48  google::LogMessageFatal::~LogMessageFatal()
>     @          0x2ff886f  
> impala::ScalarColumnReader<>::MaterializeValueBatch<>()
>     @          0x2f8ae44  
> impala::ScalarColumnReader<>::MaterializeValueBatch<>()
>     @          0x2f761bf  impala::ScalarColumnReader<>::ReadValueBatch<>()
>     @          0x2f2889a  impala::ScalarColumnReader<>::ReadValueBatch()
>     @          0x2ebd8c0  impala::HdfsParquetScanner::AssembleRows()
>     @          0x2eb882e  impala::HdfsParquetScanner::GetNextInternal()
>     @          0x2eb67bd  impala::HdfsParquetScanner::ProcessSplit()
>     @          0x2aaf3f2  impala::HdfsScanNode::ProcessSplit()
>     @          0x2aae773  impala::HdfsScanNode::ScannerThread()
>     @          0x2aadadb  
> _ZZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS_18ThreadResourcePoolEENKUlvE_clEv
>     @          0x2aafe94  
> _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_18ThreadResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE
>     @          0x220e331  boost::function0<>::operator()()
>     @          0x2842e7f  impala::Thread::SuperviseThread()
>     @          0x284ae1c  boost::_bi::list5<>::operator()<>()
>     @          0x284ad40  boost::_bi::bind_t<>::operator()()
>     @          0x284ad01  boost::detail::thread_data<>::run()
>     @          0x406b291  thread_proxy
>     @     0x7f2465cba6b9  start_thread
>     @     0x7f24627e64dc  clone
> rImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:431)
>       at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166)
>       at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158)
>       at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96)
>       at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362)
>       at com.sun.proxy.$Proxy10.getBlockLocations(Unknown Source)
>       at 
> org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:866)
> {noformat}
> It was likely a fuzz test:
> {noformat}
> 19:55:23 
> query_test/test_mem_usage_scaling.py::TestTpchMemLimitError::test_low_mem_limit_q22[mem_limit:
>  50 | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> parquet/none] 
> 19:55:23 [gw5] PASSED 
> query_test/test_mem_usage_scaling.py::TestTpchMemLimitError::test_low_mem_limit_q22[mem_limit:
>  50 | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> parquet/none] 
> 19:55:23 
> query_test/test_mem_usage_scaling.py::TestTpchMemLimitError::test_low_mem_limit_q22[mem_limit:
>  80 | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> parquet/none] 
> 19:55:25 [gw2] PASSED 
> query_test/test_queries.py::TestPartitionKeyScans::test_partition_key_scans[protocol:
>  beeswax | exec_option: {'mt_dop': 0, 'exec_single_node_rows_threshold': 0} | 
> table_format: parquet/none] 
> 19:55:25 
> query_test/test_queries.py::TestPartitionKeyScans::test_partition_key_scans[protocol:
>  beeswax | exec_option: {'mt_dop': 1, 'exec_single_node_rows_threshold': 0} | 
> table_format: avro/snap/block] 
> 19:55:26 [gw5] PASSED 
> query_test/test_mem_usage_scaling.py::TestTpchMemLimitError::test_low_mem_limit_q22[mem_limit:
>  80 | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> parquet/none] 
> 19:55:26 
> query_test/test_mem_usage_scaling.py::TestTpchMemLimitError::test_low_mem_limit_q22[mem_limit:
>  130 | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> parquet/none] 
> 19:55:26 [gw6] PASSED 
> query_test/test_scanners.py::TestIceberg::test_iceberg_profile[protocol: 
> beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': 
> 0} | table_format: parquet/none] 
> 19:55:26 
> query_test/test_scanners.py::TestIceberg::test_iceberg_profile[protocol: 
> beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, 
> 'abort_on_error': 1, 'debug_action': 
> '-1:OPEN:[email protected]', 
> 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] 
> 19:55:27 [gw3] FAILED 
> query_test/test_decimal_fuzz.py::TestDecimalFuzz::test_decimal_ops[exec_option:
>  {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 
> 'disable_codegen': False, 'abort_on_error': 1, 
> 'exec_single_node_rows_threshold': 0}] 
> 19:55:28 
> query_test/test_decimal_fuzz.py::TestDecimalFuzz::test_width_bucket[exec_option:
>  {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 
> 'disable_codegen': False, 'abort_on_error': 1, 
> 'exec_single_node_rows_threshold': 0}] 
> 19:55:28 [gw3] FAILED 
> query_test/test_decimal_fuzz.py::TestDecimalFuzz::test_width_bucket[exec_option:
>  {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 
> 'disable_codegen': False, 'abort_on_error': 1, 
> 'exec_single_node_rows_threshold': 0}] 
> 19:55:28 
> query_test/test_decimal_queries.py::TestDecimalQueries::test_queries[protocol:
>  beeswax | exec_option: {'disable_codegen_rows_threshold': 0, 
> 'disable_codegen': 'false', 'decimal_v2': 'false', 'batch_size': 0} | 
> table_format: text/none] 
> 19:55:28 [gw6] ERROR 
> query_test/test_scanners.py::TestIceberg::test_iceberg_profile[protocol: 
> beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, 
> 'abort_on_error': 1, 'debug_action': 
> '-1:OPEN:[email protected]', 
> 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] 
> 19:55:28 [gw8] FAILED 
> query_test/test_join_queries.py::TestJoinQueries::test_empty_build_joins[protocol:
>  beeswax | table_format: parquet/none | exec_option: {'batch_size': 0, 
> 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': 
> False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | 
> enable_outer_join_to_inner_transformation: false | batch_size: 0 | mt_dop: 0] 
> 19:55:28 [gw13] FAILED 
> query_test/test_parquet_stats.py::TestParquetStats::test_page_index[mt_dop: 0 
> | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> parquet/none] 
> 19:55:28 [gw12] FAILED 
> query_test/test_runtime_filters.py::TestRuntimeRowFilters::test_row_filters[mt_dop:
>  0 | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> parquet/none] 
> 19:55:28 
> query_test/test_runtime_filters.py::TestRuntimeRowFilters::test_row_filters[mt_dop:
>  4 | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> parquet/none] 
> 19:55:28 
> query_test/test_join_queries.py::TestJoinQueries::test_empty_build_joins[protocol:
>  beeswax | table_format: parquet/none | exec_option: {'batch_size': 0, 
> 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': 
> False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | 
> enable_outer_join_to_inner_transformation: true | batch_size: 0 | mt_dop: 4] 
> 19:55:28 [gw1] FAILED 
> query_test/test_scanners.py::TestScannersAllTableFormatsWithLimit::test_limit[mt_dop:
>  0 | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> parquet/none] 
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (IMPALA-10501) Hit DCHECK in parquet-column-readers.cc: def_levels_.CacheRemaining() <= num_buffered_values_

Reply via email to