[jira] [Commented] (IMPALA-10501) Hit DCHECK in parquet-column-readers.cc: def_levels_.CacheRemaining() <= num_buffered_values_

ASF subversion and git services (Jira) Tue, 23 Feb 2021 00:37:06 -0800


    [ 
https://issues.apache.org/jira/browse/IMPALA-10501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17288936#comment-17288936
 ]


ASF subversion and git services commented on IMPALA-10501:
----------------------------------------------------------

Commit 89a5fc789cb7cdbab08e331c4c95d9cd4eb30b8d in impala's branch 
refs/heads/master from Zoltan Borok-Nagy
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=89a5fc7 ]

IMPALA-10501: Hit DCHECK in parquet-column-readers.cc: 
def_levels_.CacheRemaining() <= num_buffered_values_

We had a DCHECK in ScalarColumnReader::MaterializeValueBatch() that
checked that 'num_buffered_values_' is greater or equal to the
number of cached values in the Parquet definition level decoder.

In SkipTopLevelRows() we used decoder.ReadLevel() which loaded
the cache of the decoder with probably more values than the
actual value count. It is because literal runs are stored in groups
of 8, i.e. there might be padding zeros at the end.

Alternatively we can fill the cache of the decoder with
CacheNextBatch(num_vals). In this case we won't load more values
than the actual value count.

Testing
 * until this patch TestParquetStats::test_page_index was flaky
   because of this issue
 * I tested the solution on a hacked Impala that randomly generated
   skip ranges

Change-Id: Ic071473e7b315300fd5e163225d3e39735f09c4f
Reviewed-on: http://gerrit.cloudera.org:8080/17071
Reviewed-by: Zoltan Borok-Nagy <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Hit DCHECK in parquet-column-readers.cc:  def_levels_.CacheRemaining() <= 
> num_buffered_values_
> ----------------------------------------------------------------------------------------------
>
>                 Key: IMPALA-10501
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10501
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>    Affects Versions: Impala 4.0
>            Reporter: Tim Armstrong
>            Assignee: Zoltán Borók-Nagy
>            Priority: Blocker
>              Labels: broken-build, crash, flaky, parquet
>         Attachments: consoleText.3.gz, impalad_coord_exec-0.tar.gz
>
>
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/3814/
> {noformat}
> F0211 03:55:26.383247 14487 parquet-column-readers.cc:517] 
> be46bb72819942fd:85934edd00000001] Check failed: def_levels_.CacheRemaining() 
> <= num_buffered_values_ (921 vs. 916) 
> *** Check failure stack trace: ***
>     @          0x53646ec  google::LogMessage::Fail()
>     @          0x5365fdc  google::LogMessage::SendToLog()
>     @          0x536404a  google::LogMessage::Flush()
>     @          0x5367c48  google::LogMessageFatal::~LogMessageFatal()
>     @          0x2ff886f  
> impala::ScalarColumnReader<>::MaterializeValueBatch<>()
>     @          0x2f8ae44  
> impala::ScalarColumnReader<>::MaterializeValueBatch<>()
>     @          0x2f761bf  impala::ScalarColumnReader<>::ReadValueBatch<>()
>     @          0x2f2889a  impala::ScalarColumnReader<>::ReadValueBatch()
>     @          0x2ebd8c0  impala::HdfsParquetScanner::AssembleRows()
>     @          0x2eb882e  impala::HdfsParquetScanner::GetNextInternal()
>     @          0x2eb67bd  impala::HdfsParquetScanner::ProcessSplit()
>     @          0x2aaf3f2  impala::HdfsScanNode::ProcessSplit()
>     @          0x2aae773  impala::HdfsScanNode::ScannerThread()
>     @          0x2aadadb  
> _ZZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS_18ThreadResourcePoolEENKUlvE_clEv
>     @          0x2aafe94  
> _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_18ThreadResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE
>     @          0x220e331  boost::function0<>::operator()()
>     @          0x2842e7f  impala::Thread::SuperviseThread()
>     @          0x284ae1c  boost::_bi::list5<>::operator()<>()
>     @          0x284ad40  boost::_bi::bind_t<>::operator()()
>     @          0x284ad01  boost::detail::thread_data<>::run()
>     @          0x406b291  thread_proxy
>     @     0x7f2465cba6b9  start_thread
>     @     0x7f24627e64dc  clone
> rImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:431)
>       at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166)
>       at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158)
>       at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96)
>       at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362)
>       at com.sun.proxy.$Proxy10.getBlockLocations(Unknown Source)
>       at 
> org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:866)
> {noformat}
> It was likely a fuzz test:
> {noformat}
> 19:55:23 
> query_test/test_mem_usage_scaling.py::TestTpchMemLimitError::test_low_mem_limit_q22[mem_limit:
>  50 | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> parquet/none] 
> 19:55:23 [gw5] PASSED 
> query_test/test_mem_usage_scaling.py::TestTpchMemLimitError::test_low_mem_limit_q22[mem_limit:
>  50 | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> parquet/none] 
> 19:55:23 
> query_test/test_mem_usage_scaling.py::TestTpchMemLimitError::test_low_mem_limit_q22[mem_limit:
>  80 | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> parquet/none] 
> 19:55:25 [gw2] PASSED 
> query_test/test_queries.py::TestPartitionKeyScans::test_partition_key_scans[protocol:
>  beeswax | exec_option: {'mt_dop': 0, 'exec_single_node_rows_threshold': 0} | 
> table_format: parquet/none] 
> 19:55:25 
> query_test/test_queries.py::TestPartitionKeyScans::test_partition_key_scans[protocol:
>  beeswax | exec_option: {'mt_dop': 1, 'exec_single_node_rows_threshold': 0} | 
> table_format: avro/snap/block] 
> 19:55:26 [gw5] PASSED 
> query_test/test_mem_usage_scaling.py::TestTpchMemLimitError::test_low_mem_limit_q22[mem_limit:
>  80 | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> parquet/none] 
> 19:55:26 
> query_test/test_mem_usage_scaling.py::TestTpchMemLimitError::test_low_mem_limit_q22[mem_limit:
>  130 | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> parquet/none] 
> 19:55:26 [gw6] PASSED 
> query_test/test_scanners.py::TestIceberg::test_iceberg_profile[protocol: 
> beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': 
> 0} | table_format: parquet/none] 
> 19:55:26 
> query_test/test_scanners.py::TestIceberg::test_iceberg_profile[protocol: 
> beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, 
> 'abort_on_error': 1, 'debug_action': 
> '-1:OPEN:[email protected]', 
> 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] 
> 19:55:27 [gw3] FAILED 
> query_test/test_decimal_fuzz.py::TestDecimalFuzz::test_decimal_ops[exec_option:
>  {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 
> 'disable_codegen': False, 'abort_on_error': 1, 
> 'exec_single_node_rows_threshold': 0}] 
> 19:55:28 
> query_test/test_decimal_fuzz.py::TestDecimalFuzz::test_width_bucket[exec_option:
>  {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 
> 'disable_codegen': False, 'abort_on_error': 1, 
> 'exec_single_node_rows_threshold': 0}] 
> 19:55:28 [gw3] FAILED 
> query_test/test_decimal_fuzz.py::TestDecimalFuzz::test_width_bucket[exec_option:
>  {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 
> 'disable_codegen': False, 'abort_on_error': 1, 
> 'exec_single_node_rows_threshold': 0}] 
> 19:55:28 
> query_test/test_decimal_queries.py::TestDecimalQueries::test_queries[protocol:
>  beeswax | exec_option: {'disable_codegen_rows_threshold': 0, 
> 'disable_codegen': 'false', 'decimal_v2': 'false', 'batch_size': 0} | 
> table_format: text/none] 
> 19:55:28 [gw6] ERROR 
> query_test/test_scanners.py::TestIceberg::test_iceberg_profile[protocol: 
> beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, 
> 'abort_on_error': 1, 'debug_action': 
> '-1:OPEN:[email protected]', 
> 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] 
> 19:55:28 [gw8] FAILED 
> query_test/test_join_queries.py::TestJoinQueries::test_empty_build_joins[protocol:
>  beeswax | table_format: parquet/none | exec_option: {'batch_size': 0, 
> 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': 
> False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | 
> enable_outer_join_to_inner_transformation: false | batch_size: 0 | mt_dop: 0] 
> 19:55:28 [gw13] FAILED 
> query_test/test_parquet_stats.py::TestParquetStats::test_page_index[mt_dop: 0 
> | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> parquet/none] 
> 19:55:28 [gw12] FAILED 
> query_test/test_runtime_filters.py::TestRuntimeRowFilters::test_row_filters[mt_dop:
>  0 | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> parquet/none] 
> 19:55:28 
> query_test/test_runtime_filters.py::TestRuntimeRowFilters::test_row_filters[mt_dop:
>  4 | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> parquet/none] 
> 19:55:28 
> query_test/test_join_queries.py::TestJoinQueries::test_empty_build_joins[protocol:
>  beeswax | table_format: parquet/none | exec_option: {'batch_size': 0, 
> 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': 
> False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | 
> enable_outer_join_to_inner_transformation: true | batch_size: 0 | mt_dop: 4] 
> 19:55:28 [gw1] FAILED 
> query_test/test_scanners.py::TestScannersAllTableFormatsWithLimit::test_limit[mt_dop:
>  0 | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> parquet/none] 
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (IMPALA-10501) Hit DCHECK in parquet-column-readers.cc: def_levels_.CacheRemaining() <= num_buffered_values_

Reply via email to