[ 
https://issues.apache.org/jira/browse/IMPALA-10584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17332524#comment-17332524
 ] 

ASF subversion and git services commented on IMPALA-10584:
----------------------------------------------------------

Commit 91d2ab2116293b8f45afd02d029d47233877c538 in impala's branch 
refs/heads/master from Riza Suminto
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=91d2ab2 ]

IMPALA-10584: Defer advancing read page if stream only has 2 pages.

TestScratchLimit::test_with_unlimited_scratch_limit has been
intermittently crashing in ubuntu-16.04-dockerised-tests environment
after result spooling is enabled by default in IMPALA-9856. DCHECK
violation occurs in ReservationTracker::CheckConsistency() due to
BufferedTupleStream wrongly tries to reclaim memory reservation while
unpinning the stream.

For this bug to surface, all of the following needs to happen:
- Stream is in pinned mode.
- There are only 2 pages in the stream: 1 read and 1 write.
- Stream can not increase reservation anymore either due to memory
  pressure or low buffer/memory limit.
- The stream read page has been fully read and is attached to output
  RowBatch. But the output RowBatch has not cleaned up yet.
- BufferedTupleStream::UnpinStream is invoked.

The memory accounting bug happens because UnpinStream proceeds to
NextReadPage where the read page buffer was mistakenly assumed as
released. default_page_len_ bytes were added into
write_page_reservation_ and subsequently violates the total memory
reservation.

This patch fixes the bug by deferring advancement of the read iterator
in UnpinStream if the read page is attached to output RowBatch and there
are only 2 pages in the stream. This is OK because after UnpinStream
finished, the stream is now in unpinned mode and has_read_write_page is
false. The next AddRow operation is then allowed to unpin the previous
write page first before reusing the reservation to allocate a new write
page. The next GetNext call will be responsible to advance the read
page.

Testing:
- Add be test DeferAdvancingReadPage.
- Loop the TestScratchLimit::test_with_unlimited_scratch_limit in my
  local dev machine and verify that each test passed without triggering
  the DCHECK violation.
- Reenable result spooling in TestScratchLimit that was disabled in
  IMPALA-10559.
- Pass core tests.

Change-Id: I16137b6e423f190f60c3115a06ccd0f77e9f585a
Reviewed-on: http://gerrit.cloudera.org:8080/17195
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Investigate intermittent crash in 
> TestScratchLimit::test_with_unlimited_scratch_limit
> -------------------------------------------------------------------------------------
>
>                 Key: IMPALA-10584
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10584
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 4.0
>            Reporter: Riza Suminto
>            Assignee: Riza Suminto
>            Priority: Major
>             Fix For: Impala 4.0
>
>         Attachments: e521ac50-121b-49aa-716f458f-51920619.dmp.resolved.tgz
>
>
> TestScratchLimit::test_with_unlimited_scratch_limit has been intermittently 
> crashing in ubuntu-16.04-dockerised-tests environment after result spooling 
> is enabled by default in IMPALA-9856. We decide to disable result spooling 
> for TestScratchLimit in IMPALA-10559. However, we still need to investigate 
> what is the root cause of the intermittent crash.
> I tried to loop the test overnight with minicluster of 2 impalad and the 
> following modification on test_with_unlimited_scratch_limit query options:
>  * spool_query_results=1
>  * buffer_pool_limit=24m
>  * max_result_spooling_mem=4m
>  
> At iteration 589, test_with_unlimited_scratch_limit crash with the following 
> logs.
> impalad.INFO
> {code:java}
> I0314 00:15:16.495184 10628 impala-beeswax-server.cc:541] 
> TClientRequest.queryOptions: TQueryOptions {
>   01: abort_on_error (bool) = true,
>   02: max_errors (i32) = 100,
>   03: disable_codegen (bool) = false,
>   04: batch_size (i32) = 0,
>   05: num_nodes (i32) = 0,
>   06: max_scan_range_length (i64) = 0,
>   07: num_scanner_threads (i32) = 0,
>   11: debug_action (string) = "",
>   12: mem_limit (i64) = 0,
>   15: hbase_caching (i32) = 0,
>   16: hbase_cache_blocks (bool) = false,
>   17: parquet_file_size (i64) = 0,
>   18: explain_level (i32) = 1,
>   19: sync_ddl (bool) = false,
>   24: disable_outermost_topn (bool) = false,
>   26: query_timeout_s (i32) = 0,
>   27: buffer_pool_limit (i64) = 25165824,
>   28: appx_count_distinct (bool) = false,
>   29: disable_unsafe_spills (bool) = false,
>   31: exec_single_node_rows_threshold (i32) = 0,
>   32: optimize_partition_key_scans (bool) = false,
>   33: replica_preference (i32) = 0,
>   34: schedule_random_replica (bool) = false,
>   36: disable_streaming_preaggregations (bool) = false,
>   37: runtime_filter_mode (i32) = 2,
>   38: runtime_bloom_filter_size (i32) = 1048576,
>   39: runtime_filter_wait_time_ms (i32) = 0,
>   40: disable_row_runtime_filtering (bool) = false,
>   41: max_num_runtime_filters (i32) = 10,
>   42: parquet_annotate_strings_utf8 (bool) = false,
>   43: parquet_fallback_schema_resolution (i32) = 0,
>   45: s3_skip_insert_staging (bool) = true,
>   46: runtime_filter_min_size (i32) = 1048576,
>   47: runtime_filter_max_size (i32) = 16777216,
>   48: prefetch_mode (i32) = 1,
>   49: strict_mode (bool) = false,
>   50: scratch_limit (i64) = -1,
>   51: enable_expr_rewrites (bool) = true,
>   52: decimal_v2 (bool) = true,
>   53: parquet_dictionary_filtering (bool) = true,
>   54: parquet_array_resolution (i32) = 0,
>   55: parquet_read_statistics (bool) = true,
>   56: default_join_distribution_mode (i32) = 0,
>   57: disable_codegen_rows_threshold (i32) = 5000,
>   58: default_spillable_buffer_size (i64) = 2097152,
>   59: min_spillable_buffer_size (i64) = 65536,
>   60: max_row_size (i64) = 524288,
>   61: idle_session_timeout (i32) = 0,
>   62: compute_stats_min_sample_size (i64) = 1073741824,
>   63: exec_time_limit_s (i32) = 0,
>   64: shuffle_distinct_exprs (bool) = true,
>   65: max_mem_estimate_for_admission (i64) = 0,
>   66: thread_reservation_limit (i32) = 3000,
>   67: thread_reservation_aggregate_limit (i32) = 0,
>   68: kudu_read_mode (i32) = 0,
>   69: allow_erasure_coded_files (bool) = false,
>   70: timezone (string) = "",
>   71: scan_bytes_limit (i64) = 0,
>   72: cpu_limit_s (i64) = 0,
>   73: topn_bytes_limit (i64) = 536870912,
>   74: client_identifier (string) = 
> "query_test/test_scratch_limit.py::TestScratchLimit::()::test_with_unlimited_scratch_limit[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':5000;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_th",
>   75: resource_trace_ratio (double) = 0,
>   76: num_remote_executor_candidates (i32) = 3,
>   77: num_rows_produced_limit (i64) = 0,
>   78: planner_testcase_mode (bool) = false,
>   79: default_file_format (i32) = 0,
>   80: parquet_timestamp_type (i32) = 0,
>   81: parquet_read_page_index (bool) = true,
>   82: parquet_write_page_index (bool) = true,
>   84: disable_hdfs_num_rows_estimate (bool) = false,
>   86: spool_query_results (bool) = true,
>   87: default_transactional_type (i32) = 0,
>   88: statement_expression_limit (i32) = 250000,
>   89: max_statement_length_bytes (i32) = 16777216,
>   90: disable_data_cache (bool) = false,
>   91: max_result_spooling_mem (i64) = 4194304,
>   92: max_spilled_result_spooling_mem (i64) = 1073741824,
>   93: disable_hbase_num_rows_estimate (bool) = false,
>   94: fetch_rows_timeout_ms (i64) = 10000,
>   95: now_string (string) = "",
>   96: parquet_object_store_split_size (i64) = 268435456,
>   97: mem_limit_executors (i64) = 0,
>   98: broadcast_bytes_limit (i64) = 34359738368,
>   99: preagg_bytes_limit (i64) = -1,
>   100: enable_cnf_rewrites (bool) = true,
>   101: max_cnf_exprs (i32) = 200,
>   102: kudu_snapshot_read_timestamp_micros (i64) = 0,
>   103: retry_failed_queries (bool) = false,
>   104: enabled_runtime_filter_types (i32) = 3,
>   105: async_codegen (bool) = false,
>   106: enable_distinct_semi_join_optimization (bool) = true,
>   107: sort_run_bytes_limit (i64) = -1,
>   108: max_fs_writers (i32) = 0,
>   109: refresh_updated_hms_partitions (bool) = false,
>   110: spool_all_results_for_retries (bool) = true,
>   112: use_local_tz_for_unix_timestamp_conversions (bool) = false,
>   113: convert_legacy_hive_parquet_utc_timestamps (bool) = false,
>   114: enable_outer_join_to_inner_transformation (bool) = false,
>   115: targeted_kudu_scan_range_length (i64) = -1,
>   116: report_skew_limit (double) = 1,
>   117: optimize_simple_limit (bool) = false,
>   118: use_dop_for_costing (bool) = true,
>   119: broadcast_to_partition_factor (double) = 1,
>   120: join_rows_produced_limit (i64) = 0,
>   121: utf8_mode (bool) = false,
>   122: analytic_rank_pushdown_threshold (i64) = 1000,
>   123: minmax_filter_threshold (double) = 0,
>   124: minmax_filtering_level (i32) = 1,
> }
> I0314 00:15:16.495311 10628 impala-server.cc:1247] Found local timezone 
> "America/Los_Angeles".
> I0314 00:15:16.498564 10628 impala-server.cc:1304] 
> f54227735548245d:fc498f1400000000] Registered query 
> query_id=f54227735548245d:fc498f1400000000 
> session_id=ec437b1251da7828:462192f6533a37a8
> I0314 00:15:16.503772 10628 Frontend.java:1598] 
> f54227735548245d:fc498f1400000000] Analyzing query: select o_orderdate, 
> o_custkey, o_comment
>       from tpch.orders
>       order by o_orderdate db: default
> I0314 00:15:16.504245 10628 BaseAuthorizationChecker.java:110] 
> f54227735548245d:fc498f1400000000] Authorization check took 0 ms
> I0314 00:15:16.504266 10628 Frontend.java:1640] 
> f54227735548245d:fc498f1400000000] Analysis and authorization finished.
> I0314 00:15:16.506330 10629 admission-controller.cc:1696] 
> f54227735548245d:fc498f1400000000] Trying to admit 
> id=f54227735548245d:fc498f1400000000 in pool_name=default-pool 
> executor_group_name=default per_host_mem_estimate=264.93 MB 
> dedicated_coord_mem_estimate=124.18 MB max_requests=-1 max_queued=200 
> max_mem=-1.00 B
> I0314 00:15:16.506346 10629 admission-controller.cc:1704] 
> f54227735548245d:fc498f1400000000] Stats: agg_num_running=0, 
> agg_num_queued=0, agg_mem_reserved=0,  local_host(local_mem_admitted=0, 
> num_admitted_running=0, num_queued=0, backend_mem_reserved=0, 
> topN_query_stats: queries=[], total_mem_consumed=0; pool_level_stats: 
> num_running=0, min=0, max=0, pool_total_mem=0)
> I0314 00:15:16.506363 10629 admission-controller.cc:1212] 
> f54227735548245d:fc498f1400000000] Admitting query 
> id=f54227735548245d:fc498f1400000000
> I0314 00:15:16.506400 10629 impala-server.cc:2130] 
> f54227735548245d:fc498f1400000000] Registering query locations
> I0314 00:15:16.506425 10629 coordinator.cc:150] 
> f54227735548245d:fc498f1400000000] Exec() 
> query_id=f54227735548245d:fc498f1400000000 stmt=select o_orderdate, 
> o_custkey, o_comment
>       from tpch.orders
>       order by o_orderdate
> I0314 00:15:16.506711 10629 coordinator.cc:474] 
> f54227735548245d:fc498f1400000000] starting execution on 2 backends for 
> query_id=f54227735548245d:fc498f1400000000
> I0314 00:15:16.507216 24247 control-service.cc:148] 
> f54227735548245d:fc498f1400000000] ExecQueryFInstances(): 
> query_id=f54227735548245d:fc498f1400000000 coord=lv-desktop:27000 #instances=2
> I0314 00:15:16.507802 10633 query-state.cc:948] 
> f54227735548245d:fc498f1400000002] Executing instance. 
> instance_id=f54227735548245d:fc498f1400000002 fragment_idx=1 
> per_fragment_instance_idx=1 coord_state_idx=0 #in-flight=1
> I0314 00:15:16.507810 10629 coordinator.cc:533] 
> f54227735548245d:fc498f1400000000] started execution on 2 backends for 
> query_id=f54227735548245d:fc498f1400000000
> I0314 00:15:16.507913 10634 query-state.cc:948] 
> f54227735548245d:fc498f1400000000] Executing instance. 
> instance_id=f54227735548245d:fc498f1400000000 fragment_idx=0 
> per_fragment_instance_idx=0 coord_state_idx=0 #in-flight=2
> I0314 00:15:18.587622 24267 krpc-data-stream-mgr.cc:427] Reduced stream ID 
> cache from 33 items, to 32, eviction took: 0
> I0314 00:15:20.481271 10628 impala-beeswax-server.cc:215] 
> get_results_metadata(): query_id=f54227735548245d:fc498f1400000000
> F0314 00:15:21.669415 10634 reservation-tracker.cc:436] 
> f54227735548245d:fc498f1400000000] Check failed: used_reservation_.Load() + 
> child_reservations_.Load() <= reservation_.Load() (6291456 vs. 4194304) 
> 4194304 + 2097152 > 4194304
> Wrote minidump to 
> /home/riza/workspace/impala/logs/cluster/minidumps/impalad/e521ac50-121b-49aa-716f458f-51920619.dmp
> {code}
> impalad.ERROR
> {code:java}
> Log file created at: 2021/03/13 22:41:15
> Running on machine: lv-desktop
> Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
> E0313 22:41:15.237749 24033 logging.cc:148] stderr will be logged to this 
> file.
> 21/03/13 22:41:16 INFO util.JvmPauseMonitor: Starting JVM pause monitor
> F0314 00:15:21.669415 10634 reservation-tracker.cc:436] 
> f54227735548245d:fc498f1400000000] Check failed: used_reservation_.Load() + 
> child_reservations_.Load() <= reservation_.Load() (6291456 vs. 4194304) 
> 4194304 + 2097152 > 4194304
> *** Check failure stack trace: ***
>     @          0x549bf3c  google::LogMessage::Fail()
>     @          0x549d82c  google::LogMessage::SendToLog()
>     @          0x549b89a  google::LogMessage::Flush()
>     @          0x549f498  google::LogMessageFatal::~LogMessageFatal()
>     @          0x2965d86  impala::ReservationTracker::CheckConsistency()
>     @          0x2964ea2  impala::ReservationTracker::TransferReservationTo()
>     @          0x29584d3  impala::BufferPool::ClientHandle::SaveReservation()
>     @          0x2f31cf2  impala::BufferedTupleStream::NextReadPage()
>     @          0x2f33051  impala::BufferedTupleStream::UnpinStream()
>     @          0x2fa0575  impala::SpillableRowBatchQueue::AddBatch()
>     @          0x2b6f2bc  impala::BufferedPlanRootSink::Send()
>     @          0x24cc429  impala::FragmentInstanceState::ExecInternal()
>     @          0x24c842f  impala::FragmentInstanceState::Exec()
>     @          0x240c567  impala::QueryState::ExecFInstance()
>     @          0x240a965  
> _ZZN6impala10QueryState15StartFInstancesEvENKUlvE_clEv
>     @          0x240f536  
> _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala10QueryState15StartFInstancesEvEUlvE_vE6invokeERNS1_15function_bufferE
>     @          0x22b6621  boost::function0<>::operator()()
>     @          0x2926cb3  impala::Thread::SuperviseThread()
>     @          0x292f614  boost::_bi::list5<>::operator()<>()
>     @          0x292f538  boost::_bi::bind_t<>::operator()()
>     @          0x292f4f9  boost::detail::thread_data<>::run()
>     @          0x41a2f41  thread_proxy
>     @     0x7f6d1981f6da  start_thread
>     @     0x7f6d160ea71e  clone
> Wrote minidump to 
> /home/riza/workspace/impala/logs/cluster/minidumps/impalad/e521ac50-121b-49aa-716f458f-51920619.dmp
> {code}
> I'm also attaching the resolved dump of 
> e521ac50-121b-49aa-716f458f-51920619.dmp.
> It looks like result spooling + sort query + low buffer_pool_limit used in 
> test_with_unlimited_scratch_limit reveal some bug in 
> BufferedPlanRootSink/BufferedTupleStream.
> New row batch is added to BufferedPlanRootSink, but there is not enough 
> memory. At the same time, all rows in the read_page_ of the read iterator has 
> been fully read. The underlying BufferedTupleStream try to spill, but I 
> suspect nothing got spilled. Nothing get freed including the read_page_ that 
> has been fully read.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to