[
https://issues.apache.org/jira/browse/IMPALA-9349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Armstrong updated IMPALA-9349:
----------------------------------
Priority: Critical (was: Major)
> output_unmatched_batch_ holds onto buffers for too long
> --------------------------------------------------------
>
> Key: IMPALA-9349
> URL: https://issues.apache.org/jira/browse/IMPALA-9349
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Reporter: Tim Armstrong
> Assignee: Tim Armstrong
> Priority: Critical
> Labels: crash, hang
>
> IMPALA-4224 made some of the reservation management in PHJ more explicit,
> which revealed a minor bug. This query from TestSpilling triggers the bug,
> but it has no symptoms currently because there is at least a surplus 256k of
> reservation set aside for max_row_size.
> {noformat}
> # spilled partition with 0 probe rows, RIGHT OUTER JOIN
> set debug_action="-1:OPEN:[email protected]";
> select straight_join count(*)
> from
> supplier right outer join lineitem on s_suppkey = l_suppkey
> where s_acctbal > 0 and s_acctbal < 10;
> {noformat}
> However, a slight tweak triggers a DCHECK
> {noformat}
> [localhost:21000] tpch_parquet> use tpch_parquet; set
> default_spillable_buffer_size=256k; set max_row_size=256k; set
> debug_action=-1:OPEN:[email protected];select
> straight_join count(*)
> from
> supplier right outer join lineitem on s_suppkey = l_suppkey
> where s_acctbal > 0 and s_acctbal < 10;
> Query: use tpch_parquet
> DEFAULT_SPILLABLE_BUFFER_SIZE set to 256k
> MAX_ROW_SIZE set to 256k
> DEBUG_ACTION set to -1:OPEN:[email protected]
> Query: select straight_join count(*)
> from
> supplier right outer join lineitem on s_suppkey = l_suppkey
> where s_acctbal > 0 and s_acctbal < 10
> Query submitted at: 2020-01-30 23:12:11 (Coordinator:
> http://tarmstrong-box:25000)
> Query progress can be monitored at:
> http://tarmstrong-box:25000/query_plan?query_id=8e445d7018e08002:6e35218800000000
> ERROR: Failed due to unreachable impalad(s): tarmstrong-box:22002
> {noformat}
> F0130 23:12:14.652458 2727 partitioned-hash-join-builder.cc:364]
> 8e445d7018e08002:6e35218800000005] Check failed: got_buffer Accounted in min
> reservation<BufferPool::Client> 0xb96d870 internal state:
> {<BufferPool::Client> 0xf8843a0 name: HASH_JOIN_NODE id=2 ptr=0xb96d700
> write_status: buffers allocated 262144 num_pages: 166 pinned_bytes: 262144
> dirty_unpinned_bytes: 786432 in_flight_write_bytes: 524288 reservation:
> {<ReservationTracker>: reservation_limit 9223372036854775807 reservation
> 4456448 used_reservation 524288 child_reservations 3932160 parent:
> <ReservationTracker>: reservation_limit 9223372036854775807 reservation
> 4456448 used_reservation 0 child_reservations 4456448 parent:
> <ReservationTracker>: reservation_limit 6279187114 reservation 8650752
> used_reservation 0 child_reservations 8650752 parent:
> <ReservationTracker>: reservation_limit 6671630336 reservation 8667136
> used_reservation 0 child_reservations 8667136 parent:
> NULL}
> 1 pinned pages: <BufferPool::Page> 0x12ed9ea0 len: 262144 pin_count: 1 buf:
> <BufferPool::BufferHandle> 0x12ed9f18 client: 0xb96d870/0xf8843a0 data:
> 0x17380000 len: 262144
> 3 dirty unpinned pages: <BufferPool::Page> 0x1319c500 len: 262144
> pin_count: 0 buf: <BufferPool::BufferHandle> 0x1319c578 client:
> 0xb96d870/0xf8843a0 data: 0x1554a000 len: 262144
> <BufferPool::Page> 0x1319dae0 len: 262144 pin_count: 0 buf:
> <BufferPool::BufferHandle> 0x1319db58 client: 0xb96d870/0xf8843a0 data:
> 0x127fc000 len: 262144
> <BufferPool::Page> 0x1319e4e0 len: 262144 pin_count: 0 buf:
> <BufferPool::BufferHandle> 0x1319e558 client: 0xb96d870/0xf8843a0 data:
> 0x16f4a000 len: 262144
> 2 in flight write pages: <BufferPool::Page> 0x12ebe1c0 len: 262144
> pin_count: 0 buf: <BufferPool::BufferHandle> 0x12ebe238 client:
> 0xb96d870/0xf8843a0 data: 0x16740000 len: 262144
> <BufferPool::Page> 0x1319d4a0 len: 262144 pin_count: 0 buf:
> <BufferPool::BufferHandle> 0x1319d518 client: 0xb96d870/0xf8843a0 data:
> 0x16340000 len: 262144
> }
> *** Check failure stack trace: ***
> @ 0x4dfceac google::LogMessage::Fail()
> @ 0x4dfe751 google::LogMessage::SendToLog()
> @ 0x4dfc886 google::LogMessage::Flush()
> @ 0x4dffe4d google::LogMessageFatal::~LogMessageFatal()
> @ 0x2753df6 impala::PhjBuilder::CreateAndPreparePartition()
> @ 0x2754036 impala::PhjBuilder::CreateHashPartitions()
> @ 0x2758209 impala::PhjBuilder::RepartitionBuildInput()
> @ 0x2757953 impala::PhjBuilder::BeginSpilledProbe()
> @ 0x2688643 impala::PartitionedHashJoinNode::BeginSpilledProbe()
> @ 0x268aff5 impala::PartitionedHashJoinNode::GetNext()
> @ 0x276b436 impala::AggregationNode::Open()
> @ 0x2159d7f impala::FragmentInstanceState::Open()
> @ 0x2156937 impala::FragmentInstanceState::Exec()
> @ 0x216aa7e impala::QueryState::ExecFInstance()
> @ 0x2168d9d
> _ZZN6impala10QueryState15StartFInstancesEvENKUlvE_clEv
> @ 0x216c666
> _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala10QueryState15StartFInstancesEvEUlvE_vE6invokeERNS1_15function_bufferE
> @ 0x1f60b7b boost::function0<>::operator()()
> @ 0x250f288 impala::Thread::SuperviseThread()
> @ 0x251760c boost::_bi::list5<>::operator()<>()
> @ 0x2517530 boost::_bi::bind_t<>::operator()()
> @ 0x25174f3 boost::detail::thread_data<>::run()
> @ 0x3d26099 thread_proxy
> @ 0x7fdab63cf6b9 start_thread
> @ 0x7fdab2b8b41c clone
> {noformat}
> {noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]