[
https://issues.apache.org/jira/browse/IMPALA-6294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gabor Kaszab updated IMPALA-6294:
---------------------------------
Target Version: Impala 3.3.0 (was: Impala 3.2.0)
> Concurrent hung with lots of spilling make slow progress due to blocking in
> DataStreamRecvr and DataStreamSender
> ----------------------------------------------------------------------------------------------------------------
>
> Key: IMPALA-6294
> URL: https://issues.apache.org/jira/browse/IMPALA-6294
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Affects Versions: Impala 2.11.0
> Reporter: Mostafa Mokhtar
> Assignee: Michael Ho
> Priority: Critical
> Attachments: IMPALA-6285 TPCDS Q3 slow broadcast,
> slow_broadcast_q3_reciever.txt, slow_broadcast_q3_sender.txt
>
>
> While running a highly concurrent spilling workload on a large cluster
> queries start running slower, even light weight queries that are not running
> are affected by this slow down.
> {code}
> EXCHANGE_NODE (id=9):(Total: 3m1s, non-child: 3m1s, % non-child:
> 100.00%)
> - ConvertRowBatchTime: 999.990us
> - PeakMemoryUsage: 0
> - RowsReturned: 108.00K (108001)
> - RowsReturnedRate: 593.00 /sec
> DataStreamReceiver:
> BytesReceived(4s000ms): 254.47 KB, 338.82 KB, 338.82 KB, 852.43
> KB, 1.32 MB, 1.33 MB, 1.50 MB, 2.53 MB, 2.99 MB, 3.00 MB, 3.00 MB, 3.00 MB,
> 3.00 MB, 3.00 MB, 3.00 MB, 3.00 MB, 3.00 MB, 3.00 MB, 3.16 MB, 3.49 MB, 3.80
> MB, 4.15 MB, 4.55 MB, 4.84 MB, 4.99 MB, 5.07 MB, 5.41 MB, 5.75 MB, 5.92 MB,
> 6.00 MB, 6.00 MB, 6.00 MB, 6.07 MB, 6.28 MB, 6.33 MB, 6.43 MB, 6.67 MB, 6.91
> MB, 7.29 MB, 8.03 MB, 9.12 MB, 9.68 MB, 9.90 MB, 9.97 MB, 10.44 MB, 11.25 MB
> - BytesReceived: 11.73 MB (12301692)
> - DeserializeRowBatchTimer: 957.990ms
> - FirstBatchArrivalWaitTime: 0.000ns
> - PeakMemoryUsage: 644.44 KB (659904)
> - SendersBlockedTimer: 0.000ns
> - SendersBlockedTotalTimer(*): 0.000ns
> {code}
> {code}
> DataStreamSender (dst_id=9):(Total: 1s819ms, non-child: 1s819ms, %
> non-child: 100.00%)
> - BytesSent: 234.64 MB (246033840)
> - NetworkThroughput(*): 139.58 MB/sec
> - OverallThroughput: 128.92 MB/sec
> - PeakMemoryUsage: 33.12 KB (33920)
> - RowsReturned: 108.00K (108001)
> - SerializeBatchTime: 133.998ms
> - TransmitDataRPCTime: 1s680ms
> - UncompressedRowBatchSize: 446.42 MB (468102200)
> {code}
> Timeouts seen in IMPALA-6285 are caused by this issue
> {code}
> I1206 12:44:14.925405 25274 status.cc:58] RPC recv timed out: Client
> foo-17.domain.com:22000 timed-out during recv call.
> @ 0x957a6a impala::Status::Status()
> @ 0x11dd5fe
> impala::DataStreamSender::Channel::DoTransmitDataRpc()
> @ 0x11ddcd4
> impala::DataStreamSender::Channel::TransmitDataHelper()
> @ 0x11de080 impala::DataStreamSender::Channel::TransmitData()
> @ 0x11e1004 impala::ThreadPool<>::WorkerThread()
> @ 0xd10063 impala::Thread::SuperviseThread()
> @ 0xd107a4 boost::detail::thread_data<>::run()
> @ 0x128997a (unknown)
> @ 0x7f68c5bc7e25 start_thread
> @ 0x7f68c58f534d __clone
> {code}
> A similar behavior was also observed with KRPC enabled IMPALA-6048
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]