[
https://issues.apache.org/jira/browse/SPARK-2677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Guoqiang Li updated SPARK-2677:
-------------------------------
Component/s: Spark Core
> BasicBlockFetchIterator#next can wait forever
> ---------------------------------------------
>
> Key: SPARK-2677
> URL: https://issues.apache.org/jira/browse/SPARK-2677
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 1.0.0
> Reporter: Kousuke Saruta
> Priority: Critical
> Fix For: 1.1.0
>
>
> In BasicBlockFetchIterator#next, it waits fetch result on result.take.
> {code}
> override def next(): (BlockId, Option[Iterator[Any]]) = {
> resultsGotten += 1
> val startFetchWait = System.currentTimeMillis()
> val result = results.take()
> val stopFetchWait = System.currentTimeMillis()
> _fetchWaitTime += (stopFetchWait - startFetchWait)
> if (! result.failed) bytesInFlight -= result.size
> while (!fetchRequests.isEmpty &&
> (bytesInFlight == 0 || bytesInFlight + fetchRequests.front.size <=
> maxBytesInFlight)) {
> sendRequest(fetchRequests.dequeue())
> }
> (result.blockId, if (result.failed) None else
> Some(result.deserialize()))
> }
> {code}
> But, results is implemented as LinkedBlockingQueue so if remote executor hang
> up, fetching Executor waits forever.
--
This message was sent by Atlassian JIRA
(v6.2#6252)