Hi, all i'am tuning my app in local mode, and found there was lots of time spent in local block fetch.
in stage1: i read in input data, and do a repartition, in stage2: i do some operation on the repartitioned RDD, so it involves a local block fetch, i find that the fetch time is much longer than my operation time. here's my debug logs 14/05/11 08:21:12 DEBUG BlockStoreShuffleFetcher: Fetching outputs for shuffle 2, reduce 3 14/05/11 08:21:12 DEBUG BlockStoreShuffleFetcher: BlockStoreShuffleFetcher startTime 1399767672970 14/05/11 08:21:12 DEBUG BlockStoreShuffleFetcher: Fetching map output location for shuffle 2, reduce 3 took 0 ms 14/05/11 08:21:12 INFO BlockFetcherIterator$BasicBlockFetcherIterator: maxBytesInFlight: 50331648, targetRequestSize: 10066329 14/05/11 08:21:12 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Getting 13 non-empty blocks out of 24 blocks 14/05/11 08:21:12 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Started 0 remote fetches in 2 ms 14/05/11 08:21:12 DEBUG BlockFetcherIterator$BasicBlockFetcherIterator: Got local block shuffle_2_0_3 14/05/11 08:21:12 DEBUG BlockFetcherIterator$BasicBlockFetcherIterator: Got local blocks in 6 ms ms 14/05/11 08:21:12 DEBUG BlockStoreShuffleFetcher: BlockStoreShuffleFetcher itr 1399767672982 14/05/11 08:21:12 DEBUG BlockFetcherIterator$BasicBlockFetcherIterator: resultGotten 1, next time 0 14/05/11 08:21:13 DEBUG BlockFetcherIterator$BasicBlockFetcherIterator: resultGotten 2, next time 0 14/05/11 08:21:13 DEBUG BlockFetcherIterator$BasicBlockFetcherIterator: resultGotten 3, next time 0 14/05/11 08:21:13 DEBUG BlockFetcherIterator$BasicBlockFetcherIterator: resultGotten 4, next time 0 14/05/11 08:21:13 DEBUG BlockFetcherIterator$BasicBlockFetcherIterator: resultGotten 5, next time 0 14/05/11 08:21:13 DEBUG BlockFetcherIterator$BasicBlockFetcherIterator: resultGotten 6, next time 0 14/05/11 08:21:13 DEBUG BlockFetcherIterator$BasicBlockFetcherIterator: resultGotten 7, next time 0 14/05/11 08:21:13 DEBUG BlockFetcherIterator$BasicBlockFetcherIterator: resultGotten 8, next time 0 14/05/11 08:21:13 DEBUG BlockFetcherIterator$BasicBlockFetcherIterator: resultGotten 9, next time 0 14/05/11 08:21:13 DEBUG BlockStoreShuffleFetcher: completionIter shuffleFinishTime 1399767673293 in the end, blockFetcher calls CompletionIterator to read fetched blocks off the queue as they arrive. you can see before iterate the time is 1399767672982, iterate through the elements consume nearly no time, after the iterate completes, the time is 1399767673293, where does this 300+ms spend?! thank you very much! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/time-exhausted-in-BlockFetcher-tp5540.html Sent from the Apache Spark User List mailing list archive at Nabble.com.