SuYan created SPARK-4721:
----------------------------
Summary: Improve first thread to put block failed
Key: SPARK-4721
URL: https://issues.apache.org/jira/browse/SPARK-4721
Project: Spark
Issue Type: Improvement
Reporter: SuYan
In current code, it assumes that multi-thread try to put same blockID block in
blockManager, the thread that first put info in blockinfos to do the put
process, and others will wait until the put in failed or success.
it's ok in put success, but if fails, have some problem:
1. the failed thread will remove info from blockinfo
2. other threads wake up, and use the old info.synchronized to try put
3. and if success, mark success will tell not in pending status, and “mark
success” failed. all other remaining threads will do the same thing: got
info.syn and mark success or failed even that have one success.
first, I can't understand why remove info from blockinfos while there have
other threads was wait. the comment tell us is for other threads to create new
block info. but block info is just a ID and level, use the old one and the new
one is doesn't matters if there any waits threads.
second, how about if there first threads is failed, other waits thread can do
the same process one by one but need less than all .
or just if first thread is failed, all other threads log a warning and return
after waking up.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]