LuciferYang commented on PR #37206: URL: https://github.com/apache/spark/pull/37206#issuecomment-1191014717
@JoshRosen Yes, your analysis is very accurate. From the current stack, I can only infer that the following two methods may have racing (but I haven't found any conclusive evidence), so I added read-write locks to these two methods in the initial pr. https://github.com/apache/spark/blob/66b1f79b72855af35351ff995492f2c13872dac5/core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala#L257-L261 I didn't expect there would be correctness bug here, if this is really possible, I think the current clues are not enough to troubleshoot the problem. @smcmullan-ncirl Cloud you provide more details to further investigate the issue? I want to know what the writing thread is?It seems that only you can stably reproduce this problem now :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
