srowen commented on a change in pull request #25552: [SPARK-28849][CORE] Add a 
number to control transferTo calls to avoid infinite loop in some occasional 
cases
URL: https://github.com/apache/spark/pull/25552#discussion_r317130417
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/util/Utils.scala
 ##########
 @@ -417,16 +418,19 @@ private[spark] object Utils extends Logging {
       input: FileChannel,
       output: WritableByteChannel,
       startPosition: Long,
-      bytesToCopy: Long): Unit = {
+      bytesToCopy: Long,
+      numTransferToCalls: Int): Unit = {
     val outputInitialState = output match {
       case outputFileChannel: FileChannel =>
         Some((outputFileChannel.position(), outputFileChannel))
       case _ => None
     }
     var count = 0L
+    var num = 0
     // In case transferTo method transferred less data than we have required.
-    while (count < bytesToCopy) {
+    while (count < bytesToCopy && num < numTransferToCalls) {
       count += input.transferTo(count + startPosition, bytesToCopy - count, 
output)
 
 Review comment:
   Can you log the return of `transferTo` "locally" in some way to further 
test? Not sure how easy it is to reproduce. In the absence of evidence, I think 
checking <= 0 is the most that is safe. I think it bears understanding the 
issue more first before any change is made.
   
   In any event I don't think you can ever just stop copying data and continue 
successfully. You would have to fail the job. 
   
   If the job is merely slow, at best, you'd want to introduce some kind of 
timeout at a higher level as there are all kinds of ways it can be slow. But 
that would also take care of your current situation.
   
   But you can always kill jobs that are slow directly, without any change to 
Spark.
   This doesn't fix the cause anyway (assuming it's outside Spark). 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to