HeartSaVioR commented on a change in pull request #25552: [SPARK-28849][CORE]
Add a number to control transferTo calls to avoid infinite loop in some
occasional cases
URL: https://github.com/apache/spark/pull/25552#discussion_r317032369
##########
File path: core/src/main/scala/org/apache/spark/util/Utils.scala
##########
@@ -417,16 +418,19 @@ private[spark] object Utils extends Logging {
input: FileChannel,
output: WritableByteChannel,
startPosition: Long,
- bytesToCopy: Long): Unit = {
+ bytesToCopy: Long,
+ numTransferToCalls: Int): Unit = {
val outputInitialState = output match {
case outputFileChannel: FileChannel =>
Some((outputFileChannel.position(), outputFileChannel))
case _ => None
}
var count = 0L
+ var num = 0
// In case transferTo method transferred less data than we have required.
- while (count < bytesToCopy) {
+ while (count < bytesToCopy && num < numTransferToCalls) {
count += input.transferTo(count + startPosition, bytesToCopy - count,
output)
Review comment:
Thanks @srowen for providing good example. That's exactly what I meant for
first sentence.
> Also it is not so good to add wait here, it will delay the shuffle write
critical path, and in a such scenario adding delay will not recover anything,
only just lower the system usage.
If we assume IO will never be restored then yes maybe it doesn't help, but
we don't have any evidence to back up. We're just logically thinking if it runs
as an infinite loop, then the number should be 0 continuously. If we are
suspecting this behavior, checking 0 should be closer to address our suspect,
right? The advantage of checking `count` is, we are not restricted to check the
count with `0`, as I'll also describe below.
If you're concerning about shuffle write as critical path and worry about
latency, you may have some kind of SLA in mind (like N MiB/sec) and measuring
this should provide more information to you. You may want to just kill it when
you're writing huge shuffle but Spark writes shuffle equal or lower than 1
KiB/s for 10 mins - it'll write less than 1M in 10 mins and writing 1G of
shuffle with such throughput is just effectively stuck. At least you may want
to log message as debug or warn when the throughput for recent N mins doesn't
make sense (I wouldn't say which number is good. Heuristic may come in, I don't
know.).
We can still get some hints from figuring out the relation of returned count
and latency of `transferTo`. If it's tightly related, then we may have a chance
to leverage the fact. If not, the elapsed time to finish the loop with loop
count N is unpredictable and you may still waste minutes or even hours for loop
count to be reaching configured number.
We will never reach a good answer unless there're some valuable numbers
being measured during the error situation. Everything for now is just
assumption.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]