Hello Spark users,

I have an inquiry while analyzing a sample Spark task. The task has remote
fetches (shuffle) from few blocks. However, the remote fetch time does not
really make sense to me. Can someone please help to interpret this?

The logs came from Spark REST API. The task ID 33 needs four blocks, and it
has to fetch three blocks from remote machines. In the "shuffleReadMetrics"
section, however, it marks as the "fetchWaitTime" as 0 while it really
fetches about 2.4GB from remote machines.
While in task ID 34 below, it needs to fetch 4 blocks with total size of
around 3GB, it shows the fetchWaitTime is about 2.4 seconds, and only this
makes sense.

Is this an intended behavior?

    "33" : {
      "taskId" : 33,
       ....
      "taskMetrics" : {
        ....
        "shuffleReadMetrics" : {
          *"remoteBlocksFetched" : 3,*
          "localBlocksFetched" : 1,
          *"fetchWaitTime" : 0,*
          *"remoteBytesRead" : 2401539138,*
          "localBytesRead" : 800513041,
          "recordsRead" : 4
        },
      }
    },
    "34" : {
      "taskId" : 34,
      ....
      "taskMetrics" : {
        ....
        "shuffleReadMetrics" : {
          *"remoteBlocksFetched" : 4,*
          "localBlocksFetched" : 0,
          *"fetchWaitTime" : 2416,*
          *"remoteBytesRead" : 3202052194,*
          "localBytesRead" : 0,
          "recordsRead" : 4
        },
      }
    },

Reply via email to