[GitHub] spark pull request #21606: [SPARK-24552][core][SQL] Use task ID instead of a...

rdblue Fri, 22 Jun 2018 12:11:30 -0700

Github user rdblue commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21606#discussion_r197542704
  
    --- Diff: 
core/src/main/scala/org/apache/spark/internal/io/SparkHadoopWriter.scala ---
    @@ -104,12 +104,12 @@ object SparkHadoopWriter extends Logging {
           jobTrackerId: String,
           commitJobId: Int,
           sparkPartitionId: Int,
    -      sparkAttemptNumber: Int,
    +      sparkTaskId: Long,
           committer: FileCommitProtocol,
           iterator: Iterator[(K, V)]): TaskCommitMessage = {
         // Set up a task.
         val taskContext = config.createTaskAttemptContext(
    -      jobTrackerId, commitJobId, sparkPartitionId, sparkAttemptNumber)
    +      jobTrackerId, commitJobId, sparkPartitionId, sparkTaskId.toInt)
    --- End diff --
    
    I commented before I saw this thread, but I think it is better to use the 
TID because that is already exposed in the UI so it is better for tracking 
between UI tasks and logs. The combined attempt number isn't used anywhere so 
this would introduce another number to identify a task. And anyway, shifting by 
16 means that these grow huge anyway.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21606: [SPARK-24552][core][SQL] Use task ID instead of a...

Reply via email to