Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/21606#discussion_r197542704 --- Diff: core/src/main/scala/org/apache/spark/internal/io/SparkHadoopWriter.scala --- @@ -104,12 +104,12 @@ object SparkHadoopWriter extends Logging { jobTrackerId: String, commitJobId: Int, sparkPartitionId: Int, - sparkAttemptNumber: Int, + sparkTaskId: Long, committer: FileCommitProtocol, iterator: Iterator[(K, V)]): TaskCommitMessage = { // Set up a task. val taskContext = config.createTaskAttemptContext( - jobTrackerId, commitJobId, sparkPartitionId, sparkAttemptNumber) + jobTrackerId, commitJobId, sparkPartitionId, sparkTaskId.toInt) --- End diff -- I commented before I saw this thread, but I think it is better to use the TID because that is already exposed in the UI so it is better for tracking between UI tasks and logs. The combined attempt number isn't used anywhere so this would introduce another number to identify a task. And anyway, shifting by 16 means that these grow huge anyway.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org