GitHub user rxin opened a pull request:

    https://github.com/apache/spark/pull/1259

    [SPARK-2317] Improve task logging.

    We use TID to indicate task logging. However, TID itself does not capture 
stage or retries, making it harder to correlate with the application itself. 
This pull request changes all logging messages for tasks to include both the 
TID and the stage id, stage attempt, task id, and task attempt.  I've consulted 
various people but unfortunately this is a really hard task. 
    
    Driver log looks like:
    
    ```
    14/06/28 18:53:29 INFO DAGScheduler: Submitting 10 missing tasks from Stage 
0 (MappedRDD[1] at map at <console>:13)
    14/06/28 18:53:29 INFO TaskSchedulerImpl: Adding task set 0.0 with 10 tasks
    14/06/28 18:53:29 INFO TaskSetManager: Re-computing pending task lists.
    14/06/28 18:53:29 INFO TaskSetManager: Starting task 0.0:0.0 (TID 0, 
localhost, PROCESS_LOCAL, 1203 bytes, 5 ms)
    14/06/28 18:53:29 INFO TaskSetManager: Starting task 0.0:1.0 (TID 1, 
localhost, PROCESS_LOCAL, 1203 bytes, 3 ms)
    14/06/28 18:53:29 INFO TaskSetManager: Starting task 0.0:2.0 (TID 2, 
localhost, PROCESS_LOCAL, 1203 bytes, 3 ms)
    14/06/28 18:53:29 INFO TaskSetManager: Starting task 0.0:3.0 (TID 3, 
localhost, PROCESS_LOCAL, 1203 bytes, 3 ms)
    14/06/28 18:53:29 INFO TaskSetManager: Starting task 0.0:4.0 (TID 4, 
localhost, PROCESS_LOCAL, 1203 bytes, 3 ms)
    14/06/28 18:53:29 INFO TaskSetManager: Starting task 0.0:5.0 (TID 5, 
localhost, PROCESS_LOCAL, 1203 bytes, 3 ms)
    14/06/28 18:53:29 INFO TaskSetManager: Starting task 0.0:6.0 (TID 6, 
localhost, PROCESS_LOCAL, 1203 bytes, 3 ms)
    14/06/28 18:53:29 INFO TaskSetManager: Starting task 0.0:7.0 (TID 7, 
localhost, PROCESS_LOCAL, 1203 bytes, 3 ms)
    14/06/28 18:53:29 INFO TaskSetManager: Starting task 0.0:8.0 (TID 8, 
localhost, PROCESS_LOCAL, 1203 bytes, 5 ms)
    14/06/28 18:53:29 INFO TaskSetManager: Finished 0.0:7.0 (TID 7) in 50 ms on 
localhost (1/10)
    14/06/28 18:53:29 INFO TaskSetManager: Starting task 0.0:9.0 (TID 9, 
localhost, PROCESS_LOCAL, 1203 bytes, 4 ms)
    14/06/28 18:53:29 INFO Executor: Running task 0.0:9.0 (TID 9)
    14/06/28 18:53:29 INFO TaskSetManager: Finished 0.0:5.0 (TID 5) in 68 ms on 
localhost (2/10)
    14/06/28 18:53:29 INFO TaskSetManager: Finished 0.0:6.0 (TID 6) in 70 ms on 
localhost (3/10)
    14/06/28 18:53:29 INFO Executor: Finished task 0.0:9.0 (TID 9). 517 bytes 
result sent to driver
    14/06/28 18:53:29 INFO TaskSetManager: Finished 0.0:4.0 (TID 4) in 80 ms on 
localhost (4/10)
    14/06/28 18:53:29 INFO TaskSetManager: Finished 0.0:3.0 (TID 3) in 85 ms on 
localhost (5/10)
    14/06/28 18:53:29 INFO TaskSetManager: Finished 0.0:2.0 (TID 2) in 91 ms on 
localhost (6/10)
    14/06/28 18:53:29 INFO TaskSetManager: Finished 0.0:1.0 (TID 1) in 98 ms on 
localhost (7/10)
    14/06/28 18:53:29 INFO TaskSetManager: Finished 0.0:0.0 (TID 0) in 109 ms 
on localhost (8/10)
    14/06/28 18:53:29 INFO TaskSetManager: Finished 0.0:8.0 (TID 8) in 39 ms on 
localhost (9/10)
    14/06/28 18:53:29 INFO TaskSetManager: Finished 0.0:9.0 (TID 9) in 30 ms on 
localhost (10/10)
    ```
    
    Executor log looks like
    ```
    14/06/28 18:53:29 INFO Executor: Running task 0.0:0.0 (TID 0)
    14/06/28 18:53:29 INFO Executor: Running task 0.0:1.0 (TID 1)
    14/06/28 18:53:29 INFO Executor: Running task 0.0:5.0 (TID 5)
    14/06/28 18:53:29 INFO Executor: Running task 0.0:3.0 (TID 3)
    14/06/28 18:53:29 INFO Executor: Running task 0.0:2.0 (TID 2)
    14/06/28 18:53:29 INFO Executor: Running task 0.0:6.0 (TID 6)
    14/06/28 18:53:29 INFO Executor: Running task 0.0:4.0 (TID 4)
    14/06/28 18:53:29 INFO Executor: Running task 0.0:7.0 (TID 7)
    14/06/28 18:53:29 INFO Executor: Finished task 0.0:0.0 (TID 0). 517 bytes 
result sent to driver
    14/06/28 18:53:29 INFO Executor: Finished task 0.0:3.0 (TID 3). 517 bytes 
result sent to driver
    14/06/28 18:53:29 INFO Executor: Finished task 0.0:5.0 (TID 5). 517 bytes 
result sent to driver
    14/06/28 18:53:29 INFO Executor: Finished task 0.0:4.0 (TID 4). 517 bytes 
result sent to driver
    14/06/28 18:53:29 INFO Executor: Finished task 0.0:1.0 (TID 1). 517 bytes 
result sent to driver
    14/06/28 18:53:29 INFO Executor: Finished task 0.0:7.0 (TID 7). 517 bytes 
result sent to driver
    14/06/28 18:53:29 INFO Executor: Finished task 0.0:2.0 (TID 2). 517 bytes 
result sent to driver
    14/06/28 18:53:29 INFO Executor: Finished task 0.0:6.0 (TID 6). 517 bytes 
result sent to driver
    14/06/28 18:53:29 INFO Executor: Running task 0.0:8.0 (TID 8)
    14/06/28 18:53:29 INFO Executor: Finished task 0.0:8.0 (TID 8). 517 bytes 
result sent to driver
    ```

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/rxin/spark betterTaskLogging

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/1259.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1259
    
----
commit 53888e3e97747c8a43a86fb3b219e672992619f6
Author: Reynold Xin <[email protected]>
Date:   2014-06-29T01:54:33Z

    [SPARK-2317] Improve task logging.

commit 4659b2008ab2e0e710a19d8cbd93004168a44188
Author: Reynold Xin <[email protected]>
Date:   2014-06-29T01:57:09Z

    Remove unused variable.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to