Github user ericl commented on the pull request:

    https://github.com/apache/spark/pull/12248#issuecomment-207581308
  
    @srowen, suppose you have a existing service running Spark jobs that read 
from a custom datasource. You want to add log4j trace annotations in order to 
attribute datasource logs back to the original caller of the service. However 
you want to avoid invasive changes to the existing code. This is a two-line 
change with the proposed API.
    
    ```
    // in RPC server running as driver
    def receive(request: RPC) {
        sc.setLocalProperty("traceId", request.traceId)  // add this line
        ...
    }
    
    // in datasource library running on executors
    def handleRead(...) {
        log4j.MDC.put("traceId", TaskContext.getLocalProperty("traceId"))  // 
add this line
        ...
    }
    ```
    
    The alternative is to explicitly reference `traceId` in each of the tasks, 
but this would clutter application code with many references to diagnostics 
info, discouraging the use of diagnostic tools.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to