GitHub user andrewor14 opened a pull request:

    https://github.com/apache/spark/pull/10815

    [SPARK-12887] Do not expose var's in TaskMetrics

    This is a step in implementing SPARK-10620, which migrates TaskMetrics to 
accumulators.
    
    TaskMetrics has a bunch of var's, some are fully public, some are 
`private[spark]`. This is bad coding style that makes it easy to accidentally 
overwrite previously set metrics. This has happened a few times in the past and 
caused bugs that were difficult to debug.
    
    Instead, we should have get-or-create semantics, which are more readily 
understandable. This makes sense in the case of TaskMetrics because these are 
just aggregated metrics that we want to collect throughout the task, so it 
doesn't matter who's incrementing them.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/andrewor14/spark get-or-create-metrics

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/10815.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #10815
    
----
commit 62c96e1cdc472356dfbfb24cf9650a8f36017224
Author: Andrew Or <[email protected]>
Date:   2016-01-18T19:50:04Z

    Add register* methods (get or create)

commit b9d7fbf37cc410d44e462d9d08650a20decc8fc9
Author: Andrew Or <[email protected]>
Date:   2016-01-18T20:10:17Z

    Clean up places where we set OutputMetrics
    
    Note: there's one remaining place, which is JsonProtocol.

commit 078598409225224f0532a45f34dae533695b25df
Author: Andrew Or <[email protected]>
Date:   2016-01-18T20:20:28Z

    Replace set with register
    
    JsonProtocol remains the only place where we still call set
    on each of the *Metrics classes.

commit ad094f071472b9cf7b9f9bdb7cd00d88c402995d
Author: Andrew Or <[email protected]>
Date:   2016-01-18T20:30:59Z

    Clean up JsonProtocol
    
    This commit collapsed 10 methods into 2. The 8 that were inlined
    were only used in 1 place each, and the body of each was quite
    small. The additional level of abstraction did not add much value
    and made the code verbose.

commit 34c7ce5bf724c781a37c352277f7c5cd86d33c9a
Author: Andrew Or <[email protected]>
Date:   2016-01-18T20:46:42Z

    Hide updatedBlocks

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to