Github user mridulm commented on the issue:

    https://github.com/apache/spark/pull/17596
  
    My comments were based on a fix in 1.6; actually lot of values were 
actually observed to be 0 for a lot of cases - just a few were not (even here 
it is relevant - resultSize, gctime, various bytes spilled, etc). The bitmask 
actually ends up being a single long for the cardinality of metrics we have - 
which typically gets encoded in a byte or two in reality.
    
    In addition, things like input/output/shuffle metrics, accumulator updates, 
block updates, etc (present in 1.6 in TaskMetrics) - can all be avoided in the 
serialized stream when not present.
    When present, I used read/write External on those classes to directly 
encode the values.
    
    IIRC this is relevant not just for the final result, but for heartbeats 
also - so serde saving helped a lot more than initially expected.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to