JoshRosen opened a new pull request, #37027:
URL: https://github.com/apache/spark/pull/37027

   ### What changes were proposed in this pull request?
   
   This PR fixes three longstanding bugs in Spark's `JsonProtocol`:
   
   - `TaskResourceRequest` loses precision for `amount` < 0.5. The `amount` is 
a floating point number which is either between 0 and 0.5 or is a positive 
integer, but the JSON read path assumes it is an integer.
   - `ExecutorResourceRequest` integer overflows for values larger than 
Int.MaxValue because the write path writes longs but the read path assumes 
integers.
   - Off heap StorageLevels are not handled properly: the `useOffHeap` field 
isn't included in the JSON, so this StorageLevel cannot be round-tripped 
through JSON. This could cause the History Server to display inaccurate "off 
heap memory used" stats on the executors page.
   
   I discovered these bugs while working on #36885.
   
   ### Why are the changes needed?
   
   JsonProtocol should be able to roundtrip events through JSON without loss of 
information.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes: it fixes bugs that impact information shown in the History Server Web 
UI. The new StorageLevel JSON field will be visible to tools which process raw 
event log JSON.
   
   ### How was this patch tested?
   
   Updated existing unit tests to cover the changed logic.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to