Github user superbobry commented on the issue:

    https://github.com/apache/spark/pull/19992
  
    Minor update: I've simulated #18162 on one of our 80G event logs and 
(unless there is a bug in the filtering code) the log shrank to 157M. The 
effect of this patch was almost negligible, it brought the size down to 155M. 
It is unclear for now if this generalizes to other workloads.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to