Michael Taranov created SPARK-38703:
---------------------------------------

             Summary: High GC and memory footprint after switch to ZSTD
                 Key: SPARK-38703
                 URL: https://issues.apache.org/jira/browse/SPARK-38703
             Project: Spark
          Issue Type: Question
          Components: Input/Output
    Affects Versions: 3.1.2
            Reporter: Michael Taranov


Hi All,

We started to switch our Spark pipelines to read parquet with ZSTD compression. 
After the switch we see that memory footprint is much larger than previously 
with SNAPPY.

Additionally GC stats of the jobs are much higher comparing to SNAPPY with the 
same workload as previously. 

Is there any configurations that may be relevant to read path, that may help in 
such cases ?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to