Hi Team
I am trying to read a parquet file, cache it and then do transformation and
overwrite the parquet file in a session.
But first count action doesn't cache the dataframe.
It gets cached while caching the transformed dataframe.
Even if the spark.sql.parquet.cacheMetadata = true still the write
operation destroys the cache.
Is it expected? What is the relevance of this conf setting ?

We are using pyspark on spark cluster mode.
Regards
Parag Mohanty

Reply via email to