[jira] [Created] (HUDI-5080) UnpersistRdds unpersist all rdds in the spark context

sivabalan narayanan (Jira) Sat, 22 Oct 2022 17:16:05 -0700

sivabalan narayanan created HUDI-5080:
-----------------------------------------


             Summary: UnpersistRdds unpersist all rdds in the spark context
                 Key: HUDI-5080
                 URL: https://issues.apache.org/jira/browse/HUDI-5080
             Project: Apache Hudi
          Issue Type: Bug
          Components: writer-core
            Reporter: sivabalan narayanan


In SparkRDDWriteClient, we have a method to clean up persisted Rdds to free up 
the space occupied. 

[https://github.com/apache/hudi/blob/b78c3441c4e28200abec340eaff852375764cbdb/hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/SparkRDDWriteClient.java#L584]

But the issue is, it cleans up all persisted rdds in the given spark context. 
This will impact, async compaction or any other async table services running. 

or even if there are multiple streams writing to different tables, this will be 
cause a huge impact. 

 

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HUDI-5080) UnpersistRdds unpersist all rdds in the spark context

Reply via email to