Peng Zhong created SPARK-43474:
----------------------------------

             Summary: Add support to create DataFrame Reference in Spark connect
                 Key: SPARK-43474
                 URL: https://issues.apache.org/jira/browse/SPARK-43474
             Project: Spark
          Issue Type: Task
          Components: Connect, Structured Streaming
    Affects Versions: 3.5.0
            Reporter: Peng Zhong


Add support in Spark Connect to cache a DataFrame on server side. From client 
side, it can create a reference to that DataFrame given the cache key.

 

This function will be used in streaming foreachBatch(). Server needs to call 
user function for every batch which takes a DataFrame as argument. With the new 
function, we can just cache the DataFrame on the server. Pass the id back to 
client which can creates the DataFrame reference. The server will replace the 
reference when transforming.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to