Hi all, There are ways through the `addArtifacts` API in an existing session but for that we need to have dependencies properly gzipped. In the case of different kernel/OS between client and server, it won't work either I believe. What I am interested in is doing some sort of `pip install <package` on the cluster from my client.
I came across this databricks video Dependency management in Spark connect <https://youtu.be/PbvIak6Z8eI?feature=shared&t=679>, where there was mention of following functionality but I don't see it in the master branch <https://github.com/apache/spark/blob/master/python/pyspark/sql/connect/udf.py>. Is it only supported in Databricks and no plans of open source in near future? ``` @udf(packages=["pandas==1.5.3", "pyarrow"] def myudf(): import pandas ``` ----- I had another question about extending the Spark connect client (& server) itself if I want to add a new Spark connect gRPC API. Is there a way to add an additional proto to my package (that extends SparkSession from pyspark)? I looked into Spark connect plugins and they are only to modify the plan etc, not for adding a new API. Regards, Deependra