[ https://issues.apache.org/jira/browse/SPARK-40874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Peter Toth updated SPARK-40874: ------------------------------- The following Pyspark script: {noformat} bin/pyspark --conf spark.io.encryption.enabled=true ... bar = {"a": "aa", "b": "bb"} foo = spark.sparkContext.broadcast(bar) spark.udf.register("MYUDF", lambda x: foo.value[x] if x else "") spark.sql("SELECT MYUDF('a') AS a, MYUDF('b') AS b").collect() {noformat} fails with: {noformat} 22/10/21 17:14:32 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)/ 1] org.apache.spark.api.python.PythonException: Traceback (most recent call last): File "/Users/petertoth/git/apache/spark/python/lib/pyspark.zip/pyspark/worker.py", line 811, in main func, profiler, deserializer, serializer = read_command(pickleSer, infile) File "/Users/petertoth/git/apache/spark/python/lib/pyspark.zip/pyspark/worker.py", line 87, in read_command command = serializer._read_with_length(file) File "/Users/petertoth/git/apache/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 173, in _read_with_length return self.loads(obj) File "/Users/petertoth/git/apache/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 471, in loads return cloudpickle.loads(obj, encoding=encoding) EOFError: Ran out of input {noformat} > Fix broadcasts in Python UDFs when encryption is enabled > -------------------------------------------------------- > > Key: SPARK-40874 > URL: https://issues.apache.org/jira/browse/SPARK-40874 > Project: Spark > Issue Type: Bug > Components: PySpark > Affects Versions: 3.4.0 > Reporter: Peter Toth > Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org