GitHub user redsanket opened a pull request:
https://github.com/apache/spark/pull/23166
[SPARK-26201] Fix python broadcast with encryption
## What changes were proposed in this pull request?
Python with rpc and disk encryption enabled along with a python broadcast
variable and just read the value back on the driver side the job failed with:
Traceback (most recent call last): File "broadcast.py", line 37, in
<module> words_new.value File "/pyspark.zip/pyspark/broadcast.py", line 137, in
value File "pyspark.zip/pyspark/broadcast.py", line 122, in load_from_path File
"pyspark.zip/pyspark/broadcast.py", line 128, in load EOFError: Ran out of input
To reproduce use configs: --conf spark.network.crypto.enabled=true --conf
spark.io.encryption.enabled=true
Code:
words_new = sc.broadcast(["scala", "java", "hadoop", "spark", "akka"])
words_new.value
print(words_new.value)
(Please fill in changes proposed in this fix)
## How was this patch tested?
words_new = sc.broadcast([âscalaâ, âjavaâ, âhadoopâ,
âsparkâ, âakkaâ])
textFile = sc.textFile(âREADME.mdâ)
wordCounts = textFile.flatMap(lambda line: line.split()).map(lambda word:
(word + words_new.value[1], 1)).reduceByKey(lambda a, b: a+b)
count = wordCounts.count()
print(count)
words_new.value
print(words_new.value)
(Please explain how this patch was tested. E.g. unit tests, integration
tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise,
remove this)
Please review http://spark.apache.org/contributing.html before opening a
pull request.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/redsanket/spark SPARK-26201
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/23166.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #23166
----
commit 67a2ac87fb6e2d3fd4a5f260047a37bd2858228d
Author: schintap <schintap@...>
Date: 2018-11-28T16:20:55Z
[SPARK-26201] Fix python broadcast with encryption
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]