GitHub user redsanket opened a pull request:

    https://github.com/apache/spark/pull/23166

    [SPARK-26201] Fix python broadcast with encryption

    ## What changes were proposed in this pull request?
    Python with rpc and disk encryption enabled along with a python broadcast 
variable and just read the value back on the driver side the job failed with:
    
     
    
    Traceback (most recent call last): File "broadcast.py", line 37, in 
<module> words_new.value File "/pyspark.zip/pyspark/broadcast.py", line 137, in 
value File "pyspark.zip/pyspark/broadcast.py", line 122, in load_from_path File 
"pyspark.zip/pyspark/broadcast.py", line 128, in load EOFError: Ran out of input
    
    To reproduce use configs: --conf spark.network.crypto.enabled=true --conf 
spark.io.encryption.enabled=true
    
     
    
    Code:
    
    words_new = sc.broadcast(["scala", "java", "hadoop", "spark", "akka"])
    words_new.value
    print(words_new.value)
    (Please fill in changes proposed in this fix)
    
    ## How was this patch tested?
    words_new = sc.broadcast([“scala”, “java”, “hadoop”, 
“spark”, “akka”])
    textFile = sc.textFile(“README.md”)
    wordCounts = textFile.flatMap(lambda line: line.split()).map(lambda word: 
(word + words_new.value[1], 1)).reduceByKey(lambda a, b: a+b)
     count = wordCounts.count()
     print(count)
     words_new.value
     print(words_new.value)
    
    (Please explain how this patch was tested. E.g. unit tests, integration 
tests, manual tests)
    (If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)
    
    Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/redsanket/spark SPARK-26201

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/23166.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #23166
    
----
commit 67a2ac87fb6e2d3fd4a5f260047a37bd2858228d
Author: schintap <schintap@...>
Date:   2018-11-28T16:20:55Z

    [SPARK-26201] Fix python broadcast with encryption

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to