[
https://issues.apache.org/jira/browse/SPARK-28626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17062230#comment-17062230
]
Wing Yew Poon commented on SPARK-28626:
---------------------------------------
For the record, to assist folks who need to backport this:
>From branch-2.3, we also need
>[https://github.com/apache/spark/commit/323dc3ad02e63a7c99b5bd6da618d6020657ecba]
[PYSPARK] Update py4j to version 0.10.7.
For the SPARKR change, there is a preceding change that is needed
[https://github.com/apache/spark/commit/dad5c48b2a229bf6f9e6b8548f9335f04a15c818]
[MINOR][PYTHON] Use a helper in `PythonUtils` instead of direct accessing Scala
package
> Spark leaves unencrypted data on local disk, even with encryption turned on
> (CVE-2019-10099)
> --------------------------------------------------------------------------------------------
>
> Key: SPARK-28626
> URL: https://issues.apache.org/jira/browse/SPARK-28626
> Project: Spark
> Issue Type: Bug
> Components: Security
> Affects Versions: 2.3.2
> Reporter: Imran Rashid
> Priority: Major
> Fix For: 2.3.3, 2.4.0
>
>
> Severity: Important
>
> Vendor: The Apache Software Foundation
>
> Versions affected:
> All Spark 1.x, Spark 2.0.x, Spark 2.1.x, and 2.2.x versions
> Spark 2.3.0 to 2.3.2
>
> Description:
> Prior to Spark 2.3.3, in certain situations Spark would write user data to
> local disk unencrypted, even if spark.io.encryption.enabled=true. This
> includes cached blocks that are fetched to disk (controlled by
> spark.maxRemoteBlockSizeFetchToMem); in SparkR, using parallelize; in
> Pyspark, using broadcast and parallelize; and use of python udfs.
>
>
> Mitigation:
> 1.x, 2.0.x, 2.1.x, 2.2.x, 2.3.x users should upgrade to 2.3.3 or newer,
> including 2.4.x
>
> Credit:
> This issue was reported by Thomas Graves of NVIDIA.
>
> References:
> [https://spark.apache.org/security.html]
>
> The following commits were used to fix this issue, in branch-2.3 (there may
> be other commits in master / branch-2.4, that are equivalent.)
> {noformat}
> commit 575fea120e25249716e3f680396580c5f9e26b5b
> Author: Imran Rashid <[email protected]>
> Date: Wed Aug 22 16:38:28 2018 -0500
> [CORE] Updates to remote cache reads
> Covered by tests in DistributedSuite
>
> commit 6d742d1bd71aa3803dce91a830b37284cb18cf70
> Author: Imran Rashid <[email protected]>
> Date: Thu Sep 6 12:11:47 2018 -0500
> [PYSPARK][SQL] Updates to RowQueue
> Tested with updates to RowQueueSuite
>
> commit 09dd34cb1706f2477a89174d6a1a0f17ed5b0a65
> Author: Imran Rashid <[email protected]>
> Date: Mon Aug 13 21:35:34 2018 -0500
> [PYSPARK] Updates to pyspark broadcast
>
> commit 12717ba0edfa5459c9ac2085f46b1ecc0ee759aa
> Author: hyukjinkwon <[email protected]>
> Date: Mon Sep 24 19:25:02 2018 +0800
> [SPARKR] Match pyspark features in SparkR communication protocol
> {noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]