gaborgsomogyi opened a new pull request #30592:
URL: https://github.com/apache/spark/pull/30592
### What changes were proposed in this pull request?
`spark.buffer.size` not applied in driver from pyspark. In this PR I've
fixed this issue.
### Why are the changes needed?
Apply the mentioned config on driver side.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Existing unit tests + manually.
Added the following code temporarily:
```
def local_connect_and_auth(port, auth_secret):
...
sock.connect(sa)
print("SPARK_BUFFER_SIZE: %d" %
int(os.environ.get("SPARK_BUFFER_SIZE", 65536))) <- This is the addition
sockfile = sock.makefile("rwb",
int(os.environ.get("SPARK_BUFFER_SIZE", 65536)))
...
```
Test:
```
#Compile Spark
echo "spark.buffer.size 10000" >> conf/spark-defaults.conf
$ ./bin/pyspark
Python 3.8.5 (default, Jul 21 2020, 10:48:26)
[Clang 11.0.3 (clang-1103.0.32.62)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
20/12/03 13:38:13 WARN NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use
setLogLevel(newLevel).
20/12/03 13:38:14 WARN SparkEnv: I/O encryption enabled without RPC
encryption: keys will be visible on the wire.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/__ / .__/\_,_/_/ /_/\_\ version 3.1.0-SNAPSHOT
/_/
Using Python version 3.8.5 (default, Jul 21 2020 10:48:26)
Spark context Web UI available at http://192.168.0.189:4040
Spark context available as 'sc' (master = local[*], app id =
local-1606999094506).
SparkSession available as 'spark'.
>>> sc.setLogLevel("TRACE")
>>> sc.parallelize([0, 2, 3, 4, 6], 5).glom().collect()
...
SPARK_BUFFER_SIZE: 10000
...
[[0], [2], [3], [4], [6]]
>>>
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]