attilapiros opened a new pull request #29090:
URL: https://github.com/apache/spark/pull/29090
### What changes were proposed in this pull request?
Fixing inconsistency between Spark memory configs and JVM option by adding
the "m" default unit before setting Xmx/Xms JVM options when no suffix is
provided.
### Why are the changes needed?
Spark's maximum memory can be configured in several ways:
- via Spark config
- command line argument
- environment variables
Both for executors and for the driver the memory can be configured
separately. All of these are following the format of JVM memory configurations
in a way they are using the very same size unit suffixes ("k", "m", "g" or "t")
but there is an inconsistency regarding the default unit. When no suffix is
given then the given amount is passed as it is to the JVM (to the -Xmx and -Xms
options) where this memory options are using bytes as a default unit, for this
please see the example
[here](https://docs.oracle.com/javase/8/docs/technotes/tools/windows/java.html):
>
> The following examples show how to set the maximum allowed size of
allocated memory to 80 MB using various units:
>
> -Xmx83886080
> -Xmx81920k
> -Xmx80m
Although the Spark memory config is in MiB.
### Does this PR introduce _any_ user-facing change?
Yes, when no suffix was given XmX and Xms could be configured to the 1/1024
of the desired amount.
### How was this patch tested?
#### With unit test
See `SparkSubmitCommandBuilderSuite`.
#### With a manual testing
By executing a Spark example.
Without my change and without explicit suffix unit:
```
$ ./bin/spark-submit --driver-memory 500 --class
org.apache.spark.examples.SparkPi
examples/target/original-spark-examples_2.12-3.1.0-SNAPSHOT.jar 1000
Error occurred during initialization of VM
Too small initial heap
```
You can see from the error above that it is not the Spark check for the
memory is triggered as with the contra test (still without my change, but with
suffix is given but a too low memory is set):
```
$ ./bin/spark-submit --driver-memory 300m --class
org.apache.spark.examples.SparkPi
examples/target/original-spark-examples_2.12-3.1.0-SNAPSHOT.jar 1000
...
20/07/13 20:51:15 ERROR SparkContext: Error initializing SparkContext.
java.lang.IllegalArgumentException: System memory 301465600 must be at least
471859200. Please increase heap size using the --driver-memory option or
spark.driver.memory in Spark configuration.
at
org.apache.spark.memory.UnifiedMemoryManager$.getMaxMemory(UnifiedMemoryManager.scala:221)
```
And after this PR running the first case (without explicit suffix unit):
```
$ ./bin/spark-submit --driver-memory 500 --class
org.apache.spark.examples.SparkPi
examples/target/original-spark-examples_2.12-3.1.0-SNAPSHOT.jar 1000
...
20/07/13 20:57:12 INFO TaskSetManager: Finished task 995.0 in stage 0.0 (TID
995) in 419 ms on 192.168.1.210 (executor driver) (999/1000)
20/07/13 20:57:12 INFO Executor: Finished task 999.0 in stage 0.0 (TID 999).
914 bytes result sent to driver
20/07/13 20:57:12 INFO TaskSetManager: Finished task 999.0 in stage 0.0 (TID
999) in 242 ms on 192.168.1.210 (executor driver) (1000/1000)
20/07/13 20:57:12 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks
have all completed, from pool
20/07/13 20:57:12 INFO DAGScheduler: ResultStage 0 (reduce at
SparkPi.scala:38) finished in 45.892 s
20/07/13 20:57:12 INFO DAGScheduler: Job 0 is finished. Cancelling potential
speculative or zombie tasks for this job
20/07/13 20:57:12 INFO TaskSchedulerImpl: Killing all running tasks in stage
0: Stage finished
20/07/13 20:57:12 INFO DAGScheduler: Job 0 finished: reduce at
SparkPi.scala:38, took 45.967945 s
Pi is roughly 3.1417498714174985
20/07/13 20:57:12 INFO SparkUI: Stopped Spark web UI at
http://192.168.1.210:4040
20/07/13 20:57:12 INFO MapOutputTrackerMasterEndpoint:
MapOutputTrackerMasterEndpoint stopped!
20/07/13 20:57:12 INFO MemoryStore: MemoryStore cleared
20/07/13 20:57:12 INFO BlockManager: BlockManager stopped
20/07/13 20:57:12 INFO BlockManagerMaster: BlockManagerMaster stopped
20/07/13 20:57:12 INFO
OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:
OutputCommitCoordinator stopped!
20/07/13 20:57:12 INFO SparkContext: Successfully stopped SparkContext
20/07/13 20:57:12 INFO ShutdownHookManager: Shutdown hook called
20/07/13 20:57:12 INFO ShutdownHookManager: Deleting directory
/private/var/folders/t_/fr_vqcyx23vftk81ftz1k5hw0000gn/T/spark-02c73ca4-ff94-422f-a8dc-39d88365d391
20/07/13 20:57:12 INFO ShutdownHookManager: Deleting directory
/private/var/folders/t_/fr_vqcyx23vftk81ftz1k5hw0000gn/T/spark-fa723796-b3a8-4d5b-94c7-65752cd653bd
```
And even `jps` shows the correct value:
```
$ jps -lvV
25269 org.apache.spark.deploy.SparkSubmit -Xmx500m
...
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]