There have been multiple reports of memory problems, performance problems (that
may be memory related), and OOM errors. I have been telling everyone to use:
pio train -- --driver-memory 14g --executor-memory 14g --master
spark://master-address:7077
To increase the memory for Spark and this is correct *but* if you have also set
executor memory in engine.json in the sparkConf section, it will take
precedence to the command line params. The UR template default engine.json sets
memory to 3-4g depending on which you have. Remove this line if you want to set
this on the CLI.
We are inclined to remove the sparkConf entirely since the CLI supports all of
the options and sparkConf doesn’t work in engine.json with Elasticsearch 2.x
If anyone has an opinion please let us know