sanha opened a new pull request #18: [NEMO-45] Distributed Nemo-Spark URL: https://github.com/apache/incubator-nemo/pull/18 JIRA: [NEMO-45: Distributed Nemo-Spark](https://issues.apache.org/jira/projects/NEMO/issues/NEMO-45) **Major changes:** - Enabled distributed Spark source read including yarn mode - Store commands which is used to create a `SparkSession` and recreate the session with the commands in `Executor` - Store `SparkConf` of `SparkContext` and recreate the context with the configuration in `Executor`. **Minor changes to note:** - Added `JavaMapReduce` example for Spark which is equivalent to `MapReduce` example of Beam. **Tests for the changes:** - `SparkDatasetBoundedSourceVertex` using `SparkSession` - Existing `testSparkWordCount` in `SparkITCase` cover this. - `SparkTextFileBoundedSourceVertex` using `SparkContext` - `testSparkMapReduce` is added in `SparkITCase` to cover this. **Other comments:** - None. resolves [NEMO-##](https://issues.apache.org/jira/projects/NEMO/issues/NEMO-##)
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
