sanha opened a new pull request #18: [NEMO-45] Distributed Nemo-Spark
URL: https://github.com/apache/incubator-nemo/pull/18
 
 
   JIRA: [NEMO-45: Distributed 
Nemo-Spark](https://issues.apache.org/jira/projects/NEMO/issues/NEMO-45)
   
   **Major changes:**
   - Enabled distributed Spark source read including yarn mode
     - Store commands which is used to create a `SparkSession` and recreate the 
session with the commands in `Executor`
     - Store `SparkConf` of `SparkContext` and recreate the context with the 
configuration in `Executor`.
   
   **Minor changes to note:**
   - Added `JavaMapReduce` example for Spark which is equivalent to `MapReduce` 
example of Beam. 
   
   **Tests for the changes:**
   - `SparkDatasetBoundedSourceVertex` using `SparkSession`
     - Existing `testSparkWordCount` in `SparkITCase` cover this.
   - `SparkTextFileBoundedSourceVertex` using `SparkContext`
     - `testSparkMapReduce` is added in `SparkITCase` to cover this.
   
   **Other comments:**
   - None.
   
   resolves 
[NEMO-##](https://issues.apache.org/jira/projects/NEMO/issues/NEMO-##)
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to