----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/37452/#review95987 -----------------------------------------------------------
core/src/main/java/org/apache/oozie/service/SparkConfigurationService.java (line 85) <https://reviews.apache.org/r/37452/#comment151180> Should we be doing this? From http://spark.apache.org/docs/latest/running-on-yarn.html ,it refers to hdfs location or local installation in task node. Since it applies for other clients, should we retain that in Oozie as well or we are saying that Oozie is only going to use spark libraries via sharelib? Or atleast it should be configurable to retain this setting to support users have local installation in task nodes. sharelib/spark/pom.xml (line 121) <https://reviews.apache.org/r/37452/#comment151181> Nitpick. Can you put spark-core in the beginning followed by other spark feature dependencies as that is the main one. sharelib/spark/src/main/java/org.apache.oozie.action.hadoop/SparkMain.java (lines 60 - 61) <https://reviews.apache.org/r/37452/#comment152402> Robert, In our chat, you mentioned about ability to specify this alternatively as -master yarn -mode client and -master yarn -mode cluster. Will have to handle that as well. sharelib/spark/src/main/java/org.apache.oozie.action.hadoop/SparkMain.java (line 78) <https://reviews.apache.org/r/37452/#comment152398> If local mode will be ever used in Oozie, then all the new code can go into a if (yarnClusterMode || yarnClientMode) block to be done only for non-local mode. sharelib/spark/src/main/java/org.apache.oozie.action.hadoop/SparkMain.java (line 103) <https://reviews.apache.org/r/37452/#comment152407> DELIM is space which means we expect user to specify exactly one space between the arguments. We should do \s+ or we can try something like http://stackoverflow.com/questions/6049470/can-apache-commons-cli-options-parser-ignore-unknown-command-line-options/8613949#8613949 to be more cleaner with parsing the arguments. sharelib/spark/src/main/java/org.apache.oozie.action.hadoop/SparkMain.java (line 113) <https://reviews.apache.org/r/37452/#comment152417> Can you place local files also in spark.yarn.dist.files and spark takes care of shipping them like --jars? Asking because you are adding files from java.classpath to sparkJars. Atleast in hadoop mapreduce.cache.files have to be hdfs paths. sharelib/spark/src/main/java/org.apache.oozie.action.hadoop/SparkMain.java (line 123) <https://reviews.apache.org/r/37452/#comment152408> Can you add just add a comment here saying this is redundant for yarnClientMode as driver is the launcher jvm and it is already launched. Robert did enlighten me that "in local mode, everything runs in the launcher job. in yarn-client mode, the driver runs in the launcher and the executor in Yarn. in yarn-cluster mode, the driver and executor run in Yarn" Can we added that to code comments and also in documentation as it will be confusing for users as well. - Rohini Palaniswamy On Aug. 13, 2015, 11:42 p.m., Robert Kanter wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/37452/ > ----------------------------------------------------------- > > (Updated Aug. 13, 2015, 11:42 p.m.) > > > Review request for oozie. > > > Bugs: OOZIE-2277 > https://issues.apache.org/jira/browse/OOZIE-2277 > > > Repository: oozie-git > > > Description > ------- > > https://issues.apache.org/jira/browse/OOZIE-2277 > > > Diffs > ----- > > core/src/main/java/org/apache/oozie/service/SparkConfigurationService.java > 1b7cf4a > > core/src/test/java/org/apache/oozie/service/TestSparkConfigurationService.java > b2c499d > sharelib/spark/pom.xml 6f7e74a > sharelib/spark/src/main/java/org.apache.oozie.action.hadoop/SparkMain.java > b18a0b9 > > sharelib/spark/src/test/java/org/apache/oozie/action/hadoop/TestSparkActionExecutor.java > f271abc > > Diff: https://reviews.apache.org/r/37452/diff/ > > > Testing > ------- > > - Ran unit tests with Hadoop 1 and Hadoop 2 > - Ran in a Hadoop 2 cluster with local, yarn-client, and yarn-cluster modes > > > Thanks, > > Robert Kanter > >
