[ https://issues.apache.org/jira/browse/CRUNCH-507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14483357#comment-14483357 ]
Josh Wills commented on CRUNCH-507: ----------------------------------- +1, looks straightforward. > Potential NPE in SparkPipeline constructor and additional constructor > --------------------------------------------------------------------- > > Key: CRUNCH-507 > URL: https://issues.apache.org/jira/browse/CRUNCH-507 > Project: Crunch > Issue Type: Improvement > Components: Core > Affects Versions: 0.11.0 > Reporter: Micah Whitacre > Assignee: Micah Whitacre > Fix For: 0.12.0 > > Attachments: CRUNCH-507.patch > > > Was looking at the SparkPipeline constructor API and was trying to maximize > the number of settings I'd inherit when a Spark job was submitted with > "spark-submit". This should populate the SparkContext (and JavaSparkContext) > with values like the Spark Master. If you want to: > * Specify a driver class > * Hadoop Configuration (vs picking up the defaults) > * Inherit pre-populated SparkContext you'd have to use a constructor like: > {code} > JavaSparkContext sc = new JavaSparkContext(new SparkConf); > new SparkPipeline(sc.master(), sc.appName(), Driver.class, conf) > {code} > Just for convenience we could add a constructor like the following: > {code} > public SparkPipeline(JavaSparkContext sc, String appName, Class driver, > Configuration conf) > {code} > Could remove the appName but since the spark context is not guaranteed to be > non-null we might get a NPE. This also means that on this line[1] we could > throw an NPE when trying to pull the hadoopConfiguration() off that object. > [1] - > https://github.com/apache/crunch/blob/3ab0b078c47f23b3ba893fdfb05fd723f663d02b/crunch-spark/src/main/java/org/apache/crunch/impl/spark/SparkPipeline.java#L73 -- This message was sent by Atlassian JIRA (v6.3.4#6332)