[ https://issues.apache.org/jira/browse/SPARK-6816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14488237#comment-14488237 ]
Shivaram Venkataraman commented on SPARK-6816: ---------------------------------------------- Comments from SparkR JIRA Shivaram Venkataraman added a comment - 14/Feb/15 10:32 AM I looked at this recently and I think the existing arguments to `sparkR.init` pretty much cover all the options that are exposed in SparkConf. We could split things out of the function arguments into a separate SparkConf object (something like PySpark https://github.com/apache/spark/blob/master/python/pyspark/conf.py) but the setter-methods don't translate very well to the style we use in SparkR. For example it would be something like setAppName(setMaster(conf, "local"), "SparkR") instead of conf.setMaster().setAppName() The other thing brought up by this JIRA is that we should parse arguments passed to spark-submit or set in spark-defaults.conf. I think this should automatically happen with SPARKR-178 Sun Rui Zongheng Yang Any thoughts on this ? concretevitamin Zongheng Yang added a comment - 15/Feb/15 12:07 PM I'm +1 on not using the builder pattern in R. What about using a named list or an environment to simulate a SparkConf? For example, users can write something like: {code} > conf <- list(spark.master = "local[2]", spark.executor.memory = "12g") > conf $spark.master [1] "local[2]" $spark.executor.memory [1] "12g" {code} and pass the named list to `sparkR.init()`. shivaram Shivaram Venkataraman added a comment - 15/Feb/15 5:50 PM I think the named list might be okay, (one thing is that we will have nested named lists for things like executorEnv). However I am not sure if named lists are better than just passing named arguments to the `sparkR.init`. I guess the better way to ask my question is what functionality do we want to provide to the users – Right now users can pretty much set anything they want in the SparkConf using sparkR.init One functionality that is missing is printing the conf and say inspecting what config variables are set. We could say add a getConf(sc) which returns a named list to provide this feature. Is there any other functionality we need ? concretevitamin Zongheng Yang added a comment - 21/Feb/15 3:22 PM IMO using a named list provides more flexibility: it's ordinary data that users can operate/transform on. Using only parameter-passing in the constructor locks users in operating on code instead of data. It'd also be easier to just return the saved named list if we're going to implement getConf()? Some relevant discussions: https://aphyr.com/posts/321-builders-vs-option-maps shivaram Shivaram Venkataraman added a comment - 22/Feb/15 4:33 PM Hmm okay - named lists are not quite the same as option maps though.To move forward it'll be good to see how the new API functions we want on the R side should look like. Lets keep this discussion open but I'm going to change the priority / description (we are already able to read in spark-defaults.conf now that SPARKR-178 has been merged). > Add SparkConf API to configure SparkR > ------------------------------------- > > Key: SPARK-6816 > URL: https://issues.apache.org/jira/browse/SPARK-6816 > Project: Spark > Issue Type: New Feature > Components: SparkR > Reporter: Shivaram Venkataraman > Priority: Minor > > Right now the only way to configure SparkR is to pass in arguments to > sparkR.init. The goal is to add an API similar to SparkConf on Scala/Python > to make configuration easier -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org