[ 
https://issues.apache.org/jira/browse/SPARK-6816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14488237#comment-14488237
 ] 

Shivaram Venkataraman commented on SPARK-6816:
----------------------------------------------

Comments from SparkR JIRA

Shivaram Venkataraman added a comment - 14/Feb/15 10:32 AM
I looked at this recently and I think the existing arguments to `sparkR.init` 
pretty much cover all the options that are exposed in SparkConf.
We could split things out of the function arguments into a separate SparkConf 
object (something like PySpark 
https://github.com/apache/spark/blob/master/python/pyspark/conf.py) but the 
setter-methods don't translate very well to the style we use in SparkR. For 
example it would be something like setAppName(setMaster(conf, "local"), 
"SparkR") instead of conf.setMaster().setAppName()
The other thing brought up by this JIRA is that we should parse arguments 
passed to spark-submit or set in spark-defaults.conf. I think this should 
automatically happen with SPARKR-178
Sun Rui Zongheng Yang Any thoughts on this ?
  
concretevitamin Zongheng Yang added a comment - 15/Feb/15 12:07 PM
I'm +1 on not using the builder pattern in R. What about using a named list or 
an environment to simulate a SparkConf? For example, users can write something 
like:
{code}
> conf <- list(spark.master = "local[2]", spark.executor.memory = "12g")
> conf
$spark.master
[1] "local[2]"

$spark.executor.memory
[1] "12g"
{code}
and pass the named list to `sparkR.init()`.
  
shivaram Shivaram Venkataraman added a comment - 15/Feb/15 5:50 PM
I think the named list might be okay, (one thing is that we will have nested 
named lists for things like executorEnv). However I am not sure if named lists 
are better than just passing named arguments to the `sparkR.init`. I guess the 
better way to ask my question is what functionality do we want to provide to 
the users –
Right now users can pretty much set anything they want in the SparkConf using 
sparkR.init
One functionality that is missing is printing the conf and say inspecting what 
config variables are set. We could say add a getConf(sc) which returns a named 
list to provide this feature.
Is there any other functionality we need ?
  
concretevitamin Zongheng Yang added a comment - 21/Feb/15 3:22 PM
IMO using a named list provides more flexibility: it's ordinary data that users 
can operate/transform on. Using only parameter-passing in the constructor locks 
users in operating on code instead of data. It'd also be easier to just return 
the saved named list if we're going to implement getConf()?
Some relevant discussions: https://aphyr.com/posts/321-builders-vs-option-maps
  
shivaram Shivaram Venkataraman added a comment - 22/Feb/15 4:33 PM
Hmm okay - named lists are not quite the same as option maps though.To move 
forward it'll be good to see how the new API functions we want on the R side 
should look like.
Lets keep this discussion open but I'm going to change the priority / 
description (we are already able to read in spark-defaults.conf now that 
SPARKR-178 has been merged).

> Add SparkConf API to configure SparkR
> -------------------------------------
>
>                 Key: SPARK-6816
>                 URL: https://issues.apache.org/jira/browse/SPARK-6816
>             Project: Spark
>          Issue Type: New Feature
>          Components: SparkR
>            Reporter: Shivaram Venkataraman
>            Priority: Minor
>
> Right now the only way to configure SparkR is to pass in arguments to 
> sparkR.init. The goal is to add an API similar to SparkConf on Scala/Python 
> to make configuration easier



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to