[ https://issues.apache.org/jira/browse/SPARK-10903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14964326#comment-14964326 ]
Felix Cheung edited comment on SPARK-10903 at 10/20/15 12:46 AM: ----------------------------------------------------------------- Looked into and tested with a few approaches. Tried to minimize changes by sticking to S3 method dispatch, however, it is not working {code} createDataFrame <- function(sqlContext, data, schema = NULL, samplingRatio = 1.0) UseMethod("createDataFrame") createDataFrame.jobj <- function(sqlContext, data, schema = NULL, samplingRatio = 1.0) { createDataFrame(data, schema, samplingRatio) } createDataFrame <- function(data, schema = NULL, samplingRatio = 1.0) { sqlContext <- getSqlContext() ... # works > a <- createDataFrame(iris) # does not work > b <- createDataFrame(sqlContext, iris) Error in createDataFrame(sqlContext, iris) : unexpected type: jobj {code} Any idea? IMO we would have otherwise two options: 1. Promote methods to S4 - though some functions support RDD which we might not want to expose. 2. Make breaking changes to method argument, ie. have only one signature `createDataFrame <- function(data, schema = NULL, samplingRatio = 1.0)` was (Author: felixcheung): Looked into and tested with a few approaches. Tried to minimize changes by sticking to S3 method dispatch, however, it is not working {code} createDataFrame <- function(sqlContext, data, schema = NULL, samplingRatio = 1.0) UseMethod("createDataFrame") createDataFrame.jobj <- function(sqlContext, data, schema = NULL, samplingRatio = 1.0) { createDataFrame(data, schema, samplingRatio) } # TODO(davies): support sampling and infer type from NA createDataFrame <- function(data, schema = NULL, samplingRatio = 1.0) { sqlContext <- getSqlContext() ... # works > a <- createDataFrame(iris) # does not work > b <- createDataFrame(sqlContext, iris) Error in createDataFrame(sqlContext, iris) : unexpected type: jobj {code} Any idea? IMO we would have otherwise two options: 1. Promote methods to S4 - though some functions support RDD which we might not want to expose. 2. Make breaking changes to method argument, ie. have only one signature `createDataFrame <- function(data, schema = NULL, samplingRatio = 1.0)` > Make sqlContext global > ----------------------- > > Key: SPARK-10903 > URL: https://issues.apache.org/jira/browse/SPARK-10903 > Project: Spark > Issue Type: Sub-task > Components: SparkR > Reporter: Narine Kokhlikyan > Priority: Minor > > Make sqlContext global so that we don't have to always specify it. > e.g. createDataFrame(iris) instead of createDataFrame(sqlContext, iris) -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org