[jira] [Commented] (CARBONDATA-285) Use path parameter in Spark datasource API
[ https://issues.apache.org/jira/browse/CARBONDATA-285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15573986#comment-15573986 ] ASF GitHub Bot commented on CARBONDATA-285: --- Github user jackylk commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/212#discussion_r83351395 --- Diff: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/carbonTableSchema.scala --- @@ -861,9 +861,11 @@ private[sql] case class CreateTable(cm: tableModel) extends RunnableCommand { val tablePath = catalog.createTableFromThrift(tableInfo, dbName, tbName, null)(sqlContext) try { sqlContext.sql( - s"""CREATE TABLE $dbName.$tbName USING carbondata""" + - s""" OPTIONS (tableName "$dbName.$tbName", tablePath "$tablePath") """) - .collect + s""" + | CREATE TABLE $dbName.$tbName + | USING carbondata + | OPTIONS (path "$tablePath") --- End diff -- ok, will fix it > Use path parameter in Spark datasource API > -- > > Key: CARBONDATA-285 > URL: https://issues.apache.org/jira/browse/CARBONDATA-285 > Project: CarbonData > Issue Type: Improvement > Components: spark-integration >Affects Versions: 0.1.0-incubating >Reporter: Jacky Li > Fix For: 0.2.0-incubating > > > Currently, when using carbon with spark datasource API, it need to give > database name and table name as parameter, it is not the normal way of > datasource API usage. In this PR, database name and table name is not > required to give, user need to specify the `path` parameter (indicating the > path to table folder) only when using datasource API -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CARBONDATA-285) Use path parameter in Spark datasource API
[ https://issues.apache.org/jira/browse/CARBONDATA-285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15573948#comment-15573948 ] ASF GitHub Bot commented on CARBONDATA-285: --- Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/212#discussion_r83350489 --- Diff: integration/spark/src/main/scala/org/apache/spark/sql/CarbonDatasourceRelation.scala --- @@ -55,18 +55,11 @@ class CarbonSource extends RelationProvider override def createRelation( sqlContext: SQLContext, parameters: Map[String, String]): BaseRelation = { -// if path is provided we can directly create Hadoop relation. \ -// Otherwise create datasource relation -parameters.get("path") match { - case Some(path) => CarbonDatasourceHadoopRelation(sqlContext, Array(path), parameters, None) - case _ => -val options = new CarbonOption(parameters) -val tableIdentifier = options.tableIdentifier.split("""\.""").toSeq -val identifier = tableIdentifier match { - case Seq(name) => TableIdentifier(name, None) - case Seq(db, name) => TableIdentifier(name, Some(db)) -} -CarbonDatasourceRelation(identifier, None)(sqlContext) +val options = new CarbonOption(parameters) +if (sqlContext.isInstanceOf[CarbonContext]) { --- End diff -- Ok, got it. You have added `path` in `options` while creating datasource table to make it work. > Use path parameter in Spark datasource API > -- > > Key: CARBONDATA-285 > URL: https://issues.apache.org/jira/browse/CARBONDATA-285 > Project: CarbonData > Issue Type: Improvement > Components: spark-integration >Affects Versions: 0.1.0-incubating >Reporter: Jacky Li > Fix For: 0.2.0-incubating > > > Currently, when using carbon with spark datasource API, it need to give > database name and table name as parameter, it is not the normal way of > datasource API usage. In this PR, database name and table name is not > required to give, user need to specify the `path` parameter (indicating the > path to table folder) only when using datasource API -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CARBONDATA-285) Use path parameter in Spark datasource API
[ https://issues.apache.org/jira/browse/CARBONDATA-285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15573945#comment-15573945 ] ASF GitHub Bot commented on CARBONDATA-285: --- Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/212#discussion_r83350430 --- Diff: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/carbonTableSchema.scala --- @@ -861,9 +861,11 @@ private[sql] case class CreateTable(cm: tableModel) extends RunnableCommand { val tablePath = catalog.createTableFromThrift(tableInfo, dbName, tbName, null)(sqlContext) try { sqlContext.sql( - s"""CREATE TABLE $dbName.$tbName USING carbondata""" + - s""" OPTIONS (tableName "$dbName.$tbName", tablePath "$tablePath") """) - .collect + s""" + | CREATE TABLE $dbName.$tbName + | USING carbondata + | OPTIONS (path "$tablePath") --- End diff -- There would be backward compatability issues here. Old tables cannot work because `path` was not present. > Use path parameter in Spark datasource API > -- > > Key: CARBONDATA-285 > URL: https://issues.apache.org/jira/browse/CARBONDATA-285 > Project: CarbonData > Issue Type: Improvement > Components: spark-integration >Affects Versions: 0.1.0-incubating >Reporter: Jacky Li > Fix For: 0.2.0-incubating > > > Currently, when using carbon with spark datasource API, it need to give > database name and table name as parameter, it is not the normal way of > datasource API usage. In this PR, database name and table name is not > required to give, user need to specify the `path` parameter (indicating the > path to table folder) only when using datasource API -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CARBONDATA-285) Use path parameter in Spark datasource API
[ https://issues.apache.org/jira/browse/CARBONDATA-285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15573569#comment-15573569 ] ASF GitHub Bot commented on CARBONDATA-285: --- Github user jackylk commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/212#discussion_r83336930 --- Diff: integration/spark/src/main/scala/org/apache/spark/sql/CarbonDatasourceRelation.scala --- @@ -55,18 +55,11 @@ class CarbonSource extends RelationProvider override def createRelation( sqlContext: SQLContext, parameters: Map[String, String]): BaseRelation = { -// if path is provided we can directly create Hadoop relation. \ -// Otherwise create datasource relation -parameters.get("path") match { - case Some(path) => CarbonDatasourceHadoopRelation(sqlContext, Array(path), parameters, None) - case _ => -val options = new CarbonOption(parameters) -val tableIdentifier = options.tableIdentifier.split("""\.""").toSeq -val identifier = tableIdentifier match { - case Seq(name) => TableIdentifier(name, None) - case Seq(db, name) => TableIdentifier(name, Some(db)) -} -CarbonDatasourceRelation(identifier, None)(sqlContext) +val options = new CarbonOption(parameters) +if (sqlContext.isInstanceOf[CarbonContext]) { --- End diff -- It works, please check DataFrameAPIExample.scala > Use path parameter in Spark datasource API > -- > > Key: CARBONDATA-285 > URL: https://issues.apache.org/jira/browse/CARBONDATA-285 > Project: CarbonData > Issue Type: Improvement > Components: spark-integration >Affects Versions: 0.1.0-incubating >Reporter: Jacky Li > Fix For: 0.2.0-incubating > > > Currently, when using carbon with spark datasource API, it need to give > database name and table name as parameter, it is not the normal way of > datasource API usage. In this PR, database name and table name is not > required to give, user need to specify the `path` parameter (indicating the > path to table folder) only when using datasource API -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CARBONDATA-285) Use path parameter in Spark datasource API
[ https://issues.apache.org/jira/browse/CARBONDATA-285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570916#comment-15570916 ] ASF GitHub Bot commented on CARBONDATA-285: --- Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/212#discussion_r83146927 --- Diff: integration/spark/src/main/scala/org/apache/spark/sql/CarbonDatasourceRelation.scala --- @@ -55,18 +55,11 @@ class CarbonSource extends RelationProvider override def createRelation( sqlContext: SQLContext, parameters: Map[String, String]): BaseRelation = { -// if path is provided we can directly create Hadoop relation. \ -// Otherwise create datasource relation -parameters.get("path") match { - case Some(path) => CarbonDatasourceHadoopRelation(sqlContext, Array(path), parameters, None) - case _ => -val options = new CarbonOption(parameters) -val tableIdentifier = options.tableIdentifier.split("""\.""").toSeq -val identifier = tableIdentifier match { - case Seq(name) => TableIdentifier(name, None) - case Seq(db, name) => TableIdentifier(name, Some(db)) -} -CarbonDatasourceRelation(identifier, None)(sqlContext) +val options = new CarbonOption(parameters) +if (sqlContext.isInstanceOf[CarbonContext]) { --- End diff -- sorry, yes `carboncontext.load(path)` cannot work now right? > Use path parameter in Spark datasource API > -- > > Key: CARBONDATA-285 > URL: https://issues.apache.org/jira/browse/CARBONDATA-285 > Project: CarbonData > Issue Type: Improvement > Components: spark-integration >Affects Versions: 0.1.0-incubating >Reporter: Jacky Li > Fix For: 0.2.0-incubating > > > Currently, when using carbon with spark datasource API, it need to give > database name and table name as parameter, it is not the normal way of > datasource API usage. In this PR, database name and table name is not > required to give, user need to specify the `path` parameter (indicating the > path to table folder) only when using datasource API -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CARBONDATA-285) Use path parameter in Spark datasource API
[ https://issues.apache.org/jira/browse/CARBONDATA-285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570315#comment-15570315 ] ASF GitHub Bot commented on CARBONDATA-285: --- Github user jackylk commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/212#discussion_r83123943 --- Diff: integration/spark/src/main/scala/org/apache/spark/sql/CarbonDatasourceRelation.scala --- @@ -55,18 +55,11 @@ class CarbonSource extends RelationProvider override def createRelation( sqlContext: SQLContext, parameters: Map[String, String]): BaseRelation = { -// if path is provided we can directly create Hadoop relation. \ -// Otherwise create datasource relation -parameters.get("path") match { - case Some(path) => CarbonDatasourceHadoopRelation(sqlContext, Array(path), parameters, None) - case _ => -val options = new CarbonOption(parameters) -val tableIdentifier = options.tableIdentifier.split("""\.""").toSeq -val identifier = tableIdentifier match { - case Seq(name) => TableIdentifier(name, None) - case Seq(db, name) => TableIdentifier(name, Some(db)) -} -CarbonDatasourceRelation(identifier, None)(sqlContext) +val options = new CarbonOption(parameters) +if (sqlContext.isInstanceOf[CarbonContext]) { --- End diff -- There is no `load` method in dataframe, only in context class. > Use path parameter in Spark datasource API > -- > > Key: CARBONDATA-285 > URL: https://issues.apache.org/jira/browse/CARBONDATA-285 > Project: CarbonData > Issue Type: Improvement > Components: spark-integration >Affects Versions: 0.1.0-incubating >Reporter: Jacky Li > Fix For: 0.2.0-incubating > > > Currently, when using carbon with spark datasource API, it need to give > database name and table name as parameter, it is not the normal way of > datasource API usage. In this PR, database name and table name is not > required to give, user need to specify the `path` parameter (indicating the > path to table folder) only when using datasource API -- This message was sent by Atlassian JIRA (v6.3.4#6332)