[jira] [Commented] (CARBONDATA-285) Use path parameter in Spark datasource API

2016-10-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15573986#comment-15573986
 ] 

ASF GitHub Bot commented on CARBONDATA-285:
---

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/212#discussion_r83351395
  
--- Diff: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/carbonTableSchema.scala
 ---
@@ -861,9 +861,11 @@ private[sql] case class CreateTable(cm: tableModel) 
extends RunnableCommand {
   val tablePath = catalog.createTableFromThrift(tableInfo, dbName, 
tbName, null)(sqlContext)
   try {
 sqlContext.sql(
-  s"""CREATE TABLE $dbName.$tbName USING carbondata""" +
-  s""" OPTIONS (tableName "$dbName.$tbName", tablePath 
"$tablePath") """)
-  .collect
+  s"""
+ | CREATE TABLE $dbName.$tbName
+ | USING carbondata
+ | OPTIONS (path "$tablePath")
--- End diff --

ok, will fix it


> Use path parameter in Spark datasource API
> --
>
> Key: CARBONDATA-285
> URL: https://issues.apache.org/jira/browse/CARBONDATA-285
> Project: CarbonData
>  Issue Type: Improvement
>  Components: spark-integration
>Affects Versions: 0.1.0-incubating
>Reporter: Jacky Li
> Fix For: 0.2.0-incubating
>
>
> Currently, when using carbon with spark datasource API, it need to give 
> database name and table name as parameter, it is not the normal way of 
> datasource API usage. In this PR, database name and table name is not 
> required to give, user need to specify the `path` parameter (indicating the 
> path to table folder) only when using datasource API



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CARBONDATA-285) Use path parameter in Spark datasource API

2016-10-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15573948#comment-15573948
 ] 

ASF GitHub Bot commented on CARBONDATA-285:
---

Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/212#discussion_r83350489
  
--- Diff: 
integration/spark/src/main/scala/org/apache/spark/sql/CarbonDatasourceRelation.scala
 ---
@@ -55,18 +55,11 @@ class CarbonSource extends RelationProvider
   override def createRelation(
   sqlContext: SQLContext,
   parameters: Map[String, String]): BaseRelation = {
-// if path is provided we can directly create Hadoop relation. \
-// Otherwise create datasource relation
-parameters.get("path") match {
-  case Some(path) => CarbonDatasourceHadoopRelation(sqlContext, 
Array(path), parameters, None)
-  case _ =>
-val options = new CarbonOption(parameters)
-val tableIdentifier = options.tableIdentifier.split("""\.""").toSeq
-val identifier = tableIdentifier match {
-  case Seq(name) => TableIdentifier(name, None)
-  case Seq(db, name) => TableIdentifier(name, Some(db))
-}
-CarbonDatasourceRelation(identifier, None)(sqlContext)
+val options = new CarbonOption(parameters)
+if (sqlContext.isInstanceOf[CarbonContext]) {
--- End diff --

Ok, got it. You have added `path` in `options` while creating datasource 
table to make it work. 


> Use path parameter in Spark datasource API
> --
>
> Key: CARBONDATA-285
> URL: https://issues.apache.org/jira/browse/CARBONDATA-285
> Project: CarbonData
>  Issue Type: Improvement
>  Components: spark-integration
>Affects Versions: 0.1.0-incubating
>Reporter: Jacky Li
> Fix For: 0.2.0-incubating
>
>
> Currently, when using carbon with spark datasource API, it need to give 
> database name and table name as parameter, it is not the normal way of 
> datasource API usage. In this PR, database name and table name is not 
> required to give, user need to specify the `path` parameter (indicating the 
> path to table folder) only when using datasource API



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CARBONDATA-285) Use path parameter in Spark datasource API

2016-10-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15573945#comment-15573945
 ] 

ASF GitHub Bot commented on CARBONDATA-285:
---

Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/212#discussion_r83350430
  
--- Diff: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/carbonTableSchema.scala
 ---
@@ -861,9 +861,11 @@ private[sql] case class CreateTable(cm: tableModel) 
extends RunnableCommand {
   val tablePath = catalog.createTableFromThrift(tableInfo, dbName, 
tbName, null)(sqlContext)
   try {
 sqlContext.sql(
-  s"""CREATE TABLE $dbName.$tbName USING carbondata""" +
-  s""" OPTIONS (tableName "$dbName.$tbName", tablePath 
"$tablePath") """)
-  .collect
+  s"""
+ | CREATE TABLE $dbName.$tbName
+ | USING carbondata
+ | OPTIONS (path "$tablePath")
--- End diff --

There would be backward compatability issues here. Old tables cannot work 
because `path` was not present.


> Use path parameter in Spark datasource API
> --
>
> Key: CARBONDATA-285
> URL: https://issues.apache.org/jira/browse/CARBONDATA-285
> Project: CarbonData
>  Issue Type: Improvement
>  Components: spark-integration
>Affects Versions: 0.1.0-incubating
>Reporter: Jacky Li
> Fix For: 0.2.0-incubating
>
>
> Currently, when using carbon with spark datasource API, it need to give 
> database name and table name as parameter, it is not the normal way of 
> datasource API usage. In this PR, database name and table name is not 
> required to give, user need to specify the `path` parameter (indicating the 
> path to table folder) only when using datasource API



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CARBONDATA-285) Use path parameter in Spark datasource API

2016-10-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15573569#comment-15573569
 ] 

ASF GitHub Bot commented on CARBONDATA-285:
---

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/212#discussion_r83336930
  
--- Diff: 
integration/spark/src/main/scala/org/apache/spark/sql/CarbonDatasourceRelation.scala
 ---
@@ -55,18 +55,11 @@ class CarbonSource extends RelationProvider
   override def createRelation(
   sqlContext: SQLContext,
   parameters: Map[String, String]): BaseRelation = {
-// if path is provided we can directly create Hadoop relation. \
-// Otherwise create datasource relation
-parameters.get("path") match {
-  case Some(path) => CarbonDatasourceHadoopRelation(sqlContext, 
Array(path), parameters, None)
-  case _ =>
-val options = new CarbonOption(parameters)
-val tableIdentifier = options.tableIdentifier.split("""\.""").toSeq
-val identifier = tableIdentifier match {
-  case Seq(name) => TableIdentifier(name, None)
-  case Seq(db, name) => TableIdentifier(name, Some(db))
-}
-CarbonDatasourceRelation(identifier, None)(sqlContext)
+val options = new CarbonOption(parameters)
+if (sqlContext.isInstanceOf[CarbonContext]) {
--- End diff --

It works, please check DataFrameAPIExample.scala


> Use path parameter in Spark datasource API
> --
>
> Key: CARBONDATA-285
> URL: https://issues.apache.org/jira/browse/CARBONDATA-285
> Project: CarbonData
>  Issue Type: Improvement
>  Components: spark-integration
>Affects Versions: 0.1.0-incubating
>Reporter: Jacky Li
> Fix For: 0.2.0-incubating
>
>
> Currently, when using carbon with spark datasource API, it need to give 
> database name and table name as parameter, it is not the normal way of 
> datasource API usage. In this PR, database name and table name is not 
> required to give, user need to specify the `path` parameter (indicating the 
> path to table folder) only when using datasource API



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CARBONDATA-285) Use path parameter in Spark datasource API

2016-10-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570916#comment-15570916
 ] 

ASF GitHub Bot commented on CARBONDATA-285:
---

Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/212#discussion_r83146927
  
--- Diff: 
integration/spark/src/main/scala/org/apache/spark/sql/CarbonDatasourceRelation.scala
 ---
@@ -55,18 +55,11 @@ class CarbonSource extends RelationProvider
   override def createRelation(
   sqlContext: SQLContext,
   parameters: Map[String, String]): BaseRelation = {
-// if path is provided we can directly create Hadoop relation. \
-// Otherwise create datasource relation
-parameters.get("path") match {
-  case Some(path) => CarbonDatasourceHadoopRelation(sqlContext, 
Array(path), parameters, None)
-  case _ =>
-val options = new CarbonOption(parameters)
-val tableIdentifier = options.tableIdentifier.split("""\.""").toSeq
-val identifier = tableIdentifier match {
-  case Seq(name) => TableIdentifier(name, None)
-  case Seq(db, name) => TableIdentifier(name, Some(db))
-}
-CarbonDatasourceRelation(identifier, None)(sqlContext)
+val options = new CarbonOption(parameters)
+if (sqlContext.isInstanceOf[CarbonContext]) {
--- End diff --

sorry, yes `carboncontext.load(path)` cannot work now right?


> Use path parameter in Spark datasource API
> --
>
> Key: CARBONDATA-285
> URL: https://issues.apache.org/jira/browse/CARBONDATA-285
> Project: CarbonData
>  Issue Type: Improvement
>  Components: spark-integration
>Affects Versions: 0.1.0-incubating
>Reporter: Jacky Li
> Fix For: 0.2.0-incubating
>
>
> Currently, when using carbon with spark datasource API, it need to give 
> database name and table name as parameter, it is not the normal way of 
> datasource API usage. In this PR, database name and table name is not 
> required to give, user need to specify the `path` parameter (indicating the 
> path to table folder) only when using datasource API



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CARBONDATA-285) Use path parameter in Spark datasource API

2016-10-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570315#comment-15570315
 ] 

ASF GitHub Bot commented on CARBONDATA-285:
---

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/212#discussion_r83123943
  
--- Diff: 
integration/spark/src/main/scala/org/apache/spark/sql/CarbonDatasourceRelation.scala
 ---
@@ -55,18 +55,11 @@ class CarbonSource extends RelationProvider
   override def createRelation(
   sqlContext: SQLContext,
   parameters: Map[String, String]): BaseRelation = {
-// if path is provided we can directly create Hadoop relation. \
-// Otherwise create datasource relation
-parameters.get("path") match {
-  case Some(path) => CarbonDatasourceHadoopRelation(sqlContext, 
Array(path), parameters, None)
-  case _ =>
-val options = new CarbonOption(parameters)
-val tableIdentifier = options.tableIdentifier.split("""\.""").toSeq
-val identifier = tableIdentifier match {
-  case Seq(name) => TableIdentifier(name, None)
-  case Seq(db, name) => TableIdentifier(name, Some(db))
-}
-CarbonDatasourceRelation(identifier, None)(sqlContext)
+val options = new CarbonOption(parameters)
+if (sqlContext.isInstanceOf[CarbonContext]) {
--- End diff --

There is no `load` method in dataframe, only in context class.


> Use path parameter in Spark datasource API
> --
>
> Key: CARBONDATA-285
> URL: https://issues.apache.org/jira/browse/CARBONDATA-285
> Project: CarbonData
>  Issue Type: Improvement
>  Components: spark-integration
>Affects Versions: 0.1.0-incubating
>Reporter: Jacky Li
> Fix For: 0.2.0-incubating
>
>
> Currently, when using carbon with spark datasource API, it need to give 
> database name and table name as parameter, it is not the normal way of 
> datasource API usage. In this PR, database name and table name is not 
> required to give, user need to specify the `path` parameter (indicating the 
> path to table folder) only when using datasource API



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)