[jira] [Created] (CARBONDATA-927) Show segment in data management doc
Sanoj MG created CARBONDATA-927: --- Summary: Show segment in data management doc Key: CARBONDATA-927 URL: https://issues.apache.org/jira/browse/CARBONDATA-927 Project: CarbonData Issue Type: Improvement Components: docs Affects Versions: 1.2.0-incubating, 1.1.1-incubating Reporter: Sanoj MG Assignee: Sanoj MG Priority: Minor Fix For: 1.2.0-incubating, 1.1.1-incubating Minor corrections in docs : - Fix show segment link in data management docs - show segment command output to be reformatted -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (CARBONDATA-837) Unable to delete records from carbondata table
[ https://issues.apache.org/jira/browse/CARBONDATA-837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sanoj MG reassigned CARBONDATA-837: --- Assignee: Sanoj MG > Unable to delete records from carbondata table > -- > > Key: CARBONDATA-837 > URL: https://issues.apache.org/jira/browse/CARBONDATA-837 > Project: CarbonData > Issue Type: Bug > Components: spark-integration >Affects Versions: 1.1.0-incubating > Environment: HDP 2.5, Spark 1.6.2 >Reporter: Sanoj MG >Assignee: Sanoj MG >Priority: Minor > Fix For: NONE > > > As per below document I am trying to delete entries from the table : > https://github.com/apache/incubator-carbondata/blob/master/docs/dml-operation-on-carbondata.md > scala> cc.sql("select * from accountentity").count > res10: Long = 391351 > scala> cc.sql("delete from accountentity") > INFO 30-03 09:03:03,099 - main Query [DELETE FROM ACCOUNTENTITY] > INFO 30-03 09:03:03,104 - Parsing command: select tupleId from accountentity > INFO 30-03 09:03:03,104 - Parse Completed > INFO 30-03 09:03:03,105 - Parsing command: select tupleId from accountentity > INFO 30-03 09:03:03,105 - Parse Completed > res11: org.apache.spark.sql.DataFrame = [] > scala> cc.sql("select * from accountentity").count > res12: Long = 391351 > The records gets deleted only when an action such as show() is applied. > scala> cc.sql("delete from accountentity").show -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (CARBONDATA-884) [Documentation] information on assembly jar to be provided in Quick Start
[ https://issues.apache.org/jira/browse/CARBONDATA-884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sanoj MG reassigned CARBONDATA-884: --- Assignee: Sanoj MG > [Documentation] information on assembly jar to be provided in Quick Start > - > > Key: CARBONDATA-884 > URL: https://issues.apache.org/jira/browse/CARBONDATA-884 > Project: CarbonData > Issue Type: Improvement >Reporter: Gururaj Shetty >Assignee: Sanoj MG >Priority: Minor > > In Quick start we have mentioned the below command: > Start Spark shell by running the following command in the Spark directory: > ./bin/spark-shell --jars > It is better to mention for the user from where to take the assembly jar. > For example: the assembly jar will be present in the target folder when you > build the project. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (CARBONDATA-854) Carbondata with Datastax / Cassandra
[ https://issues.apache.org/jira/browse/CARBONDATA-854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sanoj MG reassigned CARBONDATA-854: --- Assignee: Sanoj MG > Carbondata with Datastax / Cassandra > > > Key: CARBONDATA-854 > URL: https://issues.apache.org/jira/browse/CARBONDATA-854 > Project: CarbonData > Issue Type: Improvement > Components: spark-integration >Affects Versions: 1.1.0-incubating > Environment: Datastax DSE 5.0 ( DSE analytics ) >Reporter: Sanoj MG >Assignee: Sanoj MG >Priority: Minor > Time Spent: 40m > Remaining Estimate: 0h > > I am trying to get Carbondata working in a Datastax DSE 5.0 cluster. > An exception is thrown while trying to create Carbondata table from spark > shell. Below are the steps: > scala> import com.datastax.spark.connector._ > scala> import org.apache.spark.sql.SaveMode > scala> import org.apache.spark.sql.CarbonContext > scala> import org.apache.spark.sql.types._ > scala> val cc = new CarbonContext(sc, "cfs://127.0.0.1/opt/CarbonStore") > scala> val df = > cc.read.parquet("file:///home/cassandra/testdata-30day/cassandra/zone.parquet") > scala> df.write.format("carbondata").option("tableName", > "zone").option("compress", > "true").option("TempCSV","false").mode(SaveMode.Overwrite).save() > Below exception is thrown and it fails to create carbondata table. > java.io.FileNotFoundException: /opt/CarbonStore/default/zone/Metadata/schema > (No such file or directory) > at java.io.FileOutputStream.open0(Native Method) > at java.io.FileOutputStream.open(FileOutputStream.java:270) > at java.io.FileOutputStream.(FileOutputStream.java:213) > at java.io.FileOutputStream.(FileOutputStream.java:133) > at > org.apache.carbondata.core.datastore.impl.FileFactory.getDataOutputStream(FileFactory.java:207) > at > org.apache.carbondata.core.writer.ThriftWriter.open(ThriftWriter.java:84) > at > org.apache.spark.sql.hive.CarbonMetastore.createTableFromThrift(CarbonMetastore.scala:293) > at > org.apache.spark.sql.execution.command.CreateTable.run(carbonTableSchema.scala:163) > at > org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:58) > at > org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:56) > at > org.apache.spark.sql.execution.ExecutedCommand.doExecute(commands.scala:70) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:132) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:130) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) > at > org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130) > at > org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:55) > at > org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:55) > at org.apache.spark.sql.DataFrame.(DataFrame.scala:145) > at org.apache.spark.sql.DataFrame.(DataFrame.scala:130) > at org.apache.spark.sql.CarbonContext.sql(CarbonContext.scala:139) > at > org.apache.carbondata.spark.CarbonDataFrameWriter.saveAsCarbonFile(CarbonDataFrameWriter.scala:39) > at > org.apache.spark.sql.CarbonSource.createRelation(CarbonDatasourceRelation.scala:109) > at > org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:222) > at > org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:148) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (CARBONDATA-836) Error in load using dataframe - columns containing comma
[ https://issues.apache.org/jira/browse/CARBONDATA-836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sanoj MG reassigned CARBONDATA-836: --- Assignee: Sanoj MG > Error in load using dataframe - columns containing comma > - > > Key: CARBONDATA-836 > URL: https://issues.apache.org/jira/browse/CARBONDATA-836 > Project: CarbonData > Issue Type: Bug > Components: spark-integration >Affects Versions: 1.1.0-incubating > Environment: HDP sandbox 2.5, Spark 1.6.2 >Reporter: Sanoj MG >Assignee: Sanoj MG >Priority: Minor > Fix For: NONE > > > While trying to load data into Carabondata table using dataframe, the columns > containing commas are not properly loaded. > Eg: > scala> df.show(false) > +---+--+---++-+--+ > |Country|Branch|Name |Address |ShortName|Status| > +---+--+---++-+--+ > |2 |1 |Main Branch|, Dubai, UAE|UHO |256 | > +---+--+---++-+--+ > scala> df.write.format("carbondata").option("tableName", > "Branch1").option("compress", "true").mode(SaveMode.Overwrite).save() > scala> cc.sql("select * from branch1").show(false) > +---+--+---+---+-+--+ > |country|branch|name |address|shortname|status| > +---+--+---+---+-+--+ > |2 |1 |Main Branch| | Dubai |null | > +---+--+---+---+-+--+ -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (CARBONDATA-888) Dictionary include / exclude option in dataframe writer
[ https://issues.apache.org/jira/browse/CARBONDATA-888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sanoj MG reassigned CARBONDATA-888: --- Assignee: Sanoj MG > Dictionary include / exclude option in dataframe writer > --- > > Key: CARBONDATA-888 > URL: https://issues.apache.org/jira/browse/CARBONDATA-888 > Project: CarbonData > Issue Type: Improvement > Components: spark-integration >Affects Versions: 1.2.0-incubating > Environment: HDP 2.5, Spark 1.6 >Reporter: Sanoj MG >Assignee: Sanoj MG >Priority: Minor > Fix For: 1.2.0-incubating > > Time Spent: 50m > Remaining Estimate: 0h > > While creating a Carbondata table from dataframe, currently it is not > possible to specify columns that needs to be included in or excluded from the > dictionary. An option is required to specify it as below : > df.write.format("carbondata") > .option("tableName", "test") > .option("compress","true") > .option("dictionary_include","incol1,intcol2") > .option("dictionary_exclude","stringcol1,stringcol2") > .mode(SaveMode.Overwrite) > .save() > We have lot of integer columns that are dimensions, dataframe.save is used to > quickly create tables instead of writing ddls, and it would be nice to have > this feature to execute POCs. > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CARBONDATA-888) Dictionary include / exclude option in dataframe writer
[ https://issues.apache.org/jira/browse/CARBONDATA-888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15961952#comment-15961952 ] Sanoj MG commented on CARBONDATA-888: - Can this be assigned to me, I have already made the code changes and would like to create a pr. > Dictionary include / exclude option in dataframe writer > --- > > Key: CARBONDATA-888 > URL: https://issues.apache.org/jira/browse/CARBONDATA-888 > Project: CarbonData > Issue Type: Improvement > Components: spark-integration >Affects Versions: 1.2.0-incubating > Environment: HDP 2.5, Spark 1.6 >Reporter: Sanoj MG >Priority: Minor > Fix For: 1.2.0-incubating > > > While creating a Carbondata table from dataframe, currently it is not > possible to specify columns that needs to be included in or excluded from the > dictionary. An option is required to specify it as below : > df.write.format("carbondata") > .option("tableName", "test") > .option("compress","true") > .option("dictionary_include","incol1,intcol2") > .option("dictionary_exclude","stringcol1,stringcol2") > .mode(SaveMode.Overwrite) > .save() > We have lot of integer columns that are dimensions, dataframe.save is used to > quickly create tables instead of writing ddls, and it would be nice to have > this feature to execute POCs. > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CARBONDATA-888) Dictionary include / exclude option in dataframe writer
Sanoj MG created CARBONDATA-888: --- Summary: Dictionary include / exclude option in dataframe writer Key: CARBONDATA-888 URL: https://issues.apache.org/jira/browse/CARBONDATA-888 Project: CarbonData Issue Type: Improvement Components: spark-integration Affects Versions: 1.2.0-incubating Environment: HDP 2.5, Spark 1.6 Reporter: Sanoj MG Priority: Minor Fix For: 1.2.0-incubating While creating a Carbondata table from dataframe, currently it is not possible to specify columns that needs to be included in or excluded from the dictionary. An option is required to specify it as below : df.write.format("carbondata") .option("tableName", "test") .option("compress","true") .option("dictionary_include","incol1,intcol2") .option("dictionary_exclude","stringcol1,stringcol2") .mode(SaveMode.Overwrite) .save() We have lot of integer columns that are dimensions, dataframe.save is used to quickly create tables instead of writing ddls, and it would be nice to have this feature to execute POCs. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CARBONDATA-854) Carbondata with Datastax / Cassandra
[ https://issues.apache.org/jira/browse/CARBONDATA-854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15955500#comment-15955500 ] Sanoj MG commented on CARBONDATA-854: - I am working on this. Can this be assigned to me? > Carbondata with Datastax / Cassandra > > > Key: CARBONDATA-854 > URL: https://issues.apache.org/jira/browse/CARBONDATA-854 > Project: CarbonData > Issue Type: Improvement > Components: spark-integration >Affects Versions: 1.1.0-incubating > Environment: Datastax DSE 5.0 ( DSE analytics ) >Reporter: Sanoj MG >Priority: Minor > Fix For: 1.1.0-incubating > > > I am trying to get Carbondata working in a Datastax DSE 5.0 cluster. > An exception is thrown while trying to create Carbondata table from spark > shell. Below are the steps: > scala> import com.datastax.spark.connector._ > scala> import org.apache.spark.sql.SaveMode > scala> import org.apache.spark.sql.CarbonContext > scala> import org.apache.spark.sql.types._ > scala> val cc = new CarbonContext(sc, "cfs://127.0.0.1/opt/CarbonStore") > scala> val df = > cc.read.parquet("file:///home/cassandra/testdata-30day/cassandra/zone.parquet") > scala> df.write.format("carbondata").option("tableName", > "zone").option("compress", > "true").option("TempCSV","false").mode(SaveMode.Overwrite).save() > Below exception is thrown and it fails to create carbondata table. > java.io.FileNotFoundException: /opt/CarbonStore/default/zone/Metadata/schema > (No such file or directory) > at java.io.FileOutputStream.open0(Native Method) > at java.io.FileOutputStream.open(FileOutputStream.java:270) > at java.io.FileOutputStream.(FileOutputStream.java:213) > at java.io.FileOutputStream.(FileOutputStream.java:133) > at > org.apache.carbondata.core.datastore.impl.FileFactory.getDataOutputStream(FileFactory.java:207) > at > org.apache.carbondata.core.writer.ThriftWriter.open(ThriftWriter.java:84) > at > org.apache.spark.sql.hive.CarbonMetastore.createTableFromThrift(CarbonMetastore.scala:293) > at > org.apache.spark.sql.execution.command.CreateTable.run(carbonTableSchema.scala:163) > at > org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:58) > at > org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:56) > at > org.apache.spark.sql.execution.ExecutedCommand.doExecute(commands.scala:70) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:132) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:130) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) > at > org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130) > at > org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:55) > at > org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:55) > at org.apache.spark.sql.DataFrame.(DataFrame.scala:145) > at org.apache.spark.sql.DataFrame.(DataFrame.scala:130) > at org.apache.spark.sql.CarbonContext.sql(CarbonContext.scala:139) > at > org.apache.carbondata.spark.CarbonDataFrameWriter.saveAsCarbonFile(CarbonDataFrameWriter.scala:39) > at > org.apache.spark.sql.CarbonSource.createRelation(CarbonDatasourceRelation.scala:109) > at > org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:222) > at > org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:148) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CARBONDATA-854) Carbondata with Datastax / Cassandra
Sanoj MG created CARBONDATA-854: --- Summary: Carbondata with Datastax / Cassandra Key: CARBONDATA-854 URL: https://issues.apache.org/jira/browse/CARBONDATA-854 Project: CarbonData Issue Type: Improvement Components: spark-integration Affects Versions: 1.1.0-incubating Environment: Datastax DSE 5.0 ( DSE analytics ) Reporter: Sanoj MG Priority: Minor Fix For: 1.1.0-incubating I am trying to get Carbondata working in a Datastax DSE 5.0 cluster. An exception is thrown while trying to create Carbondata table from spark shell. Below are the steps: scala> import com.datastax.spark.connector._ scala> import org.apache.spark.sql.SaveMode scala> import org.apache.spark.sql.CarbonContext scala> import org.apache.spark.sql.types._ scala> val cc = new CarbonContext(sc, "cfs://127.0.0.1/opt/CarbonStore") scala> val df = cc.read.parquet("file:///home/cassandra/testdata-30day/cassandra/zone.parquet") scala> df.write.format("carbondata").option("tableName", "zone").option("compress", "true").option("TempCSV","false").mode(SaveMode.Overwrite).save() Below exception is thrown and it fails to create carbondata table. java.io.FileNotFoundException: /opt/CarbonStore/default/zone/Metadata/schema (No such file or directory) at java.io.FileOutputStream.open0(Native Method) at java.io.FileOutputStream.open(FileOutputStream.java:270) at java.io.FileOutputStream.(FileOutputStream.java:213) at java.io.FileOutputStream.(FileOutputStream.java:133) at org.apache.carbondata.core.datastore.impl.FileFactory.getDataOutputStream(FileFactory.java:207) at org.apache.carbondata.core.writer.ThriftWriter.open(ThriftWriter.java:84) at org.apache.spark.sql.hive.CarbonMetastore.createTableFromThrift(CarbonMetastore.scala:293) at org.apache.spark.sql.execution.command.CreateTable.run(carbonTableSchema.scala:163) at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:58) at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:56) at org.apache.spark.sql.execution.ExecutedCommand.doExecute(commands.scala:70) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:132) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:130) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130) at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:55) at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:55) at org.apache.spark.sql.DataFrame.(DataFrame.scala:145) at org.apache.spark.sql.DataFrame.(DataFrame.scala:130) at org.apache.spark.sql.CarbonContext.sql(CarbonContext.scala:139) at org.apache.carbondata.spark.CarbonDataFrameWriter.saveAsCarbonFile(CarbonDataFrameWriter.scala:39) at org.apache.spark.sql.CarbonSource.createRelation(CarbonDatasourceRelation.scala:109) at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:222) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:148) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CARBONDATA-837) Unable to delete records from carbondata table
[ https://issues.apache.org/jira/browse/CARBONDATA-837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15948907#comment-15948907 ] Sanoj MG commented on CARBONDATA-837: - Can this issue be assigned to me. > Unable to delete records from carbondata table > -- > > Key: CARBONDATA-837 > URL: https://issues.apache.org/jira/browse/CARBONDATA-837 > Project: CarbonData > Issue Type: Bug > Components: spark-integration >Affects Versions: 1.1.0-incubating > Environment: HDP 2.5, Spark 1.6.2 >Reporter: Sanoj MG >Priority: Minor > Fix For: NONE > > > As per below document I am trying to delete entries from the table : > https://github.com/apache/incubator-carbondata/blob/master/docs/dml-operation-on-carbondata.md > scala> cc.sql("select * from accountentity").count > res10: Long = 391351 > scala> cc.sql("delete from accountentity") > INFO 30-03 09:03:03,099 - main Query [DELETE FROM ACCOUNTENTITY] > INFO 30-03 09:03:03,104 - Parsing command: select tupleId from accountentity > INFO 30-03 09:03:03,104 - Parse Completed > INFO 30-03 09:03:03,105 - Parsing command: select tupleId from accountentity > INFO 30-03 09:03:03,105 - Parse Completed > res11: org.apache.spark.sql.DataFrame = [] > scala> cc.sql("select * from accountentity").count > res12: Long = 391351 > The records gets deleted only when an action such as show() is applied. > scala> cc.sql("delete from accountentity").show -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CARBONDATA-837) Unable to delete records from carbondata table
Sanoj MG created CARBONDATA-837: --- Summary: Unable to delete records from carbondata table Key: CARBONDATA-837 URL: https://issues.apache.org/jira/browse/CARBONDATA-837 Project: CarbonData Issue Type: Bug Components: spark-integration Affects Versions: 1.1.0-incubating Environment: HDP 2.5, Spark 1.6.2 Reporter: Sanoj MG Priority: Minor Fix For: NONE As per below document I am trying to delete entries from the table : https://github.com/apache/incubator-carbondata/blob/master/docs/dml-operation-on-carbondata.md scala> cc.sql("select * from accountentity").count res10: Long = 391351 scala> cc.sql("delete from accountentity") INFO 30-03 09:03:03,099 - main Query [DELETE FROM ACCOUNTENTITY] INFO 30-03 09:03:03,104 - Parsing command: select tupleId from accountentity INFO 30-03 09:03:03,104 - Parse Completed INFO 30-03 09:03:03,105 - Parsing command: select tupleId from accountentity INFO 30-03 09:03:03,105 - Parse Completed res11: org.apache.spark.sql.DataFrame = [] scala> cc.sql("select * from accountentity").count res12: Long = 391351 The records gets deleted only when an action such as show() is applied. scala> cc.sql("delete from accountentity").show -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CARBONDATA-836) Error in load using dataframe - columns containing comma
[ https://issues.apache.org/jira/browse/CARBONDATA-836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15948629#comment-15948629 ] Sanoj MG commented on CARBONDATA-836: - I am working on this issue, can this be assigned to me. > Error in load using dataframe - columns containing comma > - > > Key: CARBONDATA-836 > URL: https://issues.apache.org/jira/browse/CARBONDATA-836 > Project: CarbonData > Issue Type: Bug > Components: spark-integration >Affects Versions: 1.1.0-incubating > Environment: HDP sandbox 2.5, Spark 1.6.2 >Reporter: Sanoj MG >Priority: Minor > Fix For: NONE > > > While trying to load data into Carabondata table using dataframe, the columns > containing commas are not properly loaded. > Eg: > scala> df.show(false) > +---+--+---++-+--+ > |Country|Branch|Name |Address |ShortName|Status| > +---+--+---++-+--+ > |2 |1 |Main Branch|, Dubai, UAE|UHO |256 | > +---+--+---++-+--+ > scala> df.write.format("carbondata").option("tableName", > "Branch1").option("compress", "true").mode(SaveMode.Overwrite).save() > scala> cc.sql("select * from branch1").show(false) > +---+--+---+---+-+--+ > |country|branch|name |address|shortname|status| > +---+--+---+---+-+--+ > |2 |1 |Main Branch| | Dubai |null | > +---+--+---+---+-+--+ -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CARBONDATA-836) Error in load using dataframe - columns containing comma
Sanoj MG created CARBONDATA-836: --- Summary: Error in load using dataframe - columns containing comma Key: CARBONDATA-836 URL: https://issues.apache.org/jira/browse/CARBONDATA-836 Project: CarbonData Issue Type: Bug Components: spark-integration Affects Versions: 1.1.0-incubating Environment: HDP sandbox 2.5, Spark 1.6.2 Reporter: Sanoj MG Priority: Minor Fix For: NONE While trying to load data into Carabondata table using dataframe, the columns containing commas are not properly loaded. Eg: scala> df.show(false) +---+--+---++-+--+ |Country|Branch|Name |Address |ShortName|Status| +---+--+---++-+--+ |2 |1 |Main Branch|, Dubai, UAE|UHO |256 | +---+--+---++-+--+ scala> df.write.format("carbondata").option("tableName", "Branch1").option("compress", "true").mode(SaveMode.Overwrite).save() scala> cc.sql("select * from branch1").show(false) +---+--+---+---+-+--+ |country|branch|name |address|shortname|status| +---+--+---+---+-+--+ |2 |1 |Main Branch| | Dubai |null | +---+--+---+---+-+--+ -- This message was sent by Atlassian JIRA (v6.3.15#6346)