WilliamZhu created CARBONDATA-465:
-
Summary: Spark streaming dataframe support
Key: CARBONDATA-465
URL: https://issues.apache.org/jira/browse/CARBONDATA-465
Project: CarbonData
Issue Type:
suo tong created CARBONDATA-464:
---
Summary: Too many tiems GC occurs in qurey if we increase the
blocklet size
Key: CARBONDATA-464
URL: https://issues.apache.org/jira/browse/CARBONDATA-464
Project:
Hi Jihong,
Thanks for your attentions and reply.
1. Actually I has done benchmark with English/Chinese dictionary size in
{100K,200K,300K,400K,500K,600K} separately, and test result is basic same
as mentioned in this mail flow before, I will submit and open the benchmark
code and dictionary
Integration with Spark 2.x is a great feature for Carbondata as Spark 2.x is
getting the momentum gradually. This is a big effort ahead and let's take into
consideration of all the complexity involved due to dramatic API level changeļ¼
realizing it in phases is a good idea.
Regards.
Jihong
Thank you Xiaoqiao for looking into this issue and sharing your result!
Have you tried varied dictionary size for comparison among all the
alternatives?
And please pay closer attention to the license of DAT implementation, as they
are under LGPL, generally speaking, it is not legally allowed
+1
A rich set of features are planned to be included into next release, and more
importantly there will be external API changes introduced as we integrate with
Spark 2.x, Carbondata deserves a major version jump as it gets
mature/production ready and powerful in terms of rich functionality and
+1
I think I can finish some tasks. please assign some tasks to me.
--
View this message in context:
http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/Feature-Proposal-Spark-2-integration-with-CarbonData-tp3236p3320.html
Sent from the Apache CarbonData Mailing List archive
Hi Lionel
Don't need to create table first, please find the example code in
ExampleUtils.scala
df.write
.format("carbondata")
.option("tableName", tableName)
.option("compress", "true")
.option("useKettle", "false")
.mode(mode)
.save()
Preparing API docs is in progress.
Hi team,
I'm trying to save spark dataframe to carbondata file. I see the example in
your wiki
option("tableName", "carbontable"). Does that mean I have to create a
carbondata table first and then save data into the table? Can I save it
directly without creating the carbondata table?
the code is
Jacky Li created CARBONDATA-463:
---
Summary: Extract spark-common module
Key: CARBONDATA-463
URL: https://issues.apache.org/jira/browse/CARBONDATA-463
Project: CarbonData
Issue Type: Sub-task
Jacky Li created CARBONDATA-462:
---
Summary: Clean up code before moving to spark-common package
Key: CARBONDATA-462
URL: https://issues.apache.org/jira/browse/CARBONDATA-462
Project: CarbonData
Jacky Li created CARBONDATA-461:
---
Summary: Clean partitioner in RDD package
Key: CARBONDATA-461
URL: https://issues.apache.org/jira/browse/CARBONDATA-461
Project: CarbonData
Issue Type:
SWATI RAO created CARBONDATA-460:
Summary: Add Unit Tests For core.writer.sortindex package
Key: CARBONDATA-460
URL: https://issues.apache.org/jira/browse/CARBONDATA-460
Project: CarbonData
Manish Gupta created CARBONDATA-459:
---
Summary: Block distribution is wrong in case of dynamic
allocation=true
Key: CARBONDATA-459
URL: https://issues.apache.org/jira/browse/CARBONDATA-459
Project:
Hi Ramana,
Sure, I can work out a subtasks list and put it under CARBONDATA-322
Regards,
Jacky
--
View this message in context:
http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/Feature-Proposal-Spark-2-integration-with-CarbonData-tp3236p3278.html
Sent from the Apache
kumar vishal created CARBONDATA-458:
---
Summary: Improving carbon first time query performance
Key: CARBONDATA-458
URL: https://issues.apache.org/jira/browse/CARBONDATA-458
Project: CarbonData
SWATI RAO created CARBONDATA-457:
Summary: Add Unit Tests For core.writer package
Key: CARBONDATA-457
URL: https://issues.apache.org/jira/browse/CARBONDATA-457
Project: CarbonData
Issue
17 matches
Mail list logo