The CarbonData is a new BigData file format for a faster interactive query using advanced columnar storage, index, compression, and encoding techniques to improve computing efficiency. In turn, it will help to speed up queries an order of magnitude faster over PetaBytes of data.
The Apache CarbonData PMC team is happy to announce the release of 1.2.0, the community put very significant effort on improving this release , more than 50 contributors finished 200+ pull requests for improvements and bug fixes. 1.Release Notes: https://cwiki.apache.org/confluence/display/CARBONDATA/Apache+CarbonData+1.2.0+Release 2.Some key improvement in this patch release: 1)Sort columns feature: It enables users to define only required columns (which are used in query filters) can be sorted while loading the data. It improves the loading speed. 2)Support 4 type of sort scope: Local sort, Batch sort, Global sort, No sort while creating the table 3)Support partition 4)Optimize data update and delete for Spark 2.1 5)Further, improve performance by optimizing measure filter feature 6)DataMap framework to add custom indexes 7)Ecosystem feature1: support Presto integration 8)Ecosystem feature2: support Hive integration You can follow this document to use these artifacts: https://github.com/apache/carbondata/blob/master/docs/quick-start-guide.md We welcome your help and feedback, you can find the more CarbonData document and learn more at: http://carbondata.apache.org/ Thanks The Apache CarbonData team