Carbon Data integration with HIVE

Lewis Goldstein Fri, 15 Jun 2018 13:01:34 -0700

Happened upon Apache CarbonData while searching for info on other Columnar Data 
Stores on HDFS.   As I am looking for ways to accelerate consumption from 
Hadoop that could cover both large query, interactive query, and OLAP this 
technology sounds quite promising.   On initial read it sounds like CarbonData 
is considered another Columnar Data Store on HDFS analogous to Parquet and ORC, 
but then on further reading it sounds like to load data to this format it must 
pass through Spark;  I would like to know if this is truly the case?


Was hoping it would work similar to Parquet and Hive in that one would just 
define the Hive Table as external with a designated file type of CarbonData - 
is this possible or does one need Spark to be an intermediary?   Is CarbonData 
actually more like Druid than simply another Columnar Data Store on HDFS?



Nothing in this message is intended to constitute an electronic signature 
unless a specific statement to the contrary is included in this message.

Confidentiality Note: This message is intended only for the person or entity to 
which it is addressed. It may contain confidential and/or privileged material. 
Any review, transmission, dissemination or other use, or taking of any action 
in reliance upon this message by persons or entities other than the intended 
recipient is prohibited and may be unlawful. If you received this message in 
error, please contact the sender and delete it from your computer.

Carbon Data integration with HIVE

Reply via email to