[jira] [Created] (CARBONDATA-2824) Distributed CarbonStore

Ajith S (JIRA) Fri, 03 Aug 2018 19:57:33 -0700

Ajith S created CARBONDATA-2824:
-----------------------------------

             Summary: Distributed CarbonStore
                 Key: CARBONDATA-2824
                 URL: https://issues.apache.org/jira/browse/CARBONDATA-2824
             Project: CarbonData
          Issue Type: New Feature
            Reporter: Ajith S
            Assignee: Ajith S



Currently the CarbonStore is very tightly coupled with FileSystem interface and 
which runs in process JVM like in spark. We can instead make CarbonStore run as 
a separate service which can be accessed via network/rpc. So as a Followup of 
CARBONDATA-2688 (CarbonStore Java API and REST API) we can make carbon store 
distributed 

This has some advantages. 
1. Distributed CarbonStore can support parallel scanning i.e multiple tasks can 
start scanning data parallely, which may have a higher parallelism factor than 
compute layer 
2. Distributed CarbonStore can support index service to multiple apps like 
(spark/ flink/ presto), such that index will be shared to save resource 
3. Distributed CarbonStore  resource consumption is isolated from application 
and easily scalable to support higher workloads 
4. As a future improvement, Distributed CarbonStore  can implement a query 
cache since it has independent resources 

Distributed CarbonStore will have 2 main deployment parts: 
Cluster of remote carbon store service 
SDK which acts as a client for communication with store 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (CARBONDATA-2824) Distributed CarbonStore

Reply via email to