Re: [DISCUSS] Distributed CarbonStore

Jacky Li Sun, 05 Aug 2018 05:37:55 -0700

+1

I think it is a new good feature to have, but the effort to develop is quite 
high. I am worried about the release cycle getting longer. Can you define a 
roadmap for this new feature, so it can be deliver in phases across future 
versions.


Do you have anything in mind for the roadmap?

Regards,
Jacky
 

> 在 2018年8月2日，上午11:43，Ajith shetty <[email protected]> 写道：
> 
> Hi all
> 
> Currently the CarbonStore is very tightly coupled with FileSystem interface 
> and which runs in process JVM like in spark. We can instead make CarbonStore 
> run as a separate service which can be accessed via network/rpc. So as a 
> Followup of CARBONDATA-2688 (CarbonStore Java API and REST API) we can make 
> carbon store distributed
> 
> This has some advantages.
> 
> ·         Distributed CarbonStore can support parallel scanning i.e multiple 
> tasks can start scanning data parallely, which may have a higher parallelism 
> factor than compute layer
> 
> ·         Distributed CarbonStore can support index service to multiple apps 
> like (spark/ flink/ presto), such that index will be shared to save resource
> 
> ·         Distributed CarbonStore  resource consumption is isolated from 
> application and easily scalable to support higher workloads
> 
> ·         As a future improvement, Distributed CarbonStore  can implement a 
> query cache since it has independent resources
> 
> 
> 
> Distributed CarbonStore will have 2 main deployment parts:
> 
> 1. A cluster of remote carbon store service
> 
> 2. SDK which acts as a client for communication with store
> 
> Please provide your inputs/suggestions. If the idea sounds promising, i will 
> go ahead and create JIRA/subJIRAs for the same
> 
> Regards
> Ajith
>

Re: [DISCUSS] Distributed CarbonStore

Reply via email to