+1 I think it is a new good feature to have, but the effort to develop is quite high. I am worried about the release cycle getting longer. Can you define a roadmap for this new feature, so it can be deliver in phases across future versions.
Do you have anything in mind for the roadmap? Regards, Jacky > 在 2018年8月2日,上午11:43,Ajith shetty <[email protected]> 写道: > > Hi all > > Currently the CarbonStore is very tightly coupled with FileSystem interface > and which runs in process JVM like in spark. We can instead make CarbonStore > run as a separate service which can be accessed via network/rpc. So as a > Followup of CARBONDATA-2688 (CarbonStore Java API and REST API) we can make > carbon store distributed > > This has some advantages. > > · Distributed CarbonStore can support parallel scanning i.e multiple > tasks can start scanning data parallely, which may have a higher parallelism > factor than compute layer > > · Distributed CarbonStore can support index service to multiple apps > like (spark/ flink/ presto), such that index will be shared to save resource > > · Distributed CarbonStore resource consumption is isolated from > application and easily scalable to support higher workloads > > · As a future improvement, Distributed CarbonStore can implement a > query cache since it has independent resources > > > > Distributed CarbonStore will have 2 main deployment parts: > > 1. A cluster of remote carbon store service > > 2. SDK which acts as a client for communication with store > > Please provide your inputs/suggestions. If the idea sounds promising, i will > go ahead and create JIRA/subJIRAs for the same > > Regards > Ajith >
