Re: Data source management(Hive and Kafka)

Li Yang Sun, 10 Jul 2016 12:02:18 -0700

Setting classpath for individual datasource is a good idea.

However convenience for 80% cases must not be comprised. E.g. a meaningful
default hive datasource is still very important in my opinion.



Yang

On Sun, Jul 10, 2016 at 7:35 PM, Yiming Liu <[email protected]> wrote:

> Hi Kylin developers,
>
> Currently, Kylin will load the hive configuration from the HIVE_DEPENDENCY
> classpath. That means Kylin supports only one hive source. KYLIN-1826 aims
> to support multiple hive data sources, but the design is a little
> complicated by introducing the EXTERNAL HIVE concepts.
>
> The data source management becomes more tricky when multiple hive clusters
> and multiple kafka clusters are needed. I just rise the question today
> without specific solution yet, all suggestions are welcomed. I think it
> could be very useful if Kylin could support data source management.
>
> It should have the following features:
> 1. Defines Hive Cluster/Kafka Cluster as the data source factory under
> Project.
> 2. One Project could have more than one Hive/Kafka/SparkSQL Cluster
> definitions.
> 3. When "Load Table/Streaming", Kylin could load the TABLE(Hive) and
> Topic(Kafka) definition from Hive/Kafka directly.
> 4. The following model design and cube build are the same as before still.
>
> I know it's not a critical requirement, but maybe someone wants it too.
>
> Thank you a lot.
>
> --
> With Warm regards
>
> Yiming Liu (刘一鸣)
>

Re: Data source management(Hive and Kafka)

Reply via email to