[
https://issues.apache.org/jira/browse/KYLIN-745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14520896#comment-14520896
]
Luke Han commented on KYLIN-745:
--------------------------------
There's one existing function in current version is user have to sync up Hive
table metadata to Kylin first before design cube.
For this generic reader, it could be same way as this work flow:
1. Create project if not exist
2. Sync up source table metadata to Kylin
3. Design cube in Kylin
4. Trigger cube build job which will leverage this feature to read data from
any other source
The parallel is one challenge that why we always would like recommend user to
import data to Hive using other ETL tool.
> Generic Data Reader
> -------------------
>
> Key: KYLIN-745
> URL: https://issues.apache.org/jira/browse/KYLIN-745
> Project: Kylin
> Issue Type: New Feature
> Components: Job Engine, Spark Engine
> Reporter: Luke Han
> Assignee: ZhouQianhao
>
> When data be stored on existing DW like Oracle, it's not be able to read
> directly through Kylin to build cube.
> There are many requirements coming from different teams like Candor about
> this.
> There are two options:
> #1, copy your data to Hive and then build cube through Kylin. There are some
> cases are running this model to bring data into Hive from DW and leveraging
> Kylin very well.
> #2, rewrite data read module to pull data from Oracle directly. Actually,
> the first step of cube build is generate Hive Query to read data and generate
> one temp table in Hive, so it should be not too complicated to do this (but
> it depends network and others, otherwise, #1 will be more efficient one).
> Then process cube build as normal. Using generical reader to read data from
> any SQL rdbms through JDBC or other protocol will be perfect solution since
> cube could be built without ETL process
> Scope:
> Only ready data directly from existing RDBMs and store jointed result in Hive
> (temp table) for further processing, no any other transfomation.
> By design, Kylin is OLAP system not ETL one.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)