[
https://issues.apache.org/jira/browse/KYLIN-745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14519401#comment-14519401
]
Seshu Adunuthula commented on KYLIN-745:
----------------------------------------
Would extending the Read Module to read from Oracle, would it be automatically
execute in parallel? Also what about the metadata, Do you have to create the
Fact/Dimension table schema in Hive MetaStore?
> Generic Data Reader
> -------------------
>
> Key: KYLIN-745
> URL: https://issues.apache.org/jira/browse/KYLIN-745
> Project: Kylin
> Issue Type: New Feature
> Components: Job Engine, Spark Engine
> Reporter: Luke Han
> Assignee: ZhouQianhao
>
> When data be stored on existing DW like Oracle, it's not be able to read
> directly through Kylin to build cube.
> There are many requirements coming from different teams like Candor about
> this.
> There are two options:
> #1, copy your data to Hive and then build cube through Kylin. There are some
> cases are running this model to bring data into Hive from DW and leveraging
> Kylin very well.
> #2, rewrite data read module to pull data from Oracle directly. Actually,
> the first step of cube build is generate Hive Query to read data and generate
> one temp table in Hive, so it should be not too complicated to do this (but
> it depends network and others, otherwise, #1 will be more efficient one).
> Then process cube build as normal. Using generical reader to read data from
> any SQL rdbms through JDBC or other protocol will be perfect solution since
> cube could be built without ETL process
> Scope:
> Only ready data directly from existing RDBMs and store jointed result in Hive
> (temp table) for further processing, no any other transfomation.
> By design, Kylin is OLAP system not ETL one.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)