[
https://issues.apache.org/jira/browse/KYLIN-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15650353#comment-15650353
]
Shaofeng SHI edited comment on KYLIN-1826 at 11/9/16 8:46 AM:
--------------------------------------------------------------
Hi Yu, I got many conflicts when merging the patchs to latest master branch...;
besides, some design also makes me unconfident (for example the 'external hive'
isn't another source type like Kafka). today I discussed this with Yang, we
want to implement your requirement in a more extensible way:
1. Add a config in KylinConfig "kylin.hive.home", which points to a Hive
installation folder in local; (if empty, using the default)
2. As you know Kylin allows overwrite the KylinConfig at cube level today; so
user can specify different Hive home at different Cubes; And we can do the same
for project, then each project can bind to a Hive installation also; The
overwrited properties in project can be inheritated on Cube in the cube wizard;
3. Create a HiveClient by passing the KylinConfig instance, it will check the
config and then know how to load the metadata & execute the CMD;
I made an initial commit in branch KYLIN-1826-2, which has extended the
CLIHiveClient and add the copy step in HiveMRInput, you can checkout and have a
look; With this way, the impact will be small and be easier to understand &
maintain.
I don't have time to merge all the changes there, if you think it is a good
idea, please continue there. For the REST API (TableController), please keep
the project name parameter as optional, so the as-is user doesn't need change
their client side codes (many users have integrated Kylin with their apps, we
need keep the API stable as much as possible).
Thanks.
was (Author: shaofengshi):
Hi Yu, I got many conflicts when merging the patchs to latest master branch...;
besides, some design also makes me unconfident (for example the 'external hive'
isn't another source type like Kafka). today I discussed this with Yang, we
want to implement your requirement in a more extensible way:
1. Add a config in KylinConfig "kylin.have.home", which points to a Hive
installation folder in local; (if empty, using the default)
2. As you know Kylin allows overwrite the KylinConfig at cube level today; so
user can specify different Hive home at different Cubes; And we can do the same
for project, then each project can bind to a Hive installation also; The
overwrited properties in project can be inheritated on Cube in the cube wizard;
3. Create a HiveClient by passing the KylinConfig instance, it will check the
config and then know how to load the metadata & execute the CMD;
I made an initial commit in branch KYLIN-1826-2, which has extended the
CLIHiveClient and add the copy step in HiveMRInput, you can checkout and have a
look; With this way, the impact will be small and be easier to understand &
maintain.
I don't have time to merge all the changes there, if you think it is a good
idea, please continue there. For the REST API (TableController), please keep
the project name parameter as optional, so the as-is user doesn't need change
their client side codes (many users have integrated Kylin with their apps, we
need keep the API stable as much as possible).
Thanks.
> kylin support more than one hive based on different hadoop claster
> ------------------------------------------------------------------
>
> Key: KYLIN-1826
> URL: https://issues.apache.org/jira/browse/KYLIN-1826
> Project: Kylin
> Issue Type: Improvement
> Components: Environment
> Affects Versions: v1.5.2
> Reporter: fengYu
> Assignee: fengYu
> Attachments:
> 0001-KYLIN-1826-add-external-hive-interface-project-table.patch,
> 0002-KYLIN-1826-add-and-modify-cube-source-job-for-extern.patch,
> 0003-KYLIN-1826-unify-hive-concept-forbid-modify-hive-nam.patch
>
>
> Currently, kylin only support one hive which should run by 'hive' command,
> However, when source data located in more than one hive we should deploy more
> kylin instance and more than one metastore. which is difficult to manager and
> may cause some conflict.
> I has been working on it Recently, In our cluster, there are some hive
> client(different metastore) which based on different hadoop cluster, I add a
> new hive source type which called 'external hive' in kylin 1.5.x
> Thanks to kylin Plug-in architecture in 2.x, which make this work easiler.
> the main modification are:
> 1. add hive root directory in hive config file, external hive client exist in
> this directory. hive named by directory name.
> 2. add hive-site.xml file while loading hive tables.
> 3. store hive name into project, one project can only take one hive as source.
> 4. change and add some job to support job building.
> I will upload my patch if I finish all my tests.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)