[ 
https://issues.apache.org/jira/browse/KYLIN-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15650353#comment-15650353
 ] 

Shaofeng SHI edited comment on KYLIN-1826 at 11/9/16 8:46 AM:
--------------------------------------------------------------

Hi Yu, I got many conflicts when merging the patchs to latest master branch...; 
besides, some design also makes me unconfident (for example the 'external hive' 
isn't another source type like Kafka).  today I discussed this with Yang, we 
want to implement your requirement in a more extensible way:

1. Add a config in KylinConfig "kylin.hive.home", which points to a Hive 
installation folder in local; (if empty, using the default)
2. As you know Kylin allows overwrite the KylinConfig at cube level today; so 
user can specify different Hive home at different Cubes; And we can do the same 
for project, then each project can bind to a Hive installation also; The 
overwrited properties in project can be inheritated on Cube in the cube wizard;
3.  Create a HiveClient by passing the KylinConfig instance, it will check the 
config and then know how to load the metadata & execute the CMD;

I made an initial commit in branch KYLIN-1826-2, which has extended the 
CLIHiveClient and add the copy step in HiveMRInput, you can checkout and have a 
look; With this way, the impact will be small and be easier to understand & 
maintain.   

I don't have time to merge all the changes there, if you think it is a good 
idea, please continue there. For the REST API (TableController), please keep 
the project name parameter as optional, so the as-is user doesn't need change 
their client side codes (many users have integrated Kylin with their apps, we 
need keep the API stable as much as possible).

Thanks.


was (Author: shaofengshi):
Hi Yu, I got many conflicts when merging the patchs to latest master branch...; 
besides, some design also makes me unconfident (for example the 'external hive' 
isn't another source type like Kafka).  today I discussed this with Yang, we 
want to implement your requirement in a more extensible way:

1. Add a config in KylinConfig "kylin.have.home", which points to a Hive 
installation folder in local; (if empty, using the default)
2. As you know Kylin allows overwrite the KylinConfig at cube level today; so 
user can specify different Hive home at different Cubes; And we can do the same 
for project, then each project can bind to a Hive installation also; The 
overwrited properties in project can be inheritated on Cube in the cube wizard;
3.  Create a HiveClient by passing the KylinConfig instance, it will check the 
config and then know how to load the metadata & execute the CMD;

I made an initial commit in branch KYLIN-1826-2, which has extended the 
CLIHiveClient and add the copy step in HiveMRInput, you can checkout and have a 
look; With this way, the impact will be small and be easier to understand & 
maintain.   

I don't have time to merge all the changes there, if you think it is a good 
idea, please continue there. For the REST API (TableController), please keep 
the project name parameter as optional, so the as-is user doesn't need change 
their client side codes (many users have integrated Kylin with their apps, we 
need keep the API stable as much as possible).

Thanks.

> kylin support more than one hive based on different hadoop claster
> ------------------------------------------------------------------
>
>                 Key: KYLIN-1826
>                 URL: https://issues.apache.org/jira/browse/KYLIN-1826
>             Project: Kylin
>          Issue Type: Improvement
>          Components: Environment 
>    Affects Versions: v1.5.2
>            Reporter: fengYu
>            Assignee: fengYu
>         Attachments: 
> 0001-KYLIN-1826-add-external-hive-interface-project-table.patch, 
> 0002-KYLIN-1826-add-and-modify-cube-source-job-for-extern.patch, 
> 0003-KYLIN-1826-unify-hive-concept-forbid-modify-hive-nam.patch
>
>
> Currently, kylin only support one hive which should run by 'hive' command, 
> However, when source data located in more than one hive we should deploy more 
> kylin instance and more than one metastore. which is difficult to manager and 
> may cause some conflict.
> I has been working on it Recently, In our cluster, there are some hive 
> client(different metastore) which based on different hadoop cluster, I add a 
> new hive source type which called 'external hive' in kylin 1.5.x
> Thanks to kylin Plug-in architecture in 2.x, which make this work easiler. 
> the main modification are:
> 1. add hive root directory in hive config file, external hive client exist in 
> this directory. hive named by directory name.
> 2. add hive-site.xml file while loading hive tables.
> 3. store hive name into project, one project can only take one hive as source.
> 4. change and add some job to support job building.
> I will upload my patch if I finish all my tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to