[ 
https://issues.apache.org/jira/browse/KYLIN-4328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen updated KYLIN-4328:
----------------------------
    Fix Version/s: v3.1.0

> When hbase and kylin that are not in the same IDC and found that the build 
> task became very slow during scheduling. 
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: KYLIN-4328
>                 URL: https://issues.apache.org/jira/browse/KYLIN-4328
>             Project: Kylin
>          Issue Type: Improvement
>          Components: Job Engine
>    Affects Versions: v2.2.0, v3.0.0, v2.6.3
>         Environment: Centos 7.4
> hbase 1.2.4
> hive 1.1.1
> hadoop 2.7.2
>            Reporter: GuKe
>            Assignee: GuKe
>            Priority: Major
>             Fix For: v3.1.0
>
>
> When hbase and kylin that are not in the same IDC and found that the build 
> task became very slow during scheduling. 
> We found that it was caused by the following part of the code.
> The method getExecutableManager().GetAllJobIdsInCache() will read all of 
> jobid,There are currently more than 35,000 jobs in our server,and each jobid 
> accesses hbase at least twice to read the job state. 
> While that the most of jobs are succeed status.Those status won't change.
> When kylin and hbase services are in the same IDC each visit to hbase Network 
> Latency is less than 1 ms.
> However it takes more than 5 ms to access hbase each time across the IDC so 
> the delay caused by accessing hbase is considerable. 
> It takes a long time for scheduling task to run.
> So we can add a cache to hold the id of the successful job at the first time 
> of the service start.
> After we modified the code the run time reduced from 10 minutes to 20 seconds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to