[ 
https://issues.apache.org/jira/browse/KYLIN-3617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16713950#comment-16713950
 ] 

XiaoXiang Yu commented on KYLIN-3617:
-------------------------------------

 

[~Shaofengshi], this is what I found after verified [~nichunen]'s patch.

 

*Backgroud of ExecutableDao#isTaskExecutableOutput:*
{quote}private boolean isTaskExecutableOutput(String id) {
    return id.length() > 36;
}
{quote} # Each Executable has id, and id should be unique, we use 
java.util.UUID to create id of Executable.
 # 36 is a magic number, and it is the length of string of 
java.util.UUID.toString(). You can verified this simply by 
`System.out.println(UUID.randomUUID().toString().length());`
 # All subtask of a ChainedExecutable is also a Executable, its id is a string 
which length is 39 (36 + 3). See DefaultChainedExecutable#addTask.
 #  Any other Executable's id is a String created by UUID.toString(), so its 
length is 36.
 # This method is a bit fragile because it depend on specific implementation of 
subclass of Executable.

 

*In BaseTestDistributedScheduler*, the job id is the following string, it will 
be detected as a subtask of a ChainedExecutable because its length is 40(36 + 
4), but it really is a normal ChainedExecutable, not a subtask.
{quote}static final String jobId1 = "job1" + RandomUtil.randomUUID();
static final String jobId2 = "job2" + RandomUtil.randomUUID();
{quote}
 

*In addJobOutput of ExecutableDao*, job1 and job2 *will not be cached* into 
executableOutputDigestMap because its id is string which length is 40.
{quote}public void addJobOutput(ExecutableOutputPO output) throws 
PersistentException {
    try {
         output.setLastModified(0);
         writeJobOutputResource(pathOfJobOutput(output.getUuid()), output);
     +*if (!isTaskExecutableOutput(output.getUuid()))*+
         +*executableOutputDigestMap.put(output.getUuid(), output);*+
     } catch (IOException e) {
         logger.error("error update job output id:" + output.getUuid(), e);
         throw new PersistentException(e);
     }
}{quote}
But when a FetcherRunner thread want to *fetch detail from the cache*, it will 
got a *IllegalArgumentException*.

So the second commit modify the *job1* and *job2 and make it a normal UUID.* 
+So the 
[integration|https://cn.bing.com/dict/search?q=integration&FORM=BDVSP6&mkt=zh-cn]
 [testing|https://cn.bing.com/dict/search?q=testing&FORM=BDVSP6&mkt=zh-cn] 
passed.+

> Reduce number of visiting metastore for job scheduler
> -----------------------------------------------------
>
>                 Key: KYLIN-3617
>                 URL: https://issues.apache.org/jira/browse/KYLIN-3617
>             Project: Kylin
>          Issue Type: Improvement
>          Components: Job Engine
>    Affects Versions: v2.4.1
>            Reporter: nichunen
>            Assignee: nichunen
>            Priority: Major
>             Fix For: v2.6.0
>
>
> For KYLIN-3470 introduced cache for jobs' metadata, it's also can be used in 
> job scheduler to reduce the pressure on metastore



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to