[ https://issues.apache.org/jira/browse/KYLIN-4847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiaoxiang Yu closed KYLIN-4847. ------------------------------- Released at kylin 3.1.2 > Cuboid to HFile step failed on multiple job server env because of trying to > read the metric jar file from the inactive job server's location. > --------------------------------------------------------------------------------------------------------------------------------------------- > > Key: KYLIN-4847 > URL: https://issues.apache.org/jira/browse/KYLIN-4847 > Project: Kylin > Issue Type: Bug > Components: Job Engine > Affects Versions: v3.1.0 > Reporter: yoonsung.lee > Assignee: yoonsung.lee > Priority: Major > Fix For: v3.1.2 > > > h1. My Cluster Setting > 1. versIon: 3.1.0 > 2. 2 job servers(job & query mode), 2 query only servers. Each of them runs > on each different host machine. > 3. Use spark engine to build job. > h1. Problem Circumstance > h2. Root cause > The active job server submits spark job to execute `Convert Cuboid Data to > HFile`. But the active job server get an error because a resource for > submitting spark job has the wrong path which the active job server cannot > read. > * wrong resource: > ${KYLIN_HOME}/tomcat/webapps/kylin/WEB-INF/lib/metrics-core-2.2.0.jar > * The ${KYLIN_HOME} is the inactive job server's location for only the above > jar file. > This situation occurs in the following two circumstances. > h2. On build cube > 1. Request the build API to the inactive job server. (exactly: > /kylin/api/cubes/${cube_name}/rebuild ) > 2. Inactive job server stores the build task in meta store. > 3. Active job server takes the build task and proceeds it. > 4. Active job server failed on the `Convert Cuboid Data to HFile` step. > **This doesn't occur when I request build API to the active job server.** > h2. On merge > 1. Trigger merge cube job periodically > 2. Active job server takes the merge task and proceeds it. > 3. Active job server failed on the `Convert Cuboid Data to HFile` step. > **This doesn't occur when there is only one job server in the cluster.** > h1. Progress to solve this. > I'm trying to find which code set the metrics-core-2.2.0.jar path wrong. > Until now, I guess this code would be the set the metrics-core-2.2.0.jar for > the `Cuboid to HFile` spark job. > * > [https://github.com/apache/kylin/blob/kylin-3.1.0/storage-hbase/src/main/java/org/apache/kylin/storage/hbase/steps/HBaseSparkSteps.java#L69] > h1. Questions > 1. I'm trying to remote debug with IDE to make sure my guess is right. But > the breakpoint on that line is not captured on Runtime. It seems to be called > on the booting phase. Is it right? > 2. Is there any hint or guessing to solve this issue regardless of the above > my progress? -- This message was sent by Atlassian Jira (v8.3.4#803005)