[
https://issues.apache.org/jira/browse/KYLIN-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17408950#comment-17408950
]
Piotr Naszarkowski commented on KYLIN-5044:
-------------------------------------------
Caused probably by old version of hive. Closing.
> Kylin 3.1.2 - Cube processing fails on Step 4 when hive client is switched to
> beeline.
> --------------------------------------------------------------------------------------
>
> Key: KYLIN-5044
> URL: https://issues.apache.org/jira/browse/KYLIN-5044
> Project: Kylin
> Issue Type: Bug
> Components: Job Engine
> Affects Versions: v3.1.2
> Environment: HDP-2.6.5.0
> (2.6.5.0-292)
> Kylin 3.1.2
> Centos 7
> Reporter: Piotr Naszarkowski
> Priority: Major
> Attachments: 2021-07-26 21_56_39-Window.png
>
>
> I switched hive client to beeline and also enabled sparksql for hive source
> using settings below in kylin.properties:
> {code:java}
> kylin.source.hive.client=beeline
> kylin.source.hive.beeline-shell=beeline
> kylin.source.hive.beeline-params=-n hive -u jdbc:hive2://hdp...:10016
> kylin.source.hive.enable-sparksql-for-table-ops=true
> kylin.source.hive.sparksql-beeline-shell=beeline{code}
> This caused Stage 4 (#4 Step Name: Build Dimension Dictionary) of cube
> processing to fail, as well as lookup refresh. The error:
> {code:java}
> org.apache.kylin.engine.mr.exception.HadoopShellException:
> java.io.IOException:
> java.lang.NullPointerExceptionorg.apache.kylin.engine.mr.exception.HadoopShellException:
> java.io.IOException: java.lang.NullPointerException at
> org.apache.kylin.source.hive.HiveTable.getSignature(HiveTable.java:78) at
> org.apache.kylin.dict.lookup.SnapshotTable.<init>(SnapshotTable.java:73) at
> org.apache.kylin.dict.lookup.SnapshotManager.buildSnapshot(SnapshotManager.java:140)
> at
> org.apache.kylin.cube.CubeManager$DictionaryAssist.buildSnapshotTable(CubeManager.java:1260)
> at
> org.apache.kylin.cube.CubeManager.buildSnapshotTable(CubeManager.java:1164)
> at
> org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:123)
> at
> org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:69)
> at
> org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:73)
> at org.apache.kylin.engine.mr.MRUtil.runMRJob(MRUtil.java:93) at
> org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:64)
> at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:180)
> at
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:72)
> at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:180)
> at
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:118)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)Caused by:
> java.lang.NullPointerException at
> org.apache.kylin.common.util.HadoopUtil.fixWindowsPath(HadoopUtil.java:122)
> at org.apache.kylin.common.util.HadoopUtil.makeURI(HadoopUtil.java:114) at
> org.apache.kylin.common.util.HadoopUtil.getFileSystem(HadoopUtil.java:92) at
> org.apache.kylin.engine.mr.DFSFileTable.getSizeAndLastModified(DFSFileTable.java:90)
> at org.apache.kylin.source.hive.HiveTable.getSignature(HiveTable.java:63)
> ... 16 more
> result code:2 at
> org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:74)
> at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:180)
> at
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:72)
> at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:180)
> at
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:118)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745) {code}
> Investigation showed that my beeline returns metadata in different format
> than the one used in code in BeelineHiveClient.java, basically sdLocation was
> not retrieved leading to NPE. This happens on LEARN_KYLIN sample project. I
> wonder if that something to do with my environment or it's simply a bug
> (seems severe).
> Proposed code changes: [https://github.com/apache/kylin/pull/1698/files] that
> resolved the issue for me locally (needs to be rebased with proper KYLIN jira
> id in commit, and targetted to other branch, ideally it should be 3.1.3) .
--
This message was sent by Atlassian Jira
(v8.3.4#803005)