[
https://issues.apache.org/jira/browse/KYLIN-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17403603#comment-17403603
]
ASF GitHub Bot commented on KYLIN-5044:
---------------------------------------
hit-lacus closed pull request #1698:
URL: https://github.com/apache/kylin/pull/1698
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
> Kylin 3.1.2 - Cube processing fails on Step 4 when hive client is switched to
> beeline.
> --------------------------------------------------------------------------------------
>
> Key: KYLIN-5044
> URL: https://issues.apache.org/jira/browse/KYLIN-5044
> Project: Kylin
> Issue Type: Bug
> Components: Job Engine
> Affects Versions: v3.1.2
> Environment: HDP-2.6.5.0
> (2.6.5.0-292)
> Kylin 3.1.2
> Centos 7
> Reporter: Piotr Naszarkowski
> Priority: Major
> Attachments: 2021-07-26 21_56_39-Window.png
>
>
> I switched hive client to beeline and also enabled sparksql for hive source
> using settings below in kylin.properties:
> {code:java}
> kylin.source.hive.client=beeline
> kylin.source.hive.beeline-shell=beeline
> kylin.source.hive.beeline-params=-n hive -u jdbc:hive2://hdp...:10016
> kylin.source.hive.enable-sparksql-for-table-ops=true
> kylin.source.hive.sparksql-beeline-shell=beeline{code}
> This caused Stage 4 (#4 Step Name: Build Dimension Dictionary) of cube
> processing to fail, as well as lookup refresh. The error:
> {code:java}
> org.apache.kylin.engine.mr.exception.HadoopShellException:
> java.io.IOException:
> java.lang.NullPointerExceptionorg.apache.kylin.engine.mr.exception.HadoopShellException:
> java.io.IOException: java.lang.NullPointerException at
> org.apache.kylin.source.hive.HiveTable.getSignature(HiveTable.java:78) at
> org.apache.kylin.dict.lookup.SnapshotTable.<init>(SnapshotTable.java:73) at
> org.apache.kylin.dict.lookup.SnapshotManager.buildSnapshot(SnapshotManager.java:140)
> at
> org.apache.kylin.cube.CubeManager$DictionaryAssist.buildSnapshotTable(CubeManager.java:1260)
> at
> org.apache.kylin.cube.CubeManager.buildSnapshotTable(CubeManager.java:1164)
> at
> org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:123)
> at
> org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:69)
> at
> org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:73)
> at org.apache.kylin.engine.mr.MRUtil.runMRJob(MRUtil.java:93) at
> org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:64)
> at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:180)
> at
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:72)
> at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:180)
> at
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:118)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)Caused by:
> java.lang.NullPointerException at
> org.apache.kylin.common.util.HadoopUtil.fixWindowsPath(HadoopUtil.java:122)
> at org.apache.kylin.common.util.HadoopUtil.makeURI(HadoopUtil.java:114) at
> org.apache.kylin.common.util.HadoopUtil.getFileSystem(HadoopUtil.java:92) at
> org.apache.kylin.engine.mr.DFSFileTable.getSizeAndLastModified(DFSFileTable.java:90)
> at org.apache.kylin.source.hive.HiveTable.getSignature(HiveTable.java:63)
> ... 16 more
> result code:2 at
> org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:74)
> at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:180)
> at
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:72)
> at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:180)
> at
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:118)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745) {code}
> Investigation showed that my beeline returns metadata in different format
> than the one used in code in BeelineHiveClient.java, basically sdLocation was
> not retrieved leading to NPE. This happens on LEARN_KYLIN sample project. I
> wonder if that something to do with my environment or it's simply a bug
> (seems severe).
> Proposed code changes: [https://github.com/apache/kylin/pull/1698/files] that
> resolved the issue for me locally (needs to be rebased with proper KYLIN jira
> id in commit, and targetted to other branch, ideally it should be 3.1.3) .
--
This message was sent by Atlassian Jira
(v8.3.4#803005)