[
https://issues.apache.org/jira/browse/KYLIN-4016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tom_yj updated KYLIN-4016:
--------------------------
Description:
Garbage Collection on HBase
hbase数据存储在s3上,清理数据时却找的是hdfs,报文件不存在异常。
需要清理的文件在s3上真实存在,hdfs上不存在
kylin.properties
kylin.env.hdfs-working-dir=s3://XXX-hive/kylin
kylin.storage.hbase.cluster-fs=s3://XXX-hive/hbase
log
java.io.FileNotFoundException: File
/kylin/kylin_metadata/kylin-d3926099-21bb-6893-1055-6d52f2fe17b7/XXX does not
exist.
at
org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:904)
at
org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:114)
at
org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:964)
at
org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:961)
at
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:971)
at
org.apache.kylin.storage.hbase.steps.HDFSPathGarbageCollectionStep.dropHdfsPathOnCluster(HDFSPathGarbageCollectionStep.java:95)
at
org.apache.kylin.storage.hbase.steps.HDFSPathGarbageCollectionStep.doWork(HDFSPathGarbageCollectionStep.java:65)
at
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:163)
at
org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:69)
at
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:163)
at
org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:113)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
was:
Garbage Collection on HBase
hbase数据存储在s3上,清理数据时却找的是hdfs,报文件不存在异常。
需要清理的文件在s3上真实存在,hdfs上不存在
kylin.properties
kylin.env.hdfs-working-dir=s3://XXX-hive/kylin
kylin.storage.hbase.cluster-fs=s3://XXX-hive/hbase
log
> Garbage Collection on HBase hbase数据存储在s3但是却找的是hdfs
> ---------------------------------------------------
>
> Key: KYLIN-4016
> URL: https://issues.apache.org/jira/browse/KYLIN-4016
> Project: Kylin
> Issue Type: Bug
> Components: Storage - HBase
> Affects Versions: v2.5.0
> Reporter: Tom_yj
> Priority: Major
>
> Garbage Collection on HBase
> hbase数据存储在s3上,清理数据时却找的是hdfs,报文件不存在异常。
> 需要清理的文件在s3上真实存在,hdfs上不存在
>
> kylin.properties
> kylin.env.hdfs-working-dir=s3://XXX-hive/kylin
> kylin.storage.hbase.cluster-fs=s3://XXX-hive/hbase
>
>
>
> log
> java.io.FileNotFoundException: File
> /kylin/kylin_metadata/kylin-d3926099-21bb-6893-1055-6d52f2fe17b7/XXX does not
> exist.
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:904)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:114)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:964)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:961)
> at
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:971)
> at
> org.apache.kylin.storage.hbase.steps.HDFSPathGarbageCollectionStep.dropHdfsPathOnCluster(HDFSPathGarbageCollectionStep.java:95)
> at
> org.apache.kylin.storage.hbase.steps.HDFSPathGarbageCollectionStep.doWork(HDFSPathGarbageCollectionStep.java:65)
> at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:163)
> at
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:69)
> at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:163)
> at
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:113)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)