Iñigo Martinez created KYLIN-3555:
-------------------------------------

             Summary: Garbage collection on HBase step fails with S3 selected 
as storage
                 Key: KYLIN-3555
                 URL: https://issues.apache.org/jira/browse/KYLIN-3555
             Project: Kylin
          Issue Type: Bug
          Components: Job Engine
    Affects Versions: v2.4.1
            Reporter: Iñigo Martinez
         Attachments: Screenshot from 2018-09-11 12-31-25.png

When building a cube with S3 selected has storage, build process fails at 
latest step.

Although s3 has been defined as storage, cleanup task tries to delete from HDFS 
and, of course, there is no file at HDFS.

 
{code:java}
2018-09-11 12:27:56,311 DEBUG [Scheduler 1407846257 Job 
f8416975-eea6-4500-9cb7-4374f28451dc-237] 
steps.HDFSPathGarbageCollectionStep:78 : Drop HDFS path on FileSystem: 
s3://XXXXXXX-emr-kylin
2018-09-11 12:27:57,364 DEBUG [Scheduler 1407846257 Job 
f8416975-eea6-4500-9cb7-4374f28451dc-237] 
steps.HDFSPathGarbageCollectionStep:87 : HDFS path 
/kylin/kylin_metadata/kylin-f8416975-eea6-4500-9cb7-4374f28451dc/plataforma_transacciones_cubo_v1/fact_distinct_columns
 is dropped.
2018-09-11 12:27:58,104 DEBUG [Scheduler 1407846257 Job 
f8416975-eea6-4500-9cb7-4374f28451dc-237] 
steps.HDFSPathGarbageCollectionStep:87 : HDFS path 
/kylin/kylin_metadata/kylin-f8416975-eea6-4500-9cb7-4374f28451dc/plataforma_transacciones_cubo_v1/hfile
 is dropped.
2018-09-11 12:27:58,140 DEBUG [Scheduler 1407846257 Job 
f8416975-eea6-4500-9cb7-4374f28451dc-237] 
steps.HDFSPathGarbageCollectionStep:78 : Drop HDFS path on FileSystem: 
hdfs://ip-10-0-1-63.eu-west-1.compute.internal:8020
2018-09-11 12:27:58,142 DEBUG [Scheduler 1407846257 Job 
f8416975-eea6-4500-9cb7-4374f28451dc-237] 
steps.HDFSPathGarbageCollectionStep:90 : HDFS path 
/kylin/kylin_metadata/kylin-f8416975-eea6-4500-9cb7-4374f28451dc/plataforma_transacciones_cubo_v1/fact_distinct_columns
 not exists.
2018-09-11 12:27:58,147 ERROR [Scheduler 1407846257 Job 
f8416975-eea6-4500-9cb7-4374f28451dc-237] 
steps.HDFSPathGarbageCollectionStep:68 : 
job:f8416975-eea6-4500-9cb7-4374f28451dc-15 execute finished with exception
java.io.FileNotFoundException: File 
/kylin/kylin_metadata/kylin-f8416975-eea6-4500-9cb7-4374f28451dc/plataforma_transacciones_cubo_v1
 does not exist.
at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:904)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:114)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:964)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:961)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:971)
at 
org.apache.kylin.storage.hbase.steps.HDFSPathGarbageCollectionStep.dropHdfsPathOnCluster(HDFSPathGarbageCollectionStep.java:95)
at 
org.apache.kylin.storage.hbase.steps.HDFSPathGarbageCollectionStep.doWork(HDFSPathGarbageCollectionStep.java:65)
at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:162)
at 
org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:69)
at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:162)
at 
org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:113)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748){code}
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to