Luke Han created KYLIN-889:
------------------------------

             Summary: Support more than one HDFS files of lookup table
                 Key: KYLIN-889
                 URL: https://issues.apache.org/jira/browse/KYLIN-889
             Project: Kylin
          Issue Type: Bug
          Components: Job Engine
    Affects Versions: v0.7.1
            Reporter: Luke Han
            Assignee: liyang
             Fix For: v0.8.1, v0.7.4


There's assumption previous is lookup table should be small to fix into memory. 
And there's validation rule to check if there's only one HDFS file for that 
lookup table

But there are too many cases are facing such issue, also there's requirement to 
support big lookup table.

Exception:
========================================
java.lang.IllegalStateException: Expect 1 and only 1 non-zero file under 
hdfs://masters/apps/hive/warehouse/d_nw_ne_ecell2, but find 4
        at 
org.apache.kylin.dict.lookup.HiveTable.findOnlyFile(HiveTable.java:123)
        at 
org.apache.kylin.dict.lookup.HiveTable.computeHDFSLocation(HiveTable.java:107)
        at 
org.apache.kylin.dict.lookup.HiveTable.getHDFSLocation(HiveTable.java:83)
        at 
org.apache.kylin.dict.lookup.HiveTable.getFileTable(HiveTable.java:76)
        at 
org.apache.kylin.dict.lookup.HiveTable.getSignature(HiveTable.java:71)
        at 
org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:164)
        at 
org.apache.kylin.cube.CubeManager.buildDictionary(CubeManager.java:154)
        at 
org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:53)
        at 
org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:42)
        at 
org.apache.kylin.job.hadoop.dict.CreateDictionaryJob.run(CreateDictionaryJob.java:53)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
        at 
org.apache.kylin.job.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63)
        at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
        at 
org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
        at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
        at 
org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:132)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
result code:2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to