Usually the join is on PK, which need be unique; if not fulfilled kylin will report such error, you need clean the data to remove this ambiguity.
Regards, Shaofeng Shi [email protected] >From Outlook Mobile On Fri, Aug 19, 2016 at 7:22 PM +0800, "北极晨光" <[email protected]> wrote: Hi, We are using kylin of latest version. But there is a error as following when building cube. java.lang.IllegalStateException: Dup key found, key=[NA94092521], value1=[95859459,NA22686932NA94092521,NA22686932,NA94092521,A100,5,A201,20,NAC03,NAC03,PartScanE,2014/09/26,10:52:34,2014/09/26,1], value2=[95859462,NA22686904NA94092521,NA22686904,NA94092521,A201,20,A401,40,NAC03,NAC03,a61007,2014/09/26,10:52:39,2014/09/26,1] at org.apache.kylin.dict.lookup.LookupTable.initRow(LookupTable.java:84) at org.apache.kylin.dict.lookup.LookupTable.init(LookupTable.java:67) at org.apache.kylin.dict.lookup.LookupStringTable.init(LookupStringTable.java:79) at org.apache.kylin.dict.lookup.LookupTable.(LookupTable.java:55) at org.apache.kylin.dict.lookup.LookupStringTable.(LookupStringTable.java:65) at org.apache.kylin.cube.CubeManager.getLookupTable(CubeManager.java:619) at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:61) at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:42) at org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:56) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:112) at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:112) at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:127) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) There are two tables. Fact table is M_ORDER, lookup table is M_SCAN. M_ORDER inner join M_SCAN with column ORDER_NAME. ORDER_NAME is unique for M_ORDER, but duplicated for M_SCAN. What could we do to solve the problem? Thanks. Zhang
