Usually the join is on PK, which need be unique; if not fulfilled kylin will 
report such error, you need clean the data to remove this ambiguity.

Regards,

Shaofeng Shi

[email protected]

>From Outlook Mobile




On Fri, Aug 19, 2016 at 7:22 PM +0800, "北极晨光" <[email protected]> wrote:










Hi,


We are using kylin of latest version. But there is a error as following when 
building cube.
java.lang.IllegalStateException: Dup key found, key=[NA94092521], 
value1=[95859459,NA22686932NA94092521,NA22686932,NA94092521,A100,5,A201,20,NAC03,NAC03,PartScanE,2014/09/26,10:52:34,2014/09/26,1],
 
value2=[95859462,NA22686904NA94092521,NA22686904,NA94092521,A201,20,A401,40,NAC03,NAC03,a61007,2014/09/26,10:52:39,2014/09/26,1]
  at org.apache.kylin.dict.lookup.LookupTable.initRow(LookupTable.java:84)      
  at org.apache.kylin.dict.lookup.LookupTable.init(LookupTable.java:67)   at 
org.apache.kylin.dict.lookup.LookupStringTable.init(LookupStringTable.java:79)  
     at org.apache.kylin.dict.lookup.LookupTable.(LookupTable.java:55)       at 
org.apache.kylin.dict.lookup.LookupStringTable.(LookupStringTable.java:65)   at 
org.apache.kylin.cube.CubeManager.getLookupTable(CubeManager.java:619)       at 
org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:61)
      at 
org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:42)
      at 
org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:56)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)    at 
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)    at 
org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63)
        at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:112)
       at 
org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57)
     at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:112)
       at 
org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:127)
       at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
     at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
     at java.lang.Thread.run(Thread.java:745)


There are two tables. Fact table is M_ORDER, lookup table is M_SCAN. M_ORDER 
inner join M_SCAN with column ORDER_NAME. ORDER_NAME is unique for M_ORDER, but 
duplicated for M_SCAN.


What could we do to solve the problem? Thanks.


Zhang




  • Dup key fo... ????????
    • Re: D... ShaoFeng Shi
      • 答... 万里通科技及数据中心商务智能团队数据分析组

Reply via email to