Currently Kylin assumes the joining columns (C,D,E,F in your case) to be primary key. And that's wrong for your case. Some thing to improve. Could you open a JIRA?
On Tue, Aug 23, 2016 at 5:41 PM, 胡志华(万里通科技及数据中心商务智能团队数据分析组) < [email protected]> wrote: > Hi shaofeng, > > I meet the same problem. Let me describe my data model. > > My lookup table has fields like A,B,C,D,E,F,G,...., A,B,C,D are > primary keys. > > I use C,D,E,F to inner join fact table, and I meet the problem below, so I > can't create model like this ? > > > Another question,Can i put filed G into filter condition when I create > model? > > > > > > > > -----邮件原件----- > 发件人: ShaoFeng Shi [mailto:[email protected]] > 发送时间: 2016年8月20日 16:27 > 收件人: [email protected]; dev > 主题: Re: Dup key found when building cube > > Usually the join is on PK, which need be unique; if not fulfilled kylin > will report such error, you need clean the data to remove this ambiguity. > > Regards, > > Shaofeng Shi > > [email protected] > > From Outlook Mobile > > > > > On Fri, Aug 19, 2016 at 7:22 PM +0800, "北极晨光" <[email protected]> wrote: > > > > > > > > > > > Hi, > > > We are using kylin of latest version. But there is a error as following > when building cube. > java.lang.IllegalStateException: Dup key found, key=[NA94092521], > value1=[95859459,NA22686932NA94092521,NA22686932,NA94092521,A100,5, > A201,20,NAC03,NAC03,PartScanE,2014/09/26,10:52:34,2014/09/26,1], > value2=[95859462,NA22686904NA94092521,NA22686904,NA94092521,A201,20, > A401,40,NAC03,NAC03,a61007,2014/09/26,10:52:39,2014/09/26,1] at > org.apache.kylin.dict.lookup.LookupTable.initRow(LookupTable.java:84) > at org.apache.kylin.dict.lookup.LookupTable.init(LookupTable.java:67) > at > org.apache.kylin.dict.lookup.LookupStringTable.init(LookupStringTable.java:79) > at org.apache.kylin.dict.lookup.LookupTable.(LookupTable.java:55) > at > org.apache.kylin.dict.lookup.LookupStringTable.(LookupStringTable.java:65) > at org.apache.kylin.cube.CubeManager.getLookupTable(CubeManager.java:619) > at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment( > DictionaryGeneratorCLI.java:61) at org.apache.kylin.cube.cli. > DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:42) > at > org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:56) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at > org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at > org.apache.kylin.engine.mr.common.HadoopShellExecutable. > doWork(HadoopShellExecutable.java:63) at org.apache.kylin.job. > execution.AbstractExecutable.execute(AbstractExecutable.java:112) > at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork( > DefaultChainedExecutable.java:57) at org.apache.kylin.job. > execution.AbstractExecutable.execute(AbstractExecutable.java:112) > at > org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:127) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > > > There are two tables. Fact table is M_ORDER, lookup table is M_SCAN. > M_ORDER inner join M_SCAN with column ORDER_NAME. ORDER_NAME is unique for > M_ORDER, but duplicated for M_SCAN. > > > What could we do to solve the problem? Thanks. > > > Zhang > > > > > > ************************************************************ > ******************************************************************** > The information in this email is confidential and may be legally > privileged. If you have received this email in error or are not the > intended recipient, please immediately notify the sender and delete this > message from your computer. Any use, distribution, or copying of this email > other than by the intended recipient is strictly prohibited. All messages > sent to and from us may be monitored to ensure compliance with internal > policies and to protect our business. > Emails are not secure and cannot be guaranteed to be error free as they > can be intercepted, amended, lost or destroyed, or contain viruses. Anyone > who communicates with us by email is taken to accept these risks. > > 收发邮件者请注意: > 本邮件含保密信息,若误收本邮件,请务必通知发送人并直接删去,不得使用、传播或复制本邮件。 > 进出邮件均受到本公司合规监控。邮件可能发生被截留、被修改、丢失、被破坏或包含计算机病毒等不安全情况。 > ************************************************************ > ******************************************************************** >
