Re: fail Build Dimension Dictionary
can you pastebin the cube desc and the look up table's hive schema? On Wed, Aug 24, 2016 at 3:17 PM, Yiming Liuwrote: > Please double check the cube definition which is a json file. > > 2016-08-24 15:13 GMT+08:00 Jose Raul Perez Rodriguez < > joseraul.w...@gmail.com>: > >> Yes, seems logical, but I haven't defined any date/time/timestamp column, >> that is what makes this error pretty baffling for me. Thanks Yiming. >> >> >> On 24/08/16 03:24, Yiming Liu wrote: >> >> The exception means the cube has defined one column with type: >> date/time/datetime/timestamp, but the value of that column is NULL. >> >> 2016-08-23 18:06 GMT+08:00 Jose Raul Perez Rodriguez < >> joseraul.w...@gmail.com>: >> >>> Hi, >>> >>> When I build cubes with *derived columns* always get the same error at >>> "Build Dimension Dictionary" step. >>> >>> kylin version: 1.5.3 >>> >>> error: >>> >>> java.lang.NullPointerException >>> at org.apache.kylin.common.util.DateFormat.isAllDigits(DateForm >>> at.java:121) >>> at org.apache.kylin.common.util.DateFormat.stringToMillis(DateF >>> ormat.java:104) >>> at org.apache.kylin.common.util.DateFormat.stringToMillis(DateF >>> ormat.java:91) >>> at org.apache.kylin.dict.lookup.LookupStringTable.convertRow(Lo >>> okupStringTable.java:86) >>> at org.apache.kylin.dict.lookup.LookupStringTable.convertRow(Lo >>> okupStringTable.java:34) >>> at org.apache.kylin.dict.lookup.LookupTable.initRow(LookupTable >>> .java:76) >>> at org.apache.kylin.dict.lookup.LookupTable.init(LookupTable.ja >>> va:67) >>> at org.apache.kylin.dict.lookup.LookupStringTable.init(LookupSt >>> ringTable.java:79) >>> at org.apache.kylin.dict.lookup.LookupTable.(LookupTable. >>> java:55) >>> at org.apache.kylin.dict.lookup.LookupStringTable.(Lookup >>> StringTable.java:65) >>> at org.apache.kylin.cube.CubeManager.getLookupTable(CubeManager >>> .java:619) >>> at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegm >>> ent(DictionaryGeneratorCLI.java:61) >>> at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegm >>> ent(DictionaryGeneratorCLI.java:42) >>> at org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(Cre >>> ateDictionaryJob.java:56) >>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) >>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) >>> at org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWo >>> rk(HadoopShellExecutable.java:63) >>> at org.apache.kylin.job.execution.AbstractExecutable.execute(Ab >>> stractExecutable.java:112) >>> at org.apache.kylin.job.execution.DefaultChainedExecutable.doWo >>> rk(DefaultChainedExecutable.java:57) >>> at org.apache.kylin.job.execution.AbstractExecutable.execute(Ab >>> stractExecutable.java:112) >>> at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRun >>> ner.run(DefaultScheduler.java:127) >>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool >>> Executor.java:1142) >>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo >>> lExecutor.java:617) >>> at java.lang.Thread.run(Thread.java:745) >>> >>> result code:2 >>> >>> >>> The curious thing is that I don't use any Date column, so it is pretty >>> baffling. And only happens with cubes using derived columns. >>> >>> >>> Any idea will be helpful, >>> >>> >>> Thanks >>> >>> >>> >> >> >> -- >> With Warm regards >> >> Yiming Liu (刘一鸣) >> >> >> > > > -- > With Warm regards > > Yiming Liu (刘一鸣) > -- Regards, *Bin Mahone | 马洪宾*
Re: Use Apache HUE with Kylin
thanks Alberto! -- Regards, *Bin Mahone | 马洪宾*
Use Apache HUE with Kylin
Hello, I did a small manual, about How to querie Kylin on HUE https://github.com/albertoRamon/Kylin/tree/master/KylinWithHue For any suggestions/bug, *feel free to contact me* Thanks, Alberto
Re: Kylin documentation
Not at the moment. But we are working on it. Thanks! On Fri, Aug 26, 2016 at 8:36 PM, Roberto Tardío Olmos < roberto.tar...@stratebi.com> wrote: > Hi, > I was wondering if there is any scientific paper or book about Apache > Kylin, in addition to the preentations and the links of the website. I > searched several times but can not find anything about it. > > Thanks in advance, > > -- > *Roberto Tardío Olmos* > *Big Data & Business Intelligence Consultant* > > Avenida de Brasil, 17, Planta 16. 28020 Madrid Fijo: 91.788.34.10 >
Re: how to set the running mr job library directory to local?
The question itself is very confusing. We all know java lib has to be on local file system. No way for classpath to ref dependencies on HDFS. On Wed, Aug 24, 2016 at 10:09 AM, Wang Yajunwrote: > Hi all > > I have set HIEVE_HOME and HBASE_HOME to specify the local path in > envrionment, but the kylin still found the dependencies on HDFS, > > > So I want to know how to make the kylin find dependencies library in local > path not hdfs? > > Thanks > > K >
Re: Is there a way to deal with a multi-value dimension column?
I think making the data model right is the first thing. Expand the multi-value into multiple rows is the right approach. The concern that the result will be too big is then a secondary issue. There are plenty ways to handle a big table. E.g. it can be a view that only temporarily exists during cube build and is deleted right after build complete. On Thu, Aug 18, 2016 at 7:50 PM, 张天生wrote: > You perhaps don't understand my question. My question is: original column > value is '1_3_12_15_27_35', but it can't directly be used to dimension > value, so it must be splited to 6 values [1, 3, 12, 15, 27, 35], and this > values will be used to construct the rowkey, and origianl record row will > be expanded to 6 times, it is too big. Is there a way to read ' > 1_3_12_15_27_35' and automate split it to 6 values in distinct column and > other step, use this values to create dimension dictionary and rowkey, and > don't need to preprocess orignal data. > > Li Yang 于2016年8月18日周四 下午6:47写道: > >> Depends on how you query/process the multi-value field, the answer will >> be different. >> >> Could you share some query sample? >> >> On Wed, Aug 17, 2016 at 2:35 PM, 张天生 wrote: >> >>> Can someone help me to answer this question? I was still waiting for >>> answer. >>> >>> 张天生 于2016年8月15日周一 上午11:28写道: >>> I have a dimension user_tags, it is a multi-value column, for example the value is "1_3_12_15_27_35_...", it was seperated by "_". As i known, kylin don't directly propress this multi-value column, it must preprocess it to a single value column, but it will increase record count to 50~100 times, the data is too big.So is there a way to deal with multi-value dimension, it don't need to split the value to many record, in calculate dimension cardinality, it can read original data and automate split the value to multi-value and process, and it will save disk i/o and cpu spending. >>> >>