Re: fail Build Dimension Dictionary

2016-08-27 Thread hongbin ma
can you pastebin the cube desc and the look up table's hive schema?

On Wed, Aug 24, 2016 at 3:17 PM, Yiming Liu  wrote:

> Please double check the cube definition which is a json file.
>
> 2016-08-24 15:13 GMT+08:00 Jose Raul Perez Rodriguez <
> joseraul.w...@gmail.com>:
>
>> Yes, seems logical, but I haven't defined any date/time/timestamp column,
>> that is what makes this error pretty baffling for me. Thanks  Yiming.
>>
>>
>> On 24/08/16 03:24, Yiming Liu wrote:
>>
>> The exception means the cube has defined one column with type:
>> date/time/datetime/timestamp, but the value of that column is NULL.
>>
>> 2016-08-23 18:06 GMT+08:00 Jose Raul Perez Rodriguez <
>> joseraul.w...@gmail.com>:
>>
>>> Hi,
>>>
>>> When I build cubes with *derived columns* always get the same error at
>>> "Build Dimension Dictionary" step.
>>>
>>> kylin version: 1.5.3
>>>
>>> error:
>>>
>>> java.lang.NullPointerException
>>> at org.apache.kylin.common.util.DateFormat.isAllDigits(DateForm
>>> at.java:121)
>>> at org.apache.kylin.common.util.DateFormat.stringToMillis(DateF
>>> ormat.java:104)
>>> at org.apache.kylin.common.util.DateFormat.stringToMillis(DateF
>>> ormat.java:91)
>>> at org.apache.kylin.dict.lookup.LookupStringTable.convertRow(Lo
>>> okupStringTable.java:86)
>>> at org.apache.kylin.dict.lookup.LookupStringTable.convertRow(Lo
>>> okupStringTable.java:34)
>>> at org.apache.kylin.dict.lookup.LookupTable.initRow(LookupTable
>>> .java:76)
>>> at org.apache.kylin.dict.lookup.LookupTable.init(LookupTable.ja
>>> va:67)
>>> at org.apache.kylin.dict.lookup.LookupStringTable.init(LookupSt
>>> ringTable.java:79)
>>> at org.apache.kylin.dict.lookup.LookupTable.(LookupTable.
>>> java:55)
>>> at org.apache.kylin.dict.lookup.LookupStringTable.(Lookup
>>> StringTable.java:65)
>>> at org.apache.kylin.cube.CubeManager.getLookupTable(CubeManager
>>> .java:619)
>>> at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegm
>>> ent(DictionaryGeneratorCLI.java:61)
>>> at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegm
>>> ent(DictionaryGeneratorCLI.java:42)
>>> at org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(Cre
>>> ateDictionaryJob.java:56)
>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>>> at org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWo
>>> rk(HadoopShellExecutable.java:63)
>>> at org.apache.kylin.job.execution.AbstractExecutable.execute(Ab
>>> stractExecutable.java:112)
>>> at org.apache.kylin.job.execution.DefaultChainedExecutable.doWo
>>> rk(DefaultChainedExecutable.java:57)
>>> at org.apache.kylin.job.execution.AbstractExecutable.execute(Ab
>>> stractExecutable.java:112)
>>> at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRun
>>> ner.run(DefaultScheduler.java:127)
>>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>> Executor.java:1142)
>>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>> lExecutor.java:617)
>>> at java.lang.Thread.run(Thread.java:745)
>>>
>>> result code:2
>>>
>>>
>>> The curious thing is that I don't use any Date column, so it is pretty
>>> baffling. And only happens with cubes using derived columns.
>>>
>>>
>>> Any idea will be helpful,
>>>
>>>
>>> Thanks
>>>
>>>
>>>
>>
>>
>> --
>> With Warm regards
>>
>> Yiming Liu (刘一鸣)
>>
>>
>>
>
>
> --
> With Warm regards
>
> Yiming Liu (刘一鸣)
>



-- 
Regards,

*Bin Mahone | 马洪宾*


Re: Use Apache HUE with Kylin

2016-08-27 Thread hongbin ma
​thanks Alberto!​




-- 
Regards,

*Bin Mahone | 马洪宾*


Use Apache HUE with Kylin

2016-08-27 Thread Alberto Ramón
Hello,

I did a small manual, about How to querie Kylin on HUE

https://github.com/albertoRamon/Kylin/tree/master/KylinWithHue

For any suggestions/bug, *feel free to contact me*

Thanks, Alberto


Re: Kylin documentation

2016-08-27 Thread Li Yang
Not at the moment. But we are working on it.

Thanks!

On Fri, Aug 26, 2016 at 8:36 PM, Roberto Tardío Olmos <
roberto.tar...@stratebi.com> wrote:

> Hi,
> I was wondering if there is any scientific paper or book about Apache
> Kylin, in addition to the preentations and the links of the website. I
> searched several times but can not find anything about it.
>
> Thanks in advance,
>
> --
> *Roberto Tardío Olmos*
> *Big Data & Business Intelligence Consultant*
>
> Avenida de Brasil, 17, Planta 16. 28020 Madrid Fijo: 91.788.34.10
>


Re: how to set the running mr job library directory to local?

2016-08-27 Thread Li Yang
The question itself is very confusing. We all know java lib has to be on
local file system. No way for classpath to ref dependencies on HDFS.

On Wed, Aug 24, 2016 at 10:09 AM, Wang Yajun  wrote:

> Hi all
>
> I have set HIEVE_HOME and HBASE_HOME to specify the local path in
> envrionment, but the kylin still found the dependencies on HDFS,
>
>
> So I want to know how to make the kylin find dependencies library in local
> path not hdfs?
>
> Thanks
>
> K
>


Re: Is there a way to deal with a multi-value dimension column?

2016-08-27 Thread Li Yang
I think making the data model right is the first thing. Expand the
multi-value into multiple rows is the right approach. The concern that the
result will be too big is then a secondary issue. There are plenty ways to
handle a big table. E.g. it can be a view that only temporarily exists
during cube build and is deleted right after build complete.

On Thu, Aug 18, 2016 at 7:50 PM, 张天生  wrote:

> You perhaps don't understand my question. My question is: original column
> value is '1_3_12_15_27_35', but it can't directly be used to dimension
> value, so it must be splited to 6 values [1, 3, 12, 15, 27, 35], and this
> values will be used to construct the rowkey, and origianl record row will
> be expanded to 6 times, it is too big. Is there a way to read '
> 1_3_12_15_27_35' and automate split it to 6 values in distinct column and
> other step, use this values to create dimension dictionary and rowkey, and
> don't need to preprocess orignal data.
>
> Li Yang 于2016年8月18日周四 下午6:47写道:
>
>> Depends on how you query/process the multi-value field, the answer will
>> be different.
>>
>> Could you share some query sample?
>>
>> On Wed, Aug 17, 2016 at 2:35 PM, 张天生  wrote:
>>
>>> Can someone help me to answer this question? I was still waiting for
>>> answer.
>>>
>>> 张天生 于2016年8月15日周一 上午11:28写道:
>>>
 I have a dimension user_tags, it is a multi-value column, for example
 the value is "1_3_12_15_27_35_...", it was seperated by "_". As i known,
 kylin don't directly propress this multi-value column, it must preprocess
 it to a single value column, but it will increase record count to 50~100
 times, the data is too big.So is there a way to deal with multi-value
 dimension, it don't need to split the value to many record, in calculate
 dimension cardinality, it can read original data and automate split the
 value to multi-value and process, and it will save disk i/o and cpu
 spending.

>>>
>>