Re: Number of cube dimensions is limited to 62?

2016-06-17 Thread ShaoFeng Shi
Almost true; You can think Kylin is 64 bit, in theory it supports up to 63
dimension in one cube;

There is no plan to extend to 128 or more in near term I believe; Since in
most of the cases the dimension number wouldn't exceed 20, 64 is already
"redundant" and causing extra space;

With so many dimensions, there must be room for optimization; You can try
some ways like:
1) extract some columns to lookup tables, and create them as "derived"
dimension in the cube;
2) or create multiple cubes, each serving a part of these columns;

If you have other way, please also share with the community; Thanks;


2016-06-18 0:01 GMT+08:00 Victoria Tskhay :

> Hello,
>
> It looks like the max number of dimensions in one cube is 62, is that
> correct?
>
> We would like to add more than that. That may sound crazy, I know, but we
> have a special case where all the dimensions have low cardinality (3) and
> the data is very sparse. We already tried with 62 dimensions and it works
> great.
>
> Is there any way to work around that limit? What would you suggest? Thank
> you!
>
>
>
> Best regards
> --
> Victoria Tskhay
>
> *Java Backend Developer*I glispa GmbH
>
> Sonnenburger Straße 73, 10437 Berlin, Germany
> E victoria.tsk...@glispamedia.com  e.mail.ru/compose/?mailto=mailto%3avictoria.tsk...@glispamedia.com>
> Skype: vikatskhay I www.glispa.com 
>
> Sitz Berlin, AG Charlottenburg HRB 114678B
>



-- 
Best regards,

Shaofeng Shi


关于measure预计算

2016-06-17 Thread 仇同心
大家好:
在cube构建时,根据cube 设计时,measure可以有不同的聚合函数。我想找到根据不同的聚合函数来做计算的源码,但是在

// Phase 3: Build Cube
addLayerCubingSteps(result, jobId, cuboidRootPath); // layer cubing, only 
selected algorithm will execute
result.addTask(createInMemCubingStep(jobId, cuboidRootPath)); // inmem cubing, 
only selected algorithm will execute
outputSide.addStepPhase3_BuildCube(result, cuboidRootPath);


我并没有找到相应的mapreduce函数来实现度量使用的聚合函数计算,也就是在存入hbase前,预计算的那段代码,谁能告诉我具体在哪实现的?

谢谢!



Number of cube dimensions is limited to 62?

2016-06-17 Thread Victoria Tskhay

Hello,

It looks like the max number of dimensions in one cube is 62, is that 
correct?


We would like to add more than that. That may sound crazy, I know, but 
we have a special case where all the dimensions have low cardinality (3) 
and the data is very sparse. We already tried with 62 dimensions and it 
works great.


Is there any way to work around that limit? What would you suggest? 
Thank you!




Best regards
--
Victoria Tskhay

*Java Backend Developer*I glispa GmbH

Sonnenburger Straße 73, 10437 Berlin, Germany
E victoria.tsk...@glispamedia.com 


Skype: vikatskhay I www.glispa.com 

Sitz Berlin, AG Charlottenburg HRB 114678B


Not able to connect Kylin from Modrian

2016-06-17 Thread Uma Maheshwar Kamuni
Hi ,

I am trying to connect  Kylin with Mondrian.

Jars i am using are :

kylin-jdbc-1.0-incubating.jar

olap4j.1.2

mondrian-4.4-lagunitas-SNAPSHOT-with-kylin-dialect


I am getting below error:

Exception in thread "main" java.lang.NoClassDefFoundError: 
mondrian/xmla/XmlaHandler$XmlaExtra
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:195)
at TestM.main(TestM.java:14)
Caused by: java.lang.ClassNotFoundException: mondrian.xmla.XmlaHandler$XmlaExtra



Re: kylin intermediate tables in Hive

2016-06-17 Thread ShaoFeng Shi
by default the web UI only shows the jobs in LAST ONE WEEK, pls have a
check.

2016-06-17 16:58 GMT+08:00 Jie Tao :

> actually I discarded all jobs and I do not see any ERROR job in the
> Monitor view of Kylin UI.
>
> Where can I see these error jobs?
>
> Jie
>
>
> Am 17.06.2016 um 10:31 schrieb ShaoFeng Shi:
>
>> Hi Jie,
>>
>> If a job is "ERROR", the intermediate hive table of it will not be
>> dropped,
>> as "ERROR" is not a final state; User can resume an "Error" job at any
>> time, so Kylin skipped to cleanup for that.
>>
>> If you discard these error jobs, and re-run the cleanup, the intermediate
>> hive table will be dropped.
>>
>> The message here is not clear, will change the wording...
>>
>> 2016-06-17 15:48 GMT+08:00 Jie Tao :
>>
>> You are correct, the intermediate tables are left by fail-building. I do
>>> clean up storage based on the linked guide. Intermediate data in HDFS and
>>> Hbase are deleted, but the intermediate tables in Hive not. The command
>>> shows the tables but do not drop them. I donot have a lookup table but my
>>> fact table is a view.
>>>
>>> As I run the cleanup command,
>>> kylin_intermediate_logout_full_cube_1970010100_2015100100
>>> kylin_intermediate_logout_full_cube_1970010100_20160529010500
>>> kylin_intermediate_logout_full_cube_1970010100_2016060800
>>> kylin_intermediate_logout_full_cube_1970010100_20160608010500
>>> kylin_intermediate_logout_full_cube_1970010100_20160609010500
>>> kylin_intermediate_logout_full_cube_1970010100_2016061500
>>> kylin_intermediate_logout_full_cube_1970010100_2016062600
>>> kylin_intermediate_logout_full_cube_1970010100_20160626042000
>>> kylin_intermediate_test_cube_1970010100_20151201010500
>>> kylin_intermediate_test_cube_1970010100_20151231234000
>>> kylin_intermediate_test_cube_1970010100_20160302063000
>>> kylin_intermediate_test_cube_1970010100_2016062600
>>> kylin_intermediate_test_cube_1970010100_20160626042000
>>> kylin_intermediate_test_cube_1970010100_20160704082000
>>> Time taken: 0.189 seconds, Fetched: 14 row(s)
>>> 2016-06-17 09:37:12,645 INFO  [main StorageCleanupJob:262]: Remove
>>> intermediate hive table with job id 493fd20b-3074-403e-9963-fe4fb7ff7c65
>>> with job status ERROR
>>> 2016-06-17 09:37:12,648 INFO  [main StorageCleanupJob:262]: Remove
>>> intermediate hive table with job id 8a377e30-e3ba-4fe2-be12-e7d412afec5e
>>> with job status ERROR
>>>
>>> Best regards,
>>>
>>> Jie
>>>
>>>
>>> Am 17.06.2016 um 09:16 schrieb ShaoFeng Shi:
>>>
>>> BTW, are you using a view as lookup table?

 2016-06-17 15:15 GMT+08:00 ShaoFeng Shi :

 This is common; If you have a job failed in between, and you discard
 that

> job, the "Garbage collection" step will not be executed, so the
> garbages
> will be left there.
>
> This is why we still recommend user to run offline cleanup every some
> period; It is not perfert, but be good for most scenarios:
> https://kylin.apache.org/docs15/howto/howto_cleanup_storage.html
>
> 2016-06-17 15:00 GMT+08:00 Li Yang :
>
> Woo... something new to me. Anybody knows?
>
>> On Tue, Jun 14, 2016 at 6:57 PM, Jie Tao 
>> wrote:
>>
>> Kylin actually drops useless intermediate tables after cube building,
>> but
>>
>> I still see one "kylin_intermediate_cubename_searchdata" table for
>>> each
>>> cube building in Hive. Are these tables still usefull for Kylin? I
>>> use
>>> Kylin 1.5.2.1.
>>>
>>> Cheers,
>>>
>>> Jie
>>>
>>>
>>> --
> Best regards,
>
> Shaofeng Shi
>
>
>
>
>>
>


-- 
Best regards,

Shaofeng Shi


Re: kylin intermediate tables in Hive

2016-06-17 Thread Jie Tao
actually I discarded all jobs and I do not see any ERROR job in the 
Monitor view of Kylin UI.


Where can I see these error jobs?

Jie

Am 17.06.2016 um 10:31 schrieb ShaoFeng Shi:

Hi Jie,

If a job is "ERROR", the intermediate hive table of it will not be dropped,
as "ERROR" is not a final state; User can resume an "Error" job at any
time, so Kylin skipped to cleanup for that.

If you discard these error jobs, and re-run the cleanup, the intermediate
hive table will be dropped.

The message here is not clear, will change the wording...

2016-06-17 15:48 GMT+08:00 Jie Tao :


You are correct, the intermediate tables are left by fail-building. I do
clean up storage based on the linked guide. Intermediate data in HDFS and
Hbase are deleted, but the intermediate tables in Hive not. The command
shows the tables but do not drop them. I donot have a lookup table but my
fact table is a view.

As I run the cleanup command,
kylin_intermediate_logout_full_cube_1970010100_2015100100
kylin_intermediate_logout_full_cube_1970010100_20160529010500
kylin_intermediate_logout_full_cube_1970010100_2016060800
kylin_intermediate_logout_full_cube_1970010100_20160608010500
kylin_intermediate_logout_full_cube_1970010100_20160609010500
kylin_intermediate_logout_full_cube_1970010100_2016061500
kylin_intermediate_logout_full_cube_1970010100_2016062600
kylin_intermediate_logout_full_cube_1970010100_20160626042000
kylin_intermediate_test_cube_1970010100_20151201010500
kylin_intermediate_test_cube_1970010100_20151231234000
kylin_intermediate_test_cube_1970010100_20160302063000
kylin_intermediate_test_cube_1970010100_2016062600
kylin_intermediate_test_cube_1970010100_20160626042000
kylin_intermediate_test_cube_1970010100_20160704082000
Time taken: 0.189 seconds, Fetched: 14 row(s)
2016-06-17 09:37:12,645 INFO  [main StorageCleanupJob:262]: Remove
intermediate hive table with job id 493fd20b-3074-403e-9963-fe4fb7ff7c65
with job status ERROR
2016-06-17 09:37:12,648 INFO  [main StorageCleanupJob:262]: Remove
intermediate hive table with job id 8a377e30-e3ba-4fe2-be12-e7d412afec5e
with job status ERROR

Best regards,

Jie


Am 17.06.2016 um 09:16 schrieb ShaoFeng Shi:


BTW, are you using a view as lookup table?

2016-06-17 15:15 GMT+08:00 ShaoFeng Shi :

This is common; If you have a job failed in between, and you discard that

job, the "Garbage collection" step will not be executed, so the garbages
will be left there.

This is why we still recommend user to run offline cleanup every some
period; It is not perfert, but be good for most scenarios:
https://kylin.apache.org/docs15/howto/howto_cleanup_storage.html

2016-06-17 15:00 GMT+08:00 Li Yang :

Woo... something new to me. Anybody knows?

On Tue, Jun 14, 2016 at 6:57 PM, Jie Tao  wrote:

Kylin actually drops useless intermediate tables after cube building,
but


I still see one "kylin_intermediate_cubename_searchdata" table for each
cube building in Hive. Are these tables still usefull for Kylin? I use
Kylin 1.5.2.1.

Cheers,

Jie



--
Best regards,

Shaofeng Shi









Re: kylin intermediate tables in Hive

2016-06-17 Thread ShaoFeng Shi
Hi Jie,

If a job is "ERROR", the intermediate hive table of it will not be dropped,
as "ERROR" is not a final state; User can resume an "Error" job at any
time, so Kylin skipped to cleanup for that.

If you discard these error jobs, and re-run the cleanup, the intermediate
hive table will be dropped.

The message here is not clear, will change the wording...

2016-06-17 15:48 GMT+08:00 Jie Tao :

> You are correct, the intermediate tables are left by fail-building. I do
> clean up storage based on the linked guide. Intermediate data in HDFS and
> Hbase are deleted, but the intermediate tables in Hive not. The command
> shows the tables but do not drop them. I donot have a lookup table but my
> fact table is a view.
>
> As I run the cleanup command,
> kylin_intermediate_logout_full_cube_1970010100_2015100100
> kylin_intermediate_logout_full_cube_1970010100_20160529010500
> kylin_intermediate_logout_full_cube_1970010100_2016060800
> kylin_intermediate_logout_full_cube_1970010100_20160608010500
> kylin_intermediate_logout_full_cube_1970010100_20160609010500
> kylin_intermediate_logout_full_cube_1970010100_2016061500
> kylin_intermediate_logout_full_cube_1970010100_2016062600
> kylin_intermediate_logout_full_cube_1970010100_20160626042000
> kylin_intermediate_test_cube_1970010100_20151201010500
> kylin_intermediate_test_cube_1970010100_20151231234000
> kylin_intermediate_test_cube_1970010100_20160302063000
> kylin_intermediate_test_cube_1970010100_2016062600
> kylin_intermediate_test_cube_1970010100_20160626042000
> kylin_intermediate_test_cube_1970010100_20160704082000
> Time taken: 0.189 seconds, Fetched: 14 row(s)
> 2016-06-17 09:37:12,645 INFO  [main StorageCleanupJob:262]: Remove
> intermediate hive table with job id 493fd20b-3074-403e-9963-fe4fb7ff7c65
> with job status ERROR
> 2016-06-17 09:37:12,648 INFO  [main StorageCleanupJob:262]: Remove
> intermediate hive table with job id 8a377e30-e3ba-4fe2-be12-e7d412afec5e
> with job status ERROR
>
> Best regards,
>
> Jie
>
>
> Am 17.06.2016 um 09:16 schrieb ShaoFeng Shi:
>
>> BTW, are you using a view as lookup table?
>>
>> 2016-06-17 15:15 GMT+08:00 ShaoFeng Shi :
>>
>> This is common; If you have a job failed in between, and you discard that
>>> job, the "Garbage collection" step will not be executed, so the garbages
>>> will be left there.
>>>
>>> This is why we still recommend user to run offline cleanup every some
>>> period; It is not perfert, but be good for most scenarios:
>>> https://kylin.apache.org/docs15/howto/howto_cleanup_storage.html
>>>
>>> 2016-06-17 15:00 GMT+08:00 Li Yang :
>>>
>>> Woo... something new to me. Anybody knows?

 On Tue, Jun 14, 2016 at 6:57 PM, Jie Tao  wrote:

 Kylin actually drops useless intermediate tables after cube building,
>
 but

> I still see one "kylin_intermediate_cubename_searchdata" table for each
> cube building in Hive. Are these tables still usefull for Kylin? I use
> Kylin 1.5.2.1.
>
> Cheers,
>
> Jie
>
>
>>>
>>> --
>>> Best regards,
>>>
>>> Shaofeng Shi
>>>
>>>
>>>
>>
>


-- 
Best regards,

Shaofeng Shi


[jira] [Created] (KYLIN-1805) It's easily got stuck when deleting HTables during running the StorageCleanupJob

2016-06-17 Thread Zhong Yanghong (JIRA)
Zhong Yanghong created KYLIN-1805:
-

 Summary: It's easily got stuck when deleting HTables during 
running the StorageCleanupJob
 Key: KYLIN-1805
 URL: https://issues.apache.org/jira/browse/KYLIN-1805
 Project: Kylin
  Issue Type: Improvement
  Components: Tools, Build and Test
Reporter: Zhong Yanghong
Assignee: Zhong Yanghong


In some unlucky case that some unused htables cannot be deleted successfully, 
currently kylin will be pending at there. It's better to skip those issued 
htables and continue its deleting work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: kylin intermediate tables in Hive

2016-06-17 Thread Jie Tao
You are correct, the intermediate tables are left by fail-building. I do 
clean up storage based on the linked guide. Intermediate data in HDFS 
and Hbase are deleted, but the intermediate tables in Hive not. The 
command shows the tables but do not drop them. I donot have a lookup 
table but my fact table is a view.


As I run the cleanup command,
kylin_intermediate_logout_full_cube_1970010100_2015100100
kylin_intermediate_logout_full_cube_1970010100_20160529010500
kylin_intermediate_logout_full_cube_1970010100_2016060800
kylin_intermediate_logout_full_cube_1970010100_20160608010500
kylin_intermediate_logout_full_cube_1970010100_20160609010500
kylin_intermediate_logout_full_cube_1970010100_2016061500
kylin_intermediate_logout_full_cube_1970010100_2016062600
kylin_intermediate_logout_full_cube_1970010100_20160626042000
kylin_intermediate_test_cube_1970010100_20151201010500
kylin_intermediate_test_cube_1970010100_20151231234000
kylin_intermediate_test_cube_1970010100_20160302063000
kylin_intermediate_test_cube_1970010100_2016062600
kylin_intermediate_test_cube_1970010100_20160626042000
kylin_intermediate_test_cube_1970010100_20160704082000
Time taken: 0.189 seconds, Fetched: 14 row(s)
2016-06-17 09:37:12,645 INFO  [main StorageCleanupJob:262]: Remove 
intermediate hive table with job id 493fd20b-3074-403e-9963-fe4fb7ff7c65 
with job status ERROR
2016-06-17 09:37:12,648 INFO  [main StorageCleanupJob:262]: Remove 
intermediate hive table with job id 8a377e30-e3ba-4fe2-be12-e7d412afec5e 
with job status ERROR


Best regards,

Jie

Am 17.06.2016 um 09:16 schrieb ShaoFeng Shi:

BTW, are you using a view as lookup table?

2016-06-17 15:15 GMT+08:00 ShaoFeng Shi :


This is common; If you have a job failed in between, and you discard that
job, the "Garbage collection" step will not be executed, so the garbages
will be left there.

This is why we still recommend user to run offline cleanup every some
period; It is not perfert, but be good for most scenarios:
https://kylin.apache.org/docs15/howto/howto_cleanup_storage.html

2016-06-17 15:00 GMT+08:00 Li Yang :


Woo... something new to me. Anybody knows?

On Tue, Jun 14, 2016 at 6:57 PM, Jie Tao  wrote:


Kylin actually drops useless intermediate tables after cube building,

but

I still see one "kylin_intermediate_cubename_searchdata" table for each
cube building in Hive. Are these tables still usefull for Kylin? I use
Kylin 1.5.2.1.

Cheers,

Jie




--
Best regards,

Shaofeng Shi








Re: kylin intermediate tables in Hive

2016-06-17 Thread ShaoFeng Shi
BTW, are you using a view as lookup table?

2016-06-17 15:15 GMT+08:00 ShaoFeng Shi :

> This is common; If you have a job failed in between, and you discard that
> job, the "Garbage collection" step will not be executed, so the garbages
> will be left there.
>
> This is why we still recommend user to run offline cleanup every some
> period; It is not perfert, but be good for most scenarios:
> https://kylin.apache.org/docs15/howto/howto_cleanup_storage.html
>
> 2016-06-17 15:00 GMT+08:00 Li Yang :
>
>> Woo... something new to me. Anybody knows?
>>
>> On Tue, Jun 14, 2016 at 6:57 PM, Jie Tao  wrote:
>>
>> > Kylin actually drops useless intermediate tables after cube building,
>> but
>> > I still see one "kylin_intermediate_cubename_searchdata" table for each
>> > cube building in Hive. Are these tables still usefull for Kylin? I use
>> > Kylin 1.5.2.1.
>> >
>> > Cheers,
>> >
>> > Jie
>> >
>>
>
>
>
> --
> Best regards,
>
> Shaofeng Shi
>
>


-- 
Best regards,

Shaofeng Shi


Re: kylin intermediate tables in Hive

2016-06-17 Thread ShaoFeng Shi
This is common; If you have a job failed in between, and you discard that
job, the "Garbage collection" step will not be executed, so the garbages
will be left there.

This is why we still recommend user to run offline cleanup every some
period; It is not perfert, but be good for most scenarios:
https://kylin.apache.org/docs15/howto/howto_cleanup_storage.html

2016-06-17 15:00 GMT+08:00 Li Yang :

> Woo... something new to me. Anybody knows?
>
> On Tue, Jun 14, 2016 at 6:57 PM, Jie Tao  wrote:
>
> > Kylin actually drops useless intermediate tables after cube building, but
> > I still see one "kylin_intermediate_cubename_searchdata" table for each
> > cube building in Hive. Are these tables still usefull for Kylin? I use
> > Kylin 1.5.2.1.
> >
> > Cheers,
> >
> > Jie
> >
>



-- 
Best regards,

Shaofeng Shi


Re: kylin intermediate tables in Hive

2016-06-17 Thread Li Yang
Woo... something new to me. Anybody knows?

On Tue, Jun 14, 2016 at 6:57 PM, Jie Tao  wrote:

> Kylin actually drops useless intermediate tables after cube building, but
> I still see one "kylin_intermediate_cubename_searchdata" table for each
> cube building in Hive. Are these tables still usefull for Kylin? I use
> Kylin 1.5.2.1.
>
> Cheers,
>
> Jie
>


[jira] [Created] (KYLIN-1804) Better view for hierarchy dimensions adding when creating cube

2016-06-17 Thread Roger Shi (JIRA)
Roger Shi created KYLIN-1804:


 Summary: Better view for hierarchy dimensions adding when creating 
cube
 Key: KYLIN-1804
 URL: https://issues.apache.org/jira/browse/KYLIN-1804
 Project: Kylin
  Issue Type: Improvement
  Components: Web 
Reporter: Roger Shi
Assignee: Zhong,Jason
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KYLIN-1803) ExtendedColumn Measure Encoding with Non-ascii Characters

2016-06-17 Thread Yerui Sun (JIRA)
Yerui Sun created KYLIN-1803:


 Summary: ExtendedColumn Measure Encoding with Non-ascii Characters
 Key: KYLIN-1803
 URL: https://issues.apache.org/jira/browse/KYLIN-1803
 Project: Kylin
  Issue Type: Bug
  Components: Job Engine
Affects Versions: v1.5.2, v1.5.3
Reporter: Yerui Sun
Assignee: Yerui Sun
 Fix For: v1.5.3


ExtendedColumn measure ingests data by converting String to bytes array. The 
current converting can't deal with non-ascii characters properly. For example, 
the Chinese characters '北京' was converted to '??',but not UTF-8 byte arrays.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)