[jira] [Created] (KYLIN-1696) Have caught exception when connection issue occurs for some Broker

2016-05-15 Thread Zhong Yanghong (JIRA)
Zhong Yanghong created KYLIN-1696:
-

 Summary: Have caught exception when connection issue occurs for 
some Broker
 Key: KYLIN-1696
 URL: https://issues.apache.org/jira/browse/KYLIN-1696
 Project: Kylin
  Issue Type: Bug
  Components: streaming
Reporter: Zhong Yanghong
Assignee: Zhong Yanghong


2016-05-16 01:50:24,711 ERROR [main StreamingCLI:109]: error start streaming
java.lang.RuntimeException: error when get StreamingMessages
at 
org.apache.kylin.source.kafka.KafkaStreamingInput.getBatchWithTimeWindow(KafkaStreamingInput.java:93)
at 
org.apache.kylin.engine.streaming.OneOffStreamingBuilder$1.run(OneOffStreamingBuilder.java:72)
at 
org.apache.kylin.engine.streaming.cli.StreamingCLI.startOneOffCubeStreaming(StreamingCLI.java:129)
at 
org.apache.kylin.engine.streaming.cli.StreamingCLI.main(StreamingCLI.java:103)
Caused by: java.nio.channels.UnresolvedAddressException
at sun.nio.ch.Net.checkAddress(Net.java:127)
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:644)
at kafka.network.BlockingChannel.connect(BlockingChannel.scala:57)
at kafka.consumer.SimpleConsumer.connect(SimpleConsumer.scala:44)
at 
kafka.consumer.SimpleConsumer.getOrMakeConnection(SimpleConsumer.scala:142)
at 
kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:69)
at kafka.consumer.SimpleConsumer.send(SimpleConsumer.scala:93)
at kafka.javaapi.consumer.SimpleConsumer.send(SimpleConsumer.scala:68)
at 
org.apache.kylin.source.kafka.util.KafkaRequester.getPartitionMetadata(KafkaRequester.java:132)
at 
org.apache.kylin.source.kafka.util.KafkaUtils.getLeadBroker(KafkaUtils.java:53)
at 
org.apache.kylin.source.kafka.util.KafkaUtils.getFirstAndLastOffset(KafkaUtils.java:113)
at 
org.apache.kylin.source.kafka.util.KafkaUtils.findClosestOffsetWithDataTimestamp(KafkaUtils.java:102)
at 
org.apache.kylin.source.kafka.KafkaStreamingInput$StreamingMessageProducer.call(KafkaStreamingInput.java:141)
at 
org.apache.kylin.source.kafka.KafkaStreamingInput$StreamingMessageProducer.call(KafkaStreamingInput.java:104)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KYLIN-1695) disable cardinality calculation job when loading hive table

2016-05-15 Thread kangkaisen (JIRA)
kangkaisen created KYLIN-1695:
-

 Summary: disable cardinality calculation job when loading hive 
table
 Key: KYLIN-1695
 URL: https://issues.apache.org/jira/browse/KYLIN-1695
 Project: Kylin
  Issue Type: Bug
  Components: Job Engine
Affects Versions: v1.5.1
Reporter: kangkaisen
Assignee: Dong Li


When user loads/reloads hive tables from web console, kylin will submit a mr 
job asynchronously to calculate column cardinalities. This has four major 
problems:

# the calculated cardinality is stored in table metadata, but never used in 
cubing/querying
# table may change after loading, so the cardinality doesn't necessarily 
reflect the actual value
# the current `HiveColumnCardinalityJob` has many limitations, e.g., it doesn't 
support views
# the `HiveColumnCardinalityJob` may use lots of resources when computing 
cardinality of partitioned table

Due to these problems, we should disable it by default and (maybe) remove it in 
future releases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KYLIN-1694) make multiply coefficient configurable when estimating cuboid size

2016-05-15 Thread kangkaisen (JIRA)
kangkaisen created KYLIN-1694:
-

 Summary: make multiply coefficient configurable when estimating 
cuboid size
 Key: KYLIN-1694
 URL: https://issues.apache.org/jira/browse/KYLIN-1694
 Project: Kylin
  Issue Type: Bug
  Components: Job Engine
Affects Versions: v1.5.1, v1.5.0
Reporter: kangkaisen
Assignee: Dong Li


In the current version of MRv2 build engine, in CubeStatsReader when estimating 
cuboid size , the curent method is "cube is memory hungry, storage size 
estimation multiply 0.05" and "cube is not memory hungry, storage size 
estimation multiply 0.25".

This has one major problems:the default multiply coefficient is smaller, this 
will make the estimated cuboid size much less than the actual
cuboid size,which will lead to the region numbers of HBase and the reducer 
numbers of CubeHFileJob are both smaller. obviously, the current method
makes the job of CubeHFileJob much slower.

After we remove the the default multiply coefficient, the job of CubeHFileJob 
becomes much faster.

we'd better make multiply coefficient configurable and this could be more 
friendly for user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


RE: Kylin configuration requirements

2016-05-15 Thread Yapu Jia
Please refer to : http://kylin.apache.org/docs15/install/kylin_cluster.html 

-Original Message-
From: yaoxiao...@huan.tv [mailto:yaoxiao...@huan.tv] 
Sent: Monday, May 16, 2016 11:11 AM
To: dev 
Subject: Kylin configuration requirements

 Hello , the 600 million data with kylin, four machines (3 slave node), need 
what configuration.




yaoxiao...@huan.tv


Kylin configuration requirements

2016-05-15 Thread yaoxiao...@huan.tv
 Hello , the 600 million data with kylin, four machines (3 slave node), need 
what configuration.




yaoxiao...@huan.tv


[jira] [Created] (KYLIN-1693) Support multiple group-by columns for TOP_N meausre

2016-05-15 Thread JunAn Chen (JIRA)
JunAn Chen created KYLIN-1693:
-

 Summary: Support multiple group-by columns for TOP_N meausre
 Key: KYLIN-1693
 URL: https://issues.apache.org/jira/browse/KYLIN-1693
 Project: Kylin
  Issue Type: New Feature
  Components: Query Engine
Affects Versions: v1.5.1
Reporter: JunAn Chen
Assignee: liyang


For this case:
table name : "tbl"
columns:  (dim_city, dim_industry, keyword, pv)

the "keyword" column has a large cardinality, for about ten million.

currently I can build "top100 pv" in (dim_city), (dim_industry). 
But I also want to build "top100 pv" in (dim_city, dim_industry) and "top100 pv 
of keyword" in (dim_city), (dim_industry) and (dim_city, dim_industrt).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


答复: expansion rate so high

2016-05-15 Thread Peng
Kylin is the latest version 1.5.1
In the Advances Setting, all 5 dimensions are in an aggregation group.

Thanks


-邮件原件-
发件人: Yapu Jia [mailto:yapu...@microsoft.com] 
发送时间: 2016年5月16日 10:38
收件人: dev@kylin.apache.org
主题: RE: expansion rate so high 

Which kylin version do you use? The latest version have a good improvement
in data expansion.

-Original Message-
From: Peng [mailto:pengli0...@outlook.com] 
Sent: Monday, May 16, 2016 10:21 AM
To: dev@kylin.apache.org
Subject: expansion rate so high 

Hi,

   Is it normal when the expansion rate is about 1600% ?

   My  cube : about one hundred million data;  

fact table has one lookup table;  

5 dimensions , in which 4 dimensions' encoding are fixed length,  separately
the length are 9, 8, 8, 33 ; 

2 measures,

 finally the expansion rate is about 1600.

  

Thanks

Peng



[jira] [Created] (KYLIN-1691) can not load project info from hbase when startup.

2016-05-15 Thread Hanhui LI (JIRA)
Hanhui LI created KYLIN-1691:


 Summary: can not load project info from hbase when startup.
 Key: KYLIN-1691
 URL: https://issues.apache.org/jira/browse/KYLIN-1691
 Project: Kylin
  Issue Type: Bug
  Components: Environment 
Affects Versions: v1.5.1
 Environment: Ubuntu 14
JDK 1.7 
kylin 1.5.1
Reporter: Hanhui LI
Assignee: hongbin ma


can not load project info from hbase when startup if directory 
kylin_metadata@hbase is created in $KYLIN_HOME



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


RE: expansion rate so high

2016-05-15 Thread Yapu Jia
Which kylin version do you use? The latest version have a good improvement in 
data expansion.

-Original Message-
From: Peng [mailto:pengli0...@outlook.com] 
Sent: Monday, May 16, 2016 10:21 AM
To: dev@kylin.apache.org
Subject: expansion rate so high 

Hi,

   Is it normal when the expansion rate is about 1600% ?

   My  cube : about one hundred million data;  

fact table has one lookup table;  

5 dimensions , in which 4 dimensions' encoding are fixed length,  separately 
the length are 9, 8, 8, 33 ; 

2 measures,

 finally the expansion rate is about 1600.

  

Thanks

Peng



expansion rate so high

2016-05-15 Thread Peng
Hi,

   Is it normal when the expansion rate is about 1600% ?

   My  cube : about one hundred million data;  

fact table has one lookup table;  

5 dimensions , in which 4 dimensions' encoding are fixed length,  separately
the length are 9, 8, 8, 33 ; 

2 measures,

 finally the expansion rate is about 1600.

  

Thanks

Peng



Re: 回复: automatically build cube

2016-05-15 Thread nichunen
Hi,


Please refer to http://kylin.apache.org/docs/tutorial/kylin_client_tool.html
 if you are using Kylin 1.2. 
I’ll submit a patch for 1.5.x version later. 


  George/倪春恩

Mobile:+86-13501723787| WeChat:nceecn

北京明略软件系统有限公司(MiningLamp.COM
)





上海市浦东新区晨晖路258号G座iDream张江科创中心C125




Room C125#,Intelligent Industrial Park Building G,258#Chenhui Road, Pudong 
District,Shanghai,201203




> On May 16, 2016, at 9:52 AM, 耳东 <775620...@qq.com> wrote:
> 
> 
> Does it mean that Kylin don't have the automatically build function. If I 
> want kylin to build cube automatically, I should call the restful api using 
> schedule tools like quartz.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> -- 原始邮件 --
> 发件人: "bitbean";;
> 发送时间: 2016年5月16日(星期一) 上午9:47
> 收件人: "dev"; 
> 
> 
> 主题: 回复: automatically build cube
> 
> 
> 
> 
> 
> 
> 
> 
> please use restfapi
> http://kylin.apache.org/docs15/howto/howto_use_restapi.html
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> -- 原始邮件 --
> 发件人: "耳东";<775620...@qq.com>;
> 发送时间: 2016年5月16日(星期一) 上午9:45
> 收件人: "dev"; 
> 
> 
> 主题: automatically build cube
> 
> 
> 
> 
> 
> 
> 
> 
> Hi all: In my fact table, it have data between 3/24 adn 3/30. When I create 
> the model, I choose the starttime column(MMdd) as the partition date 
> column. And I build the cube with data of date 3/24 at the time 5/14 23:18, 
> after two days, it doesn't run any job to build cube again.
> How can I enable the automatically build function.
> 


?????? automatically build cube

2016-05-15 Thread bitbean
please use restfapi
 http://kylin.apache.org/docs15/howto/howto_use_restapi.html




--  --
??: "";<775620...@qq.com>;
: 2016??5??16??(??) 9:45
??: "dev"; 

:  automatically build cube



Hi all:   In my fact table, it have data between 3/24 adn 3/30. When I 
create the model, I choose the starttime column(MMdd) as the partition date 
column. And I build the cube with data of date 3/24 at the time 5/14 23:18, 
after two days, it doesn't run any job to build cube again.
   How can I enable the automatically build function.

automatically build cube

2016-05-15 Thread ????
Hi all:   In my fact table, it have data between 3/24 adn 3/30. When I 
create the model, I choose the starttime column(MMdd) as the partition date 
column. And I build the cube with data of date 3/24 at the time 5/14 23:18, 
after two days, it doesn't run any job to build cube again.
   How can I enable the automatically build function.

Re: 答复: When build cube, the jobs can't continue go, kylin got this log:

2016-05-15 Thread zhangrongkun
OK,thank you.

--
View this message in context: 
http://apache-kylin.74782.x6.nabble.com/When-build-cube-the-jobs-can-t-continue-go-kylin-got-this-log-tp4503p4545.html
Sent from the Apache Kylin mailing list archive at Nabble.com.


?????? killed by admin

2016-05-15 Thread ????
yes, it is because REDUCE capability required is more than the supported max 
container capability in the cluster. Killing the Job. reduceResourceRequest: 
 maxContainerCapability:. 
The problem is solved, when I change the memory to 4096 or a bigger number.




--  --
??: "Li Yang";;
: 2016??5??15??(??) 7:35
??: "dev"; 

: Re: killed by admin



At least the hadoop job history server should have some traces.

On Tue, May 10, 2016 at 11:13 AM,  <775620...@qq.com> wrote:

> Hi all:
>
>  When I build the cube, in the second step 'Extract Fact Table
> Distinct Columns', the log shows 'killed by admin'.
> And I could not find any error log.

[jira] [Created] (KYLIN-1690) always returning 0 or 1 for sum(a)/sum(b) for integer type a and b

2016-05-15 Thread hongbin ma (JIRA)
hongbin ma created KYLIN-1690:
-

 Summary: always returning 0 or 1 for sum(a)/sum(b) for integer 
type a and b
 Key: KYLIN-1690
 URL: https://issues.apache.org/jira/browse/KYLIN-1690
 Project: Kylin
  Issue Type: Bug
Reporter: hongbin ma
Assignee: hongbin ma



  I want to get a value which is defined as sum(a)/sum(b), how can I do 
this kind of anlysis.

  Now I build a cube which have sum(a) and sum(b), when I execute “select 
sum(a)/sum(b) from table1 group by c” ,the result is wrong. sum(a)/sum(b) the 
result is all 0 and sum(b)/sum(a) result is all 1.


 MMENE_NAMESUCC   ATTSUCC/ATT
 CSMME15BZX   336981   368366   1
 CSMME32BZX   338754   366842   1
 CSMME07BZX   687965   747694   1
 CSMME03BHW   703269   747623   1
 CSMME12BZX   705856   764656   1
 CSMME16BHW   1962293142173   1


   MMENE_NAME   SUCC   ATT   ATT/SUCC
 CSMME15BZX   336981   368366   0
 CSMME32BZX   338754   366842   0
 CSMME07BZX   687965   747694   0
 CSMME03BHW   703269   747623   0
 CSMME12BZX   705856   764656   0
 CSMME16BHW   1962293142173   0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: how to get the rate value

2016-05-15 Thread hongbin ma
​I think it's still worth a JIRA:
https://issues.apache.org/jira/browse/KYLIN-1690

On Sun, May 15, 2016 at 11:43 PM, hongbin ma  wrote:

> for integer metrics, a quick workaround is to modify sum(b)/sum(a)  to
> 1.0*sum(b)/sum(a)
>
> On Sun, May 8, 2016 at 8:26 PM, Li Yang  wrote:
>
>> Thanks for the update.
>>
>> On Thu, May 5, 2016 at 2:57 PM, 耳东 <775620...@qq.com> wrote:
>>
>> > The datatype is bigint. This problem is solved, when I change to double.
>> >
>> >
>> > -- 原始邮件 --
>> > 发件人: "耳东";<775620...@qq.com>;
>> > 发送时间: 2016年4月27日(星期三) 下午5:46
>> > 收件人: "dev";
>> >
>> > 主题: 回复: how to get the rate value
>> >
>> >
>> >
>> > the Kylin version is added in the description.
>> >
>> >
>> >
>> >
>> > -- 原始邮件 --
>> > 发件人: "Li Yang";;
>> > 发送时间: 2016年4月27日(星期三) 下午5:07
>> > 收件人: "dev";
>> >
>> > 主题: Re: how to get the rate value
>> >
>> >
>> >
>> > Please provide the Kylin version in the JIRA.
>> >
>> > On Wed, Apr 27, 2016 at 1:04 PM, ShaoFeng Shi 
>> > wrote:
>> >
>> > > hi dong, could you please open a JIRA to Kylin for tracking this
>> issue?
>> > > https://issues.apache.org/jira/secure/Dashboard.jspa
>> > >
>> > > Thanks!
>> > >
>> > > 2016-04-26 20:56 GMT+08:00 耳东 <775620...@qq.com>:
>> > >
>> > > > Hi all:
>> > > >
>> > > >
>> > > >   I want to get a value which is defined as sum(a)/sum(b), how
>> can
>> > I
>> > > > do this kind of anlysis.
>> > > >
>> > > >   Now I build a cube which have sum(a) and sum(b), when I
>> execute
>> > > > “select sum(a)/sum(b) from table1 group by c” ,the result is wrong.
>> > > > sum(a)/sum(b) the result is all 0 and sum(b)/sum(a) result is all 1.
>> > > >
>> > > >
>> > > >  MMENE_NAMESUCC   ATTSUCC/ATT
>> > > >  CSMME15BZX   336981   368366   1
>> > > >  CSMME32BZX   338754   366842   1
>> > > >  CSMME07BZX   687965   747694   1
>> > > >  CSMME03BHW   703269   747623   1
>> > > >  CSMME12BZX   705856   764656   1
>> > > >  CSMME16BHW   1962293142173   1
>> > > >
>> > > >
>> > > >MMENE_NAME   SUCC   ATT   ATT/SUCC
>> > > >  CSMME15BZX   336981   368366   0
>> > > >  CSMME32BZX   338754   366842   0
>> > > >  CSMME07BZX   687965   747694   0
>> > > >  CSMME03BHW   703269   747623   0
>> > > >  CSMME12BZX   705856   764656   0
>> > > >  CSMME16BHW   1962293142173   0
>> > >
>> > >
>> > >
>> > >
>> > > --
>> > > Best regards,
>> > >
>> > > Shaofeng Shi
>> > >
>>
>
>
>
> --
> Regards,
>
> *Bin Mahone | 马洪宾*
> Apache Kylin: http://kylin.io
> Github: https://github.com/binmahone
>



-- 
Regards,

*Bin Mahone | 马洪宾*
Apache Kylin: http://kylin.io
Github: https://github.com/binmahone


Re: how to get the rate value

2016-05-15 Thread hongbin ma
for integer metrics, a quick workaround is to modify sum(b)/sum(a)  to 1.0*
sum(b)/sum(a)

On Sun, May 8, 2016 at 8:26 PM, Li Yang  wrote:

> Thanks for the update.
>
> On Thu, May 5, 2016 at 2:57 PM, 耳东 <775620...@qq.com> wrote:
>
> > The datatype is bigint. This problem is solved, when I change to double.
> >
> >
> > -- 原始邮件 --
> > 发件人: "耳东";<775620...@qq.com>;
> > 发送时间: 2016年4月27日(星期三) 下午5:46
> > 收件人: "dev";
> >
> > 主题: 回复: how to get the rate value
> >
> >
> >
> > the Kylin version is added in the description.
> >
> >
> >
> >
> > -- 原始邮件 --
> > 发件人: "Li Yang";;
> > 发送时间: 2016年4月27日(星期三) 下午5:07
> > 收件人: "dev";
> >
> > 主题: Re: how to get the rate value
> >
> >
> >
> > Please provide the Kylin version in the JIRA.
> >
> > On Wed, Apr 27, 2016 at 1:04 PM, ShaoFeng Shi 
> > wrote:
> >
> > > hi dong, could you please open a JIRA to Kylin for tracking this issue?
> > > https://issues.apache.org/jira/secure/Dashboard.jspa
> > >
> > > Thanks!
> > >
> > > 2016-04-26 20:56 GMT+08:00 耳东 <775620...@qq.com>:
> > >
> > > > Hi all:
> > > >
> > > >
> > > >   I want to get a value which is defined as sum(a)/sum(b), how
> can
> > I
> > > > do this kind of anlysis.
> > > >
> > > >   Now I build a cube which have sum(a) and sum(b), when I execute
> > > > “select sum(a)/sum(b) from table1 group by c” ,the result is wrong.
> > > > sum(a)/sum(b) the result is all 0 and sum(b)/sum(a) result is all 1.
> > > >
> > > >
> > > >  MMENE_NAMESUCC   ATTSUCC/ATT
> > > >  CSMME15BZX   336981   368366   1
> > > >  CSMME32BZX   338754   366842   1
> > > >  CSMME07BZX   687965   747694   1
> > > >  CSMME03BHW   703269   747623   1
> > > >  CSMME12BZX   705856   764656   1
> > > >  CSMME16BHW   1962293142173   1
> > > >
> > > >
> > > >MMENE_NAME   SUCC   ATT   ATT/SUCC
> > > >  CSMME15BZX   336981   368366   0
> > > >  CSMME32BZX   338754   366842   0
> > > >  CSMME07BZX   687965   747694   0
> > > >  CSMME03BHW   703269   747623   0
> > > >  CSMME12BZX   705856   764656   0
> > > >  CSMME16BHW   1962293142173   0
> > >
> > >
> > >
> > >
> > > --
> > > Best regards,
> > >
> > > Shaofeng Shi
> > >
>



-- 
Regards,

*Bin Mahone | 马洪宾*
Apache Kylin: http://kylin.io
Github: https://github.com/binmahone


Re: some doubt about query result from in insight

2016-05-15 Thread hongbin ma
looks like a bug in kylin. opened a jira in
https://issues.apache.org/jira/browse/KYLIN-1689
since it's not typical for a numerical column to be dimension as well as
sum metric, the bug fix may come late

On Mon, May 9, 2016 at 10:44 AM, 胡志华(万里通科技及数据中心商务智能团队数据分析组) <
huzhihua...@pingan.com.cn> wrote:

> Hi all,
>
>
>
> I recently built a cube named c1, use 2 columns as dimensions
> ,”rule_name”,” PARTNER_GAIN_PAY_PT_DOC_CNT”, also use ”
> sum(PARTNER_GAIN_PAY_PT_DOC_CNT)” as measure. C1 was built successfully.
>
>
>
> So, I made a query sql to test, that is “select
> rule_name,PARTNER_GAIN_PAY_PT_DOC_CNT
> ,count(*),sum(PARTNER_GAIN_PAY_PT_DOC_CNT) from
> CUB_PARTNER_GAIN_PAY_PT_PRE0_AT0_S where rule_name='1号店3C产品' group by
> rule_name,PARTNER_GAIN_PAY_PT_DOC_CNT;”, but the result is not probably
> exactly.
>
> RULE_NAME  PARTNER...  EXPR$2 EXPR$3
>
> 1号店3C产品1860 301860
>
> 1号店3C产品700  2 700
>
> 1号店3C产品7410 387410
>
> 1号店3C产品2940 602940
>
>
>
> In my opinion,”count(*)” means the amount of records with the same
> rule_name and PARTNER_GAIN_PAY_PT_DOC_CNT, so I think sum(PARTNER…) equals
>
> Count(*) * PARTNER_GAIN_PAY_PT_DOC_CNT, but the truth is not , I wonder if
> there is something wrong with my understanding?
>
>
>
>Insight snapshot as below:
>
>
>
>
>
>
>
>
>
>
>
> 
> The information in this email is confidential and may be legally
> privileged. If you have received this email in error or are not the
> intended recipient, please immediately notify the sender and delete this
> message from your computer. Any use, distribution, or copying of this email
> other than by the intended recipient is strictly prohibited. All messages
> sent to and from us may be monitored to ensure compliance with internal
> policies and to protect our business.
> Emails are not secure and cannot be guaranteed to be error free as they
> can be intercepted, amended, lost or destroyed, or contain viruses. Anyone
> who communicates with us by email is taken to accept these risks.
>
> 收发邮件者请注意:
> 本邮件含保密信息,若误收本邮件,请务必通知发送人并直接删去,不得使用、传播或复制本邮件。
> 进出邮件均受到本公司合规监控。邮件可能发生被截留、被修改、丢失、被破坏或包含计算机病毒等不安全情况。
>
> 
>



-- 
Regards,

*Bin Mahone | 马洪宾*
Apache Kylin: http://kylin.io
Github: https://github.com/binmahone


[jira] [Created] (KYLIN-1689) bug when a column being dimension as well as in a sum metric

2016-05-15 Thread hongbin ma (JIRA)
hongbin ma created KYLIN-1689:
-

 Summary: bug when a column being dimension as well as in a sum 
metric
 Key: KYLIN-1689
 URL: https://issues.apache.org/jira/browse/KYLIN-1689
 Project: Kylin
  Issue Type: Bug
Reporter: hongbin ma
Assignee: hongbin ma


Hi all,
 
I recently built a cube named c1, use 2 columns as dimensions ,”rule_name”,” 
PARTNER_GAIN_PAY_PT_DOC_CNT”, also use ” sum(PARTNER_GAIN_PAY_PT_DOC_CNT)” as 
measure. C1 was built successfully.
 
So, I made a query sql to test, that is “select 
rule_name,PARTNER_GAIN_PAY_PT_DOC_CNT 
,count(*),sum(PARTNER_GAIN_PAY_PT_DOC_CNT) from 
CUB_PARTNER_GAIN_PAY_PT_PRE0_AT0_S where rule_name='1号店3C产品' group by 
rule_name,PARTNER_GAIN_PAY_PT_DOC_CNT;”, but the result is not probably exactly.
RULE_NAME  PARTNER...  EXPR$2 EXPR$3
1号店3C产品1860 301860
1号店3C产品700  2 700
1号店3C产品7410 387410
1号店3C产品2940 602940
 
In my opinion,”count(*)” means the amount of records with the same rule_name 
and PARTNER_GAIN_PAY_PT_DOC_CNT, so I think sum(PARTNER…) equals
Count(*) * PARTNER_GAIN_PAY_PT_DOC_CNT, but the truth is not , I wonder if 
there is something wrong with my understanding?
 
   Insight snapshot as below:



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: The monitor web page cannot load jobs

2016-05-15 Thread hongbin ma
suggest to update to 1.5.1
1.5.1 metadata is compatible with 1.5.1

On Sun, May 15, 2016 at 8:55 PM, Li Yang  wrote:

> Try refresh kylin metadata from "Admin" tab, or restart Kylin. If the
> problem persists and you can reproduce stably, open a JIRA.
>
> On Sat, May 14, 2016 at 11:37 AM, jyzheng  wrote:
>
>>
>>
>> The version of Kylin is 1.5.0, and Hbase’s version is 0.98.
>>
>>
>>
>> After submit the refresh of cube, I open the monitor page, which looks
>> like this:
>>
>>
>>
>>
>>
>>
>>
>> If I want to drop the cube, Kylin reports this:
>>
>>
>>
>>
>>
>> And the log reports this:
>>
>>
>>
>> 2016-05-14 11:07:58,155 ERROR [http-bio-7070-exec-4]
>> controller.JobController:127 :
>>
>> java.lang.NullPointerException
>>
>>at
>> org.apache.kylin.rest.service.BasicService$1.apply(BasicService.java:138)
>>
>>at
>> org.apache.kylin.rest.service.BasicService$1.apply(BasicService.java:135)
>>
>>at
>> com.google.common.collect.Iterators$8.computeNext(Iterators.java:688)
>>
>> ……
>>
>>
>>
>> Is it a bug of kylin’s web service?
>>
>>
>>
>> 
>>
>> 郑江雨  云平台
>>
>> Phone: 15155195496
>>
>>
>>
>
>


-- 
Regards,

*Bin Mahone | 马洪宾*
Apache Kylin: http://kylin.io
Github: https://github.com/binmahone


Re: The monitor web page cannot load jobs

2016-05-15 Thread Li Yang
Try refresh kylin metadata from "Admin" tab, or restart Kylin. If the
problem persists and you can reproduce stably, open a JIRA.

On Sat, May 14, 2016 at 11:37 AM, jyzheng  wrote:

>
>
> The version of Kylin is 1.5.0, and Hbase’s version is 0.98.
>
>
>
> After submit the refresh of cube, I open the monitor page, which looks
> like this:
>
>
>
>
>
>
>
> If I want to drop the cube, Kylin reports this:
>
>
>
>
>
> And the log reports this:
>
>
>
> 2016-05-14 11:07:58,155 ERROR [http-bio-7070-exec-4]
> controller.JobController:127 :
>
> java.lang.NullPointerException
>
>at
> org.apache.kylin.rest.service.BasicService$1.apply(BasicService.java:138)
>
>at
> org.apache.kylin.rest.service.BasicService$1.apply(BasicService.java:135)
>
>at
> com.google.common.collect.Iterators$8.computeNext(Iterators.java:688)
>
> ……
>
>
>
> Is it a bug of kylin’s web service?
>
>
>
> 
>
> 郑江雨  云平台
>
> Phone: 15155195496
>
>
>


Re: 编译Kylin

2016-05-15 Thread Dong Li
Hello Xin,
About how to work with kylin's source code, please check kylin's dev
documents first.
http://kylin.apache.org/development/

Thanks,
Dong Li

2016-05-12 23:59 GMT+08:00 Luke Han :

> Forward to @dev list, please subscribe and check there.
> Thanks.
>
> Regards!
> Luke Han
>
>
>
>
> On Wed, May 11, 2016 at 2:21 AM -0700, "吴鑫"  wrote:
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> 您好,
>
>
>  我想自己编译Kylin源码,想知道方法及注意事项
>
>
>
>
>
>
>
>
>


Re: Many2many relationships

2016-05-15 Thread Li Yang
Maybe give an example?

On Fri, May 13, 2016 at 11:10 PM, Sarnath K  wrote:

> Hi,
>
> Can some1 point me how Kylin handles many2many relationships that
> frequently occurs in warehouses? Is there Anything special need to be done?
> Especially the queries using bridge tablez and their modeling in Kylin, and
> query optimizations
>
> Best,
> Sarnath
>
> Thanks!
>


Re: A error at cube build. @ #3 Step Name: Build Dimension Dictionary Duration: 0.03 mins

2016-05-15 Thread Li Yang
What's the kylin version?

On Fri, May 13, 2016 at 4:30 PM, 陈佛林  wrote:

> java.lang.NegativeArraySizeException
> at
> org.apache.kylin.dict.TrieDictionary.getValueFromIdImpl(TrieDictionary.java:266)
> at
> org.apache.kylin.common.util.Dictionary.getValueFromId(Dictionary.java:111)
> at
> org.apache.kylin.dict.lookup.SnapshotTable$1.getRow(SnapshotTable.java:126)
> at
> org.apache.kylin.dict.lookup.LookupTable.init(LookupTable.java:65)
> at
> org.apache.kylin.dict.lookup.LookupTable.(LookupTable.java:53)
> at
> org.apache.kylin.dict.lookup.LookupStringTable.(LookupStringTable.java:32)
> at
> org.apache.kylin.cube.CubeManager.getLookupTable(CubeManager.java:487)
> at
> org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:62)
> at
> org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:41)
> at
> org.apache.kylin.job.hadoop.dict.CreateDictionaryJob.run(CreateDictionaryJob.java:52)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
> at
> org.apache.kylin.job.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:62)
> at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
> at
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:51)
> at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
> at
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:130)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
>


Re: My cube build successfully,but when query return empty data set

2016-05-15 Thread Li Yang
Kylin is meant for OLAP, it does no remember raw records. Try "select
count(*) from ... "

On Tue, May 10, 2016 at 5:26 PM, zhangrongkun <563364...@qq.com> wrote:

> My HBase table is't empty:
> <
> http://apache-kylin.74782.x6.nabble.com/file/n4484/QQ%E6%88%AA%E5%9B%BE20160510173749.png
> >
>
> but when query :select * from changcy,got empty:
>
> <
> http://apache-kylin.74782.x6.nabble.com/file/n4484/QQ%E6%88%AA%E5%9B%BE20160510173448.png
> >
>
> --
> View this message in context:
> http://apache-kylin.74782.x6.nabble.com/My-cube-build-successfully-but-when-query-return-empty-data-set-tp4484.html
> Sent from the Apache Kylin mailing list archive at Nabble.com.
>


Re: killed by admin

2016-05-15 Thread Li Yang
At least the hadoop job history server should have some traces.

On Tue, May 10, 2016 at 11:13 AM, 耳东 <775620...@qq.com> wrote:

> Hi all:
>
>  When I build the cube, in the second step 'Extract Fact Table
> Distinct Columns', the log shows 'killed by admin'.
> And I could not find any error log.


Re: kylin query failed

2016-05-15 Thread Li Yang
How high is the cardinality? By saying "failed" and then have to restart,
do you mean crash? Any clue in kylin.log?

Sorry for many questions, but it's hard to help without the details. The
new 1.5.2 release will come with a diagnosis tool that can extract
necessary info into a zip, which you can share with community to diagnose.

On Mon, May 9, 2016 at 5:12 PM, jyzheng  wrote:

>
>
> When I send sql query to Kylin like this:
>
>
>
> select app_name, count(distinct uid) as uv
> from sdk_log
> left join week_calendar on sdk_log.day_time = week_calendar.week_cal
> where sdk_log.day_time = date '2016-05-01'
> group by app_name
> order by uv desc
> limit 200
>
>
>
>
>
> and failed. So I have to restart Kylin.
>
>
>
> The dimension `app_name` is a great high cardinality. If I switch to
> `app_tag` dimension , it will return the right result. So I can’t get the
> top measure dimension in Kylin? If can, and what can I do to solve this?
>
> 
>
> 郑江雨  云平台
>
> Phone: 15155195496
>
>
>


Re: Is TOP_N measure support multiple dimensions(columns) in one group?

2016-05-15 Thread ShaoFeng Shi
mutiple group-by columns in one Top-N isn't supported; You can open a JIRA
to Kylin, then we will evaluate it.

2016-05-14 15:12 GMT+08:00 lancelot chen :

> I'm using kylin 1.5.1. I found that current TOP_N UI support only one
> column in group by.
> For this case:
> table name : "tbl"
> columns:  (dim_city, dim_industry, keyword, pv)
>
> currently I can build "top100 pv" in (dim_city), (dim_industry). But I also
> want to build "top100 pv" in (dim_city, dim_industry) and "top100 pv of
> query" in (dim_city), (dim_industry) and (dim_city, dim_industrt), is kylin
> 1.5.1 support those tow use cases?
>
> Best Regards!
>



-- 
Best regards,

Shaofeng Shi