Re: [Draft][REPORT] Apache Kylin - May 2017

2017-05-06 Thread Luke Han
## Description:
Apache Kylin is an open source Distributed Analytics Engine designed
to provide SQL interface and multi-dimensional analysis (OLAP) on
Hadoop supporting extremely large datasets.


## Issues:
- there are no issues requiring board attention at this time

## Activity:
- Nominated 3 committers
- Apache Kylin 2.0 has been developed long time, it
has been release on 2017-04-30, the coming development
activities will more focus on migration bug fixes which
reported by community and testing result
- Yang Li presented Apache Kylin 2.0 features
at Strata Hadoop World San Jose on 2017-03-16
- Luke Han presented Apache Kylin open source
at OSCAR Beijing on 2017-04-19
- Apache Kylin Meetup @Toutiao in Beijing
hosted on 2017-04-29 with 200 attendees onsite
and 200 online
- Chaozhong Yang presented Apache Kylin use case
in Toutiao at above Meetup
at OSCAR Beijing on 2017-04-19
- Dong Li presented Apache Kylin
at OSC China Fujian on 2017-02-25
- Dong Li presented Apache Kylin
at OSC China Xiamen on 2017-02-26
- Dong Li presented Apache Kylin
at OSC China Wuhan on 2017-04-15
- Dong Li presented Apache Kylin
at OSC China Changsha on 2017-04-16
- Hongbin Ma presented Apache Kylin
at NJSDGlobal Nanjing on 2017-04-21
- Chen Wang presented Apache Kylin
at DBAPlus Meetup China Shanghai on 2017-04-08
- Roger Shi presented Apache Kylin
at SQL on Hadoop Meetup Shanghai on 2017-04-29

## PMC changes:

- Currently 18 PMC members.
- No new PMC members added in the last 3 months
- Last PMC addition was Dong Li on Mon Apr 11 2016

## Committer base changes:

- Currently 28 committers.
- New commmitters:
- Alberto Ramón was added as a committer on Thu Apr 27 2017
- Zhixiong Chen was added as a committer on Thu Apr 27 2017
- Roger Shi was added as a committer on Thu Apr 27 2017

## Releases:

- 2.0.0 was released on Sat Apr 29 2017

## Mailing list activity:

- dev@kylin.apache.org:
- 334 subscribers (up 23 in the last 3 months):
- 809 emails sent to list (900 in previous quarter)

- iss...@kylin.apache.org:
- 61 subscribers (up 9 in the last 3 months):
- 923 emails sent to list (1525 in previous quarter)

- u...@kylin.apache.org:
- 270 subscribers (up 52 in the last 3 months):
- 239 emails sent to list (320 in previous quarter)

## JIRA activity:

- 164 JIRA tickets created in the last 3 months
- 151 JIRA tickets closed/resolved in the last 3 months


Best Regards!
-

Luke Han

On Thu, May 4, 2017 at 3:54 AM, 康凯森  wrote:

> +1 Looks good to me.
>
>
> Kaisen Kang
> -- 原始邮件 --
> 发件人: "Luke Han";;
> 发送时间: 2017年5月4日(星期四) 中午1:25
> 收件人: "dev"; "Apache Kylin PMC"<
> priv...@kylin.apache.org>;
>
> 主题: [Draft][REPORT] Apache Kylin - May 2017
>
>
>
> Dear community,
>  I have drafted below board report for review, please help to check and
> let me know if there's any issue.
>  Feel free to reply here if there are more activities, conference,
> meetup and other events, also community development and others which should
> be included in this report.
>
>  Will submit this report to board later.
>
>  Thanks.
>
> Luke
>
>
> ## Description:
> Apache Kylin is an open source Distributed Analytics Engine designed
> to provide SQL interface and multi-dimensional analysis (OLAP) on
> Hadoop supporting extremely large datasets.
>
>
> ## Issues:
> - there are no issues requiring board attention at this time
>
> ## Activity:
> - Yang Li presented Apache Kylin 2.0 features
> at Strata Hadoop World San Jose on 2017-03-16
> - Luke Han presented Apache Kylin open source
> at OSCAR Beijing on 2017-04-19
> - Apache Kylin Meetup @Toutiao in Beijing
> hosted on 2017-04-29 with 200 attendees onsite
> and 200 online
> - Chaozhong Yang presented Apache Kylin use case
> in Toutiao at above Meetup
> at OSCAR Beijing on 2017-04-19
> - Dong Li presented Apache Kylin
> at OSC China Fujian on 2017-02-25
> - Dong Li presented Apache Kylin
> at OSC China Xiamen on 2017-02-26
> - Dong Li presented Apache Kylin
> at OSC China Wuhan on 2017-04-15
> - Dong Li presented Apache Kylin
> at OSC China Changsha on 2017-04-16
> - Hongbin Ma presented Apache Kylin
> at NJSDGlobal Nanjing on 2017-04-21
> - Chen Wang presented Apache Kylin
> at DBAPlus Meetup China Shanghai on 2017-04-08
> - Roger Shi presented Apache Kylin
> at SQL on Hadoop Meetup Shanghai on 2017-04-29
>
> ## PMC changes:
>
> - Currently 18 PMC members.
> - No new PMC members added in the last 3 months
> - Last PMC addition was Dong Li on Mon Apr 11 2016
>
> ## Committer base changes:
>
> - Currently 28 committers.
> - New commmitters:
> - Alberto Ramón was added as a committer on Thu Apr 27 2017
> - Zhixiong Chen was added as a committer on Thu Apr 27 2017
> - Roger Shi was added as a committer on Thu Apr 27 2017
>
> ## Releases:
>
> - 2.0.0 was released on Sat Apr 29 2017
>
> ## Mailing list activity:
>
> - dev@kylin.apache.org:
> - 334 subscribers (up 23 in the last 3 months):

Re: Fail to extract disctinct columns from fact tables

2017-05-06 Thread ShaoFeng Shi
Hi Hong Wei,

That's cool, thanks for the update and detail summary!

For the issue 2, the workround modifies source code, which need re-compile.
Can it be achieved by modifying some FI's configuration?

2017-05-04 22:34 GMT+08:00 Hong Wei :

> Thanks for your help. I have solved this problems as follow:
> 1. Because of the strict security control of FusionInsight, it is not
> allowed to add '--hiveconf' parameter to modify any hive configure in
> runtime. After administrator added value 'mapred.job.*|dfs.*' into
> 'hive.security.authorization.sqlstd.confwhitelist.append' which set on
> Server side, and remove '--hiveconf' from beeline parameters in
> conf/kylin.properties,   the error 'java.lan.IllegalArgumentException:
> Cannot modify hive.security.authorization.sqlstd.confwhitelist.append at
> runtime. It is not in list of params that are allowed to be modified at
> runtime (state=,code=0)' is no longer reported.
>
>
> 2. I modified the method 
> 'org.apache.kylin.source.hive.HiveMRInput.configureJob':
> add a clause 'HiveConf.setLoadMetastoreConfig(true);' before
> 'HCatInputFormat.setInput(job, dbName, tableName);',  and the follow errors
> can be solved.
> > > WARN hive.metastore:574 : set_ugi() not suuccessful, Likely cause:
> > > new client talking to old server. Continuing without it.
> > > org.apache.thrift.transport.TTransportException
> > >at org.apache.thrift.transport.TIOStreamTransport.read(
> TIOStreamTransport.java:132)
> > >at org.apache.thrift.transport.TTransport.readAll(TTransport.
> java:86)
> > >
> > >at org.apache.hadoop.hive.metastore.api.
> ThriftHiveMetastore$Client.
> > > recv_set_ugi(ThriftHiveMetastore.java:3794)
> > >at org.apache.hadoop.hive.metastore.api.
> ThriftHiveMetastore$Client.
> > > set_ugi(ThriftHiveMetastore.java:3780)
> > >at  org.apache.hadoop.hive.metastore.HiveMetaStoreClient.
> open(HiveMetaStoreClient:566)
> > >at  org.apache.hadoop.hive.metastore.HiveMetaStoreClient.
> reconnect(HiveMetaStoreClient:342)
> > >   ...
> > >at  org.apache.hive.hcatalog.common.HCatUtil.getTable(
> HCatUtil.java:180)
> > >   ...
> > >at  org.apache.kylin.source.hive.HiveMRInputFormat.
> configureJob(HiveMRInput.java:105)
> > >at  org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.
> setupMapper(FactDistinctColumnsJob.java:119)
> > >at  org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.
> run(FactDistinctColumnsJob.java:103)
> > >..
> > >INFOhive.metastore:602 :Connect to metastore.
> > >INFOcommon.AbstractHadoopJob:506 : tempMetaFileString is :
> null
> > >ERROR common.MapReduceExecutable:127 : error execute
> > > MapReduceExecutable{id=..,name=Extract Fact Table Distinct
> Columns, state=RUNNING}
>
>
>
>
> -- 原始邮件 --
> 发件人: "ShaoFeng Shi";;
> 发送时间: 2017年4月21日(星期五) 下午4:41
> 收件人: "dev";
>
> 主题: Re: Fail to extract disctinct columns from fact tables
>
>
>
> Check this:  java.lan.IllegalArgumentException: Cannot modify
> hive.security.authorization.sqlstd.confwhitelist.append at runtime. It is
> not in list of params that are allowed to be modified at runtime (state=,
> code=0)
>
> FusionInsight has very strict security control, which doesn't allow client
> to overwrite Hive/mapreduce/hbase configuration values; while Kylin
> customizes many configurations for its performance and functionality.
>
> To overcome such error, you need contact your FI administrator, and tell
> him to add the parameter to FI's whitelist. You can do this in batch for
> all parameters in KYLIN_HOME/conf/*.xml, but there might be other
> parameters from application.
>
> 2017-04-21 14:19 GMT+08:00 Hong Wei :
>
> > I do confirm kylin has got the right hive-site.xml on classpath.
> > Additionally, when I add
> > --hiveconf hive.security.authorization.sqlstd.confwhitelist.append='
> > mapred.job.*|dfs.*'
> > into beeline params which is configured in conf/kylin.properties, a error
> > is reported while starting kylin:
> >
> >
> > Error: Fail to open new session: org.apache.hive.service.cli.
> HiveSQLException:
> > java.lan.IllegalArgumentException: Cannot modify
> > hive.security.authorization.sqlstd.confwhitelist.append at runtime. It
> is
> > not in list of params that are allowed to be modified at runtime (state=,
> > code=0)
> >
> >
> >
> >
> > -- 原始邮件 --
> > 发件人: "Li Yang";;
> > 发送时间: 2017年4月21日(星期五) 中午1:02
> > 收件人: "dev";
> >
> > 主题: Re: Fail to extract disctinct columns from fact tables
> >
> >
> >
> > I recall checking if Kylin has got the right hive-site.xml on classpath
> may
> > fix this problem.
> >
> > On Thu, Apr 20, 2017 at 8:13 PM, Hong Wei  wrote:
> >
> > > Dear Kylin:
> > > I try to run Kylin 

Re: kylin sso

2017-05-06 Thread ShaoFeng Shi
Hi tang,

Kylin's SSO is implemented with Spring Security SAML extension (
http://projects.spring.io/spring-security-saml/). If CAS supports SAML, it
should be easy; otherwise it need additional development. You can evaluate
and dev with your need.

2017-05-03 9:31 GMT+08:00 :

> Hello
> Boy!
> I want to know How to configure kylin based on cas implementation sso?
> Could you help me?
>



-- 
Best regards,

Shaofeng Shi 史少锋


[jira] [Created] (KYLIN-2590) Cube build error on Step 7 Build Base Cuboid Data

2017-05-06 Thread lufeng (JIRA)
lufeng created KYLIN-2590:
-

 Summary: Cube build error on Step 7 Build Base Cuboid Data
 Key: KYLIN-2590
 URL: https://issues.apache.org/jira/browse/KYLIN-2590
 Project: Kylin
  Issue Type: Bug
Affects Versions: v1.6.0
Reporter: lufeng


The cube build failed on step 7 with following error message. Any help please 

Hadoop 2.7 HBase 1.1.2 Kylin 1.6.0 Tez 0.7.0

<<<
2017-05-01 01:10:38,430 INFO  [pool-9-thread-10] common.AbstractHadoopJob:499 : 
HDFS meta dir is: 
file:///opt/kylin/bin/../tomcat/temp/kylin_job_meta8686959176405233518/meta
2017-05-01 01:10:38,430 INFO  [pool-9-thread-10] common.AbstractHadoopJob:372 : 
Job 'tmpfiles' updated -- 
file:///opt/kylin/bin/../tomcat/temp/kylin_job_meta8686959176405233518/meta
2017-05-01 01:10:38,433 INFO  [pool-9-thread-10] mapred.FileInputFormat:249 : 
Total input paths to process : 0
2017-05-01 01:10:38,433 INFO  [pool-9-thread-10] common.AbstractHadoopJob:506 : 
tempMetaFileString is : 
file:///opt/kylin/bin/../tomcat/temp/kylin_job_meta8686959176405233518/meta
2017-05-01 01:10:38,440 ERROR [pool-9-thread-10] common.MapReduceExecutable:127 
: error execute MapReduceExecutable{id=37f887a1-1f2a-40d0-a4dc-9eca7be28ab7-06, 
name=Build Base Cuboid Data, state=RUNNING}
java.lang.IllegalArgumentException: Map input splits are 0 bytes, something is 
wrong!
at 
org.apache.kylin.engine.mr.common.AbstractHadoopJob.getTotalMapInputMB(AbstractHadoopJob.java:555)
at 
org.apache.kylin.engine.mr.steps.CuboidJob.setReduceTaskNum(CuboidJob.java:175)
at org.apache.kylin.engine.mr.steps.CuboidJob.run(CuboidJob.java:138)
at org.apache.kylin.engine.mr.MRUtil.runMRJob(MRUtil.java:92)
at 
org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:120)
at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
at 
org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57)
at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
at 
org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:136)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)


---
I sovled this issue by changing execution engine from Tez to MR。 Because Tez 
engine has zero file issue . https://issues.apache.org/jira/browse/HIVE-13988

I think Kylin should not blocked by empty segment. 




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)