[jira] [Created] (KYLIN-3896) Implement IFlinkOutput based on HBase

2019-03-19 Thread vinoyang (JIRA)
vinoyang created KYLIN-3896:
---

 Summary: Implement IFlinkOutput based on HBase
 Key: KYLIN-3896
 URL: https://issues.apache.org/jira/browse/KYLIN-3896
 Project: Kylin
  Issue Type: Sub-task
  Components: Flink Engine
Reporter: vinoyang
Assignee: vinoyang






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: question related to the aggregation groups configuration

2019-03-19 Thread yuzhang
Hi kang-sen:
I do some test about Q1, {D1 to D10} have been included in an aggregation 
group and {D1 to D9} have been added into mandatory dimension. Then kylin only 
generates Cuboid {D1 to D10}(base Cuboid) which I expect {D1 to D10} and {D1 to 
D9}. When I add {D1 to D8} in to mandatory dimension, kylin generates Cuboid 
{D1 to D10}, {D1 to D8, D9} and {D1 to D8, D10} which I expect {D1 to D10}, {D1 
to D8, D9}, {D1 to D8, D10} and {D1 to D8}. About your Q1, I think the answer 
is ONLY ONE cuboid {D1 to D10} has been generated. But according the blog ("if 
a dimension is specified as “mandatory”, then all of the combinations without 
such dimension can be pruned"), the Cuboid {D1 to D9} should't been pruned. 
Maybe someone else can give more detail.
Q2 is similar with this email 
https://lists.apache.org/thread.html/3ccc8d7f98748d7c590c01c7da6ce666a16c4fe2b34be070940cae8f@%3Cuser.kylin.apache.org%3E
 and jira https://issues.apache.org/jira/browse/KYLIN-2149 . Now kylin will 
prevent config overlapping hierachy, mandatory and joint. Although the minds of 
three aggregation rule are different and even contradictory, auto merging those 
rules into Cuboids is feasible. For now, the restriction of aggregation group 
can't realize your requirement which I think is common. May be the jira 
KYLIN-2149 can be resolved in the future.


   Best regards
yuzhang




| |
yuzhang
|
|
shifengdefan...@163.com
|
签名由网易邮箱大师定制
On 3/19/2019 23:09,Lu, Kang-Sen wrote:

Hi, Yuzhang:

 

I would appreciate if you can provide answer to my 2 questions.

 

Thanks.

 

Kang-sen

 

From: Lu, Kang-Sen 
Sent: Friday, March 15, 2019 8:15 AM
To:u...@kylin.apache.org
Subject: RE: question related to the aggregation groups configuration

 

NOTICE: This email was received from an EXTERNAL sender

 

Hi, Yuzhang:

 

Thanks for taking time to reply.

 

I actually have read that article several times earlier before.

 

However, may be I missed some details or what, I am not clear about how those 
rules actual work and how they interfere with each other.

 

From the article you pointed out, the hierarchy rule does have an example, so 
it is less likely to be confused.

 

I did not find any discussion about the “mandatory rule”. It is supposed to be 
very simple, but I am stuck by the details. Let’s say, “includes” is a set of 
dim: { d1, d2, … d10}, and the “mandatory” is a set of dim: {D1, …, D9}.

So it is obvious that each cuboid generated from this agg group should all 
include set of dim {D1, …, D9}.

Now, D10 could be either selected or not. So the natural guess is that this agg 
group will generate two cuboids, i.e {D1,…,D9} and {D1,… D10}. Is this what 
kylin will do?

 

Another detail I am not clear is the interaction of “joint rule” and the 
“mandatory rule”. It seems that there is an interaction between these two 
rules. I am not clear why, and it is not discussed in the article you mentioned.

 

That was my two original questions.

 

Thanks again.

 

Kang-sen

 

From: yuzhang 
Sent: Friday, March 15, 2019 7:46 AM
To:u...@kylin.apache.org
Subject: Re: question related to the aggregation groups configuration

 

NOTICE: This email was received from an EXTERNAL sender

 

Hi kang-sen:

  Here is a blog about the mind of aggregation group. I hope it will help you.

https://kylin.apache.org/blog/2016/02/18/new-aggregation-group/

 

Best regards

 yuzhang

 

|

|

yuzhang

|
|

shifengdefan...@163.com

|

签名由 网易邮箱大师 定制

On 3/14/2019 21:21,Lu, Kang-Sen wrote:

I am running kylin 2.5.1

 

I have two questions related to the aggregation group configuration. In the 
kylin GUI, select “Model”, then try to edit a cube design, under “Grid”, select 
“Advanced Setting”, we can enter multiple “Aggregation Groups”. Each 
“Aggregation Group” can specify zero, one, or many cuboids, with the 
combination of dimensions.

 

Q1: If I want one and only one cuboid to be created with dimensions set = {D1, 
D2, … , D10}, then is it correct to enter D1-to-D10 in the “includes” list, and 
“D1-to-D9 in the “Mandatory Dimensions” list? The key question is “will kylin 
generate two cuboids, i.e. {D1, …, D9} and {D1, … , D10} or just one cuboid”?

 

Q2: If I entered D1-to-D10 into the “includes” list, and entered {D1, D2} in 
the “Joint Dimensions” list, then I can’t enter either D1 or D2 into the 
“Mandatory Dimensions” list? I was thinking if I entered {D1, D3, … , D9} in 
the “Mandatory Dimensions”, and with {D1, D2} in the “Joint Dimensions”, then 
there should only one cuboid generated for {D1, D2, …, D10}. Why is it not 
allowed?

 

Maybe the doc have this information described. But it is not clear to me 
exactly how does kylin process the info entered in the “includes”, “Mandatory 
Dimensions”, and “Joint Dimensions”. Can someone either point me to some 
document or answer the questions I mentioned above.

 

Thanks.

 

[jira] [Created] (KYLIN-3895) Failed to register new MBean when "kylin.server.query-metrics-enabled" set true

2019-03-19 Thread Guangxu Cheng (JIRA)
Guangxu Cheng created KYLIN-3895:


 Summary: Failed to register new MBean when 
"kylin.server.query-metrics-enabled" set true 
 Key: KYLIN-3895
 URL: https://issues.apache.org/jira/browse/KYLIN-3895
 Project: Kylin
  Issue Type: Bug
Reporter: Guangxu Cheng
Assignee: Guangxu Cheng


{code}
2019-03-20 10:17:25,753 WARN  [Query 46cd99cc-8eb2-8370-d24c-6c10f18da9e0-54] 
util.MBeans:94 : Error creating MBean object name: Ha   
doop:service=Kylin,name=KYLIN_SYSTEM,sub=CUBE[name=KYLIN_HIVE_METRICS_JOB_QA]
 org.apache.hadoop.metrics2.MetricsException: 
javax.management.MalformedObjectNameException: Invalid character '=' in value 
part of property
 at 
org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newObjectName(DefaultMetricsSystem.java:122)
 at 
org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newMBeanName(DefaultMetricsSystem.java:102)
 at org.apache.hadoop.metrics2.util.MBeans.getMBeanName(MBeans.java:92)
 at org.apache.hadoop.metrics2.util.MBeans.register(MBeans.java:55)
{code}

The subname can't contain '='



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3894) Build buildSupportsSnappy Error When Doing Integration Testing

2019-03-19 Thread Yanwen Lin (JIRA)
Yanwen Lin created KYLIN-3894:
-

 Summary: Build buildSupportsSnappy Error When Doing Integration 
Testing
 Key: KYLIN-3894
 URL: https://issues.apache.org/jira/browse/KYLIN-3894
 Project: Kylin
  Issue Type: Test
  Components: Tools, Build and Test
Affects Versions: v2.6.0
 Environment: Hortonworks HDP 3.0.1.0-187 Docker container.
Reporter: Yanwen Lin


Hi all,
I am currently running integration test. However, I met the following error. 
Could you please share some suggestions on this?
I've passed maven install(skip test) and maven test.
 
*1. Command*:
mvn verify -fae -Dhdp.version=3.0.1.0-187 -P sandbox

 
*2. Error message from Yarn Container Attempt:*
{noformat}
2019-03-18 16:43:25,583 INFO [main] org.apache.kylin.engine.mr.KylinMapper: 
Accepting Mapper Key with ordinal: 12019-03-18 16:43:25,583 INFO [main] 
org.apache.kylin.engine.mr.KylinMapper: Do map, available memory: 
322m2019-03-18 16:43:25,596 INFO [main] org.apache.kylin.common.KylinConfig: 
Creating new manager instance of class 
org.apache.kylin.cube.cuboid.CuboidManager2019-03-18 16:43:25,599 INFO [main] 
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: File Output 
Committer Algorithm version is 12019-03-18 16:43:25,599 INFO [main] 
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: FileOutputCommitter 
skip cleanup _temporary folders under output directory:false, ignore cleanup 
failures: false2019-03-18 16:43:25,795 ERROR [main] 
org.apache.kylin.engine.mr.KylinMapper:
java.lang.UnsatisfiedLinkError: 
org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy()Z at 
org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy(Native Method) at 
org.apache.hadoop.io.compress.SnappyCodec.checkNativeCodeLoaded(SnappyCodec.java:63)
 at 
org.apache.hadoop.io.compress.SnappyCodec.getCompressorType(SnappyCodec.java:136)
 at org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:150) 
at org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:168) at 
org.apache.hadoop.io.SequenceFile$Writer.init(SequenceFile.java:1304) at 
org.apache.hadoop.io.SequenceFile$Writer.(SequenceFile.java:1192) at 
org.apache.hadoop.io.SequenceFile$BlockCompressWriter.(SequenceFile.java:1552)
 at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:289) at 
org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:542) at 
org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat.getSequenceWriter(SequenceFileOutputFormat.java:64)
 at 
org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat.getRecordWriter(SequenceFileOutputFormat.java:75)
 at 
org.apache.hadoop.mapreduce.lib.output.LazyOutputFormat$LazyRecordWriter.write(LazyOutputFormat.java:113)
 at 
org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.write(MultipleOutputs.java:468)
 at 
org.apache.kylin.engine.mr.steps.FilterRecommendCuboidDataMapper.doMap(FilterRecommendCuboidDataMapper.java:85)
 at 
org.apache.kylin.engine.mr.steps.FilterRecommendCuboidDataMapper.doMap(FilterRecommendCuboidDataMapper.java:44)
 at org.apache.kylin.engine.mr.KylinMapper.map(KylinMapper.java:77) at 
org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) at 
org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799) at 
org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) at 
org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174) at 
java.security.AccessController.doPrivileged(Native Method) at 
javax.security.auth.Subject.doAs(Subject.java:422) at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)2019-03-18 
16:43:25,797 INFO [main] org.apache.kylin.engine.mr.KylinMapper: Do cleanup, 
available memory: 318m2019-03-18 16:43:25,813 INFO [main] 
org.apache.kylin.engine.mr.KylinMapper: Total rows: 12019-03-18 16:43:25,813 
ERROR [main] org.apache.hadoop.mapred.YarnChild: Error running child : 
java.lang.UnsatisfiedLinkError: 
org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy()Z at 
org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy(Native Method) at 
org.apache.hadoop.io.compress.SnappyCodec.checkNativeCodeLoaded(SnappyCodec.java:63)
 at 
org.apache.hadoop.io.compress.SnappyCodec.getCompressorType(SnappyCodec.java:136)
 at org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:150) 
at org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:168) at 
org.apache.hadoop.io.SequenceFile$Writer.init(SequenceFile.java:1304) at 
org.apache.hadoop.io.SequenceFile$Writer.(SequenceFile.java:1192) at 
org.apache.hadoop.io.SequenceFile$BlockCompressWriter.(SequenceFile.java:1552)
 at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:289) at 
org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:542) at 

[jira] [Created] (KYLIN-3893) Cube build failed for wrong row key column description

2019-03-19 Thread Liu Shaohui (JIRA)
Liu Shaohui created KYLIN-3893:
--

 Summary: Cube build failed for wrong row key column description
 Key: KYLIN-3893
 URL: https://issues.apache.org/jira/browse/KYLIN-3893
 Project: Kylin
  Issue Type: Bug
Reporter: Liu Shaohui


User created wrong RowKeyColDesc, eg,
RowKeyColDesc\{column=MYSQL_FEEDBACK_USER_AUDIT.DATE, 
encoding=integer:undefined}
which cause the cube build forever.

 
{code:java}
org.apache.kylin.engine.mr.exception.HadoopShellException: 
java.lang.NumberFormatException: For input string: "undefined"    at 
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)   
 at java.lang.Integer.parseInt(Integer.java:580)    at 
java.lang.Integer.parseInt(Integer.java:615)    at 
org.apache.kylin.dimension.IntegerDimEnc$Factory.createDimensionEncoding(IntegerDimEnc.java:65)
    at 
org.apache.kylin.dimension.DimensionEncodingFactory.create(DimensionEncodingFactory.java:65)
    at org.apache.kylin.cube.kv.CubeDimEncMap.get(CubeDimEncMap.java:74)    at 
org.apache.kylin.engine.mr.common.CubeStatsReader.getCuboidSizeMapFromRowCount(CubeStatsReader.java:206)
    at 
org.apache.kylin.engine.mr.common.CubeStatsReader.getCuboidSizeMap(CubeStatsReader.java:170)
    at 
org.apache.kylin.storage.hbase.steps.CreateHTableJob.run(CreateHTableJob.java:102)
    at org.apache.kylin.engine.mr.MRUtil.runMRJob(MRUtil.java:92)    at 
org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63)
    at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:178)
    at 
org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71)
    at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:178)
    at 
org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
   at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
   at java.lang.Thread.run(Thread.java:748)result code:2    at 
org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:73)
    at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:178)
    at 
org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71)
    at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:178)
    at 
org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
   at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
   at java.lang.Thread.run(Thread.java:748){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[Discussion]Does 'UNION ALL' support query on two fact table ?

2019-03-19 Thread yuzhang
Hi dear all:
Simple question as mail title desc.


Best regards
yuzhang


| |
yuzhang
|
|
shifengdefan...@163.com
|
签名由网易邮箱大师定制

Re:Re: kylin top-n query

2019-03-19 Thread 黄云尧
thanks a lot
发件人:JiaTao Tao 
发送日期:2019-03-19 10:11:37
收件人:dev 
主题:Re: kylin top-n query>And this may also help:
>http://kylin.apache.org/docs/tutorial/create_cube.html (go to the "TOP_N"
>Section)
>
>
>-- 
>
>
>Regards!
>
>Aron Tao
>
>黄云尧  于2019年3月18日周一 下午12:06写道:
>
>> someone has  documents for   top-n query in kylin ?
>>
>>
>>
>>




Re: Hbase table is always empty when build with spark

2019-03-19 Thread ShaoFeng Shi
Hi Alex,

Could you please report a JIRA to Kylin? or send a Pull request if you
already have a hot-fix. Thank you!

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




mailpig  于2019年2月25日周一 下午5:18写道:

> Sure, hive table is not empty and the output directory of hfile also has
> data.
>
> 
>
>
> After set the mapreduce.job.outputformat.class in the job config, load
> hfile
> to hbase is success.
> Besides that I found the source code has the above config in the first
> commit,
> ..
> HTable table = new HTable(hbaseConf,
> cubeSegment.getStorageLocationIdentifier());
> try {
> HFileOutputFormat2.configureIncrementalLoadMap(job, table);
> } catch (IOException ioe) {
> // this can be ignored.
> logger.debug(ioe.getMessage(), ioe);
> }
> ...
> But after the commit 76c9c960be542c919301c72b34c7ae5ce6f1ec1c, the above
> config is deleted, I don't know why. Please check.
>
> --
> Sent from: http://apache-kylin.74782.x6.nabble.com/
>


Kylin Support Spark 1.6 version when start supporting spark 2.6

2019-03-19 Thread rsanad...@gmail.com
HI ALl,

Wanted to check Currently Kylin supports Spark 1.6 version which is older
and latest Spark 2.6 is available . When will Kylin start supporting Spark
2.6 ? 

Thanks,
Rahul  

--
Sent from: http://apache-kylin.74782.x6.nabble.com/


Re: 答复: How kylin store data in Hbase ?

2019-03-19 Thread rsanad...@gmail.com
Hi Shaofeng Shi,

Thank you so much ! 

Smiles:)
Rahul S

--
Sent from: http://apache-kylin.74782.x6.nabble.com/


[jira] [Created] (KYLIN-3892) Set cubing job priority

2019-03-19 Thread Temple Zhou (JIRA)
Temple Zhou created KYLIN-3892:
--

 Summary: Set cubing job priority
 Key: KYLIN-3892
 URL: https://issues.apache.org/jira/browse/KYLIN-3892
 Project: Kylin
  Issue Type: New Feature
  Components: Job Engine
Affects Versions: v2.6.0, v2.5.0, v2.4.0
Reporter: Temple Zhou
Assignee: Temple Zhou


The cubing job with high priority will be delayed when there are too many tasks 
running. 

So I want to set the job priority for the important cubing jobs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)