Repository: kylin Updated Branches: refs/heads/document 2edbb70a5 -> 0697ab49e
update howto contribute doc Project: http://git-wip-us.apache.org/repos/asf/kylin/repo Commit: http://git-wip-us.apache.org/repos/asf/kylin/commit/0697ab49 Tree: http://git-wip-us.apache.org/repos/asf/kylin/tree/0697ab49 Diff: http://git-wip-us.apache.org/repos/asf/kylin/diff/0697ab49 Branch: refs/heads/document Commit: 0697ab49e450a68f66424707b9214d4f18e32652 Parents: 2edbb70 Author: shaofengshi <shaofeng...@apache.org> Authored: Mon Feb 12 16:50:33 2018 +0800 Committer: shaofengshi <shaofeng...@apache.org> Committed: Mon Feb 12 16:50:42 2018 +0800 ---------------------------------------------------------------------- website/_dev/dev_env.md | 16 +++----- website/_dev/howto_contribute.md | 52 +++++++++++++++++++++++-- website/_dev/index.md | 9 +++-- website/_docs21/tutorial/cube_streaming.md | 14 +++---- website/_docs23/tutorial/cube_streaming.md | 18 ++++----- 5 files changed, 75 insertions(+), 34 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/kylin/blob/0697ab49/website/_dev/dev_env.md ---------------------------------------------------------------------- diff --git a/website/_dev/dev_env.md b/website/_dev/dev_env.md index c16330a..957e870 100644 --- a/website/_dev/dev_env.md +++ b/website/_dev/dev_env.md @@ -11,7 +11,7 @@ By following this tutorial, you will be able to build kylin test cubes by runnin ## Environment on the Hadoop CLI -Off-Hadoop-CLI installation requires you having a hadoop CLI machine (or a hadoop sandbox) as well as your local develop machine. To make things easier we strongly recommend you starting with running Kylin on a hadoop sandbox, like <http://hortonworks.com/products/hortonworks-sandbox/>. In the following tutorial we'll go with **Hortonworks Sandbox 2.2.4**. It is recommended that you provide enough memory to your sandbox, 8G or more is preferred. +Off-Hadoop-CLI installation requires you having a hadoop CLI machine (or a hadoop sandbox) as well as your local develop machine. To make things easier we strongly recommend you starting with running Kylin on a hadoop sandbox, like <http://hortonworks.com/products/hortonworks-sandbox/>. In the following tutorial we'll go with **Hortonworks Sandbox 2.4.0.0-169**. It is recommended that you provide enough memory to your sandbox, 8G or more is preferred. ### Start Hadoop @@ -44,21 +44,15 @@ ln -s /root/apache-maven-3.2.5/bin/mvn /usr/bin/mvn ### Install Spark -Manually install spark-1.6.3-bin-hadoop2.6 in a local folder like /usr/local/spark +Manually install spark-2.1.2-bin-hadoop2.7 in a local folder like /usr/local/spark {% highlight Groff markup %} -wget -O /tmp/spark-1.6.3-bin-hadoop2.6.tgz http://d3kbcqa49mib13.cloudfront.net/spark-1.6.3-bin-hadoop2.6.tgz +wget -O /tmp/spark-2.1.2-bin-hadoop2.7.tgz http://d3kbcqa49mib13.cloudfront.net/spark-2.1.2-bin-hadoop2.7.tgz cd /usr/local -tar -zxvf /tmp/spark-1.6.3-bin-hadoop2.6.tgz -ln -s spark-1.6.3-bin-hadoop2.6 spark +tar -zxvf /tmp/spark-2.1.2-bin-hadoop2.7.tgz +ln -s spark-2.1.2-bin-hadoop2.7 spark {% endhighlight %} -Upload the spark-assembly jar to HDFS as /kylin/spark/spark-assembly-1.6.3-hadoop2.6.0.jar (avoid repeatedly uploading the jar to HDFS): - -{% highlight Groff markup %} -hadoop fs -mkdir /kylin/spark/ -hadoop fs -put /usr/local/spark/lib/spark-assembly-1.6.3-hadoop2.6.0.jar /kylin/spark/ -{% endhighlight %} Create local temp folder for hbase client (if it doesn't exist): http://git-wip-us.apache.org/repos/asf/kylin/blob/0697ab49/website/_dev/howto_contribute.md ---------------------------------------------------------------------- diff --git a/website/_dev/howto_contribute.md b/website/_dev/howto_contribute.md index 4fb3505..737f8b9 100644 --- a/website/_dev/howto_contribute.md +++ b/website/_dev/howto_contribute.md @@ -12,11 +12,27 @@ Apache Kylin is always looking for contributions of not only code, but also usag Both code and document are under Git source control. Note the purpose of different branches. * `master`: Main development branch for new features -* `2.0.x`: Maintenance branch for a certain release +* `2.[n].x`: Maintenance branch for a certain major release * `document`: Document branch +## Components and owners +Apache Kylin has several sub-components. And for each component we will arrange one or multiple component owners. -## Pick an Open Task +Component owners is listed in the description field on this Apache Kylin [JIRA components page](https://issues.apache.org/jira/projects/KYLIN?selectedItem=com.atlassian.jira.jira-projects-plugin:components-page). The owners are listed in the 'Description' field rather than in the 'Component Lead' field because the latter only allows us to list one individual whereas it is encouraged that components have multiple owners. + +- Component owners are volunteers who are expert in their component domain and may have an agenda on how they think their Apache Kylin component should evolve. The owner needs to be an Apache Kylin committer at this moment. + +- Owners will try and review patches that land within their componentâs scope. + +- Owners can rotate, based on his aspiration. + +- When nominate or vote a new committer, the nominator needs to state which component the candidate can be the owner. + +- If you're already an Apache Kylin committer and would like to be a volunteer as a component owner, just write to the dev list and weâll sign you up. + +- If you think the component list need be updated (add, remove, rename, etc), write to the dev list and weâll review that. + +## Pick a task There are open tasks waiting to be done, tracked by JIRA. To make it easier to search, there are a few JIRA filters. * [A list of tasks](https://issues.apache.org/jira/issues/?filter=12339895) managed by Yang Li. @@ -25,6 +41,16 @@ There are open tasks waiting to be done, tracked by JIRA. To make it easier to s Do not forget to discuss in [mailing list](/community/index.html) before working on a big task. +If create a new JIRA for bug or feature, remember to provide enough information for the community: + +* A well summary for the problem or feature +* A detail description, which may include: + - the environment of this problem occurred + - the steps to reproduce the problem + - the error trace or log files (as attachment) + - the metadata of the model or cube +* Related components: we will arrange reviewer based on this selection. +* Affected version: which Kylin you're using. ## Making Code Changes * [Setup dev env](/development/dev_env.html) @@ -32,7 +58,7 @@ Do not forget to discuss in [mailing list](/community/index.html) before working * Discuss with others in mailing list or issue comments, make sure the proposed changes fit in with what others are doing and have planned for the project * Make changes in your fork * No strict code style at the moment, but the general rule is keep consistent with existing files. E.g. use 4-space indent for java files. - * Add unit test for your code change as much as possible. + * Add test case for your code change as much as possible. * Make sure "mvn clean package" and "mvn test" can get success. * Sufficient unit test and integration test is a mandatory part of code change. * [Run tests](/development/howto_test.html) to ensure your change is in good quality and does not break anything. If your patch was generated incorrectly or your code does not adhere to the code guidelines, you may be asked to redo some work. @@ -53,6 +79,26 @@ $ ./dev-support/submit-patch.py -jid KYLIN-xxxxx -b master -srb * Alternatively, you can also manually generate a patch. Please use `git rebase -i` first, to combine (squash) smaller commits into a single larger one. Then use `git format-patch` command to generate the patch, for a detail guide you can refer to [How to create and apply a patch with Git](https://ariejan.net/2009/10/26/how-to-create-and-apply-a-patch-with-git/) +## Code Review +The reviewer need to review the patch from the following perspectives: + +* _Functionality_: the patch MUST address the issue and has been verified by the contributor before submitting for review. +* _Test coverage_: the change MUST be covered by a UT or the Integration test, otherwise it is not maintainable. Execptional case includes GUI, shell script, etc. +* _Performance_: the change SHOULD NOT downgrade Kylin's performance. +* _Metadata compatibility_: the change should support old metadata definition. Otherwise, a metadata migration tool and documentation is required. +* _API compatibility_: the change SHOULD NOT break public API's functionality and behavior; If an old API need be replaced by the new one, print warning message there. +* _Documentation_: if the Kylin document need be updated together, create another JIRA with "Document" as the component to track. In the document JIRA, attach the doc change patch which is againt the "document" branch. + +A patch which doesn't comply with the above rules may not get merged. + +## Patch +1 Policy + +Patches that fit within the scope of a single component require, at least, a +1 by one of the componentâs owners before commit. If owners are absent â busy or otherwise â two +1s by non-owners but committers will suffice. + +Patches that span components need at least two +1s before they can be committed, preferably +1s by owners of components touched by the x-component patch. + +Any -1 on a patch by anyone vetoes a patch; it cannot be committed until the justification for the -1 is addressed. + ## Apply Patch * Committer will review Pull Requests and Patches in JIRA regarding correctness, performance, design, coding style, test coverage http://git-wip-us.apache.org/repos/asf/kylin/blob/0697ab49/website/_dev/index.md ---------------------------------------------------------------------- diff --git a/website/_dev/index.md b/website/_dev/index.md index 060ed33..1fc73f6 100644 --- a/website/_dev/index.md +++ b/website/_dev/index.md @@ -12,12 +12,13 @@ Check out the [How to Contribute](/development/howto_contribute.html) document. ### Source Repository Apache Kylin⢠source code is version controlled using Git version control: Commits [Summary](https://git-wip-us.apache.org/repos/asf?p=kylin.git;a=summary) -Source Repo: [git://git.apache.org/kylin.git](git://git.apache.org/kylin.git) +Source Repo: [https://git-wip-us.apache.org/repos/asf/kylin.git](https://git-wip-us.apache.org/repos/asf/kylin.git) Mirrored to Github: [https://github.com/apache/kylin](https://github.com/apache/kylin) ### CI and Code Analysis UT on master branch with JDK 1.7(deprecated): [Kylin-Master-JDK-1.7](https://builds.apache.org/job/Kylin-Master-JDK-1.7/) UT on master branch with JDK 1.8: [Kylin-Master-JDK-1.8](https://builds.apache.org/job/Kylin-Master-JDK-1.8/) +Integretion test within Hadoop Sandbox (HDP 2.4) : [http://34.226.50.254:8081/](http://34.226.50.254:8081/) Static Code Analysis: [SonarCube dashboard](https://builds.apache.org/analysis/overview?id=org.apache.kylin%3Akylin) [![Build Status](https://travis-ci.org/apache/kylin.svg?branch=master)](https://travis-ci.org/apache/kylin)[![Codacy Badge](https://api.codacy.com/project/badge/Grade/74f0139786cd4e8a8ce69bb0c17c2e71)](https://www.codacy.com/app/kyligence-git/kylin?utm_source=github.com&utm_medium=referral&utm_content=apache/kylin&utm_campaign=Badge_Grade) @@ -28,8 +29,8 @@ Track issues on the "Kylin" Project on the Apache JIRA ([browse](http://issues.a ### Roadmap - Hadoop 3.0 support (Erasure Coding) -- Spark cubing enhancement +- Spark Cubing enhancement - Connect more sources (JDBC, SparkSQL) -- Ad-hoc queries without cubing -- Better storage (Kudu?) +- Ad-hoc queries without Cubing +- Better storage (Druid, Kudu, etc) - Real-time analytics with Lambda Architecture http://git-wip-us.apache.org/repos/asf/kylin/blob/0697ab49/website/_docs21/tutorial/cube_streaming.md ---------------------------------------------------------------------- diff --git a/website/_docs21/tutorial/cube_streaming.md b/website/_docs21/tutorial/cube_streaming.md index c4124eb..fa96db5 100644 --- a/website/_docs21/tutorial/cube_streaming.md +++ b/website/_docs21/tutorial/cube_streaming.md @@ -26,29 +26,29 @@ Download the Kylin v1.6 from download page, expand the tar ball in /usr/local/ f ## Create sample Kafka topic and populate data -Create a sample topic "kylindemo", with 3 partitions: +Create a sample topic "kylin_streaming_topic", with 3 partitions: {% highlight Groff markup %} -bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 3 --topic kylindemo -Created topic "kylindemo". +bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 3 --topic kylin_streaming_topic +Created topic "kylin_streaming_topic". {% endhighlight %} Put sample data to this topic; Kylin has an utility class which can do this; {% highlight Groff markup %} export KAFKA_HOME=/usr/local/kafka_2.10-0.10.0.0 -export KYLIN_HOME=/usr/local/apache-kylin-1.6.0-bin +export KYLIN_HOME=/usr/local/apache-kylin-2.1.0-bin cd $KYLIN_HOME -./bin/kylin.sh org.apache.kylin.source.kafka.util.KafkaSampleProducer --topic kylindemo --broker localhost:9092 +./bin/kylin.sh org.apache.kylin.source.kafka.util.KafkaSampleProducer --topic kylin_streaming_topic --broker localhost:9092 {% endhighlight %} This tool will send 100 records to Kafka every second. Please keep it running during this tutorial. You can check the sample message with kafka-console-consumer.sh now: {% highlight Groff markup %} cd $KAFKA_HOME -bin/kafka-console-consumer.sh --zookeeper localhost:2181 --bootstrap-server localhost:9092 --topic kylindemo --from-beginning +bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic kylin_streaming_topic --from-beginning {"amount":63.50375137330458,"category":"TOY","order_time":1477415932581,"device":"Other","qty":4,"user":{"id":"bf249f36-f593-4307-b156-240b3094a1c3","age":21,"gender":"Male"},"currency":"USD","country":"CHINA"} {"amount":22.806058795736583,"category":"ELECTRONIC","order_time":1477415932591,"device":"Andriod","qty":1,"user":{"id":"00283efe-027e-4ec1-bbed-c2bbda873f1d","age":27,"gender":"Female"},"currency":"USD","country":"INDIA"} @@ -70,7 +70,7 @@ Notice that Kylin supports structured (or say "embedded") message from v1.6, it ![](/images/tutorial/1.6/Kylin-Cube-Streaming-Tutorial/2_Define_streaming_table.png) -Click "Next". On this page, provide the Kafka cluster information; Enter "kylindemo" as "Topic" name; The cluster has 1 broker, whose host name is "sandbox", port is "9092", click "Save". +Click "Next". On this page, provide the Kafka cluster information; Enter "kylin_streaming_topic" as "Topic" name; The cluster has 1 broker, whose host name is "sandbox", port is "9092", click "Save". ![](/images/tutorial/1.6/Kylin-Cube-Streaming-Tutorial/3_Kafka_setting.png) http://git-wip-us.apache.org/repos/asf/kylin/blob/0697ab49/website/_docs23/tutorial/cube_streaming.md ---------------------------------------------------------------------- diff --git a/website/_docs23/tutorial/cube_streaming.md b/website/_docs23/tutorial/cube_streaming.md index ef6578e..fa96db5 100644 --- a/website/_docs23/tutorial/cube_streaming.md +++ b/website/_docs23/tutorial/cube_streaming.md @@ -1,8 +1,8 @@ --- -layout: docs23 +layout: docs21 title: Scalable Cubing from Kafka categories: tutorial -permalink: /docs23/tutorial/cube_streaming.html +permalink: /docs21/tutorial/cube_streaming.html --- Kylin v1.6 releases the scalable streaming cubing function, it leverages Hadoop to consume the data from Kafka to build the cube, you can check [this blog](/blog/2016/10/18/new-nrt-streaming/) for the high level design. This doc is a step by step tutorial, illustrating how to create and build a sample cube; @@ -26,29 +26,29 @@ Download the Kylin v1.6 from download page, expand the tar ball in /usr/local/ f ## Create sample Kafka topic and populate data -Create a sample topic "kylindemo", with 3 partitions: +Create a sample topic "kylin_streaming_topic", with 3 partitions: {% highlight Groff markup %} -bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 3 --topic kylindemo -Created topic "kylindemo". +bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 3 --topic kylin_streaming_topic +Created topic "kylin_streaming_topic". {% endhighlight %} Put sample data to this topic; Kylin has an utility class which can do this; {% highlight Groff markup %} export KAFKA_HOME=/usr/local/kafka_2.10-0.10.0.0 -export KYLIN_HOME=/usr/local/apache-kylin-1.6.0-bin +export KYLIN_HOME=/usr/local/apache-kylin-2.1.0-bin cd $KYLIN_HOME -./bin/kylin.sh org.apache.kylin.source.kafka.util.KafkaSampleProducer --topic kylindemo --broker localhost:9092 +./bin/kylin.sh org.apache.kylin.source.kafka.util.KafkaSampleProducer --topic kylin_streaming_topic --broker localhost:9092 {% endhighlight %} This tool will send 100 records to Kafka every second. Please keep it running during this tutorial. You can check the sample message with kafka-console-consumer.sh now: {% highlight Groff markup %} cd $KAFKA_HOME -bin/kafka-console-consumer.sh --zookeeper localhost:2181 --bootstrap-server localhost:9092 --topic kylindemo --from-beginning +bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic kylin_streaming_topic --from-beginning {"amount":63.50375137330458,"category":"TOY","order_time":1477415932581,"device":"Other","qty":4,"user":{"id":"bf249f36-f593-4307-b156-240b3094a1c3","age":21,"gender":"Male"},"currency":"USD","country":"CHINA"} {"amount":22.806058795736583,"category":"ELECTRONIC","order_time":1477415932591,"device":"Andriod","qty":1,"user":{"id":"00283efe-027e-4ec1-bbed-c2bbda873f1d","age":27,"gender":"Female"},"currency":"USD","country":"INDIA"} @@ -70,7 +70,7 @@ Notice that Kylin supports structured (or say "embedded") message from v1.6, it ![](/images/tutorial/1.6/Kylin-Cube-Streaming-Tutorial/2_Define_streaming_table.png) -Click "Next". On this page, provide the Kafka cluster information; Enter "kylindemo" as "Topic" name; The cluster has 1 broker, whose host name is "sandbox", port is "9092", click "Save". +Click "Next". On this page, provide the Kafka cluster information; Enter "kylin_streaming_topic" as "Topic" name; The cluster has 1 broker, whose host name is "sandbox", port is "9092", click "Save". ![](/images/tutorial/1.6/Kylin-Cube-Streaming-Tutorial/3_Kafka_setting.png)