This is an automated email from the ASF dual-hosted git repository.
nic pushed a commit to branch document
in repository https://gitbox.apache.org/repos/asf/kylin.git
The following commit(s) were added to refs/heads/document by this push:
new d6b16fe spark build distinct value, dimension dic, uhc dic
d6b16fe is described below
commit d6b16fe828fb66ff6656c36ac9195576b8eb20d8
Author: rupengwang <[email protected]>
AuthorDate: Mon Dec 30 17:31:09 2019 +0800
spark build distinct value, dimension dic, uhc dic
---
website/_docs/tutorial/cube_spark.md | 23 +++++++++++++++++++++++
website/_docs30/tutorial/cube_spark.md | 22 ++++++++++++++++++++++
website/_docs31/tutorial/cube_spark.md | 24 ++++++++++++++++++++++++
3 files changed, 69 insertions(+)
diff --git a/website/_docs/tutorial/cube_spark.md
b/website/_docs/tutorial/cube_spark.md
index 2ac27d7..fb7a1fa 100644
--- a/website/_docs/tutorial/cube_spark.md
+++ b/website/_docs/tutorial/cube_spark.md
@@ -127,6 +127,29 @@ When Kylin executes this step, you can monitor the status
in Yarn resource manag
After all steps be successfully executed, the Cube becomes "Ready" and you can
query it as normal.
+
+## Using Spark with Apache Livy
+
+You can use Livy by adding flowing configuration:
+
+{% highlight Groff markup %}
+kylin.engine.livy-conf.livy-enabled=true
+kylin.engine.livy-conf.livy-url=http://ip:8998
+kylin.engine.livy-conf.livy-key.file=hdfs:///path/kylin-job-3.0.0-SNAPSHOT.jar
+kylin.engine.livy-conf.livy-arr.jars=hdfs:///path/hbase-client-1.2.0-{$env.version}.jar,hdfs:///path/hbase-common-1.2.0-{$env.version}.jar,hdfs:///path/hbase-hadoop-compat-1.2.0-{$env.version}.jar,hdfs:///path/hbase-hadoop2-compat-1.2.0-{$env.version}.jar,hdfs:///path/hbase-server-1.2.0-{$env.version}.jar,hdfs:///path/htrace-core-3.2.0-incubating.jar,hdfs:///path/metrics-core-2.2.0.jar
+{% endhighlight %}
+
+
+## Optional
+
+As we all know, the cubing job includes several steps and the steps 'extract
fact table distinct value', 'build dimension dictionary' and 'build UHC
dimension dictionary' can also be built by spark. The configurations are as
follows.
+
+{% highlight Groff markup %}
+kylin.engine.spark-fact-distinct=true
+kylin.engine.spark-dimension-dictionary=true
+kylin.engine.spark-udc-dictionary=true
+{% endhighlight %}
+
## Troubleshooting
When getting error, you should check "logs/kylin.log" firstly. There has the
full Spark command that Kylin executes, e.g:
diff --git a/website/_docs30/tutorial/cube_spark.md
b/website/_docs30/tutorial/cube_spark.md
index da785b1..ed356df 100644
--- a/website/_docs30/tutorial/cube_spark.md
+++ b/website/_docs30/tutorial/cube_spark.md
@@ -127,6 +127,28 @@ When Kylin executes this step, you can monitor the status
in Yarn resource manag
After all steps be successfully executed, the Cube becomes "Ready" and you can
query it as normal.
+
+## Using Spark with Apache Livy
+
+You can use Livy by adding flowing configuration:
+
+{% highlight Groff markup %}
+kylin.engine.livy-conf.livy-enabled=true
+kylin.engine.livy-conf.livy-url=http://ip:8998
+kylin.engine.livy-conf.livy-key.file=hdfs:///path/kylin-job-3.0.0-SNAPSHOT.jar
+kylin.engine.livy-conf.livy-arr.jars=hdfs:///path/hbase-client-1.2.0-{$env.version}.jar,hdfs:///path/hbase-common-1.2.0-{$env.version}.jar,hdfs:///path/hbase-hadoop-compat-1.2.0-{$env.version}.jar,hdfs:///path/hbase-hadoop2-compat-1.2.0-{$env.version}.jar,hdfs:///path/hbase-server-1.2.0-{$env.version}.jar,hdfs:///path/htrace-core-3.2.0-incubating.jar,hdfs:///path/metrics-core-2.2.0.jar
+{% endhighlight %}
+
+
+## Optional
+
+As we all know, the cubing job includes several steps and the steps 'extract
fact table distinct value' and 'build dimension dictionary' can also be built
by spark. The configurations are as follows.
+
+{% highlight Groff markup %}
+kylin.engine.spark-fact-distinct=true
+kylin.engine.spark-dimension-dictionary=true
+{% endhighlight %}
+
## Troubleshooting
When getting error, you should check "logs/kylin.log" firstly. There has the
full Spark command that Kylin executes, e.g:
diff --git a/website/_docs31/tutorial/cube_spark.md
b/website/_docs31/tutorial/cube_spark.md
index dceefe5..112b46d 100644
--- a/website/_docs31/tutorial/cube_spark.md
+++ b/website/_docs31/tutorial/cube_spark.md
@@ -127,6 +127,30 @@ When Kylin executes this step, you can monitor the status
in Yarn resource manag
After all steps be successfully executed, the Cube becomes "Ready" and you can
query it as normal.
+
+## Using Spark with Apache Livy
+
+You can use Livy by adding flowing configuration:
+
+{% highlight Groff markup %}
+kylin.engine.livy-conf.livy-enabled=true
+kylin.engine.livy-conf.livy-url=http://ip:8998
+kylin.engine.livy-conf.livy-key.file=hdfs:///path/kylin-job-3.0.0-SNAPSHOT.jar
+kylin.engine.livy-conf.livy-arr.jars=hdfs:///path/hbase-client-1.2.0-{$env.version}.jar,hdfs:///path/hbase-common-1.2.0-{$env.version}.jar,hdfs:///path/hbase-hadoop-compat-1.2.0-{$env.version}.jar,hdfs:///path/hbase-hadoop2-compat-1.2.0-{$env.version}.jar,hdfs:///path/hbase-server-1.2.0-{$env.version}.jar,hdfs:///path/htrace-core-3.2.0-incubating.jar,hdfs:///path/metrics-core-2.2.0.jar
+{% endhighlight %}
+
+
+## Optional
+
+As we all know, the cubing job includes several steps and the steps 'extract
fact table distinct value', 'build dimension dictionary' and 'build UHC
dimension dictionary' can also be built by spark. The configurations are as
follows.
+
+{% highlight Groff markup %}
+kylin.engine.spark-fact-distinct=true
+kylin.engine.spark-dimension-dictionary=true
+kylin.engine.spark-udc-dictionary=true
+{% endhighlight %}
+
+
## Troubleshooting
When getting error, you should check "logs/kylin.log" firstly. There has the
full Spark command that Kylin executes, e.g: