Author: lidong
Date: Tue Jun 14 14:08:04 2022
New Revision: 1901905
URL: http://svn.apache.org/viewvc?rev=1901905&view=rev
Log:
# minor, fix error redirect url of mysql metastore
Added:
kylin/site/blog/2022/04/
kylin/site/blog/2022/04/20/
kylin/site/blog/2022/04/20/kylin4-on-cloud-part1/
kylin/site/blog/2022/04/20/kylin4-on-cloud-part1/index.html
kylin/site/blog/2022/04/20/kylin4-on-cloud-part2/
kylin/site/blog/2022/04/20/kylin4-on-cloud-part2/index.html
Modified:
kylin/site/blog/index.html
kylin/site/cn/docs/gettingstarted/kylin-quickstart.html
kylin/site/docs/gettingstarted/kylin-quickstart.html
kylin/site/feed.xml
Added: kylin/site/blog/2022/04/20/kylin4-on-cloud-part1/index.html
URL:
http://svn.apache.org/viewvc/kylin/site/blog/2022/04/20/kylin4-on-cloud-part1/index.html?rev=1901905&view=auto
==============================================================================
--- kylin/site/blog/2022/04/20/kylin4-on-cloud-part1/index.html (added)
+++ kylin/site/blog/2022/04/20/kylin4-on-cloud-part1/index.html Tue Jun 14
14:08:04 2022
@@ -0,0 +1,671 @@
+<!--
+* Licensed to the Apache Software Foundation (ASF) under one
+* or more contributor license agreements. See the NOTICE file
+* distributed with this work for additional information
+* regarding copyright ownership. The ASF licenses this file
+* to you under the Apache License, Version 2.0 (the
+* "License"); you may not use this file except in compliance
+* with the License. You may obtain a copy of the License at
+*
+* http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an "AS IS" BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+-->
+<!doctype html>
+<html>
+ <!--
+* Licensed to the Apache Software Foundation (ASF) under one
+* or more contributor license agreements. See the NOTICE file
+* distributed with this work for additional information
+* regarding copyright ownership. The ASF licenses this file
+* to you under the Apache License, Version 2.0 (the
+* "License"); you may not use this file except in compliance
+* with the License. You may obtain a copy of the License at
+*
+* http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an "AS IS" BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+-->
+
+<head>
+ <meta charset="utf-8">
+ <meta http-equiv="X-UA-Compatible" content="IE=edge">
+ <meta name="viewport" content="width=device-width, initial-scale=1">
+
+ <title>Apache Kylin | Kylin on Cloud â Build A Data Analysis Platform on
the Cloud in Two Hours Part 1</title>
+ <meta name="description" content="Video Tutorials">
+ <meta name="author" content="Apache Kylin">
+ <link rel="shortcut icon" href="fav.png" type="image/png">
+
+
+
+<link rel="stylesheet" href="/assets/css/animate.css">
+<!-- Bootstrap -->
+<link rel="stylesheet" href="/assets/css/bootstrap.min.css">
+
+<!-- Fonts -->
+<!-- <link rel="stylesheet"
href="http://fonts.googleapis.com/css?family=Alice|Open+Sans:400,300,700"> -->
+
+<!-- Icons -->
+<link rel="stylesheet" href="/assets/css/font-awesome.min.css">
+
+ <!-- Custom styles -->
+ <link rel="stylesheet" href="/assets/css/styles.css">
+ <link rel="stylesheet" href="/assets/css/docs.css">
+ <link rel="stylesheet" href="/assets/css/pygments.css">
+
+ <link rel="canonical"
href="http://kylin.apache.org/blog/2022/04/20/kylin4-on-cloud-part1/">
+ <link rel="alternate" type="application/rss+xml" title="Apache Kylin"
href="http://kylin.apache.org/feed.xml" />
+
+<!--[if lt IE 9]> <script src="assets/js/html5shiv.js"></script> <![endif]-->
+<!-- Global site tag (gtag.js) - Google Analytics -->
+<script async
src="https://www.googletagmanager.com/gtag/js?id=UA-120788561-1"></script>
+<script>
+ window.dataLayer = window.dataLayer || [];
+ function gtag(){dataLayer.push(arguments);}
+ gtag('js', new Date());
+
+ gtag('config', 'UA-120788561-1');
+</script>
+<script type="text/javascript" src="/assets/js/jquery-1.9.1.min.js"></script>
+<script type="text/javascript" src="/assets/js/nside.js"></script> </script>
+<script type="text/javascript" src="/assets/js/nnav.js"></script> </script>
+<script>
+var _hmt = _hmt || [];
+(function() {
+ var hm = document.createElement("script");
+ hm.src = "https://hm.baidu.com/hm.js?bdc5e03add430c0b72cc0eb91eabfa99";
+ var s = document.getElementsByTagName("script")[0];
+ s.parentNode.insertBefore(hm, s);
+})();
+</script>
+
+</head>
+
+ <body>
+ <!--
+* Licensed to the Apache Software Foundation (ASF) under one
+* or more contributor license agreements. See the NOTICE file
+* distributed with this work for additional information
+* regarding copyright ownership. The ASF licenses this file
+* to you under the Apache License, Version 2.0 (the
+* "License"); you may not use this file except in compliance
+* with the License. You may obtain a copy of the License at
+*
+* http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an "AS IS" BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+-->
+
+<header id="header" >
+
+ <!-- Main Menu -->
+ <nav class="navbar navbar-default" role="navigation" id="nav-wrapper">
+ <div class="container-fluid" id="nav">
+ <!--
+ <img class="img-circle" width="40px" height="40px" id="circlelogo"
src="/assets/images/kylin_logo.jpg">
+ -->
+ <!-- Brand and toggle get grouped for better mobile display -->
+ <div class="navbar-header">
+ <img class="navbar-logo" width="46"
src="/assets/images/kylin_logo.png" ></img>
+ <button type="button" class="navbar-toggle collapsed"
data-toggle="collapse" data-target="#bs-example-navbar-collapse-1">
+ <span class="sr-only">Toggle navigation</span>
+ <span class="icon-bar"></span>
+ <span class="icon-bar"></span>
+ <span class="icon-bar"></span>
+ </button>
+ <ul class="nav icon-navbar">
+ <li><a href="https://twitter.com/apachekylin" target="_blank"
class="fa fa-twitter fa-lg" title="Twitter: @ApacheKylin" ></a></li>
+ <li><a href="https://github.com/apache/kylin" target="_blank"
class="fa fa-github-alt fa-lg" title="Github: apache/kylin" ></a></li>
+ <li><a href="https://www.facebook.com/kylinio" target="_blank"
class="fa fa-facebook fa-lg" title="Facebook: kylin.io" ></a></li>
+ </ul>
+ </div>
+
+ <!-- Collect the nav links, forms, and other content for toggling -->
+ <div class="navbar-collapse collapse" id="bs-example-navbar-collapse-1">
+
+ <ul class="nav navbar-nav">
+
+ <li><a href="/">Home</a></li>
+ <li>
+ <a href="/docs" class="dropdown-toggle" data-toggle="dropdown"
role="button" aria-haspopup="true" aria-expanded="false">Docs<span
class="caret"></span></a>
+ <ul class="dropdown-menu">
+ <li><a href="/docs/">Latest Release(Kylin 4.0.1)</a></li>
+ <li><a href="/docs31/">Kylin 3.1.3</a></li>
+ <li><a href="/docs24/">Kylin 2.4.0</a></li>
+ <li><a href="/archive/">Archive</a></li>
+ </ul>
+ </li>
+ <li><a href="/download">Download</a></li>
+ <li><a href="/community" >Community</a></li>
+ <li>
+ <a href="/development" class="dropdown-toggle"
data-toggle="dropdown" role="button" aria-haspopup="true"
aria-expanded="false">Development<span class="caret"></span></a>
+ <ul class="dropdown-menu">
+ <li><a href="/development40/">Kylin 4.x</a></li>
+ <li><a href="/development/">Kylin 3.x And Older Versions</a></li>
+ </ul>
+ </li>
+ <li><a href="/blog">Blog</a></li>
+ <li><a href="/cn" >䏿ç</a></li>
+ </ul>
+ </div><!-- /.navbar-collapse -->
+ </div><!-- /.container-fluid -->
+ </nav>
+
+ <div id="head" class="parallax normal-header" >
+ <div class="text-center header-apache">
+ <a href="http://apache.org/foundation/contributing.html" title="Support
Apache" style="margin-left: 150px;">
+ <div>
+ <img src="https://www.apache.org/images/SupportApache-small.png" >
+ </div>
+ </a>
+ </div>
+ </div>
+
+ </header>
+
+ <div class="page-content main">
+ <header style=" padding:2em 0 0 ">
+ <div class="container" >
+ <div style=" padding:0 4em">
+ <div class="blog-icon">
+ <img width="30" src="/assets/images/icon_blog_w.png">
+ </div>
+ <h4 class="index-title" style="
float:left;"><span>Apache Kylin⢠Technical Blog</span></h4>
+ </div>
+ </div>
+ </div>
+
+ <div class="container blog">
+ <div>
+ <article class="post-content" >
+ <!--
+* Licensed to the Apache Software Foundation (ASF) under one
+* or more contributor license agreements. See the NOTICE file
+* distributed with this work for additional information
+* regarding copyright ownership. The ASF licenses this file
+* to you under the Apache License, Version 2.0 (the
+* "License"); you may not use this file except in compliance
+* with the License. You may obtain a copy of the License at
+*
+* http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an "AS IS" BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+-->
+
+<div class="post" style=" padding:2em 4em 4em 4em">
+
+ <header class="post-header">
+ <h1 class="post-title">Kylin on Cloud â Build A Data Analysis Platform
on the Cloud in Two Hours Part 1</h1>
+ <p class="post-meta" >Apr 20, 2022 ⢠Yaqian Zhang</p>
+ </header>
+
+ <article class="post-content" >
+ <h2 id="video-tutorials">Video Tutorials</h2>
+
+<p><a href="https://youtu.be/5kKXEMjO1Sc">Kylin on Cloud â Build A Data
Analysis Platform on the Cloud in Two Hours Part 1</a></p>
+
+<h2 id="background">Background</h2>
+
+<p>Apache Kylin is a multidimensional database based on pre-computation and
multidimensional models. It also supports standard SQL query interface. In
Kylin, users can define table relationships by creating Models, define
dimensions and measures by creating Cubes, and run data aggregation with Cube
building. The pre-computed data will be saved to answer user queries and users
also can perform further aggregation on the pre-computed data, significantly
improving the query performance.</p>
+
+<p>With the release of Kylin 4.0, Kylin can now be deployed without a Hadoop
environment. To make it easier for users to deploy Kylin on the cloud, Kylin
community recently developed a cloud deployment tool that allows users to
obtain a complete Kylin cluster by executing just one line of command,
delivering a fast and efficient analysis experience for the users. Moreover, in
January 2022, the Kylin community released MDX for Kylin to enhance the
semantic capability of Kylin as a multidimensional database. MDX for Kylin
provides the MDX query interface, users can define business metrics based on
the multidimensional model and translate the Kylin data models into a
business-friendly language to give data business values, making it easier to
integrate with Excel, Tableau, and other BI tools.</p>
+
+<p>With all these innovations, users can easily and quickly deploy Kylin
clusters on the cloud, create multi-dimensional models, and enjoy the short
query latency brought by pre-computation; whatâs more, users can also use MDX
for Kylin to define and manage business metrics, leveraging both the advantages
of data warehouse and business semantics.</p>
+
+<p>With Kylin + MDX for Kylin, users can directly work with BI tools for
multidimensional data analysis, or use it as the basis to build complex
applications such as metrics platforms. Compared with the solution of building
a metrics platform directly with computing engines such as Spark and Hive that
perform Join and aggregated query computation at runtime, Kylin, with our
multidimensional modeling, pre-computation technology, and semantics layer
capabilities empowered by MDX for Kylin, provides users with key functions such
as massive data computation, extremely fast query response, unified
multidimensional model, interface to a variety of BI tools, and basic business
metrics management capabilities.</p>
+
+<p>This tutorial will start from a data engineerâs perspective to show how
to build a Kylin on Cloud data analysis platform, which will deliver a
high-performance query experience for hundreds of millions of rows of data with
a lower TCO, the capability to manage business metrics through MDX for Kylin,
and direct connection to BI tools for quick reports generating.</p>
+
+<p>Each step of this tutorial is explained in detail with illustrations and
checkpoints to help newcomers. All you need to start is to an AWS account and 2
hours. Note: The cloud cost to finish this tutorial is around 15$.</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/0_deploy_kylin.png" alt="" /></p>
+
+<h2 id="business-scenario">Business scenario</h2>
+
+<p>Since the beginning of 2020, COVID-19 has spread rapidly all over the
world, which has greatly changed peopleâs daily life, especially their travel
habits. This tutorial wants to learn the impact of the pandemic on the New York
taxi industry based on the pandemic data and New York taxi travel data since
2018 and indicators such as positive cases, fatality rate, taxi orders, and
average travel mileage will be analyzed. We hope this analysis could provide
some insights for future decision-making.</p>
+
+<h3 id="business-issues">Business issues</h3>
+
+<ul>
+ <li>The severity of the pandemic in different countries and regions</li>
+ <li>Travel metrics of different blocks in New York City, such as order
number, travel mileage, etc.</li>
+ <li>Does the pandemic have a significant impact on taxi orders?</li>
+ <li>Travel habits change after the pandemic (long-distance vs.
short-distance travels)</li>
+ <li>Is the severity of the pandemic strongly related to taxi travel?</li>
+</ul>
+
+<h3 id="dataset">Dataset</h3>
+
+<h4 id="covid-19-dataset">COVID-19 Dataset</h4>
+
+<p>The COVID-19 dataset includes a fact table <code
class="highlighter-rouge">covid_19_activity</code> and a dimension table <code
class="highlighter-rouge">lookup_calendar</code>.</p>
+
+<p><code class="highlighter-rouge">covid_19_activity</code> contains the
number of confirmed cases and deaths reported each day in different regions
around the world. <code class="highlighter-rouge">lookup_calendar</code> is a
date dimension table that holds time-extended information, such as the
beginning of the year, and the beginning of the month for each date. <code
class="highlighter-rouge">covid_19_activity</code> and <code
class="highlighter-rouge">lookup_calendar</code> are associated by date.<br />
+COVID-19 æ°æ®éç¸å
³ä¿¡æ¯å¦ä¸:</p>
+
+<table>
+ <tbody>
+ <tr>
+ <td>Data size</td>
+ <td>235 MB</td>
+ </tr>
+ <tr>
+ <td>Fact table row count</td>
+ <td>2,753,688</td>
+ </tr>
+ <tr>
+ <td>Data range</td>
+ <td>2020-01-21~2022-03-07</td>
+ </tr>
+ <tr>
+ <td>Download address provided by the dataset provider</td>
+
<td>https://data.world/covid-19-data-resource-hub/covid-19-case-counts/workspace/file?filename=COVID-19+Activity.csv</td>
+ </tr>
+ <tr>
+ <td>S3 directory of the dataset</td>
+ <td>s3://public.kyligence.io/kylin/kylin_demo/data/covid19_data/</td>
+ </tr>
+ </tbody>
+</table>
+
+<h4 id="nyc-taxi-order-dataset">NYC taxi order dataset</h4>
+
+<p>The NYC taxi order dataset consists of a fact table <code
class="highlighter-rouge">taxi_trip_records_view</code>, and two dimension
tables, <code class="highlighter-rouge">newyork_zone</code> and <code
class="highlighter-rouge">lookup_calendar</code>.</p>
+
+<p>Among them, each record in <code
class="highlighter-rouge">taxi_trip_records_view</code> corresponds to one taxi
trip and contains information like the pick-up ID, drop-off ID, trip duration,
order amount, travel mileage, etc. <code
class="highlighter-rouge">newyork_zone</code> records the administrative
district corresponding to the location ID. <code
class="highlighter-rouge">taxi_trip_records_view</code> are connected with
<code class="highlighter-rouge">newyork_zone</code> through columns
PULocationID and DOLocationID to get the information about pick-up and drop-off
blocks. <code class="highlighter-rouge">lookup_calendar</code> is the same
dimension table as in the COVID-19 dataset. <code
class="highlighter-rouge">taxi_trip_records_view</code> and <code
class="highlighter-rouge">lookup_calendar</code> are connected by date.</p>
+
+<p>NYC taxi order dataset informationï¼</p>
+
+<table>
+ <tbody>
+ <tr>
+ <td>Data size</td>
+ <td>19 G</td>
+ </tr>
+ <tr>
+ <td>Fact table row count</td>
+ <td>226,849,274</td>
+ </tr>
+ <tr>
+ <td>Data range</td>
+ <td>2018-01-01~2021-07-31</td>
+ </tr>
+ <tr>
+ <td>Download address provided by the dataset provider</td>
+ <td>https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page</td>
+ </tr>
+ <tr>
+ <td>S3 directory of the dataset</td>
+
<td>s3://public.kyligence.io/kylin/kylin_demo/data/trip_data_2018-2021/</td>
+ </tr>
+ </tbody>
+</table>
+
+<h4 id="er-diagram">ER Diagram</h4>
+
+<p>The ER diagram of the COVID-19 dataset and NYC taxi order dataset is as
follows:</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/1_table_ER.png" alt="" /></p>
+
+<h3 id="metrics-design">Metrics design</h3>
+
+<p>Based on what we try to solve with this model, we designed the following
atomic metrics and business metrics:</p>
+
+<h6 id="atomic-metrics">1. Atomic metrics</h6>
+
+<p>Atomic metrics refer to measures created in Kylin Cube, which are
relatively simple, as they only run aggregated calculations on one column.</p>
+
+<ul>
+ <li>Covid19 case count: <code
class="highlighter-rouge">sum(covid_19_activity.people_positive_cases_count)</code></li>
+ <li>Covid19 fatality: <code class="highlighter-rouge">sum(covid_19_activity.
people_death_count)</code></li>
+ <li>Covid19 new positive case count: <code
class="highlighter-rouge">sum(covid_19_activity.
people_positive_new_cases_count)</code></li>
+ <li>Covid19 new death count: <code
class="highlighter-rouge">sum(covid_19_activity.
people_death_new_count)</code></li>
+ <li>Taxi trip mileage: <code
class="highlighter-rouge">sum(taxi_trip_records_view. trip_distance)</code></li>
+ <li>Taxi order amount: <code
class="highlighter-rouge">sum(taxi_trip_records_view. total_amount)</code></li>
+ <li>Taxi trip count: <code class="highlighter-rouge">count()</code></li>
+ <li>Taxi trip duration: <code
class="highlighter-rouge">sum(taxi_trip_records_view.trip_time_hour)</code></li>
+</ul>
+
+<h6 id="business-metrics">2. Business metrics</h6>
+
+<p>Business metrics are various compound operations based on atomic metrics
that have specific business meanings.</p>
+
+<ul>
+ <li>MTD, YTD of each atomic metric</li>
+ <li>MOM, YOY of each atomic metric</li>
+ <li>Covid19 fatality rate: death count/positive case count</li>
+ <li>Average taxi trip speed: taxi trip distance/taxi trip duration</li>
+ <li>Average taxi trip mileage: taxi trip distance/taxi trip count</li>
+</ul>
+
+<h2 id="operation-overview">Operation Overview</h2>
+
+<p>The diagram below is the main steps to build a cloud data analysis platform
with Apache Kylin and how to perform data analysis:</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/2_step_overview.jpg" alt="" /></p>
+
+<h2 id="cluster-architecture">Cluster architecture</h2>
+
+<p>Here is the architecture of the Kylin cluster deployed by the cloud
deployment tool:</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/3_kylin_cluster.jpg" alt="" /></p>
+
+<h2 id="kylin-on-cloud-deployment">Kylin on Cloud deployment</h2>
+
+<h3 id="prerequisites">Prerequisites</h3>
+
+<ul>
+ <li>GitHub Desktop: for downloading the deployment tool;</li>
+ <li>Python 3.6.6: for running the deployment tool</li>
+</ul>
+
+<h3 id="aws-permission-check-and-initialization">AWS permission check and
initialization</h3>
+
+<p>Log in to AWS with your account to check the permission status and then
create the Access Key, IAM Role, Key Pair, and S3 working directory according
to the document <a
href="https://github.com/apache/kylin/blob/kylin4_on_cloud/readme/prerequisites.md">Prerequisites</a>.
Subsequent AWS operations will be performed with this account.</p>
+
+<h3 id="configure-the-deployment-tool">Configure the deployment tool</h3>
+
+<ol>
+ <li>
+ <p>Execute the following command to clone the code for the Kylin on AWS
deployment tool.</p>
+
+ <p><code class="highlighter-rouge">shell
+ git clone -b kylin4_on_cloud --single-branch
https://github.com/apache/kylin.git && cd kylin
+</code></p>
+ </li>
+ <li>
+ <p>Initialize the virtual environment for your Python on local machine.</p>
+
+ <p>Run the command below to check the Python version. Note: Python 3.6.6
or above is needed:</p>
+
+ <p><code class="highlighter-rouge">shell
+ python --version
+</code></p>
+
+ <p>Initialize the virtual environment for Python and install
dependencies:</p>
+
+ <p><code class="highlighter-rouge">shell
+ bin/init.sh
+ source venv/bin/activate
+</code></p>
+ </li>
+ <li>
+ <p>Modify the configuration file <code
class="highlighter-rouge">kylin_configs.yaml</code></p>
+ </li>
+</ol>
+
+<p>Open kylin_configs.yaml file, and replace the configuration items with the
actual values:</p>
+
+<ul>
+ <li><code class="highlighter-rouge">AWS_REGION</code>: Region for EC2
instance, the default value is <code
class="highlighter-rouge">cn-northwest-1</code></li>
+ <li><code class="highlighter-rouge">${IAM_ROLE_NAME}</code>: IAM Role just
created, e.g. <code class="highlighter-rouge">kylin_deploy_role</code></li>
+ <li><code class="highlighter-rouge">${S3_URI}</code>: S3 working directory
for deploying Kylin, e.g. s3://kylindemo/kylin_demo_dir/</li>
+ <li><code class="highlighter-rouge">${KEY_PAIR}</code>: Key pairs just
created, e.g. kylin_deploy_key</li>
+ <li><code class="highlighter-rouge">${Cidr Ip}</code>: IP address range that
is allowed to access EC2 instances, e.g. 10.1.0.0/32, usually set as your
external IP address to ensure that only you can access these EC2 instances</li>
+</ul>
+
+<p>As Kylin adopts a read-write separation architecture to separate build and
query resources, in the following steps, we will first start a build cluster to
connect to Glue to create tables, load data sources, and submit build jobs for
pre-computation, then delete the build cluster but save the metadata. Then we
will start a query cluster with MDX for Kylin to create business metrics,
connect to BI tools for queries, and perform data analysis. Kylin on AWS
cluster uses RDS to store metadata and S3 to store the built data. It also
supports loading data sources from AWS Glue. Except for the EC2 nodes, the
other resources used are permanent and will not disappear with the deletion of
nodes. Therefore, when there is no query or build job, users can delete the
build or query clusters and only keep the metadata and S3 working directory.</p>
+
+<h3 id="kylin-build-cluster">Kylin build cluster</h3>
+
+<h4 id="start-kylin-build-cluster">Start Kylin build cluster</h4>
+
+<ol>
+ <li>
+ <p>Start the build cluster with the following command. The whole process
may take 15-30 minutes depending on your network conditions.</p>
+
+ <p><code class="highlighter-rouge">shell
+ python deploy.py --type deploy --mode job
+</code></p>
+ </li>
+ <li>
+ <p>You may check the terminal to see if the build cluster is successfully
deployed:</p>
+ </li>
+</ol>
+
+<p><img src="/images/blog/kylin4_on_cloud/4_deploy_cluster_successfully.png"
alt="" /></p>
+
+<h4 id="check-aws-service">Check AWS Service</h4>
+
+<ol>
+ <li>
+ <p>Go to CloudFormation on AWS console, where you can see 7 stacks are
created by the Kylin deployment tool:</p>
+
+ <p><img src="/images/blog/kylin4_on_cloud/5_check_aws_stacks.png" alt=""
/></p>
+ </li>
+ <li>
+ <p>Users can view the details of EC2 nodes through the AWS console or use
the command below to check the names, private IPs, and public IPs of all EC2
nodes.</p>
+ </li>
+</ol>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>python deploy.py
--type list
+</code></pre>
+</div>
+
+<p><img src="/images/blog/kylin4_on_cloud/6_list_cluster_node.png" alt=""
/></p>
+
+<h4 id="spark-sql-query-response-time">Spark-SQL query response time</h4>
+
+<p>Letâs first check the query response time in Spark-SQL environment as a
comparison.</p>
+
+<ol>
+ <li>
+ <p>First, log in to the EC2 where Kylin is deployed with the public IP of
the Kylin node, switch to root user, and execute <code
class="highlighter-rouge">~/.bash_profile</code> to implement the environment
variables set beforehand.</p>
+
+ <p><code class="highlighter-rouge">shell
+ ssh -i "${KEY_PAIR}" ec2-user@${kylin_node_public_ip}
+ sudo su
+ source ~/.bash_profile
+</code></p>
+ </li>
+ <li>
+ <p>Go to <code class="highlighter-rouge">$SPARK_HOME</code> to modify
configuration file <code
class="highlighter-rouge">conf/spark-defaults.conf</code>, change
spark_master_node_private_ip to a private IP of the Spark master node:</p>
+
+ <p>```shell<br />
+ cd $SPARK_HOME<br />
+ vim conf/spark-defaults.conf</p>
+
+ <p>## Replace spark_master_node_private_ip with the private IP of the real
Spark master node<br />
+ spark.master spark://spark_master_node_private_ip:7077<br />
+ ```</p>
+
+ <p>In <code class="highlighter-rouge">spark-defaults.conf</code>, the
resource allocation for driver and executor is the same as that for Kylin query
cluster.</p>
+ </li>
+ <li>
+ <p>Create table in Spark-SQL</p>
+
+ <p>All data from the test dataset is stored in S3 bucket of <code
class="highlighter-rouge">cn-north-1</code> and <code
class="highlighter-rouge">us-east-1</code>. If your S3 bucket is in <code
class="highlighter-rouge">cn-north-1</code> or <code
class="highlighter-rouge">us-east-1</code>, you can directly run SQL to create
the table; Or, you will need to execute the following script to copy the data
to the S3 working directory set up in <code
class="highlighter-rouge">kylin_configs.yaml</code>, and modify your SQL for
creating the table:</p>
+
+ <p>```shell<br />
+ ## AWS CN user<br />
+ aws s3 sync s3://public.kyligence.io/kylin/kylin_demo/data/ ${S3_DATA_DIR}
âregion cn-north-1</p>
+
+ <p>## AWS Global user<br />
+ aws s3 sync s3://public.kyligence.io/kylin/kylin_demo/data/ ${S3_DATA_DIR}
âregion us-east-1</p>
+
+ <p>## Modify create table SQL<br />
+ sed -i
âs#s3://public.kyligence.io/kylin/kylin_demo/data/#${S3_DATA_DIR}#gâ
/home/ec2-user/kylin_demo/create_kylin_demo_table.sql<br />
+ ```</p>
+
+ <p>Execute SQL for creating table:</p>
+
+ <p><code class="highlighter-rouge">shell
+ bin/spark-sql -f /home/ec2-user/kylin_demo/create_kylin_demo_table.sql
+</code></p>
+ </li>
+ <li>
+ <p>Execute query in Spark-SQL</p>
+
+ <p>Go to Spark-SQL:</p>
+
+ <p><code class="highlighter-rouge">shell
+ bin/spark-sql
+</code></p>
+
+ <p>Run query in Spark-SQL:</p>
+
+ <p><code class="highlighter-rouge">sql
+ use kylin_demo;
+ select TAXI_TRIP_RECORDS_VIEW.PICKUP_DATE, NEWYORK_ZONE.BOROUGH, count(*),
sum(TAXI_TRIP_RECORDS_VIEW.TRIP_TIME_HOUR),
sum(TAXI_TRIP_RECORDS_VIEW.TOTAL_AMOUNT)
+ from TAXI_TRIP_RECORDS_VIEW
+ left join NEWYORK_ZONE
+ on TAXI_TRIP_RECORDS_VIEW.PULOCATIONID = NEWYORK_ZONE.LOCATIONID
+ group by TAXI_TRIP_RECORDS_VIEW.PICKUP_DATE, NEWYORK_ZONE.BOROUGH;
+</code></p>
+
+ <p>We can see that with the same configuration as Kylin query cluster,
direct query using Spark-SQL takes over 100s:</p>
+
+ <p><img src="/images/blog/kylin4_on_cloud/7_query_in_spark_sql.png" alt=""
/></p>
+ </li>
+ <li>
+ <p>After the query is successfully executed, we should exit the Spark-SQL
before proceeding to the following steps to save resources.</p>
+ </li>
+</ol>
+
+<h4 id="import-kylin-metadata">Import Kylin metadata</h4>
+
+<ol>
+ <li>
+ <p>Go to <code class="highlighter-rouge">$KYLIN_HOME</code></p>
+
+ <p><code class="highlighter-rouge">shell
+ cd $KYLIN_HOME
+</code></p>
+ </li>
+ <li>
+ <p>Import metadata</p>
+
+ <p><code class="highlighter-rouge">shell
+ bin/metastore.sh restore /home/ec2-user/meta_backups/
+</code></p>
+ </li>
+ <li>
+ <p>Reload metadata</p>
+ </li>
+</ol>
+
+<p>Type <code
class="highlighter-rouge">http://${kylin_node_public_ip}:7070/kylin</code>
(relace the IP with the public IP of the EC2 node) in your browser to log in to
Kylin web UI, and log in with the default username and password ADMIN/KYLIN:</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/8_kylin_web_ui.png" alt="" /></p>
+
+<p>Reload Kylin metadata by clicking System - > Configuration - > Reload
Metadata:</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/9_reload_kylin_metadata.png" alt=""
/></p>
+
+<p>If youâd like to learn how to manually create the Model and Cube included
in Kylin metadata, please refer to <a
href="https://cwiki.apache.org/confluence/display/KYLIN/Create+Model+and+Cube+in+Kylin">Create
model and cube in Kylin</a>.</p>
+
+<h4 id="run-build">Run build</h4>
+
+<p>Submit the Cube build job. Since no partition column is set in the model,
we will directly perform a full build for the two cubes:</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/10_full_build_cube.png.png" alt=""
/></p>
+
+<p><img src="/images/blog/kylin4_on_cloud/11_kylin_job_complete.png" alt=""
/></p>
+
+<h4 id="destroy-build-cluster">Destroy build cluster</h4>
+
+<p>After the building Job is completed, execute the cluster delete command to
close the build cluster. By default, the RDS stack, monitor stack, and VPC
stack will be kept.</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>python deploy.py
--type destroy
+</code></pre>
+</div>
+
+<p>Cluster is successfully closed:</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/12_destroy_job_cluster.png" alt=""
/></p>
+
+<h4 id="check-aws-resource">Check AWS resource</h4>
+
+<p>After the cluster is successfully deleted, you can go to the <code
class="highlighter-rouge">CloudFormation</code> page in AWS console to confirm
whether there are remaining resources. Since the metadata RDS, monitor nodes,
and VPC nodes are kept by default, you will see only the following three stacks
on the page.</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/13_check_aws_stacks.png" alt=""
/></p>
+
+<p>The resources in the three stacks will still be used when we start the
query cluster, to ensure that the query cluster and the build cluster use the
same set of metadata.</p>
+
+<h4 id="intro-to-next-part">Intro to next part</h4>
+
+<p>Thatâs all for the first part of Kylin on Cloud ââ Build A Data
Analysis Platform on the Cloud in Two Hours, please see part 2 here: <a
href="../kylin4-on-cloud-part2/">Kylin on Cloud ââ Quickly Build Cloud Data
Analysis Service Platform within Two Hours</a> (Part 2)</p>
+
+ </article>
+
+</div>
+
+
+
+
+
+ </article>
+ </div>
+ </div>
+ <!--
+* Licensed to the Apache Software Foundation (ASF) under one
+* or more contributor license agreements. See the NOTICE file
+* distributed with this work for additional information
+* regarding copyright ownership. The ASF licenses this file
+* to you under the Apache License, Version 2.0 (the
+* "License"); you may not use this file except in compliance
+* with the License. You may obtain a copy of the License at
+*
+* http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an "AS IS" BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+-->
+
+<footer id="underfooter">
+ <div>
+ <div class="row">
+ <div class="col-md-12 widget">
+ <div class="widget-body">
+ <div class="footer-img">
+ <a href="http://www.apache.org">
+ <img id="asf-logo" height="78px" alt="Apache
Software Foundation" src="/assets/images/apache_footer.png">
+ </a>
+ </div>
+ <p style="padding-top: 11px;">
+ The contents of this website are © 2015 Apache
Software Foundation under the terms of the
+ <a href="http://www.apache.org/licenses/LICENSE-2.0">
Apache License v2 </a>.
+ </p>
+ <p style="margin-bottom: 11px;">
+ Apache Kylin and its logo are trademarks of the Apache
Software Foundation.
+ </div>
+
+ </div>
+ </div>
+ </div>
+ <!-- /row of widgets -->
+
+ </div>
+ <div></div>
+
+</footer>
+
+ <script src="/assets/js/jquery-1.9.1.min.js"></script>
+ <script src="/assets/js/bootstrap.min.js"></script>
+ <script src="/assets/js/main.js"></script>
+ </body>
+</html>
+
+
+
+
Added: kylin/site/blog/2022/04/20/kylin4-on-cloud-part2/index.html
URL:
http://svn.apache.org/viewvc/kylin/site/blog/2022/04/20/kylin4-on-cloud-part2/index.html?rev=1901905&view=auto
==============================================================================
--- kylin/site/blog/2022/04/20/kylin4-on-cloud-part2/index.html (added)
+++ kylin/site/blog/2022/04/20/kylin4-on-cloud-part2/index.html Tue Jun 14
14:08:04 2022
@@ -0,0 +1,585 @@
+<!--
+* Licensed to the Apache Software Foundation (ASF) under one
+* or more contributor license agreements. See the NOTICE file
+* distributed with this work for additional information
+* regarding copyright ownership. The ASF licenses this file
+* to you under the Apache License, Version 2.0 (the
+* "License"); you may not use this file except in compliance
+* with the License. You may obtain a copy of the License at
+*
+* http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an "AS IS" BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+-->
+<!doctype html>
+<html>
+ <!--
+* Licensed to the Apache Software Foundation (ASF) under one
+* or more contributor license agreements. See the NOTICE file
+* distributed with this work for additional information
+* regarding copyright ownership. The ASF licenses this file
+* to you under the Apache License, Version 2.0 (the
+* "License"); you may not use this file except in compliance
+* with the License. You may obtain a copy of the License at
+*
+* http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an "AS IS" BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+-->
+
+<head>
+ <meta charset="utf-8">
+ <meta http-equiv="X-UA-Compatible" content="IE=edge">
+ <meta name="viewport" content="width=device-width, initial-scale=1">
+
+ <title>Apache Kylin | Kylin on Cloud â Build A Data Analysis Platform on
the Cloud in Two Hours Part 2</title>
+ <meta name="description" content="This is the second part of the blog
series, for part 1, see ï¼Kylin on Cloud â Build A Data Analysis Platform on
the Cloud in Two Hours Part 1">
+ <meta name="author" content="Apache Kylin">
+ <link rel="shortcut icon" href="fav.png" type="image/png">
+
+
+
+<link rel="stylesheet" href="/assets/css/animate.css">
+<!-- Bootstrap -->
+<link rel="stylesheet" href="/assets/css/bootstrap.min.css">
+
+<!-- Fonts -->
+<!-- <link rel="stylesheet"
href="http://fonts.googleapis.com/css?family=Alice|Open+Sans:400,300,700"> -->
+
+<!-- Icons -->
+<link rel="stylesheet" href="/assets/css/font-awesome.min.css">
+
+ <!-- Custom styles -->
+ <link rel="stylesheet" href="/assets/css/styles.css">
+ <link rel="stylesheet" href="/assets/css/docs.css">
+ <link rel="stylesheet" href="/assets/css/pygments.css">
+
+ <link rel="canonical"
href="http://kylin.apache.org/blog/2022/04/20/kylin4-on-cloud-part2/">
+ <link rel="alternate" type="application/rss+xml" title="Apache Kylin"
href="http://kylin.apache.org/feed.xml" />
+
+<!--[if lt IE 9]> <script src="assets/js/html5shiv.js"></script> <![endif]-->
+<!-- Global site tag (gtag.js) - Google Analytics -->
+<script async
src="https://www.googletagmanager.com/gtag/js?id=UA-120788561-1"></script>
+<script>
+ window.dataLayer = window.dataLayer || [];
+ function gtag(){dataLayer.push(arguments);}
+ gtag('js', new Date());
+
+ gtag('config', 'UA-120788561-1');
+</script>
+<script type="text/javascript" src="/assets/js/jquery-1.9.1.min.js"></script>
+<script type="text/javascript" src="/assets/js/nside.js"></script> </script>
+<script type="text/javascript" src="/assets/js/nnav.js"></script> </script>
+<script>
+var _hmt = _hmt || [];
+(function() {
+ var hm = document.createElement("script");
+ hm.src = "https://hm.baidu.com/hm.js?bdc5e03add430c0b72cc0eb91eabfa99";
+ var s = document.getElementsByTagName("script")[0];
+ s.parentNode.insertBefore(hm, s);
+})();
+</script>
+
+</head>
+
+ <body>
+ <!--
+* Licensed to the Apache Software Foundation (ASF) under one
+* or more contributor license agreements. See the NOTICE file
+* distributed with this work for additional information
+* regarding copyright ownership. The ASF licenses this file
+* to you under the Apache License, Version 2.0 (the
+* "License"); you may not use this file except in compliance
+* with the License. You may obtain a copy of the License at
+*
+* http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an "AS IS" BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+-->
+
+<header id="header" >
+
+ <!-- Main Menu -->
+ <nav class="navbar navbar-default" role="navigation" id="nav-wrapper">
+ <div class="container-fluid" id="nav">
+ <!--
+ <img class="img-circle" width="40px" height="40px" id="circlelogo"
src="/assets/images/kylin_logo.jpg">
+ -->
+ <!-- Brand and toggle get grouped for better mobile display -->
+ <div class="navbar-header">
+ <img class="navbar-logo" width="46"
src="/assets/images/kylin_logo.png" ></img>
+ <button type="button" class="navbar-toggle collapsed"
data-toggle="collapse" data-target="#bs-example-navbar-collapse-1">
+ <span class="sr-only">Toggle navigation</span>
+ <span class="icon-bar"></span>
+ <span class="icon-bar"></span>
+ <span class="icon-bar"></span>
+ </button>
+ <ul class="nav icon-navbar">
+ <li><a href="https://twitter.com/apachekylin" target="_blank"
class="fa fa-twitter fa-lg" title="Twitter: @ApacheKylin" ></a></li>
+ <li><a href="https://github.com/apache/kylin" target="_blank"
class="fa fa-github-alt fa-lg" title="Github: apache/kylin" ></a></li>
+ <li><a href="https://www.facebook.com/kylinio" target="_blank"
class="fa fa-facebook fa-lg" title="Facebook: kylin.io" ></a></li>
+ </ul>
+ </div>
+
+ <!-- Collect the nav links, forms, and other content for toggling -->
+ <div class="navbar-collapse collapse" id="bs-example-navbar-collapse-1">
+
+ <ul class="nav navbar-nav">
+
+ <li><a href="/">Home</a></li>
+ <li>
+ <a href="/docs" class="dropdown-toggle" data-toggle="dropdown"
role="button" aria-haspopup="true" aria-expanded="false">Docs<span
class="caret"></span></a>
+ <ul class="dropdown-menu">
+ <li><a href="/docs/">Latest Release(Kylin 4.0.1)</a></li>
+ <li><a href="/docs31/">Kylin 3.1.3</a></li>
+ <li><a href="/docs24/">Kylin 2.4.0</a></li>
+ <li><a href="/archive/">Archive</a></li>
+ </ul>
+ </li>
+ <li><a href="/download">Download</a></li>
+ <li><a href="/community" >Community</a></li>
+ <li>
+ <a href="/development" class="dropdown-toggle"
data-toggle="dropdown" role="button" aria-haspopup="true"
aria-expanded="false">Development<span class="caret"></span></a>
+ <ul class="dropdown-menu">
+ <li><a href="/development40/">Kylin 4.x</a></li>
+ <li><a href="/development/">Kylin 3.x And Older Versions</a></li>
+ </ul>
+ </li>
+ <li><a href="/blog">Blog</a></li>
+ <li><a href="/cn" >䏿ç</a></li>
+ </ul>
+ </div><!-- /.navbar-collapse -->
+ </div><!-- /.container-fluid -->
+ </nav>
+
+ <div id="head" class="parallax normal-header" >
+ <div class="text-center header-apache">
+ <a href="http://apache.org/foundation/contributing.html" title="Support
Apache" style="margin-left: 150px;">
+ <div>
+ <img src="https://www.apache.org/images/SupportApache-small.png" >
+ </div>
+ </a>
+ </div>
+ </div>
+
+ </header>
+
+ <div class="page-content main">
+ <header style=" padding:2em 0 0 ">
+ <div class="container" >
+ <div style=" padding:0 4em">
+ <div class="blog-icon">
+ <img width="30" src="/assets/images/icon_blog_w.png">
+ </div>
+ <h4 class="index-title" style="
float:left;"><span>Apache Kylin⢠Technical Blog</span></h4>
+ </div>
+ </div>
+ </div>
+
+ <div class="container blog">
+ <div>
+ <article class="post-content" >
+ <!--
+* Licensed to the Apache Software Foundation (ASF) under one
+* or more contributor license agreements. See the NOTICE file
+* distributed with this work for additional information
+* regarding copyright ownership. The ASF licenses this file
+* to you under the Apache License, Version 2.0 (the
+* "License"); you may not use this file except in compliance
+* with the License. You may obtain a copy of the License at
+*
+* http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an "AS IS" BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+-->
+
+<div class="post" style=" padding:2em 4em 4em 4em">
+
+ <header class="post-header">
+ <h1 class="post-title">Kylin on Cloud â Build A Data Analysis Platform
on the Cloud in Two Hours Part 2</h1>
+ <p class="post-meta" >Apr 20, 2022 ⢠Yaqian Zhang</p>
+ </header>
+
+ <article class="post-content" >
+ <p>This is the second part of the blog series, for part 1, see ï¼<a
href="../kylin4-on-cloud-part1/">Kylin on Cloud â Build A Data Analysis
Platform on the Cloud in Two Hours Part 1</a></p>
+
+<h3 id="video-tutorials">Video Tutorials</h3>
+
+<p><a href="https://youtu.be/LPHxqZ-au4w">Kylin on Cloud â Build A Data
Analysis Platform on the Cloud in Two Hours Part 2</a></p>
+
+<h3 id="kylin-query-cluster">Kylin query cluster</h3>
+
+<h4 id="start-kylin-query-cluster">Start Kylin query cluster</h4>
+
+<ol>
+ <li>
+ <p>Besides the <code class="highlighter-rouge">kylin_configs.yaml</code>
file for starting the build cluster, we will also enable MDX with the command
below:</p>
+
+ <div class="highlighter-rouge"><pre class="highlight"><code>ENABLE_MDX:
&ENABLE_MDX 'true'
+</code></pre>
+ </div>
+ </li>
+ <li>
+ <p>Then execute the deploy command to start the cluster:</p>
+
+ <div class="highlighter-rouge"><pre class="highlight"><code>python
deploy.py --type deploy --mode query
+</code></pre>
+ </div>
+ </li>
+</ol>
+
+<h4 id="query-with-kylin">Query with Kylin</h4>
+
+<ol>
+ <li>
+ <p>After the query cluster is successfully started, first execute <code
class="highlighter-rouge">python deploy.py --type list</code> to get all node
information, and then type in your browser <code
class="highlighter-rouge">http://${kylin_node_public_ip}:7070/kylin</code> to
log in to Kylin web UI:</p>
+
+ <p><img src="/images/blog/kylin4_on_cloud/14_kylin_web_ui.png" alt=""
/></p>
+ </li>
+ <li>
+ <p>Execute the same SQL on Insight page as what we have done with
spark-SQL:</p>
+
+ <div class="highlighter-rouge"><pre class="highlight"><code>select
TAXI_TRIP_RECORDS_VIEW.PICKUP_DATE, NEWYORK_ZONE.BOROUGH, count(*),
sum(TAXI_TRIP_RECORDS_VIEW.TRIP_TIME_HOUR),
sum(TAXI_TRIP_RECORDS_VIEW.TOTAL_AMOUNT)
+from TAXI_TRIP_RECORDS_VIEW
+left join NEWYORK_ZONE
+on TAXI_TRIP_RECORDS_VIEW.PULOCATIONID = NEWYORK_ZONE.LOCATIONID
+group by TAXI_TRIP_RECORDS_VIEW.PICKUP_DATE, NEWYORK_ZONE.BOROUGH;
+</code></pre>
+ </div>
+
+ <p><img src="/images/blog/kylin4_on_cloud/15_query_in_kylin.png" alt=""
/></p>
+ </li>
+</ol>
+
+<p>As we can see, when the query hits the cube, that is, the query is directly
answered by the pre-computed data, the query result is returned in about 4s, a
great reduction from the over 100s of query latency.</p>
+
+<h3 id="pre-computation-reduces-query-cost">Pre-computation reduces query
cost</h3>
+
+<p>In this test, we used the New York taxi order data with fact table
containing 200+ million entries of data. As we can see from the result, Kylin
has significantly improved the query efficiency in this big data analysis
scenario against hundreds of millions of data entries. Moreover, the build data
could be reused to answer thousands of subsequent queries, thereby reducing
query cost.</p>
+
+<h3 id="configure-semantic-layer">Configure semantic layer</h3>
+
+<h4 id="import-dataset-into-mdx-for-kylin">Import Dataset into MDX for
Kylin</h4>
+
+<p>With <code class="highlighter-rouge">MDX for Kylin</code>, you can create
<code class="highlighter-rouge">Dataset</code> based on the Kylin Cube, define
Cube relations, and create business metrics. To make it easy for beginners, you
can directly download Dataset file from S3 and import it into <code
class="highlighter-rouge">MDX for Kylin</code>:</p>
+
+<ol>
+ <li>
+ <p>Download the dataset to your local machine from S3.</p>
+
+ <div class="highlighter-rouge"><pre class="highlight"><code>wget
https://s3.cn-north-1.amazonaws.com.cn/public.kyligence.io/kylin/kylin_demo/covid_trip_project_covid_trip_dataset.json
+</code></pre>
+ </div>
+ </li>
+ <li>
+ <p>Access <code class="highlighter-rouge">MDX for Kylin</code> web UI</p>
+
+ <p>Enter <code
class="highlighter-rouge">http://${kylin_node_public_ip}:7080</code> in your
browser to access <code class="highlighter-rouge">MDX for Kylin</code> web UI
and log in with the default username and password <code
class="highlighter-rouge">ADMIN/KYLIN</code>:</p>
+
+ <p><img src="/images/blog/kylin4_on_cloud/16_mdx_web_ui.png" alt="" /></p>
+ </li>
+ <li>
+ <p>Confirm Kylin connection</p>
+
+ <p><code class="highlighter-rouge">MDX for Kylin</code> is already
configured with the information of the Kylin node to be connected. You only
need to type in the username and password (<code
class="highlighter-rouge">ADMIN/KYLIN</code>) for the Kylin node when logging
in for the first time.</p>
+
+ <p><img src="/images/blog/kylin4_on_cloud/17_connect_to_kylin.png" alt=""
/></p>
+
+ <p><img src="/images/blog/kylin4_on_cloud/18_exit_management.png" alt=""
/></p>
+ </li>
+ <li>
+ <p>Import Dataset</p>
+
+ <p>After Kylin is successfully connected, click the icon in the upper
right corner to exit the management page:</p>
+
+ <p><img src="/images/blog/kylin4_on_cloud/19_kylin_running.png" alt=""
/></p>
+
+ <p>Switch to the <code class="highlighter-rouge">covid_trip_project</code>
project and click <code class="highlighter-rouge">Import Dataset</code> on
<code class="highlighter-rouge">Dataset</code> page:</p>
+
+ <p><img src="/images/blog/kylin4_on_cloud/20_import_dataset.png" alt=""
/></p>
+
+ <p>Select and import the <code
class="highlighter-rouge">covid_trip_project_covid_trip_dataset.json</code>
file we just download from S3.</p>
+
+ <p><code class="highlighter-rouge">covid_trip_dataset</code> contains
specific dimensions and measures for each atomic metric, such as YTD, MTD,
annual growth, monthly growth, time hierarchy, and regional hierarchy; as well
as various business metrics including COVID-19 death rate, the average speed of
taxi trips, etc. For more information on how to manually create a dataset, see
Create dataset in <code class="highlighter-rouge">MDX for Kylin</code> or <a
href="https://kyligence.github.io/mdx-kylin/">MDX for Kylin User Manual</a>.</p>
+ </li>
+</ol>
+
+<h2 id="data-analysis-with-bi-and-excel">Data analysis with BI and Excel</h2>
+
+<h3 id="data-analysis-using-tableau">Data analysis using Tableau</h3>
+
+<p>Letâs take Tableau installed on a local Windows machine as an example to
connect to MDX for Kylin for data analysis.</p>
+
+<ol>
+ <li>
+ <p>Select Tableauâs built-in <code class="highlighter-rouge">Microsoft
Analysis Service</code> to connect to <code class="highlighter-rouge">MDX for
Kylin</code>. (Note: Please install the <a
href="https://www.tableau.com/support/drivers?_ga=2.104833284.564621013.1647953885-1839825424.1608198275"><code
class="highlighter-rouge">Microsoft Analysis Services</code> driver</a> in
advance, which can be downloaded from Tableau).</p>
+
+ <p><img src="/images/blog/kylin4_on_cloud/21_tableau_connect.png" alt=""
/></p>
+ </li>
+ <li>
+ <p>In the pop-up settings page, enter the <code
class="highlighter-rouge">MDX for Kylin</code> server address, the username and
password. The server address is <code
class="highlighter-rouge">http://${kylin_node_public_ip}:7080/mdx/xmla/covid_trip_project</code>:</p>
+
+ <p><img src="/images/blog/kylin4_on_cloud/22_tableau_server.png" alt=""
/></p>
+ </li>
+ <li>
+ <p>Select covid_trip_dataset as the dataset:</p>
+
+ <p><img src="/images/blog/kylin4_on_cloud/23_tableau_dataset.png" alt=""
/></p>
+ </li>
+ <li>
+ <p>Then we can run data analysis with the worksheet. Since we have defined
the business metrics with <code class="highlighter-rouge">MDX for Kylin</code>,
when we want to generate a business report with Tableau, we can directly drag
the pre-defined business metrics into the worksheet to create a report.</p>
+ </li>
+ <li>
+ <p>Firstly, we will analyze the pandemic data and draw the national-level
pandemic map with the number of confirmed cases and mortality rate. We only
need to drag and drop <code class="highlighter-rouge">COUNTRY_SHORT_NAME</code>
under <code class="highlighter-rouge">REGION_HIERARCHY</code> to the Columns
field and drop and drop <code
class="highlighter-rouge">SUM_NEW_POSITIVE_CASES</code> and <code
class="highlighter-rouge">CFR_COVID19</code> (fatality rate) under Measures to
the Rows field, and then select to display the data results as a map:</p>
+
+ <p><img src="/images/blog/kylin4_on_cloud/24_tableau_covid19_map.png"
alt="" /></p>
+
+ <p>The size of the symbols represents the level of COVID-19 death count
and the shade of the color represents the level of the mortality rate.
According to the pandemic map, the United States and India have more confirmed
cases, but the mortality rates in the two countries are not significantly
different from the other countries. However, countries with much fewer
confirmed cases, such as Peru, Vanuatu, and Mexico, have persistently high
death rates. You can continue to explore the reasons behind this if you are
interested.</p>
+
+ <p>Since we have set up a regional hierarchy, we can break down the
country-level situation to the provincial/state level to see the pandemic
situation in different regions of each country:</p>
+
+ <p><img src="/images/blog/kylin4_on_cloud/25_tableau_province.png" alt=""
/></p>
+
+ <p>Zoom in on the COVID map to see the status in each state of the United
States:</p>
+
+ <p><img src="/images/blog/kylin4_on_cloud/26_tableau_us_covid19.png"
alt="" /></p>
+
+ <p>It can be concluded that there is no significant difference in the
mortality rate in each state of the United States, which is around 0.01. In
terms of the number of confirmed cases, it is significantly higher in
California, Texas, Florida, and New York City. These regions are economically
developed and have a large population. This might be the reason behind the
higher number of confirmed COVID-19 cases. In the following part, we will
combine the pandemic data with the New York taxi dataset to analyze the impact
of the pandemic on the New York Taxi industry.</p>
+ </li>
+ <li>
+ <p>For the New York taxi order dataset, we want to compare the order
numbers and travel speed in different boroughs.</p>
+ </li>
+</ol>
+
+<p>Drag and drop <code class="highlighter-rouge">BOROUGH</code> under <code
class="highlighter-rouge">PICKUP_NEWYORK_ZONE</code> to Columns, and drag and
drop <code class="highlighter-rouge">ORDER_COUNT</code> and <code
class="highlighter-rouge">trip_mean_speed</code> under Measures to Rows, and
display the results as a map. The color shade represents the average speed and
the size of the symbol represents the order number. We can see that taxi orders
departing from Manhattan are higher than all the other boroughs combined, but
the average speed is the lowest. Queens ranks second in terms of order number
while Staten Island has the lowest amount of taxi activities. The average speed
of taxi trips departing from the Bronx is 82 mph, several times higher than
that of the other boroughs. This also reflects the population density and the
level of economic development in different New York boroughs.</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/27_tableau_taxi_1.png" alt="" /></p>
+
+<p>Then we will replace the field <code
class="highlighter-rouge">BOROUGH</code> from <code
class="highlighter-rouge">PICKUP_NEWYORK_ZONE</code> with <code
class="highlighter-rouge">BOROUGH</code> from <code
class="highlighter-rouge">DROPOFF_NEWYORK_ZONE</code>, to analyze the number of
taxi orders and average speed by drop-off ID:</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/27_tableau_taxi_2.png" alt="" /></p>
+
+<p>The pick-up and drop-off data of Brooklyn, Queens, and Bronx differ
greatly, for example, the taxi orders to Brooklyn or Bronx are much higher than
those departing from there, while there are much fewer taxi trips to Queens
than those starting from it.</p>
+
+<ul>
+ <li>Travel habits change after the pandemic (long-distance vs.
short-distance travels)</li>
+</ul>
+
+<p>To analyze the average trip mileage we can get the residentsâ travel
habit changes, drag and drop dimension <code
class="highlighter-rouge">MONTH_START</code> to Rows, and drag and drop the
metric <code class="highlighter-rouge">trip_mean_distance</code> to Columns:</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/28_tableau_taxi_3.png" alt="" /></p>
+
+<p>Based on the histogram we can see that there have been significant changes
in peopleâs travel behavior before and after the outbreak of COVID-19, as the
average trip mileage has increased significantly since March 2020 and in some
months is even several times higher, and the trip mileage of each month
fluctuated greatly. We can combine these data with the pandemic data in the
month dimension, so we drag and drop <code
class="highlighter-rouge">SUM_NEW_POSITIVE_CASES</code> and <code
class="highlighter-rouge">MTD_ORDER_COUNT</code> to Rows and add <code
class="highlighter-rouge">PROVINCE_STATE_NAME=New York</code> as the filter
condition:</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/29_tableau_taxi_4.png" alt="" /></p>
+
+<p>It is interesting to see that the number of taxi orders decreased sharply
at the beginning of the outbreak while the average trip mileage increased,
indicating people have cut unnecessary short-distance travels or switched to a
safer means of transportation. By comparing the data curves, we can see that
the severity of the pandemic and peopleâs travel patterns are highly related,
taxi orders drop and average trip mileage increases when the pandemic worsens,
while when the situation improves, taxi order increases while average trip
mileage drops.</p>
+
+<h3 id="data-analysis-via-excel">Data analysis via Excel</h3>
+
+<p>With <code class="highlighter-rouge">MDX for Kylin</code>, we can also use
Kylin for big data analysis with Excel. In this test, we will use Excel
installed on a local Windows machine to connect MDX for Kylin.</p>
+
+<ol>
+ <li>
+ <p>Open Excel, select <code class="highlighter-rouge">Data</code> ->
<code class="highlighter-rouge">Get Data</code> -> <code
class="highlighter-rouge">From Database</code> -> <code
class="highlighter-rouge">From Analysis Services</code>:</p>
+
+ <p><img src="/images/blog/kylin4_on_cloud/30_excel_connect.png" alt=""
/></p>
+ </li>
+ <li>
+ <p>In <code class="highlighter-rouge">Data Connection Wizard</code>, enter
the connection information as the server name:<code
class="highlighter-rouge">http://${kylin_node_public_ip}:7080/mdx/xmla/covid_trip_project</code>:</p>
+
+ <p><img src="/images/blog/kylin4_on_cloud/31_excel_server.png" alt=""
/></p>
+
+ <p><img src="/images/blog/kylin4_on_cloud/32_tableau_dataset.png" alt=""
/></p>
+ </li>
+ <li>
+ <p>Then create a PivotTable for this data connection. We can see the data
listed here is the same as that when we are using Tableau. So no matter whether
analysts are using Tableau or Excel, they are working on identical sets of data
models, dimensions, and business metrics, thereby realizing unified
semantics.</p>
+ </li>
+ <li>
+ <p>We have just created a pandemic map and run a trend analysis using
<code class="highlighter-rouge">covid19</code> and <code
class="highlighter-rouge">newyork_trip_data</code> with Tableau. In Excel, we
can check more details for the same datasets and data scenarios.</p>
+ </li>
+</ol>
+
+<ul>
+ <li>For COVID-19 related data, we add <code
class="highlighter-rouge">REGION_HIERARCHY</code> and pre-defined <code
class="highlighter-rouge">SUM_NEW_POSITIVE_CASES</code> and mortality rate
<code class="highlighter-rouge">CFR_COVID19</code> to the PivotTable:</li>
+</ul>
+
+<p><img src="/images/blog/kylin4_on_cloud/33_tableau_covid19_1.png" alt=""
/></p>
+
+<p>The highest level of the regional hierarchy is <code
class="highlighter-rouge">CONTINENT_NAME</code>, which includes the number of
confirmed cases and mortality rate in each continent. We can see that Europe
has the highest number of confirmed cases while Africa has the highest
mortality rate. In this PivotTable, we can easily drill down to lower regional
levels to check more fine-grained data, such as data from different Asian
countries, and sort them in descending order according to the number of
confirmed cases:</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/34_excel_covid20_2.png" alt="" /></p>
+
+<p>The data shows that India, Turkey, and Iran are the countries with the
highest number of confirmed cases.</p>
+
+<ul>
+ <li>Regarding the problem, does the pandemic have a significant impact on
taxi orders, we first look at the YTD and growth rate of taxi orders from the
year dimension by creating a PivotTable with <code
class="highlighter-rouge">TIME_HIERARCHY</code>, <code
class="highlighter-rouge">YOY_ORDER_COUNT</code>, and <code
class="highlighter-rouge">YTD_ORDER_COUNT</code> as the dimension for time
hierarchy:</li>
+</ul>
+
+<p><img src="/images/blog/kylin4_on_cloud/35_excel_taxi_1.png" alt="" /></p>
+
+<p>It can be seen that since the outbreak of the pandemic in 2020, there is a
sharp decrease in taxi orders. The growth rate in 2020 is -0.7079, that is, a
reduction of 70% in taxi orders. The growth rate in 2021 is still negative, but
the decrease is not so obvious compared to 2020 when the pandemic just
started.</p>
+
+<p>Click to expand the time hierarchy to view the data at quarter, month, and
even day levels. By selecting <code
class="highlighter-rouge">MOM_ORDER_COUNT</code> and <code
class="highlighter-rouge">ORDER_COUNT</code>, we can check the monthly order
growth rate and order numbers in different time hierarchies:</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/36_excel_taxi_2.png" alt="" /></p>
+
+<p>The order growth rate in March 2020 was -0.52, which is already a
significant fall. The rate dropped even further to -0.92 in April, that is, a
90% reduction in orders. Then the decreasing rate becomes less obvious. But
taxi orders were still much lower than before the outbreak.</p>
+
+<h3 id="use-api-to-integrate-kylin-with-data-analysis-platform">Use API to
integrate Kylin with data analysis platform</h3>
+
+<p>In addition to mainstream BI tools such as Excel and Tableau, many
companies also like to develop their in-house data analysis platforms. For such
self-developed data analysis platforms, users can still use Kylin + MDX for
Kylin as the base for the analysis platform by calling API to ensure a unified
data definition. In the following part, we will show you how to send a query to
MDX for Kylin through Olap4j, the Java library similar to JDBC driver that can
access any OLAP service.</p>
+
+<p>We also provide a simple demo for our users, you may click <a
href="https://github.com/apache/kylin/tree/mdx-query-demo">mdx query demo</a>
to download the source code.</p>
+
+<ol>
+ <li>
+ <p>Download jar package for the demo:</p>
+
+ <div class="highlighter-rouge"><pre class="highlight"><code>wget
https://s3.cn-north-1.amazonaws.com.cn/public.kyligence.io/kylin/kylin_demo/mdx_query_demo.tgz
+tar -xvf mdx_query_demo.tgz
+cd mdx_query_demo
+</code></pre>
+ </div>
+ </li>
+ <li>
+ <p>Run demo</p>
+ </li>
+</ol>
+
+<p>Make sure Java 8 is installed before running the demo:</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/37_jdk_8.png" alt="" /></p>
+
+<p>Two parameters are needed to run the demo: the IP of the MDX node and the
MDX query to be run. The default port is 7080. The MDX node IP here is the
public IP of the Kylin node.</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>java -cp
olap4j-xmla-1.2.0.jar:olap4j-1.2.0.jar:xercesImpl-2.9.1.jar:mdx-query-demo-0.0.1.jar
io.kyligence.mdxquerydemo.MdxQueryDemoApplication "${kylin_node_public_ip}"
"${mdx_query}"
+</code></pre>
+</div>
+
+<p>Or you could just enter the IP of the MDX node, the system will
automatically run the following MDX statement to count the order number and
average trip mileage of each borough according to the pickup ID:</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>SELECT
+{[Measures].[ORDER_COUNT],
+[Measures].[trip_mean_distance]}
+DIMENSION PROPERTIES [MEMBER_UNIQUE_NAME],[MEMBER_ORDINAL],[MEMBER_CAPTION] ON
COLUMNS,
+NON EMPTY [PICKUP_NEWYORK_ZONE].[BOROUGH].[BOROUGH].AllMembers
+DIMENSION PROPERTIES [MEMBER_UNIQUE_NAME],[MEMBER_ORDINAL],[MEMBER_CAPTION] ON
ROWS
+FROM [covid_trip_dataset]
+</code></pre>
+</div>
+
+<p>We will also use the default query in this tutorial. After the execution is
completed, we can get the query result in the command line:</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/38_demo_result.png" alt="" /></p>
+
+<p>As you can see, we have successfully obtained the data needed. The result
shows that the largest number of taxi orders are from Manhattan, with an
average order distance of only about 2.4 miles, which is reasonable if we
consider the area and dense population in Manhattan; while the average distance
of orders departing from Bronx is 33 miles, much higher than any other
boroughs, probably due to Bronxâs remote location.</p>
+
+<p>As with Tableau and Excel, the MDX statement here can directly use the
metrics defined in Kylin and MDX for Kylin. Users can do further analysis of
the data with their own data analysis platform.</p>
+
+<h3 id="unified-data-definition">Unified data definition</h3>
+
+<p>We have demonstrated 3 ways to work with Kylin + MDX for Kylin, from which
we can see that with the help of Kylin multi-dimensional database and MDX for
Kylin semantic layer, no matter which data analytic system you are using, you
can always use the same data model and business metrics and enjoy the
advantages brought by unified semantics.</p>
+
+<h2 id="delete-clusters">Delete clusters</h2>
+
+<h3 id="delete-query-cluster">Delete query cluster</h3>
+
+<p>After the analysis, we can execute the cluster destruction command to
delete the query cluster. If you also want to delete metadata database RDS,
monitor node and VPC of Kylin and MDX for Kylin, you can execute the following
cluster destroy command:</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>python deploy.py
--type destroy-all
+</code></pre>
+</div>
+
+<h3 id="check-aws-resources">Check AWS resources</h3>
+
+<p>After all cluster resources are deleted, there should be no Kylin
deployment tool-related Stack on <code
class="highlighter-rouge">CloudFormation</code>. If you also want to delete the
deployment-related files and data from S3, you can manually delete the
following folders under the S3 working directory:</p>
+
+<p><img src="/images/blog/kylin4_on_cloud/39_check_s3_demo.png" alt="" /></p>
+
+<h2 id="summary">Summary</h2>
+
+<p>You only need an AWS account to follow the steps in this tutorial to
explore our Kylin deployment tool on the Cloud. Kylin + MDX for Kylin, with our
pre-computation technology, multi-dimensional models, and basic metrics
management capabilities, enables users to build a big data analysis platform on
the cloud in a convenient way. In addition, we also support seamless connection
to mainstream BI tools, helping our users to better leverage their data with
higher efficiency and the lowest TCO.</p>
+
+ </article>
+
+</div>
+
+
+
+
+
+ </article>
+ </div>
+ </div>
+ <!--
+* Licensed to the Apache Software Foundation (ASF) under one
+* or more contributor license agreements. See the NOTICE file
+* distributed with this work for additional information
+* regarding copyright ownership. The ASF licenses this file
+* to you under the Apache License, Version 2.0 (the
+* "License"); you may not use this file except in compliance
+* with the License. You may obtain a copy of the License at
+*
+* http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an "AS IS" BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+-->
+
+<footer id="underfooter">
+ <div>
+ <div class="row">
+ <div class="col-md-12 widget">
+ <div class="widget-body">
+ <div class="footer-img">
+ <a href="http://www.apache.org">
+ <img id="asf-logo" height="78px" alt="Apache
Software Foundation" src="/assets/images/apache_footer.png">
+ </a>
+ </div>
+ <p style="padding-top: 11px;">
+ The contents of this website are © 2015 Apache
Software Foundation under the terms of the
+ <a href="http://www.apache.org/licenses/LICENSE-2.0">
Apache License v2 </a>.
+ </p>
+ <p style="margin-bottom: 11px;">
+ Apache Kylin and its logo are trademarks of the Apache
Software Foundation.
+ </div>
+
+ </div>
+ </div>
+ </div>
+ <!-- /row of widgets -->
+
+ </div>
+ <div></div>
+
+</footer>
+
+ <script src="/assets/js/jquery-1.9.1.min.js"></script>
+ <script src="/assets/js/bootstrap.min.js"></script>
+ <script src="/assets/js/main.js"></script>
+ </body>
+</html>
+
+
+
+
Modified: kylin/site/blog/index.html
URL:
http://svn.apache.org/viewvc/kylin/site/blog/index.html?rev=1901905&r1=1901904&r2=1901905&view=diff
==============================================================================
--- kylin/site/blog/index.html (original)
+++ kylin/site/blog/index.html Tue Jun 14 14:08:04 2022
@@ -197,6 +197,26 @@ var _hmt = _hmt || [];
<div class="col-md-6 col-lg-6 col-xs-12">
+ <a class="blog-card"
href="/blog/2022/04/20/kylin4-on-cloud-part2/">
+ <div class="blog-pic">
+ <img width="20" src="../assets/images/icon_blog_w.png" />
+ </div>
+ <p class="blog-title">Kylin on Cloud â Build A Data Analysis
Platform on the Cloud in Two Hours Part 2</p>
+ <p align="left" class="post-meta">posted: Apr 20, 2022</p>
+ </a>
+ </div>
+
+ <div class="col-md-6 col-lg-6 col-xs-12">
+ <a class="blog-card"
href="/blog/2022/04/20/kylin4-on-cloud-part1/">
+ <div class="blog-pic">
+ <img width="20" src="../assets/images/icon_blog_w.png" />
+ </div>
+ <p class="blog-title">Kylin on Cloud â Build A Data Analysis
Platform on the Cloud in Two Hours Part 1</p>
+ <p align="left" class="post-meta">posted: Apr 20, 2022</p>
+ </a>
+ </div>
+
+ <div class="col-md-6 col-lg-6 col-xs-12">
<a class="blog-card"
href="/blog/2022/03/31/how-to-use-excel-to-query-kylin/">
<div class="blog-pic">
<img width="20" src="../assets/images/icon_blog_w.png" />
Modified: kylin/site/cn/docs/gettingstarted/kylin-quickstart.html
URL:
http://svn.apache.org/viewvc/kylin/site/cn/docs/gettingstarted/kylin-quickstart.html?rev=1901905&r1=1901904&r2=1901905&view=diff
==============================================================================
--- kylin/site/cn/docs/gettingstarted/kylin-quickstart.html (original)
+++ kylin/site/cn/docs/gettingstarted/kylin-quickstart.html Tue Jun 14 14:08:04
2022
@@ -259,7 +259,7 @@ kylin.metadata.url=kylin_metadata@jdbc,d
kylin.env.zookeeper-connect-string=ip:2181
</code></p>
-<p>ä½ éè¦ä¿®æ¹å
¶ä¸ç Mysql ç¨æ·ååå¯ç ï¼ä»¥ååå¨å
æ°æ®çdatabaseåtableã请åè <a
href="/_docs40/tutorial/mysql_metastore.html">é
置 Mysql 为 Metastore</a>
äºè§£ Mysql ä½ä¸º Metastore ç详ç»é
ç½®ã</p>
+<p>ä½ éè¦ä¿®æ¹å
¶ä¸ç Mysql ç¨æ·ååå¯ç ï¼ä»¥ååå¨å
æ°æ®çdatabaseåtableã请åè <a
href="/cn/docs/tutorial/mysql_metastore.html">é
置 Mysql 为 Metastore</a>
äºè§£ Mysql ä½ä¸º Metastore ç详ç»é
ç½®ã</p>
<h4 id="step5">step5ãç¯å¢æ£æ¥</h4>
Modified: kylin/site/docs/gettingstarted/kylin-quickstart.html
URL:
http://svn.apache.org/viewvc/kylin/site/docs/gettingstarted/kylin-quickstart.html?rev=1901905&r1=1901904&r2=1901905&view=diff
==============================================================================
--- kylin/site/docs/gettingstarted/kylin-quickstart.html (original)
+++ kylin/site/docs/gettingstarted/kylin-quickstart.html Tue Jun 14 14:08:04
2022
@@ -256,7 +256,7 @@ kylin.env.zookeeper-connect-string=ip:21
</code></p>
<p>You need to change the Mysql user name and password, as well as the
database and table where the metadata is stored.<br />
-Please refer to <a href="/_docs/tutorial/mysql_metastore.html">Configure Mysql
as Metastore</a> learn about the detailed configuration of MySQL as a
Metastore.</p>
+Please refer to <a href="/docs/tutorial/mysql_metastore.html">Configure Mysql
as Metastore</a> learn about the detailed configuration of MySQL as a
Metastore.</p>
<h4 id="step5-environmental-inspection">Step5. Environmental Inspection</h4>
<p>Kylin runs on a Hadoop cluster and has certain requirements for the
version, access permissions and CLASSPATH of each component. <br />