[incubator-kyuubi] branch master updated: [KYUUBI #1335] Spell issue branch

ulyssesyou Thu, 04 Nov 2021 18:34:15 -0700

This is an automated email from the ASF dual-hosted git repository.

ulyssesyou pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-kyuubi.git



The following commit(s) were added to refs/heads/master by this push:
     new 34c58b9  [KYUUBI #1335] Spell issue branch
34c58b9 is described below

commit 34c58b9133dc87cd93756443bf494c44c969f18f
Author: AnybodyHome <[email protected]>
AuthorDate: Fri Nov 5 09:33:32 2021 +0800

    [KYUUBI #1335] Spell issue branch
    
    <!--
    Thanks for sending a pull request!
    
    Here are some tips for you:
      1. If this is your first time, please read our contributor guidelines: 
https://kyuubi.readthedocs.io/en/latest/community/contributions.html
      2. If the PR is related to an issue in 
https://github.com/apache/incubator-kyuubi/issues, add '[KYUUBI #XXXX]' in your 
PR title, e.g., '[KYUUBI #XXXX] Your PR title ...'.
      3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., 
'[WIP][KYUUBI #XXXX] Your PR title ...'.
    -->
    
    ### _Why are the changes needed?_
    <!--
    Please clarify why the changes are needed. For instance,
      1. If you add a feature, you can talk about the use case of it.
      2. If you fix a bug, you can clarify why it is a bug.
    -->
    Spell check and punctuation check.
    
    ### _How was this patch tested?_
    - [ ] Add some test cases that check the changes thoroughly including 
negative and positive cases if possible
    
    - [ ] Add screenshots for manual tests if appropriate
    
    - [ ] [Run 
test](https://kyuubi.readthedocs.io/en/latest/develop_tools/testing.html#running-tests)
 locally before make a pull request
    
    Closes #1335 from zhenjiaguo/spell-issue-branch.
    
    Closes #1335
    
    b4d48192 [AnybodyHome] recover beeline change
    85603b6f [AnybodyHome] spell check and punctuation check
    
    Authored-by: AnybodyHome <[email protected]>
    Signed-off-by: ulysses-you <[email protected]>
---
 docs/overview/architecture.md                 |  2 +-
 docs/quick_start/quick_start.md               | 16 +++++-----
 docs/quick_start/quick_start_with_datagrip.md |  2 +-
 docs/quick_start/quick_start_with_helm.md     |  6 ++--
 docs/quick_start/quick_start_with_jdbc.md     |  2 +-
 docs/sql/rules.md                             |  2 +-
 docs/sql/z-order-benchmark.md                 | 46 +++++++++++++--------------
 docs/tools/spark_block_cleaner.md             | 10 +++---
 8 files changed, 43 insertions(+), 43 deletions(-)

diff --git a/docs/overview/architecture.md b/docs/overview/architecture.md
index 0f1b4c0..7c8d770 100644
--- a/docs/overview/architecture.md
+++ b/docs/overview/architecture.md
@@ -82,7 +82,7 @@ Next, let us share some of the key design concepts of Kyuubi.
 
 Kyuubi implements the [Hive Service 
RPC](https://mvnrepository.com/artifact/org.apache.hive/hive-service-rpc/2.3.9) 
module,
 which provides the same way of accessing data as HiveServer2 and Spark Thrift 
Server.
-On the client side，you can build fantastic business reports, BI applications, 
or even ETL jobs only via the [Hive 
JDBC](https://mvnrepository.com/artifact/org.apache.hive/hive-jdbc/2.3.9) 
module.
+On the client side, you can build fantastic business reports, BI applications, 
or even ETL jobs only via the [Hive 
JDBC](https://mvnrepository.com/artifact/org.apache.hive/hive-jdbc/2.3.9) 
module.
 
 You only need to be familiar with Structured Query Language (SQL) and Java 
Database Connectivity (JDBC) to handle massive data.
 It helps you focus on the design and implementation of your business system.
diff --git a/docs/quick_start/quick_start.md b/docs/quick_start/quick_start.md
index fa89c62..f3d72dd 100644
--- a/docs/quick_start/quick_start.md
+++ b/docs/quick_start/quick_start.md
@@ -45,7 +45,7 @@ Java | Java<br>Runtime<br>Environment | Required | Java 8/11 
| Kyuubi is pre-bui
 Spark | Distributed<br>SQL<br>Engine | Required | 3.0.0 and above | By default 
Kyuubi binary release is delivered without<br> a Spark tarball.
 HDFS | Distributed<br>File<br>System |  Optional | referenced<br>by<br>Spark | 
Hadoop Distributed File System is a <br>part of Hadoop framework, used to<br> 
store and process the datasets.<br> You can interact with any<br> 
Spark-compatible versions of HDFS.
 Hive | Metastore | Optional | referenced<br>by<br>Spark | Hive Metastore for 
Spark SQL to connect
-Zookeeper | Service<br>Discovery | Optional | 
Any<br>zookeeper<br>ensemble<br>compatible<br>with<br>curator(2.12.0) | By 
default, Kyuubi provides a<br> embeded Zookeeper server inside for<br> 
non-production use.
+Zookeeper | Service<br>Discovery | Optional | 
Any<br>zookeeper<br>ensemble<br>compatible<br>with<br>curator(2.12.0) | By 
default, Kyuubi provides a<br> embedded Zookeeper server inside for<br> 
non-production use.
 
 Additionally, if you want to work with other Spark compatible systems or 
plugins, you only need to take care of them as using them with regular Spark 
applications.
 For example, you can run Spark SQL engines created by the Kyuubi on any 
cluster manager, including YARN, Kubernetes, Mesos, e.t.c...
@@ -93,14 +93,14 @@ From top to bottom are:
 - DISCLAIMER: the disclaimer made by Apache Kyuubi Community as a project 
still in ASF Incubator.
 - LICENSE: the [APACHE LICENSE, VERSION 
2.0](https://www.apache.org/licenses/LICENSE-2.0) we claim to obey.
 - RELEASE: the build information of this package.
-- NOTICE: the natice made by Apache Kyuubi Community about its project and 
dependencies.
+- NOTICE: the notice made by Apache Kyuubi Community about its project and 
dependencies.
 - bin: the entry of the Kyuubi server with `kyuubi` as the startup script.
 - conf: all the defaults used by Kyuubi Server itself or creating a session 
with Spark applications.
 - externals
   - engines: contains all kinds of SQL engines that we support, e.g. Apache 
Spark, Apache Flink(coming soon).
-- licenses: a bunch of licenses included
+- licenses: a bunch of licenses included.
 - jars: packages needed by the Kyuubi server.
-- logs: Where the logs of the Kyuubi server locates.
+- logs: where the logs of the Kyuubi server locates.
 - pid: stores the PID file of the Kyuubi server instance.
 - work: the root of the working directories of all the forked sub-processes, 
a.k.a. SQL engines.
 
@@ -110,7 +110,7 @@ As mentioned above, for a quick start deployment, then only 
you need to be sure
 
 ### Setup JAVA
 
-You can either set it system-widely,  e.g. in the `.bashrc` file.
+You can either set it system-widely, e.g. in the `.bashrc` file.
 
 ```bash
 java -version
@@ -123,7 +123,7 @@ Or, `export JAVA_HOME=/path/to/java` in the local os 
session.
 
 ```bash
 export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk-11.0.5.jdk/Contents/Home
- java -version
+java -version
 java version "11.0.5" 2019-10-15 LTS
 Java(TM) SE Runtime Environment 18.9 (build 11.0.5+10-LTS)
 Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.5+10-LTS, mixed mode)
@@ -214,7 +214,7 @@ In this case, the session will create for the user named 
'anonymous'.
 
 Kyuubi will create a Spark SQL engine application using 
`kyuubi-spark-sql-engine_2.12-<version>.jar`.
 It will cost awhile for the application to be ready before fully establishing 
the session.
-Otherwise, an existing application will be resued, and the time cost here is 
negligible.
+Otherwise, an existing application will be reused, and the time cost here is 
negligible.
 
 Similarly, you can create a session for another user(or principal, subject, 
and maybe something else you defined), e.g. named `kentyao`,
 
@@ -317,7 +317,7 @@ Closing: 0: jdbc:hive2://localhost:10009/
 Stop Kyuubi by running the following in the `$KYUUBI_HOME` directory:
 
 ```bash
-bin/kyuubi.sh stop
+bin/kyuubi stop
 ```
 
 And then, you will see the KyuubiServer waving goodbye to you.
diff --git a/docs/quick_start/quick_start_with_datagrip.md 
b/docs/quick_start/quick_start_with_datagrip.md
index a37e18f..dc54d04 100644
--- a/docs/quick_start/quick_start_with_datagrip.md
+++ b/docs/quick_start/quick_start_with_datagrip.md
@@ -40,7 +40,7 @@ You should first download the missing driver files. Just 
click on the link below
 ### Generic JDBC Connection Settings
 After install drivers, you should configure the right host and port which you 
can find in kyuubi server log. By default, we use `localhost` and `10009` to 
configure.
 
-Of curse, you can fill other configs.
+Of course, you can fill other configs.
 
 After generic configs, you can use test connection to test.
 
diff --git a/docs/quick_start/quick_start_with_helm.md 
b/docs/quick_start/quick_start_with_helm.md
index cf4c6f2..5545245 100644
--- a/docs/quick_start/quick_start_with_helm.md
+++ b/docs/quick_start/quick_start_with_helm.md
@@ -42,7 +42,7 @@ cretate ns kyuubi
 ```bash
 helm install kyuubi-helm ${KYUUBI_HOME}/docker/helm -n ${namespace_name}
 ```
-It will print variables and the way to get kyuubi expose ip and port
+It will print variables and the way to get kyuubi expose ip and port.
 ```bash
 NAME: kyuubi-helm
 LAST DEPLOYED: Wed Oct 20 15:22:47 2021
@@ -67,7 +67,7 @@ helm uninstall kyuubi-helm -n ${namespace_name}
 
 #### Edit server config
 
-Modify `values.yaml` under `${KYUUBI_HOME}/docker/helm`
+Modify `values.yaml` under `${KYUUBI_HOME}/docker/helm`:
 ```yaml
 # Kyuubi server numbers
 replicaCount: 2
@@ -105,7 +105,7 @@ NAME                             READY   STATUS    RESTARTS 
  AGE
 kyuubi-server-585d8944c5-m7j5s   1/1     Running   0          30m
 kyuubi-server-32sdsa1245-2d2sj   1/1     Running   0          30m
 ```
-then, use pod name to get logs
+then, use pod name to get logs:
 ```bash
 kubectl -n ${namespace_name} logs kyuubi-server-585d8944c5-m7j5s
 ```
diff --git a/docs/quick_start/quick_start_with_jdbc.md 
b/docs/quick_start/quick_start_with_jdbc.md
index bc84098..a06f8f0 100644
--- a/docs/quick_start/quick_start_with_jdbc.md
+++ b/docs/quick_start/quick_start_with_jdbc.md
@@ -35,7 +35,7 @@ Add repository to your maven configuration file which may 
reside in `$MAVEN_HOME
     <name>central maven repo https</name>
     <url>https://repo.maven.apache.org/maven2</url>
   </repository>
-<repositories>
+</repositories>
 ```
 You can add below dependency to your `pom.xml` file in your application.
 
diff --git a/docs/sql/rules.md b/docs/sql/rules.md
index ee7b330..052612c 100644
--- a/docs/sql/rules.md
+++ b/docs/sql/rules.md
@@ -24,7 +24,7 @@
 # Auxiliary SQL extension for Spark SQL
 
 Kyuubi provides SQL extension out of box. Due to the version compatibility 
with Apache Spark, currently we only support Apache Spark branch-3.1 (i.e 3.1.1 
and 3.1.2).
-And don't worry, Kyuubi will support the new Apache Spark version in future. 
Thanks to the adaptive query execution framework (AQE), Kyuubi can do these 
optimization.
+And don't worry, Kyuubi will support the new Apache Spark version in the 
future. Thanks to the adaptive query execution framework (AQE), Kyuubi can do 
these optimizations.
 
 ## What feature does Kyuubi SQL extension provide
 - merging small files automatically
diff --git a/docs/sql/z-order-benchmark.md b/docs/sql/z-order-benchmark.md
index 5beb180..d04f630 100644
--- a/docs/sql/z-order-benchmark.md
+++ b/docs/sql/z-order-benchmark.md
@@ -23,25 +23,25 @@
 
 # Z-order Benchmark
 
-Z-order is a technique that allows you to map multidimensional data to a 
single dimension. We did a performance test
+Z-order is a technique that allows you to map multidimensional data to a 
single dimension. We did a performance test.
 
-for this test ,we used aliyun Databricks Delta test case
-https://help.aliyun.com/document_detail/168137.html?spm=a2c4g.11186623.6.563.10d758ccclYtVb
+For this test ,we used aliyun Databricks Delta test case
+https://help.aliyun.com/document_detail/168137.html?spm=a2c4g.11186623.6.563.10d758ccclYtVb.
 
 Prepare data for the three scenarios:
 
-1. 10 billion data and 2 hundred files（parquet files）: for big file(1G)
-2. 10 billion data and 1 thousand files（parquet files）: for medium file(200m)
-3. one billion data and 10 hundred files（parquet files）: for smaller file(200k)
+1. 10 billion data and 2 hundred files (parquet files): for big file(1G)
+2. 10 billion data and 1 thousand files (parquet files): for medium file(200m)
+3. 1 billion data and 10 thousand files (parquet files): for smaller file(200k)
 
-test env：
+Test env:
 spark-3.1.2
 hadoop-2.7.2
-kyubbi-1.4.0
+kyuubi-1.4.0
 
-test step：
+Test step:
 
-Step1: create hive tables
+Step1: create hive tables.
 
 ```scala
 spark.sql(s"drop database if exists $dbName cascade")
@@ -55,8 +55,8 @@ spark.sql(s"create table $connZorder (src_ip string, src_port 
int, dst_ip string
 spark.sql(s"show tables").show(false)
 ```
 
-Step2： prepare data for parquet table with three scenarios
-we use the following code
+Step2: prepare data for parquet table with three scenarios,
+we use the following code.
 
 ```scala
 def randomIPv4(r: Random) = Seq.fill(4)(r.nextInt(256)).mkString(".")
@@ -67,14 +67,14 @@ def randomConnRecord(r: Random) = ConnRecord(
   dst_ip = randomIPv4(r), dst_port = randomPort(r))
 ```
 
-Step3： do optimize with z-order only ip and do optimize with order by only ip， 
sort column： src_ip, dst_ip and shuffle partition just as file numbers .
+Step3: do optimize with z-order only ip and do optimize with order by only ip, 
sort column: src_ip, dst_ip and shuffle partition just as file numbers.
 
 ```
 INSERT overwrite table conn_order_only_ip select src_ip, src_port, dst_ip, 
dst_port from conn_random_parquet order by src_ip, dst_ip;
 OPTIMIZE conn_zorder_only_ip ZORDER BY src_ip, dst_ip;
 ```
 
-Step4： do optimize with z-order and do optimize with order by ， sort column： 
src_ip, src_port, dst_ip, dst_port and shuffle partition just as file numbers .
+Step4: do optimize with z-order and do optimize with order by, sort column: 
src_ip, src_port, dst_ip, dst_port and shuffle partition just as file numbers.
 
 ```
 INSERT overwrite table conn_order select src_ip, src_port, dst_ip, dst_port 
from conn_random_parquet order by src_ip, src_port, dst_ip, dst_port;
@@ -82,7 +82,7 @@ OPTIMIZE conn_zorder ZORDER BY src_ip, src_port, dst_ip, 
dst_port;
 ```
 
 
-The complete code is as follows：
+The complete code is as follows:
 
 ```shell
 ./spark-shell
@@ -191,20 +191,20 @@ select count(*) from conn_zorder where src_ip like '157%' 
and dst_ip like '216.%
 ## Benchmark result
 
 We have done two performance tests: one is to compare the efficiency of  
Z-order Optimize and Order by Sort, 
-and the other is to query based on the optimized Z-order by data and Random 
data
+and the other is to query based on the optimized Z-order by data and Random 
data.
 
 ### Efficiency of Z-order Optimize and Order-by Sort
 
-**10 billion data and 1000 files and Query resource:200 core 600G memory**
+**10 billion data and 1000 files and Query resource: 200 core 600G memory**
 
-z-order by or order by only ip
+Z-order by or order by only ip:
 
 | Table               | row count      | optimize  time     |
 | ------------------- | -------------- | ------------------ |
 | conn_order_only_ip  | 10,000,000,000 | 1591.99 s          |
 | conn_zorder_only_ip | 10,000,000,000 | 8371.405 s         |
 
-z-order by or order by all columns
+Z-order by or order by all columns:
 
 | Table               | row count      | optimize  time     |
 | ------------------- | -------------- | ------------------ |
@@ -213,9 +213,9 @@ z-order by or order by all columns
 
 ### Z-order by benchmark result
 
-by querying the tables before and after optimization, we find that
+By querying the tables before and after optimization, we find that:
 
-**10 billion data and 200 files and Query resource:200 core 600G memory**
+**10 billion data and 200 files and Query resource: 200 core 600G memory**
 
 | Table               | Average File Size | Scan row count | Average query 
time | row count Skipping ratio |
 | ------------------- | ----------------- | -------------- | 
------------------ | ------------------------ |
@@ -225,7 +225,7 @@ by querying the tables before and after optimization, we 
find that
 
 
 
-**10 billion data and 1000 files and Query resource:200 core 600G memory**
+**10 billion data and 1000 files and Query resource: 200 core 600G memory**
 
 | Table               | Average File Size | Scan row count | Average query 
time | row count Skipping ratio |
 | ------------------- | ----------------- | -------------- | 
------------------ | ------------------------ |
@@ -235,7 +235,7 @@ by querying the tables before and after optimization, we 
find that
 
 
 
-**1 billion data and 10000 files and Query resource:10 core 40G memory**
+**1 billion data and 10000 files and Query resource: 10 core 40G memory**
 
 | Table               | Average File Size | Scan row count | Average query 
time | row count Skipping ratio |
 | ------------------- | ----------------- | -------------- | 
------------------ | ------------------------ |
diff --git a/docs/tools/spark_block_cleaner.md 
b/docs/tools/spark_block_cleaner.md
index 8514fa9..c391005 100644
--- a/docs/tools/spark_block_cleaner.md
+++ b/docs/tools/spark_block_cleaner.md
@@ -56,16 +56,16 @@ Before you start using Spark Block Cleaner, you should 
build its docker images.
 
 In the `KYUUBI_HOME` directory, you can use the following cmd to build docker 
image.
 ```shell
-    docker build ./tools/spark-block-cleaner/kubernetes/docker
+docker build ./tools/spark-block-cleaner/kubernetes/docker
 ```
 
 ### Modify spark-block-cleaner.yml
 
 You need to modify the 
`${KYUUBI_HOME}/tools/spark-block-cleaner/kubernetes/spark-block-cleaner.yml` 
to fit your current environment.
 
-In Kyuubi tools, we recommend using `DaemonSet` to start , and we offer 
default yaml file in daemonSet way.
+In Kyuubi tools, we recommend using `DaemonSet` to start, and we offer default 
yaml file in daemonSet way.
 
-Base file structure : 
+Base file structure: 
 ```yaml
 apiVersion
 kind
@@ -128,7 +128,7 @@ After you finishing modifying the above, you can use the 
following command `kube
 Name | Default | unit | Meaning
 --- | --- | --- | ---
 CACHE_DIRS | /data/data1,/data/data2|  | The target dirs in container path 
which will clean block files.
-FILE_EXPIRED_TIME | 604800 | seconds | Cleaner will clean the block files 
which current time - last modified time  more than the fileExpiredTime.
-DEEP_CLEAN_FILE_EXPIRED_TIME | 432000 | seconds | Deep clean will clean the 
block files which current time - last modified time  more than the 
deepCleanFileExpiredTime.
+FILE_EXPIRED_TIME | 604800 | seconds | Cleaner will clean the block files 
which current time - last modified time more than the fileExpiredTime.
+DEEP_CLEAN_FILE_EXPIRED_TIME | 432000 | seconds | Deep clean will clean the 
block files which current time - last modified time more than the 
deepCleanFileExpiredTime.
 FREE_SPACE_THRESHOLD | 60 | % | After first clean, if free Space low than 
threshold trigger deep clean.
 SCHEDULE_INTERVAL | 3600 | seconds | Cleaner sleep between cleaning.

[incubator-kyuubi] branch master updated: [KYUUBI #1335] Spell issue branch

Reply via email to