This is an automated email from the ASF dual-hosted git repository.
liuyu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/pulsar.git
The following commit(s) were added to refs/heads/master by this push:
new f3ca870 feat: docs migration about Pulsar SQL (#11981)
f3ca870 is described below
commit f3ca870bfde8d2de2f22b21191eaa10094ec10c0
Author: Li Li <[email protected]>
AuthorDate: Tue Sep 14 11:53:36 2021 +0800
feat: docs migration about Pulsar SQL (#11981)
Signed-off-by: LiLi <[email protected]>
---
.../docs/sql-deployment-configurations.md | 170 ++++++++++++++++++
site2/website-next/docs/sql-getting-started.md | 147 ++++++++++++++++
site2/website-next/docs/sql-overview.md | 21 +++
site2/website-next/docs/sql-rest-api.md | 192 ++++++++++++++++++++
site2/website-next/sidebars.json | 10 ++
.../version-2.7.3/sql-deployment-configurations.md | 171 ++++++++++++++++++
.../version-2.7.3/sql-getting-started.md | 148 ++++++++++++++++
.../versioned_docs/version-2.7.3/sql-overview.md | 22 +++
.../versioned_docs/version-2.7.3/sql-rest-api.md | 193 +++++++++++++++++++++
.../version-2.8.0/sql-deployment-configurations.md | 171 ++++++++++++++++++
.../version-2.8.0/sql-getting-started.md | 148 ++++++++++++++++
.../versioned_docs/version-2.8.0/sql-overview.md | 22 +++
.../versioned_docs/version-2.8.0/sql-rest-api.md | 193 +++++++++++++++++++++
.../versioned_sidebars/version-2.7.3-sidebars.json | 24 +++
.../versioned_sidebars/version-2.8.0-sidebars.json | 24 +++
15 files changed, 1656 insertions(+)
diff --git a/site2/website-next/docs/sql-deployment-configurations.md
b/site2/website-next/docs/sql-deployment-configurations.md
new file mode 100644
index 0000000..ed181d8
--- /dev/null
+++ b/site2/website-next/docs/sql-deployment-configurations.md
@@ -0,0 +1,170 @@
+---
+id: sql-deployment-configurations
+title: Pulsar SQL configuration and deployment
+sidebar_label: Configuration and deployment
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+You can configure Presto Pulsar connector and deploy a cluster with the
following instruction.
+
+## Configure Presto Pulsar Connector
+You can configure Presto Pulsar Connector in the
`${project.root}/conf/presto/catalog/pulsar.properties` properties file. The
configuration for the connector and the default values are as follows.
+
+```properties
+# name of the connector to be displayed in the catalog
+connector.name=pulsar
+
+# the url of Pulsar broker service
+pulsar.web-service-url=http://localhost:8080
+
+# URI of Zookeeper cluster
+pulsar.zookeeper-uri=localhost:2181
+
+# minimum number of entries to read at a single time
+pulsar.entry-read-batch-size=100
+
+# default number of splits to use per query
+pulsar.target-num-splits=4
+```
+
+You can connect Presto to a Pulsar cluster with multiple hosts. To configure
multiple hosts for brokers, add multiple URLs to `pulsar.web-service-url`. To
configure multiple hosts for ZooKeeper, add multiple URIs to
`pulsar.zookeeper-uri`. The following is an example.
+
+```
+pulsar.web-service-url=http://localhost:8080,localhost:8081,localhost:8082
+pulsar.zookeeper-uri=localhost1,localhost2:2181
+```
+
+## Query data from existing Presto clusters
+
+If you already have a Presto cluster, you can copy the Presto Pulsar connector
plugin to your existing cluster. Download the archived plugin package with the
following command.
+
+```bash
+$ wget pulsar:binary_release_url
+```
+
+## Deploy a new cluster
+
+Since Pulsar SQL is powered by [Trino (formerly Presto
SQL)](https://trino.io), the configuration for deployment is the same for the
Pulsar SQL worker.
+
+:::note
+
+For how to set up a standalone single node environment, refer to [Query
data](sql-getting-started.md).
+
+:::
+
+
+You can use the same CLI args as the Presto launcher.
+
+```bash
+$ ./bin/pulsar sql-worker --help
+Usage: launcher [options] command
+
+Commands: run, start, stop, restart, kill, status
+
+Options:
+ -h, --help show this help message and exit
+ -v, --verbose Run verbosely
+ --etc-dir=DIR Defaults to INSTALL_PATH/etc
+ --launcher-config=FILE
+ Defaults to INSTALL_PATH/bin/launcher.properties
+ --node-config=FILE Defaults to ETC_DIR/node.properties
+ --jvm-config=FILE Defaults to ETC_DIR/jvm.config
+ --config=FILE Defaults to ETC_DIR/config.properties
+ --log-levels-file=FILE
+ Defaults to ETC_DIR/log.properties
+ --data-dir=DIR Defaults to INSTALL_PATH
+ --pid-file=FILE Defaults to DATA_DIR/var/run/launcher.pid
+ --launcher-log-file=FILE
+ Defaults to DATA_DIR/var/log/launcher.log (only in
+ daemon mode)
+ --server-log-file=FILE
+ Defaults to DATA_DIR/var/log/server.log (only in
+ daemon mode)
+ -D NAME=VALUE Set a Java system property
+
+```
+
+The default configuration for the cluster is located in
`${project.root}/conf/presto`. You can customize your deployment by modifying
the default configuration.
+
+You can set the worker to read from a different configuration directory, or
set a different directory to write data.
+
+```bash
+$ ./bin/pulsar sql-worker run --etc-dir /tmp/incubator-pulsar/conf/presto
--data-dir /tmp/presto-1
+```
+
+You can start the worker as daemon process.
+
+```bash
+$ ./bin/pulsar sql-worker start
+```
+
+### Deploy a cluster on multiple nodes
+
+You can deploy a Pulsar SQL cluster or Presto cluster on multiple nodes. The
following example shows how to deploy a cluster on three-node cluster.
+
+1. Copy the Pulsar binary distribution to three nodes.
+
+The first node runs as Presto coordinator. The minimal configuration
requirement in the `${project.root}/conf/presto/config.properties` file is as
follows.
+
+```properties
+coordinator=true
+node-scheduler.include-coordinator=true
+http-server.http.port=8080
+query.max-memory=50GB
+query.max-memory-per-node=1GB
+discovery-server.enabled=true
+discovery.uri=<coordinator-url>
+```
+
+The other two nodes serve as worker nodes, you can use the following
configuration for worker nodes.
+
+```properties
+coordinator=false
+http-server.http.port=8080
+query.max-memory=50GB
+query.max-memory-per-node=1GB
+discovery.uri=<coordinator-url>
+```
+
+2. Modify `pulsar.web-service-url` and `pulsar.zookeeper-uri` configuration
in the `${project.root}/conf/presto/catalog/pulsar.properties` file accordingly
for the three nodes.
+
+3. Start the coordinator node.
+
+```
+$ ./bin/pulsar sql-worker run
+```
+
+4. Start worker nodes.
+
+```
+$ ./bin/pulsar sql-worker run
+```
+
+5. Start the SQL CLI and check the status of your cluster.
+
+```bash
+$ ./bin/pulsar sql --server <coordinate_url>
+```
+
+6. Check the status of your nodes.
+
+```bash
+presto> SELECT * FROM system.runtime.nodes;
+ node_id | http_uri | node_version | coordinator | state
+---------+-------------------------+--------------+-------------+--------
+ 1 | http://192.168.2.1:8081 | testversion | true | active
+ 3 | http://192.168.2.2:8081 | testversion | false | active
+ 2 | http://192.168.2.3:8081 | testversion | false | active
+```
+
+For more information about deployment in Presto, refer to [Presto
deployment](https://trino.io/docs/current/installation/deployment.html).
+
+:::note
+
+The broker does not advance LAC, so when Pulsar SQL bypass broker to query
data, it can only read entries up to the LAC that all the bookies learned. You
can enable periodically write LAC on the broker by setting
"bookkeeperExplicitLacIntervalInMills" in the broker.conf.
+
+:::
+
diff --git a/site2/website-next/docs/sql-getting-started.md
b/site2/website-next/docs/sql-getting-started.md
new file mode 100644
index 0000000..2d25b1e
--- /dev/null
+++ b/site2/website-next/docs/sql-getting-started.md
@@ -0,0 +1,147 @@
+---
+id: sql-getting-started
+title: Query data with Pulsar SQL
+sidebar_label: Query data
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+Before querying data in Pulsar, you need to install Pulsar and built-in
connectors.
+
+## Requirements
+1. Install [Pulsar](getting-started-standalone.md#install-pulsar-standalone).
+2. Install Pulsar [built-in
connectors](getting-started-standalone.md#install-builtin-connectors-optional).
+
+## Query data in Pulsar
+To query data in Pulsar with Pulsar SQL, complete the following steps.
+
+1. Start a Pulsar standalone cluster.
+
+```bash
+./bin/pulsar standalone
+```
+
+2. Start a Pulsar SQL worker.
+
+```bash
+./bin/pulsar sql-worker run
+```
+
+3. After initializing Pulsar standalone cluster and the SQL worker, run SQL
CLI.
+
+```bash
+./bin/pulsar sql
+```
+
+4. Test with SQL commands.
+
+```bash
+presto> show catalogs;
+ Catalog
+---------
+ pulsar
+ system
+(2 rows)
+
+Query 20180829_211752_00004_7qpwh, FINISHED, 1 node
+Splits: 19 total, 19 done (100.00%)
+0:00 [0 rows, 0B] [0 rows/s, 0B/s]
+
+
+presto> show schemas in pulsar;
+ Schema
+-----------------------
+ information_schema
+ public/default
+ public/functions
+ sample/standalone/ns1
+(4 rows)
+
+Query 20180829_211818_00005_7qpwh, FINISHED, 1 node
+Splits: 19 total, 19 done (100.00%)
+0:00 [4 rows, 89B] [21 rows/s, 471B/s]
+
+
+presto> show tables in pulsar."public/default";
+ Table
+-------
+(0 rows)
+
+Query 20180829_211839_00006_7qpwh, FINISHED, 1 node
+Splits: 19 total, 19 done (100.00%)
+0:00 [0 rows, 0B] [0 rows/s, 0B/s]
+
+```
+
+Since there is no data in Pulsar, no records is returned.
+
+5. Start the built-in connector _DataGeneratorSource_ and ingest some mock
data.
+
+```bash
+./bin/pulsar-admin sources create --name generator --destinationTopicName
generator_test --source-type data-generator
+```
+
+And then you can query a topic in the namespace "public/default".
+
+```bash
+presto> show tables in pulsar."public/default";
+ Table
+----------------
+ generator_test
+(1 row)
+
+Query 20180829_213202_00000_csyeu, FINISHED, 1 node
+Splits: 19 total, 19 done (100.00%)
+0:02 [1 rows, 38B] [0 rows/s, 17B/s]
+```
+
+You can now query the data within the topic "generator_test".
+
+```bash
+presto> select * from pulsar."public/default".generator_test;
+
+ firstname | middlename | lastname | email |
username | password | telephonenumber | age | companyemail
| nationalidentitycardnumber |
+-------------+-------------+-------------+----------------------------------+--------------+----------+-----------------+-----+-----------------------------------------------+----------------------------+
+ Genesis | Katherine | Wiley | [email protected] |
genesisw | y9D2dtU3 | 959-197-1860 | 71 |
[email protected] | 880-58-9247 |
+ Brayden | | Stanton | [email protected] |
braydens | ZnjmhXik | 220-027-867 | 81 | [email protected]
| 604-60-7069 |
+ Benjamin | Julian | Velasquez | [email protected] |
benjaminv | 8Bc7m3eb | 298-377-0062 | 21 |
[email protected] | 213-32-5882 |
+ Michael | Thomas | Donovan | [email protected] |
michaeld | OqBm9MLs | 078-134-4685 | 55 | [email protected]
| 443-30-3442 |
+ Brooklyn | Avery | Roach | [email protected] |
broach | IxtBLafO | 387-786-2998 | 68 | [email protected]
| 085-88-3973 |
+ Skylar | | Bradshaw | [email protected] |
skylarb | p6eC6cKy | 210-872-608 | 96 | [email protected]
| 453-46-0334 |
+.
+.
+.
+```
+
+You can query the mock data.
+
+## Query your own data
+If you want to query your own data, you need to ingest your own data first.
You can write a simple producer and write custom defined data to Pulsar. The
following is an example.
+
+```java
+public class Test {
+
+ public static class Foo {
+ private int field1 = 1;
+ private String field2;
+ private long field3;
+ }
+
+ public static void main(String[] args) throws Exception {
+ PulsarClient pulsarClient =
PulsarClient.builder().serviceUrl("pulsar://localhost:6650").build();
+ Producer<Foo> producer =
pulsarClient.newProducer(AvroSchema.of(Foo.class)).topic("test_topic").create();
+
+ for (int i = 0; i < 1000; i++) {
+ Foo foo = new Foo();
+ foo.setField1(i);
+ foo.setField2("foo" + i);
+ foo.setField3(System.currentTimeMillis());
+ producer.newMessage().value(foo).send();
+ }
+ producer.close();
+ pulsarClient.close();
+ }
+}
+```
diff --git a/site2/website-next/docs/sql-overview.md
b/site2/website-next/docs/sql-overview.md
new file mode 100644
index 0000000..d99f672
--- /dev/null
+++ b/site2/website-next/docs/sql-overview.md
@@ -0,0 +1,21 @@
+---
+id: sql-overview
+title: Pulsar SQL Overview
+sidebar_label: Overview
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+Apache Pulsar is used to store streams of event data, and the event data is
structured with predefined fields. With the implementation of the [Schema
Registry](schema-get-started.md), you can store structured data in Pulsar and
query the data by using [Trino (formerly Presto SQL)](https://trino.io/).
+
+As the core of Pulsar SQL, Presto Pulsar connector enables Presto workers
within a Presto cluster to query data from Pulsar.
+
+
+
+The query performance is efficient and highly scalable, because Pulsar adopts
[two level segment based
architecture](concepts-architecture-overview.md#apache-bookkeeper).
+
+Topics in Pulsar are stored as segments in [Apache
BookKeeper](https://bookkeeper.apache.org/). Each topic segment is replicated
to some BookKeeper nodes, which enables concurrent reads and high read
throughput. You can configure the number of BookKeeper nodes, and the default
number is `3`. In Presto Pulsar connector, data is read directly from
BookKeeper, so Presto workers can read concurrently from horizontally scalable
number BookKeeper nodes.
+
+
diff --git a/site2/website-next/docs/sql-rest-api.md
b/site2/website-next/docs/sql-rest-api.md
new file mode 100644
index 0000000..5ee292f
--- /dev/null
+++ b/site2/website-next/docs/sql-rest-api.md
@@ -0,0 +1,192 @@
+---
+id: sql-rest-api
+title: Pulsar SQL REST APIs
+sidebar_label: REST APIs
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+This section lists resources that make up the Presto REST API v1.
+
+## Request for Presto services
+
+All requests for Presto services should use Presto REST API v1 version.
+
+To request services, use explicit URL `http://presto.service:8081/v1`. You
need to update `presto.service:8081` with your real Presto address before
sending requests.
+
+`POST` requests require the `X-Presto-User` header. If you use authentication,
you must use the same `username` that is specified in the authentication
configuration. If you do not use authentication, you can specify anything for
`username`.
+
+```properties
+X-Presto-User: username
+```
+
+For more information about headers, refer to
[PrestoHeaders](https://github.com/trinodb/trino).
+
+## Schema
+
+You can use statement in the HTTP body. All data is received as JSON document
that might contain a `nextUri` link. If the received JSON document contains a
`nextUri` link, the request continues with the `nextUri` link until the
received data does not contain a `nextUri` link. If no error is returned, the
query completes successfully. If an `error` field is displayed in `stats`, it
means the query fails.
+
+The following is an example of `show catalogs`. The query continues until the
received JSON document does not contain a `nextUri` link. Since no `error` is
displayed in `stats`, it means that the query completes successfully.
+
+```powershell
+➜ ~ curl --header "X-Presto-User: test-user" --request POST --data 'show
catalogs' http://localhost:8081/v1/statement
+{
+ "infoUri" :
"http://localhost:8081/ui/query.html?20191113_033653_00006_dg6hb",
+ "stats" : {
+ "queued" : true,
+ "nodes" : 0,
+ "userTimeMillis" : 0,
+ "cpuTimeMillis" : 0,
+ "wallTimeMillis" : 0,
+ "processedBytes" : 0,
+ "processedRows" : 0,
+ "runningSplits" : 0,
+ "queuedTimeMillis" : 0,
+ "queuedSplits" : 0,
+ "completedSplits" : 0,
+ "totalSplits" : 0,
+ "scheduled" : false,
+ "peakMemoryBytes" : 0,
+ "state" : "QUEUED",
+ "elapsedTimeMillis" : 0
+ },
+ "id" : "20191113_033653_00006_dg6hb",
+ "nextUri" :
"http://localhost:8081/v1/statement/20191113_033653_00006_dg6hb/1"
+}
+
+➜ ~ curl http://localhost:8081/v1/statement/20191113_033653_00006_dg6hb/1
+{
+ "infoUri" :
"http://localhost:8081/ui/query.html?20191113_033653_00006_dg6hb",
+ "nextUri" :
"http://localhost:8081/v1/statement/20191113_033653_00006_dg6hb/2",
+ "id" : "20191113_033653_00006_dg6hb",
+ "stats" : {
+ "state" : "PLANNING",
+ "totalSplits" : 0,
+ "queued" : false,
+ "userTimeMillis" : 0,
+ "completedSplits" : 0,
+ "scheduled" : false,
+ "wallTimeMillis" : 0,
+ "runningSplits" : 0,
+ "queuedSplits" : 0,
+ "cpuTimeMillis" : 0,
+ "processedRows" : 0,
+ "processedBytes" : 0,
+ "nodes" : 0,
+ "queuedTimeMillis" : 1,
+ "elapsedTimeMillis" : 2,
+ "peakMemoryBytes" : 0
+ }
+}
+
+➜ ~ curl http://localhost:8081/v1/statement/20191113_033653_00006_dg6hb/2
+{
+ "id" : "20191113_033653_00006_dg6hb",
+ "data" : [
+ [
+ "pulsar"
+ ],
+ [
+ "system"
+ ]
+ ],
+ "infoUri" :
"http://localhost:8081/ui/query.html?20191113_033653_00006_dg6hb",
+ "columns" : [
+ {
+ "typeSignature" : {
+ "rawType" : "varchar",
+ "arguments" : [
+ {
+ "kind" : "LONG_LITERAL",
+ "value" : 6
+ }
+ ],
+ "literalArguments" : [],
+ "typeArguments" : []
+ },
+ "name" : "Catalog",
+ "type" : "varchar(6)"
+ }
+ ],
+ "stats" : {
+ "wallTimeMillis" : 104,
+ "scheduled" : true,
+ "userTimeMillis" : 14,
+ "progressPercentage" : 100,
+ "totalSplits" : 19,
+ "nodes" : 1,
+ "cpuTimeMillis" : 16,
+ "queued" : false,
+ "queuedTimeMillis" : 1,
+ "state" : "FINISHED",
+ "peakMemoryBytes" : 0,
+ "elapsedTimeMillis" : 111,
+ "processedBytes" : 0,
+ "processedRows" : 0,
+ "queuedSplits" : 0,
+ "rootStage" : {
+ "cpuTimeMillis" : 1,
+ "runningSplits" : 0,
+ "state" : "FINISHED",
+ "completedSplits" : 1,
+ "subStages" : [
+ {
+ "cpuTimeMillis" : 14,
+ "runningSplits" : 0,
+ "state" : "FINISHED",
+ "completedSplits" : 17,
+ "subStages" : [
+ {
+ "wallTimeMillis" : 7,
+ "subStages" : [],
+ "stageId" : "2",
+ "done" : true,
+ "nodes" : 1,
+ "totalSplits" : 1,
+ "processedBytes" : 22,
+ "processedRows" : 2,
+ "queuedSplits" : 0,
+ "userTimeMillis" : 1,
+ "cpuTimeMillis" : 1,
+ "runningSplits" : 0,
+ "state" : "FINISHED",
+ "completedSplits" : 1
+ }
+ ],
+ "wallTimeMillis" : 92,
+ "nodes" : 1,
+ "done" : true,
+ "stageId" : "1",
+ "userTimeMillis" : 12,
+ "processedRows" : 2,
+ "processedBytes" : 51,
+ "queuedSplits" : 0,
+ "totalSplits" : 17
+ }
+ ],
+ "wallTimeMillis" : 5,
+ "done" : true,
+ "nodes" : 1,
+ "stageId" : "0",
+ "userTimeMillis" : 1,
+ "processedRows" : 2,
+ "processedBytes" : 22,
+ "totalSplits" : 1,
+ "queuedSplits" : 0
+ },
+ "runningSplits" : 0,
+ "completedSplits" : 19
+ }
+}
+```
+
+:::note
+
+Since the response data is not in sync with the query state from the
perspective of clients, you cannot rely on the response data to determine
whether the query completes.
+
+:::
+
+
+For more information about Presto REST API, refer to [Presto HTTP
Protocol](https://github.com/prestosql/presto/wiki/HTTP-Protocol).
diff --git a/site2/website-next/sidebars.json b/site2/website-next/sidebars.json
index b9570b8..f8d6dc8 100644
--- a/site2/website-next/sidebars.json
+++ b/site2/website-next/sidebars.json
@@ -63,6 +63,16 @@
"io-develop",
"io-cli"
]
+ },
+ {
+ "type": "category",
+ "label": "Pulsar SQL",
+ "items": [
+ "sql-overview",
+ "sql-getting-started",
+ "sql-deployment-configurations",
+ "sql-rest-api"
+ ]
}
]
}
\ No newline at end of file
diff --git
a/site2/website-next/versioned_docs/version-2.7.3/sql-deployment-configurations.md
b/site2/website-next/versioned_docs/version-2.7.3/sql-deployment-configurations.md
new file mode 100644
index 0000000..218c1e2
--- /dev/null
+++
b/site2/website-next/versioned_docs/version-2.7.3/sql-deployment-configurations.md
@@ -0,0 +1,171 @@
+---
+id: sql-deployment-configurations
+title: Pulsar SQL configuration and deployment
+sidebar_label: Configuration and deployment
+original_id: sql-deployment-configurations
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+You can configure Presto Pulsar connector and deploy a cluster with the
following instruction.
+
+## Configure Presto Pulsar Connector
+You can configure Presto Pulsar Connector in the
`${project.root}/conf/presto/catalog/pulsar.properties` properties file. The
configuration for the connector and the default values are as follows.
+
+```properties
+# name of the connector to be displayed in the catalog
+connector.name=pulsar
+
+# the url of Pulsar broker service
+pulsar.broker-service-url=http://localhost:8080
+
+# URI of Zookeeper cluster
+pulsar.zookeeper-uri=localhost:2181
+
+# minimum number of entries to read at a single time
+pulsar.entry-read-batch-size=100
+
+# default number of splits to use per query
+pulsar.target-num-splits=4
+```
+
+You can connect Presto to a Pulsar cluster with multiple hosts. To configure
multiple hosts for brokers, add multiple URLs to `pulsar.broker-service-url`.
To configure multiple hosts for ZooKeeper, add multiple URIs to
`pulsar.zookeeper-uri`. The following is an example.
+
+```
+pulsar.broker-service-url=http://localhost:8080,localhost:8081,localhost:8082
+pulsar.zookeeper-uri=localhost1,localhost2:2181
+```
+
+## Query data from existing Presto clusters
+
+If you already have a Presto cluster, you can copy the Presto Pulsar connector
plugin to your existing cluster. Download the archived plugin package with the
following command.
+
+```bash
+$ wget pulsar:binary_release_url
+```
+
+## Deploy a new cluster
+
+Since Pulsar SQL is powered by [Presto](https://prestosql.io), the
configuration for deployment is the same for the Pulsar SQL worker.
+
+:::note
+
+For how to set up a standalone single node environment, refer to [Query
data](sql-getting-started.md).
+
+:::
+
+
+You can use the same CLI args as the Presto launcher.
+
+```bash
+$ ./bin/pulsar sql-worker --help
+Usage: launcher [options] command
+
+Commands: run, start, stop, restart, kill, status
+
+Options:
+ -h, --help show this help message and exit
+ -v, --verbose Run verbosely
+ --etc-dir=DIR Defaults to INSTALL_PATH/etc
+ --launcher-config=FILE
+ Defaults to INSTALL_PATH/bin/launcher.properties
+ --node-config=FILE Defaults to ETC_DIR/node.properties
+ --jvm-config=FILE Defaults to ETC_DIR/jvm.config
+ --config=FILE Defaults to ETC_DIR/config.properties
+ --log-levels-file=FILE
+ Defaults to ETC_DIR/log.properties
+ --data-dir=DIR Defaults to INSTALL_PATH
+ --pid-file=FILE Defaults to DATA_DIR/var/run/launcher.pid
+ --launcher-log-file=FILE
+ Defaults to DATA_DIR/var/log/launcher.log (only in
+ daemon mode)
+ --server-log-file=FILE
+ Defaults to DATA_DIR/var/log/server.log (only in
+ daemon mode)
+ -D NAME=VALUE Set a Java system property
+
+```
+
+The default configuration for the cluster is located in
`${project.root}/conf/presto`. You can customize your deployment by modifying
the default configuration.
+
+You can set the worker to read from a different configuration directory, or
set a different directory to write data.
+
+```bash
+$ ./bin/pulsar sql-worker run --etc-dir /tmp/incubator-pulsar/conf/presto
--data-dir /tmp/presto-1
+```
+
+You can start the worker as daemon process.
+
+```bash
+$ ./bin/pulsar sql-worker start
+```
+
+### Deploy a cluster on multiple nodes
+
+You can deploy a Pulsar SQL cluster or Presto cluster on multiple nodes. The
following example shows how to deploy a cluster on three-node cluster.
+
+1. Copy the Pulsar binary distribution to three nodes.
+
+The first node runs as Presto coordinator. The minimal configuration
requirement in the `${project.root}/conf/presto/config.properties` file is as
follows.
+
+```properties
+coordinator=true
+node-scheduler.include-coordinator=true
+http-server.http.port=8080
+query.max-memory=50GB
+query.max-memory-per-node=1GB
+discovery-server.enabled=true
+discovery.uri=<coordinator-url>
+```
+
+The other two nodes serve as worker nodes, you can use the following
configuration for worker nodes.
+
+```properties
+coordinator=false
+http-server.http.port=8080
+query.max-memory=50GB
+query.max-memory-per-node=1GB
+discovery.uri=<coordinator-url>
+```
+
+2. Modify `pulsar.broker-service-url` and `pulsar.zookeeper-uri`
configuration in the `${project.root}/conf/presto/catalog/pulsar.properties`
file accordingly for the three nodes.
+
+3. Start the coordinator node.
+
+```
+$ ./bin/pulsar sql-worker run
+```
+
+4. Start worker nodes.
+
+```
+$ ./bin/pulsar sql-worker run
+```
+
+5. Start the SQL CLI and check the status of your cluster.
+
+```bash
+$ ./bin/pulsar sql --server <coordinate_url>
+```
+
+6. Check the status of your nodes.
+
+```bash
+presto> SELECT * FROM system.runtime.nodes;
+ node_id | http_uri | node_version | coordinator | state
+---------+-------------------------+--------------+-------------+--------
+ 1 | http://192.168.2.1:8081 | testversion | true | active
+ 3 | http://192.168.2.2:8081 | testversion | false | active
+ 2 | http://192.168.2.3:8081 | testversion | false | active
+```
+
+For more information about deployment in Presto, refer to [Presto
deployment](https://prestosql.io/docs/current/installation/deployment.html).
+
+:::note
+
+The broker does not advance LAC, so when Pulsar SQL bypass broker to query
data, it can only read entries up to the LAC that all the bookies learned. You
can enable periodically write LAC on the broker by setting
"bookkeeperExplicitLacIntervalInMills" in the broker.conf.
+
+:::
+
diff --git
a/site2/website-next/versioned_docs/version-2.7.3/sql-getting-started.md
b/site2/website-next/versioned_docs/version-2.7.3/sql-getting-started.md
new file mode 100644
index 0000000..e166435
--- /dev/null
+++ b/site2/website-next/versioned_docs/version-2.7.3/sql-getting-started.md
@@ -0,0 +1,148 @@
+---
+id: sql-getting-started
+title: Query data with Pulsar SQL
+sidebar_label: Query data
+original_id: sql-getting-started
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+Before querying data in Pulsar, you need to install Pulsar and built-in
connectors.
+
+## Requirements
+1. Install [Pulsar](getting-started-standalone.md#install-pulsar-standalone).
+2. Install Pulsar [built-in
connectors](getting-started-standalone.md#install-builtin-connectors-optional).
+
+## Query data in Pulsar
+To query data in Pulsar with Pulsar SQL, complete the following steps.
+
+1. Start a Pulsar standalone cluster.
+
+```bash
+./bin/pulsar standalone
+```
+
+2. Start a Pulsar SQL worker.
+
+```bash
+./bin/pulsar sql-worker run
+```
+
+3. After initializing Pulsar standalone cluster and the SQL worker, run SQL
CLI.
+
+```bash
+./bin/pulsar sql
+```
+
+4. Test with SQL commands.
+
+```bash
+presto> show catalogs;
+ Catalog
+---------
+ pulsar
+ system
+(2 rows)
+
+Query 20180829_211752_00004_7qpwh, FINISHED, 1 node
+Splits: 19 total, 19 done (100.00%)
+0:00 [0 rows, 0B] [0 rows/s, 0B/s]
+
+
+presto> show schemas in pulsar;
+ Schema
+-----------------------
+ information_schema
+ public/default
+ public/functions
+ sample/standalone/ns1
+(4 rows)
+
+Query 20180829_211818_00005_7qpwh, FINISHED, 1 node
+Splits: 19 total, 19 done (100.00%)
+0:00 [4 rows, 89B] [21 rows/s, 471B/s]
+
+
+presto> show tables in pulsar."public/default";
+ Table
+-------
+(0 rows)
+
+Query 20180829_211839_00006_7qpwh, FINISHED, 1 node
+Splits: 19 total, 19 done (100.00%)
+0:00 [0 rows, 0B] [0 rows/s, 0B/s]
+
+```
+
+Since there is no data in Pulsar, no records is returned.
+
+5. Start the built-in connector _DataGeneratorSource_ and ingest some mock
data.
+
+```bash
+./bin/pulsar-admin sources create --name generator --destinationTopicName
generator_test --source-type data-generator
+```
+
+And then you can query a topic in the namespace "public/default".
+
+```bash
+presto> show tables in pulsar."public/default";
+ Table
+----------------
+ generator_test
+(1 row)
+
+Query 20180829_213202_00000_csyeu, FINISHED, 1 node
+Splits: 19 total, 19 done (100.00%)
+0:02 [1 rows, 38B] [0 rows/s, 17B/s]
+```
+
+You can now query the data within the topic "generator_test".
+
+```bash
+presto> select * from pulsar."public/default".generator_test;
+
+ firstname | middlename | lastname | email |
username | password | telephonenumber | age | companyemail
| nationalidentitycardnumber |
+-------------+-------------+-------------+----------------------------------+--------------+----------+-----------------+-----+-----------------------------------------------+----------------------------+
+ Genesis | Katherine | Wiley | [email protected] |
genesisw | y9D2dtU3 | 959-197-1860 | 71 |
[email protected] | 880-58-9247 |
+ Brayden | | Stanton | [email protected] |
braydens | ZnjmhXik | 220-027-867 | 81 | [email protected]
| 604-60-7069 |
+ Benjamin | Julian | Velasquez | [email protected] |
benjaminv | 8Bc7m3eb | 298-377-0062 | 21 |
[email protected] | 213-32-5882 |
+ Michael | Thomas | Donovan | [email protected] |
michaeld | OqBm9MLs | 078-134-4685 | 55 | [email protected]
| 443-30-3442 |
+ Brooklyn | Avery | Roach | [email protected] |
broach | IxtBLafO | 387-786-2998 | 68 | [email protected]
| 085-88-3973 |
+ Skylar | | Bradshaw | [email protected] |
skylarb | p6eC6cKy | 210-872-608 | 96 | [email protected]
| 453-46-0334 |
+.
+.
+.
+```
+
+You can query the mock data.
+
+## Query your own data
+If you want to query your own data, you need to ingest your own data first.
You can write a simple producer and write custom defined data to Pulsar. The
following is an example.
+
+```java
+public class Test {
+
+ public static class Foo {
+ private int field1 = 1;
+ private String field2;
+ private long field3;
+ }
+
+ public static void main(String[] args) throws Exception {
+ PulsarClient pulsarClient =
PulsarClient.builder().serviceUrl("pulsar://localhost:6650").build();
+ Producer<Foo> producer =
pulsarClient.newProducer(AvroSchema.of(Foo.class)).topic("test_topic").create();
+
+ for (int i = 0; i < 1000; i++) {
+ Foo foo = new Foo();
+ foo.setField1(i);
+ foo.setField2("foo" + i);
+ foo.setField3(System.currentTimeMillis());
+ producer.newMessage().value(foo).send();
+ }
+ producer.close();
+ pulsarClient.close();
+ }
+}
+```
diff --git a/site2/website-next/versioned_docs/version-2.7.3/sql-overview.md
b/site2/website-next/versioned_docs/version-2.7.3/sql-overview.md
new file mode 100644
index 0000000..369ac2c
--- /dev/null
+++ b/site2/website-next/versioned_docs/version-2.7.3/sql-overview.md
@@ -0,0 +1,22 @@
+---
+id: sql-overview
+title: Pulsar SQL Overview
+sidebar_label: Overview
+original_id: sql-overview
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+Apache Pulsar is used to store streams of event data, and the event data is
structured with predefined fields. With the implementation of the [Schema
Registry](schema-get-started.md), you can store structured data in Pulsar and
query the data by using [Presto](https://prestosql.io/).
+
+As the core of Pulsar SQL, Presto Pulsar connector enables Presto workers
within a Presto cluster to query data from Pulsar.
+
+
+
+The query performance is efficient and highly scalable, because Pulsar adopts
[two level segment based
architecture](concepts-architecture-overview.md#apache-bookkeeper).
+
+Topics in Pulsar are stored as segments in [Apache
BookKeeper](https://bookkeeper.apache.org/). Each topic segment is replicated
to some BookKeeper nodes, which enables concurrent reads and high read
throughput. You can configure the number of BookKeeper nodes, and the default
number is `3`. In Presto Pulsar connector, data is read directly from
BookKeeper, so Presto workers can read concurrently from horizontally scalable
number BookKeeper nodes.
+
+
diff --git a/site2/website-next/versioned_docs/version-2.7.3/sql-rest-api.md
b/site2/website-next/versioned_docs/version-2.7.3/sql-rest-api.md
new file mode 100644
index 0000000..61b76d6
--- /dev/null
+++ b/site2/website-next/versioned_docs/version-2.7.3/sql-rest-api.md
@@ -0,0 +1,193 @@
+---
+id: sql-rest-api
+title: Pulsar SQL REST APIs
+sidebar_label: REST APIs
+original_id: sql-rest-api
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+This section lists resources that make up the Presto REST API v1.
+
+## Request for Presto services
+
+All requests for Presto services should use Presto REST API v1 version.
+
+To request services, use explicit URL `http://presto.service:8081/v1`. You
need to update `presto.service:8081` with your real Presto address before
sending requests.
+
+`POST` requests require the `X-Presto-User` header. If you use authentication,
you must use the same `username` that is specified in the authentication
configuration. If you do not use authentication, you can specify anything for
`username`.
+
+```properties
+X-Presto-User: username
+```
+
+For more information about headers, refer to
[PrestoHeaders](https://github.com/trinodb/trino).
+
+## Schema
+
+You can use statement in the HTTP body. All data is received as JSON document
that might contain a `nextUri` link. If the received JSON document contains a
`nextUri` link, the request continues with the `nextUri` link until the
received data does not contain a `nextUri` link. If no error is returned, the
query completes successfully. If an `error` field is displayed in `stats`, it
means the query fails.
+
+The following is an example of `show catalogs`. The query continues until the
received JSON document does not contain a `nextUri` link. Since no `error` is
displayed in `stats`, it means that the query completes successfully.
+
+```powershell
+➜ ~ curl --header "X-Presto-User: test-user" --request POST --data 'show
catalogs' http://localhost:8081/v1/statement
+{
+ "infoUri" :
"http://localhost:8081/ui/query.html?20191113_033653_00006_dg6hb",
+ "stats" : {
+ "queued" : true,
+ "nodes" : 0,
+ "userTimeMillis" : 0,
+ "cpuTimeMillis" : 0,
+ "wallTimeMillis" : 0,
+ "processedBytes" : 0,
+ "processedRows" : 0,
+ "runningSplits" : 0,
+ "queuedTimeMillis" : 0,
+ "queuedSplits" : 0,
+ "completedSplits" : 0,
+ "totalSplits" : 0,
+ "scheduled" : false,
+ "peakMemoryBytes" : 0,
+ "state" : "QUEUED",
+ "elapsedTimeMillis" : 0
+ },
+ "id" : "20191113_033653_00006_dg6hb",
+ "nextUri" :
"http://localhost:8081/v1/statement/20191113_033653_00006_dg6hb/1"
+}
+
+➜ ~ curl http://localhost:8081/v1/statement/20191113_033653_00006_dg6hb/1
+{
+ "infoUri" :
"http://localhost:8081/ui/query.html?20191113_033653_00006_dg6hb",
+ "nextUri" :
"http://localhost:8081/v1/statement/20191113_033653_00006_dg6hb/2",
+ "id" : "20191113_033653_00006_dg6hb",
+ "stats" : {
+ "state" : "PLANNING",
+ "totalSplits" : 0,
+ "queued" : false,
+ "userTimeMillis" : 0,
+ "completedSplits" : 0,
+ "scheduled" : false,
+ "wallTimeMillis" : 0,
+ "runningSplits" : 0,
+ "queuedSplits" : 0,
+ "cpuTimeMillis" : 0,
+ "processedRows" : 0,
+ "processedBytes" : 0,
+ "nodes" : 0,
+ "queuedTimeMillis" : 1,
+ "elapsedTimeMillis" : 2,
+ "peakMemoryBytes" : 0
+ }
+}
+
+➜ ~ curl http://localhost:8081/v1/statement/20191113_033653_00006_dg6hb/2
+{
+ "id" : "20191113_033653_00006_dg6hb",
+ "data" : [
+ [
+ "pulsar"
+ ],
+ [
+ "system"
+ ]
+ ],
+ "infoUri" :
"http://localhost:8081/ui/query.html?20191113_033653_00006_dg6hb",
+ "columns" : [
+ {
+ "typeSignature" : {
+ "rawType" : "varchar",
+ "arguments" : [
+ {
+ "kind" : "LONG_LITERAL",
+ "value" : 6
+ }
+ ],
+ "literalArguments" : [],
+ "typeArguments" : []
+ },
+ "name" : "Catalog",
+ "type" : "varchar(6)"
+ }
+ ],
+ "stats" : {
+ "wallTimeMillis" : 104,
+ "scheduled" : true,
+ "userTimeMillis" : 14,
+ "progressPercentage" : 100,
+ "totalSplits" : 19,
+ "nodes" : 1,
+ "cpuTimeMillis" : 16,
+ "queued" : false,
+ "queuedTimeMillis" : 1,
+ "state" : "FINISHED",
+ "peakMemoryBytes" : 0,
+ "elapsedTimeMillis" : 111,
+ "processedBytes" : 0,
+ "processedRows" : 0,
+ "queuedSplits" : 0,
+ "rootStage" : {
+ "cpuTimeMillis" : 1,
+ "runningSplits" : 0,
+ "state" : "FINISHED",
+ "completedSplits" : 1,
+ "subStages" : [
+ {
+ "cpuTimeMillis" : 14,
+ "runningSplits" : 0,
+ "state" : "FINISHED",
+ "completedSplits" : 17,
+ "subStages" : [
+ {
+ "wallTimeMillis" : 7,
+ "subStages" : [],
+ "stageId" : "2",
+ "done" : true,
+ "nodes" : 1,
+ "totalSplits" : 1,
+ "processedBytes" : 22,
+ "processedRows" : 2,
+ "queuedSplits" : 0,
+ "userTimeMillis" : 1,
+ "cpuTimeMillis" : 1,
+ "runningSplits" : 0,
+ "state" : "FINISHED",
+ "completedSplits" : 1
+ }
+ ],
+ "wallTimeMillis" : 92,
+ "nodes" : 1,
+ "done" : true,
+ "stageId" : "1",
+ "userTimeMillis" : 12,
+ "processedRows" : 2,
+ "processedBytes" : 51,
+ "queuedSplits" : 0,
+ "totalSplits" : 17
+ }
+ ],
+ "wallTimeMillis" : 5,
+ "done" : true,
+ "nodes" : 1,
+ "stageId" : "0",
+ "userTimeMillis" : 1,
+ "processedRows" : 2,
+ "processedBytes" : 22,
+ "totalSplits" : 1,
+ "queuedSplits" : 0
+ },
+ "runningSplits" : 0,
+ "completedSplits" : 19
+ }
+}
+```
+
+:::note
+
+Since the response data is not in sync with the query state from the
perspective of clients, you cannot rely on the response data to determine
whether the query completes.
+
+:::
+
+
+For more information about Presto REST API, refer to [Presto HTTP
Protocol](https://github.com/prestosql/presto/wiki/HTTP-Protocol).
diff --git
a/site2/website-next/versioned_docs/version-2.8.0/sql-deployment-configurations.md
b/site2/website-next/versioned_docs/version-2.8.0/sql-deployment-configurations.md
new file mode 100644
index 0000000..fbf8615
--- /dev/null
+++
b/site2/website-next/versioned_docs/version-2.8.0/sql-deployment-configurations.md
@@ -0,0 +1,171 @@
+---
+id: sql-deployment-configurations
+title: Pulsar SQL configuration and deployment
+sidebar_label: Configuration and deployment
+original_id: sql-deployment-configurations
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+You can configure Presto Pulsar connector and deploy a cluster with the
following instruction.
+
+## Configure Presto Pulsar Connector
+You can configure Presto Pulsar Connector in the
`${project.root}/conf/presto/catalog/pulsar.properties` properties file. The
configuration for the connector and the default values are as follows.
+
+```properties
+# name of the connector to be displayed in the catalog
+connector.name=pulsar
+
+# the url of Pulsar broker service
+pulsar.web-service-url=http://localhost:8080
+
+# URI of Zookeeper cluster
+pulsar.zookeeper-uri=localhost:2181
+
+# minimum number of entries to read at a single time
+pulsar.entry-read-batch-size=100
+
+# default number of splits to use per query
+pulsar.target-num-splits=4
+```
+
+You can connect Presto to a Pulsar cluster with multiple hosts. To configure
multiple hosts for brokers, add multiple URLs to `pulsar.web-service-url`. To
configure multiple hosts for ZooKeeper, add multiple URIs to
`pulsar.zookeeper-uri`. The following is an example.
+
+```
+pulsar.web-service-url=http://localhost:8080,localhost:8081,localhost:8082
+pulsar.zookeeper-uri=localhost1,localhost2:2181
+```
+
+## Query data from existing Presto clusters
+
+If you already have a Presto cluster, you can copy the Presto Pulsar connector
plugin to your existing cluster. Download the archived plugin package with the
following command.
+
+```bash
+$ wget pulsar:binary_release_url
+```
+
+## Deploy a new cluster
+
+Since Pulsar SQL is powered by [Trino (formerly Presto
SQL)](https://trino.io), the configuration for deployment is the same for the
Pulsar SQL worker.
+
+:::note
+
+For how to set up a standalone single node environment, refer to [Query
data](sql-getting-started.md).
+
+:::
+
+
+You can use the same CLI args as the Presto launcher.
+
+```bash
+$ ./bin/pulsar sql-worker --help
+Usage: launcher [options] command
+
+Commands: run, start, stop, restart, kill, status
+
+Options:
+ -h, --help show this help message and exit
+ -v, --verbose Run verbosely
+ --etc-dir=DIR Defaults to INSTALL_PATH/etc
+ --launcher-config=FILE
+ Defaults to INSTALL_PATH/bin/launcher.properties
+ --node-config=FILE Defaults to ETC_DIR/node.properties
+ --jvm-config=FILE Defaults to ETC_DIR/jvm.config
+ --config=FILE Defaults to ETC_DIR/config.properties
+ --log-levels-file=FILE
+ Defaults to ETC_DIR/log.properties
+ --data-dir=DIR Defaults to INSTALL_PATH
+ --pid-file=FILE Defaults to DATA_DIR/var/run/launcher.pid
+ --launcher-log-file=FILE
+ Defaults to DATA_DIR/var/log/launcher.log (only in
+ daemon mode)
+ --server-log-file=FILE
+ Defaults to DATA_DIR/var/log/server.log (only in
+ daemon mode)
+ -D NAME=VALUE Set a Java system property
+
+```
+
+The default configuration for the cluster is located in
`${project.root}/conf/presto`. You can customize your deployment by modifying
the default configuration.
+
+You can set the worker to read from a different configuration directory, or
set a different directory to write data.
+
+```bash
+$ ./bin/pulsar sql-worker run --etc-dir /tmp/incubator-pulsar/conf/presto
--data-dir /tmp/presto-1
+```
+
+You can start the worker as daemon process.
+
+```bash
+$ ./bin/pulsar sql-worker start
+```
+
+### Deploy a cluster on multiple nodes
+
+You can deploy a Pulsar SQL cluster or Presto cluster on multiple nodes. The
following example shows how to deploy a cluster on three-node cluster.
+
+1. Copy the Pulsar binary distribution to three nodes.
+
+The first node runs as Presto coordinator. The minimal configuration
requirement in the `${project.root}/conf/presto/config.properties` file is as
follows.
+
+```properties
+coordinator=true
+node-scheduler.include-coordinator=true
+http-server.http.port=8080
+query.max-memory=50GB
+query.max-memory-per-node=1GB
+discovery-server.enabled=true
+discovery.uri=<coordinator-url>
+```
+
+The other two nodes serve as worker nodes, you can use the following
configuration for worker nodes.
+
+```properties
+coordinator=false
+http-server.http.port=8080
+query.max-memory=50GB
+query.max-memory-per-node=1GB
+discovery.uri=<coordinator-url>
+```
+
+2. Modify `pulsar.web-service-url` and `pulsar.zookeeper-uri` configuration
in the `${project.root}/conf/presto/catalog/pulsar.properties` file accordingly
for the three nodes.
+
+3. Start the coordinator node.
+
+```
+$ ./bin/pulsar sql-worker run
+```
+
+4. Start worker nodes.
+
+```
+$ ./bin/pulsar sql-worker run
+```
+
+5. Start the SQL CLI and check the status of your cluster.
+
+```bash
+$ ./bin/pulsar sql --server <coordinate_url>
+```
+
+6. Check the status of your nodes.
+
+```bash
+presto> SELECT * FROM system.runtime.nodes;
+ node_id | http_uri | node_version | coordinator | state
+---------+-------------------------+--------------+-------------+--------
+ 1 | http://192.168.2.1:8081 | testversion | true | active
+ 3 | http://192.168.2.2:8081 | testversion | false | active
+ 2 | http://192.168.2.3:8081 | testversion | false | active
+```
+
+For more information about deployment in Presto, refer to [Presto
deployment](https://trino.io/docs/current/installation/deployment.html).
+
+:::note
+
+The broker does not advance LAC, so when Pulsar SQL bypass broker to query
data, it can only read entries up to the LAC that all the bookies learned. You
can enable periodically write LAC on the broker by setting
"bookkeeperExplicitLacIntervalInMills" in the broker.conf.
+
+:::
+
diff --git
a/site2/website-next/versioned_docs/version-2.8.0/sql-getting-started.md
b/site2/website-next/versioned_docs/version-2.8.0/sql-getting-started.md
new file mode 100644
index 0000000..e166435
--- /dev/null
+++ b/site2/website-next/versioned_docs/version-2.8.0/sql-getting-started.md
@@ -0,0 +1,148 @@
+---
+id: sql-getting-started
+title: Query data with Pulsar SQL
+sidebar_label: Query data
+original_id: sql-getting-started
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+Before querying data in Pulsar, you need to install Pulsar and built-in
connectors.
+
+## Requirements
+1. Install [Pulsar](getting-started-standalone.md#install-pulsar-standalone).
+2. Install Pulsar [built-in
connectors](getting-started-standalone.md#install-builtin-connectors-optional).
+
+## Query data in Pulsar
+To query data in Pulsar with Pulsar SQL, complete the following steps.
+
+1. Start a Pulsar standalone cluster.
+
+```bash
+./bin/pulsar standalone
+```
+
+2. Start a Pulsar SQL worker.
+
+```bash
+./bin/pulsar sql-worker run
+```
+
+3. After initializing Pulsar standalone cluster and the SQL worker, run SQL
CLI.
+
+```bash
+./bin/pulsar sql
+```
+
+4. Test with SQL commands.
+
+```bash
+presto> show catalogs;
+ Catalog
+---------
+ pulsar
+ system
+(2 rows)
+
+Query 20180829_211752_00004_7qpwh, FINISHED, 1 node
+Splits: 19 total, 19 done (100.00%)
+0:00 [0 rows, 0B] [0 rows/s, 0B/s]
+
+
+presto> show schemas in pulsar;
+ Schema
+-----------------------
+ information_schema
+ public/default
+ public/functions
+ sample/standalone/ns1
+(4 rows)
+
+Query 20180829_211818_00005_7qpwh, FINISHED, 1 node
+Splits: 19 total, 19 done (100.00%)
+0:00 [4 rows, 89B] [21 rows/s, 471B/s]
+
+
+presto> show tables in pulsar."public/default";
+ Table
+-------
+(0 rows)
+
+Query 20180829_211839_00006_7qpwh, FINISHED, 1 node
+Splits: 19 total, 19 done (100.00%)
+0:00 [0 rows, 0B] [0 rows/s, 0B/s]
+
+```
+
+Since there is no data in Pulsar, no records is returned.
+
+5. Start the built-in connector _DataGeneratorSource_ and ingest some mock
data.
+
+```bash
+./bin/pulsar-admin sources create --name generator --destinationTopicName
generator_test --source-type data-generator
+```
+
+And then you can query a topic in the namespace "public/default".
+
+```bash
+presto> show tables in pulsar."public/default";
+ Table
+----------------
+ generator_test
+(1 row)
+
+Query 20180829_213202_00000_csyeu, FINISHED, 1 node
+Splits: 19 total, 19 done (100.00%)
+0:02 [1 rows, 38B] [0 rows/s, 17B/s]
+```
+
+You can now query the data within the topic "generator_test".
+
+```bash
+presto> select * from pulsar."public/default".generator_test;
+
+ firstname | middlename | lastname | email |
username | password | telephonenumber | age | companyemail
| nationalidentitycardnumber |
+-------------+-------------+-------------+----------------------------------+--------------+----------+-----------------+-----+-----------------------------------------------+----------------------------+
+ Genesis | Katherine | Wiley | [email protected] |
genesisw | y9D2dtU3 | 959-197-1860 | 71 |
[email protected] | 880-58-9247 |
+ Brayden | | Stanton | [email protected] |
braydens | ZnjmhXik | 220-027-867 | 81 | [email protected]
| 604-60-7069 |
+ Benjamin | Julian | Velasquez | [email protected] |
benjaminv | 8Bc7m3eb | 298-377-0062 | 21 |
[email protected] | 213-32-5882 |
+ Michael | Thomas | Donovan | [email protected] |
michaeld | OqBm9MLs | 078-134-4685 | 55 | [email protected]
| 443-30-3442 |
+ Brooklyn | Avery | Roach | [email protected] |
broach | IxtBLafO | 387-786-2998 | 68 | [email protected]
| 085-88-3973 |
+ Skylar | | Bradshaw | [email protected] |
skylarb | p6eC6cKy | 210-872-608 | 96 | [email protected]
| 453-46-0334 |
+.
+.
+.
+```
+
+You can query the mock data.
+
+## Query your own data
+If you want to query your own data, you need to ingest your own data first.
You can write a simple producer and write custom defined data to Pulsar. The
following is an example.
+
+```java
+public class Test {
+
+ public static class Foo {
+ private int field1 = 1;
+ private String field2;
+ private long field3;
+ }
+
+ public static void main(String[] args) throws Exception {
+ PulsarClient pulsarClient =
PulsarClient.builder().serviceUrl("pulsar://localhost:6650").build();
+ Producer<Foo> producer =
pulsarClient.newProducer(AvroSchema.of(Foo.class)).topic("test_topic").create();
+
+ for (int i = 0; i < 1000; i++) {
+ Foo foo = new Foo();
+ foo.setField1(i);
+ foo.setField2("foo" + i);
+ foo.setField3(System.currentTimeMillis());
+ producer.newMessage().value(foo).send();
+ }
+ producer.close();
+ pulsarClient.close();
+ }
+}
+```
diff --git a/site2/website-next/versioned_docs/version-2.8.0/sql-overview.md
b/site2/website-next/versioned_docs/version-2.8.0/sql-overview.md
new file mode 100644
index 0000000..c13dfc1
--- /dev/null
+++ b/site2/website-next/versioned_docs/version-2.8.0/sql-overview.md
@@ -0,0 +1,22 @@
+---
+id: sql-overview
+title: Pulsar SQL Overview
+sidebar_label: Overview
+original_id: sql-overview
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+Apache Pulsar is used to store streams of event data, and the event data is
structured with predefined fields. With the implementation of the [Schema
Registry](schema-get-started.md), you can store structured data in Pulsar and
query the data by using [Trino (formerly Presto SQL)](https://trino.io/).
+
+As the core of Pulsar SQL, Presto Pulsar connector enables Presto workers
within a Presto cluster to query data from Pulsar.
+
+
+
+The query performance is efficient and highly scalable, because Pulsar adopts
[two level segment based
architecture](concepts-architecture-overview.md#apache-bookkeeper).
+
+Topics in Pulsar are stored as segments in [Apache
BookKeeper](https://bookkeeper.apache.org/). Each topic segment is replicated
to some BookKeeper nodes, which enables concurrent reads and high read
throughput. You can configure the number of BookKeeper nodes, and the default
number is `3`. In Presto Pulsar connector, data is read directly from
BookKeeper, so Presto workers can read concurrently from horizontally scalable
number BookKeeper nodes.
+
+
diff --git a/site2/website-next/versioned_docs/version-2.8.0/sql-rest-api.md
b/site2/website-next/versioned_docs/version-2.8.0/sql-rest-api.md
new file mode 100644
index 0000000..61b76d6
--- /dev/null
+++ b/site2/website-next/versioned_docs/version-2.8.0/sql-rest-api.md
@@ -0,0 +1,193 @@
+---
+id: sql-rest-api
+title: Pulsar SQL REST APIs
+sidebar_label: REST APIs
+original_id: sql-rest-api
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+This section lists resources that make up the Presto REST API v1.
+
+## Request for Presto services
+
+All requests for Presto services should use Presto REST API v1 version.
+
+To request services, use explicit URL `http://presto.service:8081/v1`. You
need to update `presto.service:8081` with your real Presto address before
sending requests.
+
+`POST` requests require the `X-Presto-User` header. If you use authentication,
you must use the same `username` that is specified in the authentication
configuration. If you do not use authentication, you can specify anything for
`username`.
+
+```properties
+X-Presto-User: username
+```
+
+For more information about headers, refer to
[PrestoHeaders](https://github.com/trinodb/trino).
+
+## Schema
+
+You can use statement in the HTTP body. All data is received as JSON document
that might contain a `nextUri` link. If the received JSON document contains a
`nextUri` link, the request continues with the `nextUri` link until the
received data does not contain a `nextUri` link. If no error is returned, the
query completes successfully. If an `error` field is displayed in `stats`, it
means the query fails.
+
+The following is an example of `show catalogs`. The query continues until the
received JSON document does not contain a `nextUri` link. Since no `error` is
displayed in `stats`, it means that the query completes successfully.
+
+```powershell
+➜ ~ curl --header "X-Presto-User: test-user" --request POST --data 'show
catalogs' http://localhost:8081/v1/statement
+{
+ "infoUri" :
"http://localhost:8081/ui/query.html?20191113_033653_00006_dg6hb",
+ "stats" : {
+ "queued" : true,
+ "nodes" : 0,
+ "userTimeMillis" : 0,
+ "cpuTimeMillis" : 0,
+ "wallTimeMillis" : 0,
+ "processedBytes" : 0,
+ "processedRows" : 0,
+ "runningSplits" : 0,
+ "queuedTimeMillis" : 0,
+ "queuedSplits" : 0,
+ "completedSplits" : 0,
+ "totalSplits" : 0,
+ "scheduled" : false,
+ "peakMemoryBytes" : 0,
+ "state" : "QUEUED",
+ "elapsedTimeMillis" : 0
+ },
+ "id" : "20191113_033653_00006_dg6hb",
+ "nextUri" :
"http://localhost:8081/v1/statement/20191113_033653_00006_dg6hb/1"
+}
+
+➜ ~ curl http://localhost:8081/v1/statement/20191113_033653_00006_dg6hb/1
+{
+ "infoUri" :
"http://localhost:8081/ui/query.html?20191113_033653_00006_dg6hb",
+ "nextUri" :
"http://localhost:8081/v1/statement/20191113_033653_00006_dg6hb/2",
+ "id" : "20191113_033653_00006_dg6hb",
+ "stats" : {
+ "state" : "PLANNING",
+ "totalSplits" : 0,
+ "queued" : false,
+ "userTimeMillis" : 0,
+ "completedSplits" : 0,
+ "scheduled" : false,
+ "wallTimeMillis" : 0,
+ "runningSplits" : 0,
+ "queuedSplits" : 0,
+ "cpuTimeMillis" : 0,
+ "processedRows" : 0,
+ "processedBytes" : 0,
+ "nodes" : 0,
+ "queuedTimeMillis" : 1,
+ "elapsedTimeMillis" : 2,
+ "peakMemoryBytes" : 0
+ }
+}
+
+➜ ~ curl http://localhost:8081/v1/statement/20191113_033653_00006_dg6hb/2
+{
+ "id" : "20191113_033653_00006_dg6hb",
+ "data" : [
+ [
+ "pulsar"
+ ],
+ [
+ "system"
+ ]
+ ],
+ "infoUri" :
"http://localhost:8081/ui/query.html?20191113_033653_00006_dg6hb",
+ "columns" : [
+ {
+ "typeSignature" : {
+ "rawType" : "varchar",
+ "arguments" : [
+ {
+ "kind" : "LONG_LITERAL",
+ "value" : 6
+ }
+ ],
+ "literalArguments" : [],
+ "typeArguments" : []
+ },
+ "name" : "Catalog",
+ "type" : "varchar(6)"
+ }
+ ],
+ "stats" : {
+ "wallTimeMillis" : 104,
+ "scheduled" : true,
+ "userTimeMillis" : 14,
+ "progressPercentage" : 100,
+ "totalSplits" : 19,
+ "nodes" : 1,
+ "cpuTimeMillis" : 16,
+ "queued" : false,
+ "queuedTimeMillis" : 1,
+ "state" : "FINISHED",
+ "peakMemoryBytes" : 0,
+ "elapsedTimeMillis" : 111,
+ "processedBytes" : 0,
+ "processedRows" : 0,
+ "queuedSplits" : 0,
+ "rootStage" : {
+ "cpuTimeMillis" : 1,
+ "runningSplits" : 0,
+ "state" : "FINISHED",
+ "completedSplits" : 1,
+ "subStages" : [
+ {
+ "cpuTimeMillis" : 14,
+ "runningSplits" : 0,
+ "state" : "FINISHED",
+ "completedSplits" : 17,
+ "subStages" : [
+ {
+ "wallTimeMillis" : 7,
+ "subStages" : [],
+ "stageId" : "2",
+ "done" : true,
+ "nodes" : 1,
+ "totalSplits" : 1,
+ "processedBytes" : 22,
+ "processedRows" : 2,
+ "queuedSplits" : 0,
+ "userTimeMillis" : 1,
+ "cpuTimeMillis" : 1,
+ "runningSplits" : 0,
+ "state" : "FINISHED",
+ "completedSplits" : 1
+ }
+ ],
+ "wallTimeMillis" : 92,
+ "nodes" : 1,
+ "done" : true,
+ "stageId" : "1",
+ "userTimeMillis" : 12,
+ "processedRows" : 2,
+ "processedBytes" : 51,
+ "queuedSplits" : 0,
+ "totalSplits" : 17
+ }
+ ],
+ "wallTimeMillis" : 5,
+ "done" : true,
+ "nodes" : 1,
+ "stageId" : "0",
+ "userTimeMillis" : 1,
+ "processedRows" : 2,
+ "processedBytes" : 22,
+ "totalSplits" : 1,
+ "queuedSplits" : 0
+ },
+ "runningSplits" : 0,
+ "completedSplits" : 19
+ }
+}
+```
+
+:::note
+
+Since the response data is not in sync with the query state from the
perspective of clients, you cannot rely on the response data to determine
whether the query completes.
+
+:::
+
+
+For more information about Presto REST API, refer to [Presto HTTP
Protocol](https://github.com/prestosql/presto/wiki/HTTP-Protocol).
diff --git a/site2/website-next/versioned_sidebars/version-2.7.3-sidebars.json
b/site2/website-next/versioned_sidebars/version-2.7.3-sidebars.json
index 3812e2b..dd1f5d0 100644
--- a/site2/website-next/versioned_sidebars/version-2.7.3-sidebars.json
+++ b/site2/website-next/versioned_sidebars/version-2.7.3-sidebars.json
@@ -175,6 +175,30 @@
],
"collapsible": true,
"collapsed": true
+ },
+ {
+ "type": "category",
+ "label": "Pulsar SQL",
+ "items": [
+ {
+ "type": "doc",
+ "id": "version-2.7.3/sql-overview"
+ },
+ {
+ "type": "doc",
+ "id": "version-2.7.3/sql-getting-started"
+ },
+ {
+ "type": "doc",
+ "id": "version-2.7.3/sql-deployment-configurations"
+ },
+ {
+ "type": "doc",
+ "id": "version-2.7.3/sql-rest-api"
+ }
+ ],
+ "collapsible": true,
+ "collapsed": true
}
]
}
\ No newline at end of file
diff --git a/site2/website-next/versioned_sidebars/version-2.8.0-sidebars.json
b/site2/website-next/versioned_sidebars/version-2.8.0-sidebars.json
index edbc08e..9cfcd3e 100644
--- a/site2/website-next/versioned_sidebars/version-2.8.0-sidebars.json
+++ b/site2/website-next/versioned_sidebars/version-2.8.0-sidebars.json
@@ -175,6 +175,30 @@
],
"collapsible": true,
"collapsed": true
+ },
+ {
+ "type": "category",
+ "label": "Pulsar SQL",
+ "items": [
+ {
+ "type": "doc",
+ "id": "version-2.8.0/sql-overview"
+ },
+ {
+ "type": "doc",
+ "id": "version-2.8.0/sql-getting-started"
+ },
+ {
+ "type": "doc",
+ "id": "version-2.8.0/sql-deployment-configurations"
+ },
+ {
+ "type": "doc",
+ "id": "version-2.8.0/sql-rest-api"
+ }
+ ],
+ "collapsible": true,
+ "collapsed": true
}
]
}
\ No newline at end of file