This is an automated email from the ASF dual-hosted git repository.
lhotari pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/pulsar-site.git
The following commit(s) were added to refs/heads/main by this push:
new b6e99158cbf3 Remove Pulsar SQL from latest and 3.2.x docs (#899)
b6e99158cbf3 is described below
commit b6e99158cbf3b2dc8aa952724050c60d7eaea7ee
Author: Lari Hotari <[email protected]>
AuthorDate: Wed May 15 19:47:11 2024 +0300
Remove Pulsar SQL from latest and 3.2.x docs (#899)
* Remove Pulsar SQL from latest and 3.2.x docs
* Remove Pulsar SQL from sidebars
---
docs/concepts-multi-tenancy.md | 1 -
docs/cookbooks-tiered-storage.md | 2 -
docs/reference-metrics.md | 28 ---
docs/sql-deployment-configurations.md | 256 ---------------------
docs/sql-getting-started.md | 119 ----------
docs/sql-overview.md | 22 --
docs/sql-rest-api.md | 207 -----------------
sidebars.json | 10 -
.../version-3.2.x/concepts-multi-tenancy.md | 1 -
.../version-3.2.x/cookbooks-tiered-storage.md | 2 -
versioned_docs/version-3.2.x/reference-metrics.md | 28 ---
.../version-3.2.x/sql-deployment-configurations.md | 256 ---------------------
.../version-3.2.x/sql-getting-started.md | 119 ----------
versioned_docs/version-3.2.x/sql-overview.md | 22 --
versioned_docs/version-3.2.x/sql-rest-api.md | 207 -----------------
versioned_sidebars/version-3.2.x-sidebars.json | 10 -
16 files changed, 1290 deletions(-)
diff --git a/docs/concepts-multi-tenancy.md b/docs/concepts-multi-tenancy.md
index adda813bf8b0..7be84e311bb6 100644
--- a/docs/concepts-multi-tenancy.md
+++ b/docs/concepts-multi-tenancy.md
@@ -46,7 +46,6 @@ persistent://tenant/app1/topic-3
Pulsar is a multi-tenant event streaming system. Administrators can manage the
tenants and namespaces by setting policies at different levels. However, the
policies, such as retention policy and storage quota policy, are only available
at a namespace level. In many use cases, users need to set a policy at the
topic level. The namespace change events approach is proposed for supporting
topic-level policies in an efficient way. In this approach, Pulsar is used as
an event log to store name [...]
- Avoid using ZooKeeper and introduce more loads to ZooKeeper.
- Use Pulsar as an event log for propagating the policy cache. It can scale
efficiently.
-- Use Pulsar SQL to query the namespace changes and audit the system.
Each namespace has a [system topic](concepts-messaging.md#system-topic) named
`__change_events`. This system topic stores change events for a given
namespace. The following figure illustrates how to leverage the system topic to
update topic-level policies.
diff --git a/docs/cookbooks-tiered-storage.md b/docs/cookbooks-tiered-storage.md
index 7a9d99790ba0..9c569dca652a 100644
--- a/docs/cookbooks-tiered-storage.md
+++ b/docs/cookbooks-tiered-storage.md
@@ -31,8 +31,6 @@ Pulsar uses multi-part objects to upload the segment data. It
is possible that a
We recommend you add a life cycle rule your bucket to expire incomplete
multi-part upload after a day or two to avoid
getting charged for incomplete uploads.
-When ledgers are offloaded to long term storage, you can still query data in
the offloaded ledgers with Pulsar SQL.
-
## Configuring the offload driver
Offloading is configured in `broker.conf`.
diff --git a/docs/reference-metrics.md b/docs/reference-metrics.md
index d30a60f4f428..736dae5c8144 100644
--- a/docs/reference-metrics.md
+++ b/docs/reference-metrics.md
@@ -12,7 +12,6 @@ Pulsar exposes the following metrics in Prometheus format.
You can monitor your
- [Pulsar Functions](#pulsar-functions)
- [Connectors](#connectors)
- [Proxy](#proxy)
- - [Pulsar SQL Worker](#pulsar-sql-worker)
- [Pulsar transaction](#pulsar-transaction)
The following types of metrics are available:
@@ -834,33 +833,6 @@ All the proxy metrics are labeled with the following
labels:
| pulsar_proxy_binary_ops | Counter | Counter of proxy operations. |
| pulsar_proxy_binary_bytes | Counter | Counter of proxy bytes. |
-## Pulsar SQL Worker
-
-| Name | Type | Description |
-|---|---|---|
-| split_bytes_read | Counter | Number of bytes read from BookKeeper. |
-| split_num_messages_deserialized | Counter | Number of messages deserialized.
|
-| split_num_record_deserialized | Counter | Number of records deserialized. |
-| split_bytes_read_per_query | Summary | Total number of bytes read per query.
|
-| split_entry_deserialize_time | Summary | Time spent on deserializing
entries. |
-| split_entry_deserialize_time_per_query | Summary | Time spent on
deserializing entries per query. |
-| split_entry_queue_dequeue_wait_time | Summary | Time spend on waiting to get
entry from entry queue because it is empty. |
-| split_entry_queue_dequeue_wait_time_per_query | Summary | Total time spent
waiting to get entry from entry queue per query. |
-| split_message_queue_dequeue_wait_time_per_query | Summary | Time spent
waiting to dequeue from message queue because is is empty per query. |
-| split_message_queue_enqueue_wait_time | Summary | Time spent waiting for
message queue enqueue because the message queue is full. |
-| split_message_queue_enqueue_wait_time_per_query | Summary | Time spent
waiting for message queue enqueue because the message queue is full per query. |
-| split_num_entries_per_batch | Summary | Number of entries per batch. |
-| split_num_entries_per_query | Summary | Number of entries per query. |
-| split_num_messages_deserialized_per_entry | Summary | Number of messages
deserialized per entry. |
-| split_num_messages_deserialized_per_query | Summary | Number of messages
deserialized per query. |
-| split_read_attempts | Summary | Number of reading attempts (fail if queues
are full). |
-| split_read_attempts_per_query | Summary | Number of reading attempts per
query. |
-| split_read_latency_per_batch | Summary | Latency of reads per batch. |
-| split_read_latency_per_query | Summary | Total read latency per query. |
-| split_record_deserialize_time | Summary | Time spent deserializing message
to record. For example, Avro, JSON, and so on. |
-| split_record_deserialize_time_per_query | Summary | Time spent deserializing
message to record per query. |
-| split_total_execution_time | Summary | The total execution time. |
-
## Pulsar transaction
All the transaction metrics are labeled with the following labels:
diff --git a/docs/sql-deployment-configurations.md
b/docs/sql-deployment-configurations.md
deleted file mode 100644
index 1c1351f58240..000000000000
--- a/docs/sql-deployment-configurations.md
+++ /dev/null
@@ -1,256 +0,0 @@
----
-id: sql-deployment-configurations
-title: Pulsar SQL configuration and deployment
-sidebar_label: "Configuration and deployment"
-description: Configure the Pulsar Trino plugin and deploy a Pulsar SQL cluster.
----
-
-You can configure the Pulsar Trino plugin and deploy a cluster with the
following instruction.
-
-## Configure Pulsar Trino plugin
-
-To configure the Pulsar Trino plugin, you can modify the
`${project.root}/trino/conf/catalog/pulsar.properties` properties file. The
configuration for the connector and the default values are as follows.
-
-```properties
-# name of the connector to be displayed in the catalog
-connector.name=pulsar
-
-# the URL of Pulsar broker service
-pulsar.web-service-url=http://localhost:8080
-
-# the URL of Pulsar broker binary service
-pulsar.broker-binary-service-url=pulsar://localhost:6650
-
-# the URL of Zookeeper cluster
-pulsar.zookeeper-uri=localhost:2181
-
-# minimum number of entries to read at a single time
-pulsar.entry-read-batch-size=100
-
-# default number of splits to use per query
-pulsar.target-num-splits=4
-
-# max size of one batch message (default value is 5MB)
-pulsar.max-message-size=5242880
-
-# number of split used when querying data from Pulsar
-pulsar.target-num-splits=2
-
-# size of queue to buffer entry read from Pulsar
-pulsar.max-split-entry-queue-size=1000
-
-# size of queue to buffer message extract from entries
-pulsar.max-split-message-queue-size=10000
-
-# status provider to record connector metrics
-pulsar.stats-provider=org.apache.bookkeeper.stats.NullStatsProvider
-
-# config in map format for stats provider e.g. {"key1":"val1","key2":"val2"}
-pulsar.stats-provider-configs={}
-
-# whether to rewrite Pulsar's default topic delimiter '/'
-pulsar.namespace-delimiter-rewrite-enable=false
-
-# delimiter used to rewrite Pulsar's default delimiter '/', use if default is
causing incompatibility with other system like Superset
-pulsar.rewrite-namespace-delimiter="/"
-
-# maximum number of thread pool size for ledger offloader.
-pulsar.managed-ledger-offload-max-threads=2
-
-# driver used to offload or read cold data to or from long-term storage
-pulsar.managed-ledger-offload-driver=null
-
-# directory to load offloaders nar file.
-pulsar.offloaders-directory="./offloaders"
-
-# properties and configurations related to specific offloader implementation
as map e.g. {"key1":"val1","key2":"val2"}
-pulsar.offloader-properties={}
-
-# authentication plugin used to authenticate to Pulsar cluster
-pulsar.auth-plugin=null
-
-# authentication parameter used to authenticate to the Pulsar cluster as a
string e.g. "key1:val1,key2:val2".
-pulsar.auth-params=null
-
-# whether the Pulsar client accept an untrusted TLS certificate from broker
-pulsar.tls-allow-insecure-connection=null
-
-# whether to allow hostname verification when a client connects to broker over
TLS.
-pulsar.tls-hostname-verification-enable=null
-
-# path for the trusted TLS certificate file of Pulsar broker
-pulsar.tls-trust-cert-file-path=null
-
-## whether to enable Pulsar authorization
-pulsar.authorization-enabled=false
-
-# set the threshold for BookKeeper request throttle, default is disabled
-pulsar.bookkeeper-throttle-value=0
-
-# set the number of IO thread
-pulsar.bookkeeper-num-io-threads=2 * Runtime.getRuntime().availableProcessors()
-
-# set the number of worker thread
-pulsar.bookkeeper-num-worker-threads=Runtime.getRuntime().availableProcessors()
-
-# whether to use BookKeeper V2 wire protocol
-pulsar.bookkeeper-use-v2-protocol=true
-
-# interval to check the need for sending an explicit LAC, default is disabled
-pulsar.bookkeeper-explicit-interval=0
-
-# size for managed ledger entry cache (in MB).
-pulsar.managed-ledger-cache-size-MB=0
-
-# number of threads to be used for managed ledger tasks dispatching
-pulsar.managed-ledger-num-worker-threads=Runtime.getRuntime().availableProcessors()
-
-# number of threads to be used for managed ledger scheduled tasks
-pulsar.managed-ledger-num-scheduler-threads=Runtime.getRuntime().availableProcessors()
-
-# directory used to store extraction NAR file
-pulsar.nar-extraction-directory=System.getProperty("java.io.tmpdir")
-```
-
-### Enable authentication and authorization between Pulsar and Pulsar SQL
-
-To enable authentication and authorization between Pulsar and Pulsar SQL, you
need to set the following configurations in the
`${project.root}/trino/conf/catalog/pulsar.properties` properties file:
-
-```properties
-pulsar.authorization-enabled=true
-pulsar.broker-binary-service-url=pulsar://localhost:6650
-```
-
-:::note
-By default, the authentication and authorization between Pulsar and Pulsar SQL
are **disabled**.
-:::
-
-### Connect Trino to Pulsar with multiple hosts
-
-To connect Trino with multiple hosts for brokers, add multiple URLs to
`pulsar.web-service-url`.
-To connect Trino with multiple hosts for ZooKeeper, add multiple URLs to
`pulsar.web-service-url`.
-
-The following is an example.
-
-```properties
-pulsar.web-service-url=http://localhost:8080,localhost:8081,localhost:8082
-pulsar.zookeeper-uri=localhost1,localhost2:2181
-```
-
-### Get the last message in a topic
-
-:::note
-
-By default, Pulsar SQL **does not get the last message in a topic**. It is by
design and controlled by settings. By default, BookKeeper LAC only advances
when subsequent entries are added. If there is no subsequent entry added, the
last written entry is not visible to readers until the ledger is closed. This
is not a problem for Pulsar which uses managed ledger, but Pulsar SQL directly
reads from BookKeeper ledger.
-
-:::
-
-To get the last message in a topic, you need to set the following
configurations:
-
-1. For the broker configuration, set `bookkeeperExplicitLacIntervalInMills` >
0 in `broker.conf` or `standalone.conf`.
-
-2. For the Trino configuration, set `pulsar.bookkeeper-explicit-interval` > 0
and `pulsar.bookkeeper-use-v2-protocol=false`.
-
-However, using BookKeeper V3 protocol introduces additional GC overhead to BK
as it uses Protobuf.
-
-## Query data from existing Trino clusters
-
-If you already have a Trino cluster compatible to version 363, you can copy
the Pulsar Trino plugin to your existing cluster. Download the archived plugin
package with the following command.
-
-```bash
-wget pulsar:binary_release_url
-```
-
-## Deploy a new cluster
-
-Since Pulsar SQL is powered by Trino, the configuration for deployment is the
same for the Pulsar SQL worker.
-
-:::note
-
-For how to set up a standalone single node environment, refer to [Query
data](sql-getting-started.md).
-
-:::
-
-You can use the same CLI args as the Trino launcher.
-
-The default configuration for the cluster is located in
`${project.root}/trino/conf`. You can customize your deployment by modifying
the default configuration.
-
-You can set the worker to read from a different configuration directory, or
set a different directory to write data.
-
-```bash
-./bin/pulsar sql-worker run --etc-dir /tmp/pulsar/trino/conf --data-dir
/tmp/trino-1
-```
-
-You can start the worker as daemon process.
-
-```bash
-./bin/pulsar sql-worker start
-```
-
-### Deploy a cluster on multiple nodes
-
-You can deploy a Pulsar SQL cluster or Trino cluster on multiple nodes. The
following steps shows how to deploy a cluster on three-node cluster.
-
-Step 1: Copy the Pulsar binary distribution to three nodes.
-
-The first node runs as Trino coordinator. The minimal configuration required
in the `${project.root}/trino/conf/config.properties` file is as follows.
-
-```properties
-coordinator=true
-node-scheduler.include-coordinator=true
-http-server.http.port=8080
-query.max-memory=50GB
-query.max-memory-per-node=1GB
-discovery-server.enabled=true
-discovery.uri=<coordinator-url>
-```
-
-The other two nodes serve as worker nodes, you can use the following
configuration for worker nodes.
-
-```properties
-coordinator=false
-http-server.http.port=8080
-query.max-memory=50GB
-query.max-memory-per-node=1GB
-discovery.uri=<coordinator-url>
-```
-
-step 2: Modify `pulsar.web-service-url` and `pulsar.zookeeper-uri`
configuration in the `${project.root}/trino/conf/catalog/pulsar.properties`
file accordingly for the three nodes.
-
-Step 3: Start the coordinator node.
-
-```bash
-./bin/pulsar sql-worker run
-```
-
-Step 4: Start worker nodes.
-
-```bash
-./bin/pulsar sql-worker run
-```
-
-Step 5: Start the SQL CLI and check the status of your cluster.
-
-```bash
-./bin/pulsar sql --server <coordinate_url>
-```
-
-Step 6: Check the status of your nodes.
-
-```bash
-trino> SELECT * FROM system.runtime.nodes;
- node_id | http_uri | node_version | coordinator | state
----------+-------------------------+--------------+-------------+--------
- 1 | http://192.168.2.1:8081 | testversion | true | active
- 3 | http://192.168.2.2:8081 | testversion | false | active
- 2 | http://192.168.2.3:8081 | testversion | false | active
-```
-
-For more information about the deployment in Trino, refer to [Trino
deployment](https://trino.io/docs/363/installation/deployment.html).
-
-:::note
-
-The broker does not advance LAC, so when Pulsar SQL bypasses brokers to query
data, it can only read entries up to the LAC that all the bookies learned. You
can enable periodically write LAC on the broker by setting
"bookkeeperExplicitLacIntervalInMills" in the `broker.conf` file.
-
-:::
-
diff --git a/docs/sql-getting-started.md b/docs/sql-getting-started.md
deleted file mode 100644
index b7d6c18f7b9c..000000000000
--- a/docs/sql-getting-started.md
+++ /dev/null
@@ -1,119 +0,0 @@
----
-id: sql-getting-started
-title: Query data with Pulsar SQL
-sidebar_label: "Query data"
-description: Query data with Pulsar SQL.
----
-
-Before querying data in Pulsar, you need to install Pulsar and built-in
connectors.
-
-## Requirements
-
-1. Install [Pulsar](getting-started-standalone.md).
-2. Install Pulsar [built-in
connectors](io-quickstart.md#install-pulsar-and-built-in-connector).
-
-## Query data in Pulsar
-
-To query data in Pulsar with Pulsar SQL, you need to complete the following
steps:
-
-### Step 1: Start a Pulsar cluster
-
-```bash
-PULSAR_STANDALONE_USE_ZOOKEEPER=1 ./bin/pulsar standalone
-```
-
-:::note
-
-Starting the Pulsar standalone cluster from scratch doesn't enable ZooKeeper
by default. However, the Pulsar SQL depends on ZooKeeper. Therefore, you need
to set `PULSAR_STANDALONE_USE_ZOOKEEPER=1` to enable ZooKeeper.
-
-:::
-
-### Step 2: Start a Pulsar SQL worker
-
-```bash
-./bin/pulsar sql-worker run
-```
-
-### Step 3: Run SQL CLI
-
-```bash
-./bin/pulsar sql
-```
-
-### Step 4: Test with SQL commands
-
-```bash
-trino> show catalogs;
- Catalog
----------
- pulsar
- system
-(2 rows)
-
-Query 20180829_211752_00004_7qpwh, FINISHED, 1 node
-Splits: 19 total, 19 done (100.00%)
-0:00 [0 rows, 0B] [0 rows/s, 0B/s]
-
-
-trino> show schemas in pulsar;
- Schema
------------------------
- information_schema
- public/default
- public/functions
-(3 rows)
-
-Query 20180829_211818_00005_7qpwh, FINISHED, 1 node
-Splits: 19 total, 19 done (100.00%)
-0:00 [4 rows, 89B] [21 rows/s, 471B/s]
-
-
-trino> show tables in pulsar."public/default";
- Table
--------
-(0 rows)
-
-Query 20180829_211839_00006_7qpwh, FINISHED, 1 node
-Splits: 19 total, 19 done (100.00%)
-0:00 [0 rows, 0B] [0 rows/s, 0B/s]
-```
-
-Since there is no data in Pulsar, no records are returned.
-
-### Step 5: Ingest some mock data
-
-```bash
-./bin/pulsar-admin sources create --name generator --destinationTopicName
generator_test --source-type data-generator
-```
-
-And then you can query a topic in the namespace "public/default":
-
-```bash
-trino> show tables in pulsar."public/default";
- Table
-----------------
- generator_test
-(1 row)
-
-Query 20180829_213202_00000_csyeu, FINISHED, 1 node
-Splits: 19 total, 19 done (100.00%)
-0:02 [1 rows, 38B] [0 rows/s, 17B/s]
-```
-
-You can now query the data within the topic "generator_test":
-
-```bash
-trino> select * from pulsar."public/default".generator_test;
-
- firstname | middlename | lastname | email |
username | password | telephonenumber | age | companyemail
| nationalidentitycardnumber |
--------------+-------------+-------------+----------------------------------+--------------+----------+-----------------+-----+-----------------------------------------------+----------------------------+
- Genesis | Katherine | Wiley | [email protected] |
genesisw | y9D2dtU3 | 959-197-1860 | 71 |
[email protected] | 880-58-9247 |
- Brayden | | Stanton | [email protected] |
braydens | ZnjmhXik | 220-027-867 | 81 | [email protected]
| 604-60-7069 |
- Benjamin | Julian | Velasquez | [email protected] |
benjaminv | 8Bc7m3eb | 298-377-0062 | 21 |
[email protected] | 213-32-5882 |
- Michael | Thomas | Donovan | [email protected] |
michaeld | OqBm9MLs | 078-134-4685 | 55 | [email protected]
| 443-30-3442 |
- Brooklyn | Avery | Roach | [email protected] |
broach | IxtBLafO | 387-786-2998 | 68 | [email protected]
| 085-88-3973 |
- Skylar | | Bradshaw | [email protected] |
skylarb | p6eC6cKy | 210-872-608 | 96 | [email protected]
| 453-46-0334 |
-...
-```
-
-You can query the mock data.
diff --git a/docs/sql-overview.md b/docs/sql-overview.md
deleted file mode 100644
index 7032df6f9cb6..000000000000
--- a/docs/sql-overview.md
+++ /dev/null
@@ -1,22 +0,0 @@
----
-id: sql-overview
-title: Pulsar SQL Overview
-sidebar_label: "Overview"
-description: Get a comprehensive understanding of Pulsar SQL.
----
-
-Apache Pulsar is used to store streams of event data, and the event data is
structured with predefined fields. With the implementation of the [Schema
Registry](schema-get-started.md), you can store structured data in Pulsar and
query the data by using [Trino (formerly Presto SQL)](https://trino.io/).
-
-As the core of Pulsar SQL, the Pulsar Trino plugin enables Trino workers
within a Trino cluster to query data from Pulsar.
-
-
-
-The query performance is efficient and highly scalable, because Pulsar adopts
[two-level-segment-based
architecture](concepts-architecture-overview.md#apache-bookkeeper).
-
-Topics in Pulsar are stored as segments in [Apache
BookKeeper](https://bookkeeper.apache.org/). Each topic segment is replicated
to some BookKeeper nodes, which enables concurrent reads and high read
throughput. In the Pulsar Trino connector, data is read directly from
BookKeeper, so Trino workers can read concurrently from a horizontally scalable
number of BookKeeper nodes.
-
-
-
-# Caveat
-
-If you're upgrading Pulsar SQL from 2.11 or early, you should copy config
files from `conf/presto` to `trino/conf`. If you're downgrading Pulsar SQL to
2.11 or early from newer versions, do verse visa.
diff --git a/docs/sql-rest-api.md b/docs/sql-rest-api.md
deleted file mode 100644
index d23f140e6f14..000000000000
--- a/docs/sql-rest-api.md
+++ /dev/null
@@ -1,207 +0,0 @@
----
-id: sql-rest-api
-title: Pulsar SQL REST APIs
-sidebar_label: "REST APIs"
-description: Get a comprehensive understanding of Trino REST API.
----
-
-This section lists resources that make up the Trino REST API v1.
-
-## Request for Trino services
-
-All requests for Trino services should use Trino REST API v1 version.
-
-To request services, use the explicit URL `http://trino.service:8081/v1`. You
need to update `trino.service:8081` with your real Trino address before sending
requests.
-
-`POST` requests require the `X-Trino-User` header. If you use authentication,
you must use the same `username` that is specified in the authentication
configuration. If you do not use authentication, you can specify anything for
`username`.
-
-```http
-X-Trino-User: username
-```
-
-For more information about headers, refer to [client request
headers](https://trino.io/docs/363/develop/client-protocol.html#client-request-headers).
-
-## Schema
-
-You can use statement in the HTTP body. All data is received as JSON document
that might contain a `nextUri` link. If the received JSON document contains a
`nextUri` link, the request continues with the `nextUri` link until the
received data does not contain a `nextUri` link. If no error is returned, the
query completes successfully. If an `error` field is displayed in `stats`, it
means the query fails.
-
-The following is an example of `show catalogs`. The query continues until the
received JSON document does not contain a `nextUri` link. Since no `error` is
displayed in `stats`, it means that the query completes successfully.
-
-```bash
-curl --header "X-Trino-User: test-user" --request POST --data 'show catalogs'
http://localhost:8081/v1/statement
-```
-
-Output:
-
-```json
-{
- "infoUri" :
"http://localhost:8081/ui/query.html?20191113_033653_00006_dg6hb",
- "stats" : {
- "queued" : true,
- "nodes" : 0,
- "userTimeMillis" : 0,
- "cpuTimeMillis" : 0,
- "wallTimeMillis" : 0,
- "processedBytes" : 0,
- "processedRows" : 0,
- "runningSplits" : 0,
- "queuedTimeMillis" : 0,
- "queuedSplits" : 0,
- "completedSplits" : 0,
- "totalSplits" : 0,
- "scheduled" : false,
- "peakMemoryBytes" : 0,
- "state" : "QUEUED",
- "elapsedTimeMillis" : 0
- },
- "id" : "20191113_033653_00006_dg6hb",
- "nextUri" :
"http://localhost:8081/v1/statement/20191113_033653_00006_dg6hb/1"
-}
-```
-
-```bash
-curl http://localhost:8081/v1/statement/20191113_033653_00006_dg6hb/1
-```
-
-Output:
-
-```json
-{
- "infoUri" :
"http://localhost:8081/ui/query.html?20191113_033653_00006_dg6hb",
- "nextUri" :
"http://localhost:8081/v1/statement/20191113_033653_00006_dg6hb/2",
- "id" : "20191113_033653_00006_dg6hb",
- "stats" : {
- "state" : "PLANNING",
- "totalSplits" : 0,
- "queued" : false,
- "userTimeMillis" : 0,
- "completedSplits" : 0,
- "scheduled" : false,
- "wallTimeMillis" : 0,
- "runningSplits" : 0,
- "queuedSplits" : 0,
- "cpuTimeMillis" : 0,
- "processedRows" : 0,
- "processedBytes" : 0,
- "nodes" : 0,
- "queuedTimeMillis" : 1,
- "elapsedTimeMillis" : 2,
- "peakMemoryBytes" : 0
- }
-}
-```
-
-```bash
-curl http://localhost:8081/v1/statement/20191113_033653_00006_dg6hb/2
-```
-
-Output:
-
-```json
-{
- "id" : "20191113_033653_00006_dg6hb",
- "data" : [
- [
- "pulsar"
- ],
- [
- "system"
- ]
- ],
- "infoUri" :
"http://localhost:8081/ui/query.html?20191113_033653_00006_dg6hb",
- "columns" : [
- {
- "typeSignature" : {
- "rawType" : "varchar",
- "arguments" : [
- {
- "kind" : "LONG_LITERAL",
- "value" : 6
- }
- ],
- "literalArguments" : [],
- "typeArguments" : []
- },
- "name" : "Catalog",
- "type" : "varchar(6)"
- }
- ],
- "stats" : {
- "wallTimeMillis" : 104,
- "scheduled" : true,
- "userTimeMillis" : 14,
- "progressPercentage" : 100,
- "totalSplits" : 19,
- "nodes" : 1,
- "cpuTimeMillis" : 16,
- "queued" : false,
- "queuedTimeMillis" : 1,
- "state" : "FINISHED",
- "peakMemoryBytes" : 0,
- "elapsedTimeMillis" : 111,
- "processedBytes" : 0,
- "processedRows" : 0,
- "queuedSplits" : 0,
- "rootStage" : {
- "cpuTimeMillis" : 1,
- "runningSplits" : 0,
- "state" : "FINISHED",
- "completedSplits" : 1,
- "subStages" : [
- {
- "cpuTimeMillis" : 14,
- "runningSplits" : 0,
- "state" : "FINISHED",
- "completedSplits" : 17,
- "subStages" : [
- {
- "wallTimeMillis" : 7,
- "subStages" : [],
- "stageId" : "2",
- "done" : true,
- "nodes" : 1,
- "totalSplits" : 1,
- "processedBytes" : 22,
- "processedRows" : 2,
- "queuedSplits" : 0,
- "userTimeMillis" : 1,
- "cpuTimeMillis" : 1,
- "runningSplits" : 0,
- "state" : "FINISHED",
- "completedSplits" : 1
- }
- ],
- "wallTimeMillis" : 92,
- "nodes" : 1,
- "done" : true,
- "stageId" : "1",
- "userTimeMillis" : 12,
- "processedRows" : 2,
- "processedBytes" : 51,
- "queuedSplits" : 0,
- "totalSplits" : 17
- }
- ],
- "wallTimeMillis" : 5,
- "done" : true,
- "nodes" : 1,
- "stageId" : "0",
- "userTimeMillis" : 1,
- "processedRows" : 2,
- "processedBytes" : 22,
- "totalSplits" : 1,
- "queuedSplits" : 0
- },
- "runningSplits" : 0,
- "completedSplits" : 19
- }
-}
-```
-
-:::note
-
-Since the response data is not in sync with the query state from the
perspective of clients, you cannot rely on the response data to determine
whether the query completes.
-
-:::
-
-For more information about Trino REST API, refer to [Trino client REST
API](https://trino.io/docs/363/develop/client-protocol.html).
diff --git a/sidebars.json b/sidebars.json
index 85cd06327bb4..b36a35ee4633 100644
--- a/sidebars.json
+++ b/sidebars.json
@@ -189,16 +189,6 @@
"io-develop"
]
},
- {
- "type": "category",
- "label": "Pulsar SQL",
- "items": [
- "sql-overview",
- "sql-getting-started",
- "sql-deployment-configurations",
- "sql-rest-api"
- ]
- },
{
"type": "category",
"label": "Tiered Storage",
diff --git a/versioned_docs/version-3.2.x/concepts-multi-tenancy.md
b/versioned_docs/version-3.2.x/concepts-multi-tenancy.md
index adda813bf8b0..7be84e311bb6 100644
--- a/versioned_docs/version-3.2.x/concepts-multi-tenancy.md
+++ b/versioned_docs/version-3.2.x/concepts-multi-tenancy.md
@@ -46,7 +46,6 @@ persistent://tenant/app1/topic-3
Pulsar is a multi-tenant event streaming system. Administrators can manage the
tenants and namespaces by setting policies at different levels. However, the
policies, such as retention policy and storage quota policy, are only available
at a namespace level. In many use cases, users need to set a policy at the
topic level. The namespace change events approach is proposed for supporting
topic-level policies in an efficient way. In this approach, Pulsar is used as
an event log to store name [...]
- Avoid using ZooKeeper and introduce more loads to ZooKeeper.
- Use Pulsar as an event log for propagating the policy cache. It can scale
efficiently.
-- Use Pulsar SQL to query the namespace changes and audit the system.
Each namespace has a [system topic](concepts-messaging.md#system-topic) named
`__change_events`. This system topic stores change events for a given
namespace. The following figure illustrates how to leverage the system topic to
update topic-level policies.
diff --git a/versioned_docs/version-3.2.x/cookbooks-tiered-storage.md
b/versioned_docs/version-3.2.x/cookbooks-tiered-storage.md
index 7a9d99790ba0..9c569dca652a 100644
--- a/versioned_docs/version-3.2.x/cookbooks-tiered-storage.md
+++ b/versioned_docs/version-3.2.x/cookbooks-tiered-storage.md
@@ -31,8 +31,6 @@ Pulsar uses multi-part objects to upload the segment data. It
is possible that a
We recommend you add a life cycle rule your bucket to expire incomplete
multi-part upload after a day or two to avoid
getting charged for incomplete uploads.
-When ledgers are offloaded to long term storage, you can still query data in
the offloaded ledgers with Pulsar SQL.
-
## Configuring the offload driver
Offloading is configured in `broker.conf`.
diff --git a/versioned_docs/version-3.2.x/reference-metrics.md
b/versioned_docs/version-3.2.x/reference-metrics.md
index d30a60f4f428..736dae5c8144 100644
--- a/versioned_docs/version-3.2.x/reference-metrics.md
+++ b/versioned_docs/version-3.2.x/reference-metrics.md
@@ -12,7 +12,6 @@ Pulsar exposes the following metrics in Prometheus format.
You can monitor your
- [Pulsar Functions](#pulsar-functions)
- [Connectors](#connectors)
- [Proxy](#proxy)
- - [Pulsar SQL Worker](#pulsar-sql-worker)
- [Pulsar transaction](#pulsar-transaction)
The following types of metrics are available:
@@ -834,33 +833,6 @@ All the proxy metrics are labeled with the following
labels:
| pulsar_proxy_binary_ops | Counter | Counter of proxy operations. |
| pulsar_proxy_binary_bytes | Counter | Counter of proxy bytes. |
-## Pulsar SQL Worker
-
-| Name | Type | Description |
-|---|---|---|
-| split_bytes_read | Counter | Number of bytes read from BookKeeper. |
-| split_num_messages_deserialized | Counter | Number of messages deserialized.
|
-| split_num_record_deserialized | Counter | Number of records deserialized. |
-| split_bytes_read_per_query | Summary | Total number of bytes read per query.
|
-| split_entry_deserialize_time | Summary | Time spent on deserializing
entries. |
-| split_entry_deserialize_time_per_query | Summary | Time spent on
deserializing entries per query. |
-| split_entry_queue_dequeue_wait_time | Summary | Time spend on waiting to get
entry from entry queue because it is empty. |
-| split_entry_queue_dequeue_wait_time_per_query | Summary | Total time spent
waiting to get entry from entry queue per query. |
-| split_message_queue_dequeue_wait_time_per_query | Summary | Time spent
waiting to dequeue from message queue because is is empty per query. |
-| split_message_queue_enqueue_wait_time | Summary | Time spent waiting for
message queue enqueue because the message queue is full. |
-| split_message_queue_enqueue_wait_time_per_query | Summary | Time spent
waiting for message queue enqueue because the message queue is full per query. |
-| split_num_entries_per_batch | Summary | Number of entries per batch. |
-| split_num_entries_per_query | Summary | Number of entries per query. |
-| split_num_messages_deserialized_per_entry | Summary | Number of messages
deserialized per entry. |
-| split_num_messages_deserialized_per_query | Summary | Number of messages
deserialized per query. |
-| split_read_attempts | Summary | Number of reading attempts (fail if queues
are full). |
-| split_read_attempts_per_query | Summary | Number of reading attempts per
query. |
-| split_read_latency_per_batch | Summary | Latency of reads per batch. |
-| split_read_latency_per_query | Summary | Total read latency per query. |
-| split_record_deserialize_time | Summary | Time spent deserializing message
to record. For example, Avro, JSON, and so on. |
-| split_record_deserialize_time_per_query | Summary | Time spent deserializing
message to record per query. |
-| split_total_execution_time | Summary | The total execution time. |
-
## Pulsar transaction
All the transaction metrics are labeled with the following labels:
diff --git a/versioned_docs/version-3.2.x/sql-deployment-configurations.md
b/versioned_docs/version-3.2.x/sql-deployment-configurations.md
deleted file mode 100644
index 1c1351f58240..000000000000
--- a/versioned_docs/version-3.2.x/sql-deployment-configurations.md
+++ /dev/null
@@ -1,256 +0,0 @@
----
-id: sql-deployment-configurations
-title: Pulsar SQL configuration and deployment
-sidebar_label: "Configuration and deployment"
-description: Configure the Pulsar Trino plugin and deploy a Pulsar SQL cluster.
----
-
-You can configure the Pulsar Trino plugin and deploy a cluster with the
following instruction.
-
-## Configure Pulsar Trino plugin
-
-To configure the Pulsar Trino plugin, you can modify the
`${project.root}/trino/conf/catalog/pulsar.properties` properties file. The
configuration for the connector and the default values are as follows.
-
-```properties
-# name of the connector to be displayed in the catalog
-connector.name=pulsar
-
-# the URL of Pulsar broker service
-pulsar.web-service-url=http://localhost:8080
-
-# the URL of Pulsar broker binary service
-pulsar.broker-binary-service-url=pulsar://localhost:6650
-
-# the URL of Zookeeper cluster
-pulsar.zookeeper-uri=localhost:2181
-
-# minimum number of entries to read at a single time
-pulsar.entry-read-batch-size=100
-
-# default number of splits to use per query
-pulsar.target-num-splits=4
-
-# max size of one batch message (default value is 5MB)
-pulsar.max-message-size=5242880
-
-# number of split used when querying data from Pulsar
-pulsar.target-num-splits=2
-
-# size of queue to buffer entry read from Pulsar
-pulsar.max-split-entry-queue-size=1000
-
-# size of queue to buffer message extract from entries
-pulsar.max-split-message-queue-size=10000
-
-# status provider to record connector metrics
-pulsar.stats-provider=org.apache.bookkeeper.stats.NullStatsProvider
-
-# config in map format for stats provider e.g. {"key1":"val1","key2":"val2"}
-pulsar.stats-provider-configs={}
-
-# whether to rewrite Pulsar's default topic delimiter '/'
-pulsar.namespace-delimiter-rewrite-enable=false
-
-# delimiter used to rewrite Pulsar's default delimiter '/', use if default is
causing incompatibility with other system like Superset
-pulsar.rewrite-namespace-delimiter="/"
-
-# maximum number of thread pool size for ledger offloader.
-pulsar.managed-ledger-offload-max-threads=2
-
-# driver used to offload or read cold data to or from long-term storage
-pulsar.managed-ledger-offload-driver=null
-
-# directory to load offloaders nar file.
-pulsar.offloaders-directory="./offloaders"
-
-# properties and configurations related to specific offloader implementation
as map e.g. {"key1":"val1","key2":"val2"}
-pulsar.offloader-properties={}
-
-# authentication plugin used to authenticate to Pulsar cluster
-pulsar.auth-plugin=null
-
-# authentication parameter used to authenticate to the Pulsar cluster as a
string e.g. "key1:val1,key2:val2".
-pulsar.auth-params=null
-
-# whether the Pulsar client accept an untrusted TLS certificate from broker
-pulsar.tls-allow-insecure-connection=null
-
-# whether to allow hostname verification when a client connects to broker over
TLS.
-pulsar.tls-hostname-verification-enable=null
-
-# path for the trusted TLS certificate file of Pulsar broker
-pulsar.tls-trust-cert-file-path=null
-
-## whether to enable Pulsar authorization
-pulsar.authorization-enabled=false
-
-# set the threshold for BookKeeper request throttle, default is disabled
-pulsar.bookkeeper-throttle-value=0
-
-# set the number of IO thread
-pulsar.bookkeeper-num-io-threads=2 * Runtime.getRuntime().availableProcessors()
-
-# set the number of worker thread
-pulsar.bookkeeper-num-worker-threads=Runtime.getRuntime().availableProcessors()
-
-# whether to use BookKeeper V2 wire protocol
-pulsar.bookkeeper-use-v2-protocol=true
-
-# interval to check the need for sending an explicit LAC, default is disabled
-pulsar.bookkeeper-explicit-interval=0
-
-# size for managed ledger entry cache (in MB).
-pulsar.managed-ledger-cache-size-MB=0
-
-# number of threads to be used for managed ledger tasks dispatching
-pulsar.managed-ledger-num-worker-threads=Runtime.getRuntime().availableProcessors()
-
-# number of threads to be used for managed ledger scheduled tasks
-pulsar.managed-ledger-num-scheduler-threads=Runtime.getRuntime().availableProcessors()
-
-# directory used to store extraction NAR file
-pulsar.nar-extraction-directory=System.getProperty("java.io.tmpdir")
-```
-
-### Enable authentication and authorization between Pulsar and Pulsar SQL
-
-To enable authentication and authorization between Pulsar and Pulsar SQL, you
need to set the following configurations in the
`${project.root}/trino/conf/catalog/pulsar.properties` properties file:
-
-```properties
-pulsar.authorization-enabled=true
-pulsar.broker-binary-service-url=pulsar://localhost:6650
-```
-
-:::note
-By default, the authentication and authorization between Pulsar and Pulsar SQL
are **disabled**.
-:::
-
-### Connect Trino to Pulsar with multiple hosts
-
-To connect Trino with multiple hosts for brokers, add multiple URLs to
`pulsar.web-service-url`.
-To connect Trino with multiple hosts for ZooKeeper, add multiple URLs to
`pulsar.web-service-url`.
-
-The following is an example.
-
-```properties
-pulsar.web-service-url=http://localhost:8080,localhost:8081,localhost:8082
-pulsar.zookeeper-uri=localhost1,localhost2:2181
-```
-
-### Get the last message in a topic
-
-:::note
-
-By default, Pulsar SQL **does not get the last message in a topic**. It is by
design and controlled by settings. By default, BookKeeper LAC only advances
when subsequent entries are added. If there is no subsequent entry added, the
last written entry is not visible to readers until the ledger is closed. This
is not a problem for Pulsar which uses managed ledger, but Pulsar SQL directly
reads from BookKeeper ledger.
-
-:::
-
-To get the last message in a topic, you need to set the following
configurations:
-
-1. For the broker configuration, set `bookkeeperExplicitLacIntervalInMills` >
0 in `broker.conf` or `standalone.conf`.
-
-2. For the Trino configuration, set `pulsar.bookkeeper-explicit-interval` > 0
and `pulsar.bookkeeper-use-v2-protocol=false`.
-
-However, using BookKeeper V3 protocol introduces additional GC overhead to BK
as it uses Protobuf.
-
-## Query data from existing Trino clusters
-
-If you already have a Trino cluster compatible to version 363, you can copy
the Pulsar Trino plugin to your existing cluster. Download the archived plugin
package with the following command.
-
-```bash
-wget pulsar:binary_release_url
-```
-
-## Deploy a new cluster
-
-Since Pulsar SQL is powered by Trino, the configuration for deployment is the
same for the Pulsar SQL worker.
-
-:::note
-
-For how to set up a standalone single node environment, refer to [Query
data](sql-getting-started.md).
-
-:::
-
-You can use the same CLI args as the Trino launcher.
-
-The default configuration for the cluster is located in
`${project.root}/trino/conf`. You can customize your deployment by modifying
the default configuration.
-
-You can set the worker to read from a different configuration directory, or
set a different directory to write data.
-
-```bash
-./bin/pulsar sql-worker run --etc-dir /tmp/pulsar/trino/conf --data-dir
/tmp/trino-1
-```
-
-You can start the worker as daemon process.
-
-```bash
-./bin/pulsar sql-worker start
-```
-
-### Deploy a cluster on multiple nodes
-
-You can deploy a Pulsar SQL cluster or Trino cluster on multiple nodes. The
following steps shows how to deploy a cluster on three-node cluster.
-
-Step 1: Copy the Pulsar binary distribution to three nodes.
-
-The first node runs as Trino coordinator. The minimal configuration required
in the `${project.root}/trino/conf/config.properties` file is as follows.
-
-```properties
-coordinator=true
-node-scheduler.include-coordinator=true
-http-server.http.port=8080
-query.max-memory=50GB
-query.max-memory-per-node=1GB
-discovery-server.enabled=true
-discovery.uri=<coordinator-url>
-```
-
-The other two nodes serve as worker nodes, you can use the following
configuration for worker nodes.
-
-```properties
-coordinator=false
-http-server.http.port=8080
-query.max-memory=50GB
-query.max-memory-per-node=1GB
-discovery.uri=<coordinator-url>
-```
-
-step 2: Modify `pulsar.web-service-url` and `pulsar.zookeeper-uri`
configuration in the `${project.root}/trino/conf/catalog/pulsar.properties`
file accordingly for the three nodes.
-
-Step 3: Start the coordinator node.
-
-```bash
-./bin/pulsar sql-worker run
-```
-
-Step 4: Start worker nodes.
-
-```bash
-./bin/pulsar sql-worker run
-```
-
-Step 5: Start the SQL CLI and check the status of your cluster.
-
-```bash
-./bin/pulsar sql --server <coordinate_url>
-```
-
-Step 6: Check the status of your nodes.
-
-```bash
-trino> SELECT * FROM system.runtime.nodes;
- node_id | http_uri | node_version | coordinator | state
----------+-------------------------+--------------+-------------+--------
- 1 | http://192.168.2.1:8081 | testversion | true | active
- 3 | http://192.168.2.2:8081 | testversion | false | active
- 2 | http://192.168.2.3:8081 | testversion | false | active
-```
-
-For more information about the deployment in Trino, refer to [Trino
deployment](https://trino.io/docs/363/installation/deployment.html).
-
-:::note
-
-The broker does not advance LAC, so when Pulsar SQL bypasses brokers to query
data, it can only read entries up to the LAC that all the bookies learned. You
can enable periodically write LAC on the broker by setting
"bookkeeperExplicitLacIntervalInMills" in the `broker.conf` file.
-
-:::
-
diff --git a/versioned_docs/version-3.2.x/sql-getting-started.md
b/versioned_docs/version-3.2.x/sql-getting-started.md
deleted file mode 100644
index b7d6c18f7b9c..000000000000
--- a/versioned_docs/version-3.2.x/sql-getting-started.md
+++ /dev/null
@@ -1,119 +0,0 @@
----
-id: sql-getting-started
-title: Query data with Pulsar SQL
-sidebar_label: "Query data"
-description: Query data with Pulsar SQL.
----
-
-Before querying data in Pulsar, you need to install Pulsar and built-in
connectors.
-
-## Requirements
-
-1. Install [Pulsar](getting-started-standalone.md).
-2. Install Pulsar [built-in
connectors](io-quickstart.md#install-pulsar-and-built-in-connector).
-
-## Query data in Pulsar
-
-To query data in Pulsar with Pulsar SQL, you need to complete the following
steps:
-
-### Step 1: Start a Pulsar cluster
-
-```bash
-PULSAR_STANDALONE_USE_ZOOKEEPER=1 ./bin/pulsar standalone
-```
-
-:::note
-
-Starting the Pulsar standalone cluster from scratch doesn't enable ZooKeeper
by default. However, the Pulsar SQL depends on ZooKeeper. Therefore, you need
to set `PULSAR_STANDALONE_USE_ZOOKEEPER=1` to enable ZooKeeper.
-
-:::
-
-### Step 2: Start a Pulsar SQL worker
-
-```bash
-./bin/pulsar sql-worker run
-```
-
-### Step 3: Run SQL CLI
-
-```bash
-./bin/pulsar sql
-```
-
-### Step 4: Test with SQL commands
-
-```bash
-trino> show catalogs;
- Catalog
----------
- pulsar
- system
-(2 rows)
-
-Query 20180829_211752_00004_7qpwh, FINISHED, 1 node
-Splits: 19 total, 19 done (100.00%)
-0:00 [0 rows, 0B] [0 rows/s, 0B/s]
-
-
-trino> show schemas in pulsar;
- Schema
------------------------
- information_schema
- public/default
- public/functions
-(3 rows)
-
-Query 20180829_211818_00005_7qpwh, FINISHED, 1 node
-Splits: 19 total, 19 done (100.00%)
-0:00 [4 rows, 89B] [21 rows/s, 471B/s]
-
-
-trino> show tables in pulsar."public/default";
- Table
--------
-(0 rows)
-
-Query 20180829_211839_00006_7qpwh, FINISHED, 1 node
-Splits: 19 total, 19 done (100.00%)
-0:00 [0 rows, 0B] [0 rows/s, 0B/s]
-```
-
-Since there is no data in Pulsar, no records are returned.
-
-### Step 5: Ingest some mock data
-
-```bash
-./bin/pulsar-admin sources create --name generator --destinationTopicName
generator_test --source-type data-generator
-```
-
-And then you can query a topic in the namespace "public/default":
-
-```bash
-trino> show tables in pulsar."public/default";
- Table
-----------------
- generator_test
-(1 row)
-
-Query 20180829_213202_00000_csyeu, FINISHED, 1 node
-Splits: 19 total, 19 done (100.00%)
-0:02 [1 rows, 38B] [0 rows/s, 17B/s]
-```
-
-You can now query the data within the topic "generator_test":
-
-```bash
-trino> select * from pulsar."public/default".generator_test;
-
- firstname | middlename | lastname | email |
username | password | telephonenumber | age | companyemail
| nationalidentitycardnumber |
--------------+-------------+-------------+----------------------------------+--------------+----------+-----------------+-----+-----------------------------------------------+----------------------------+
- Genesis | Katherine | Wiley | [email protected] |
genesisw | y9D2dtU3 | 959-197-1860 | 71 |
[email protected] | 880-58-9247 |
- Brayden | | Stanton | [email protected] |
braydens | ZnjmhXik | 220-027-867 | 81 | [email protected]
| 604-60-7069 |
- Benjamin | Julian | Velasquez | [email protected] |
benjaminv | 8Bc7m3eb | 298-377-0062 | 21 |
[email protected] | 213-32-5882 |
- Michael | Thomas | Donovan | [email protected] |
michaeld | OqBm9MLs | 078-134-4685 | 55 | [email protected]
| 443-30-3442 |
- Brooklyn | Avery | Roach | [email protected] |
broach | IxtBLafO | 387-786-2998 | 68 | [email protected]
| 085-88-3973 |
- Skylar | | Bradshaw | [email protected] |
skylarb | p6eC6cKy | 210-872-608 | 96 | [email protected]
| 453-46-0334 |
-...
-```
-
-You can query the mock data.
diff --git a/versioned_docs/version-3.2.x/sql-overview.md
b/versioned_docs/version-3.2.x/sql-overview.md
deleted file mode 100644
index 7032df6f9cb6..000000000000
--- a/versioned_docs/version-3.2.x/sql-overview.md
+++ /dev/null
@@ -1,22 +0,0 @@
----
-id: sql-overview
-title: Pulsar SQL Overview
-sidebar_label: "Overview"
-description: Get a comprehensive understanding of Pulsar SQL.
----
-
-Apache Pulsar is used to store streams of event data, and the event data is
structured with predefined fields. With the implementation of the [Schema
Registry](schema-get-started.md), you can store structured data in Pulsar and
query the data by using [Trino (formerly Presto SQL)](https://trino.io/).
-
-As the core of Pulsar SQL, the Pulsar Trino plugin enables Trino workers
within a Trino cluster to query data from Pulsar.
-
-
-
-The query performance is efficient and highly scalable, because Pulsar adopts
[two-level-segment-based
architecture](concepts-architecture-overview.md#apache-bookkeeper).
-
-Topics in Pulsar are stored as segments in [Apache
BookKeeper](https://bookkeeper.apache.org/). Each topic segment is replicated
to some BookKeeper nodes, which enables concurrent reads and high read
throughput. In the Pulsar Trino connector, data is read directly from
BookKeeper, so Trino workers can read concurrently from a horizontally scalable
number of BookKeeper nodes.
-
-
-
-# Caveat
-
-If you're upgrading Pulsar SQL from 2.11 or early, you should copy config
files from `conf/presto` to `trino/conf`. If you're downgrading Pulsar SQL to
2.11 or early from newer versions, do verse visa.
diff --git a/versioned_docs/version-3.2.x/sql-rest-api.md
b/versioned_docs/version-3.2.x/sql-rest-api.md
deleted file mode 100644
index d23f140e6f14..000000000000
--- a/versioned_docs/version-3.2.x/sql-rest-api.md
+++ /dev/null
@@ -1,207 +0,0 @@
----
-id: sql-rest-api
-title: Pulsar SQL REST APIs
-sidebar_label: "REST APIs"
-description: Get a comprehensive understanding of Trino REST API.
----
-
-This section lists resources that make up the Trino REST API v1.
-
-## Request for Trino services
-
-All requests for Trino services should use Trino REST API v1 version.
-
-To request services, use the explicit URL `http://trino.service:8081/v1`. You
need to update `trino.service:8081` with your real Trino address before sending
requests.
-
-`POST` requests require the `X-Trino-User` header. If you use authentication,
you must use the same `username` that is specified in the authentication
configuration. If you do not use authentication, you can specify anything for
`username`.
-
-```http
-X-Trino-User: username
-```
-
-For more information about headers, refer to [client request
headers](https://trino.io/docs/363/develop/client-protocol.html#client-request-headers).
-
-## Schema
-
-You can use statement in the HTTP body. All data is received as JSON document
that might contain a `nextUri` link. If the received JSON document contains a
`nextUri` link, the request continues with the `nextUri` link until the
received data does not contain a `nextUri` link. If no error is returned, the
query completes successfully. If an `error` field is displayed in `stats`, it
means the query fails.
-
-The following is an example of `show catalogs`. The query continues until the
received JSON document does not contain a `nextUri` link. Since no `error` is
displayed in `stats`, it means that the query completes successfully.
-
-```bash
-curl --header "X-Trino-User: test-user" --request POST --data 'show catalogs'
http://localhost:8081/v1/statement
-```
-
-Output:
-
-```json
-{
- "infoUri" :
"http://localhost:8081/ui/query.html?20191113_033653_00006_dg6hb",
- "stats" : {
- "queued" : true,
- "nodes" : 0,
- "userTimeMillis" : 0,
- "cpuTimeMillis" : 0,
- "wallTimeMillis" : 0,
- "processedBytes" : 0,
- "processedRows" : 0,
- "runningSplits" : 0,
- "queuedTimeMillis" : 0,
- "queuedSplits" : 0,
- "completedSplits" : 0,
- "totalSplits" : 0,
- "scheduled" : false,
- "peakMemoryBytes" : 0,
- "state" : "QUEUED",
- "elapsedTimeMillis" : 0
- },
- "id" : "20191113_033653_00006_dg6hb",
- "nextUri" :
"http://localhost:8081/v1/statement/20191113_033653_00006_dg6hb/1"
-}
-```
-
-```bash
-curl http://localhost:8081/v1/statement/20191113_033653_00006_dg6hb/1
-```
-
-Output:
-
-```json
-{
- "infoUri" :
"http://localhost:8081/ui/query.html?20191113_033653_00006_dg6hb",
- "nextUri" :
"http://localhost:8081/v1/statement/20191113_033653_00006_dg6hb/2",
- "id" : "20191113_033653_00006_dg6hb",
- "stats" : {
- "state" : "PLANNING",
- "totalSplits" : 0,
- "queued" : false,
- "userTimeMillis" : 0,
- "completedSplits" : 0,
- "scheduled" : false,
- "wallTimeMillis" : 0,
- "runningSplits" : 0,
- "queuedSplits" : 0,
- "cpuTimeMillis" : 0,
- "processedRows" : 0,
- "processedBytes" : 0,
- "nodes" : 0,
- "queuedTimeMillis" : 1,
- "elapsedTimeMillis" : 2,
- "peakMemoryBytes" : 0
- }
-}
-```
-
-```bash
-curl http://localhost:8081/v1/statement/20191113_033653_00006_dg6hb/2
-```
-
-Output:
-
-```json
-{
- "id" : "20191113_033653_00006_dg6hb",
- "data" : [
- [
- "pulsar"
- ],
- [
- "system"
- ]
- ],
- "infoUri" :
"http://localhost:8081/ui/query.html?20191113_033653_00006_dg6hb",
- "columns" : [
- {
- "typeSignature" : {
- "rawType" : "varchar",
- "arguments" : [
- {
- "kind" : "LONG_LITERAL",
- "value" : 6
- }
- ],
- "literalArguments" : [],
- "typeArguments" : []
- },
- "name" : "Catalog",
- "type" : "varchar(6)"
- }
- ],
- "stats" : {
- "wallTimeMillis" : 104,
- "scheduled" : true,
- "userTimeMillis" : 14,
- "progressPercentage" : 100,
- "totalSplits" : 19,
- "nodes" : 1,
- "cpuTimeMillis" : 16,
- "queued" : false,
- "queuedTimeMillis" : 1,
- "state" : "FINISHED",
- "peakMemoryBytes" : 0,
- "elapsedTimeMillis" : 111,
- "processedBytes" : 0,
- "processedRows" : 0,
- "queuedSplits" : 0,
- "rootStage" : {
- "cpuTimeMillis" : 1,
- "runningSplits" : 0,
- "state" : "FINISHED",
- "completedSplits" : 1,
- "subStages" : [
- {
- "cpuTimeMillis" : 14,
- "runningSplits" : 0,
- "state" : "FINISHED",
- "completedSplits" : 17,
- "subStages" : [
- {
- "wallTimeMillis" : 7,
- "subStages" : [],
- "stageId" : "2",
- "done" : true,
- "nodes" : 1,
- "totalSplits" : 1,
- "processedBytes" : 22,
- "processedRows" : 2,
- "queuedSplits" : 0,
- "userTimeMillis" : 1,
- "cpuTimeMillis" : 1,
- "runningSplits" : 0,
- "state" : "FINISHED",
- "completedSplits" : 1
- }
- ],
- "wallTimeMillis" : 92,
- "nodes" : 1,
- "done" : true,
- "stageId" : "1",
- "userTimeMillis" : 12,
- "processedRows" : 2,
- "processedBytes" : 51,
- "queuedSplits" : 0,
- "totalSplits" : 17
- }
- ],
- "wallTimeMillis" : 5,
- "done" : true,
- "nodes" : 1,
- "stageId" : "0",
- "userTimeMillis" : 1,
- "processedRows" : 2,
- "processedBytes" : 22,
- "totalSplits" : 1,
- "queuedSplits" : 0
- },
- "runningSplits" : 0,
- "completedSplits" : 19
- }
-}
-```
-
-:::note
-
-Since the response data is not in sync with the query state from the
perspective of clients, you cannot rely on the response data to determine
whether the query completes.
-
-:::
-
-For more information about Trino REST API, refer to [Trino client REST
API](https://trino.io/docs/363/develop/client-protocol.html).
diff --git a/versioned_sidebars/version-3.2.x-sidebars.json
b/versioned_sidebars/version-3.2.x-sidebars.json
index dba0a504946f..8d3f8a1fb70d 100644
--- a/versioned_sidebars/version-3.2.x-sidebars.json
+++ b/versioned_sidebars/version-3.2.x-sidebars.json
@@ -189,16 +189,6 @@
"io-develop"
]
},
- {
- "type": "category",
- "label": "Pulsar SQL",
- "items": [
- "sql-overview",
- "sql-getting-started",
- "sql-deployment-configurations",
- "sql-rest-api"
- ]
- },
{
"type": "category",
"label": "Tiered Storage",