This is an automated email from the ASF dual-hosted git repository.

jerrypeng pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-pulsar.wiki.git


The following commit(s) were added to refs/heads/master by this push:
     new a190d49  Updated PIP 18: Pulsar SQL (markdown)
a190d49 is described below

commit a190d49d6128a93533e5acbecd4fb4a1da83f752
Author: Boyang Jerry Peng <[email protected]>
AuthorDate: Thu Jul 19 12:39:36 2018 -0700

    Updated PIP 18: Pulsar SQL (markdown)
---
 PIP-18:-Pulsar-SQL.md | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/PIP-18:-Pulsar-SQL.md b/PIP-18:-Pulsar-SQL.md
index 50333c7..716f64b 100644
--- a/PIP-18:-Pulsar-SQL.md
+++ b/PIP-18:-Pulsar-SQL.md
@@ -9,10 +9,13 @@ N/A
 We are trying to create a method in which users can explore, in a natural 
manner,  the data already stored within Pulsar topics.  We believe the best way 
to accomplish this is to expose SQL interface that allows users to query 
existing data within a Pulsar cluster.  
 
 Just to be absolutely clear,  the SQL we are proposing is for querying data 
already in Pulsar and we are currently not proposing the implementation of any 
sort of SQL on data streams
-Why are we doing this?
 
-Many users are interested for such a feature.  For example, many users store 
large amounts of historical data in Pulsar for various purposes.  Giving them 
to capability to query that that data gives them huge value.  Users will 
typically need to stream the data out of Pulsar and into another platform to do 
any sort of analysis, but with Pulsar SQL, users can just use one platform.
-How are we going to do it?
+
+## Why are we doing this?
+
+Many users are interested in such a feature.  For example, many users store 
large amounts of historical data in Pulsar for various purposes.  Giving them 
to capability to query that that data gives them huge value.  Users will 
typically need to stream the data out of Pulsar and into another platform to do 
any sort of analysis, but with Pulsar SQL, users can just use one platform.
+
+## How are we going to do it?
 
 With the implementation of a schema registry in Pulsar, data can be structured 
so that it can be easily mapped to tables that can be queried by SQL. We plan 
on using Presto (https://prestodb.io/) as the backbone of Pulsar SQL.  A 
connector can be implemented using the Presto connector SPI that allows presto 
to ingest data from Pulsar and to be queried using Presto’s existing SQL 
framework.
 
@@ -21,7 +24,6 @@ The schema registry will be used to generate the structure of 
tables that will b
 Thus, Pulsar will be queried for metadata concerning topics and schemas and 
from that metadata, we will go directly to the bookies to load and deserialize 
the data.
 
 
-
 ## Goals
 
 * Allow users to submit SQL queries using a Pulsar CLI
@@ -92,4 +94,4 @@ Let’s break the implementation into multiple phases:
 6. Performance testing and optimizing
 
 
-More to come...
\ No newline at end of file
+More to come...

Reply via email to