sijie commented on a change in pull request #4786: Add *Understand Schema* 
Section
URL: https://github.com/apache/pulsar/pull/4786#discussion_r306603212
 
 

 ##########
 File path: site2/docs/schema-understand.md
 ##########
 @@ -0,0 +1,319 @@
+---
+id: schema-understand
+title: Understand schema
+sidebar_label: Understand schema
+---
+
+## `SchemaInfo`
+
+Pulsar schema is defined in a data structure called `SchemaInfo`. 
+
+The `SchemaInfo` is stored and enforced on a per-topic basis and cannot be 
stored at the namespace or tenant level.
+
+A `SchemaInfo` consists of the following fields:
+
+| Field | Description |
+|---|---|
+| `name` | Schema name (a string). |
+| `type` | Schema type, which determines how to interpret the schema data. |
+| `schema` | Schema data, which is a sequence of 8-bit unsigned bytes and 
schema-type specific. |
+| `properties` | A map of string key/value pairs, which is 
application-specific. |
+
+**Example**
+
+This is the `SchemaInfo` of a string.
+
+```text
+{
+    “name”: “test-string-schema”,
+    “type”: “STRING”,
+    “schema”: “”,
+    “properties”: {}
+}
+```
+
+## Schema type
+
+Pulsar supports various schema types, which are mainly divided into two 
categories: 
+
+* Primitive type 
+
+* Complex type
+
+> #### Note
+> 
+> If you create a schema without specifying a type, producers and consumers 
can only handle raw bytes.
+
+### Primitive type
+
+Currently, Pulsar supports the following primitive types:
+
+| Primitive Type | Description |
+|---|---|
+| `BOOLEAN` | A binary value |
+| `INT8` | A 8-bit signed integer |
+| `INT16` | A 16-bit signed integer |
+| `INT32` | A 32-bit signed integer |
+| `INT64` | A 64-bit signed integer |
+| `FLOAT` | A single precision (32-bit) IEEE 754 floating-point number |
+| `DOUBLE` | A double-precision (64-bit) IEEE 754 floating-point number |
+| `BYTES` | A sequence of 8-bit unsigned bytes |
+| `STRING` | A Unicode character sequence |
+| `TIMESTAMP` (`DATE`, `TIME`) |  A logic type represents a specific instant 
in time with millisecond precision. It stores the number of milliseconds since 
`January 1, 1970, 00:00:00 GMT` as an `INT64` value | 
+
+For primitive types, Pulsar does not store any schema data in `SchemaInfo`. 
The `type` in `SchemaInfo` is used to determine how to serialize and 
deserialize the data. 
+
+Some of the primitive schema implementations can use `properties` to store 
implementation-specific tunable settings. For example, a `string` schema can 
use `properties` to store the encoding charset to serialize and deserialize 
strings.
+
+The conversions between **Pulsar schema types** and **language-specific 
primitive types** are as below.
+
+| Schema Type | Java Type| Python Type |
 
 Review comment:
   Pulsar always support Go since 2.4.0. 
   
   @wolfstudy can you help adding the corresponding types for Go?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to