momo-jun commented on code in PR #18242:
URL: https://github.com/apache/pulsar/pull/18242#discussion_r1012619499
##########
site2/docs/schema-get-started.md:
##########
@@ -4,92 +4,480 @@ title: Get started
sidebar_label: "Get started"
---
-This chapter introduces Pulsar schemas and explains why they are important.
-## Schema Registry
+````mdx-code-block
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+````
-Type safety is extremely important in any application built around a message
bus like Pulsar.
-Producers and consumers need some kind of mechanism for coordinating types at
the topic level to avoid various potential problems arising. For example,
serialization and deserialization issues.
+This hands-on tutorial provides instructions and examples on how to construct
and customize schemas.
-Applications typically adopt one of the following approaches to guarantee type
safety in messaging. Both approaches are available in Pulsar, and you're free
to adopt one or the other or to mix and match on a per-topic basis.
+## Construct a string schema
-#### Note
->
-> Currently, the Pulsar schema registry is only available for the [Java
client](client-libraries-java.md), [Go client](client-libraries-go.md), [Python
client](client-libraries-python.md), and [C++ client](client-libraries-cpp.md).
+This example demonstrates how to construct a [string
schema](schema-understand.md#primitive-type) and use it to produce and consume
messages in Java.
-### Client-side approach
+1. Create a producer with a string schema and send messages.
-Producers and consumers are responsible for not only serializing and
deserializing messages (which consist of raw bytes) but also "knowing" which
types are being transmitted via which topics.
+ ```java
+ Producer<String> producer = client.newProducer(Schema.STRING).create();
+ producer.newMessage().value("Hello Pulsar!").send();
+ ```
-If a producer is sending temperature sensor data on the topic `topic-1`,
consumers of that topic will run into trouble if they attempt to parse that
data as moisture sensor readings.
+2. Create a consumer with a string schema and receive messages.
-Producers and consumers can send and receive messages consisting of raw byte
arrays and leave all type safety enforcement to the application on an
"out-of-band" basis.
+ ```java
+ Consumer<String> consumer = client.newConsumer(Schema.STRING).subscribe();
+ consumer.receive();
+ ```
-### Server-side approach
+## Construct a key/value schema
-Producers and consumers inform the system which data types can be transmitted
via the topic.
+This example shows how to construct a [key/value
schema](schema-understand.md#keyvalue-schema) and use it to produce and consume
messages in Java.
-With this approach, the messaging system enforces type safety and ensures that
producers and consumers remain synced.
+1. Construct a key/value schema with `INLINE` encoding type.
-Pulsar has a built-in **schema registry** that enables clients to upload data
schemas on a per-topic basis. Those schemas dictate which data types are
recognized as valid for that topic.
+ ```java
+ Schema<KeyValue<Integer, String>> kvSchema = Schema.KeyValue(
+ Schema.INT32,
+ Schema.STRING,
+ KeyValueEncodingType.INLINE
+ );
+ ```
-## Why use schema
+2. Optionally, construct a key/value schema with `SEPARATED` encoding type.
-When a schema is enabled, Pulsar does parse data, it takes bytes as inputs and
sends bytes as outputs. While data has meaning beyond bytes, you need to parse
data and might encounter parse exceptions which mainly occur in the following
situations:
+ ```java
+ Schema<KeyValue<Integer, String>> kvSchema = Schema.KeyValue(
+ Schema.INT32,
+ Schema.STRING,
+ KeyValueEncodingType.SEPARATED
+ );
+ ```
-* The field does not exist
+3. Produce messages using a key/value schema.
-* The field type has changed (for example, `string` is changed to `int`)
+ ```java
+ Schema<KeyValue<Integer, String>> kvSchema = Schema.KeyValue(
+ Schema.INT32,
+ Schema.STRING,
+ KeyValueEncodingType.SEPARATED
+ );
-There are a few methods to prevent and overcome these exceptions, for example,
you can catch exceptions when parsing errors, which makes code hard to
maintain; or you can adopt a schema management system to perform schema
evolution, not to break downstream applications, and enforces type safety to
max extend in the language you are using, the solution is Pulsar Schema.
+ Producer<KeyValue<Integer, String>> producer = client.newProducer(kvSchema)
+ .topic(TOPIC)
+ .create();
-Pulsar schema enables you to use language-specific types of data when
constructing and handling messages from simple types like `string` to more
complex application-specific types.
+ final int key = 100;
+ final String value = "value-100";
+
+ // send the key/value message
+ producer.newMessage()
+ .value(new KeyValue(key, value))
+ .send();
+ ```
+
+4. Consume messages using a key/value schema.
+
+ ```java
+ Schema<KeyValue<Integer, String>> kvSchema = Schema.KeyValue(
+ Schema.INT32,
+ Schema.STRING,
+ KeyValueEncodingType.SEPARATED
+ );
+
+ Consumer<KeyValue<Integer, String>> consumer = client.newConsumer(kvSchema)
+ ...
+ .topic(TOPIC)
+ .subscriptionName(SubscriptionName).subscribe();
+
+ // receive key/value pair
+ Message<KeyValue<Integer, String>> msg = consumer.receive();
+ KeyValue<Integer, String> kv = msg.getValue();
+ ```
+
+## Construct a struct schema
+
+This example shows how to construct a [struct
schema](schema-understand.md#struct-schema) and use it to produce and consume
messages using different methods.
+
+````mdx-code-block
+<Tabs
+ defaultValue="static"
+
values={[{"label":"static","value":"static"},{"label":"generic","value":"generic"},{"label":"SchemaDefinition","value":"SchemaDefinition"}]}>
+
+<TabItem value="static">
+
+You can predefine the `struct` schema, which can be a POJO in Java, a `struct`
in Go, or classes generated by Avro or Protobuf tools.
+
+**Example**
+
+Pulsar gets the schema definition from the predefined `struct` using an Avro
library. The schema definition is the schema data stored as a part of the
`SchemaInfo`.
+
+1. Create the _User_ class to define the messages sent to Pulsar topics.
+
+ ```java
+ @Builder
+ @AllArgsConstructor
+ @NoArgsConstructor
+ public static class User {
+ String name;
+ int age;
+ }
+ ```
+
+2. Create a producer with a `struct` schema and send messages.
+
+ ```java
+ Producer<User> producer =
client.newProducer(Schema.AVRO(User.class)).create();
+
producer.newMessage().value(User.builder().name("pulsar-user").age(1).build()).send();
+ ```
+
+3. Create a consumer with a `struct` schema and receive messages
+
+ ```java
+ Consumer<User> consumer =
client.newConsumer(Schema.AVRO(User.class)).subscribe();
+ User user = consumer.receive().getValue();
+ ```
+
+</TabItem>
+<TabItem value="generic">
+
+Sometimes applications do not have pre-defined structs, and you can use this
method to define schema and access data.
+
+You can define the `struct` schema using the `GenericSchemaBuilder`, generate
a generic struct using `GenericRecordBuilder` and consume messages into
`GenericRecord`.
**Example**
-You can use the _User_ class to define the messages sent to Pulsar topics.
+1. Use `RecordSchemaBuilder` to build a schema.
+
+ ```java
+ RecordSchemaBuilder recordSchemaBuilder =
SchemaBuilder.record("schemaName");
+ recordSchemaBuilder.field("intField").type(SchemaType.INT32);
+ SchemaInfo schemaInfo = recordSchemaBuilder.build(SchemaType.AVRO);
+
+ Producer<GenericRecord> producer =
client.newProducer(Schema.generic(schemaInfo)).create();
+ ```
+
+2. Use `RecordBuilder` to build the struct records.
+
+ ```java
+ producer.newMessage().value(schema.newRecordBuilder()
+ .set("intField", 32)
+ .build()).send();
+ ```
+
+</TabItem>
+<TabItem value="SchemaDefinition">
+
+You can define the `schemaDefinition` to generate a `struct` schema.
+
+**Example**
+
+1. Create the _User_ class to define the messages sent to Pulsar topics.
+
+ ```java
+ @Builder
+ @AllArgsConstructor
+ @NoArgsConstructor
+ public static class User {
+ String name;
+ int age;
+ }
+ ```
+
+2. Create a producer with a `SchemaDefinition` and send messages.
+
+ ```java
+ SchemaDefinition<User> schemaDefinition =
SchemaDefinition.<User>builder().withPojo(User.class).build();
+ Producer<User> producer =
client.newProducer(Schema.AVRO(schemaDefinition)).create();
+
producer.newMessage().value(User.builder().name("pulsar-user").age(1).build()).send();
+ ```
+
+3. Create a consumer with a `SchemaDefinition` schema and receive messages
+
+ ```java
+ SchemaDefinition<User> schemaDefinition =
SchemaDefinition.<User>builder().withPojo(User.class).build();
+ Consumer<User> consumer =
client.newConsumer(Schema.AVRO(schemaDefinition)).subscribe();
+ User user = consumer.receive().getValue();
+ ```
+
+</TabItem>
+
+</Tabs>
+````
+
+### Avro schema using Java
+
+Suppose you have a `SensorReading` class as follows, and you'd like to
transmit it over a Pulsar topic.
```java
-public class User {
- String name;
- int age;
+public class SensorReading {
+ public float temperature;
+
+ public SensorReading(float temperature) {
+ this.temperature = temperature;
+ }
+
+ // A no-arg constructor is required
+ public SensorReading() {
+ }
+
+ public float getTemperature() {
+ return temperature;
+ }
+
+ public void setTemperature(float temperature) {
+ this.temperature = temperature;
+ }
}
```
-When constructing a producer with the _User_ class, you can specify a schema
or not as below.
+Create a `Producer<SensorReading>` (or `Consumer<SensorReading>`) like this:
+
+```java
+Producer<SensorReading> producer =
client.newProducer(JSONSchema.of(SensorReading.class))
+ .topic("sensor-readings")
+ .create();
+```
+
+The following schema formats are currently available for Java:
Review Comment:
Agree. I added a new heading to introduce these examples as a quick twist.
Most content changes in this PR are a copy&paste to implement a quick
information architecture change. A thorough content review will be done in the
following week through Google docs:)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]