Re: [DISCUSSIONS] Should we use AUTO_PRODUCE schema?

2022-12-15 Thread 丛搏
> When the input messages are raw bytes, we cannot guarantee the > validation always succeeds because the schema might change. The > exception is actually thrown in `TypedMessageBuilder#value`. > > But since these APIs are stable, we could only fix it by adding the > documents to describe in which

Re: [DISCUSSIONS] Should we use AUTO_PRODUCE schema?

2022-12-15 Thread Yunze Xu
> It is irresponsible behavior of the producer to leave everything to the > consumer. I agreed now. > I think what we need to do is describe the document clearly IMO, it's a code problem because there is no exception signature for `TypedMessageBuilder#value` and `Message#getValue`. The

Re: [DISCUSSIONS] Should we use AUTO_PRODUCE schema?

2022-12-14 Thread 丛搏
We also can use BYTES producer, but in BYTES schema, do not use .newMessage(schema0), the message will not carry the schema version. the consumer will not decode correctly. and BYTES schema can't validate the data schema. if the data is empty bytes array, It does not make sense to send it to the

Re: [DISCUSSIONS] Should we use AUTO_PRODUCE schema?

2022-12-14 Thread Yunze Xu
Why not use the following code with a BYTES producer in your case? ```java var schema0 = Schema.AVRO(SchemaDefinition.builder() .withJsonDef("student with version0 json def").build(); p.newMessage(schema0).value(schema0.decode(student1)).send(); ... ``` Thanks, Yunze On Wed, Dec 14, 2022 at

Re: [DISCUSSIONS] Should we use AUTO_PRODUCE schema?

2022-12-14 Thread 丛搏
Yunze Xu 于2022年12月14日周三 20:37写道: > > > how do you can create two Student.class in one java process? and use > the same namespace? > > Could you give an example to show how `AUTO_PRODUCE` schema makes a > difference? // this is Student use version0, may be data from kafka byte[] student1 =

Re: [DISCUSSIONS] Should we use AUTO_PRODUCE schema?

2022-12-14 Thread Yunze Xu
> It looks like that the AUTO_PRODUCE schema is similar to the BYTES schema in the semantic. The only differences are: 1. AUTO_PRODUCE schema can produce messages to a topic that already has schema (because it downloaded the schema and add it to the CommandProducer request) 2. AUTO_PRODUCE schema

Re: [DISCUSSIONS] Should we use AUTO_PRODUCE schema?

2022-12-14 Thread Yunze Xu
> how do you can create two Student.class in one java process? and use the same namespace? Could you give an example to show how `AUTO_PRODUCE` schema makes a difference? But with AUTO_PRODUCE schema, the precondition is that we have a topic that has messages of these two schemas. For example,

Re: [DISCUSSIONS] Should we use AUTO_PRODUCE schema?

2022-12-14 Thread Xiangying Meng
Good viewpoint, It looks like that the AUTO_PRODUCE schema is similar to the BYTES schema in the semantic. So can we make the BYTES schema has the features of the AUTO_PRODUCE? There have some reasons to do this. Firstly, it does not cause compatibility issues. Now, the topics that have messages

Re: [DISCUSSIONS] Should we use AUTO_PRODUCE schema?

2022-12-13 Thread 丛搏
> > > the user only creates one producer to send all Kafka topic data, if > using Pulsar schema, the user needs to create all schema producers in > a map > > It doesn't make sense to me. If the source topic has messages of > multiple schemas, why did you try to sink them into the same topic > with

Re: [DISCUSSIONS] Should we use AUTO_PRODUCE schema?

2022-12-13 Thread 丛搏
Yunze Xu 于2022年12月14日周三 12:40写道: > > > the user only creates one producer to send all Kafka topic data, if > using Pulsar schema, the user needs to create all schema producers in > a map > > It doesn't make sense to me. If the source topic has messages of > multiple schemas, why did you try to

Re: [DISCUSSIONS] Should we use AUTO_PRODUCE schema?

2022-12-13 Thread Yunze Xu
> the user only creates one producer to send all Kafka topic data, if using Pulsar schema, the user needs to create all schema producers in a map It doesn't make sense to me. If the source topic has messages of multiple schemas, why did you try to sink them into the same topic with a schema? The

Re: [DISCUSSIONS] Should we use AUTO_PRODUCE schema?

2022-12-13 Thread 丛搏
Hi, Yunze: Yunze Xu 于2022年12月14日周三 02:26写道: > First, how do you guarantee the schema can be used to encode the raw > bytes whose format is unknown? I think this is what the user needs to ensure that the user knows all the schema from the Kafka topic and the date(bytes[]) that the user can send

[DISCUSSIONS] Should we use AUTO_PRODUCE schema?

2022-12-13 Thread Yunze Xu
Hi all, Pulsar supports AUTO_PRODUCE schema, but this feature was introduced at an early time [1] when there was no PIP. I have read the documents [2] and found the example scenario. > Suppose that: > - You have a producer processing messages from a Kafka topic K. > - You have a Pulsar topic P,