Hi siNan:

>From my point of view, it is just a plug-in. I don't think it is
necessary to add configuration for the plugin.
This is meaningless, and it will increase the difficulty of use for users.


SiNan Liu <liusinan1...@gmail.com> 于2023年3月8日周三 15:54写道:
>
> Hi, bo.
>
> 1. I understand what you say, to develop a new
> `ProtobufNativeAdvancedSchemaCompatibilityCheck`, rather than changing
> existing `ProtobufNativeSchemaCompatibilityCheck`. But I found a few small
> problems:
>
> (1)ProtobufNativeAdvancedSchemaCompatibilityCheck and
> ProtobufNativeSchemaCompatibilityCheck schemaType is PROTOBUF_NATIVE. It
> looks like both checkers are PROTOBUF not using AVRO-PROTOBUF's "native"
> implementation, which leads to some problems or "unreasonable" and gives me
> some extended thinking and questions.
>
`CompatibilityCheck ` its only a plugin.
`ProtobufNativeSchemaCompatibilityCheck` may sooner or later leave the
stage, when `ProtobufNativeAdvancedSchemaCompatibilityCheck` is
stable, we can make it the default Checker.

It is just a plug-in, users can change it at will and ensure that it
is used correctly
> (2)In broker.conf
>
> `schemaRegistryCompatibilityCheckers`. If
> ProtobufNativeSchemaCompatibilityCheck and
> ProtobufNativeAdvancedSchemaCompatibilityCheck all set. This is going to
> overwrite each other. Because this is a map:
>
> https://github.com/apache/pulsar/blob/af1360fb167c1f9484fda5771df3ea9b21d1440b/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/schema/SchemaRegistryService.java#L36-L44
>
> ```java
>
> Map<SchemaType, SchemaCompatibilityCheck> checkers = new HashMap<>();
>
> for (String className : checkerClasses) {
>
> SchemaCompatibilityCheck schemaCompatibilityCheck =
> Reflections.createInstance(className,
>
> SchemaCompatibilityCheck.class,
> Thread.currentThread().getContextClassLoader());
>
> checkers.put(schemaCompatibilityCheck.getSchemaType(),
> schemaCompatibilityCheck);
>
> ```
>
> Is this a big problem or a small one? Is it possible or unnecessary? Maybe
> we can write in the documentation that protobufNative checkers can only
> choose one of the two? Why are there two Checkers for different
> implementations of the same schemaType? Why not the checker to create
> different validator, so we don not have to change
> schemaRegistryCompatibilityCheckers.

users can only use one, not two, which will bring complexity to users

>
> (3)And after the update to ProtobufNativeAdvancedSchemaCompatibilityCheck.
> Existing topics previously only checked the name of the root message, not
> the content of protobuf.
>
> What if the user wants both Checkers?
>
> Set to ProtobufNativeAdvancedSchemaCompatibilityCheck, affect the topic of
> the existing schema?
>
> Older topics still use the old checker, and newer topics or certain older
> topics use the new advancedchecker.
>
when `ProtobufNativeAdvancedSchemaCompatibilityCheck` stable,
users will not choose `ProtobufNativeSchemaCompatibilityCheck`.
because it not a complete checker.
> (4)So should we have one schemaType for a checker? protobufNativeChecker
> can have as many different implementation classes as possible. This
> classname configuration in PIP, let's see if it can be set at the topic
> level. In the current PIP design I just load this parameter into the
> checker when the broker is started and the checkers map is set up. Can I do
> this in the new normal pr if I want to support topic level? Or perfect it
> here?
>
> Add a call PROTOBUF_NATIVE_ADVANCE schemaType corresponding
> ProtobufNativeAdvancedSchemaCompatibilityCheck? (Seems to be more trouble).
>
> Sorry I can not use the computer and network in the company, I use my
> mobile phone to reply to the email, the format may be a bit messy. Please
> understand.
>
> Thanks,
>
> sinan
>
>
> 丛搏 <bog...@apache.org> 于 2023年3月7日周二 下午11:39写道:
>
> > SiNan Liu <liusinan1...@gmail.com> 于2023年3月7日周二 13:22写道:
> > >
> > > Great to see your comment, bo!
> > >
> > > 1. The first way. The protobuf website has a description of the rules,
> > but
> > > no plans to implement them.
> > > https://protobuf.dev/programming-guides/proto/#updating
> >
> > https://groups.google.com/g/protobuf
> > maybe ask here
> >
> > >
> > > 2. I think this PIP can be divided into two parts.
> > > (1) Add a flag(`ValidatorClassName`), load it into
> > > `ProtobufNativeSchemaCompatibilityCheck` when the broker starts.
> > > ValidatorClassName is empty by default, and the implementation continues
> > as
> > > before, with no change for the user.
> >
> > `ProtobufNativeSchemaCompatibilityCheck` is a plugin in `broker.conf`
> > ```
> >
> > schemaRegistryCompatibilityCheckers=org.apache.pulsar.broker.service.schema.JsonSchemaCompatibilityCheck,org.apache.pulsar.broker.service.schema.AvroSchemaCompatibilityCheck,org.apache.pulsar.broker.service.schema.ProtobufNativeSchemaCompatibilityCheck
> > ```
> > I do not recommend that we directly modify this plugin and continue to
> > add configuration items, which will cause trouble for users.
> > We have a lot of configs and it's getting very unwieldy.
> > in my opinion, we don't change
> >
> > `org.apache.pulsar.broker.service.schema.ProtobufNativeSchemaCompatibilityCheck`,
> > it is a simple implementation, it doesn't go wrong very often, most
> > users will use it. we can add another ProtobufNativeCheck named
> > `ProtobufNativeAdvancedSchemaCompatibilityCheck ` or other. in this
> > way, we don't need to add this flag. There is no need to consider
> > compatibility, it is just a plug-in and will not affect current logic.
> > If the user needs it, just change the plugin to the new implementation
> >
> > > ```java
> > >     ProtobufNativeSchemaValidator DEFAULT = (fromDescriptors,
> > toDescriptor)
> > > -> {
> > >         for (Descriptors.Descriptor fromDescriptor : fromDescriptors) {
> > >             // The default implementation only checks if the root message
> > > has changed.
> > >             if
> > > (!fromDescriptor.getFullName().equals(toDescriptor.getFullName())) {
> > >                 throw new ProtoBufCanReadCheckException("Protobuf root
> > > message isn't allow change!");
> > >             }
> > >         }
> > >     };
> > > ```
> > > `ValidatorClassName` value also can be set to the current implementation
> > of
> > > PIP add
> > >
> > `org.apache.pulsar.broker.service.schema.validator.ProtobufNativeSchemaBreakValidatorImpl`.
> > >
> > > (2) Recoding the `ProtobufNativeSchemaCompatibilityCheck`. Through the
> > flag
> > > (`ValidatorClassName`) to build different
> > `ProtobufNativeSchemaValidator`.
> > > Isn't it just a plug-in? The user can develop and choose a different
> > > `ProtobufNativeSchemaValidator`. I think it didn't change the logic, it
> > > just allowed him to expand it.
> > >
> > >
> > > I think this PIP should be an enhancement and supplement to the function,
> > > and there is no such thing as unnecessary and meaningless.
> > >
> > >
> > > Thanks,
> > > sinan
> > >
> > >
> > >
> > >
> > >
> > > 丛搏 <bog...@apache.org> 于2023年3月7日周二 11:53写道:
> > >
> > > > I think we have two ways to do that.
> > > >
> > > > First way: We need to advance the improvement of java in protobuf. Ask
> > > > if they have plans to improve.
> > > >
> > > > Second way: the new PROTOBUF_NATIVE `SchemaCompatibilityCheck` should
> > > > be implemented as a plugin, don't change any existing plugin logic
> > > > (it's simple and already used). I don't recommend adding flags for
> > > > rollback, it adds configuration and makes little sense.
> > > >
> > > > Thanks,
> > > > Bo
> > > >
> > > > Asaf Mesika <asaf.mes...@gmail.com> 于2023年3月6日周一 23:00写道:
> > > >
> > > > >
> > > > > Can you convert the code block which is actually a quote in the
> > > > > beginning of the PIP to something which doesn't require to scroll
> > > > > horizontally so much?
> > > > > Use
> > > > >
> > > >
> > https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax#quoting-text
> > > > >
> > > > > Let's improve the clarity of what you wrote:
> > > > >
> > > > > "the PROTOBUF uses avro struct to store."
> > > > > -->
> > > > > When Schema type PROTOBUF is used, Pulsar Client assumes the object
> > given
> > > > > to it as message data is an auto-generated POJO containing the
> > > > annotations
> > > > > encoding the schema. The client is using a converter, which converts
> > a
> > > > > Protobuf schema descriptor into an Avro schema and sends that as the
> > > > Schema
> > > > > of the producer/consumer.
> > > > >
> > > > > "On the broker side, protobuf and avro both use SchemaData converted
> > to
> > > > > org.apache.avro.Schema."
> > > > > -->
> > > > > Since the schema is an Avro schema, the implementation of
> > compatibility
> > > > > check on the broker side is to simply re-use the compatibility check
> > of
> > > > the
> > > > > AVRO schema type.
> > > > >
> > > > > "ProtobufSchema is different from ProtobufNativeSchema in schema
> > > > > compatibility check it uses avro-protobuf.
> > > > >
> > > >
> > https://central.sonatype.com/artifact/org.apache.avro/avro-protobuf/1.11.1/overview
> > > > > But the current implementation of ProtobufNative schema compatibility
> > > > > check only
> > > > > checked if the root message name is changed."
> > > > >
> > > > > -->
> > > > > PROTOBUF_NATIVE schema type is different.
> > > > > The client is actually using Protobuf Descriptor as the schema, as
> > > > opposed
> > > > > to Avro schema of PROTOBUF schema type. In the broker, the
> > > > PROTOBUF_NATIVE
> > > > > compatibility check actually hasn't implemented any rule, besides
> > one:
> > > > > checking if the root message name has changed.
> > > > >
> > > > >
> > > > >
> > > > > >    1. For now, there is no official or third-party solution for
> > > > ProtoBuf
> > > > > >    compatibility. If in the future have better solutions of a third
> > > > party or
> > > > > >    the official, we develop new ProtobufNativeSchemaValidator and
> > use,
> > > > so
> > > > > >    add a flag.
> > > > > >
> > > > > > Who do you need to make that configurable? Once you found a third
> > > > party,
> > > > > just switch to it? Who knows, maybe you never will. Introduce it
> > when you
> > > > > find it, not now.
> > > > >
> > > > >
> > > > > We improve in ProtobufNativeSchemaCompatibilityCheck BACKWARD,
> > FORWARD
> > > > > > these strategies. As with the AVRO implementation, protobuf
> > > > compatibility
> > > > > > checking need implementing the canRead method. *This will check
> > that
> > > > > > the writtenschema can be read by readSchema.*
> > > > >
> > > > >
> > > > > I completely disagree.
> > > > > Avro implementation is confusing for our use case. Don't copy that.
> > > > >
> > > > > You have
> > > > >
> > > > > public void checkCompatible(SchemaData from, SchemaData to,
> > > > > SchemaCompatibilityStrategy strategy)
> > > > >         throws IncompatibleSchemaException {
> > > > >     Descriptor fromDescriptor =
> > > > > ProtobufNativeSchemaUtils.deserialize(from.getData());
> > > > >     Descriptor toDescriptor =
> > > > > ProtobufNativeSchemaUtils.deserialize(to.getData());
> > > > >     switch (strategy) {
> > > > >         case BACKWARD_TRANSITIVE:
> > > > >         case BACKWARD:
> > > > >         case FORWARD_TRANSITIVE:
> > > > >         case FORWARD:
> > > > >         case FULL_TRANSITIVE:
> > > > >         case FULL:
> > > > >             checkRootMessageChange(fromDescriptor, toDescriptor,
> > > > strategy);
> > > > >             return;
> > > > >         case ALWAYS_COMPATIBLE:
> > > > >             return;
> > > > >         default:
> > > > >             throw new IncompatibleSchemaException("Unknown
> > > > > SchemaCompatibilityStrategy.");
> > > > >     }
> > > > > }
> > > > >
> > > > > I would rename :
> > > > > from --> currentSchema
> > > > > to --> newSchema
> > > > >
> > > > > Use that switch case and have a method for each like:
> > > > > validateBackwardsCompatibility(currentSchema, newSchema)
> > > > >
> > > > > I dislike canRead and usage of writtenSchema, since you have two
> > > > completely
> > > > > different use cases: from the producing side and the consumer side.
> > > > >
> > > > > schemaValidatorBuilder
> > > > > >
> > > > > > I dislike this proposal. IMO Avro implementation is way too
> > > > complicated.
> > > > > Why not have a simple function for validation for each switch case
> > above?
> > > > > Why do we need strategy and builder, and all this complexity?
> > > > >
> > > > >
> > > > > *Here are the basic compatibility rules we've defined:*
> > > > >
> > > > >
> > > > > IMO it's impossible to read the validation rules as you described
> > them.
> > > > > I wrote how they should be structured numerous times above.
> > > > > I can't validate them.
> > > > >
> > > > >
> > > > > IMO, the current design is very hard to read.
> > > > > Please try to avoid jumping into code sections.
> > > > > Write a high level design section, in which you describe in words
> > what
> > > > you
> > > > > plan to do.
> > > > > Write the validation rules in the structure that is easy to
> > understand:
> > > > > rules per each compatibility check, and use proper words (current
> > schema,
> > > > > new schema), since new schema can be once used for read and once
> > used for
> > > > > write.
> > > > >
> > > > > In its current form it takes too much time to understand the design,
> > and
> > > > it
> > > > > shouldn't be the case.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Asaf
> > > > >
> > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Sun, Mar 5, 2023 at 3:58 PM SiNan Liu <liusinan1...@gmail.com>
> > wrote:
> > > > >
> > > > > > Hi! I updated the explanation of some things in the PIP issue. And
> > also
> > > > > > added a new “flag” in the conf is used as the different
> > > > > > ProtobufNativeSchemaValidator implementation, also set
> > > > > > ProtobufNativeSchemaValidator default only check whether the name
> > of
> > > > the
> > > > > > root message is the same.
> > > > > >
> > > > > >
> > > > > > Thanks,
> > > > > > sinan
> > > > > >
> > > > > >
> > > > > > Asaf Mesika <asaf.mes...@gmail.com> 于2023年3月5日周日 20:21写道:
> > > > > >
> > > > > > > On Wed, Mar 1, 2023 at 4:33 PM SiNan Liu <liusinan1...@gmail.com
> > >
> > > > wrote:
> > > > > > >
> > > > > > > > >
> > > > > > > > > Can you please explain how a Protobuf Schema descriptor can
> > be
> > > > > > > validated
> > > > > > > > > for backward compatibility check using Avro based
> > compatibility
> > > > > > rules?
> > > > > > > > > Doesn't it expect the schema to be Avro, but it is actually a
> > > > > > Protobuf
> > > > > > > > > descriptor?
> > > > > > > > > Is there some translation happening?
> > > > > > > >
> > > > > > > >
> > > > > > > > 1. *You can take a quick look at the previous design, the
> > PROTOBUF
> > > > uses
> > > > > > > > avro struct to store.*
> > > > > > > > https://github.com/apache/pulsar/pull/1954
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > https://github.com/apache/pulsar/blob/579f22c8449be287ee1209a477aeaad346495289/pulsar-client/src/main/java/org/apache/pulsar/client/impl/schema/ProtobufSchema.java#L59-L61
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > https://github.com/apache/pulsar/blob/579f22c8449be287ee1209a477aeaad346495289/pulsar-client/src/main/java/org/apache/pulsar/client/impl/schema/ProtobufSchema.java#L110-L115
> > > > > > >
> > > > > > >
> > > > > > > Ok. So to summarize your code (easier to write it than send
> > links):
> > > > > > > * Pulsar Client, when used with Protobuf Schema, actually
> > converts
> > > > the
> > > > > > > Protobuf descriptor into an Avro Schema (using code found inside
> > Avro
> > > > > > > library) and saves that Avro schema as the schema. It's not
> > saving
> > > > the
> > > > > > > protobuf descriptor at all. Very confusing I have to add - never
> > > > expected
> > > > > > > that.
> > > > > > > This explains why In the ProtobufSchemaCompatibilityCheck they
> > just
> > > > > > extend
> > > > > > > the Avro without doing any translation.
> > > > > > >
> > > > > > > Thanks for that.
> > > > > > >
> > > > > > > Now thatI finally understand this, I can say that: you *must*
> > explain
> > > > > > that
> > > > > > > in the motivation part in your PIP.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > 2. *On the broker side, protobuf and avro both use `SchemaData`
> > > > > > converted
> > > > > > > > to `org.apache.avro.Schema`.*
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > https://github.com/apache/pulsar/blob/579f22c8449be287ee1209a477aeaad346495289/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/ServerCnx.java#L1280-L1293
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > https://github.com/apache/pulsar/blob/579f22c8449be287ee1209a477aeaad346495289/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/schema/ProtobufSchemaCompatibilityCheck.java#L26-L31
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > https://github.com/apache/pulsar/blob/579f22c8449be287ee1209a477aeaad346495289/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/schema/AvroSchemaBasedCompatibilityCheck.java#L47-L70
> > > > > > >
> > > > > > >
> > > > > > > Actually those links don't really help.
> > > > > > > The main link that helps is:
> > > > > > >
> > > > > > >
> > > > > >
> > > >
> > https://github.com/apache/pulsar/blob/ec102fb024a6ea2b195826778300f20e330dff06/pulsar-client/src/main/java/org/apache/pulsar/client/impl/schema/ProtobufSchema.java#L102-L122
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > I'm sorry - I don't understand.
> > > > > > > > > I understand the different compatibility check strategies.
> > > > > > > > > If you just spell them out here, then as you say, just
> > translate
> > > > the
> > > > > > > > > Protobuf Descriptor into an Avro schema and run the Avro
> > > > > > > > > compatibility validation, no?
> > > > > > > > > I believe the answer is no, since you may want to verify
> > > > different
> > > > > > > things
> > > > > > > > > when it comes to Protobuf, which are different then Avro.
> > > > > > > >
> > > > > > > >
> > > > > > > > 1.
> > > > > > > > *ProtobufSchema is different from ProtobufNativeSchema in that
> > it
> > > > uses
> > > > > > > > avro-protobuf.*
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > https://central.sonatype.com/artifact/org.apache.avro/avro-protobuf/1.11.1/overview
> > > > > > > > *ProtobufNativeSchema needs a native compatibility check, but
> > > > there is
> > > > > > no
> > > > > > > > official or third party implementation. So this PIP does not
> > use
> > > > > > > > avro-protobuf for protobuf compatibility checking.*
> > > > > > > >
> > > > > > > > 2. *By the way, this is implemented in much the same way that
> > > > Apache
> > > > > > avro
> > > > > > > > does compatibility checking.*
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/SchemaValidatorBuilder.java
> > > > > > > > `canReadStrategy`,`canBeReadStrategy`,`mutualReadStrategy`
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/ValidateCanRead.java
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/ValidateCanBeRead.java
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/ValidateMutualRead.java
> > > > > > > > *In `ValidateMutualRead.java`, the arguments of `canRead()` are
> > > > > > > > writtenSchema and readSchema. We only need to change the order
> > of
> > > > > > > arguments
> > > > > > > > we pass to `canRead()`.*
> > > > > > > > ```java
> > > > > > > > private void validateWithStrategy(Descriptors.Descriptor
> > > > toValidate,
> > > > > > > > Descriptors.Descriptor fromDescriptor) throws
> > > > > > > ProtoBufCanReadCheckException
> > > > > > > > {
> > > > > > > > switch (strategy) {
> > > > > > > > case CanReadExistingStrategy -> canRead(fromDescriptor,
> > > > toValidate);
> > > > > > > > case CanBeReadByExistingStrategy -> canRead(toValidate,
> > > > > > fromDescriptor);
> > > > > > > > case CanBeReadMutualStrategy -> {
> > > > > > > > canRead(toValidate, fromDescriptor);
> > > > > > > > canRead(fromDescriptor, toValidate);
> > > > > > > > }
> > > > > > > > }
> > > > > > > > }
> > > > > > > >
> > > > > > > > private void canRead(Descriptors.Descriptor writtenSchema,
> > > > > > > > Descriptors.Descriptor readSchema) throws
> > > > > > ProtoBufCanReadCheckException {
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > ProtobufNativeSchemaBreakCheckUtils.checkSchemaCompatibility(writtenSchema,
> > > > > > > > readSchema);
> > > > > > > > }
> > > > > > > > ```
> > > > > > > >
> > > > > > > >
> > > > > > > I get that you want to take inspiration from the existing Avro
> > Schema
> > > > > > > compatibility check, to do your code design.
> > > > > > > I also understand you *won't* use any existing avro code for
> > that.
> > > > > > > I also understand, you have to write the validation check on your
> > > > own,
> > > > > > > since there is no 3rd party to explain that.
> > > > > > >
> > > > > > > The only thing I can't understand are the actual rules as I wrote
> > > > before,
> > > > > > > since they are written confusingly.
> > > > > > > So, I repeat what I asked before:
> > > > > > >
> > > > > > > I think you should structure the validation rules differently:
> > > > > > >
> > > > > > > * Backward checks
> > > > > > > ** List down rules, where use newSchema (the schema used by
> > producer
> > > > or
> > > > > > > consumer) and existingSchema (last schema used)
> > > > > > > * Forward
> > > > > > > ** List down rules, where use newSchema (the schema used by
> > producer
> > > > or
> > > > > > > consumer) and existingSchema (last schema used)
> > > > > > >
> > > > > > > Once that's accomplished I will be able to understand the
> > different
> > > > > > > validation rules for each compatibility check.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > sinan
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > Asaf Mesika <asaf.mes...@gmail.com> 于2023年3月1日周三 21:19写道:
> > > > > > > > >
> > > > > > > > > On Mon, Feb 27, 2023 at 3:47 PM SiNan Liu <
> > > > liusinan1...@gmail.com>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > I read it and they look identical. What's the difference
> > > > between
> > > > > > > > them?
> > > > > > > > > >
> > > > > > > > > > Current avro,json, and protobuf schemas are all implemented
> > > > based
> > > > > > on
> > > > > > > > AVRO.
> > > > > > > > > > > What do you mean, they are all implemented based on
> > Avro? You
> > > > > > mean
> > > > > > > > the
> > > > > > > > > > > protobuf schema is converted into an Avro Schema, and
> > then
> > > > you
> > > > > > use
> > > > > > > > Avro
> > > > > > > > > > > compatibility validation?
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > `org.apache.pulsar.broker.service.schema.ProtobufSchemaCompatibilityCheck`
> > > > > > > > > >
> > > > > > >
> > > > `org.apache.pulsar.broker.service.schema.AvroSchemaCompatibilityCheck`
> > > > > > > > > >
> > > > > > >
> > > > `org.apache.pulsar.broker.service.schema.JsonSchemaCompatibilityCheck`
> > > > > > > > > > They all extends `AvroSchemaBasedCompatibilityCheck`, the
> > > > > > > > > > `checkCompatible()` is the same implementation with `AVRO`.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Can you please explain how a Protobuf Schema descriptor can
> > be
> > > > > > > validated
> > > > > > > > > for backward compatibility check using Avro based
> > compatibility
> > > > > > rules?
> > > > > > > > > Doesn't it expect the schema to be Avro, but it is actually a
> > > > > > Protobuf
> > > > > > > > > descriptor?
> > > > > > > > > Is there some translation happening?
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > I think you should structure the validation rules
> > differently:
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > The Compatibility check strategy is described on the
> > website
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > https://pulsar.apache.org/docs/next/schema-understand/#schema-compatibility-check-strategy
> > > > > > > > > > 1. BACKWARD(CanReadExistingStrategy): Consumers using
> > schema
> > > > V3 can
> > > > > > > > process
> > > > > > > > > > data written by producers using the last schema version
> > V2. So
> > > > V2
> > > > > > is
> > > > > > > > > > "writtenSchema" and V3 is "readSchema".
> > > > > > > > > > 2. FORWARD(CanBeReadByExistingStrategy): Consumers using
> > the
> > > > last
> > > > > > > > schema
> > > > > > > > > > version V2 can process data written by producers using a
> > new
> > > > schema
> > > > > > > V3,
> > > > > > > > > > even though they may not be able to use the full
> > capabilities
> > > > of
> > > > > > the
> > > > > > > > new
> > > > > > > > > > schema. So V3 is "writtenSchema" and V2 is "readSchema".
> > > > > > > > > > 3. FULL(CanBeReadMutualStrategy): Schemas are both
> > backward and
> > > > > > > forward
> > > > > > > > > > compatible.
> > > > > > > > > > Schema can evolve. The old version schema and the new
> > version
> > > > > > schema
> > > > > > > > should
> > > > > > > > > > be well understood.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > I'm sorry - I don't understand.
> > > > > > > > > I understand the different compatibility check strategies.
> > > > > > > > > If you just spell them out here, then as you say, just
> > translate
> > > > the
> > > > > > > > > Protobuf Descriptor into an Avro schema and run the Avro
> > > > > > > > > compatibility validation, no?
> > > > > > > > > I believe the answer is no, since you may want to verify
> > > > different
> > > > > > > things
> > > > > > > > > when it comes to Protobuf, which are different then Avro.
> > > > > > > > >
> > > > > > > > > At the current state, I can't understand your design at all.
> > > > Please
> > > > > > > help
> > > > > > > > > clarify that.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > So each strategy should have its own section.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > The arguments of `canRead()` are writtenSchema and
> > readSchema.
> > > > As
> > > > > > > we've
> > > > > > > > > > just described, we only need to change the order of
> > arguments
> > > > we
> > > > > > pass
> > > > > > > > to
> > > > > > > > > > `canRead()`.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > sinan
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Asaf Mesika <asaf.mes...@gmail.com> 于2023年2月27日周一 20:49写道:
> > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > And you can see the difference between ProtoBuf and
> > > > > > > ProtoBufNative:
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > >
> > > > https://pulsar.apache.org/docs/next/schema-get-started/#protobufnative
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > https://pulsar.apache.org/docs/next/schema-get-started/#protobuf
> > > > > > > > > > > >
> > > > > > > > > > >  I read it and they look identical. What's the difference
> > > > between
> > > > > > > > them?
> > > > > > > > > > >
> > > > > > > > > > > Current avro,json, and protobuf schemas are all
> > implemented
> > > > based
> > > > > > > on
> > > > > > > > > > AVRO.
> > > > > > > > > > >
> > > > > > > > > > > What do you mean, they are all implemented based on
> > Avro? You
> > > > > > mean
> > > > > > > > the
> > > > > > > > > > > protobuf schema is converted into an Avro Schema, and
> > then
> > > > you
> > > > > > use
> > > > > > > > Avro
> > > > > > > > > > > compatibility validation?
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > *Here are the basic compatibility rules we've defined:*
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > I think you should structure the validation rules
> > > > differently:
> > > > > > > > > > >
> > > > > > > > > > > * Backward checks
> > > > > > > > > > > ** List down rules, where use newSchema (the schema used
> > by
> > > > > > > producer
> > > > > > > > or
> > > > > > > > > > > consumer) and existingSchema (last schema used)
> > > > > > > > > > > * Forward
> > > > > > > > > > > ** List down rules, where use newSchema (the schema used
> > by
> > > > > > > producer
> > > > > > > > or
> > > > > > > > > > > consumer) and existingSchema (last schema used)
> > > > > > > > > > >
> > > > > > > > > > > So each strategy should have its own section.
> > > > > > > > > > >
> > > > > > > > > > > I'm saying this since you used "writttenSchema" word but
> > it
> > > > > > > > represents
> > > > > > > > > > > something completely different if it's backward or
> > forward
> > > > check.
> > > > > > > > > > >
> > > > > > > > > > > Once you'll have that structure like that, I personally
> > will
> > > > be
> > > > > > > able
> > > > > > > > to
> > > > > > > > > > > read and understand it.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > The motivation and problem statement are now good -
> > thanks
> > > > for
> > > > > > > > improving
> > > > > > > > > > > it.
> > > > > > > > > > >
> > > > > > > > > > > On Mon, Feb 27, 2023 at 8:20 AM SiNan Liu <
> > > > > > liusinan1...@gmail.com>
> > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi! I updated the PIP issue again. This time I've added
> > > > some
> > > > > > > > background
> > > > > > > > > > > and
> > > > > > > > > > > > some explanations.
> > > > > > > > > > > >
> > > > > > > > > > > > The compatibility check rules are already written in
> > the
> > > > > > > > > > Implementation.
> > > > > > > > > > > > ProtoBufNative implements the same canRead method as
> > Apache
> > > > > > Avro.
> > > > > > > > > > > > It does this by checking whether the schema for
> > writing and
> > > > > > > reading
> > > > > > > > is
> > > > > > > > > > > > compatible. I also indicate whether the writtenSchema
> > and
> > > > > > > > readSchema of
> > > > > > > > > > > the
> > > > > > > > > > > > Backward, Forward, and Full strategies are the old or
> > the
> > > > new
> > > > > > > > version
> > > > > > > > > > of
> > > > > > > > > > > > the schema.
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > > sinan
> > > > > > > > > > > >
> > > > > > > > > > > > Asaf Mesika <asaf.mes...@gmail.com> 于2023年2月26日周日
> > 23:24写道:
> > > > > > > > > > > >
> > > > > > > > > > > > > I'm sorry, but this PIP lacks a lot of background
> > > > knowledge,
> > > > > > so
> > > > > > > > you
> > > > > > > > > > > need
> > > > > > > > > > > > to
> > > > > > > > > > > > > add IMO for people to understand it. You don't need
> > to
> > > > > > explain
> > > > > > > > the
> > > > > > > > > > > entire
> > > > > > > > > > > > > pulsar in this PIP, but at the very least a few
> > > > paragraphs
> > > > > > > > detailing
> > > > > > > > > > > all
> > > > > > > > > > > > > you need to know, to put you in context:
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >    - Start by saying Pulsar as a built-in schema
> > registry
> > > > > > > inside
> > > > > > > > > > Pulsar
> > > > > > > > > > > > >    broker.
> > > > > > > > > > > > >       - Every time the client updates the schema, it
> > > > uploads
> > > > > > it
> > > > > > > > to
> > > > > > > > > > the
> > > > > > > > > > > > >       broker. When that happens, it has a feature
> > which
> > > > > > > validates
> > > > > > > > if
> > > > > > > > > > > the
> > > > > > > > > > > > > new
> > > > > > > > > > > > >       schema version is compatible with the previous
> > > > > > versions.
> > > > > > > > There
> > > > > > > > > > > > > are 4 types
> > > > > > > > > > > > >       of compatibility: Full, ... (complete and
> > explain
> > > > each
> > > > > > > one
> > > > > > > > > > > briefly)
> > > > > > > > > > > > >    - Also explain Pulsar Schema registry supports
> > various
> > > > > > > schema
> > > > > > > > > > > > >    protocols:  Avro, protobuf native, ... (complete
> > the
> > > > > > rest),
> > > > > > > > each
> > > > > > > > > > > > > protocol
> > > > > > > > > > > > >    has a schema which dictates how to serialize and
> > > > > > deserialize
> > > > > > > > the
> > > > > > > > > > > > message
> > > > > > > > > > > > >    content into typed object.
> > > > > > > > > > > > >    - Explain in short what is protobuf native
> > (compare
> > > > > > protobuf
> > > > > > > > > > > > non-native)
> > > > > > > > > > > > >    - Please don't paste code instead of explaining.
> > > > > > > > > > > > >       - Explain that protobuf native current
> > validation
> > > > check
> > > > > > > is
> > > > > > > > only
> > > > > > > > > > > > >       composed of checking the root message name is
> > the
> > > > same
> > > > > > > > between
> > > > > > > > > > > > > the current
> > > > > > > > > > > > >       schema version and the new version.
> > > > > > > > > > > > >          - Explain briefly what is a root message
> > and its
> > > > > > name.
> > > > > > > > > > > > >       - Explain the problem (list scenarios) that we
> > have
> > > > > > > because
> > > > > > > > > > > > protobuf
> > > > > > > > > > > > >       native schema only supports FULL compatibility
> > > > > > > validation.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Regarding high level design - as in what you plan to
> > do.
> > > > > > > > > > > > > I suggest you add "High Level Design" and in it
> > detail
> > > > how
> > > > > > you
> > > > > > > > plan
> > > > > > > > > > to
> > > > > > > > > > > > > validate, per protobuf version, per compatibility
> > check
> > > > > > > > (backward,
> > > > > > > > > > > > forward,
> > > > > > > > > > > > > full,...).
> > > > > > > > > > > > > I tried reading the implementation - for me , it's
> > all
> > > > over
> > > > > > the
> > > > > > > > > > place.
> > > > > > > > > > > > Can
> > > > > > > > > > > > > you please list in order what I wrote above, and
> > list the
> > > > > > > > validation
> > > > > > > > > > > > rules
> > > > > > > > > > > > > with a good explanation why you validate it like
> > that?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Lastly, one you have all the validation rules clearly
> > > > stated,
> > > > > > > you
> > > > > > > > can
> > > > > > > > > > > use
> > > > > > > > > > > > > it to document it properly so users can know what
> > > > validation
> > > > > > to
> > > > > > > > > > expect.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Asaf
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, Feb 22, 2023 at 5:10 PM SiNan Liu <
> > > > > > > > liusinan1...@gmail.com>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Sorry, my mistake. I removed the code and
> > described the
> > > > > > > design
> > > > > > > > to
> > > > > > > > > > > > improve
> > > > > > > > > > > > > > the PROTOBUF_NATIVE schema compatibility checks.
> > You
> > > > can
> > > > > > > have a
> > > > > > > > > > look.
> > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Asaf Mesika <asaf.mes...@gmail.com> 于2023年2月22日周三
> > > > 21:16写道:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I read it but you're almost directly diving into
> > the
> > > > > > code -
> > > > > > > > it
> > > > > > > > > > will
> > > > > > > > > > > > > take
> > > > > > > > > > > > > > me
> > > > > > > > > > > > > > > hours just to reverse engineer your design.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Can you please include a "High Level Design"
> > section
> > > > in
> > > > > > > which
> > > > > > > > you
> > > > > > > > > > > > > explain
> > > > > > > > > > > > > > > how you plan to tackle any issue?
> > > > > > > > > > > > > > > If I can read that section and explain to someone
> > > > else
> > > > > > how
> > > > > > > > this
> > > > > > > > > > > will
> > > > > > > > > > > > > > work,
> > > > > > > > > > > > > > > it means the section is complete.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Let's leave the code to the PRs.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Sun, Feb 19, 2023 at 2:59 PM SiNan Liu <
> > > > > > > > > > liusinan1...@gmail.com>
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi all,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I made a PIP to discuss:
> > > > > > > > > > > > > https://github.com/apache/pulsar/issues/19565
> > > > > > > > > > > > > > .
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > We can talk about the current design here.
> > > > Especially
> > > > > > for
> > > > > > > > the
> > > > > > > > > > > field
> > > > > > > > > > > > > > type
> > > > > > > > > > > > > > > > change check rules, please give your valuable
> > > > advice.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > Sinan
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> >

Reply via email to