Hi, Yunze:

> Regarding the 1st question, yes, that's why I open this thread to
> discuss. If we change these default values, the behavior of new Python
> clients will be like the Java client. In addition, it actually reverts
> the breaking change brought in #12232.

I also kind of forget why we have #12232 to change the default behavior
Maybe the python2 and python3 order rule is different.

If we change the order is the default value, for every topic that uses
python client will register a new schema. Will it register a new
schema? Maybe we should add a special logic in the broker to
check the python client version and make it will not register
a new schema. Otherwise, the impact may still be quite large.

Thanks,
Bo
>
> Regarding the 2nd question, yes, they are both sorted in alphabetical
> order. I don't know the behavior of the .NET clients, for C++, Golang,
> Node.js clients, they all do not support generating schema definition
> from a DTO.
>
> Thanks,
> Yunze
>
> On Thu, Mar 30, 2023 at 10:14 AM 丛搏 <congbobo...@gmail.com> wrote:
> >
> > Hi, Yunze :
> >
> > 1. If the changes may cause some compatibility issues.
> > How do we solve the compatibility issues? It may be a
> > breaking change.
> >
> > 2. Another question is if sorting is enabled by default,
> > is the sorting rule the same as java or other clients?
> >
> > Putting aside the above two problems, I think it is
> > good to be consistent with other clients.
> >
> > Thanks,
> > Bo
> >
> > Eric Hare <eric.h...@datastax.com> 于2023年3月29日周三 22:42写道:
> > >
> > > +1 - i think keeping the `_sorted_fields` and `_required` defaults 
> > > consistent between the clients is the way to go.
> > >
> > > > On Mar 29, 2023, at 7:09 AM, Yunze Xu <y...@streamnative.io.INVALID> 
> > > > wrote:
> > > >
> > > > I found the Python client has two options to control the behavior:
> > > > 1. Set `_sorted_fields`. It's false by default in the Python client,
> > > > but it's true in the Java client. i.e. the Java client sorts all
> > > > fields by default.
> > > > 2. Set `_required`. It's false by default for all types in the Python
> > > > client, but it's only false for the string type in the Java client.
> > > >
> > > > i.e. given the following Java class:
> > > >
> > > > ```java
> > > > class User {
> > > >    String name;
> > > >    int age;
> > > >    double score;
> > > > }
> > > > ```
> > > >
> > > > We have to give the following definition in Python:
> > > >
> > > > ```python
> > > > class User(Record):
> > > >    _sorted_fields = True
> > > >    name = String()
> > > >    age = Integer(required=True)
> > > >    score = Double(required=True)
> > > > ```
> > > >
> > > > I see https://github.com/apache/pulsar/pull/12232 adds the
> > > > `_sorted_fields` field and disables the field sort by default. It
> > > > breaks compatibility with the Java client.
> > > >
> > > > IMO, we should make `_sorted_fields` true by default and `_required`
> > > > true for all types other than `String` by default.
> > > >
> > > > Thanks,
> > > > Yunze
> > > >
> > > > On Wed, Mar 29, 2023 at 4:00 PM Yunze Xu <y...@streamnative.io> wrote:
> > > >>
> > > >> Hi all,
> > > >>
> > > >> Recently I found the default generated schema definition in the Python
> > > >> client is different from the Java client, which leads to some
> > > >> unexpected behavior.
> > > >>
> > > >> For example, given the following class definition in Python:
> > > >>
> > > >> ```python
> > > >> class Data(Record):
> > > >>    i = Integer()
> > > >> ```
> > > >>
> > > >> The type of `i` field is a union: "type": ["null", "int"]
> > > >>
> > > >> While given the following class definition in Java:
> > > >>
> > > >> ```java
> > > >> class Data {
> > > >>    private final int i;
> > > >>    /* ... */
> > > >> }
> > > >> ```
> > > >>
> > > >> The type of `i` field is an integer: "type": "int"
> > > >>
> > > >> It brings an issue that if a Python consumer subscribes to a topic
> > > >> with schema defined above, then a Java producer will fail to create
> > > >> because of the schema incompatibility.
> > > >>
> > > >> Currently, the workaround is to change the schema compatibility
> > > >> strategy to FORWARD.
> > > >>
> > > >> Should we change the way to generate schema definition in the Python
> > > >> client to be compatible with the Java client? It could bring breaking
> > > >> changes to old Python clients, but it could guarantee compatibility
> > > >> with the Java client.
> > > >>
> > > >> If not, we still have to introduce an extra configuration to make
> > > >> Python schema compatible with Java schema. But it requires code
> > > >> changes. e.g. here is a possible solution:
> > > >>
> > > >> ```python
> > > >> class Data(Record):
> > > >>    # NOTE: Users might have to add this extra field to control how to
> > > >> generate the schema
> > > >>    __java_compatible = True
> > > >>    i = Integer()
> > > >> ```
> > > >>
> > > >> Thanks,
> > > >> Yunze
> > >

Reply via email to