ta1meng commented on issue #9571:
URL: https://github.com/apache/pulsar/issues/9571#issuecomment-828086647


   @codelipenghui @congbobo184 while the original issue was fixed, there were 
three bugs that I encountered that also resulted in an `IncompatibleSchema` 
exception.
   
   Consider the workarounds that I implemented locally:
   
   ```
   @classmethod
      def schema(cls):
          schema = {
              'name': str(cls.__name__),
              'type': 'record',
              'fields': []
          }
   
          # Do NOT sort the keys!!
          for name in cls._fields.keys():
              field = cls._fields[name]
              field_type = field.schema() if field._required else (
                  # HACK for default values
                  [field.schema(), 'null'] if field._default != None else 
['null', field.schema()]
              )
              schema['fields'].append({
                  'name': name,
                  'type': field_type
              } if field._default == None else {
                  # HACK for default values
                  'name': name,
                  'type': field_type,
                  'default': field._default
              })
          return schema
   ```
   
   which overrides `Record::schema()` in `definition.py`.
   
   `First issue`: Pulsar's schema comparison logic seems textual in nature, so 
if two fields are specified in reverse order, the schema comparison would 
return "incompatible", even though the Avro schemas are compatible. The 
workaround I put in removes the sorting of field names, so they appear in the 
same order in the schema as they are declared in code. This issue can either be 
fixed server-side or in the Python client library.
   
   `Second issue`: there is no default value support in the schema generation 
code. I hacked it in. I'm new to Python and I'm sure this code can be written 
better.
   
   `Third issue:`: `"default": null` is partially supported, and rarely logged. 
This one took me the longest to figure out because I saw identical schemas 
printed that resulted in an IncompatibleSchema exception. While this issue 
would be tricky to fix in the Pulsar client library, we can improve things 
greatly by logging `"default": null` when it is a part of the schema. This one 
is different from the first two issues, as it seems to require a sweep across 
multiple Pulsar projects, so I will file this issue separately.
   
   So this ticket can be used to track `First issue` and `Second issue`, as 
they are both localized to a single method in the Pulsar Python client library.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to