Good morning,
It seems like this email is the simplest way to signal about issues in
*https://github.com/apache/avro/tree/master/lang/py
<https://github.com/apache/avro/tree/master/lang/py>*
There is an issue with the class *SchemaCompatibilityResult*, defined in
*compatibility.py*:
class SchemaCompatibilityResult:
def __init__(
self,
compatibility: SchemaCompatibilityType =
SchemaCompatibilityType.recursion_in_progress,
incompatibilities: Optional[List[SchemaIncompatibilityType]] = None,
messages: Optional[Set[str]] = None,
locations: Optional[Set[str]] = None,
):
self.locations = locations or {"/"}
self.messages = messages or set()
self.compatibility = compatibility
self.incompatibilities = incompatibilities or []
As you can see the two attributes, *locations* and *messages*, are defined
as python sets and therefore are unordered. When a compatibility check is
made between a reader and a writer schema, the check is made recursively,
and results of the above type are merged together for each incompatibility
found. The problem is that locations and messages must go in pairs, while
they are defined as separate attributes, and merged as follows, see
*compatibility.py:98*:
def merge(this: SchemaCompatibilityResult, that: SchemaCompatibilityResult)
-> SchemaCompatibilityResult:
...
messages = this.messages.union(that.messages)
locations = this.locations.union(that.locations)
...
Since python sets are not ordered, it is possible to get *messages* that
are not in sync with their *locations*.
Using python lists instead of sets would solve this problem, but IMHO a
better solution is to encapsulate location and message in a simple class,
so they are always bound together.
Best wishes,
Oleksii