jjaakola-aiven commented on code in PR #1759:
URL: https://github.com/apache/avro/pull/1759#discussion_r1003412161


##########
lang/py/avro/test/test_schema.py:
##########
@@ -713,6 +713,22 @@ def test_parse_invalid_symbol(self):
         except avro.errors.InvalidName:  # pragma: no coverage
             self.fail("When enum symbol validation is disabled, an invalid 
symbol should not raise InvalidName.")
 
+    def test_parse_duplicate_symbol(self):
+        duplicate_symbol_schema = json.dumps({"type": "enum", "name": 
"AVRO3573", "symbols": ["A", "A", "B", "C", "D"]})
+        with self.assertRaisesRegex(
+            avro.errors.AvroException, r"Duplicate symbol: \[\'A\'\]", 
msg="When enum symbol has a duplicate, AvroException should raise."
+        ):
+            avro.schema.parse(duplicate_symbol_schema)
+
+    def test_parse_duplicate_symbols(self):
+        duplicate_symbols_schema = json.dumps({"type": "enum", "name": 
"AVRO3573", "symbols": ["A", "A", "B", "C", "C", "D"]})
+        with self.assertRaisesRegex(
+            avro.errors.AvroException,
+            r"Duplicate symbols: (\[\'A\', \'C\'\])|(\[\'C\', \'A\'\])",

Review Comment:
   Is there a case where the resulting order in error message would be 
backwards?
   If not I'd prefer for readability that error message and assertion would be 
exact. E.g.
   ```python
       with self.assertRaises(avro.errors.AvroException, msg="Should raise 
AvroException") as exc_info:
               avro.schema.parse(duplicate_symbols_schema)
       assert str(exc_info.exception) == "Duplicate symbols: (['A', 'C'])"
   ```
   



##########
lang/py/avro/schema.py:
##########
@@ -570,7 +570,12 @@ def __init__(
                     raise avro.errors.InvalidName("An enum symbol must be a 
valid schema name.")
 
         if len(set(symbols)) < len(symbols):
-            raise avro.errors.AvroException(f"Duplicate symbol: {symbols}")
+            duplicate_symbols = {symbol for symbol in symbols if 
symbols.count(symbol) > 1}
+
+            if len(duplicate_symbols) == 1:
+                raise avro.errors.AvroException(f"Duplicate symbol: 
{list(duplicate_symbols)}")
+            else:
+                raise avro.errors.AvroException(f"Duplicate symbols: 
{list(duplicate_symbols)}")

Review Comment:
   Java gives an error message of `Duplicate enum symbol: A` and reports only 
single duplicate. I am not sure if the messages should be aligned, just noting 
the difference.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to