[
https://issues.apache.org/jira/browse/AVRO-3573?focusedWorklogId=790339&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-790339
]
ASF GitHub Bot logged work on AVRO-3573:
----------------------------------------
Author: ASF GitHub Bot
Created on: 13/Jul/22 09:22
Start Date: 13/Jul/22 09:22
Worklog Time Spent: 10m
Work Description: izveigor commented on code in PR #1759:
URL: https://github.com/apache/avro/pull/1759#discussion_r919851566
##########
lang/py/avro/schema.py:
##########
@@ -570,7 +570,12 @@ def __init__(
raise avro.errors.InvalidName("An enum symbol must be a
valid schema name.")
if len(set(symbols)) < len(symbols):
- raise avro.errors.AvroException(f"Duplicate symbol: {symbols}")
+ duplicate_symbols = {symbol for symbol in symbols if
symbols.count(symbol) > 1}
+
+ if len(duplicate_symbols) == 1:
Review Comment:
Hello, @clesaec!
I do not really understand what you want to say. if we have 0 duplicates
symbols, a length of set will less than a length of list and an error will not
raise. if we have 1 duplicate symbol, the application shows the list with only
one symbol. If we have more than 1 duplicate symbol, the application shows the
list of duplicates symbols.
The message for 1 duplicate symbol differs from the message for more 1
duplicates symbols only the grammar of the language.
If you worry about format of output (list), then many libraries use this
format both for one symbol and for many symbols. For example, marshmallow:
https://github.com/marshmallow-code/marshmallow/blob/dev/src/marshmallow/schema.py#L1009-L1018
Issue Time Tracking
-------------------
Worklog Id: (was: 790339)
Time Spent: 50m (was: 40m)
> Duplicate symbols (EnumSchema)
> ------------------------------
>
> Key: AVRO-3573
> URL: https://issues.apache.org/jira/browse/AVRO-3573
> Project: Apache Avro
> Issue Type: Improvement
> Affects Versions: 1.11.0
> Reporter: Igor Izvekov
> Priority: Minor
> Labels: pull-request-available
> Fix For: 1.11.1
>
> Time Spent: 50m
> Remaining Estimate: 0h
>
> If EnumSchema has duplicate symbols, an error will raise. Instead of a list
> of duplicate symbols or a value of duplicate symbol, error shows all list of
> symbols. Improvement removes this defect and shows a message "Duplicate
> symbol" with the symbol, if it is one, or "Duplicates symbols" with the list
> of duplicate symbols, if there are more than one symbol.
> P.S. Tests do not check error's message. Try to write a test for checking a
> message of an error can take a long time.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)