[
https://issues.apache.org/jira/browse/AVRO-3184?focusedWorklogId=643413&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-643413
]
ASF GitHub Bot logged work on AVRO-3184:
----------------------------------------
Author: ASF GitHub Bot
Created on: 30/Aug/21 11:09
Start Date: 30/Aug/21 11:09
Worklog Time Spent: 10m
Work Description: martin-g commented on a change in pull request #1301:
URL: https://github.com/apache/avro/pull/1301#discussion_r698395051
##########
File path: lang/java/avro/src/main/java/org/apache/avro/generic/GenericData.java
##########
@@ -69,6 +69,16 @@
private static final GenericData INSTANCE = new GenericData();
+ private static final Map<Class<?>, String> PRIMATIVE_DATUM_TYPES = new
IdentityHashMap<>();
Review comment:
Any reason to use IdentityHashMap in favour of HashMap ?
##########
File path: lang/java/avro/src/main/java/org/apache/avro/generic/GenericData.java
##########
@@ -892,6 +902,9 @@ public int resolveUnion(Schema union, Object datum) {
protected String getSchemaName(Object datum) {
if (datum == null || datum == JsonProperties.NULL_VALUE)
return Type.NULL.getName();
+ String primativeType = PRIMATIVE_DATUM_TYPES.get(datum.getClass());
+ if (primativeType != null)
+ return primativeType;
Review comment:
Shouldn't we remove the later checks for `isInteger(datum)`,
`isLong(datum)`, etc. ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 643413)
Remaining Estimate: 0h
Time Spent: 10m
> Cache Datum Type Strings in Resolve Union
> -----------------------------------------
>
> Key: AVRO-3184
> URL: https://issues.apache.org/jira/browse/AVRO-3184
> Project: Apache Avro
> Issue Type: Improvement
> Reporter: David Mollitor
> Assignee: David Mollitor
> Priority: Major
> Attachments: AVRO-3184.JPG, AVRO-master.JPG
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> {code:java|title=GenericData.java}
> protected String getSchemaName(Object datum) {
> if (datum == null || datum == JsonProperties.NULL_VALUE)
> return Type.NULL.getName();
> if (isRecord(datum))
> return getRecordSchema(datum).getFullName();
> if (isEnum(datum))
> return getEnumSchema(datum).getFullName();
> if (isArray(datum))
> return Type.ARRAY.getName();
> if (isMap(datum))
> return Type.MAP.getName();
> if (isFixed(datum))
> return getFixedSchema(datum).getFullName();
> if (isString(datum))
> return Type.STRING.getName();
> if (isBytes(datum))
> return Type.BYTES.getName();
> if (isInteger(datum))
> return Type.INT.getName();
> if (isLong(datum))
> return Type.LONG.getName();
> if (isFloat(datum))
> return Type.FLOAT.getName();
> if (isDouble(datum))
> return Type.DOUBLE.getName();
> if (isBoolean(datum))
> return Type.BOOLEAN.getName();
> throw new AvroRuntimeException(String.format("Unknown datum type %s: %s",
> datum.getClass().getName(), datum));
> }
> {code}
> This is a lot of effort for each of the simple native types (Long, Float,
> Double, etc.) type. It is the last thing that is checked. Add a cache for
> these simple use cases.
> I came across this while examining performance of Apache ORC which includes
> an Avro benchmark for comparison. You can see the charts with the change
> implemented.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)