[
https://issues.apache.org/jira/browse/AVRO-3184?focusedWorklogId=648667&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-648667
]
ASF GitHub Bot logged work on AVRO-3184:
----------------------------------------
Author: ASF GitHub Bot
Created on: 09/Sep/21 15:20
Start Date: 09/Sep/21 15:20
Worklog Time Spent: 10m
Work Description: belugabehr commented on pull request #1301:
URL: https://github.com/apache/avro/pull/1301#issuecomment-916198144
Yup. Should be good to go now. Thanks for all your helpful feedback.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 648667)
Time Spent: 2h (was: 1h 50m)
> Cache Datum Type Strings in Resolve Union
> -----------------------------------------
>
> Key: AVRO-3184
> URL: https://issues.apache.org/jira/browse/AVRO-3184
> Project: Apache Avro
> Issue Type: Improvement
> Reporter: David Mollitor
> Assignee: David Mollitor
> Priority: Major
> Labels: pull-request-available
> Attachments: AVRO-3184.JPG, AVRO-master.JPG
>
> Time Spent: 2h
> Remaining Estimate: 0h
>
> {code:java|title=GenericData.java}
> protected String getSchemaName(Object datum) {
> if (datum == null || datum == JsonProperties.NULL_VALUE)
> return Type.NULL.getName();
> if (isRecord(datum))
> return getRecordSchema(datum).getFullName();
> if (isEnum(datum))
> return getEnumSchema(datum).getFullName();
> if (isArray(datum))
> return Type.ARRAY.getName();
> if (isMap(datum))
> return Type.MAP.getName();
> if (isFixed(datum))
> return getFixedSchema(datum).getFullName();
> if (isString(datum))
> return Type.STRING.getName();
> if (isBytes(datum))
> return Type.BYTES.getName();
> if (isInteger(datum))
> return Type.INT.getName();
> if (isLong(datum))
> return Type.LONG.getName();
> if (isFloat(datum))
> return Type.FLOAT.getName();
> if (isDouble(datum))
> return Type.DOUBLE.getName();
> if (isBoolean(datum))
> return Type.BOOLEAN.getName();
> throw new AvroRuntimeException(String.format("Unknown datum type %s: %s",
> datum.getClass().getName(), datum));
> }
> {code}
> This is a lot of effort for each of the simple native types (Long, Float,
> Double, etc.) type. It is the last thing that is checked. Add a cache for
> these simple use cases.
> I came across this while examining performance of Apache ORC which includes
> an Avro benchmark for comparison. You can see the charts with the change
> implemented.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)