[ 
https://issues.apache.org/jira/browse/AVRO-3184?focusedWorklogId=647020&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-647020
 ]

ASF GitHub Bot logged work on AVRO-3184:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 06/Sep/21 14:56
            Start Date: 06/Sep/21 14:56
    Worklog Time Spent: 10m 
      Work Description: RyanSkraba commented on a change in pull request #1301:
URL: https://github.com/apache/avro/pull/1301#discussion_r702957775



##########
File path: lang/java/avro/src/main/java/org/apache/avro/generic/GenericData.java
##########
@@ -69,6 +69,16 @@
 
   private static final GenericData INSTANCE = new GenericData();
 
+  private static final Map<Class<?>, String> PRIMATIVE_DATUM_TYPES = new 
IdentityHashMap<>();

Review comment:
       They're equivalent -- let's go with @belugabehr 's choice!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 647020)
    Time Spent: 1h 40m  (was: 1.5h)

> Cache Datum Type Strings in Resolve Union
> -----------------------------------------
>
>                 Key: AVRO-3184
>                 URL: https://issues.apache.org/jira/browse/AVRO-3184
>             Project: Apache Avro
>          Issue Type: Improvement
>            Reporter: David Mollitor
>            Assignee: David Mollitor
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: AVRO-3184.JPG, AVRO-master.JPG
>
>          Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> {code:java|title=GenericData.java}
>   protected String getSchemaName(Object datum) {
>     if (datum == null || datum == JsonProperties.NULL_VALUE)
>       return Type.NULL.getName();
>     if (isRecord(datum))
>       return getRecordSchema(datum).getFullName();
>     if (isEnum(datum))
>       return getEnumSchema(datum).getFullName();
>     if (isArray(datum))
>       return Type.ARRAY.getName();
>     if (isMap(datum))
>       return Type.MAP.getName();
>     if (isFixed(datum))
>       return getFixedSchema(datum).getFullName();
>     if (isString(datum))
>       return Type.STRING.getName();
>     if (isBytes(datum))
>       return Type.BYTES.getName();
>     if (isInteger(datum))
>       return Type.INT.getName();
>     if (isLong(datum))
>       return Type.LONG.getName();
>     if (isFloat(datum))
>       return Type.FLOAT.getName();
>     if (isDouble(datum))
>       return Type.DOUBLE.getName();
>     if (isBoolean(datum))
>       return Type.BOOLEAN.getName();
>     throw new AvroRuntimeException(String.format("Unknown datum type %s: %s", 
> datum.getClass().getName(), datum));
>   }
> {code}
> This is a lot of effort for each of the simple native types (Long, Float, 
> Double, etc.) type.  It is the last thing that is checked.  Add a cache for 
> these simple use cases.
> I came across this while examining performance of Apache ORC which includes 
> an Avro benchmark for comparison.  You can see the charts with the change 
> implemented.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to