[ 
https://issues.apache.org/jira/browse/AVRO-3184?focusedWorklogId=646719&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-646719
 ]

ASF GitHub Bot logged work on AVRO-3184:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 05/Sep/21 20:56
            Start Date: 05/Sep/21 20:56
    Worklog Time Spent: 10m 
      Work Description: belugabehr commented on a change in pull request #1301:
URL: https://github.com/apache/avro/pull/1301#discussion_r702475769



##########
File path: lang/java/avro/src/main/java/org/apache/avro/generic/GenericData.java
##########
@@ -892,6 +902,9 @@ public int resolveUnion(Schema union, Object datum) {
   protected String getSchemaName(Object datum) {
     if (datum == null || datum == JsonProperties.NULL_VALUE)
       return Type.NULL.getName();
+    String primativeType = PRIMATIVE_DATUM_TYPES.get(datum.getClass());
+    if (primativeType != null)
+      return primativeType;

Review comment:
       Hello,
   
   It was my desire to remove these methods, but I feared I would be breaking 
backwards compatibility (even more than this already does).  It may still be 
possible for developers to override these methods. For example:
   
   ```java
       if (isInteger(datum))
         return Type.INT.getName();
   ```
   
   A developer could allow for the primitive case to be handled by the cache, 
but also extend the functionality to allow for other logical types to, in this 
case, return an INT type by overriding `isInteger`.  I'm happy to take away 
that ability if it offers no real-world use cases, but I didn't want to be so 
heavy-handed with it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 646719)
    Time Spent: 40m  (was: 0.5h)

> Cache Datum Type Strings in Resolve Union
> -----------------------------------------
>
>                 Key: AVRO-3184
>                 URL: https://issues.apache.org/jira/browse/AVRO-3184
>             Project: Apache Avro
>          Issue Type: Improvement
>            Reporter: David Mollitor
>            Assignee: David Mollitor
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: AVRO-3184.JPG, AVRO-master.JPG
>
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> {code:java|title=GenericData.java}
>   protected String getSchemaName(Object datum) {
>     if (datum == null || datum == JsonProperties.NULL_VALUE)
>       return Type.NULL.getName();
>     if (isRecord(datum))
>       return getRecordSchema(datum).getFullName();
>     if (isEnum(datum))
>       return getEnumSchema(datum).getFullName();
>     if (isArray(datum))
>       return Type.ARRAY.getName();
>     if (isMap(datum))
>       return Type.MAP.getName();
>     if (isFixed(datum))
>       return getFixedSchema(datum).getFullName();
>     if (isString(datum))
>       return Type.STRING.getName();
>     if (isBytes(datum))
>       return Type.BYTES.getName();
>     if (isInteger(datum))
>       return Type.INT.getName();
>     if (isLong(datum))
>       return Type.LONG.getName();
>     if (isFloat(datum))
>       return Type.FLOAT.getName();
>     if (isDouble(datum))
>       return Type.DOUBLE.getName();
>     if (isBoolean(datum))
>       return Type.BOOLEAN.getName();
>     throw new AvroRuntimeException(String.format("Unknown datum type %s: %s", 
> datum.getClass().getName(), datum));
>   }
> {code}
> This is a lot of effort for each of the simple native types (Long, Float, 
> Double, etc.) type.  It is the last thing that is checked.  Add a cache for 
> these simple use cases.
> I came across this while examining performance of Apache ORC which includes 
> an Avro benchmark for comparison.  You can see the charts with the change 
> implemented.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to