[ 
https://issues.apache.org/jira/browse/HADOOP-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12630449#action_12630449
 ] 

Joydeep Sen Sarma commented on HADOOP-4138:
-------------------------------------------

1. let's move the clear() to an else clause for the preceding if().
2. sounds ok.

a few more comments:

- setStructField in Reflection and ThriftStructObjectInspector - is this used 
anywhere? what's the motivation behind this
- please remove references to TypeInfo in ObjectInspector.java comments and 
explain differently.

the MetadataListStructObjectInspector.getStructFieldData looks pretty high 
overhead to me. we have gone to so much trouble to avoid creating 
objectinspectors - but those are just one time per map/reduce instance. but the 
getField() type of calls are per row. creating a list from an array type per 
evaluation seems unnecessary - we should be able to get directly from the 
backing array. there are quite a few function calls as well (nested function 
calls and class equality checks and so on).

minor: getCategory() calls in Standard* can be marked final. 

One thing that i found somewhat complicated is the way the 
ObjectInspectorFactory() is written. It sounds like this would be the factory 
for most objectinspectors - but it's constructed to be only for reflection and 
reflection derived ones. in particular - the metadatatyped... class has it's 
own caching and factory like methods. You might want to think about structuring 
this more cleanly (instead of 'Type' - there could be a more generic concept of 
a signature and inspector family and a cache per type X family).


> [Hive] refactor the SerDe library
> ---------------------------------
>
>                 Key: HADOOP-4138
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4138
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/hive
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4138-1.txt, HADOOP-4138-2.txt
>
>
> Hive uses the library from src/contrib/hive/serde to do 
> serialization/deserialization.
> We want to do a refactoring of the library to:
> 1. Split Serializer and Deserializer interface
> 2. Split Serializer/Deserializer and ObjectInspector interface
> 3. Change hive/metaserver and hive/ql to use the new SerDe framework

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to