[ 
https://issues.apache.org/jira/browse/HIVE-756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744035#action_12744035
 ] 

He Yongqiang commented on HIVE-756:
-----------------------------------

{noformat}
       if (!currentValue.inited) {
         currentValueBuffer();
+        ret.resetValid(columnNumber); // do this only when not intialized 
       }
 
       // we do not use BytesWritable here to avoid the byte-copy from
       // DataOutputStream to BytesWritable
 
-      ret.resetValid(columnNumber);

-        if (skippedColIDs[i]) {
-          if (ref != BytesRefWritable.ZeroBytesRefWritable)
-            ret.set(i, BytesRefWritable.ZeroBytesRefWritable);
-          continue;
-        }
{noformat}

The code can be used by non-hive code, and since getCurrentRow is a public 
method, we can not gurantee that every time the passed in argument ret is the 
same as the one in previous callings, so we need to do the "resetValid" and 
set(.., BytesRefWritable.ZeroBytesRefWritable) everytime called.  what do you 
think?


> performance improvement for RCFile and ColumnarSerDe in Hive
> ------------------------------------------------------------
>
>                 Key: HIVE-756
>                 URL: https://issues.apache.org/jira/browse/HIVE-756
>             Project: Hadoop Hive
>          Issue Type: Improvement
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>         Attachments: hive-756.patch
>
>
> There are some easy performance improvements in the columnar storage in Hive 
> I found during Hackathon. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to