[jira] Commented: (HIVE-256) map side aggregation : number of output rows is same as number of input rows

Joydeep Sen Sarma (JIRA) Wed, 28 Jan 2009 20:36:24 -0800

    [ 
https://issues.apache.org/jira/browse/HIVE-256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668334#action_12668334
 ]


Joydeep Sen Sarma commented on HIVE-256:
----------------------------------------

man - the commits are fast! 

i had some questions:

Curious - 

-    if ((numEntries % NUMROWSESTIMATESIZE) == 0) {
+    if ((numEntriesHashTable == 0) || ((numEntries % NUMROWSESTIMATESIZE) == 
0)) {

Guessing this was the critical change - but couldn't follow how 
numEntriesHashTable could ever be 0 without numEntries being 0 as well.

Also - I couldn't understand this code fragment (quoting the old code since it 
doesn't matter):

      Field[] fArr = agg.getFields();
      for (Field f : fArr) 
        fixedRowSize += getSize(i, agg, f);

the getSize() call doesn't even look at the Field - it seems to base it's 
decision on the class type of agg - and I think will default to unknowntype 
(which is a whopping 256 bytes)?

(sorry - unrelated to this bug - just generally curious since looking at this 
code in detail for first time)



> map side aggregation : number of output rows is same as number of input rows
> ----------------------------------------------------------------------------
>
>                 Key: HIVE-256
>                 URL: https://issues.apache.org/jira/browse/HIVE-256
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: Namit Jain
>             Fix For: 0.2.0
>
>         Attachments: patch-256.1.txt
>
>
> map side aggregation : number of output rows is same as number of input rows

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-256) map side aggregation : number of output rows is same as number of input rows

Reply via email to