[ 
https://issues.apache.org/jira/browse/HIVE-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13066196#comment-13066196
 ] 

jirapos...@reviews.apache.org commented on HIVE-2282:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1132/
-----------------------------------------------------------

(Updated 2011-07-15 20:48:38.625544)


Review request for hive and Siying Dong.


Changes
-------

I added comments to the estimateSampledInputSize function.  This function does 
set the input size even if there is no sampling, but this means that we do not 
need to create two cases everywhere we might need to use an estimated input 
size or an actual input size.  Instead, we can just run the function (which 
only does significant work the first time it is run thanks to a boolean flag) 
and the input size will be set to the appropriate values.  It only estimates 
the input size if sampling is used.

I also added the header to VerifyIsLocalModeHook.java


Summary
-------

A query should run in local mode when block sampling is used and the sample is 
small enough.  The size of the sample is currently being estimated, as it is 
done to estimate the number of reducers.


This addresses bug HIVE-2282.
    https://issues.apache.org/jira/browse/HIVE-2282


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java cd3de76 
  ql/src/test/org/apache/hadoop/hive/ql/hooks/VerifyIsLocalModeHook.java 
PRE-CREATION 
  ql/src/test/queries/clientpositive/sample_islocalmode_hook.q PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapRedTask.java 53769a0 

Diff: https://reviews.apache.org/r/1132/diff


Testing
-------

TestCliDriver TestNegativeCliDriver, manually tested


Thanks,

Kevin



> Local mode needs to work well with block sampling
> -------------------------------------------------
>
>                 Key: HIVE-2282
>                 URL: https://issues.apache.org/jira/browse/HIVE-2282
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Siying Dong
>            Assignee: Kevin Wilfong
>         Attachments: HIVE-2282.1.patch.txt
>
>
> Currently, if block sampling is enabled and large set of data are sampled to 
> a small set, local mode needs to be kicked in. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to