[ 
https://issues.apache.org/jira/browse/HIVE-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13558440#comment-13558440
 ] 

Phabricator commented on HIVE-2780:
-----------------------------------

navis has commented on the revision "HIVE-2780 [jira] Implement more 
restrictive table sampler".

INLINE COMMENTS
  ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java:489 ok.
  ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java:583 ok.
  ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java:657 I 
remember the code is copied from CombineHiveInputFormat. I'll check that.
  ql/src/java/org/apache/hadoop/hive/ql/io/SplitSampler.java:34 ok.
  ql/src/test/results/clientpositive/split_sample_sampler.q.out:27 Original 
implementation provided split level granularity and the purpose of this patch 
is making it smaller (per row). This means underlying files should be 
splittable, which you pointed out previously.

REVISION DETAIL
  https://reviews.facebook.net/D1623

BRANCH
  DPAL-722

To: JIRA, ashutoshc, navis

                
> Implement more restrictive table sampler
> ----------------------------------------
>
>                 Key: HIVE-2780
>                 URL: https://issues.apache.org/jira/browse/HIVE-2780
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Trivial
>         Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2780.D1623.1.patch, 
> ASF.LICENSE.NOT.GRANTED--HIVE-2780.D1623.2.patch, HIVE-2780.D1623.3.patch
>
>
> Current table sampling scans whole block, making more rows included than 
> expected especially for small tables.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to