[ 
https://issues.apache.org/jira/browse/TEZ-1081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13977740#comment-13977740
 ] 

Sergey Shelukhin commented on TEZ-1081:
---------------------------------------

[~sseth] [~t3rmin4t0r] fyi

> expose some basic statistics from org.apache.tez.runtime.api.Input (or 
> similar)
> -------------------------------------------------------------------------------
>
>                 Key: TEZ-1081
>                 URL: https://issues.apache.org/jira/browse/TEZ-1081
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Sergey Shelukhin
>
> Hive loads data from  org.apache.tez.runtime.api.Input into mapjoin 
> hashtables. It would be useful to know in advance
> 1) How many rows are there in the input (should be easy to add).
> 2) How many unique keys (even an approximation).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to