Hi Rob,
I am not a developer, but I can tell you that to generate such statistics, we 
had an intern work in spark all last summer.  So, I don’t think it is built 
into hive.
Hope this helps you,
~Nikki

From: Robert Grandl <rgra...@yahoo.com>
Reply-To: "user@hive.apache.org" <user@hive.apache.org>, Robert Grandl 
<rgra...@yahoo.com>
Date: Wednesday, December 21, 2016 at 9:04 AM
To: User <user@hive.apache.org>, Dev <d...@hive.apache.org>
Subject: Re: Hive statistics

Hi guys,

I am wondering. Is there any other mailing list for hive related questions?

I feel there is not much activity on the user/dev hive mailing lists or at 
least not much support in answering my questions.

Thanks,
Robert

On Tuesday, December 20, 2016 11:01 PM, Robert Grandl <rgra...@yahoo.com> wrote:

Hi guys,

I am wondering if it's possible to estimate the number of distinct keys and 
their distribution in a way or another.

More concretely, for every stage, it is possible to determine the number of 
distinct keys and for each key the number of values  before the data is 
actually processed?

Thanks,
Robert



________________________________

This e-mail message is authorized for use by the intended recipient only and 
may contain information that is privileged and confidential. If you received 
this message in error, please call us immediately at (425) 590-5000 and ask to 
speak to the message sender. Please do not copy, disseminate, or retain this 
message unless you are the intended recipient. In addition, to ensure the 
security of your data, please do not send any unencrypted credit card or 
personally identifiable information to this email address. Thank you.

Reply via email to