Hi Rob, I am not a developer, but I can tell you that to generate such statistics, we had an intern work in spark all last summer. So, I don’t think it is built into hive. Hope this helps you, ~Nikki
From: Robert Grandl <rgra...@yahoo.com> Reply-To: "user@hive.apache.org" <user@hive.apache.org>, Robert Grandl <rgra...@yahoo.com> Date: Wednesday, December 21, 2016 at 9:04 AM To: User <user@hive.apache.org>, Dev <d...@hive.apache.org> Subject: Re: Hive statistics Hi guys, I am wondering. Is there any other mailing list for hive related questions? I feel there is not much activity on the user/dev hive mailing lists or at least not much support in answering my questions. Thanks, Robert On Tuesday, December 20, 2016 11:01 PM, Robert Grandl <rgra...@yahoo.com> wrote: Hi guys, I am wondering if it's possible to estimate the number of distinct keys and their distribution in a way or another. More concretely, for every stage, it is possible to determine the number of distinct keys and for each key the number of values before the data is actually processed? Thanks, Robert ________________________________ This e-mail message is authorized for use by the intended recipient only and may contain information that is privileged and confidential. If you received this message in error, please call us immediately at (425) 590-5000 and ask to speak to the message sender. Please do not copy, disseminate, or retain this message unless you are the intended recipient. In addition, to ensure the security of your data, please do not send any unencrypted credit card or personally identifiable information to this email address. Thank you.