Re: Distinct clause in hive

2017-02-27 Thread Pushkar.Gujar
Hi Rakesh,

How big are your files? and is the data ordered/sorted by column on which
you are running distinct on? if column contains empty string, null and
spaces which all treated as different by hive. Converting them to hive's
native null type can help in improving performance..


Thank you,
*Pushkar Gujar*


On Sun, Feb 26, 2017 at 11:56 PM, rakesh sharma 
wrote:

> When using distinct in hive query it runs for hours otherwise it's running
> for less than a minute. How can I optimise thisvquery.
>
> Thanks
>
> Get Outlook for Android 
>
>


Distinct clause in hive

2017-02-26 Thread rakesh sharma
When using distinct in hive query it runs for hours otherwise it's running for 
less than a minute. How can I optimise thisvquery.

Thanks

Get Outlook for Android