Hi Rakesh,

How big are your files? and is the data ordered/sorted by column on which
you are running distinct on? if column contains empty string, null and
spaces which all treated as different by hive. Converting them to hive's
native null type can help in improving performance..


Thank you,
*Pushkar Gujar*


On Sun, Feb 26, 2017 at 11:56 PM, rakesh sharma <rakeshsharm...@hotmail.com>
wrote:

> When using distinct in hive query it runs for hours otherwise it's running
> for less than a minute. How can I optimise thisvquery.
>
> Thanks
>
> Get Outlook for Android <https://aka.ms/ghei36>
>
>

Reply via email to