there is no sampling for order by in Hive. Hive uses a single reducer for
order by (if you're talking about MR execution engine).

Hive on Spark is different for this, thought.

Thanks,
Xuefu

On Mon, Mar 2, 2015 at 2:17 AM, Jeff Zhang <[email protected]> wrote:

> Order by usually invoke 2 steps (sampling job and repartition job) but
> hive only run one mr job for order by, so wondering when and where does
> hive do sampling ? client side ?
>
>
> --
> Best Regards
>
> Jeff Zhang
>

Reply via email to