Re: Hive Hash in Spark

Reynold Xin Wed, 06 Mar 2019 13:58:07 -0800

I think they might be used in bucketing? Not 100% sure.

On Wed, Mar 06, 2019 at 1:40 PM, < [email protected] > wrote:


> 
> 
> 
> Hi,
> 
> 
> 
>  
> 
> 
> 
> I noticed the existence of a Hive Hash partitioning implementation in
> Spark, but also noticed that it’s not being used, and that the Spark hash
> partitioning function is presently hardcoded to Murmur3. My question is
> whether Hive Hash is dead code or are their future plans to support
> reading and understanding data the has been partitioned using Hive Hash?
> By understanding, I mean that I’m able to avoid a full shuffle join on
> Table A (partitioned by Hive Hash) when joining with a Table B that I can
> shuffle via Hive Hash to Table A.
> 
> 
> 
>  
> 
> 
> 
> Thank you,
> 
> 
> 
> Tyson
> 
> 
>

Re: Hive Hash in Spark

Reply via email to