u, Mar 7, 2019 at 12:58 PM wrote:
> Thanks Ryan and Reynold for the information!
>
>
>
> Cheers,
>
> Tyson
>
>
>
> *From:* Ryan Blue
> *Sent:* Wednesday, March 6, 2019 3:47 PM
> *To:* Reynold Xin
> *Cc:* tcon...@gmail.com; Spark Dev List
> *Subject:
Thanks Ryan and Reynold for the information!
Cheers,
Tyson
From: Ryan Blue
Sent: Wednesday, March 6, 2019 3:47 PM
To: Reynold Xin
Cc: tcon...@gmail.com; Spark Dev List
Subject: Re: Hive Hash in Spark
I think this was needed to add support for bucketed Hive tables. Like Tyson
I think this was needed to add support for bucketed Hive tables. Like Tyson
noted, if the other side of a join can be bucketed the same way, then Spark
can use a bucketed join. I have long-term plans to support this in the
DataSourceV2 API, but I don't think we are very close to implementing it
yet
I think they might be used in bucketing? Not 100% sure.
On Wed, Mar 06, 2019 at 1:40 PM, < tcon...@gmail.com > wrote:
>
>
>
> Hi,
>
>
>
>
>
>
>
> I noticed the existence of a Hive Hash partitioning implementation in
> Spark, but also noticed that it’s not being used, and that the Spark
Hi,
I noticed the existence of a Hive Hash partitioning implementation in Spark,
but also noticed that it's not being used, and that the Spark hash
partitioning function is presently hardcoded to Murmur3. My question is
whether Hive Hash is dead code or are their future plans to support reading