True. The spec does not mandate the bucket files have to be there if they
are empty. (missing directories are 0 row tables).
Thanks,
Edward
On Tue, Apr 3, 2018 at 4:42 PM, Richard A. Bross wrote:
> Gopal,
>
> The Presto devs say they are willing to make the changes to
Gopal,
The Presto devs say they are willing to make the changes to adhere to the Hive
bucket spec. I quoted
"Presto could fix their fail-safe for bucketing implementation to actually
trust the Hive bucketing spec & get you out of this mess - the bucketing
contract for Hive is actual file
Gopal,
Thanks for this. Great information and something to look at more closely to
better understand the internals.
Rick
- Original Message -
From: "Gopal Vijayaraghavan"
To: user@hive.apache.org
Sent: Tuesday, April 3, 2018 3:15:46 AM
Subject: Re: Hive, Tez,
>* I'm interested in your statement that CLUSTERED BY does not CLUSTER BY.
> My understanding was that this was related to the number of buckets, but you
> are relating it to ORC stripes. It is odd that no examples that I've seen
> include the SORTED BY statement other than in relation to