Hi Cody,

There are currently no concrete plans for adding buckets to Spark SQL, but
thats mostly due to lack of resources / demand for this feature.  Adding
full support is probably a fair amount of work since you'd have to make
changes throughout parsing/optimization/execution.  That said, there are
probably some smaller tasks that could be easier (for example, you might be
able to avoid a shuffle when doing joins on tables that are already
bucketed by exposing more metastore information to the planner).

Michael

On Sun, Sep 14, 2014 at 3:10 PM, Cody Koeninger <c...@koeninger.org> wrote:

> I noticed that the release notes for 1.1.0 said that spark doesn't support
> Hive buckets "yet".  I didn't notice any jira issues related to adding
> support.
>
> Broadly speaking, what would be involved in supporting buckets, especially
> the bucketmapjoin and sortedmerge optimizations?
>

Reply via email to