Hi Cody, There are currently no concrete plans for adding buckets to Spark SQL, but thats mostly due to lack of resources / demand for this feature. Adding full support is probably a fair amount of work since you'd have to make changes throughout parsing/optimization/execution. That said, there are probably some smaller tasks that could be easier (for example, you might be able to avoid a shuffle when doing joins on tables that are already bucketed by exposing more metastore information to the planner).
Michael On Sun, Sep 14, 2014 at 3:10 PM, Cody Koeninger <c...@koeninger.org> wrote: > I noticed that the release notes for 1.1.0 said that spark doesn't support > Hive buckets "yet". I didn't notice any jira issues related to adding > support. > > Broadly speaking, what would be involved in supporting buckets, especially > the bucketmapjoin and sortedmerge optimizations? >