Yes Jacky, we will do refactor and use the partition flow.
On 9 February 2018 at 13:44, Jacky Li <13561...@qq.com> wrote:
> Hi Ravindra,
> You mean we can do one round of refactory for bucketed table feature in
> CarbonData 1.4.
> I am fine with it.
> > 在 2018年2月9日，下午3:49，Ravindra Pesala <ravi.pes...@gmail.com> 写道：
> > Hi Likun,
> > I feel it is better to change the implementation to use sparks bucketing
> > generation just like how standard hive partitions generates. It will be
> > easy to change it after implementing of partition feature. And it is a
> > useful feature for joining big tables and hash based buckets and
> > by enables the queries faster. So it is better to change the
> > implementation instead of removing it.
> > Regards,
> > Ravindra.
> > On 9 February 2018 at 13:14, Jacky Li <jacky.li...@qq.com> wrote:
> >> Hi,
> >> One year ago, CarbonData 1.0.0 has introduced bucket table feature, it
> >> expected to improve join performance by avoiding shuffling if both
> >> are bucketed on same column with same number of buckets.
> >> However, after this feature was introduced, personally speaking it was
> >> widely used in the community and it creates maintenance overhead for the
> >> developers in the community (for very new Pull Request, all bucket
> >> testcase need to be fixed)
> >> And now carbon has integrated with spark standard partition, developer
> >> add bucket support using spark bucketed table feature in future if it
> >> requires.
> >> So, I propose to remove bucket feature after CarbonData 1.3.0 version.
> >> What do you think?
> >> Regards,
> >> Jacky
> > --
> > Thanks & Regards,
> > Ravi
Thanks & Regards,