Hi Ravindra, You mean we can do one round of refactory for bucketed table feature in CarbonData 1.4. I am fine with it.
Regards, Jacky > 在 2018年2月9日,下午3:49,Ravindra Pesala <ravi.pes...@gmail.com> 写道: > > Hi Likun, > > I feel it is better to change the implementation to use sparks bucketing > generation just like how standard hive partitions generates. It will be > easy to change it after implementing of partition feature. And it is a > useful feature for joining big tables and hash based buckets and clustered > by enables the queries faster. So it is better to change the > implementation instead of removing it. > > Regards, > Ravindra. > > On 9 February 2018 at 13:14, Jacky Li <jacky.li...@qq.com> wrote: > >> Hi, >> >> One year ago, CarbonData 1.0.0 has introduced bucket table feature, it was >> expected to improve join performance by avoiding shuffling if both tables >> are bucketed on same column with same number of buckets. >> >> However, after this feature was introduced, personally speaking it was not >> widely used in the community and it creates maintenance overhead for the >> developers in the community (for very new Pull Request, all bucket related >> testcase need to be fixed) >> >> And now carbon has integrated with spark standard partition, developer can >> add bucket support using spark bucketed table feature in future if it >> requires. >> >> So, I propose to remove bucket feature after CarbonData 1.3.0 version. >> What do you think? >> >> Regards, >> Jacky >> >> > > > -- > Thanks & Regards, > Ravi