Hi,

One year ago, CarbonData 1.0.0 has introduced bucket table feature, it was 
expected to improve join performance by avoiding shuffling if both tables are 
bucketed on same column with same number of buckets. 

However, after this feature was introduced, personally speaking it was not 
widely used in the community and it creates maintenance overhead for the 
developers in the community (for very new Pull Request, all bucket related 
testcase need to be fixed)

And now carbon has integrated with spark standard partition, developer can add 
bucket support using spark bucketed table feature in future if it requires.

So, I propose to remove bucket feature after CarbonData 1.3.0 version.
What do you think?

Regards,
Jacky

Reply via email to