Hi All,

As I am implementing new encoding feature for carbondata, I found it is hard to 
maintain both read and write backward compatibility with all CarbonData format 
including V1, V2, and V3.

In this post, I want to discuss the roadmap for backward compatibility support. 

I am proposing following feature plan:
1. For the write support. Start from CarbonData 1.2 onwards, support writing V3 
format only. 
V3 format is introduced in CarbonData 1.1 (2017 Feb), and it is stable for more 
than half year now. And since we are going to add new feature in V3 format 
only, it is better we clean the writing path for V3 format. If there are bugs 
in V1 and V2 format, we still will fix it in maintenance version before 
CarbonData 1.1

2. For the read support, there are two options.
Option 1: Support reading V1 and V2 format, and in CarbonData 1.3, build data 
migration tool to help user to migrate old carbon store. Stop supporting 
reading V1 and V2 after CarbonData 1.3
The pro is that if there are still some users are using V1 or V2 carbon in 
there application, they can continue to use CarbonData 1.2 to read the old data.
The con is that any new feature introduced for V3 need to be careful and should 
not break read compatibility of V1 and V2. Like, some new encoding will be 
every hard to introduce.

Option 2: Support reading V3 format starting from CarbonData 1.2
The pro is that code will be more clean and no restriction of add new encoding.
The con is that any old carbon store that based on V1 and V2 format, it can be 
read using CarbonData 1.1 only.

I want to collect the opinion form community, if there are users still using V1 
or V2 format, I think it is saver to go with Option 1. Otherwise, if all users 
are using V3 format (CarbonData 1.1 and 1.1.1), I think Option 2 is a better 
choice.


Thanks,
Jacky Li

Reply via email to