Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/2252
  
    To eliminate if-else on longString as much as possible,
    + For DataMapSchemaType, I changed VARIABLE to VARIABLE_INT and 
VARIABLE_SHORT, they are used for BlockletDataMap;
    + For DimensionStoreType, I changed VARIABLELENGTH to VARIABLE_INT_LENGTH 
and VARIABLE_SHORT_LENGTH, they are used for encoding/decoding dimensions;
    + For ColumnPageStatCollector, I changed LVStringStatCollector to 
LVShortStringStatCollector and LVLongStringStatCollector, they are used for 
column statistics;
    
    While deep into the code, I found that there is no need to add an internal 
datatype such as TEXT.
    + The dimensions all are considered as String, we only need a boolean array 
to indicate whether it is long string. We don't need an array to indicate all 
the datatype of the dimensions.
    + The procedure for String and Text are nearly the same. A boolean (array) 
and proper abstraction is enough.


---

Reply via email to