Github user xuchuanyin commented on the issue:
https://github.com/apache/carbondata/pull/2252
To eliminate if-else on longString as much as possible,
+ For DataMapSchemaType, I changed VARIABLE to VARIABLE_INT and
VARIABLE_SHORT, they are used for BlockletDataMap;
+ For DimensionStoreType, I changed VARIABLELENGTH to VARIABLE_INT_LENGTH
and VARIABLE_SHORT_LENGTH, they are used for encoding/decoding dimensions;
+ For ColumnPageStatCollector, I changed LVStringStatCollector to
LVShortStringStatCollector and LVLongStringStatCollector, they are used for
column statistics;
While deep into the code, I found that there is no need to add an internal
datatype such as TEXT.
+ The dimensions all are considered as String, we only need a boolean array
to indicate whether it is long string. We don't need an array to indicate all
the datatype of the dimensions.
+ The procedure for String and Text are nearly the same. A boolean (array)
and proper abstraction is enough.
---