As of today, we apply dictionary encoding for all columns by default. We should probably move a hybrid approach where we decide the encoding based on the data profile. For e.g. if the cardinality of the column is very high (which is the case for metrics), dictionary encoding does not provide a lot of value.
Thoughts?
