I am trying to explore some use case that I believe are perfect for the columnarSerDe, tables with 100+ columns where only one or two are selected in a particular query.
CREATE TABLE (....) ROW FORMAT SERDE "org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe" STORED AS RCFile ; My issue is my data from our source table, with gzip sequence files, is much smaller then the ColumnarSerDe table and as a result any performance gains are lost. Any ideas? Thank you, Edward