Hi, We are writing direct orc file from storm topology instead of using hive streaming (Due to performance issue with our data). However, we want to compact the data. So we have added the "NO_AUTO_COMPACTION"=“false” option in table which we created to read data(1.6 GB scattered in multiple small files) in ORC file. Does “NO_AUTO_COMPACTION” means it will not compact data while hive streaming is used? If no, why it did not compact our data into 1 file?
We also tried manually calling compaction from java code using org.apache.hadoop.hive.metastore.txn.TxnHandler’s compact API which shows it has started compaction, when we execute command Show compactions. But still does not work. I don’t want to execute the manual commands from command line. Is there any way? PS: We are writing all files in one directory only. Thanks, Sachin