For a very long time, we've had a workflow that looks like this:

Export data from a compressed, orc hive table to another hive table that is 
"external stored as text file". No compression specified.

Then, we point to the folder "x" behind that new table and use CsvBulkInsert to 
get data to Hbase.

Today, I noticed that the data has not been getting into HBase since late 
August.

After some clicking around, it looks like this is happening because we have 
hive.exec.compress.output set to true, so the data in folder "x" is compressed 
in ".deflate" folders.

However, it looks like someone changed this setting to true 4 months ago.

So we should either be missing 4 months of data, or this should work.

Thus my question: does CSV bulk insert work with compressed output like this?


Reply via email to