> We are pushing the compressed text files into HDFS directory for Hive
>EXTERNAL table, then using an INSERT on the table using ORC storage. We
>are letting Hive handle the ORC file creation process.

Are the compressed text files small enough to process one by one?

I did write something similar last year for an EBCIDIC case.

The only thing it can't do is split a file half-way through, so each file
is processed as a single stream with a simple state machine.

Cheers,
Gopal


Reply via email to