I have two processes. One that writes sequence files directly to hdfs, the
other that is a hive table that reads these files.

All works well with the exception that I am only flushing the files
periodically. SequenceFile input format gets angry when it encounters
0-bytes seq files.

I was considering flush and sync on first record write. Also was thinking
should just be able to hack sequence file input format to skip 0 byte files
and not throw exception on readFully() which it sometimes does.

Anyone ever tackled this?

Reply via email to