Hi,

I'm planning to use Fume in order to stream data from a local client machine into HDFS running on a cloud environment.

Is there a way to start a mapper already on an incomplete file? As I know a file in HDFS has to be closed first before a mapper can start.

Is this true?

Any possible idea for a solution of this problem?

Or do I have to write smaller chunks of my big input file and create multiple files in HDFS and start a separate map task on each file once it has been closed?

Best Regards,

Romeo

Romeo Kienzler
r o m e o @ o r m i u m . d e

Reply via email to