I'm not sure why, but we don't get any .tmp files when writing to HDFS.
Which version of Flume are you using? Either way, you could do something
like this:

hadoop fs -ls /collect/ | awk '{print $8}' | egrep -v tmp$ | xargs -in 1
hadoop fs -mv {} /hive/

might be a bit slower but it should work..

On Wed, Oct 5, 2011 at 1:00 PM, Jonathan <[email protected]> wrote:

> Hey experts,
>
> I have flume writing to a directory in hdfs. I then fire off a cron job to
> move that data into hive every five minutes. The problem that I am having is
> that the .tmp files are also moved and start causing errors on the collector
> that is writing the files to hdfs. Is there any way to get rid of the .tmp
> files or to have them in a different directory then the other files? Any
> other suggestions on how I can work around this issue?
>
> Jonathan
>

Reply via email to