Re: spark filestrea problem

2015-05-04 Thread Akhil Das
With filestream you can actually pass a filter parameter to avoid loading
up .tmp file/directories.

Also, when you move/rename a file, the file creation date doesn't change
and hence spark won't detect them i believe.

Thanks
Best Regards

On Sat, May 2, 2015 at 9:37 PM, Evo Eftimov evo.efti...@isecc.com wrote:

 it seems that on Spark Streaming 1.2 the filestream API may have a bug -
 it doesn't detect new files when moving or renaming them on HDFS - only
 when copying them but that leads to a well known problem with .tmp files
 which get removed and make spark steraming filestream throw exception



spark filestrea problem

2015-05-02 Thread Evo Eftimov
it seems that on Spark Streaming 1.2 the filestream API may have a bug - it
doesn't detect new files when moving or renaming them on HDFS - only when
copying them but that leads to a well known problem with .tmp files which
get removed and make spark steraming filestream throw exception