-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi,
You could retroactively union an existing DStream with one from a newly created file. Then when another file is "detected", you would need to re-union the stream an create another DStream. It seems like the implementation of FileInputDStream only looks for files in the directory and the filtering is applied using FileSystem.listStatus(dir, filter) method which does not provide recursive listing. A cleaner solution would be to extend FileInputDStream and override the findNewFiles(...) with the ability to recursively list files (probably by using FileSystem.listFiles. Refer: http://stackoverflow.com/a/25645225/113411 - -- Ankur On 13/05/2015 02:03, lisendong wrote: > but in fact the directories are not ready at the beginning to my > task . > > for example: > > /user/root/2015/05/11/data.txt /user/root/2015/05/12/data.txt > /user/root/2015/05/13/data.txt > > like this. > > and one new directory one day. > > how to create the new DStream for tomorrow’s new > directory(/user/root/2015/05/13/) ?? > > >> 在 2015年5月13日,下午4:59,Ankur Chauhan <achau...@brightcove.com> 写道: >> > I would suggest creating one DStream per directory and then using > StreamingContext#union(...) to get a union DStream. > > -- Ankur > > On 13/05/2015 00:53, hotdog wrote: >>>> I want to use use fileStream in spark streaming to monitor >>>> multi hdfs directories, such as: >>>> >>>> val list_join_action_stream = ssc.fileStream[LongWritable, >>>> Text, TextInputFormat]("/user/root/*/*", check_valid_file(_), >>>> false).map(_._2.toString).print >>>> >>>> >>>> Buy the way, i could not under the meaning of the three class >>>> : LongWritable, Text, TextInputFormat >>>> >>>> but it doesn't work... >>>> >>>> >>>> >>>> -- View this message in context: >>>> http://apache-spark-user-list.1001560.n3.nabble.com/how-to-monitor- mul > >>>> ti-directories-in-spark-streaming-task-tp22863.html >>>> >>>> > Sent from the Apache Spark User List mailing list archive at > Nabble.com. >>>> >>>> ------------------------------------------------------------------- - -- >>>> >>>> > >>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>>> For additional commands, e-mail: user-h...@spark.apache.org >>>> >> [attachment] >> >> 0x6D461C4A.asc download: http://u.163.com/t0/fqZhSPbA >> >> preview: http://u.163.com/t0/2LRiaRy >> >> >> 0x6D461C4A.asc.sig download: http://u.163.com/t0/Ij1N9 >> >> <0x6D461C4A.asc><0x6D461C4A.asc.sig> > > -----BEGIN PGP SIGNATURE----- iQEbBAEBAgAGBQJVUxl2AAoJEOSJAMhvLp3L4dsH+KxSz/YF7UUiwZDiP36umD1X 3LVU2Io3CGVRDI4OEYs1mvSE2DqMx820DHApl0VxxkYdLmAPUtaAc1zAtWOPgiqQ GuL0jfdwkVGOBsbF6cycJe6XWMbJUyty0tU1IsvS23OvuhKD2ulgBJieyY/quvSs dIdFDu4bNhVhuz1KN+Vm44cdfZ/rHchOoaOnSej5zOglSerr/hTFyGZUdalAYMxq t2P2M2mkHrlqHqqt4EMtEOyi6iDvVPaiaJB8NQ6xbBDs9fSmv3noB5fl19hPc9gk 8G4JbzZkD01Nh2ZRZgH1voE7NPI4P/Z6UTSJBR9qdIgtinoP5JLSBNpRew4WuA== =7vh9 -----END PGP SIGNATURE----- --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org