We are also doing transformations, thats the reason using spark streaming. Does Spark streaming support tumbling windows? I was thinking I can use a window operation to writing into HDFS.
Thanks On Sun, Jun 25, 2017 at 10:23 PM, ayan guha <guha.a...@gmail.com> wrote: > I would suggest to use Flume, if possible, as it has in built HDFS log > rolling capabilities.... > > On Mon, Jun 26, 2017 at 1:09 PM, Naveen Madhire <vmadh...@umail.iu.edu> > wrote: > >> Hi, >> >> I am using spark streaming with 1 minute duration to read data from kafka >> topic, apply transformations and persist into HDFS. >> >> The application is creating a new directory every 1 minute with many >> partition files(= nbr of partitions). What parameter should I need to >> change/configure to persist and create a HDFS directory say *every 30 >> minutes* instead of duration of the spark streaming application? >> >> >> Any help would be appreciated. >> >> Thanks, >> Naveen >> >> >> > > > -- > Best Regards, > Ayan Guha >