I would suggest to use Flume, if possible, as it has in built HDFS log
rolling capabilities....

On Mon, Jun 26, 2017 at 1:09 PM, Naveen Madhire <vmadh...@umail.iu.edu>
wrote:

> Hi,
>
> I am using spark streaming with 1 minute duration to read data from kafka
> topic, apply transformations and persist into HDFS.
>
> The application is creating a new directory every 1 minute with many
> partition files(= nbr of partitions). What parameter should I need to
> change/configure to persist and create a HDFS directory say *every 30
> minutes* instead of duration of the spark streaming application?
>
>
> Any help would be appreciated.
>
> Thanks,
> Naveen
>
>
>


-- 
Best Regards,
Ayan Guha

Reply via email to