Hi People,

I'm using java kafka spark streaming and saving the result file into hdfs.

As per my understanding, spark streaming write every processed message or
event to hdfs file. Reason to creating one file per message or event could
be to ensure fault tolerance. Is there any way spark handle this small file
problem or Do I need to append small files into bigger file and then insert
into hdfs?

Appreciate your time and suggestions.

Reply via email to