Team, We have an apex application that is reading from Kafka and wring to HDFS.
The data flow for kafka topic is very huge… say 2500 messages per sec!! The issue we are facing is: The operator (which extends AbstractFileOutputOperator) is writing to hdfs is building latency over time and failing eventually. Can someone pls share your thoughts on how I can handle this ? Thanks a lot. Regards, Raja.