Hi, I have 2 spark pipeline applications almost identical, but I found out a significant difference between their performance.
Basically the 1st application consumes the streaming from Kafka, slice this streaming in batches of 1 minute and for each record calculates a score given the already loaded machine learning model and outputs the end results with scores to a database. But the 2nd application after 7 hours of continuously running ends up with stopping and I observed each batch job had gotten longer to complete compared with earlier batch jobs. This 2nd application besides consuming the same streaming data as the 1st application, there are a couple of additional steps related with records aggregation. I'd like to ask here if as this records aggregation is the only diference between both applications, this can explain the why my 2nd application is getting gradually longer to complete it's streaming batch jobs. I'd appreciate any help/clue or tip to help me understand what is going on with this 2nd application. Thank you, Saulo