Hi,
I'm investigating the performance of my application and am trying to
gain a better understanding of what the boot, init and finish times
reported by the PythonRunner signify.
This is an example of output I have obtained:
19/10/26 12:14:24 INFO PythonRunner: Times: total = 35412, boot =
Hello
We are running into an issue.
We have customized the SparkListener class, and added that to spark context.
But when the spark job is failed, we find the "onApplicationEnd" function was
triggered twice.
Is that supposed to be triggered just once when the spark job is failed?
Because the
Hi Roland,
Not much shared apart from it's not working. Latest partition offset is
used when the size of a TopicPartition is negative.
This can be found out by checking the following log entry in the logs:
logDebug(s"rateLimit $tp size is $size")
If you've double checked and still think it's an
Hi All,
changing maxOffsetsPerTrigger and restarting the job won’t apply to the batch
size. This is somehow bad as we currently use a trigger duration of 5minutes
which consumes only 100k messages with an offset lag in the billions.
Decreasing trigger duration affects also micro batch size -