[
https://issues.apache.org/jira/browse/FLINK-24230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17413981#comment-17413981
]
Piotr Nowojski commented on FLINK-24230:
----------------------------------------
I would try to create a "one shot" type benchmark for triggering an unaligned
checkpoint. Currently we are starting up the job inside the benchmark method,
with some pre-configured checkpointing interval, and we are waiting for N
checkpoints to finish. This has a problem, that we need many checkpoints for
stable results and as [~akalashnikov] described this forces us to use small
checkpointing intervals with very small amount of backpressure (checkpointing
time ~10ms). With such small back pressure, buffer debloating targets above 1ms
do not make much sense.
We could try to rewrite this benchmark to setup flink cluster, submit a job and
wait unilt source is backpressured in the setup method. The main benchmark
method would just trigger manually a checkpoint. Moving as much as possible to
the setup method would make the results more stable, and would allow us to
reduce number of checkpoints that we need to take for stable results (hopefully
down to 1 checkpoint per benchmark invocation), which would allow us to run the
job with higher backpressure.
One open question is how to
{quote}
The main benchmark method would just trigger manually a checkpoint
{quote}
We could either hack and use checkpoints induced from sources (I'm not sure but
I think FLIP-27 should support that), or create new REST/CLI API call or expose
in the MiniCluster a way to manually trigger checkpoints, similarly how
savepoints are triggered. That last bit would be useful in many other tests,
where currently we are relaying on some static latches or hacking things around
because we currently can not manually trigger checkpoints.
> Buffer debloating microbenchmark for single gate
> ------------------------------------------------
>
> Key: FLINK-24230
> URL: https://issues.apache.org/jira/browse/FLINK-24230
> Project: Flink
> Issue Type: Sub-task
> Components: Runtime / Network
> Affects Versions: 1.14.0
> Reporter: Anton Kalashnikov
> Priority: Major
> Fix For: 1.15.0
>
>
> Currently, there are no microbenchmarks that check buffer debloating
> effectiveness. The idea is to create one which will measure the checkpoint
> time. The benchmark should be similar to `UnalignedCheckpointTimeBenchmark`
> but unlike the `UnalignedCheckpointTimeBenchmark` where we see the effect of
> `Buffer debloat` only for extremely small values like 1ms for
> BUFFER_DEBLOAT_TARGET. This benchmark should provide a more reliable way to
> check the different implementations of `Buffer debloat` it can be reached by
> increasing at least record size and checkpoint interval. The main target is
> to have how long will it take to do the checkpoint during backpressure when
> all buffers are full.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)