[
https://issues.apache.org/jira/browse/FLINK-9465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17424536#comment-17424536
]
Feifan Wang commented on FLINK-9465:
------------------------------------
Hi [~trohrmann],
Since the two REST API mentioned above use the POST method, I tend to add
parameter as part of the body of the http request, just like other parameters.
I want to name parameter as "savepoint-timeout" or "savepointTimeout" directly.
* "savepoint-timeout" for [REST API :
/jobs/:jobid/savepoints|https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/ops/rest_api/#jobs-jobid-savepoints]
and [CLI : Creating a
Savepoint|https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/cli/#creating-a-savepoint]
* "savepointTimeout" for [REST API :
/jobs/:jobid/stop|https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/ops/rest_api/#jobs-jobid-stop]
and [CLI : Stopping a Job Gracefully Creating a Final
Savepoint|https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/cli/#stopping-a-job-gracefully-creating-a-final-savepoint]
And the parameter in this 4 place should be optional, if not appear, checkpoint
timeout will take effect.
How do you think about ?
> Specify a separate savepoint timeout option via CLI
> ---------------------------------------------------
>
> Key: FLINK-9465
> URL: https://issues.apache.org/jira/browse/FLINK-9465
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Checkpointing
> Affects Versions: 1.5.0
> Reporter: Truong Duc Kien
> Assignee: Feifan Wang
> Priority: Minor
> Labels: auto-deprioritized-major, pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Savepoint can take much longer time to perform than checkpoint, especially
> with incremental checkpoint enabled. This leads to a couple of troubles:
> * For our job, we currently have to set the checkpoint timeout much large
> than necessary, otherwise we would be unable to perform savepoint.
> * During rush hour, our cluster would encounter high rate of checkpoint
> timeout due to backpressure, however we're unable to migrate to a larger
> configuration, because savepoint also timeout.
> In my opinion, the timeout for savepoint should be configurable separately,
> both in the config file and as parameter to the savepoint command.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)