Pawel Bartoszek created FLINK-10841:
---
Summary: Reduce the number of ListObjects calls when checkpointing
to S3
Key: FLINK-10841
URL: https://issues.apache.org/jira/browse/FLINK-10841
Project: Flink
For posterity: Here is the Jira Issue that tracks this:
https://issues.apache.org/jira/browse/FLINK-9061
On Thu, Mar 22, 2018 at 11:46 PM, Jamie Grier wrote:
> I think we need to modify the way we write checkpoints to S3 for high-scale
> jobs (those with many total tasks). The issue is that we
I think we need to modify the way we write checkpoints to S3 for high-scale
jobs (those with many total tasks). The issue is that we are writing all
the checkpoint data under a common key prefix. This is the worst case
scenario for S3 performance since the key is used as a partition key.
In the
Yes, this gives much more information :)
Cheers,
Gyula
Stephan Ewen ezt írta (időpont: 2016. jan. 4., H, 16:24):
> Hey!
>
> Nice to hear that it works.
>
> A bit of info is now visible in the web dashboard now, as of that PR:
> https://github.com/apache/flink/pull/1453
>
> Is that what you had
Hey!
Nice to hear that it works.
A bit of info is now visible in the web dashboard now, as of that PR:
https://github.com/apache/flink/pull/1453
Is that what you had in mind?
Greetings,
Stephan
On Sat, Jan 2, 2016 at 4:53 PM, Gyula Fóra wrote:
> Ok, I could figure out the problem, it was my
Ok, I could figure out the problem, it was my fault :). The issue was that
I was running a short testing job and the sources finished before
triggering the checkpoint. So the folder was created for the job in S3 but
since we didn't write anything to it is shown as a file in S3.
Maybe it would be g
Hey,
I am trying to checkpoint my streaming job to S3 but it seems that the
checkpoints never complete but also I don't get any error in the logs.
The state backend connects properly to S3 apparently as it creates the
following file in the given S3 directory :
95560b1acf5307bc3096020071c83230_$f