Hi, I've been running some flink scala applications on an AWS EMR cluster
(version 5.26.0 with flink 1.8.0 for scala 2.11) for a while and I started
to have some issues now.

I have a flink app that reads some files from S3, process them and save
some files to s3 and also some records to a database.

The application is not so complex it has a source that reads a directory
(multiple files) and other one that reads a single one and then it has some
grouping and mapping and a left outer join between these 2 sources.

The issue is that occasionally the application got stuck with only two
tasks running, one finished and the other ones not even run. The 2 tasks
that keep running forever are the source1 from directory (multiple files)
and the leftouterjoin, the source2 (input from a single file) is the one
that finishes. One interest thing is that there should be several tasks
between source 1 and this leftouterjoin but they remain in CREATED state.
If the app stuck usually I simply kill that and run that again, which
works. The issue is not that frequent but is getting more and more
frequent. It's happening almost everyday now.

I also have a DEBUG log from a job that didn't work and another one from a
job that worked.

Thanks.

Reply via email to