Teodor,
I've concluded this is a bug, and have reported it:
https://issues.apache.org/jira/browse/FLINK-19109
Best regards,
David
On Sun, Aug 30, 2020 at 3:01 PM Teodor Spæren
wrote:
> Hey again David!
>
> I tried your proposed change of setting the paralilism higher. This
> worked, but why
Hey again David!
I tried your proposed change of setting the paralilism higher. This
worked, but why does this fix the behavior? I don't understand why this
would fix it. The only thing that happens to the query plan is that a
"remapping" node is added.
Thanks for the fix, and for any
Hey David!
I tried what you said, but it did not solve the problem. The job still
has to wait until the very end before outputting anything.
I mentioned in my original email that I had set the parallelism to 1 job
wide, but when I reran the task, I added your line. Are there any
Teodor,
This is happening because of the way that readTextFile works when it is
executing in parallel, which is to divide the input file into a bunch of
splits, which are consumed in parallel. This is making it so that the
watermark isn't able to move forward until much or perhaps all of the file
Hey!
Second time posting to a mailing lists, lets hope I'm doing this
correctly :)
My usecase is to take data from the mediawiki dumps and stream it into
Flink via the `readTextFile` method. The dumps are TSV files with an
event per line, each event have a timestamp and a type. I want to