Guillaume Balaine commented on BEAM-2831:

The implication here, is that from 2.1 onwards it is impossible to run any 
reasonably sized batch with the FlinkRunner with binary formats like Avro and 
Protobuf with the default block size of FileIO...

> Pipeline crashes due to Beam encoder breaking Flink memory management
> ---------------------------------------------------------------------
>                 Key: BEAM-2831
>                 URL: https://issues.apache.org/jira/browse/BEAM-2831
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-flink
>    Affects Versions: 2.0.0, 2.1.0
>         Environment: Flink 1.2.1 and 1.3.0, Java HotSpot and OpenJDK 8, macOS 
> 10.12.6 and unknown Linux
>            Reporter: Reinier Kip
>            Assignee: Aljoscha Krettek
>            Priority: Major
> I’ve been running a Beam pipeline on Flink. Depending on the dataset size and 
> the heap memory configuration of the jobmanager and taskmanager, I may run 
> into an EOFException, which causes the job to fail.
> As [discussed on Flink's 
> mailinglist|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/EOFException-related-to-memory-segments-during-run-of-Beam-pipeline-on-Flink-td15255.html]
>  (stacktrace enclosed), Flink catches these EOFExceptions and activates disk 
> spillover. Because Beam wraps these exceptions, this mechanism fails, the 
> exception travels up the stack, and the job aborts.
> Hopefully this is enough information and this is something that can be 
> adjusted for in Beam. I'd be glad to provide more information where needed.

This message was sent by Atlassian JIRA

Reply via email to