[ 
https://issues.apache.org/jira/browse/FLINK-13376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16891632#comment-16891632
 ] 

Biao Liu commented on FLINK-13376:
----------------------------------

Currently the implementation of {{ContinuousFileReaderOperator}} is a bit 
tricky. Under a finite stream scenario, after receiving all splits, it would be 
blocked in {{close}} to wait the split reader finishing. This behavior is not 
compatible with {{BoundedOneInput}}. Because {{endInput}} would be triggered 
before all datum is finished.

Before we rewrite the {{ContinuousFileReaderOperator}} through FLIP-27, I 
propose fixing this in a work-around way. By extending 
{{ContinuousFileReaderOperator}} with {{BoundedOneInput}}, we could block 
{{ContinuousFileReaderOperator}} in {{BoundedOneInput}}.{{endInput}}. That way, 
it keeps the same behavior with the current, and it respects the semantics of 
{{BoundedOneInput}}.

> FileInputFormat can not be used in batch mode of blink-planner
> --------------------------------------------------------------
>
>                 Key: FLINK-13376
>                 URL: https://issues.apache.org/jira/browse/FLINK-13376
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Connectors / FileSystem, Runtime / Task, Table SQL / 
> Runtime
>            Reporter: Jingsong Lee
>            Priority: Blocker
>             Fix For: 1.9.0, 1.10.0
>
>
> FileInputFormat use ContinuousFileReaderOperator in 
> StreamExecutionEnvironment, and ContinuousFileReaderOperator join the async 
> thread in close, So before closing, the data is incomplete.
> But batch operators use endInput to do some work, this lead to incomplete 
> records in endInput, and lead to wrong results.
> This mean batch mode of blink-planner can not use file related sources...



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to