[
https://issues.apache.org/jira/browse/FLINK-27827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17544337#comment-17544337
]
Andreas Hailu commented on FLINK-27827:
---------------------------------------
Hi [~gaoyunhaii] & [~martijnvisser], very well. Thanks for your input!
> StreamExecutionEnvironment method supporting explicit Boundedness
> -----------------------------------------------------------------
>
> Key: FLINK-27827
> URL: https://issues.apache.org/jira/browse/FLINK-27827
> Project: Flink
> Issue Type: Improvement
> Components: API / DataStream
> Reporter: Andreas Hailu
> Priority: Minor
>
> When creating a {{{}DataStreamSource{}}}, an explicitly bounded input is only
> returned if the {{InputFormat}} provided implements {{{}FileInputFormat{}}}.
> This is results in runtime exceptions when trying to run applications in
> Batch execution mode while using non {{{}FileInputFormat{}}}s e.g. Apache
> Iceberg [1], Flink's Hadoop MapReduce compatibility API's [2] inputs, etc...
> I understand there is a {{DataSource}} API [3] that supports the
> specification of the boundedness of an input, but that would require all
> connectors to revise their APIs to leverage it which would take some time.
> My organization is in the middle of migrating from the {{DataSet}} API to the
> {{{}DataStream API{}}}, and we've found this to be a challenge as nearly all
> of our inputs have been determined to be unbounded as we use {{InputFormats}}
> that are not {{{}FileInputFormat{}}}s.
> Our work-around was to provide a local patch in
> {{StreamExecutionEnvironment}} with a method supporting explicitly bounded
> inputs.
> As this helped us implement a Batch {{DataStream}} solution, perhaps this is
> something that may be helpful for others?
>
> [1] [https://iceberg.apache.org/docs/latest/flink/#reading-with-datastream]
> [2]
> [https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/dataset/hadoop_map_reduce/]
>
> [3]
> [https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/datastream/sources/#the-data-source-api]
>
--
This message was sent by Atlassian Jira
(v8.20.7#820007)