Hi Martijn, many thanks for this clarification. Do you know of any example somewhere which would showcase such an approach?
Best, Georg Am Mo., 9. Mai 2022 um 14:45 Uhr schrieb Martijn Visser < martijnvis...@apache.org>: > Hi Georg, > > No they wouldn't. There is no capability out of the box that lets you > start Flink in streaming mode, run everything that's available at that > moment and then stops when there's no data anymore. You would need to > trigger the stop yourself. > > Best regards, > > Martijn > > On Fri, 6 May 2022 at 13:37, Georg Heiler <georg.kf.hei...@gmail.com> > wrote: > >> Hi, >> >> I would disagree: >> In the case of spark, it is a streaming application that is offering full >> streaming semantics (but with less cost and bigger latency) as it triggers >> less often. In particular, windowing and stateful semantics as well as >> late-arriving data are handled automatically using the regular streaming >> features. >> >> Would these features be available in a Flink Batch job as well? >> >> Best, >> Georg >> >> Am Fr., 6. Mai 2022 um 13:26 Uhr schrieb Martijn Visser < >> martijnvis...@apache.org>: >> >>> Hi Georg, >>> >>> Flink batch applications run until all their input is processed. When >>> that's the case, the application finishes. You can read more about this in >>> the documentation for DataStream [1] or Table API [2]. I think this matches >>> the same as Spark is explaining in the documentation. >>> >>> Best regards, >>> >>> Martijn >>> >>> [1] >>> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/execution_mode/ >>> [2] >>> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/common/ >>> >>> On Mon, 2 May 2022 at 16:46, Georg Heiler <georg.kf.hei...@gmail.com> >>> wrote: >>> >>>> Hi, >>>> >>>> spark >>>> https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#triggers >>>> offers a variety of triggers. >>>> >>>> In particular, it also has the "once" mode: >>>> >>>> *One-time micro-batch* The query will execute *only one* micro-batch >>>> to process all the available data and then stop on its own. This is useful >>>> in scenarios you want to periodically spin up a cluster, process everything >>>> that is available since the last period, and then shutdown the cluster. In >>>> some case, this may lead to significant cost savings. >>>> >>>> Does flink have a similar possibility? >>>> >>>> Best, >>>> Georg >>>> >>>