Re: trigger once (batch job with streaming semantics)

Georg Heiler Mon, 09 May 2022 14:04:34 -0700

Hi Martijn,

many thanks for this clarification. Do you know of any example somewhere
which would showcase such an approach?


Best,
Georg

Am Mo., 9. Mai 2022 um 14:45 Uhr schrieb Martijn Visser <
martijnvis...@apache.org>:

> Hi Georg,
>
> No they wouldn't. There is no capability out of the box that lets you
> start Flink in streaming mode, run everything that's available at that
> moment and then stops when there's no data anymore. You would need to
> trigger the stop yourself.
>
> Best regards,
>
> Martijn
>
> On Fri, 6 May 2022 at 13:37, Georg Heiler <georg.kf.hei...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I would disagree:
>> In the case of spark, it is a streaming application that is offering full
>> streaming semantics (but with less cost and bigger latency) as it triggers
>> less often. In particular, windowing and stateful semantics as well as
>> late-arriving data are handled automatically using the regular streaming
>> features.
>>
>> Would these features be available in a Flink Batch job as well?
>>
>> Best,
>> Georg
>>
>> Am Fr., 6. Mai 2022 um 13:26 Uhr schrieb Martijn Visser <
>> martijnvis...@apache.org>:
>>
>>> Hi Georg,
>>>
>>> Flink batch applications run until all their input is processed. When
>>> that's the case, the application finishes. You can read more about this in
>>> the documentation for DataStream [1] or Table API [2]. I think this matches
>>> the same as Spark is explaining in the documentation.
>>>
>>> Best regards,
>>>
>>> Martijn
>>>
>>> [1]
>>> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/execution_mode/
>>> [2]
>>> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/common/
>>>
>>> On Mon, 2 May 2022 at 16:46, Georg Heiler <georg.kf.hei...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> spark
>>>> https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#triggers
>>>> offers a variety of triggers.
>>>>
>>>> In particular, it also has the "once" mode:
>>>>
>>>> *One-time micro-batch* The query will execute *only one* micro-batch
>>>> to process all the available data and then stop on its own. This is useful
>>>> in scenarios you want to periodically spin up a cluster, process everything
>>>> that is available since the last period, and then shutdown the cluster. In
>>>> some case, this may lead to significant cost savings.
>>>>
>>>> Does flink have a similar possibility?
>>>>
>>>> Best,
>>>> Georg
>>>>
>>>

Re: trigger once (batch job with streaming semantics)

Reply via email to