>
>
> Well serverless does not mean that all state is torn down between
> executions – but that it's a possibility. Often times, there's going
> to be data cached on the function host and/or multiple function
> executions per instance setup.
>

True - those are not insurmountable problems, but Cloud Run apparently has
some limitations here.


>
> > [...] There are also limitations
> > when it comes to task running time - it is not uncommon in Airflow that a
> > task can run for many hours. Both limitations make it not very well
> suited
> > for a serverless approach IMHO.
>
> That depends on the platform. A serverless function is not necessarily
> required to finish within a certain amount of time (although this
> changes the pricing considerably). In any case, if a task takes longer
> than a couple of minutes, it could be offloaded to a beefier computing
> system (for example Beam, Spark, or perhaps just a container instance
> or VM that be started on-demand). In this scenario, Airflow is more of
> an orchestrator than worker.
>

Indeed - there are cases like that for "service" type of operators and for
"transfer" type of operators where the transfer is handled by external
service. And when we move to "Reschedule" type of operators -
https://lists.apache.org/thread.html/rc6f56234342c87f154865489e3a6555609e4b98a8c62ca4997cb6a6c%40%3Cdev.airflow.apache.org%3E
-
this will get even closer to what you describe (for now long-running
external operations still require an operator to actively poll for results
and block process/worker slot). Still even after Poke Reschedule, there are
quite a number of Transfer operators that are intermediaries rather than
orchestration only. For example PostgresToGoogleCloudStorageOperator - if
you have a lot of data such transfer operator will require a lot of time to
complete and the data passes through the worker. This is not really
serverless friendly I think - not only because of the time limits but often
you have then a long running HTTP request for such a serverless task which
is not really friendly with the way how networking is done (long running
HTTP requests are dropped).

Again - those are not insurmountable problems - we are going likely to
release a KNative executor soon, It's just good to be aware of its
limitations..

J.

>
>  --\--
> cheers
>


-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Reply via email to