Hi,
As Magnus suggested, you need to use a scheduler.
We are using *Apache Airflow*.
If you are using *GCP*, then you can use *Composer*. That will make your
life easier, but it is a bit more costly than hosting *Airflow* on your own
*server*.
Another workaround on *GCP* is using *Scheduler* to call *CloudFucniton* which
creates and run the *Dataflow* job. That is exactly like you have a *cron*
job that start your Apache Beam job.

All the best...

On Tue, Dec 31, 2019 at 8:09 PM Magnus Runesson <[email protected]>
wrote:

> Hi!
>
> You probably want to take a look at a scheduler such as Airflow(
> https://airflow.apache.org/) or Luigi(
> https://luigi.readthedocs.io/en/stable/index.html). If you are on Google
> they have a Cloud Composer(https://cloud.google.com/composer/) which is
> Airflow underneath. On AWS Glue is probably an option.
>
> /Magnus
> On 2019-12-31 13:04, Gershi, Noam wrote:
>
> Hi,
>
>
>
> What is the best way to scheduler Apache beam pipelines (execute the same
> pipeline once per day, for the data of the last day)?
>
> Will it be a different solution per runner?
>
>
>
>
>
> [image: citi_logo_mail][image: citi_logo_mail]*Noam Gershi*
>
> Software Developer
>
> *T*: +972 (3) 7405718
>
> [image: Mail_signature_blue]
>
>
>
>

-- 
Soliman ElSaber
Data Engineer
www.mindvalley.com

Reply via email to