Hi, As Magnus suggested, you need to use a scheduler. We are using *Apache Airflow*. If you are using *GCP*, then you can use *Composer*. That will make your life easier, but it is a bit more costly than hosting *Airflow* on your own *server*. Another workaround on *GCP* is using *Scheduler* to call *CloudFucniton* which creates and run the *Dataflow* job. That is exactly like you have a *cron* job that start your Apache Beam job.
All the best... On Tue, Dec 31, 2019 at 8:09 PM Magnus Runesson <[email protected]> wrote: > Hi! > > You probably want to take a look at a scheduler such as Airflow( > https://airflow.apache.org/) or Luigi( > https://luigi.readthedocs.io/en/stable/index.html). If you are on Google > they have a Cloud Composer(https://cloud.google.com/composer/) which is > Airflow underneath. On AWS Glue is probably an option. > > /Magnus > On 2019-12-31 13:04, Gershi, Noam wrote: > > Hi, > > > > What is the best way to scheduler Apache beam pipelines (execute the same > pipeline once per day, for the data of the last day)? > > Will it be a different solution per runner? > > > > > > [image: citi_logo_mail][image: citi_logo_mail]*Noam Gershi* > > Software Developer > > *T*: +972 (3) 7405718 > > [image: Mail_signature_blue] > > > > -- Soliman ElSaber Data Engineer www.mindvalley.com
