Hi All,

For dataflow you can also take advantage of the scheduler in dataflow itself.
There is a how-to on using the jobs tab in dataflow [1].

To use this you only need to place your pipeline and the needed metadata in a 
Google Storage bucket and you can fire off a pipeline on a schedule.

We try to be as versatile as possible to fit in your current data processing 
eco-system. We recently updated the documentation on how to use airflow [2] in 
the coming months we will create more How to guides focussed on scheduling in 
different environments.

If you have more information on what your architecture looks like we can 
provide more insights on how we would tackle the problem in your situation.

Kind regards,
Hans

[1] 
https://hop.apache.org//manual/latest/pipeline/beam/dataflowPipeline/google-dataflow-pipeline.html
[2] 
https://hop.apache.org//manual/next/how-to-guides/run-hop-in-apache-airflow.html
On 14 May 2023 at 08:06 +0200, Mikhail Khludnev <[email protected]>, wrote:
> Got it. Thank you, Thad!
>
> > On Sun, May 14, 2023 at 6:01 AM Thad Guidry <[email protected]> wrote:
> > > You can use any scheduling tool (CRON, RunDeck, etc.) !  Isn't that 
> > > great?!?!?
> > >
> > > In your tool of choice, you will just need to ensure the tool can perform 
> > > an HTTP Post request with the correct parameters.
> > >
> > > I personally use the async web service for long running batch jobs and 
> > > use Python Flash to build a simple dashboard to monitor, but you could do 
> > > the same with RunDeck, Nagios, or other tools.
> > > https://hop.apache.org/manual/latest/hop-server/async-web-service.html
> > >
> > > The Execution service which is what most of us use is detailed here:
> > > https://hop.apache.org/manual/latest/hop-rest/index.html#_execution_services
> > > Where that doc page needs to be improved and provide links from the other 
> > > pages of Workflows and Pipelines.  Especially at least a tip or note 
> > > admonition
> > > on this page:
> > > https://hop.apache.org/manual/latest/pipeline/pipelines.html
> > > We're not very good with providing linking to other parts of the manual 
> > > that are directly relevant to Pipeline or Workflow running.  PR's welcome!
> > >
> > > Anyways, here's the web service metadata directly.
> > > https://hop.apache.org/manual/latest/hop-server/web-service.html
> > >
> > > Also, you can even control the Hop Server itself through scripts that 
> > > also could be scheduled with your tool of choice.
> > > https://hop.apache.org/manual/latest/hop-server/index.html
> > >
> > > Luckily, you can use the "Search the docs" input box in the top right of 
> > > the manual in order to find some of the other pages that you might be 
> > > interested in.
> > >
> > > Thad
> > > https://www.linkedin.com/in/thadguidry/
> > > https://calendly.com/thadguidry/
> > >
> > >
> > > > On Sun, May 14, 2023 at 4:03 AM Mikhail Khludnev <[email protected]> 
> > > > wrote:
> > > > > Hello,
> > > > > Thank you for the nice project. This is what I need. I want to 
> > > > > reassure my understanding of scheduling. If I need to run 
> > > > > pipeline/workflow on a regular schedule, is there any other option 
> > > > > beside of Airflow (kicking of hop pipiline) and scheduling it as 
> > > > > Dataflow job?
> > > > >
> > > > > --
> > > > > Sincerely yours
> > > > > Mikhail Khludnev
>
>
> --
> Sincerely yours
> Mikhail Khludnev
> https://t.me/MUST_SEARCH
> A caveat: Cyrillic!

Reply via email to