Hi all Was wondering if anyone has experience similar I kicked off 3 dataflow template s via cloud function. It has created 3 VM which are still alive after jobs completed and I cannot delete them.... Could anyone assist with this? Kind regards
On Mon, Apr 6, 2020, 3:00 PM Marco Mistroni <[email protected]> wrote: > Hey > Thanks I create template from CMD line...was having issues with CLF but I > think I was not using Auth correctly > Will try your sample and report back if I am stuck > Thanks a lot! > > On Mon, Apr 6, 2020, 2:20 PM André Rocha Silva < > [email protected]> wrote: > >> Could you create the template already? >> >> Have you read the article? There I write the cloud function in js. Here >> is some example of a cloud function in python: >> >> import google.auth >> import random >> import logging >> >> from googleapiclient.discovery import build >> >> GCLOUD_PROJECT = 'project-id-123' >> >> >> def RunDataflow(event, context): >> >> credentials, _ = google.auth.default() >> >> service = build('dataflow', 'v1b3', credentials=credentials) >> >> uri = 'gs://bucket/input/file' >> output_file = 'gs://bucket/output/file' >> >> template_path = 'gs://bucket/Dataflow_templates/template' >> template_body = { >> 'jobName': ('cf-job-' + str(random.randint(1, 101000))), >> 'parameters': { >> 'input_file': uri, >> 'output_file': output_file, >> }, >> } >> >> request = service.projects().templates().launch( >> projectId=GCLOUD_PROJECT, >> gcsPath=template_path, >> body=template_body) >> response = request.execute() >> >> logging.info(f'RunDataflow: got this response {response}') >> >> >> On Mon, Apr 6, 2020 at 10:13 AM Marco Mistroni <[email protected]> >> wrote: >> >>> @andre sorry to hijack this. Are you able to send a working example of >>> kicking off dataflow template via cloud function? >>> >>> Kind regards >>> >>> On Mon, Apr 6, 2020, 1:51 PM André Rocha Silva < >>> [email protected]> wrote: >>> >>>> Hey! >>>> >>>> Could you make it work? You can take a look in this post, is a >>>> single file template, easy peasy to create a template from: >>>> >>>> https://towardsdatascience.com/my-first-etl-job-google-cloud-dataflow-1fd773afa955 >>>> >>>> If you want, we can schedule a google hangout and I help you, step by >>>> step. >>>> It is the least I can do after having had so much help from the >>>> community :) >>>> >>>> On Sat, Apr 4, 2020 at 4:52 PM Marco Mistroni <[email protected]> >>>> wrote: >>>> >>>>> Hey >>>>> sure... it's a crap script :).. just an ordinary dataflow script >>>>> >>>>> >>>>> https://github.com/mmistroni/GCP_Experiments/tree/master/dataflow/edgar_flow >>>>> >>>>> >>>>> What i meant to say , for your template question, is for you to write >>>>> a basic script which run on bean... something as simple as this >>>>> >>>>> >>>>> https://github.com/mmistroni/GCP_Experiments/blob/master/dataflow/beam_test.py >>>>> >>>>> and then you can create a template out of it by just running this >>>>> >>>>> python -m edgar_main --runner=dataflow --project=datascience-projets >>>>> --template_location=gs://mm_dataflow_bucket/templates/edgar_dataflow_template >>>>> --temp_location=gs://mm_dataflow_bucket/temp >>>>> --staging_location=gs://mm_dataflow_bucket/staging >>>>> >>>>> That will create a template 'edgar_dataflow_template' which you can >>>>> use in GCP dataflow console to create your job. >>>>> >>>>> hth, i m sort of a noob to Beam, having started writing code just over >>>>> a month ago. Feel free to ping me if u get stuck >>>>> >>>>> kind regards >>>>> Marco >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Sat, Apr 4, 2020 at 6:01 PM Xander Song <[email protected]> >>>>> wrote: >>>>> >>>>>> Hi Marco, >>>>>> >>>>>> Thanks for your response. Would you mind sending the edgar_main >>>>>> script so I can take a look? >>>>>> >>>>>> On Sat, Apr 4, 2020 at 2:25 AM Marco Mistroni <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hey >>>>>>> As far as I know you can generate a dataflow template out of your >>>>>>> beam code by specifying an option on command line? >>>>>>> I am running this CMD and once template is generated I kick off a >>>>>>> dflow job via console by pointing at it >>>>>>> >>>>>>> python -m edgar_main --runner=dataflow --project=datascience-projets >>>>>>> --template_location=gs://<your bucket> Hth >>>>>>> >>>>>>> >>>>>>> On Sat, Apr 4, 2020, 9:52 AM Xander Song <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> I am attempting to write a custom Dataflow Template using the >>>>>>>> Apache Beam Python SDK, but am finding the documentation difficult to >>>>>>>> follow. Does anyone have a minimal working example of how to write and >>>>>>>> deploy such a template? >>>>>>>> >>>>>>>> Thanks in advance. >>>>>>>> >>>>>>> >>>> >>>> -- >>>> >>>> *ANDRÉ ROCHA SILVA* >>>> * DATA ENGINEER* >>>> (48) 3181-0611 >>>> >>>> <https://www.linkedin.com/in/andre-rocha-silva/> /andre-rocha-silva/ >>>> <http://portaltelemedicina.com.br/> >>>> <https://www.youtube.com/channel/UC0KH36-OXHFIKjlRY2GyAtQ> >>>> <https://pt-br.facebook.com/PortalTelemedicina/> >>>> <https://www.linkedin.com/company/9426084/> >>>> >>>> >> >> -- >> >> *ANDRÉ ROCHA SILVA* >> * DATA ENGINEER* >> (48) 3181-0611 >> >> <https://www.linkedin.com/in/andre-rocha-silva/> /andre-rocha-silva/ >> <http://portaltelemedicina.com.br/> >> <https://www.youtube.com/channel/UC0KH36-OXHFIKjlRY2GyAtQ> >> <https://pt-br.facebook.com/PortalTelemedicina/> >> <https://www.linkedin.com/company/9426084/> >> >>
