Reach out to Google Cloud support. On Wed, Apr 8, 2020 at 1:51 AM Marco Mistroni <[email protected]> wrote:
> Hi all > Was wondering if anyone has experience similar > I kicked off 3 dataflow template s via cloud function. It has created 3 VM > which are still alive after jobs completed and I cannot delete them.... > Could anyone assist with this? > Kind regards > > On Mon, Apr 6, 2020, 3:00 PM Marco Mistroni <[email protected]> wrote: > >> Hey >> Thanks I create template from CMD line...was having issues with CLF but >> I think I was not using Auth correctly >> Will try your sample and report back if I am stuck >> Thanks a lot! >> >> On Mon, Apr 6, 2020, 2:20 PM André Rocha Silva < >> [email protected]> wrote: >> >>> Could you create the template already? >>> >>> Have you read the article? There I write the cloud function in js. Here >>> is some example of a cloud function in python: >>> >>> import google.auth >>> import random >>> import logging >>> >>> from googleapiclient.discovery import build >>> >>> GCLOUD_PROJECT = 'project-id-123' >>> >>> >>> def RunDataflow(event, context): >>> >>> credentials, _ = google.auth.default() >>> >>> service = build('dataflow', 'v1b3', credentials=credentials) >>> >>> uri = 'gs://bucket/input/file' >>> output_file = 'gs://bucket/output/file' >>> >>> template_path = 'gs://bucket/Dataflow_templates/template' >>> template_body = { >>> 'jobName': ('cf-job-' + str(random.randint(1, 101000))), >>> 'parameters': { >>> 'input_file': uri, >>> 'output_file': output_file, >>> }, >>> } >>> >>> request = service.projects().templates().launch( >>> projectId=GCLOUD_PROJECT, >>> gcsPath=template_path, >>> body=template_body) >>> response = request.execute() >>> >>> logging.info(f'RunDataflow: got this response {response}') >>> >>> >>> On Mon, Apr 6, 2020 at 10:13 AM Marco Mistroni <[email protected]> >>> wrote: >>> >>>> @andre sorry to hijack this. Are you able to send a working example of >>>> kicking off dataflow template via cloud function? >>>> >>>> Kind regards >>>> >>>> On Mon, Apr 6, 2020, 1:51 PM André Rocha Silva < >>>> [email protected]> wrote: >>>> >>>>> Hey! >>>>> >>>>> Could you make it work? You can take a look in this post, is a >>>>> single file template, easy peasy to create a template from: >>>>> >>>>> https://towardsdatascience.com/my-first-etl-job-google-cloud-dataflow-1fd773afa955 >>>>> >>>>> If you want, we can schedule a google hangout and I help you, step by >>>>> step. >>>>> It is the least I can do after having had so much help from the >>>>> community :) >>>>> >>>>> On Sat, Apr 4, 2020 at 4:52 PM Marco Mistroni <[email protected]> >>>>> wrote: >>>>> >>>>>> Hey >>>>>> sure... it's a crap script :).. just an ordinary dataflow script >>>>>> >>>>>> >>>>>> https://github.com/mmistroni/GCP_Experiments/tree/master/dataflow/edgar_flow >>>>>> >>>>>> >>>>>> What i meant to say , for your template question, is for you to write >>>>>> a basic script which run on bean... something as simple as this >>>>>> >>>>>> >>>>>> https://github.com/mmistroni/GCP_Experiments/blob/master/dataflow/beam_test.py >>>>>> >>>>>> and then you can create a template out of it by just running this >>>>>> >>>>>> python -m edgar_main --runner=dataflow --project=datascience-projets >>>>>> --template_location=gs://mm_dataflow_bucket/templates/edgar_dataflow_template >>>>>> --temp_location=gs://mm_dataflow_bucket/temp >>>>>> --staging_location=gs://mm_dataflow_bucket/staging >>>>>> >>>>>> That will create a template 'edgar_dataflow_template' which you can >>>>>> use in GCP dataflow console to create your job. >>>>>> >>>>>> hth, i m sort of a noob to Beam, having started writing code just >>>>>> over a month ago. Feel free to ping me if u get stuck >>>>>> >>>>>> kind regards >>>>>> Marco >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Sat, Apr 4, 2020 at 6:01 PM Xander Song <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hi Marco, >>>>>>> >>>>>>> Thanks for your response. Would you mind sending the edgar_main >>>>>>> script so I can take a look? >>>>>>> >>>>>>> On Sat, Apr 4, 2020 at 2:25 AM Marco Mistroni <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Hey >>>>>>>> As far as I know you can generate a dataflow template out of your >>>>>>>> beam code by specifying an option on command line? >>>>>>>> I am running this CMD and once template is generated I kick off a >>>>>>>> dflow job via console by pointing at it >>>>>>>> >>>>>>>> python -m edgar_main --runner=dataflow >>>>>>>> --project=datascience-projets --template_location=gs://<your bucket> >>>>>>>> Hth >>>>>>>> >>>>>>>> >>>>>>>> On Sat, Apr 4, 2020, 9:52 AM Xander Song <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> I am attempting to write a custom Dataflow Template using the >>>>>>>>> Apache Beam Python SDK, but am finding the documentation difficult to >>>>>>>>> follow. Does anyone have a minimal working example of how to write and >>>>>>>>> deploy such a template? >>>>>>>>> >>>>>>>>> Thanks in advance. >>>>>>>>> >>>>>>>> >>>>> >>>>> -- >>>>> >>>>> *ANDRÉ ROCHA SILVA* >>>>> * DATA ENGINEER* >>>>> (48) 3181-0611 >>>>> >>>>> <https://www.linkedin.com/in/andre-rocha-silva/> /andre-rocha-silva/ >>>>> <http://portaltelemedicina.com.br/> >>>>> <https://www.youtube.com/channel/UC0KH36-OXHFIKjlRY2GyAtQ> >>>>> <https://pt-br.facebook.com/PortalTelemedicina/> >>>>> <https://www.linkedin.com/company/9426084/> >>>>> >>>>> >>> >>> -- >>> >>> *ANDRÉ ROCHA SILVA* >>> * DATA ENGINEER* >>> (48) 3181-0611 >>> >>> <https://www.linkedin.com/in/andre-rocha-silva/> /andre-rocha-silva/ >>> <http://portaltelemedicina.com.br/> >>> <https://www.youtube.com/channel/UC0KH36-OXHFIKjlRY2GyAtQ> >>> <https://pt-br.facebook.com/PortalTelemedicina/> >>> <https://www.linkedin.com/company/9426084/> >>> >>>
