Hi all,

Using the Python SDK (the one installed using pip install
google-cloud-dataflow) I have implemented a very simple pipeline trying to
read from Datastore and print the result on a dataset with just 3 entities:

entities = p \
          | 'read from datastore' >>
ReadFromDatastore(project='project-name', query=ds_query) \
          | 'printing' >> beam.Map( lambda row : println(row) )


Running this locally, this seems to work fine. Running it on the cloud
this results in the following graph:

[image: Inline image 1]

but the execution stops in the GroupByKey step after which the rest of the
pipeline fails. Anything that should be added code-wise to make this
working?
Or is this only working locally for now?

Thanks :)

Matthias

Reply via email to