(Speaking for Java, but I think Python is similar) There's nothing in the Beam API right now for querying a job unless you have a handle on the original object returned by the runner. The nature of the result of run() is particular to a runner, though it is easy to imagine a feature whereby you can "attach" to a known running job.
So I think your best option is to use runner-specific APIs for now. For Dataflow that would be the cloud APIs [1]. You can see how it is done by the Beam wrapper DataflowPipelineJob [2] as a reference. Out of curiosity - what sort of third-party app? It would super if you could file a JIRA [3] describing your use case with some more details, to help gain visibility. Kenn [1] https://cloud.google.com/dataflow/docs/reference/rest/v1b3/projects.jobs/get [2] https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineJob.java#L441 [3] https://issues.apache.org/jira/secure/CreateIssue!default.jspa On Sun, Jul 9, 2017 at 2:54 PM, Randal Moore <[email protected]> wrote: > Is this part of the Beam API or something I should look at the google docs > for help? Assume a job is running in dataflow - how can an interested > third-party app query the status if it knows the job-id? > > rdm >
