Hi all,

I’m experiencing very slow performance and startup delay when testing a
pipeline locally. I’m reading data from a Google PubSub subscription as the
data source, and before each pipeline execution I ensure that data is
present in the subscription (readable from GCP console).

I’m seeing startup delay on the order of minutes with DirectRunner (5-10
min). Is that expected? I did find a Jira ticket[1] that at first seemed
related, but I think it has more to do with BQ than DirectRunner.

I’ve run the pipeline with a debugger connected and confirmed that it’s
minutes before the first DoFn in my pipeline receives any data. Is there a
way I can profile the direct runner to see what it’s churning on?

Thanks,
Evan

[1]
https://issues.apache.org/jira/plugins/servlet/mobile#issue/BEAM-4548

Reply via email to