Can you try running direct runner with the option `--experiments=use_deprecated_read`
Seems like an instance of https://issues.apache.org/jira/browse/BEAM-10670?focusedCommentId=17316858&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17316858 also reported in https://lists.apache.org/thread.html/re6b0941a8b4951293a0327ce9b25e607cafd6e45b69783f65290edee%40%3Cdev.beam.apache.org%3E We should rollback using the SDF wrapper by default because of the usability and performance issues reported. On Sat, May 8, 2021 at 12:57 AM Evan Galpin <evan.gal...@gmail.com> wrote: > Hi all, > > I’m experiencing very slow performance and startup delay when testing a > pipeline locally. I’m reading data from a Google PubSub subscription as the > data source, and before each pipeline execution I ensure that data is > present in the subscription (readable from GCP console). > > I’m seeing startup delay on the order of minutes with DirectRunner (5-10 > min). Is that expected? I did find a Jira ticket[1] that at first seemed > related, but I think it has more to do with BQ than DirectRunner. > > I’ve run the pipeline with a debugger connected and confirmed that it’s > minutes before the first DoFn in my pipeline receives any data. Is there a > way I can profile the direct runner to see what it’s churning on? > > Thanks, > Evan > > [1] > https://issues.apache.org/jira/plugins/servlet/mobile#issue/BEAM-4548 >