You could try Dataflow Runner v2. The difference in the implementation may allow you to work around what is impacting the pipelines.
On Fri, Aug 5, 2022 at 9:40 AM Evan Galpin <[email protected]> wrote: > Thanks Luke, I've opened a support case as well but thought it would be > prudent to ask here in case there was something obvious with the code. Is > there any additional/optional validation that I can opt to use when > building and deploying the pipeline that might give hints? Otherwise I'll > just wait on the support case. > > Thanks, > Evan > > On Fri, Aug 5, 2022 at 11:22 AM Luke Cwik via user <[email protected]> > wrote: > >> I took a look at the code and nothing obvious stood out to me in the code >> as this is a ParDo with OnWindowExpiration. Just to make sure, the rate >> limit is per key and would only be a global rate limit if there was a >> single key. >> >> Are the workers trying to start? >> * If no, then you would need to open a support case and share some >> job ids so that someone could debug internal service logs. >> * If yes, then did the workers start successfully? >> ** If no, logs should have some details as to why the worker couldn't >> start. >> ** If yes, are the workers getting work items? >> *** If no, then you would need to open a support case and share some >> job ids so that someone could debug internal service logs. >> *** If yes then the logs should have some details as to why the work >> items are failing. >> >> >> On Fri, Aug 5, 2022 at 7:36 AM Evan Galpin <[email protected]> wrote: >> >>> Hi all, >>> >>> I'm trying to create a RateLimit[1] transform that's based fairly >>> heavily on GroupIntoBatches[2]. I've been able to run unit tests using >>> TestPipeline to verify desired behaviour and have also run successfully >>> using DirectRunner. However, when I submit the same job to Dataflow it >>> completely fails to start and only gives the error message "Workflow >>> Failed." The job builds/uploads/submits without error, but never starts and >>> gives no detail as to why. >>> >>> Is there anything I can do to gain more insight about what is going >>> wrong? I've included a gist of the RateLimit[1] code in case there is >>> anything obvious wrong there. >>> >>> Thanks in advance, >>> Evan >>> >>> [1] https://gist.github.com/egalpin/162a04b896dc7be1d0899acf17e676b3 >>> [2] >>> https://github.com/apache/beam/blob/c8d92b03b6b2029978dbc2bf824240232c5e61ac/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/GroupIntoBatches.java >>> >>
