lostluck commented on PR #35263: URL: https://github.com/apache/beam/pull/35263#issuecomment-2971548431
Not offhand. I see the error, I see a print out. But nothing from Prism (which is notionally intentional, but could be a set up problem Python side not capturing additional output from prism). If the flake is reproducible, I recommend additional logging to narrow down where the disconnect is happening. Since the flow basically should be Python-side error -> Prism fails the pipeline and returns an error to the JobManagementServer -> JobManagement server reports that state to the Python process -> Ultimately raise an error or exception that can be caught in tests. But the flow could also be "Python side error -> caught immeadiately by the test due to it happening in the local process" (AKA the danger of direct runners). You can add a `log_level` flag set to `debug` here: https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/portability/prism_runner.py which will make for much more verbose prism output. But that'll be quite verbose. I'd recommend just adding an info/error log here: https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/runners/prism/internal/execute.go#L84 But Prism side, is already doing an error log for failed job anyway, and it's not clear why that's not showing up anywhere. It's possibly not getting captured Python side for some reason. https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/portability/portable_runner.py#L223 is the handling for the "state" of portable runner events, which might also be useful to be a bit more verbose on. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
