jrmccluskey opened a new issue, #33078: URL: https://github.com/apache/beam/issues/33078
### What happened? The TensorRT integration test went red when the workflow was moved from a base container with Python 3.8 to Python 3.10 as part of the 3.8 support deprecation. The problem is that the model engine staged at gs://apache-beam-ml/models/ssd_mobilenet_v2_320x320_coco17_tpu-8.trt (based on a [TF Model Garden config](https://github.com/tensorflow/models/blob/master/research/object_detection/configs/tf2/ssd_mobilenet_v2_320x320_coco17_tpu-8.config)) was built with TensorRT 8.x, and Python 3.10 containers use TensorRT 10.x. Unfortunately the documentation around loading the model from the TF side is somewhat out of date or not necessarily what we need; additionally, we need to [convert the model from a TF format to a TensorRT format](https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html) since we do not use the ONNX route in the test. The gradle task for the test is `:sdks:python:test-suites:dataflow:py310:tensorRTtests` and is defined here: https://github.com/apache/beam/blob/2488ca131bed3f28c92fa7b38b7506a461818a3a/sdks/python/test-suites/dataflow/common.gradle#L444. When testing the workflow ensure that you're running on dataflow or on a machine with a GPU, as the workflow will fail with CUDA error 35 if there isn't a GPU present. ### Issue Failure Failure: Test is continually failing ### Issue Priority Priority: 2 (backlog / disabled test but we think the product is healthy) ### Issue Components - [X] Component: Python SDK - [ ] Component: Java SDK - [ ] Component: Go SDK - [ ] Component: Typescript SDK - [ ] Component: IO connector - [ ] Component: Beam YAML - [ ] Component: Beam examples - [ ] Component: Beam playground - [ ] Component: Beam katas - [ ] Component: Website - [ ] Component: Infrastructure - [ ] Component: Spark Runner - [ ] Component: Flink Runner - [ ] Component: Samza Runner - [ ] Component: Twister2 Runner - [ ] Component: Hazelcast Jet Runner - [ ] Component: Google Cloud Dataflow Runner -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
