ojasiiitd opened a new issue, #33946:
URL: https://github.com/apache/beam/issues/33946
### What happened?
Installing Apache Beam:
`! pip install apache-beam`
Installing TensorRT:
`! pip install --upgrade tensorrt==10.7`
`! pip install --upgrade tensorrt-lean==10.7`
`! pip install --upgrade tensorrt-dispatch==10.7`
`! pip install pycuda`
```
! apt-get install -y \
libnvonnxparsers8 \
libnvparsers8 \
libnvinfer-plugin8 \
libnvinfer8 \
python3-libnvinfer \
python3-libnvinfer-dev
! apt-get install -y tensorrt
```
`dpkg` output:
```
ii libnvinfer-dev 10.7.0.23-1+cuda12.6
amd64 TensorRT development libraries
ii libnvinfer-dispatch-dev 10.7.0.23-1+cuda12.6
amd64 TensorRT development dispatch runtime libraries
ii libnvinfer-dispatch10 10.7.0.23-1+cuda12.6
amd64 TensorRT dispatch runtime library
ii libnvinfer-headers-dev 10.7.0.23-1+cuda12.6
amd64 TensorRT development headers
ii libnvinfer-headers-plugin-dev 10.7.0.23-1+cuda12.6
amd64 TensorRT plugin headers
ii libnvinfer-lean-dev 10.7.0.23-1+cuda12.6
amd64 TensorRT lean runtime libraries
ii libnvinfer-lean10 10.7.0.23-1+cuda12.6
amd64 TensorRT lean runtime library
ii libnvinfer-plugin-dev 10.7.0.23-1+cuda12.6
amd64 TensorRT plugin libraries
ii libnvinfer-plugin10 10.7.0.23-1+cuda12.6
amd64 TensorRT plugin libraries
ii libnvinfer-plugin8 8.6.1.6-1+cuda12.0
amd64 TensorRT plugin libraries
ii libnvinfer-vc-plugin-dev 10.7.0.23-1+cuda12.6
amd64 TensorRT vc-plugin library
ii libnvinfer-vc-plugin10 10.7.0.23-1+cuda12.6
amd64 TensorRT vc-plugin library
ii libnvinfer10 10.7.0.23-1+cuda12.6
amd64 TensorRT runtime libraries
ii libnvinfer8 8.6.1.6-1+cuda12.0
amd64 TensorRT runtime libraries
ii python3-libnvinfer 10.7.0.23-1+cuda12.6
amd64 Python 3 bindings for TensorRT standard runtime
ii python3-libnvinfer-dev 10.7.0.23-1+cuda12.6
amd64 Python 3 development package for TensorRT standard runtime
ii python3-libnvinfer-dispatch 10.7.0.23-1+cuda12.6
amd64 Python 3 bindings for TensorRT dispatch runtime
ii python3-libnvinfer-lean 10.7.0.23-1+cuda12.6
amd64 Python 3 bindings for TensorRT lean runtime
```
I used this [Beam
example](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/inference/large_language_modeling/main.py)
as a reference. Code to reproduce:
```
import argparse
import sys
import logging
import apache_beam as beam
from apache_beam.ml.inference.base import RunInference
from apache_beam.ml.inference.tensorrt_inference import
TensorRTEngineHandlerNumPy, TensorRTEngine
from apache_beam.ml.inference.pytorch_inference import
PytorchModelHandlerTensor
from apache_beam.ml.inference.pytorch_inference import make_tensor_model_fn
from apache_beam.options.pipeline_options import PipelineOptions
from apache_beam.options.pipeline_options import SetupOptions
import tensorrt as trt
class Preprocess(beam.DoFn):
def __init__(self, tokenizer: tokenizer):
self._tokenizer = tokenizer
def process(self, element):
input_ids = element
return input_ids
class Postprocess(beam.DoFn):
def __init__(self, tokenizer):
self._tokenizer = tokenizer
def process(self, element):
decoded_outputs = self._tokenizer.decode(element.inference,
skip_special_tokens=True)
print(f"Output Prediction: {decoded_outputs}")
def load_engine(engine_path: str) -> trt.ICudaEngine:
"""Loads a serialized TensorRT engine from file."""
TRT_LOGGER = trt.Logger(trt.Logger.WARNING) # Create a TensorRT logger
runtime = trt.Runtime(TRT_LOGGER) # Create a runtime object
# Read the engine file
with open(engine_path, "rb") as f:
engine_data = f.read()
# Deserialize the engine
engine = runtime.deserialize_cuda_engine(engine_data)
return engine
def parse_args(argv):
"""Parses args for the workflow."""
parser = argparse.ArgumentParser()
return parser.parse_known_args(args=argv)
################ MAIN ################
TRT_LOGGER = trt.Logger(trt.Logger.VERBOSE) # Change to VERBOSE
runtime = trt.Runtime(TRT_LOGGER)
known_args, pipeline_args = parse_args(sys.argv)
pipeline_options = PipelineOptions(pipeline_args)
task_inputs = []
for i,sample in enumerate(eval_dataset):
task_inputs.append(sample['audio']['array'])
if i == 10:
break
task_inputs = np.array(task_inputs)
trt_tokenizer = tokenizer
trt_engine_path = "./tensorrt_final_V3.engine"
if not os.path.exists(trt_engine_path):
print(f"Error: Engine file {engine_path} not found!")
engine = load_engine(trt_engine_path)
model_handler = TensorRTEngineHandlerNumPy(
min_batch_size=1,
max_batch_size=1,
engine_path=trt_engine_path)
# model_handler = TensorRTEngine(engine)
# [START Pipeline]
with beam.Pipeline(options=pipeline_options) as pipeline:
_ = (
pipeline
| "CreateInputs" >> beam.Create(task_inputs)
| "Preprocess" >> beam.ParDo(Preprocess(tokenizer=trt_tokenizer))
| "RunInference" >> RunInference(model_handler=model_handler)
| "PostProcess" >> beam.ParDo(Postprocess(tokenizer=trt_tokenizer)))
# [END Pipeline]
```
I saw a [similar issue](https://github.com/NVIDIA/TensorRT/issues/4216) on
the TensorRT Repository.
### Issue Priority
Priority: 2 (default / most bugs should be filed as P2)
### Issue Components
- [x] Component: Python SDK
- [ ] Component: Java SDK
- [ ] Component: Go SDK
- [ ] Component: Typescript SDK
- [ ] Component: IO connector
- [ ] Component: Beam YAML
- [x] Component: Beam examples
- [ ] Component: Beam playground
- [ ] Component: Beam katas
- [ ] Component: Website
- [ ] Component: Infrastructure
- [ ] Component: Spark Runner
- [ ] Component: Flink Runner
- [ ] Component: Samza Runner
- [ ] Component: Twister2 Runner
- [ ] Component: Hazelcast Jet Runner
- [ ] Component: Google Cloud Dataflow Runner
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]