damccorm commented on code in PR #36966:
URL: https://github.com/apache/beam/pull/36966#discussion_r2794025809


##########
sdks/python/apache_beam/ml/inference/vllm_inference.py:
##########
@@ -125,7 +132,7 @@ def start_server(self, retries=3):
         server_cmd = [
             sys.executable,
             '-m',
-            'vllm.entrypoints.openai.api_server',
+            self._vllm_executable,

Review Comment:
   I made this update, and now I'm successfully starting up a model endpoint 
(`HTTP Request: GET http://localhost:52921/v1/models "HTTP/1.1 200 OK"`), 
however now I'm running into a new problem:
   
   ```
   Traceback (most recent call last):
   File "<frozen runpy>", line 198, in _run_module_as_main
   File "<frozen runpy>", line 88, in _run_code
   File "/usr/local/lib/python3.12/dist-packages/dynamo/vllm/__main__.py", line 
7, in <module>
   main()
   File "/usr/local/lib/python3.12/dist-packages/dynamo/vllm/main.py", line 
820, in main
   uvloop.run(worker())
   File "/usr/local/lib/python3.12/dist-packages/uvloop/__init__.py", line 96, 
in run
   return __asyncio.run(
   ^^^^^^^^^^^^^^
   File "/usr/lib/python3.12/asyncio/runners.py", line 195, in run
   return runner.run(main)
   ^^^^^^^^^^^^^^^^
   File "/usr/lib/python3.12/asyncio/runners.py", line 118, in run
       return self._loop.run_until_complete(task)
   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
   File "/usr/local/lib/python3.12/dist-packages/uvloop/__init__.py", line 48, 
in wrapper
   return await main
   ^^^^^^^^^^
   File "/usr/local/lib/python3.12/dist-packages/dynamo/vllm/main.py", line 67, 
in worker
   runtime = DistributedRuntime(
   ^^^^^^^^^^^^^^^^^^^
   Exception: Failed to connect to NATS: IO error: Connection refused (os error 
111). Verify NATS server is running and accessible.
   ```
   
   
https://console.cloud.google.com/dataflow/jobs/us-central1/2026-02-11_07_08_48-18398043110228237613
   
   I think that this is called out in 
https://github.com/ai-dynamo/dynamo?tab=readme-ov-file#run-dynamo and I can 
avoid NATS entirely with ` --kv-events-config '{"enable_kv_cache_events": 
false}'`, but I've had a little trouble getting that right so far



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to