[PR] Add llmPredict built-in for LLM inference via HTTP [systemds]

via GitHub Tue, 10 Mar 2026 18:14:18 -0700


kubraaksux opened a new pull request, #2447:
URL: https://github.com/apache/systemds/pull/2447


   ## Summary
   
   - Add `llmPredict` DML built-in that sends prompts to any OpenAI-compatible
     inference server and returns predictions with latency and token counts
   - New dedicated `LlmPredictCPInstruction` (216 lines) with connect/read
     timeouts, concurrent request support, and proper error handling
   - 10 Java tests: 7 mock-based negative tests (HTTP 500, malformed JSON,
     connection refused, timeout, etc.) run in CI without a server; 3 live
     tests skip gracefully when no server is available
   - Lightweight `llm_server.py` for testing without vLLM
   
   ## Test plan
   
   - [ ] `mvn test -pl . -Dtest=JMLCLLMInferenceTest` passes (7 mock tests)
   - [ ] LicenseCheck passes
   - [ ] Java Codestyle passes
   - [ ] Live tests work with `llm_server.py` or vLLM (manual)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] Add llmPredict built-in for LLM inference via HTTP [systemds]

Reply via email to