krickert commented on PR #1085: URL: https://github.com/apache/opennlp/pull/1085#issuecomment-4702739431
## Summary On an inference failure the previous code returned an all-zero `double[]`. That isn't a valid probability distribution (it doesn't sum to 1), so any downstream `getBestCategory` / thresholding silently picks garbage and the real failure travels far from its cause. `categorize(...)` now fails loudly, and distinguishes the *kind* of failure instead of lumping everything into one method-wide `catch (Exception)`: - **Malformed input** (`strings` null or empty) throws `IllegalArgumentException`, validated up front. - **Inference failure** (an `OrtException`, or any runtime fault while executing the model) throws `IllegalStateException` with the cause preserved. The model execution is extracted into a private `infer(...)` helper so the wrap is scoped to it, not the whole method. - **Unexpected model output shape** throws its own `IllegalStateException`, surfaced on its own rather than being re-wrapped as an "inference failed" cause. `scoreMap` / `sortedScoreMap` inherit this, since they delegate to `categorize`. ## Tests - **softmax**: uniform distribution for equal logits, finiteness for large logits (the previous code returned `NaN`), and a reference distribution (`softmax([1,2,3])`). - **fail-loud**: `categorize`, `scoreMap`, and `sortedScoreMap` surface an `IllegalStateException` on inference failure; malformed input is rejected with `IllegalArgumentException`. - **eval**: `DocumentCategorizerDLEval#categorizeFailsLoudlyOnFailure` covers the contract end-to-end without requiring `OPENNLP_DATA_DIR`. ## Verification ``` ./mvnw -pl opennlp-core/opennlp-ml/opennlp-dl test # Tests run: 35, Failures: 0, Errors: 0, Skipped: 0 — BUILD SUCCESS ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
