rzo1 opened a new pull request, #985: URL: https://github.com/apache/opennlp/pull/985
Note: Requires https://github.com/apache/opennlp/pull/984 first. - Fix multi-period skip logic in `sentPosDetect` incorrectly skipping real sentence boundaries when the next sentence starts immediately with an abbreviation (e.g., "Gedanken.Bek."). The skip is now bypassed when the character after the delimiter is uppercase, indicating a new sentence rather than a multi-period abbreviation like "z.B." - Generalize abbreviation-at-segment-start check in `isAcceptableBreak` from `tokenStartPos == 0` to `tokenStartPos == fromIndex`, so abbreviations are correctly recognized at segment boundaries, not just at the start of the entire text -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
