This is an automated email from the ASF dual-hosted git repository.
krickert pushed a change to branch OPENNLP-1850-3-dl
in repository https://gitbox.apache.org/repos/asf/opennlp.git
discard 105d8a17c OPENNLP-1850 Review nits: extract testable DL guards;
merge-copy; capitalize msgs; migration note
discard fabf88583 OPENNLP-1850 Make mergeOverlappingSpans O(n log n) (dl)
discard fd1195676 OPENNLP-1850 Reject non-finite logits in softmax, not just
NaN (dl)
discard 6605361a8 OPENNLP-1850 Fully-qualify TokenNameFinder javadoc links in
NameFinderDL
discard c9f65483a OPENNLP-1850 Fail loud on corrupt document-classification
model output
discard dbc733be8 OPENNLP-1850 Fail fast on null finder input; fix the GPU
eval test options
discard 665758511 OPENNLP-1850 Harden fail-loud paths in the DL components
discard 80c4e48b4 OPENNLP-1850 Add real-model chunk-boundary eval tests; drop
dead label constants
discard 282b1f49c OPENNLP-1850 Resolve overlapping chunk spans and compose the
input alignment
discard b78550661 OPENNLP-1850 Add OffsetMappingNameFinder capability
interface and a findInOriginal end-to-end test
discard ac0b30fdf OPENNLP-1850 Offset-safe, Unicode-aware input normalization
in the DL components
discard 087c4779d OPENNLP-1850 Review nits: rename
searchAnalyzer->matchingAnalyzer; drop 'search' framing in profile docs
discard 9b10fe3ce OPENNLP-1850 Review nits: add Turkish profile; derive
coverage from the enum (profiles)
discard 7db9aced6 OPENNLP-1850 Resolve Norwegian nb/nn to the Norwegian
profile (profiles)
discard 1cedb0608 OPENNLP-1850 Per-language NormalizationProfile registry (2c)
discard f811ef1bb OPENNLP-1850 Review nits: TermAnalyzer javadoc references
matchingAnalyzer()
discard db9aaa65d OPENNLP-1850 Review nits: rename dashes()->dash(); LEMMA
doc+test; soften forward-link (Term)
discard 7ba196d79 OPENNLP-1850 Layered Term model: Term, TermAnalyzer (2b)
discard 9c8e3fc91 OPENNLP-1850 Review nits: ExtendedPictographic fail-loud
parity + doc; WordType heuristic note (tokenizer)
discard 3fa5fa6e1 OPENNLP-1850 Fail loud on a Word_Break line missing its ';'
(tokenizer)
discard 6b4b32637 OPENNLP-1850 UAX #29 word tokenizer: WordSegmenter,
WordTokenizer, WordType (2a)
add 45ad38b33 OPENNLP-1850 Review: make AlignedText.normalized a
CharSequence; add normalizedString()
add f85f3d8f3 OPENNLP-1850 UAX #29 word tokenizer: WordSegmenter,
WordTokenizer, WordType (2a)
add 21aadedba OPENNLP-1850 Fail loud on a Word_Break line missing its ';'
(tokenizer)
add 58c9b1607 OPENNLP-1850 Review nits: ExtendedPictographic fail-loud
parity + doc; WordType heuristic note (tokenizer)
add 99035083c OPENNLP-1850 Layered Term model: Term, TermAnalyzer (2b)
add 6ee2f9af5 OPENNLP-1850 Review nits: rename dashes()->dash(); LEMMA
doc+test; soften forward-link (Term)
add 45ac5e9d8 OPENNLP-1850 Review nits: TermAnalyzer javadoc references
matchingAnalyzer()
add 0e71a106c OPENNLP-1850 Per-language NormalizationProfile registry (2c)
add f3af8f077 OPENNLP-1850 Resolve Norwegian nb/nn to the Norwegian
profile (profiles)
add c4d2c65d4 OPENNLP-1850 Review nits: add Turkish profile; derive
coverage from the enum (profiles)
add b76df0544 OPENNLP-1850 Review nits: rename
searchAnalyzer->matchingAnalyzer; drop 'search' framing in profile docs
add a91c0897d OPENNLP-1850 Offset-safe, Unicode-aware input normalization
in the DL components
add cec20785e OPENNLP-1850 Add OffsetMappingNameFinder capability
interface and a findInOriginal end-to-end test
add 22542a85d OPENNLP-1850 Resolve overlapping chunk spans and compose the
input alignment
add b43dc8782 OPENNLP-1850 Add real-model chunk-boundary eval tests; drop
dead label constants
add 28105ffeb OPENNLP-1850 Harden fail-loud paths in the DL components
add 6c6ff19fc OPENNLP-1850 Fail fast on null finder input; fix the GPU
eval test options
add f6f8fc11e OPENNLP-1850 Fail loud on corrupt document-classification
model output
add c9be56b69 OPENNLP-1850 Fully-qualify TokenNameFinder javadoc links in
NameFinderDL
add 0ac38937c OPENNLP-1850 Reject non-finite logits in softmax, not just
NaN (dl)
add e57a1ee14 OPENNLP-1850 Make mergeOverlappingSpans O(n log n) (dl)
add c60cc29e5 OPENNLP-1850 Review nits: extract testable DL guards;
merge-copy; capitalize msgs; migration note
add c9d739198 OPENNLP-1850 Follow AlignedText.normalized() -> CharSequence
in NameFinderDL
This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version. This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:
* -- * -- B -- O -- O -- O (105d8a17c)
\
N -- N -- N refs/heads/OPENNLP-1850-3-dl (c9d739198)
You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.
Any revisions marked "omit" are not gone; other references still
refer to them. Any revisions marked "discard" are gone forever.
No new revisions were added by this update.
Summary of changes:
.../opennlp/tools/util/normalizer/AlignedText.java | 15 +++-
.../tools/util/normalizer/AlignedTextTest.java | 93 ++++++++++++++++++++++
.../java/opennlp/dl/namefinder/NameFinderDL.java | 2 +-
.../normalizer/AlignedNormalizerPipelineTest.java | 4 +-
4 files changed, 110 insertions(+), 4 deletions(-)
create mode 100644
opennlp-api/src/test/java/opennlp/tools/util/normalizer/AlignedTextTest.java