[ https://issues.apache.org/jira/browse/OPENNLP-1446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17800499#comment-17800499 ]
ASF GitHub Bot commented on OPENNLP-1446: ----------------------------------------- mawiesne commented on PR #113: URL: https://github.com/apache/opennlp-sandbox/pull/113#issuecomment-1869570276 Note: These changes also fix massive performance problems due to "slow" code, e.g. by loading resource multiple times for no benefit, as well as by dealing with regex patterns inefficiently. The tests execute much faster compared with the original, previous version. > Investigate why LeskEvaluatorTest and MFSEvaluatorTest fail while parsing > 'EnglishLS.train' > ------------------------------------------------------------------------------------------- > > Key: OPENNLP-1446 > URL: https://issues.apache.org/jira/browse/OPENNLP-1446 > Project: OpenNLP > Issue Type: Task > Components: wsd > Affects Versions: 2.1.0 > Reporter: Martin Wiesner > Assignee: Martin Wiesner > Priority: Minor > Fix For: 2.3.2 > > > The _LeskEvaluatorTest_ & _MFSEvaluatorTest_ in the _opennlp-wsd_ sandbox > component both fail parsing the 'EnglishLS.train' file. The data is kept > original, downloaded from > {{[https://web.eecs.umich.edu/~mihalcea/senseval/senseval3/data.html]}} > h4. {{Aims:}} > * Investigate what causes the xml parsing to fail > * Fix it and make both existing tests pass > * Optional: Improve the existing test code to be more strict. > h4. Note: > The test setup to reproduce this is on a branch and to be merged into the > main branch. -- This message was sent by Atlassian Jira (v8.20.10#820010)