[
https://issues.apache.org/jira/browse/OPENNLP-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16447783#comment-16447783
]
ASF GitHub Bot commented on OPENNLP-1194:
-----------------------------------------
kottmann commented on a change in pull request #312: OPENNLP-1194: Adds type
name filter to BratDocumentParser
URL: https://github.com/apache/opennlp/pull/312#discussion_r183317571
##########
File path:
opennlp-tools/src/main/java/opennlp/tools/formats/brat/BratNameSampleStreamFactory.java
##########
@@ -148,7 +154,15 @@ else if ("whitespace".equals(tokenizerName)) {
}
}
- return new BratNameSampleStream(sentDetector, tokenizer, samples);
+ Set<String> nameTypes = null;
+ if (params.getNameTypes() != null) {
+ String[] nameTypesArr = params.getNameTypes().split(",");
+ if (nameTypesArr != null && nameTypesArr.length > 0) {
Review comment:
String.split(",) would never return null,it would return an empty string if
it is called like this ",".split(","). it can also return a string with spaces
attached, probably makes sense to apply trim()
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Brat Document Parser should support name type filters
> -----------------------------------------------------
>
> Key: OPENNLP-1194
> URL: https://issues.apache.org/jira/browse/OPENNLP-1194
> Project: OpenNLP
> Issue Type: Improvement
> Components: Formats
> Reporter: William Colen
> Assignee: William Colen
> Priority: Major
> Fix For: 1.8.5
>
>
> Brat Document Parser fails if there is a span overlap. Sometimes we are
> interested in only some types. In that case we could ignore the overlapping
> that are not of our interest.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)