[
https://issues.apache.org/jira/browse/OPENNLP-239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13078733#comment-13078733
]
James Kosin commented on OPENNLP-239:
-------------------------------------
Maybe... the issue is also with comparisons. The Dictionary class takes a flag
for the case sensitivity and the equals() method is adjusted based on this flag
for all the classes and sub-classes that are returned from the Dictionary to
allow the flag to persist. It is one reason why it would be difficult to
remove the duplicate isCaseSensitive flag from the StringListWrapper class.
If the dictionary were created with the flag to true, then all entries would
use a case sensitive comparison. The down side is that some of the tools allow
us to modify the case sensitivity after dictionary creation in the sense of
chaning the meaning (when I say creation I mean when it was created from the
original data and not when the user is using the dictionary).
The original code however would always use the dictionary flag when created to
determine the case comparison rules (when the user created the dictionary from
the XML stream). This looses the original intent I believe of the flag and the
dictionary may not work as intended.
The changes I've done so far allows us to change the dictionary flag (to the
users preference), if allowed, and allow the dictionary to be used to compare
two different strings with the appropriate case sensitive or insensitive test.
Unmutable also means the DictionarySerializer would need to return the
Dictionary class created from the input stream.... which I was trying to avoid
if possible.
> Case Sensitivie Flag & Custom Tag Dictionary
> --------------------------------------------
>
> Key: OPENNLP-239
> URL: https://issues.apache.org/jira/browse/OPENNLP-239
> Project: OpenNLP
> Issue Type: New Feature
> Components: Parser
> Affects Versions: tools-1.5.1-incubating
> Reporter: mark meiklejohn
> Assignee: James Kosin
> Fix For: tools-1.5.2-incubating
>
>
> Unable to set case sensitive flag as per TreebankParser 1.3.1 or use a custom
> tag dictionary
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira