[
https://issues.apache.org/jira/browse/OPENNLP-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alan Wang updated OPENNLP-1316:
-------------------------------
Description:
Hi, I want to know if OPENNLP needs to expand contractions.
i.g. +_*n't*_+ -> _*not*_, +_*'ve*_+ -> _*have*_, +_*'m*_+ -> _*am*_, but
+_*'s*_+ can be extended to _*is*_ or *_has_*, +_*'d*_+ can be extended to
*_had_* or _*would*_, depending on the context.
1、Use POSTag to mark contractions to determine which extension is to be used.
2、Like nltk, extend only some contractions that are not ambiguous.
Thanks!
was:
Hi, I want to know if OPENNLP needs to expand contractions.
i.g. +_*n't*_+ -> _*not*_, +_*'ve*_+ -> _*have*_, +_*'m*_+ -> _*am*_, but
+_*'s*_+ can be extended to _*is*_ or *_has_*, +_*'d*_+ can be extended to
*_had_* or _*would*_, depending on the context.
1、Use POSTag to mark contractions to determine which extension is to be used.
2、Like nltk, extend only some acronyms that are not ambiguous.
Thanks!
> Expand common contractions in the english language
> --------------------------------------------------
>
> Key: OPENNLP-1316
> URL: https://issues.apache.org/jira/browse/OPENNLP-1316
> Project: OpenNLP
> Issue Type: Improvement
> Reporter: Alan Wang
> Priority: Minor
>
> Hi, I want to know if OPENNLP needs to expand contractions.
> i.g. +_*n't*_+ -> _*not*_, +_*'ve*_+ -> _*have*_, +_*'m*_+ -> _*am*_, but
> +_*'s*_+ can be extended to _*is*_ or *_has_*, +_*'d*_+ can be extended to
> *_had_* or _*would*_, depending on the context.
> 1、Use POSTag to mark contractions to determine which extension is to be used.
> 2、Like nltk, extend only some contractions that are not ambiguous.
> Thanks!
--
This message was sent by Atlassian Jira
(v8.3.4#803005)