[
https://issues.apache.org/jira/browse/OPENNLP-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17798184#comment-17798184
]
ASF GitHub Bot commented on OPENNLP-1526:
-----------------------------------------
mawiesne commented on code in PR #566:
URL: https://github.com/apache/opennlp/pull/566#discussion_r1430079027
##########
opennlp-tools/lang/es/abb_ES.xml:
##########
@@ -0,0 +1,254 @@
+<?xml version="1.0" encoding="UTF-8"?>
+
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied. See the License for the
+ specific language governing permissions and limitations
+ under the License.
+-->
+
+<dictionary case_sensitive="false">
+ <entry>
+ <token>a.C.</token>
+ </entry>
+ <entry>
+ <token>a. de C.</token>
+ </entry>
+ <entry>
+ <token>a.J.C.</token>
+ </entry>
+ <entry>
+ <token>a. de J.C.</token>
+ </entry>
+ <entry>
+ <token>a. m.</token>
+ </entry>
+ <entry>
+ <token>apdo.</token>
+ </entry>
+ <entry>
+ <token>apdo.</token>
+ </entry>
+ <entry>
+ <token>aprox.</token>
+ </entry>
+ <entry>
+ <token>Av.</token>
+ </entry>
+ <entry>
+ <token>Avda.</token>
+ </entry>
+ <entry>
+ <token>Bs. As.</token>
+ </entry>
+ <entry>
+ <token>c.c.</token>
+ </entry>
+ <entry>
+ <token>cap.</token>
+ </entry>
+ <entry>
+ <token>D.</token>
+ </entry>
+ <entry>
+ <token>Da.</token>
+ </entry>
+ <entry>
+ <token>Dña.</token>
+ </entry>
+ <entry>
+ <token>d.C.</token>
+ </entry>
+ <entry>
+ <token>d. de C.</token>
+ </entry>
+ <entry>
+ <token>d.J.C.</token>
+ </entry>
+ <entry>
+ <token>d. de J.C</token>
Review Comment:
It must not end with a dot. See other comment by Bruno with `n.°` from above.
This example does not end with a dot, either.
> Add Spanish abbreviation dictionary
> -----------------------------------
>
> Key: OPENNLP-1526
> URL: https://issues.apache.org/jira/browse/OPENNLP-1526
> Project: OpenNLP
> Issue Type: Improvement
> Components: Sentence Detector, Tokenizer
> Affects Versions: 2.3.0, 2.3.1
> Reporter: Martin Wiesner
> Assignee: Martin Wiesner
> Priority: Minor
> Fix For: 2.3.2
>
> Attachments: abb_ES.xml
>
> Time Spent: 1h
> Remaining Estimate: 1h
>
> Similar to the addition in OPENNLP-570, an abbreviation dictionary for
> Spanish sentence detection and tokenisation might be beneficial.
> Aims:
> - Create and add a new file {{abb_ES.xml}} to _opennlp-tools/lang/es_
> - Add basic set of test cases
--
This message was sent by Atlassian Jira
(v8.20.10#820010)