Revision: 6634
          
http://languagetool.svn.sourceforge.net/languagetool/?rev=6634&view=rev
Author:   dominikoeo
Date:     2012-03-24 12:29:14 +0000 (Sat, 24 Mar 2012)
Log Message:
-----------
- No longer consider the ellipsis (?\226?\128?\166) as a sentence separator.
  It was causing false positives at least in French and Breton
  as in "Mais?\226?\128?\166 c'est mon ami." (no upper case after ellipsis).
  An ellipsis does not always separate sentences. Later, I will
  change French and Breton to use the SRX tokenizer (but only
  after the 1.7 release).

Modified Paths:
--------------
    
trunk/JLanguageTool/src/java/org/languagetool/tokenizers/SentenceTokenizer.java

Modified: 
trunk/JLanguageTool/src/java/org/languagetool/tokenizers/SentenceTokenizer.java
===================================================================
--- 
trunk/JLanguageTool/src/java/org/languagetool/tokenizers/SentenceTokenizer.java 
    2012-03-23 19:11:39 UTC (rev 6633)
+++ 
trunk/JLanguageTool/src/java/org/languagetool/tokenizers/SentenceTokenizer.java 
    2012-03-24 12:29:14 UTC (rev 6634)
@@ -38,7 +38,7 @@
   // end of sentence marker:
   protected static final String EOS = "\0";
   //private final static String EOS = "#"; // for testing only
-  protected static final String P = "[\\.!?…]"; // PUNCTUATION
+  protected static final String P = "[\\.!?]"; // PUNCTUATION
   protected static final String AP = "(?:'|«|\"||\\)|\\]|\\})?"; // AFTER 
PUNCTUATION
   protected static final String PAP = P + AP;
   protected static final String PARENS = "[\\(\\)\\[\\]]"; // parentheses

This was sent by the SourceForge.net collaborative development platform, the 
world's largest Open Source development site.


------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Languagetool-cvs mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/languagetool-cvs

Reply via email to