Revision: 7858
          
http://languagetool.svn.sourceforge.net/languagetool/?rev=7858&view=rev
Author:   dominikoeo
Date:     2012-08-13 19:52:26 +0000 (Mon, 13 Aug 2012)
Log Message:
-----------
- The word tokenizer now considers the pipe | and backtick ` characters
  as word separators.  I intended to make the star * as word separator too
  but it would currently break 2 rules: German rule LEERZEICHEN_RECHENZEICHEN
  and Italian rule GR_09.

Modified Paths:
--------------
    trunk/JLanguageTool/CHANGES.txt
    trunk/JLanguageTool/src/java/org/languagetool/tokenizers/WordTokenizer.java

Modified: trunk/JLanguageTool/CHANGES.txt
===================================================================
--- trunk/JLanguageTool/CHANGES.txt     2012-08-13 18:27:55 UTC (rev 7857)
+++ trunk/JLanguageTool/CHANGES.txt     2012-08-13 19:52:26 UTC (rev 7858)
@@ -66,6 +66,8 @@
 
  -HTTP API: the XML output has been extended to include the category of the 
match
 
+ -The word tokenizer now considers the following characters as word separator: 
| (pipe)
+  and` (backtick).
 
 1.8 (2012-06-30)
 

Modified: 
trunk/JLanguageTool/src/java/org/languagetool/tokenizers/WordTokenizer.java
===================================================================
--- trunk/JLanguageTool/src/java/org/languagetool/tokenizers/WordTokenizer.java 
2012-08-13 18:27:55 UTC (rev 7857)
+++ trunk/JLanguageTool/src/java/org/languagetool/tokenizers/WordTokenizer.java 
2012-08-13 19:52:26 UTC (rev 7858)
@@ -37,7 +37,7 @@
   public List<String> tokenize(final String text) {
     final List<String> l = new ArrayList<String>();
     final StringTokenizer st = new StringTokenizer(text, 
-        "\u0020\u00A0\u115f\u1160\u1680" 
+        "\u0020\u0060\u007c\u00A0\u115f\u1160\u1680" 
         + "\u2000\u2001\u2002\u2003\u2004\u2005\u2006\u2007" 
         + "\u2008\u2009\u200A\u200B\u200c\u200d\u200e\u200f"
         + "\u2028\u2029\u202a\u202b\u202c\u202d\u202e\u202f"

This was sent by the SourceForge.net collaborative development platform, the 
world's largest Open Source development site.


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Languagetool-cvs mailing list
Languagetool-cvs@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-cvs

Reply via email to