Revision: 7132
http://languagetool.svn.sourceforge.net/languagetool/?rev=7132&view=rev
Author: milek_pl
Date: 2012-05-31 22:25:42 +0000 (Thu, 31 May 2012)
Log Message:
-----------
[en] fix word tokenization in "they had no use for the glottal
stop?\226?\128?\148the first phoneme of the Phoenician pronunciation of the
letter"
Modified Paths:
--------------
trunk/JLanguageTool/src/java/org/languagetool/tokenizers/en/EnglishWordTokenizer.java
Modified:
trunk/JLanguageTool/src/java/org/languagetool/tokenizers/en/EnglishWordTokenizer.java
===================================================================
---
trunk/JLanguageTool/src/java/org/languagetool/tokenizers/en/EnglishWordTokenizer.java
2012-05-31 20:35:03 UTC (rev 7131)
+++
trunk/JLanguageTool/src/java/org/languagetool/tokenizers/en/EnglishWordTokenizer.java
2012-05-31 22:25:42 UTC (rev 7132)
@@ -44,7 +44,7 @@
+ "\u2028\u2029\u202a\u202b\u202c\u202d\u202e\u202f"
+ "\u205F\u2060\u2061\u2062\u2063\u206A\u206b\u206c\u206d"
+ "\u206E\u206F\u3000\u3164\ufeff\uffa0\ufff9\ufffa\ufffb"
- + ",.;()[]{}!?:\"'’‘„“”…\\/\t\n", true);
+ + "—,.;()[]{}!?:\"'’‘„“”…\\/\t\n", true);
while (st.hasMoreElements()) {
tokens.add(st.nextToken());
}
This was sent by the SourceForge.net collaborative development platform, the
world's largest Open Source development site.
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Languagetool-cvs mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/languagetool-cvs