Revision: 6078
http://languagetool.svn.sourceforge.net/languagetool/?rev=6078&view=rev
Author: dominikoeo
Date: 2011-12-20 22:37:19 +0000 (Tue, 20 Dec 2011)
Log Message:
-----------
[br] minor update to Breton tokenizer to treat U+02BC as a quote.
Modified Paths:
--------------
trunk/JLanguageTool/src/java/org/languagetool/tokenizers/br/BretonWordTokenizer.java
Modified:
trunk/JLanguageTool/src/java/org/languagetool/tokenizers/br/BretonWordTokenizer.java
===================================================================
---
trunk/JLanguageTool/src/java/org/languagetool/tokenizers/br/BretonWordTokenizer.java
2011-12-20 22:36:19 UTC (rev 6077)
+++
trunk/JLanguageTool/src/java/org/languagetool/tokenizers/br/BretonWordTokenizer.java
2011-12-20 22:37:19 UTC (rev 6078)
@@ -51,8 +51,8 @@
// FIXME: this is a bit of a hacky way to tokenize. It should work
// but I should work on a more elegant way.
- String replaced = text.replaceAll("([Cc])['’‘]([Hh])", "$1##BR_APOS##$2")
- .replaceAll("(\\p{L})['’‘]", "$1##BR_APOS## ");
+ String replaced = text.replaceAll("([Cc])['’‘ʼ]([Hh])", "$1##BR_APOS##$2")
+ .replaceAll("(\\p{L})['’‘ʼ]", "$1##BR_APOS## ");
final List<String> tokenList = super.tokenize(replaced);
List<String> tokens = new ArrayList<String>();
This was sent by the SourceForge.net collaborative development platform, the
world's largest Open Source development site.
------------------------------------------------------------------------------
Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create
new or port existing apps to sell to consumers worldwide. Explore the
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev
_______________________________________________
Languagetool-cvs mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/languagetool-cvs