[
https://issues.apache.org/jira/browse/CODEC-125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13058180#comment-13058180
]
Gary D. Gregory commented on CODEC-125:
---------------------------------------
Hi Matthiew:
Thank you for the 2nd patch.
I just spent an hour trying to figure this out tonight and here is how far I
got using the latest patch dated 29/Jun/11 21:40.
First a code comment: Let's put all the code in the language.bm package instead
of splitting it up in two packages. I prefer the name bm for now, the pm seems
redundant under {{language}}.
The seems to be a fundamental problem with Regex strings in this patch.
The first error I ran into was in {{lang.txt}}. The first non-comment line is:
{{^o’ english true}}
which gave a {{PatternSyntaxException}} I can no longer reproduce.
I commented that line out. Then I get a {{PatternSyntaxException}} on lines
that start with {{?}}. This makes sense since {{?}} is qualifier.
So I hacked the code to skip lines that start with {{?}} in Lang.java like so:
{code:java}
Pattern pattern = null;
final String regex = parts[0];
if (regex.charAt(0) != '?') {
try {
pattern = Pattern.compile(regex);
} catch (PatternSyntaxException e) {
throw new IllegalArgumentException("Error compiling regex
at line " + lineNo + ": " + regex, e);
}
String[] langs = parts[1].split("\\+");
boolean accept = parts[2].equals("true");
rules.add(new LangRule(pattern, new
HashSet<String>(Arrays.asList(langs)), accept));
}
{code}
Next up is Rules.java which blows up due to the same {{?}} issue:
{noformat}
java.lang.ExceptionInInitializerError
at
org.apache.commons.codec.language.bmpm.PhoneticEngine.phoneticUtf8(PhoneticEngine.java:98)
at
org.apache.commons.codec.language.bmpm.PhoneticEngine.encode(PhoneticEngine.java:85)
at
org.apache.commons.codec.language.PhoneticTest.testPhonetic(PhoneticTest.java:72)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
at
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
at
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at
org.junit.runners.BlockJUnit4ClassRunner.runNotIgnored(BlockJUnit4ClassRunner.java:79)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:71)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:49)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
at org.junit.runners.Suite.runChild(Suite.java:128)
at org.junit.runners.Suite.runChild(Suite.java:24)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
at
org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
at
org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
Caused by: java.lang.IllegalArgumentException: Error compiling regex:
(?|?|?|?|?|?|?|?|?|?)
at org.apache.commons.codec.language.bmpm.Rule.<init>(Rule.java:46)
at org.apache.commons.codec.language.bmpm.Rule.parseRules(Rule.java:215)
at org.apache.commons.codec.language.bmpm.Rule.<clinit>(Rule.java:129)
... 34 more
Caused by: java.util.regex.PatternSyntaxException: Unknown inline modifier near
index 2
(?|?|?|?|?|?|?|?|?|?)$
^
at java.util.regex.Pattern.error(Pattern.java:1713)
at java.util.regex.Pattern.group0(Pattern.java:2519)
at java.util.regex.Pattern.sequence(Pattern.java:1806)
at java.util.regex.Pattern.expr(Pattern.java:1752)
at java.util.regex.Pattern.compile(Pattern.java:1460)
at java.util.regex.Pattern.<init>(Pattern.java:1133)
at java.util.regex.Pattern.compile(Pattern.java:823)
at org.apache.commons.codec.language.bmpm.Rule.<init>(Rule.java:44)
... 36 more
{noformat}
I hacked this class as well to surface the underlying RE:
{code:java}
public Rule(String pattern, String lContext, String rContext, String
phoneme, Set<String> language, String logical)
{
this.pattern = pattern;
try {
this.lContext = Pattern.compile(lContext + "$");
} catch (PatternSyntaxException e) {
throw new IllegalArgumentException("Error compiling regex: " +
lContext, e);
}
{code}
Then I quit.
I do not understand how this can work for you and not for me.
What happens when you run:
{code:java}
import java.util.regex.Pattern;
public class PatternTest {
/**
* @param args
*/
public static void main(String[] args) {
System.out.println(Pattern.compile("?"));
}
}
{code}
I get:
{noformat}
Exception in thread "main" java.util.regex.PatternSyntaxException: Dangling
meta character '?' near index 0
?
^
at java.util.regex.Pattern.error(Pattern.java:1713)
at java.util.regex.Pattern.sequence(Pattern.java:1878)
at java.util.regex.Pattern.expr(Pattern.java:1752)
at java.util.regex.Pattern.compile(Pattern.java:1460)
at java.util.regex.Pattern.<init>(Pattern.java:1133)
at java.util.regex.Pattern.compile(Pattern.java:823)
at PatternTest.main(PatternTest.java:9)
{noformat}
How about you?
> Implement a Beider-Morse phonetic matching codec
> ------------------------------------------------
>
> Key: CODEC-125
> URL: https://issues.apache.org/jira/browse/CODEC-125
> Project: Commons Codec
> Issue Type: New Feature
> Reporter: Matthew Pocock
> Priority: Minor
> Attachments: bmpm.patch, bmpm.patch
>
>
> I have implemented Beider Morse Phonetic Matching as a codec against the
> commons-codec svn trunk. I would like to contribute this to commons-codec.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira