I think this should work: (Written in C# originally - so someone please check if it compiles - I don't have a java compiler here)
private String discardEscapeChar(String input) { char[] caSource = input.toCharArray(); char[] caDest = new char[caSource.length]; int j = 0; for (int i = 0; i < caSource.length; i++) { if (caSource[i] == '\\') { if (caSource.length == ++i) break; } caDest[j++]=caSource[i]; } return new String(caDest, 0, j); } Regarding your UnitTest - It think it's wrong: > assertEquals("\\\\\\\\192.168.0.15\\\\public", > discardEscapeChar ("\\\\192.168.0.15\\\\public")); It should be: assertEquals("\\\\192.168.0.15\\\\public", discardEscapeChar ("\\\\\\\\192.168.0.15\\\\public")); I would also suggest to add the following: String s="\\\\some.host.name\\dir+:+-!():^[]\{}~*?"; assertEquals(s,discardEscapeChar(escape(s))); Eyal > -----Original Message----- > From: Erik Hatcher [mailto:[EMAIL PROTECTED] > Sent: Wednesday, July 20, 2005 22:38 PM > To: java-user@lucene.apache.org > Subject: Re: QueryParser handling of backslash characters > > > On Jul 19, 2005, at 11:19 AM, Jeff Davis wrote: > > > Hi, > > > > I'm seeing some strange behavior in the way the QueryParser handles > > consecutive backslash characters. I know that backslash is > the escape > > character in Lucene, and so I would expect "\\\\" to match > fields that > > have two consecutive backslashes, but this does not seem to be the > > case. > > > > The fields I'm searching are UNC paths, e.g. > "\\192.168.0.15\public". > > The only way I can get my query to find the record containing that > > value is to type "FieldName:\\\192.168.0.15\\public" (three > slashes). > > Why is the third backslash character not treated as an > escape? Is it > > just that any backslash that is preceded by a backslash is > interpreted > > as a literal backslash character, regardless of whether the "escape" > > backslash was itself escaped? > > > > I can code around this, but it seems inconsistent with the way that > > escape characters usually work. Is this a bug, or is it > intentional, > > or am I missing something? > > I've waited until I had a chance to experiment with this > before replying. I say that this is a bug. There is a > private method in QueryParser called discardEscapeChar (shown > below). I copied it to a JUnit test case and gave it this assert: > > assertEquals("\\\\\\\\192.168.0.15\\\\public", > discardEscapeChar ("\\\\192.168.0.15\\\\public")); > > This test fails with: > > Expected:\\\\192.168.0.15\\public > Actual :\192.168.0.15\public > > Which is wrong in my opinion. (though my head hurts thinking > about metaescaping backslashes in Java code to make this a > proper test) > > The bug is isolated to the discardEscapeChar() method where > it eats too many backslashes. Could you have a shot at > tweaking that method to do the right thing and submit a patch? > > private String discardEscapeChar(String input) { > char[] caSource = input.toCharArray(); > char[] caDest = new char[caSource.length]; > int j = 0; > for (int i = 0; i < caSource.length; i++) { > if ((caSource[i] != '\\') || (i > 0 && caSource[i-1] > == '\\')) { > caDest[j++]=caSource[i]; > } > } > return new String(caDest, 0, j); > } > > Erik > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]