Hi,

Please find attached a patch to the class: 
org.apache.lucene.search.WildcardTermEnum.

This patch replaces the 'wildcardEquals' function - it originally only
matched against 1 or more characters in place of a * character if the
wildcard was at the end of the term. The desired behaviour would be for
zero or more characters, as is the case with PrefixQuery. I expect most
people didn't notice the existing behaviour, as the query parser creates
PrefixQuery objects for such queries by default.

This patch also adds comments to the code. Is it suitable to be included
in the main Lucene source?

Regards,

-- 
Lee Mallabone.
Granta Design Ltd.


Index: src/java/org/apache/lucene/search/WildcardTermEnum.java
===================================================================
RCS file: /var/lib/cvs/libs/lucene/src/java/org/apache/lucene/search/WildcardTermEnum.java,v
retrieving revision 1.1.1.1
diff -u -r1.1.1.1 WildcardTermEnum.java
--- src/java/org/apache/lucene/search/WildcardTermEnum.java	9 Nov 2001 14:10:25 -0000	1.1.1.1
+++ src/java/org/apache/lucene/search/WildcardTermEnum.java	14 Feb 2002 17:52:45 -0000
@@ -116,29 +116,75 @@
   
   public static final char WILDCARD_STRING = '*';
   public static final char WILDCARD_CHAR = '?';
-  
-  public static final boolean wildcardEquals(String pattern, int patternIdx, String string, int stringIdx) {
-    for ( int p = patternIdx; ; ++p ) {
-      for ( int s = stringIdx; ; ++p, ++s ) {
-        boolean sEnd = (s >= string.length());
-        boolean pEnd = (p >= pattern.length());
-        
-        if (sEnd && pEnd) return true;
-        if (sEnd || pEnd) break;
-        if (pattern.charAt(p) == WILDCARD_CHAR) continue;
-        if (pattern.charAt(p) == WILDCARD_STRING) {
-          int i;
-          ++p;
-          for (i = string.length(); i >= s; --i)
-            if (wildcardEquals(pattern, p, string, i))
-              return true;
-          break;
+
+    /**
+     * Determines if a word matches a wildcard pattern.
+     * <small>Work released by Granta Design Ltd after originally being done on company time.</small>
+     */
+    public static final boolean wildcardEquals(String pattern, int patternIdx, String string, int stringIdx)
+    {
+        for (int p = patternIdx; ; ++p)
+        {
+            for (int s = stringIdx; ; ++p, ++s)
+            {
+                // End of string yet?
+                boolean sEnd = (s >= string.length());
+                // End of pattern yet?
+                boolean pEnd = (p >= pattern.length());
+
+                // If we're looking at the end of the string...
+                if (sEnd)
+                {
+                    // Assume the only thing left on the pattern is/are wildcards
+                    boolean justWildcardsLeft = true ;
+
+                    // Current wildcard position
+                    int wildcardSearchPos = p ;
+                    // While we haven't found the end of the pattern, and haven't encountered any non-wildcard characters
+                    while (wildcardSearchPos < pattern.length() && justWildcardsLeft)
+                    {
+                        // Check the character at the current position
+                        char wildchar = pattern.charAt(wildcardSearchPos);
+                        // If it's not a wildcard character, then there is more pattern information
+                        // after this/these wildcards.
+
+                        if (wildchar != WILDCARD_CHAR && wildchar != WILDCARD_STRING)
+                        {
+                            justWildcardsLeft = false ;
+                        }
+                        else
+                        {
+                            // Look at the next character
+                            wildcardSearchPos++ ;
+                        }
+                    }
+
+                    // This was a prefix wildcard search, and we've matched - return true.
+                    if (justWildcardsLeft)
+                        return true ;
+                }
+
+                // If we've gone past the end of the string, or the pattern, return false.
+                if (sEnd || pEnd) break;
+
+                // Match a single character, so continue.
+                if (pattern.charAt(p) == WILDCARD_CHAR) continue;
+
+                //
+                if (pattern.charAt(p) == WILDCARD_STRING)
+                {
+                    // Look at the character beyond the '*'.
+                    ++p;
+                    // Examine the string, starting at the last character.
+                    for (int i = string.length(); i >= s; --i)
+                    {
+                        if (wildcardEquals(pattern, p, string, i))
+                            return true;
+                    }
+                    break;
+                }
+                if (pattern.charAt(p) != string.charAt(s)) break;
+            }
+            return false;
         }
-        if (pattern.charAt(p) != string.charAt(s)) break;
-      }
-      return false;
     }
-  }
   
   public void close() throws IOException {
       super.close();

--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to