Consider the following program:

package test;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class TokenizeTest {
    public static void main(String[] args) {
        String input = "The cat sat on the mat";
        int prevEnd = 0;
        Matcher matcher = Pattern.compile("[\\x20\\n\\r\\t]+").matcher(input);
        while (matcher.find()) {
            System.err.println("Match at " + matcher.start() + ": " +
input.substring(prevEnd, matcher.start()));
            prevEnd = matcher.end();
        }
        System.err.println("Remainder: " + input.substring(prevEnd));
    }
}

With Sun JRE the output is:

Match at 3: The
Match at 7: cat
Match at 11: sat
Match at 14: on
Match at 18: the
Remainder: mat

With GNU Classpath it is:

Remainder: The cat sat on the mat

Michael Kay


-- 
           Summary: Regex tokenizing
           Product: classpath
           Version: 0.20
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: classpath
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: mike at saxonica dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25976



_______________________________________________
Bug-classpath mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/bug-classpath

Reply via email to