Hi Hermanni,

In message "[kaffe] Bug report"
    on 03/06/27, Hermanni Hyyti�l� <[EMAIL PROTECTED]> writes:

> Token: # (type: -3)

> The tokenizer is initialized in
> sandStorm.main.SandstormConfig$configSection (starts from line 610) like
> this:

> tok = new StreamTokenizer(in);
> tok.resetSyntax();
> tok.wordChars((char)0, (char)255);
> tok.whitespaceChars('\u0000', '\u0020');
> tok.commentChar('#');
> tok.eolIsSignificant(true);

This way of initialization makes all characters between 0 and 255
word characters.

So '#' is both a word character and a comment character.
(Sun's API document says, "Each character can have zero or more of these
attributes.")

Kaffe's java.io.StreamTokenizer checks each character in the
following order:

  isWhitespace
  isNumeric
  isAlphabetic
  chr=='/' && CPlusPlusComments && parseCPlusPlusCommentChars()
  chr=='/' && CComments && parseCCommentChars()
  isComment
  isStringQuote

So '#' is treated as a word character (isAlphabetic) before
it is checked against isComment.

I do not think Sun's API document clearly defines in what order
character types should be checked.  So it can be said that treating
'#' as a word character is not a bug but so specified.

But in order to make the behavior of kaffe's java.io.StreamTokenizer
similar to Sun's,  I suggest that the cheking order be changed
as follows (the more specific, the earlier):

  isWhitespace
  chr=='/' && CPlusPlusComments && parseCPlusPlusCommentChars()
  chr=='/' && CComments && parseCCommentChars()
  isComment
  isStringQuote
  isNumeric
  isAlphabetic

Please try this patch.

--- java/io/StreamTokenizer.java.orig   Tue Feb 19 09:47:49 2002
+++ java/io/StreamTokenizer.java        Sat Jun 28 11:48:50 2003
@@ -116,14 +116,6 @@
                /* Skip whitespace and return nextTokenType */
                parseWhitespaceChars(chr);
        }
-       else if (e.isNumeric) {
-               /* Parse the number and return */
-               parseNumericChars(chr);
-       }
-       else if (e.isAlphabetic) {
-               /* Parse the word and return */
-               parseAlphabeticChars(chr);
-       }
        /* Contrary to the description in JLS 1.ed,
           C & C++ comments seem to be checked
           before other comments. That actually
@@ -145,6 +137,14 @@
        else if (e.isStringQuote) {
                /* Parse string and return word */
                parseStringQuoteChars(chr);
+       }
+       else if (e.isNumeric) {
+               /* Parse the number and return */
+               parseNumericChars(chr);
+       }
+       else if (e.isAlphabetic) {
+               /* Parse the word and return */
+               parseAlphabeticChars(chr);
        }
        else {
                /* Just return it as a token */

_______________________________________________
kaffe mailing list
[EMAIL PROTECTED]
http://kaffe.org/cgi-bin/mailman/listinfo/kaffe

Reply via email to