That works. Thank you!
On Wed, Jul 23, 2014 at 10:37 PM, Jon Zeppieri <[email protected]> wrote: > On Thu, Jul 24, 2014 at 1:11 AM, Mangpo Phitchaya Phothilimthana > <[email protected]> wrote: > > That's not ideal because if there is white space after BB#0:, it will > match > > COMMENT again. Is there a better way to do this? > > Factor out the difference? > > (line-comment (re-: (re-& (re-: ";" (re-* (char-complement #\newline))) > (complement (re-: block-comment any-string))) > #\newline)) > > > > > > > > On Wed, Jul 23, 2014 at 10:05 PM, Jon Zeppieri <[email protected]> > wrote: > >> > >> Sorry, I sent that early by mistake. More below: > >> > >> On Thu, Jul 24, 2014 at 1:02 AM, Jon Zeppieri <[email protected]> > wrote: > >> > Your example string is "\n; BB#0;\n" > >> > So, I'd expect the lexer to match: > >> > - whitespace > >> > - line-comment > >> > > >> > Yes, `block-comment` matches, but `line-comment' > >> > >> ... gives the longer match, because it includes the newline at the > >> end, whereas `block-comment` will not match that newline. Since the > >> ending newline will be taken care of by the whitespace rule, perhaps > >> you could simply remove the final newline from the `line-comment` > >> definition? It will still match everything up to (but not including) > >> the newline. > >> > >> -Jon > >> > >> > >> > >> > >> > > >> > On Thu, Jul 24, 2014 at 12:46 AM, Mangpo Phitchaya Phothilimthana > >> > <[email protected]> wrote: > >> >> Hi, > >> >> > >> >> I try to write a lexer and parser, but I cannot figure out how to set > >> >> priority to lexer's tokens. My simplified lexer (shown below) has > only > >> >> 2 > >> >> tokens BLOCK, and COMMENT. BLOCK is in fact a subset of COMMENT. > BLOCK > >> >> appears first in the lexer, but when I parse something that matches > >> >> BLOCK, > >> >> it always matches to COMMENT instead. Below is my program. In this > >> >> particular example, I expect to get a BLOCK token, but I get COMMENT > >> >> token > >> >> instead. If I comment out (line-comment (token-COMMENT lexeme)) in > the > >> >> lexer, I then get the BLOCK token. > >> >> > >> >> Can anyone tell me how to work around this issue? I can only find > this > >> >> in > >> >> the documentation > >> >> "When multiple patterns match, a lexer will choose the longest match, > >> >> breaking ties in favor of the rule appearing first." > >> >> > >> >> #lang racket > >> >> > >> >> (require parser-tools/lex > >> >> (prefix-in re- parser-tools/lex-sre) > >> >> parser-tools/yacc) > >> >> > >> >> (define-tokens a (BLOCK COMMENT)) > >> >> (define-empty-tokens b (EOF)) > >> >> > >> >> (define-lex-trans number > >> >> (syntax-rules () > >> >> ((_ digit) > >> >> (re-: (uinteger digit) > >> >> (re-? (re-: "." (re-? (uinteger digit)))))))) > >> >> > >> >> (define-lex-trans uinteger > >> >> (syntax-rules () > >> >> ((_ digit) (re-+ digit)))) > >> >> > >> >> (define-lex-abbrevs > >> >> (block-comment (re-: "; BB#" number10 ":")) > >> >> (line-comment (re-: ";" (re-* (char-complement #\newline)) > >> >> #\newline)) > >> >> (digit10 (char-range "0" "9")) > >> >> (number10 (number digit10))) > >> >> > >> >> (define my-lexer > >> >> (lexer-src-pos > >> >> (block-comment (token-BLOCK lexeme)) > >> >> (line-comment (token-COMMENT lexeme)) > >> >> (whitespace (position-token-token (my-lexer input-port))) > >> >> ((eof) (token-EOF)))) > >> >> > >> >> (define my-parser > >> >> (parser > >> >> (start code) > >> >> (end EOF) > >> >> (error > >> >> (lambda (tok-ok? tok-name tok-value start-pos end-pos) > >> >> (raise-syntax-error 'parser > >> >> (format "syntax error at '~a' in src l:~a c:~a" > >> >> tok-name > >> >> (position-line start-pos) > >> >> (position-col start-pos))))) > >> >> (tokens a b) > >> >> (src-pos) > >> >> (grammar > >> >> (unit ((BLOCK) $1) > >> >> ((COMMENT) $1)) > >> >> (code ((unit) (list $1)) > >> >> ((unit code) (cons $1 $2)))))) > >> >> > >> >> (define (lex-this lexer input) > >> >> (lambda () > >> >> (let ([token (lexer input)]) > >> >> (pretty-display token) > >> >> token))) > >> >> > >> >> (define (ast-from-string s) > >> >> (let ((input (open-input-string s))) > >> >> (ast input))) > >> >> > >> >> (define (ast input) > >> >> (my-parser (lex-this my-lexer input))) > >> >> > >> >> (ast-from-string " > >> >> ; BB#0: > >> >> ") > >> >> > >> >> ____________________ > >> >> Racket Users list: > >> >> http://lists.racket-lang.org/users > >> >> > > > > >
____________________ Racket Users list: http://lists.racket-lang.org/users

