[il-antlr-interest: 29857] Re: [antlr-interest] Best practice to handle Lexer backtracking demand

Gerald Rosenberg Sun, 15 Aug 2010 08:53:12 -0700

The attached grammar illustrates two different patterns that couldwork to identify the markers.

However, there is an open question about whether a valid marker canappear without prefix, suffix, or any escaped characters). Since it isnot clear what would be valid, I have left the grammar as an incompleteexample.

------ Original Message (Sunday, August 15, 2010 10:42:28AM) From: Joachim Schrod ------Subject: Re: [antlr-interest] Best practice to handle Lexer backtrackingdemand

Gerald Rosenberg writes:

How is XXXX guaranteed to be unambiguous with any other fragment of
aaaaXXXXbbbb?  That is, how can you be sure that a fragment like aaaX or
XXXXb will never match a different start marker.

The data generating service guarantees it. (It escapes characters if
any complete marker substring happens to be in the data.)

Is there a case distinction, as implied, or something more
interesting? Is the distinction the same for the end marker?

No and no. The markers are strings like `prenames', `prenamee',
`surnames', etc.

        Joachim



--

Gerald B. Rosenberg, Esq.
NewTechLaw
260 Sheridan Ave., Suite 208
Palo Alto, CA 94306-2009
650.325.2100 (office) / 650.703.1724 (cell)
650.325.2107 (facsimile)

www.newtechlaw.com

CONFIDENTIALITY NOTICE: This email message (including any attachments)is being sent by an attorney,is for the sole use of the intended recipient, and may containconfidential and privileged information.Any unauthorized review, use, disclosure or distribution is prohibited.If you are not the intendedrecipient, please contact the sender immediately by reply email anddelete all copies of this message

and any attachments without retaining a copy.

grammar test;

options { 
        output = AST;
        language = Java;
}

tokens { 
        MWORD;
        UWORD;
}

@parser::header {
        
        package test.gen;
}

@lexer::header {
        
        package test.gen;
}

@lexer::members {

        List<Token> tokens = new ArrayList<Token>();

        @Override
        public void emit(Token token) {
                super.emit(token);
                tokens.add(token);
        }

        @Override
        public Token nextToken() {
                super.nextToken();
                if (tokens.size() == 0) {
                        return Token.EOF_TOKEN;
                }
                return tokens.remove(0);
        }

        private/* ILexerHelper */Object helper;

        public void setHelper(/* ILexerHelper */Object helper) {
                this.helper = helper;
        }
}

start1
        : text1+ EOF
        ;

start2
        : text2+ EOF
        ;

text1
        : MWORD ( w+=WORD | w+=UWORD )+ MWORD 
                        { 
                                // analyze $w
                        }
        | WORD
        ;
        
text2 : MARKER ;

WORD
        : beg=ESC+ mid=CHAR+ end=ESC+
                {
                        $beg.setType(UWORD);
                        // $beg.setText(helper.uEsc($beg.getText()));
                        emit($beg);

                        $mid.setType(MWORD);
                        // $mid.setType(helper.determineType($mid));
                        emit($mid);

                        $end.setType(UWORD);
                        // $end.setText(helper.uEsc($end.getText()));
                        emit($end);

                }
        | CHAR+
        ;

MARKER
        : ESC*
    (  // options { k=10; } :  
                'prenamee'
        |  'prenames'
        |  'surname'
    )
    ESC*
    ;

fragment
ESC
        : '\\'
                ( 'n'
                | 'r'
                | 't'
                | 'b'
                | 'f'
                | '"'
                | '\''
                | '\\'
                | . 
                )
        ;

fragment
CHAR
        : 'a'..'z' | 'A'..'Z'
        ;

WS
        :  ( ' ' | '\t' | '\r'? '\n' )+ { $channel = HIDDEN; }
        ;

\a\b\bsurname\d\e\e\e
\F\C\Xprenames\Q\B\Y\U

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
"il-antlr-interest" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.

[il-antlr-interest: 29857] Re: [antlr-interest] Best practice to handle Lexer backtracking demand

Reply via email to