Thanks a lot Adrian! It's working beautifully! On Mon, Jul 18, 2011 at 4:49 AM, <[email protected]> wrote: > Hi Talek, what you should do is include the tail items in the scanner and add > a pattern that covers any word that is not 'select'. If you specify 'select' > ahead of the generic pattern it will be matched in favour of the generic > pattern on only that word. > > Adrian > -----Original Message----- > From: Alec Tica <[email protected]> > Sender: [email protected] > Date: Fri, 15 Jul 2011 00:20:42 > To: <[email protected]> > Reply-To: [email protected] > Subject: [ragel-users] Detect keywords with a ragel scanner > > Hi, > > I'm new to Ragel and I'm trying to figure out how to solve, > apparently, a very simple problem. Let's say I have the following > text: > > "select 1 from dual;select 2 from dual;/*comment*/select 3 from dual;select" > > I want to detect all "select" keywords using a scanner but taking into > consideration the word boundaries. "select" is a keyword only if: > > 1. starts at: the very beginning of the text or it has a whitespace > before or a comment or a statement separator (;) > 2. ends at: the very end of the text or it has a whitespace after or a > comment or a statement separator (;) > 3. is not within quotes > 4. is not part of a comment > > Till now I have: > > <code> > %%{ > machine example; > > action is_eof { > true if p == eof - 1 > } > > # eof > EOF = zlen when is_eof; > > # strings > squoted_string = ['] ( (any - [''])** ) [']; > dquoted_string = '"' ( any )* :>> '"'; > > # comments > ml_comment = '/*' ( any )* :>> '*/'; > sl_comment = '--' ( any )* :>> ('\n' | EOF); > comment = ml_comment | sl_comment; > > tail = space | comment | ';' | EOF; > > # keyword > select = 'select' . tail; > > main := |* > squoted_string; > dquoted_string; > comment; > select => { puts "found at #{ts}-#{te}" }; > any; > *|; > > }%% > > %% write data; > > data = 'unselect 1 from dual;select 2 from dual;/*comment*/select 3 > from dual;select' > # convert the provided string in a stream of chars > stream_data = data.unpack("c*") if(data.is_a?(String)) > eof = stream_data.length > > %% write init; > %% write exec; > </code> > > Of course, the above scanner incorrectly matches the "unselect" word > from the data. Anyway, I feel that I'm not on the right track > therefore I'd like to ask for your advice. > > Many thanks in advance! > > -- > talek > > _______________________________________________ > ragel-users mailing list > [email protected] > http://www.complang.org/mailman/listinfo/ragel-users > _______________________________________________ > ragel-users mailing list > [email protected] > http://www.complang.org/mailman/listinfo/ragel-users >
-- talek _______________________________________________ ragel-users mailing list [email protected] http://www.complang.org/mailman/listinfo/ragel-users
