Every sub-expression has a set of final states. An FSM operation may add or remove final-state status as it builds new machines. So your main machine may not have any final states, but they were present as the machine was built up, and so you see variations in how the eof embedding operators affect the result.

Arian

On 13-07-25 12:31 AM, Solomon Gibbs wrote:
Hello,

I'm not sure I understand what Ragel considers a "final" state. IIRC
the User's Guide says that states that are final before machine
simplification remain final thereafter.

    When exactly is a state final, and how does one recognize this?

I'm using the state machine syntax to implement a string finder --
find ASCII strings with length greater than n, and print them. This
means implementing a maximum length matcher, as below.

Despite the fact that the dot output shows no final states, the EOF
transitions behave differently depending on which flavor of {$%@}eof
is used. I do not understand why this should be. For example, in the
"has_string" state below, using %eof instead of @eof causes both the
"commit_nonstring_eof" and "commit_string_eof" actions to be called
from one of the generated/synthetic states terminating the matching
state.

(State graphics for this machine are are available via
http://stackoverflow.com/questions/17848941/ragel-final-states-and-eof)

action commit_string {    }

action commit_string_eof { }

action commit_nonstring_eof { }

action set_mark { }

action reset {
    /* Force the machine back into state 1. This happens after
     * an incomplete match when some graphical characters are
     * consumed, but not enough for use to keep the string. */
    fgoto start;
}

# Matching classes union to 0x00 .. 0xFF
graphic = (0x09 | 0x20 .. 0x7E);
non_graphic =  (0x00 .. 0x08 | 0x0A .. 0x1F | 0x7F .. 0xFF);

collector = (

    start: (
       # Set the mark if we have a graphic character,
       # otherwise go to non_graphic state and consume input
       graphic @set_mark ->  has_glyph |
       non_graphic ->  no_glyph
    ) $eof(commit_nonstring_eof),

    no_glyph: (
          # Consume input until a graphic character is encountered
          non_graphic ->  no_glyph |
          graphic @set_mark ->  has_glyph
    ) $eof(commit_nonstring_eof),

    has_glyph: (
           # We already matched one graphic character to get here
           # from start or no_glyph. Try to match N-1 before allowing
               # the string to be committed. If we don't get to N-1,
               # drop back to the start state
               graphic{3} $lerr(reset) ->  has_string
    ) @eof(commit_nonstring_eof),

    has_string: (
                # Already consumed our quota of N graphic characters;
                # consume input until we run out of graphic characters
                # then reset the machine. All exiting edges should commit
                # the string. We diferentiate between exiting on a non-graphic
                # input that shouldn't be added to the string and exiting
                # on a (graphic) EOF that should be added.
                graphic* non_graphic ->  start
    ) %from(commit_string) @eof(commit_string_eof)
    #) %from(commit_string) %eof(commit_string_eof) // bad

); #$debug;

main := (collector)+;Hello,

I'm not sure I understand what Ragel considers a "final" state. IIRC
the User's Guide says that states that are final before machine
simplification remain final thereafter.

    When exactly is a state final, and how does one recognize this?

I'm using the state machine syntax to implement a string finder --
find ASCII strings with length greater than n, and print them. This
means implementing a maximum length matcher, as below.

Despite the fact that the dot output shows no final states, the EOF
transitions behave differently depending on which flavor of {$%@}eof
is used. I do not understand why this should be. For example, in the
"has_string" state below, using %eof instead of @eof causes both the
"commit_nonstring_eof" and "commit_string_eof" actions to be called
from one of the generated/synthetic states terminating the matching
state.

(State graphics for this machine are are available via
http://stackoverflow.com/questions/17848941/ragel-final-states-and-eof)

action commit_string {    }

action commit_string_eof { }

action commit_nonstring_eof { }

action set_mark { }

action reset {
    /* Force the machine back into state 1. This happens after
     * an incomplete match when some graphical characters are
     * consumed, but not enough for use to keep the string. */
    fgoto start;
}

# Matching classes union to 0x00 .. 0xFF
graphic = (0x09 | 0x20 .. 0x7E);
non_graphic =  (0x00 .. 0x08 | 0x0A .. 0x1F | 0x7F .. 0xFF);

collector = (

    start: (
       # Set the mark if we have a graphic character,
       # otherwise go to non_graphic state and consume input
       graphic @set_mark ->  has_glyph |
       non_graphic ->  no_glyph
    ) $eof(commit_nonstring_eof),

    no_glyph: (
          # Consume input until a graphic character is encountered
          non_graphic ->  no_glyph |
          graphic @set_mark ->  has_glyph
    ) $eof(commit_nonstring_eof),

    has_glyph: (
           # We already matched one graphic character to get here
           # from start or no_glyph. Try to match N-1 before allowing
               # the string to be committed. If we don't get to N-1,
               # drop back to the start state
               graphic{3} $lerr(reset) ->  has_string
    ) @eof(commit_nonstring_eof),

    has_string: (
                # Already consumed our quota of N graphic characters;
                # consume input until we run out of graphic characters
                # then reset the machine. All exiting edges should commit
                # the string. We diferentiate between exiting on a non-graphic
                # input that shouldn't be added to the string and exiting
                # on a (graphic) EOF that should be added.
                graphic* non_graphic ->  start
    ) %from(commit_string) @eof(commit_string_eof)
    #) %from(commit_string) %eof(commit_string_eof) // bad

); #$debug;

main := (collector)+;

_______________________________________________
ragel-users mailing list
ragel-users@complang.org
http://www.complang.org/mailman/listinfo/r

_______________________________________________
ragel-users mailing list
ragel-users@complang.org
http://www.complang.org/mailman/listinfo/ragel-users

Reply via email to