You're using from-state actions, which isn't going to give you the desired results. Use > and % instead, then go from there. If you use %* on digits, it will execute on every character since the final state is going back into itself to implement the repetition.

Remember that graphviz will help you immensely with debugging. You can look at the result of compiling specific rules in your grammar.

On 09/08/11 14:06, François Beausoleil wrote:
  Hey folks!

I'm writing the "definitive" URL parser class. Lofty goal, perhaps, but also a 
learning exercise. I have an issue with entering and leaving actions.

My code's on GitHub: 
https://github.com/francois/urlparser/blob/master/url.rl#L34

Given the following two URLs:

tcp://127.0.0.1:1234
tcp://a:[email protected]:1234/

For both URLs, I correctly recognize the scheme. For both URLs, either user or 
hostname is wrong, and in both cases, the port's not recognized.

My Ruby implementation is at 
https://github.com/francois/urlparser/blob/master/ruby/lib/urlparser/parser.rl#L14

My question boils down to: how do I definitively know that what I'm looking at is a user, 
vs a hostname, since both have nearly the same set of characters. Should I be using 
"State Action Embedding Operators"? Actually, scratch that: it seems that's 
what I should be doing, because I managed to recognize the host in some cases. For the 
first URL above, I can recognize most of the port: I end up with 123, not 1234, thus 
losing the last character.

A little pointer to some existing parser with the similar behavior would be 
appreciated.

Thanks!
François



_______________________________________________
ragel-users mailing list
[email protected]
http://www.complang.org/mailman/listinfo/ragel-users

_______________________________________________
ragel-users mailing list
[email protected]
http://www.complang.org/mailman/listinfo/ragel-users

Reply via email to