Hey folks!  

I'm writing the "definitive" URL parser class. Lofty goal, perhaps, but also a 
learning exercise. I have an issue with entering and leaving actions.

My code's on GitHub: 
https://github.com/francois/urlparser/blob/master/url.rl#L34

Given the following two URLs:

tcp://127.0.0.1:1234
tcp://a:[email protected]:1234/

For both URLs, I correctly recognize the scheme. For both URLs, either user or 
hostname is wrong, and in both cases, the port's not recognized.

My Ruby implementation is at 
https://github.com/francois/urlparser/blob/master/ruby/lib/urlparser/parser.rl#L14

My question boils down to: how do I definitively know that what I'm looking at 
is a user, vs a hostname, since both have nearly the same set of characters. 
Should I be using "State Action Embedding Operators"? Actually, scratch that: 
it seems that's what I should be doing, because I managed to recognize the host 
in some cases. For the first URL above, I can recognize most of the port: I end 
up with 123, not 1234, thus losing the last character.

A little pointer to some existing parser with the similar behavior would be 
appreciated.

Thanks!
François



_______________________________________________
ragel-users mailing list
[email protected]
http://www.complang.org/mailman/listinfo/ragel-users

Reply via email to