Hi,
I am trying to implement a state based recursice descent SIP 
parser using re2c for the lexer and have a hand-coded parser.

I have a problem here that with parsing the Absolute Uri, "Accept" 
header and the generic param.

1. Absolute Uri:
absoluteURI    =  scheme ":" ( hier-part / opaque-part )
hier-part      =  ( net-path / abs-path ) [ "?" query ]
net-path       =  "//" authority [ abs-path ]
abs-path       =  "/" path-segments
opaque-part    =  uric-no-slash *uric
uric           =  reserved / unreserved / escaped
uric-no-slash  =  unreserved / escaped / ";" / "?" / ":"                        / "@"/ 
"&" / "=" / "+" / "$" / ","
path-segments  =  segment *( "/" segment )
segment        =  *pchar *( ";" param )
param          =  *pchar
pchar          =  unreserved / escaped /":" / "@" / "&" /                       "=" / 
"+" / "$" / ","
scheme         =  ALPHA *( ALPHA / DIGIT / "+" / "-"                            / "." )

/*Problem*/
authority      =  srvr / reg-name
srvr           =  [ [ userinfo "@" ] hostport ]
reg-name       =  1*( unreserved / escaped / "$" / ","
                   / ";" / ":" / "@" / "&" / "=" / "+" )
query          =  *uric

Here the problem is with authority.
I have srvr and reg-name in the same state and I am getting 
reg-name for almost cases from the lexer. I have no way of 
differentiating between srvr and reg-name. So, I can't even put 
them in two different states.

2. Accept Header:

Accept         =  "Accept" HCOLON
                 ( accept-range *(COMMA accept-range) )
accept-range   =  media-range *(SEMI accept-param)
media-range    =  ( "*/*"
                   / ( m-type SLASH "*" )
                   / ( m-type SLASH m-subtype )
                   ) *( SEMI m-parameter )
accept-param   =  ("q" EQUAL qvalue) / generic-param
qvalue         =  ( "0" [ "." 0*3DIGIT ] )
                   / ( "1" [ "." 0*3("0") ] )

The problem that I have here is that, I can't decide what to parse 
after the SEMI in the accept-range. It could be either a 
m-parameter in the media-range or it could be the accept param 
after the SEMI.

3. Generic Param:

generic-param  =  token [ EQUAL gen-value ]
gen-value      =  token / host / quoted-string

Here the probelm is that token is a superset of the character set 
of host. Now in the lexer, if I define TOKEN first, I am returned 
a TOKEN evertime and no instance of HOST is found.

If I define host first, then in most cases I am getting a HOST and 
TOKEN in only a very few cases where some character that is not in 
HOST is present is present.

Somebody please tell me a solution for these problems.

ciao,
Akshat
_________________________________________________________
There is always a better job for you at Monsterindia.com.
Go now http://monsterindia.rediff.com/jobs

_______________________________________________
Sip-implementors mailing list
[EMAIL PROTECTED]
http://lists.cs.columbia.edu/mailman/listinfo/sip-implementors

Reply via email to