Frank W. Miller wrote:
> 
> -----Original Message-----
> From: Paul Kyzivat [mailto:[EMAIL PROTECTED] 
> Sent: Tuesday, April 01, 2008 3:35 PM
> To: Frank W. Miller
> Cc: 'Iñaki Baz Castillo'; [email protected]
> Subject: Re: [Sip-implementors] Why SIP abnf is so permissive???
> 
> 
> 
> Frank W. Miller wrote:
>> You can't really just check for quotes out of context. In some contexts 
>> they might not always come in pairs.
>>
>> FM: Where does the syntax allow for unpaired quotes?  I did a quick search
>> on 3261 for DQUOT and didn't see any place where they are allowed to be
>> unmatched.  Forgive my ignorance?
> 
> Well, one place is in comments. I'm not sure if there are others. But 
> comments are a bitch. They are only valid in headers that specify them. 
> I can't see any simplified rule for figuring out when you are in a 
> comment. (I'm certain that comments are a feature of sip that Iñaki will 
> question. I question the reason for them too.)
> 
> FM: Well, I see comment spec for the following headers:
> 
> Retry-After
> Server
> 
> And that's it in 3261 (quick search).  I suppose somebody could in their
> evilness decide to put a single quote inside a comment in one of these
> headers...

Yep. Heh, heh, heh.

> Also, thinking about it a little more, what difference does it make if you
> collapse comments?  I mean you might lose some human readable formatting,
> but does that really matter?

Comments are not "to the end of the header" comments, but rather are 
bounded by balanced parens, and so can appear in the middle of a header. 
And in fact that is the case with the two cases you uncovered that allow 
comments.

So an unpaired quote in a comment in one of those would potentially get 
you into a wrong whitespace suppressing mode at least for the remainder 
of the header. Depending on how you do it, it might continue until the 
next quote, no matter how far away it is.

For the two headers mentioned in 3261 that allow comments, screwing up 
the whitespace for the remainder of the header may not be a problem. But 
if it continues further than that it of course could be. And then there 
could always be extension headers that allow comments that could result 
in more trouble.

I can probably keep coming up with counter examples till the cows come 
home. Admittedly they are obscure. It just depends on how much you want 
to play the odds on what people do.

> But of course quotes can appear in bodies, with any restrictions being 
> imposed solely by the Content-Type of the body.
> 
> FM: I plan to change the code to stop at the end of the SIP headers.
> 
> 
>> And they may be escaped in funny ways.
> 
>> FM: If they are escaped, doesn't that mean they are already inside another
>> set of quotes?  The escaped sequence will have to check for escaped
>> characters?!
> 
> Well, there is at least the \" escaping in quoted-string. Maybe that is 
> all that is relevant. You can deal with that *if* you can figure out 
> that you are in a quoted-string, which requires that you can tell that a 
> quote you find actually introduces a quoted-string, rather than being 
> part of a comment or some other thing (tbd, maybe non-existent) that 
> doesn't introduce a quoted-string.
> 
> FM: Hmm.  If we assume that collapsing the contents of a comment is not a
> problem, then presumably not collapsing a quoted string within a comment is
> ok too?

Yeah, not collapsing should not be a problem. :-)

The problem is that when you mistakenly get into the not-collapsing 
mode, it means that the next quote you see will put you into the 
collapsing mode exactly when it shouldn't.

> So, the changes required then to this code snippet:
> 
> 1) Recognize and skip over quoted strings, paying attention to escaped
> quotes inside the string (I'll ignore the single quote in the comment issue
> for the moment, hoping that that’s VERY uncommon)
> 
> 2) Stop the collapsing at the end of the SIP headers
> 
> 3) Add check for horizontal tab as the start of a continuation line
> 
> Anything else that anybody can think of?  I'll post the updated version once
> I have tested it a bit...

In the end, what does this buy you? You still touch and move every 
character. How is it any better than doing pretty much the same thing 
everywhere in the syntax where LWS is allowed?

Or are you trying to put this into a form where you can parse with REs 
or something like that, as Iñaki seems to be doing? (While that is 
interesting from an academic perspective, IMO it is not a practical 
approach.)

        Paul
_______________________________________________
Sip-implementors mailing list
[email protected]
https://lists.cs.columbia.edu/cucslists/listinfo/sip-implementors

Reply via email to