-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 4/12/2014 12:04 p.m., Alex Rousskov wrote: > On 12/03/2014 07:46 AM, Amos Jeffries wrote: >> On 21/11/2014 5:38 p.m., Alex Rousskov wrote: >>> On 11/17/2014 05:56 PM, Amos Jeffries wrote: > >>>> + while (ConfigParser::RecognizeQuotedPair_ && >>>> *(nextToken-1) == '\\') { >> >>> Are you sure that nextToken never points at the start of a >>> buffer, especially when parsing parameters loaded from external >>> files? If you are absolutely sure, this guarantee should be >>> documented where ConfigParser::TokenParse() is defined. >>> Otherwise, the code has to be revised (but still support regex >>> parameter files that start with a backslash). >> >> I dont believe that is possible. nextToken always points at the >> current buffer, > > Yes, but I am worried that nextToken can point to the very > beginning of the current buffer, rendering *(nextToken-1) invalid. > > >> There is an un-conditional strcspn() call prior to the >> while-loop which will move the nextToken pointer down into the >> buffer at least one character in distance. > > Why do you expect that strcspn() to always return a positive > value?
Return value is a size_t (unsigned). > >> If the buffer starts with a '\' character the nextToken should >> be pointing at its escaped value character. > > Where would nextToken be increased in that case, prior to reaching > the loop? AFAICT, strcspn() would return zero when the buffer > starts with a backslash and, hence, nextToken will remain pointing > at the beginning of the buffer. It is possible that I have > overlooked some other code that makes that impossible, but even if > I did, the change looks very fragile at best. > > >>> The second problem with this code is that it incorrectly >>> handles "\\X" AFAICT: X should not be escaped, even though it >>> is preceded by a backslash (because that backslash is itself >>> escaped). > >> This code does not un-escape anything. All it does is notice '\' >> escapes and ensures that both the '\' and character following >> are included in the token > > Unfortunately, the same code also includes characters that were > _not_ protected/escaped/whatever by '\'. Details below. > > >> The regex pattern parser is responsible for any un-escaping and >> processing of the token itself. > > Agreed. I am only talking about tokenizing, which is this code > responsibility. > > >> while (... *(nextToken-1) == '\\') { ++nextToken; // skip the >> quoted-pair (\-escaped) character > > > I am talking about the "skip" code above. It is incorrectly > skipping X in "\\X". For example, it will "skip" (i.e., include in > token) a space character in > > foo\\ bar > > The above text has two space-separated tokens: "foo\\" and "bar", > but the loop will treat that text as one token, because it sees the > space character as "escaped". That is wrong AFAICT. > Ah, I see. The !ConfigParser::RecognizeQuotedValues test was preventing the correct delimiter set being used until it had already reached the loop. Fixed now. Amos -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (MingW32) iQEcBAEBAgAGBQJUgHC0AAoJELJo5wb/XPRjmtgIALHRlXeSgKo0+m1SfzCdO8VC 7N1iRtM1cj/9xbMaWixhA06ZVEQ9H7LbKWJi3cbVd+wxB34Y/1piEAqCezUqd3jG LzA6SFvJmetmuwLnGB96tj7k23AEtrn0ae9MviyP1CEXIeq5ddfb45xEp90dsZR2 2j/qBqJluY3pP5bil3jIDuE6YrPfJ4y+4WgY+JV/wLX2fqqveTj3pQGk21zrU/yq NbHA/7aNw9sxapU6s4zVASpgvXovf/TNS2E2QgIWBb0qRiEaJIMFBxC2ZQV0FEkt 61WsTe0qerG1oUIdfCXq2oPdW2coMVdlp1QR1h1dNbGFMB3yK50OxtQ/jonKYWI= =X5Z7 -----END PGP SIGNATURE----- _______________________________________________ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev