[GitHub] trafficserver pull request #866: TS-2237: Fix double encoding of URLs in squ...
Github user shinrich closed the pull request at: https://github.com/apache/trafficserver/pull/866 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #866: TS-2237: Fix double encoding of URLs in squ...
Github user shinrich commented on a diff in the pull request: https://github.com/apache/trafficserver/pull/866#discussion_r74760880 --- Diff: proxy/logging/LogUtils.cc --- @@ -359,6 +359,23 @@ LogUtils::escapify_url(Arena *arena, char *url, size_t len_in, int *len_out, cha while (from < in_url_end) { unsigned char c = *from; if (map[c / 8] & (1 << (7 - c % 8))) { + /* + * If two characters following a '%' don't need to be encoded, then it must + * mean that the three character sequence is already encoded. Just copy it over. + */ + if ((*from == '%') && ((from + 2) < in_url_end)) { +unsigned char c1 = *(from + 1); +unsigned char c2 = *(from + 2); +bool needsEncoding = ((map[c1 / 8] & (1 << (7 - c1 % 8))) || (map[c2 / 8] & (1 << (7 - c2 % 8; +if (!needsEncoding) { + out_len -= 2; + *to++ = *from; + from++; + Debug("log-utils", "character already encoded..skipping %c, %c, %c", *from, *(from + 1), *(from + 2)); --- End diff -- Yes, *to++ = *from++ should be equivalent. And moving up the Debug statement looks like the right thing to do. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #866: TS-2237: Fix double encoding of URLs in squ...
Github user shinrich commented on a diff in the pull request: https://github.com/apache/trafficserver/pull/866#discussion_r74760536 --- Diff: configure.ac --- @@ -49,7 +49,7 @@ AM_INIT_AUTOMAKE([-Wall -Werror -Wno-portability tar-ustar foreign no-installinf AM_MAINTAINER_MODE([enable]) # Enable a recursive "tidy" rule for clang-tidy. -AM_EXTRA_RECURSIVE_TARGETS([tidy]) +#AM_EXTRA_RECURSIVE_TARGETS([tidy]) --- End diff -- Sorry. Will fix. Really need to figure out how to get the newer autotools working in my environment --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #866: TS-2237: Fix double encoding of URLs in squ...
Github user zwoop commented on a diff in the pull request: https://github.com/apache/trafficserver/pull/866#discussion_r74707817 --- Diff: proxy/logging/LogUtils.cc --- @@ -359,6 +359,23 @@ LogUtils::escapify_url(Arena *arena, char *url, size_t len_in, int *len_out, cha while (from < in_url_end) { unsigned char c = *from; if (map[c / 8] & (1 << (7 - c % 8))) { + /* + * If two characters following a '%' don't need to be encoded, then it must + * mean that the three character sequence is already encoded. Just copy it over. + */ + if ((*from == '%') && ((from + 2) < in_url_end)) { +unsigned char c1 = *(from + 1); +unsigned char c2 = *(from + 2); +bool needsEncoding = ((map[c1 / 8] & (1 << (7 - c1 % 8))) || (map[c2 / 8] & (1 << (7 - c2 % 8; +if (!needsEncoding) { + out_len -= 2; + *to++ = *from; + from++; + Debug("log-utils", "character already encoded..skipping %c, %c, %c", *from, *(from + 1), *(from + 2)); --- End diff -- AH, I see, out_len is padded with count*2 (where count is the number of special characters). So, out_len -=2 is correct. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #866: TS-2237: Fix double encoding of URLs in squ...
Github user zwoop commented on a diff in the pull request: https://github.com/apache/trafficserver/pull/866#discussion_r74707669 --- Diff: proxy/logging/LogUtils.cc --- @@ -359,6 +359,23 @@ LogUtils::escapify_url(Arena *arena, char *url, size_t len_in, int *len_out, cha while (from < in_url_end) { unsigned char c = *from; if (map[c / 8] & (1 << (7 - c % 8))) { + /* + * If two characters following a '%' don't need to be encoded, then it must + * mean that the three character sequence is already encoded. Just copy it over. + */ + if ((*from == '%') && ((from + 2) < in_url_end)) { +unsigned char c1 = *(from + 1); +unsigned char c2 = *(from + 2); +bool needsEncoding = ((map[c1 / 8] & (1 << (7 - c1 % 8))) || (map[c2 / 8] & (1 << (7 - c2 % 8; +if (!needsEncoding) { + out_len -= 2; + *to++ = *from; + from++; + Debug("log-utils", "character already encoded..skipping %c, %c, %c", *from, *(from + 1), *(from + 2)); --- End diff -- Hmmm, so some questions on this: 1) Why not *to++ = *from++; ? 2) Since we now moved from forward, is the Debug() line still correct? Seems that it'd be one too much ? 3) I'm not sure I understand this logic, it seems it consumes 2 bytes (out_len -= 2), but it only writes one (*to++ = *from) ? Shouldn't this consume / copy all 3 bytes ? That's sort of what the comments imply, no? 4) It might be nice to explain (comment) what all that bit shifting and logic is actually doing? Presumably it's checking if c1 or c2 is of a particular value, but what values are those? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #866: TS-2237: Fix double encoding of URLs in squ...
Github user zwoop commented on a diff in the pull request: https://github.com/apache/trafficserver/pull/866#discussion_r74707413 --- Diff: configure.ac --- @@ -49,7 +49,7 @@ AM_INIT_AUTOMAKE([-Wall -Werror -Wno-portability tar-ustar foreign no-installinf AM_MAINTAINER_MODE([enable]) # Enable a recursive "tidy" rule for clang-tidy. -AM_EXTRA_RECURSIVE_TARGETS([tidy]) +#AM_EXTRA_RECURSIVE_TARGETS([tidy]) --- End diff -- Probably shouldn't remove this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #866: TS-2237: Fix double encoding of URLs in squ...
GitHub user shinrich opened a pull request: https://github.com/apache/trafficserver/pull/866 TS-2237: Fix double encoding of URLs in squid logs. Resurrecting @sudheerv's fix. We've been running with this fix for over a year. The logic to lookup the character in the bitfield array is a bit odd, but it is the same indexing done for the original character lookup. You can merge this pull request into a Git repository by running: $ git pull https://github.com/shinrich/trafficserver ts-2237 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/trafficserver/pull/866.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #866 commit 22c286947308de594564ac2cde8d72463417ea24 Author: shinrichDate: 2016-08-12T22:04:04Z TS-2237: Fix double encoding of URLs in squid logs. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---