[GitHub] trafficserver pull request #866: TS-2237: Fix double encoding of URLs in squ...

2016-09-07 Thread shinrich
Github user shinrich closed the pull request at:

https://github.com/apache/trafficserver/pull/866


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] trafficserver pull request #866: TS-2237: Fix double encoding of URLs in squ...

2016-08-15 Thread shinrich
Github user shinrich commented on a diff in the pull request:

https://github.com/apache/trafficserver/pull/866#discussion_r74760880
  
--- Diff: proxy/logging/LogUtils.cc ---
@@ -359,6 +359,23 @@ LogUtils::escapify_url(Arena *arena, char *url, size_t 
len_in, int *len_out, cha
   while (from < in_url_end) {
 unsigned char c = *from;
 if (map[c / 8] & (1 << (7 - c % 8))) {
+  /*
+   * If two characters following a '%' don't need to be encoded, then 
it must
+   * mean that the three character sequence is already encoded.  Just 
copy it over.
+   */
+  if ((*from == '%') && ((from + 2) < in_url_end)) {
+unsigned char c1   = *(from + 1);
+unsigned char c2   = *(from + 2);
+bool needsEncoding = ((map[c1 / 8] & (1 << (7 - c1 % 8))) || 
(map[c2 / 8] & (1 << (7 - c2 % 8;
+if (!needsEncoding) {
+  out_len -= 2;
+  *to++ = *from;
+  from++;
+  Debug("log-utils", "character already encoded..skipping %c, %c, 
%c", *from, *(from + 1), *(from + 2));
--- End diff --

Yes, *to++ = *from++ should be equivalent.  And moving up the Debug 
statement looks like the right thing to do.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] trafficserver pull request #866: TS-2237: Fix double encoding of URLs in squ...

2016-08-15 Thread shinrich
Github user shinrich commented on a diff in the pull request:

https://github.com/apache/trafficserver/pull/866#discussion_r74760536
  
--- Diff: configure.ac ---
@@ -49,7 +49,7 @@ AM_INIT_AUTOMAKE([-Wall -Werror -Wno-portability 
tar-ustar foreign no-installinf
 AM_MAINTAINER_MODE([enable])
 
 # Enable a recursive "tidy" rule for clang-tidy.
-AM_EXTRA_RECURSIVE_TARGETS([tidy])
+#AM_EXTRA_RECURSIVE_TARGETS([tidy])
--- End diff --

Sorry.  Will fix.  Really need to figure out how to get the newer autotools 
working in my environment


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] trafficserver pull request #866: TS-2237: Fix double encoding of URLs in squ...

2016-08-14 Thread zwoop
Github user zwoop commented on a diff in the pull request:

https://github.com/apache/trafficserver/pull/866#discussion_r74707817
  
--- Diff: proxy/logging/LogUtils.cc ---
@@ -359,6 +359,23 @@ LogUtils::escapify_url(Arena *arena, char *url, size_t 
len_in, int *len_out, cha
   while (from < in_url_end) {
 unsigned char c = *from;
 if (map[c / 8] & (1 << (7 - c % 8))) {
+  /*
+   * If two characters following a '%' don't need to be encoded, then 
it must
+   * mean that the three character sequence is already encoded.  Just 
copy it over.
+   */
+  if ((*from == '%') && ((from + 2) < in_url_end)) {
+unsigned char c1   = *(from + 1);
+unsigned char c2   = *(from + 2);
+bool needsEncoding = ((map[c1 / 8] & (1 << (7 - c1 % 8))) || 
(map[c2 / 8] & (1 << (7 - c2 % 8;
+if (!needsEncoding) {
+  out_len -= 2;
+  *to++ = *from;
+  from++;
+  Debug("log-utils", "character already encoded..skipping %c, %c, 
%c", *from, *(from + 1), *(from + 2));
--- End diff --

AH, I see, out_len is padded with count*2 (where count is the number of 
special characters). So, out_len -=2 is correct.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] trafficserver pull request #866: TS-2237: Fix double encoding of URLs in squ...

2016-08-14 Thread zwoop
Github user zwoop commented on a diff in the pull request:

https://github.com/apache/trafficserver/pull/866#discussion_r74707669
  
--- Diff: proxy/logging/LogUtils.cc ---
@@ -359,6 +359,23 @@ LogUtils::escapify_url(Arena *arena, char *url, size_t 
len_in, int *len_out, cha
   while (from < in_url_end) {
 unsigned char c = *from;
 if (map[c / 8] & (1 << (7 - c % 8))) {
+  /*
+   * If two characters following a '%' don't need to be encoded, then 
it must
+   * mean that the three character sequence is already encoded.  Just 
copy it over.
+   */
+  if ((*from == '%') && ((from + 2) < in_url_end)) {
+unsigned char c1   = *(from + 1);
+unsigned char c2   = *(from + 2);
+bool needsEncoding = ((map[c1 / 8] & (1 << (7 - c1 % 8))) || 
(map[c2 / 8] & (1 << (7 - c2 % 8;
+if (!needsEncoding) {
+  out_len -= 2;
+  *to++ = *from;
+  from++;
+  Debug("log-utils", "character already encoded..skipping %c, %c, 
%c", *from, *(from + 1), *(from + 2));
--- End diff --

Hmmm, so some questions on this:

1) Why not  *to++ = *from++;  ?
2) Since we now moved from forward, is the Debug() line still correct? 
Seems that it'd be one too much ?
3) I'm not sure I understand this logic, it seems it consumes 2 bytes 
(out_len -= 2), but it only writes one (*to++ = *from) ? Shouldn't this consume 
/ copy all 3 bytes ? That's sort of what the comments imply, no?
4) It might be nice to explain (comment) what all that bit shifting and 
logic is actually doing? Presumably it's checking if c1 or c2 is of a 
particular value, but what values are those?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] trafficserver pull request #866: TS-2237: Fix double encoding of URLs in squ...

2016-08-14 Thread zwoop
Github user zwoop commented on a diff in the pull request:

https://github.com/apache/trafficserver/pull/866#discussion_r74707413
  
--- Diff: configure.ac ---
@@ -49,7 +49,7 @@ AM_INIT_AUTOMAKE([-Wall -Werror -Wno-portability 
tar-ustar foreign no-installinf
 AM_MAINTAINER_MODE([enable])
 
 # Enable a recursive "tidy" rule for clang-tidy.
-AM_EXTRA_RECURSIVE_TARGETS([tidy])
+#AM_EXTRA_RECURSIVE_TARGETS([tidy])
--- End diff --

Probably shouldn't remove this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] trafficserver pull request #866: TS-2237: Fix double encoding of URLs in squ...

2016-08-12 Thread shinrich
GitHub user shinrich opened a pull request:

https://github.com/apache/trafficserver/pull/866

TS-2237: Fix double encoding of URLs in squid logs.

Resurrecting @sudheerv's fix.  We've been running with this fix for over a 
year.  The logic to lookup the character in the bitfield array is a bit odd, 
but it is the same indexing done for the original character lookup.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shinrich/trafficserver ts-2237

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/trafficserver/pull/866.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #866


commit 22c286947308de594564ac2cde8d72463417ea24
Author: shinrich 
Date:   2016-08-12T22:04:04Z

TS-2237: Fix double encoding of URLs in squid logs.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---