[
https://issues.apache.org/jira/browse/TS-2237?focusedWorklogId=26418&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-26418
]
ASF GitHub Bot logged work on TS-2237:
--------------------------------------
Author: ASF GitHub Bot
Created on: 14/Aug/16 20:07
Start Date: 14/Aug/16 20:07
Worklog Time Spent: 10m
Work Description: Github user zwoop commented on a diff in the pull
request:
https://github.com/apache/trafficserver/pull/866#discussion_r74707669
--- Diff: proxy/logging/LogUtils.cc ---
@@ -359,6 +359,23 @@ LogUtils::escapify_url(Arena *arena, char *url, size_t
len_in, int *len_out, cha
while (from < in_url_end) {
unsigned char c = *from;
if (map[c / 8] & (1 << (7 - c % 8))) {
+ /*
+ * If two characters following a '%' don't need to be encoded, then
it must
+ * mean that the three character sequence is already encoded. Just
copy it over.
+ */
+ if ((*from == '%') && ((from + 2) < in_url_end)) {
+ unsigned char c1 = *(from + 1);
+ unsigned char c2 = *(from + 2);
+ bool needsEncoding = ((map[c1 / 8] & (1 << (7 - c1 % 8))) ||
(map[c2 / 8] & (1 << (7 - c2 % 8))));
+ if (!needsEncoding) {
+ out_len -= 2;
+ *to++ = *from;
+ from++;
+ Debug("log-utils", "character already encoded..skipping %c, %c,
%c", *from, *(from + 1), *(from + 2));
--- End diff --
Hmmm, so some questions on this:
1) Why not *to++ = *from++; ?
2) Since we now moved from forward, is the Debug() line still correct?
Seems that it'd be one too much ?
3) I'm not sure I understand this logic, it seems it consumes 2 bytes
(out_len -= 2), but it only writes one (*to++ = *from) ? Shouldn't this consume
/ copy all 3 bytes ? That's sort of what the comments imply, no?
4) It might be nice to explain (comment) what all that bit shifting and
logic is actually doing? Presumably it's checking if c1 or c2 is of a
particular value, but what values are those?
Issue Time Tracking
-------------------
Worklog Id: (was: 26418)
Time Spent: 1h 10m (was: 1h)
> URL encoding wrong in squid.blog
> --------------------------------
>
> Key: TS-2237
> URL: https://issues.apache.org/jira/browse/TS-2237
> Project: Traffic Server
> Issue Type: Bug
> Components: Logging
> Reporter: David Carlin
> Priority: Minor
> Labels: yahoo
> Fix For: sometime
>
> Attachments: TS-2237.diff
>
> Time Spent: 1h 10m
> Remaining Estimate: 0h
>
> I was replaying URLs captured from squid.blog and I noticed I was getting
> 404's for some of them when squid.blog showed a 200 for that request. Turns
> out there is an issue with URL encoding. For example:
> Requesting file 'duck%20sports%20authority.gif' via curl will put this in the
> logs:
> duck%2520sports%2520authority.gif
> The % from %20 (space) in the request is being converted to %25 resulting in
> %2520
> I tested both the %<cquc> and %<cquuc> log fields - same thing happens. I
> tested on ATS 3.2.0 and 3.3.5
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)