Hi, There are two bugs in the current uri-encode procedure in (web uri).
Firstly, if you have an octet less than 16 it only gets encoded to % HEXDIGIT instead of % HEXDIGIT HEXDIGIT. scheme@(guile−user)> (uri-encode "foo\nbar") $30 = "foo%abar" Secondly, if you have a string with no unreserved characters, nothing gets encoded. scheme@(guile−user)> (uri-encode "<>\\^") $31 = "<>\\∧" scheme@(guile−user)> (uri-encode "<>\\^a") $32 = "%3c%3e%5c%5ea" Patches attached. Cheers, -- Ian Price -- shift-reset.com "Programming is like pinball. The reward for doing it well is the opportunity to do it again" - from "The Wizardy Compiled"
>From 11f56bd6a4fdf1331ea30cd68b4d77e35215b4a5 Mon Sep 17 00:00:00 2001 From: Ian Price <[email protected]> Date: Mon, 20 Aug 2012 23:03:38 +0100 Subject: [PATCH 1/2] Fix uri-encoding for octets 0-15 * module/web/uri.scm (uri-encode): All encoded octets should be of the form % HEXDIGIT HEXDIGIT. * test-suite/tests/web-uri.test ("encode"): Add test. --- module/web/uri.scm | 2 ++ test-suite/tests/web-uri.test | 3 ++- 2 files changed, 4 insertions(+), 1 deletions(-) diff --git a/module/web/uri.scm b/module/web/uri.scm index 109118b..3816d02 100644 --- a/module/web/uri.scm +++ b/module/web/uri.scm @@ -377,6 +377,8 @@ the byte." (if (< i len) (let ((byte (bytevector-u8-ref bv i))) (display #\% port) + (when (< byte 16) + (display #\0 port)) (display (number->string byte 16) port) (lp (1+ i)))))))) str))) diff --git a/test-suite/tests/web-uri.test b/test-suite/tests/web-uri.test index 4621a19..a9ded46 100644 --- a/test-suite/tests/web-uri.test +++ b/test-suite/tests/web-uri.test @@ -258,4 +258,5 @@ (equal? "foo bar" (uri-decode "foo+bar")))) (with-test-prefix "encode" - (pass-if (equal? "foo%20bar" (uri-encode "foo bar")))) + (pass-if (equal? "foo%20bar" (uri-encode "foo bar"))) + (pass-if (equal? "foo%0a%00bar" (uri-encode "foo\n\x00bar")))) -- 1.7.7.6
>From ae4fa3f65c1d49822b5a284a065017673c81e65e Mon Sep 17 00:00:00 2001 From: Ian Price <[email protected]> Date: Mon, 20 Aug 2012 23:12:23 +0100 Subject: [PATCH 2/2] Fix uri-encoding for strings with no unreserved chars * module/web/uri.scm (uri-encode): Change test to check for unreserved chars instead of reserved chars. * test-suite/tests/web-uri.test ("encode"): Add test. --- module/web/uri.scm | 4 +++- test-suite/tests/web-uri.test | 3 ++- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/module/web/uri.scm b/module/web/uri.scm index 3816d02..78614a5 100644 --- a/module/web/uri.scm +++ b/module/web/uri.scm @@ -364,7 +364,9 @@ Percent-encoding first writes out the given character to a bytevector within the given @var{encoding}, then encodes each byte as @code{%@var{HH}}, where @var{HH} is the hexadecimal representation of the byte." - (if (string-index str unescaped-chars) + (define (needs-escaped? ch) + (not (char-set-contains? unescaped-chars ch))) + (if (string-index str needs-escaped?) (call-with-output-string* (lambda (port) (string-for-each diff --git a/test-suite/tests/web-uri.test b/test-suite/tests/web-uri.test index a9ded46..3f6e7e3 100644 --- a/test-suite/tests/web-uri.test +++ b/test-suite/tests/web-uri.test @@ -259,4 +259,5 @@ (with-test-prefix "encode" (pass-if (equal? "foo%20bar" (uri-encode "foo bar"))) - (pass-if (equal? "foo%0a%00bar" (uri-encode "foo\n\x00bar")))) + (pass-if (equal? "foo%0a%00bar" (uri-encode "foo\n\x00bar"))) + (pass-if (equal? "%3c%3e%5c%5e" (uri-encode "<>\\^")))) -- 1.7.7.6
