After a harder look, I concluded there was a bit more work to be done,
but still very basic modifications.
Attached is a version of urlencode() which seems to make the most sense
to me.
I wonder how I could officially propose at least some of these
modifications.
- Dan
Bill Janssen wrote:
Bill Janssen <jans...@parc.com> wrote:
Dan Mahn <dan.m...@digidescorp.com> wrote:
3) Regarding the following code fragment in urlencode():
k = quote_plus(str(k))
if isinstance(v, str):
v = quote_plus(v)
l.append(k + '=' + v)
elif isinstance(v, str):
# is there a reasonable way to convert to ASCII?
# encode generates a string, but "replace" or "ignore"
# lose information and "strict" can raise UnicodeError
v = quote_plus(v.encode("ASCII","replace"))
l.append(k + '=' + v)
I don't understand how the "elif" section is invoked, as it uses the
same condition as the "if" section.
This looks like a 2->3 bug; clearly only the second branch should be
used in Py3K. And that "replace" is also a bug; it should signal an
error on encoding failures. It should probably catch UnicodeError and
explain the problem, which is that only Latin-1 values can be passed in
the query string. So the encode() to "ASCII" is also a mistake; it
should be "ISO-8859-1", and the "replace" should be a "strict", I think.
Sorry! In 3.0.1, this whole thing boils down to
l.append(quote_plus(k) + '=' + quote_plus(v))
Bill
def urlencode(query, doseq=0, safe='', encoding=None, errors=None):
"""Encode a sequence of two-element tuples or dictionary into a URL query
string.
If any values in the query arg are sequences and doseq is true, each
sequence element is converted to a separate parameter.
If the query arg is a sequence of two-element tuples, the order of the
parameters in the output will match the order of parameters in the
input.
"""
if hasattr(query,"items"):
# mapping objects
query = query.items()
else:
# it's a bother at times that strings and string-like objects are
# sequences...
try:
# non-sequence items should not work with len()
# non-empty strings will fail this
if len(query) and not isinstance(query[0], tuple):
raise TypeError
# zero-length sequences of all types will get here and succeed,
# but that's a minor nit - since the original implementation
# allowed empty dicts that type of behavior probably should be
# preserved for consistency
except TypeError:
ty,va,tb = sys.exc_info()
raise TypeError("not a valid non-string sequence or mapping
object").with_traceback(tb)
l = []
if not doseq:
# preserve old behavior
for k, v in query:
k = quote_plus(k if isinstance(k, (str,bytes)) else str(k), safe,
encoding, errors)
v = quote_plus(v if isinstance(v, (str,bytes)) else str(v), safe,
encoding, errors)
l.append(k + '=' + v)
else:
for k, v in query:
k = quote_plus(k if isinstance(k, (str,bytes)) else str(k), safe,
encoding, errors)
if isinstance(v, str):
v = quote_plus(v if isinstance(v, (str,bytes)) else str(v),
safe, encoding, errors)
l.append(k + '=' + v)
else:
try:
# is this a sufficient test for sequence-ness?
x = len(v)
except TypeError:
# not a sequence
v = quote_plus(str(v))
l.append(k + '=' + v)
else:
# loop over the sequence
for elt in v:
elt = quote_plus(elt if isinstance(elt, (str,bytes))
else str(elt), safe, encoding, errors)
l.append(k + '=' + elt)
return '&'.join(l)
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com