Walter Dörwald added the comment:
The original specification (PEP 293) required that an error handler called for
encoding *must* return a replacement string (not bytes). This returned string
must then be encoded again. Only if this fails an exception must be raised.
Returning bytes from
Walter Dörwald added the comment:
Just a guess, but the buffer size might be so small that the text that you
expect gets passed via **two** calls to _char_data(). You should refactor your
code the simply collect all the text in _char_data() and act on it in the
_end_element() handler.
So
Walter Dörwald added the comment:
Shadowing the real modules `re` and `io` by
from typing import *
would indeed be bad, but that argument IMHO doesn't hold for the types `IO`,
`TextIO` and `BinaryIO`, yet they are not listed in `typing.__all__`. Is there
a reason
Walter Dörwald added the comment:
I guess that is good enough. "Being changeable" does not necessarily mean mean
"being changeable via attribute assignment".
Thanks for your research. Closing the issue as "not a bug".
--
resolution: -> not a bug
New submission from Walter Dörwald :
PEP 293 states the following:
"""
For stream readers/writers the errors attribute must be changeable to be able
to switch between different error handling methods during the lifetime of the
stream reader/writer. This is cur
Walter Dörwald added the comment:
UnicodeEncodeError and UnicodeDecodeError are used to report un(en|de)codedable
ranges in the source object, so it wouldn't make sense to use them for errors
that have nothing to do with problems in the source object. Their constructor
requires 5 arguments
Change by Walter Dörwald :
--
resolution: -> fixed
stage: patch review -> resolved
status: open -> closed
___
Python tracker
<https://bugs.python.or
Walter Dörwald added the comment:
New changeset 85339f5c220a5e79c47c3a33c93f1dca5c59c52e by Srinivas Reddy
Thatiparthy (శ్రీనివాస్ రెడ్డి తాటిపర్తి) in branch 'master':
bpo-35078: Allow customization of CSS class name of a month in calendar module
(gh-10137)
https://github.com/python
Walter Dörwald added the comment:
IMHO the names don't fit Pythons current naming scheme, so what about naming
them "lchop" and "rchop"?
--
nosy: +doerwalter
___
Python tracker
<https://bug
Walter Dörwald added the comment:
codecs.iterencode()/iterdecode() are just shallow 10-line wrappers around
incremental codecs (which are used as the basis of io streams).
Note that the doc string for iterencode() contains:
Encodes the input strings from the iterator using
Walter Dörwald added the comment:
The documentation might be unclear here. But the argument iterator of
iterdecode(iterator, encoding, errors='strict', **kwargs)
*is* supposed to be an iterable over bytes objects.
In fact iterencode() transforms an iterator over strings into an iterator
Change by Walter Dörwald :
--
pull_requests: +14603
stage: needs patch -> patch review
pull_request: https://github.com/python/cpython/pull/14809
___
Python tracker
<https://bugs.python.org/issu
Walter Dörwald added the comment:
Can we at least get the __qualname__ in exception messages?
Currently enum.Enum.__new__() and enum.Enum._missing_() use:
raise ValueError("%r is not a valid %s" % (value, cls.__name__))
IMHO this should be:
raise ValueError("%r is
Walter Dörwald added the comment:
OK, I've created the pull request (11157).
--
___
Python tracker
<https://bugs.python.org/issue2661>
___
___
Python-bugs-list m
Change by Walter Dörwald :
--
pull_requests: +10390
stage: needs patch -> patch review
___
Python tracker
<https://bugs.python.org/issue2661>
___
___
Python-
Walter Dörwald added the comment:
OK, I see, http://www.unicode.org/versions/Unicode5.2.0/ch03.pdf (Table 3-7 on
page 93) states that the only valid 3-bytes UTF-8 sequences starting with the
byte 0xED have a value for the second byte in the range 0x80 to 0x9F. 0xA0 is
just beyond that range
New submission from Walter Dörwald :
The following code issues a misleading exception message:
>>> b'\xed\xa0\xbd\xed\xb3\x9e'.decode("utf-8")
Traceback (most recent call last):
File "", line 1, in
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xed in posi
Change by Walter Dörwald :
--
keywords: +easy
___
Python tracker
<https://bugs.python.org/issue34443>
___
___
Python-bugs-list mailing list
Unsubscribe:
New submission from Walter Dörwald :
The __repr__ output of an enum class should use __qualname__ instead of
__name__. The following example shows the problem:
import enum
class X:
class I:
pass
class Y:
class I(enum.Enum):
pass
print(X.I)
print(Y.I)
This prints:
I
New submission from Walter Dörwald :
When I call a function decorated with functools.singledispatch without an
argument, I get the following:
$ python
Python 3.6.5 (default, Jun 17 2018, 12:13:06)
[GCC 4.2.1 Compatible Apple LLVM 9.1.0 (clang-902.0.39.2)] on darwin
Type "help",
Walter Dörwald added the comment:
The problem here is that StreamArray lies about the length of the iterator.
This confuses json.encoder._make_iterencode._iterencode_list(), (which is
called by json.dump()), because it first does a check for "if not lst" and then
assumes i
Changes by Walter Dörwald <wal...@livinglogic.de>:
--
stage: -> resolved
status: open -> closed
___
Python tracker <rep...@bugs.python.org>
<http://bugs.
Walter Dörwald added the comment:
Should be fixed now. Thanks for noticing it.
--
resolution: -> fixed
___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.or
Walter Dörwald added the comment:
New changeset f5c58c781aa0bb296885baf62f4f39100f2cd93d by Walter Dörwald in
branch 'master':
bpo-30733: Fix typos in "What's New" entry (GH-2414)
https://github.com/python/cpython/commit/f5c58c781aa0bb296885baf62f4f39100f2cd93d
--
nosy: +
Changes by Walter Dörwald <wal...@livinglogic.de>:
--
pull_requests: +2463
___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30733>
___
Walter Dörwald added the comment:
Closing the issue. The patch has been merged.
--
resolution: -> fixed
stage: patch review -> resolved
status: open -> closed
___
Python tracker <rep...@bugs.python.org>
<http://bugs.pyt
Walter Dörwald added the comment:
New changeset 8b7a4cc40e9b2f34da94efb75b158da762624015 by Walter Dörwald (Oz N
Tiram) in branch 'master':
bpo-30095: Make CSS classes used by calendar.HTMLCalendar customizable (GH-1439)
https://github.com/python/cpython/commit
Walter Dörwald added the comment:
See comments on the pull request. Also it seems that currently the pull request
can't be merged because of merge conflicts.
--
___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/i
Walter Dörwald added the comment:
See my comments on the pull request: https://github.com/python/cpython/pull/1439
After you address those, IMHO this is ready to be merged.
--
___
Python tracker <rep...@bugs.python.org>
<http://bugs.p
Walter Dörwald added the comment:
See comments on Github
--
___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30095>
___
___
Pyth
Walter Dörwald added the comment:
The second link is a 404.
For the v1 patch:
The variable names are a bit inconsistent: The first uses "classes" all others
use "styles". This should be consistent within itself and with the existing
code, i.e. "classes" sh
Walter Dörwald added the comment:
OK, go ahead. I'm looking forward to what you come up with.
--
___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/i
Walter Dörwald added the comment:
IMHO this could all be done by overwriting the relevant methods.
But this might be overkill.
I think a solution might be to move the CSS classes into class attributes of
HTMLCalendar. Customizing the CSS classes would then be done by subclassing
HTMLCalendar
Walter Dörwald added the comment:
This looks to me like a limited reimplementation of the codec machinery. Why
not use incremental codecs as a preprocessor? Would this be to slow?
--
___
Python tracker <rep...@bugs.python.org>
Walter Dörwald added the comment:
OK, with the fixed CFLAGS definition I do indeed get a working ssl module.
I wonder whether the link Ned posted should be put into the README file.
Anyway I think the issue can be closed. Thanks for the help
Walter Dörwald added the comment:
OK, I've set CFLAGS and LDFLAGS as you suggested. However the ssl module still
doesn't get built. Attached is the new build log (Python3.6-build2.log)
--
Added file: http://bugs.python.org/file46073/Python3.6-build2.log
Walter Dörwald added the comment:
No, neither CFLAGS nor LDFLAGS are set, the only "FLAGS" environment variable I
have set is ARCHFLAGS='-arch x86_64' (I can't remember why). However unsetting
this variable doesn't change the result.
--
New submission from Walter Dörwald:
I'm trying to compile Python 3.6 from source on MacOS X Sierra. However it
seems that the _ssl module doesn't get built. Attached is the complete output.
Note that I have openssl installed via homebrew:
~/ ▸ brew list openssl
/usr/local/Cellar/openssl
Walter Dörwald added the comment:
I don't think that's necessary. What's the use case for this?
And if we want to to this, wouldn't it be better to enhance datetime, so that
this use case is supported too?
--
___
Python tracker <
Walter Dörwald added the comment:
> Who's talking about latin-1 in Python3? Of course str() needs to return
> decode('utf-8').
So that would mean that:
print(b"\xff")
will always fail!
--
nosy: +doerwalter
___
Pyt
Walter Dörwald added the comment:
But this leads to uninspectable objects.
--
___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/i
Walter Dörwald added the comment:
Don't worry, I've switched to using Python 3 in 2012, where this isn't a
problem. ;)
--
___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/
New submission from Walter Dörwald:
When an exception is raised by inspect.Signature.bind() in some cases the
exception has a StopIteration as its __context__:
import inspect
try:
inspect.signature(lambda x:None).bind()
except Exception as exc:
print(repr(exc))
print(repr(exc
Walter Dörwald added the comment:
The patch does indeed fix the segmentation fault. However the exception message
looks confusing:
TypeError: don't know how to handle UnicodeEncodeError in error callback
--
___
Python tracker rep
Walter Dörwald added the comment:
Looks much better. However shouldn't:
exc-ob_type-tp_name
be:
Py_TYPE(exc)-tp_name
(although there are still many spots in the source that still use
ob_type-tp_name)
--
___
Python tracker rep
Walter Dörwald added the comment:
The linked code at https://github.com/vadmium/python-iview/commit/68b0559 seems
strange to me:
try:
text.encode(encoding, textio.errors or strict)
except UnicodeEncodeError:
text = text.encode(encoding, errors).decode(encoding
Walter Dörwald added the comment:
That analysis seems correct to me.
Stateless and stream codecs were the original implementation. 2006 I
implemented incremental codecs: http://bugs.python.org/issue1436130
The intent was to have stateful codecs that can work with iterators and
generators
Walter Dörwald added the comment:
The updated code in the documentation still doesn't set the * and **
parameters. I would have preferred the following code:
for param in sig.parameters.values():
if param.name not in ba.arguments:
if param.kind is inspect.Parameter.VAR_POSITIONAL
Changes by Walter Dörwald wal...@livinglogic.de:
--
status: open - closed
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19834
___
___
Python-bugs
New submission from Walter Dörwald:
inspect.Signature.bind() doesn't add values for parameters that are unspecified
but have a default value. The documentation at
https://docs.python.org/3/library/inspect.html#inspect.BoundArguments.arguments
includes an example how to add default values
Walter Dörwald added the comment:
The following doesn't work::
import inspect
def foo(*args, **kwargs):
return (args, kwargs)
# Code from
https://docs.python.org/3/library/inspect.html#inspect.BoundArguments.arguments
to fill in the defaults
sig = inspect.signature(foo
Walter Dörwald added the comment:
The problem seems to be in that line:
except imaplib.IMAP4_SSL.abort, imaplib.IMAP4.abort:
This does *not* catch both exception classes, but catches only IMAP4_SSL.abort
and stores the exception object in imaplib.IMAP4.abort.
What you want is:
except
Walter Dörwald added the comment:
I don't know anything about SMTP, but would it make sense to use an incremental
decoder for decoding UTF-8?
--
nosy: +doerwalter
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19806
Walter Dörwald added the comment:
Here is a patch that implements suggestion 2 and 3.
--
keywords: +patch
Added file: http://bugs.python.org/file35800/mapping-tests.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2661
Walter Dörwald added the comment:
The requirement that getstate() returns a (buffer, int) tuple has to do with
the fact that for text streams seek() and tell() somehow have to take the state
of the codec into account. See
_pyio.TextIOWrapper.(seek|tell|_pack_cookie|_unpack_cookie).
However I
Walter Dörwald added the comment:
The cronjob that produces this information has been deactivated, because it
currently produces broken output. The code for that job is available from here:
https://pypi.python.org/pypi/pycoco
It would be great to have up to date coverage info for Python again
Walter Dörwald added the comment:
\n \r\n \r \u2028.split() should have been \n \r\n \r \u2028.split( ),
i.e. a list of different line ends.
The purpose of these tests is not entirely clear, so I'm not sure that it is
properly grasped the idea of the author.
I wrote the tests nearly 10
Walter Dörwald added the comment:
True, here's an updated patch.
--
Added file: http://bugs.python.org/file33933/fix_linetests2.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20520
Walter Dörwald added the comment:
I dug up an ancient email about that subject:
However, I've discovered that BufferedIncrementalEncoder.getstate()
doesn't match the specification (i.e. it returns the buffer, not an
int). However this class is unused (and probably useless, because
Walter Dörwald added the comment:
The best solution IMHO would be to implement real incremental codecs for all of
those.
Maybe iterencode() with an empty iterator should never call encode()? (But IMHO
it would be better to document that iterencode()/iterdecode() should only be
used with real
Walter Dörwald added the comment:
The stream part of the codecs isn't used that much in Python 3 any more, so I'm
not sure if this is worth fixing.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13881
Walter Dörwald added the comment:
sys.displayhook doesn't fail, because it uses the backslashreplace error
handler, and for sys.displayhook that's OK, because it's only used for screen
output and there some output is better than no output. However print and
pprint.pprint might be used
Walter Dörwald added the comment:
This is not the fault of pprint. IMHO it doesn't make sense to fix anything
here, at least not for pprint specifically. print() has the same problem:
$ LANG= ./python -c print('\u20ac
Changes by Walter Dörwald wal...@livinglogic.de:
--
resolution: - fixed
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19834
___
___
Python-bugs
Walter Dörwald added the comment:
Here's an updated version of the patch, addressing most of Alexandre's comments.
--
Added file: http://bugs.python.org/file32918/python-2-exception-pickling-2.diff
___
Python tracker rep...@bugs.python.org
http
Walter Dörwald added the comment:
OK, here is a patch. Instead of mapping the exceptions module to builtins, it
does the mapping for each exception class separately. I've excluded
StandardError, because I think there's no appropriate equivalent in Python 3.
--
keywords: +patch
Added
New submission from Walter Dörwald:
Exception objects that have been pickled with Python 2 can not be unpickled
with Python 3, even when fix_imports=True is specified:
$ python2.7
Python 2.7.2 (default, Aug 30 2011, 11:04:13)
[GCC 3.3.5 (Debian 1:3.3.5-13)] on linux2
Type help
Walter Dörwald added the comment:
Here is a new version of the patch. The annotation is done on the code object
instead of on the frame object. This avoids two problems: There is no runtime
overhead, as the decorator returns the original function and no additional
frames show up
Walter Dörwald added the comment:
Do you have an example where code objects are shared? We could attach the
annotation formatter to the function object, but unfortunately the function
object is now accessible in the traceback.
Note the co_annotation is not the annotation string, rather
New submission from Walter Dörwald:
This patch adds frame annotations, i.e. it adds an attribute f_annotation to
frame objects, a decorator to set this attribute on exceptions and extensions
to the traceback machinery that display the annotation in the traceback.
--
components
Walter Dörwald added the comment:
See http://bugs.python.org/issue18861 and the discussion started here:
https://mail.python.org/pipermail/python-dev/2013-November/130155.html.
Basically it allows to add context information to a traceback without changing
the type of the exception
Walter Dörwald added the comment:
The point of using a function is to allow the function special hanling of the
encoding name, which goes beyond a simple map lookup, i.e. you could do the
following:
import codecs
def search_function(encoding):
if not encoding.startswith(append
Walter Dörwald added the comment:
I'd like to have this feature too. However the code should use
d if d is not None else {}
instead of
d or {}
For example I might want to use a subclass of dict (lowerdict) that converts
all keys to lowercase. When I use an empty lowerdict in new_child
Walter Dörwald added the comment:
And returning bytes is documented in PEP 383, as an extension to the PEP 293
machinery:
To convert non-decodable bytes, a new error handler ([2]) surrogateescape
is introduced, which produces these surrogates. On encoding, the error handler
converts
Walter Dörwald added the comment:
True, the second test uses the wrong error handler.
And yes, you're correct, bytes are now immutable. And even if I try to decode a
bytearray, what the callback gets to see is still an immutable bytes object::
import codecs
def mutating(exc
Walter Dörwald added the comment:
codecs.utf_8_decode('\u20ac'.encode('utf8')[:2])
('', 0)
Oh... codecs.CODEC_decode are incremental decoders? I misunderstood completly
this.
No, those function are not decoders, they're just helper functions used to
implement the real incremental
On 25.07.12 08:09, Ulrich Eckhardt wrote:
Am 24.07.2012 17:01, schrieb cpppw...@gmail.com:
reader = codecs.getreader(encoding)
lines = []
with open(filename, 'rb') as f:
lines = reader(f, 'strict').readlines(keepends=False)
where encoding == 'utf-16-be'
Everything
Walter Dörwald wal...@livinglogic.de added the comment:
So is this simply a documentation issue, or can we close the bug as won't fix?
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15408
New submission from Walter Dörwald wal...@livinglogic.de:
The attached script behaves differently on Python 2.7.2 and Python 3.2.3.
With Python 2.7 the script runs for ca. 30 seconds and then I get back my
prompt.
With Python 3.2 the script runs in the background, I get back my prompt
On 07.07.12 04:56, Steven D'Aprano wrote:
On Fri, 06 Jul 2012 12:55:31 -0400, Karl Knechtel wrote:
Hello all,
While attempting to make a wrapper for opening multiple types of
UTF-encoded files (more on that later, in a separate post, I guess), I
ran into some oddities with the `codecs`
Walter Dörwald wal...@livinglogic.de added the comment:
An alternative would be to use an incremental encoder instead of a
StreamWriter. (Which is what TextIOWrapper does internally).
--
nosy: +doerwalter
___
Python tracker rep...@bugs.python.org
On 11.03.12 15:37, Steven D'Aprano wrote:
At least two standard error handlers are documented as working for
encoding only:
xmlcharrefreplace
backslashreplace
See http://docs.python.org/library/codecs.html#codec-base-classes
and http://docs.python.org/py3k/library/codecs.html
Why is this? I
Walter Dörwald wal...@livinglogic.de added the comment:
See this ancient posting about this problem:
http://mail.python.org/pipermail/python-dev/2002-August/027661.html
(see point 4.). So I guess somebody did finally complain! ;)
The error attributes are documented in PEP 293
Walter Dörwald wal...@livinglogic.de added the comment:
+1 on the documentation changes.
--
nosy: +doerwalter
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12171
Walter Dörwald wal...@livinglogic.de added the comment:
OK, I reran the test with::
./python -mtest.regrtest -T -N test_urllib
and this does indeed produce coverage files (for _abcoll, _weakrefset, abc,
base64, codecs, collections, contextlib, functools, genericpath, hashlib,
locale
New submission from Walter Dörwald wal...@livinglogic.de:
Running regrtest.py with coverage option seems to be broken for the py3k branch
at the moment. Run the following commands on the shell:
wget http://svn.python.org/snapshots/python3k.tar.bz2
tar xjf python3k.tar.bz2
cd python
./configure
Walter Dörwald wal...@livinglogic.de added the comment:
STINNER Victor victor.stin...@haypocalc.com added the comment:
... it complicates handling of the output of trace.py.
For each file you have to do the encoding detection dance again ...
What? You just have to call one function
Walter Dörwald wal...@livinglogic.de added the comment:
Using the original encoding of the Python source file might be the politically
correct thing to do, but it complicates handling of the output of trace.py. For
each file you have to do the encoding detection dance again. It would be great
New submission from Walter Dörwald wal...@livinglogic.de:
It seems that on Python 3 (i.e. the py3k branch) trace.py can not handle source
that includes Unicode characters. Running the test suite with code coverage
info via
./python Lib/test/regrtest.py -T -N -uurlfetch,largefile,network
Walter Dörwald wal...@livinglogic.de added the comment:
The following patch (against the release27-maint branch) seems to fix the
problem.
--
keywords: +patch
nosy: +doerwalter
Added file: http://bugs.python.org/file19468/json.diff
___
Python
Walter Dörwald wal...@livinglogic.de added the comment:
Does the following patch fix your problems?
--
keywords: +patch
nosy: +doerwalter
Added file: http://bugs.python.org/file19217/calendar.diff
___
Python tracker rep...@bugs.python.org
http
Walter Dörwald wal...@livinglogic.de added the comment:
Yes, I think you should apply the patch.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8848
Walter Dörwald wal...@livinglogic.de added the comment:
The code for case 's'/'z' in py3k is indeed the same as for case 'U'. The patch
looks good to me.
IMHO removing 'U' should only be done once Py2 is dead.
--
___
Python tracker rep
Walter Dörwald wal...@livinglogic.de added the comment:
I’d be grateful if someone could post links to discussion
about the removal of codecs like hex and rot13
r55932 (~3 years ago):
That was my commit. ;)
Thanks for the link. Do you have a pointer to the PEP or ML thread
discussing
On 28.04.10 15:02, james_027 wrote:
hi,
Any idea how I can replace words in a html file? Meaning only the
content will get replace while the html tags, javascript, css are
remain untouch.
You could try XIST (http://www.livinglogic.de/Python/xist/):
Example code:
from ll.xist import xsc,
Walter Dörwald wal...@livinglogic.de added the comment:
Yes, that's the posting I was referring to. I wonder why the link is gone.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7651
Walter Dörwald wal...@livinglogic.de added the comment:
This is a common thinko. ;)
If i is negative then len(s) - i would be greater that len(s). However len(s) +
i is correct. Example:
foo[-1] is foo[len(foo) + (-1)] is foo[len(foo)-1]
--
nosy: +doerwalter
resolution: - invalid
Walter Dörwald wal...@livinglogic.de added the comment:
After the patch the comment:
/* Implementation limitations: only support error handler that return
bytes, and only support up to four replacement bytes. */
no longer applies.
Also I would like to see a version of this patch where
Walter Dörwald wal...@livinglogic.de added the comment:
On 24.02.10 15:28, Eric Smith wrote:
Eric Smith e...@trueblade.com added the comment:
Fixed:
trunk: r78418
release26-maint: r78419
Still working on porting to py3k and release31-maint.
A much better solution would IMHO
New submission from Walter Dörwald wal...@livinglogic.de:
In the current py3k branch setting an attribute of an object with PyMemberDefs
raises an internal error:
$ ./python.exe
Python 3.2a0 (py3k:78419M, Feb 24 2010, 17:56:06)
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type help
1 - 100 of 216 matches
Mail list logo