[issue3839] wsgi.simple_server resets 'Content-Length' header on empty content even if app defined it.
kxroberto added the comment: However, setting a default "0" when no content, that is still too much in general. In case of a '304 Not Modified' for example (which is probably the most frequent HTTP status used on the web overall!) a Content-Length header obviously is disallowed at all according to https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.5 "this prevents inconsistencies between cached entity-bodies and updated headers" Apache, NGINX and other servers observed indeed do not set Content-Length in 304. And there were bugfix issues regarding that. Some browsers obviously pick up a Content-Length "0", update the cached resource and thus zero the cached data. Literally obeying "If a cache uses a received 304 response to update a cache entry, the cache MUST update the entry to reflect any new field values given in the response." (Though that seems rather silly, as that would mean "Modified". And Content-Length should reasonably perhaps be better always associated with the current transmission for needs of keep-alive connections and buffer management at a lower level, and not with cache & status issues.) Possibly the same problem for 204. -- nosy: +kxroberto ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue3839> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13693] email.Header.Header incorrect/non-smart on international charset address fields
New submission from kxroberto kxrobe...@users.sourceforge.net: the email.* package seems to over-encode international charset address fields - resulting even in display errors in the receivers reader - , when message header composition is done as recommended in http://docs.python.org/library/email.header.html Python 2.7.2 e=email.Parser.Parser().parsestr(getcliptext()) e['From'] '=?utf-8?q?Martin_v=2E_L=C3=B6wis?= rep...@bugs.python.org' # note the par email.Header.decode_header(_) [('Martin v. L\xc3\xb6wis', 'utf-8'), ('rep...@bugs.python.org', None)] # unfortunately there is no comfortable function for this: u='Martin v. L\xc3\xb6wis'.decode('utf8') + ' rep...@bugs.python.org' u u'Martin v. L\xf6wis rep...@bugs.python.org' msg=email.Message.Message() msg['From']=u msg.as_string() 'From: =?utf-8?b?TWFydGluIHYuIEzDtndpcyA8cmVwb3J0QGJ1Z3MucHl0aG9uLm9yZz4=?=\n\n' msg['From']=str(u) msg.as_string() 'From: =?utf-8?b?TWFydGluIHYuIEzDtndpcyA8cmVwb3J0QGJ1Z3MucHl0aG9uLm9yZz4=?=\nFrom: Martin v. L\xf6wis rep...@bugs.python.org\n\n' msg['From']=email.Header.Header(u) msg.as_string() 'From: =?utf-8?b?TWFydGluIHYuIEzDtndpcyA8cmVwb3J0QGJ1Z3MucHl0aG9uLm9yZz4=?=\nFrom: Martin v. L\xf6wis rep...@bugs.python.org\nFrom: =?utf-8?b?TWFydGluIHYuIEzDtndpcyA8cmVwb3J0QGJ1Z3MucHl0aG9uLm9yZz4=?=\n\n' (BTW: strange is that multiple msg['From']=... _assignments_ end up as multiple additions !??? also msg renders 8bit header lines without warning/error or auto-encoding, while it does auto on unicode!??) Whats finally arriving at the receiver is typically like: From: =?utf-8?b?TWFydGluIHYuIEzDtndpcyA8cmVwb3J0QGJ1Z3MucHl0aG9uLm9yZz4=?= rep...@bugs.python.org because the servers seem to want the address open, they extract the address and _add_ it (duplicating) as ASCII. = error I have not found any emails in my archives where address header fields are so over-encoded like python does. Even in non-address fields mostly only those words/groups are encoded which need it. I assume the sophisticated/high-level looking email.* package doesn't expect that the user fiddles things together low-level? with parseaddr, re.search, make_header Header.encode , '.join ... Or is it indeed (undocumented) so? IMHO it should be auto-smart enough. Note: there is a old deprecated function mimify.mime_encode_header which seemed to try to cautiously auto-encode correct/sparsely (but actually fails too on all examples tried). -- components: Library (Lib) messages: 150434 nosy: kxroberto priority: normal severity: normal status: open title: email.Header.Header incorrect/non-smart on international charset address fields type: behavior versions: Python 2.6, Python 2.7, Python 3.1, Python 3.2, Python 3.3, Python 3.4 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13693 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10839] email module should not allow some header field repetitions
kxroberto kxrobe...@users.sourceforge.net added the comment: I think really ill/strange is that kind of item _assignments_ do _add_ multiple. If msg[field] = xywould just add-first/replace-frist , and only msg.add_/.append(field, xy) would add multiples that would be clear and understandable/readable. (The sophisticated check dictionary is unnecessary IMHO, I don't expect the class to be ever smart enough for a full RFC checklist.) e.g. I remember a bug like msg[field] = xy if special_condition: msg[field] = abc # just wanted a alternative Never ever expected a double header here! = with adding behavior is absurd IMHO. Certainly doesn't allow readable code. -- nosy: +kxroberto ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10839 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13693] email.Header.Header incorrect/non-smart on international charset address fields
kxroberto kxrobe...@users.sourceforge.net added the comment: now I tried to render this address field header u'Name abc\u03a3@xy, abc@ewf, Nameß weofij@fjeio' with h = email.Header.Header(continuation_ws='') h.append ... / email.Header.make_header via these chunks: [('Name ', us-ascii), ('abc\xce\xa3', utf-8), ('@xy, abc@ewf, ', us-ascii), ('Name\xc3\x9f', utf-8), (' weofij@fjeio', us-ascii)] the outcome is: 'Name =?utf-8?b?YWJjzqM=?= @xy, abc@ewf, =?utf-8?b?TmFtZcOf?=\n weofij@fjeio' (note: local part of email address can be utf too) It seems to be impossible to avoid the erronous extra spaces from outside within that email.Header framework. Thus I guess it was not possible up to now to decently format a beyond-ascii MIME message using the official email.Header mechanism? - even when pre-digesting things -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13693 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13580] Pre-linkage of CPython =2.6 binary on Linux too fat (libssl, libcrypto)
kxroberto kxrobe...@users.sourceforge.net added the comment: (I often wonder why software today isn't much faster than years ago - though the nominal speed of hardware increases tremendously. package sizes grow, without appropriate growth of functionality. This is one example how the rescources are wasted too careless.) I don't know of any evidence that software slowness has to do with code size. When code isn't called, it doesn't pollute the instruction caches and hence shouldn't affect execution speed. With slowness under the subject here I mean long startup time (and slowness by overall memory impact on small systems like laptops, non-cutting edge hardware, even embedded systems.). Thats what is mainly unpleasant and what seems to not improve appropriately overall. Developers carelessly link bigger and bigger libraries which eats the hardware gains ... There are however some good efforts: For example when Google came out with Chrome (and many after fast responding apps), fast startup time was the striking issue. And the old browsers meanwhile were challenged by that, and improved as well. Please keep Python fast as well. I'm using it since 1.5.2, and that careless fatty degeneration is one of the main things I don't like. Python is a universal language. Most Python progs are small scripts. Overall I wonder why you post here on the main topic resource usage, when you don't care about issues of magnitude 2x memory usage. Why not close this topic for Python at all with your arguments? I understand the concern about py2exe and similar distribution systems (although distribution size should be much less important nowadays than 10 years ago). But, really, it's a separate issue. that is not really a separate issue (because module decoupling is a pre-requisite therefor, a sort of show stopper). And as mentioned its by far not the only issue. For example each cgi script (which has to respond fast and does only a small job), which does import cgi and a few lines; or a script which just uses e.g., urllib string format functions ... : the whole thing is drawn. Well, CGI scripts are a wasteful way to do programmatic page serving. If you care about performance, you should have switched to something like FastCGI or mod_wsgi. And how about other scripts ;-) I'm sure you find everywhere something how you can make all app programmers busy and not take care of the few cheap fixes mentioned in the system to make Python faster und usable easily for everybody. I created this issue to improve Python and make experience significantly faster. You seem to me being too interested in closing issues fast. If you care about performance You have this sort of black-white arguments which are green and somehow really I think, you are perhaps misplaced in this category resource usage. Also the linkage of _ssl solely against a detailed version of libssl/libcrypto is still questionable. I don't know the reasons (if any). Perhaps you can open a separate issue about that? Yet the issue of this library is here now. Why procrastinate? This sentence sounds like you want to dictate us what and how we should work on. That won't fly, sorry. The reason we want to avoid tackling multiple issues in a single tracker entry is simply so that the entries stay readable and searchable. (and, really, most projects' bug trackers work that way, for the same reasons) You are free to divide the issue if you really think its worth multiple. But why close it swift-handed before that is sorted out / set up? I indeed wonder about that careless style here meanwhile. Its not as it was years ago. To me this issues seem to belong rather so close together, that the possible fix should perhaps be made in one go. (The Debians would have gone rather deep into issues when they really created that fine tuning on their own. almost can't believe. There's nothing magical about libssl that would make us link it statically to the executable; it's far too optional a dependency for that. Perhaps Debian has its own bootstrapping requirements that mandate it, or perhaps they simply made a mistake and nobody complained before? Why don't you open an issue on their bug tracker, or at least try to contact them? You would get a definite answer about it. So are you definitely saying/knowing, there is really no such mentioned optimized module selection (~50% of so modules since Python2.5 on Debian) somewhere in the Python build files? (I ask first here to not create unnecessary lots of issues) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13580 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13580] Pre-linkage of CPython =2.6 binary on Linux too fat (libssl, libcrypto)
New submission from kxroberto kxrobe...@users.sourceforge.net: With transition from Python2.5 to Python2.6 on current Debian stable I noticed that the python2.6 executable has now 2x size of python2.5's. Half of lib-dynload/* obviously have been embedded into the executable by default. While most of the selections may be somewhat reasonable, I want to protest against static inclusion of _ssl.so, which now draws libssl*.so and libcryto*.so at each Python startup. This module is rarely needed, and the draw is almost as fat as the Python binary itself and those libs are not genarally loaded in the system. Those 2 dependencies solely are against detailed versions even!! See below. Besides load time and resource wastage, there are now e.g. likely problems with frozen python scripts due to the detailed version deps. (binding with unversioned libssl.so may be ok for future separate _ssl.so module?) $ ldd /usr/bin/python2.5 linux-gate.so.1 = (0xb78dc000) libpthread.so.0 = /lib/i686/cmov/libpthread.so.0 (0xb78c1000) libdl.so.2 = /lib/i686/cmov/libdl.so.2 (0xb78bd000) libutil.so.1 = /lib/i686/cmov/libutil.so.1 (0xb78b8000) libm.so.6 = /lib/i686/cmov/libm.so.6 (0xb7892000) libc.so.6 = /lib/i686/cmov/libc.so.6 (0xb774c000) /lib/ld-linux.so.2 (0xb78dd000) $ ldd /usr/bin/python2.6 linux-gate.so.1 = (0xb76e7000) libpthread.so.0 = /lib/i686/cmov/libpthread.so.0 (0xb76cc000) libdl.so.2 = /lib/i686/cmov/libdl.so.2 (0xb76c8000) libutil.so.1 = /lib/i686/cmov/libutil.so.1 (0xb76c3000) libssl.so.0.9.8 = /usr/lib/libssl.so.0.9.8 (0xb7679000) libcrypto.so.0.9.8 = /usr/lib/libcrypto.so.0.9.8 (0xb751d000) libz.so.1 = /usr/lib/libz.so.1 (0xb7509000) libm.so.6 = /lib/i686/cmov/libm.so.6 (0xb74e3000) libc.so.6 = /lib/i686/cmov/libc.so.6 (0xb739c000) /lib/ld-linux.so.2 (0xb76e8000) Note: missing files consumed from lib-dynload/ since Python2.5: _functools.so 6780 _hashlib.so 11392 math.so 12492 array.so 32432 _socket.so 54228 strop.so 21616 spwd.so 7132 collections.so 21116 unicodedata.so 474792 itertools.so 29684 rgbimg.so 12416 select.so 12816 time.so 16412 grp.so 6868 _locale.so 15760 binascii.so 17344 _weakref.so 4816 cStringIO.so 17076 cPickle.so 68968 syslog.so 5824 _ssl.so 15452 _bisect.so 7568 operator.so 25392 fcntl.so 13536 _struct.so 24832 zlib.so 21708 _random.so 10368 (python2.7 not tested, as it is not available via apt-get so far.) -- components: Build, Installation messages: 149217 nosy: kxroberto priority: normal severity: normal status: open title: Pre-linkage of CPython =2.6 binary on Linux too fat (libssl, libcrypto) type: resource usage versions: Python 2.6, Python 2.7, Python 3.1, Python 3.2, Python 3.3, Python 3.4 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13580 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13479] pickle too picky on re-defined classes
kxroberto kxrobe...@users.sourceforge.net added the comment: Well, == whould allow the wanted feature by exception through meta classes for concerned classes: class X: ... a=1 ... Y=X class X: ... a=1 ... Y==X False class XCompare(type): ... def __eq__(self, other): ... print tolerant class __eq__ ... return self.__name__ == other.__name__ ... class X: ... __metaclass__ = XCompare ... a=1 ... Y=X class X: ... a=1 ... Y==X tolerant class __eq__ True Better than nothing. Its a improvement generally, independently. But thinking about my acutal use cases and all: It still doesn't satisfy. I don't want to introduce this extra magic on all those classes just for that feature - because when needed, the majority of classes are concerned (see below). One can have only one meta class ... its too tricky and off-road to guess for most programmers ... when in doubt, raise an error: That is IMHO too rigid here, and generally when a feature is then hindered too much. Aren't warnings the right tool for such case? If really rarely there is problem, should it surface easily already during dev test time? Compared to the everday life danger of Pythons dynamic attribute access, version incompatibilities, etc. its about a rather harmless issue here. Now I'd vote for a warnings.warn upon == (or old is) failing , and then an error only when the .__name__ is not matching too. A warning at dev test time should be enough, when just == (or is) fails. I mainly like the tolerance during development: e.g. fast reload style edit-run cycles (reload sometimes improved with special reload fix code), because I noticed that 95% of code changes/bug fixes do not require a full expensive app-restart. This pays off particularly with bigger GUI app development/fixing and similar, where lot of status is accumulated expensively during run time. But I wished that feature already for a deployed app too. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13479 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13580] Pre-linkage of CPython =2.6 binary on Linux too fat (libssl, libcrypto)
kxroberto kxrobe...@users.sourceforge.net added the comment: Of course, as soon as we use sockets, we will bring SSL in Indeed, as it is now. Suggestions: * urllib.URLOpener.open_https shall always exist, but fail on runtime. non-existance with strange AttributeError is inelegant..bogus. Note: concept of late import ftplib, .. is otherwise ok in urllib. same style shall be used for ssl on demand. see python-Bugs-1046077) * httplib.HTTPSConnection.connect shall late-import ssl * httplib.HTTPSConnection,HTTPS,FakeSocket shall always exist but error on runtime if ssl is not available; same reason as with open_https (* httplib.test already late-imports ssl) * imaplib.IMAP4_SSL.ssl,open shall late-import ssl * smtplib.starttls should late-import ssl * smtplib.SMTP_SSL._get_socket should late-import ssl * smtplib.SSLFakeFile shall always exist (same reason as with open_https) * poplib.POP3_SSL.__init__ shall late-import ssl * deprecated socket.ssl() shall late-import _ssl/ssl (and possibly RAND_add, RAND_egd, RAND_status too if they need to exist globally for compabtibilty; constants to be entered fix into socket or _socket; sslerror perhaps a builtin in _socket, which _ssl then uses ) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13580 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13580] Pre-linkage of CPython =2.6 binary on Linux too fat (libssl, libcrypto)
kxroberto kxrobe...@users.sourceforge.net added the comment: Can you try to compile python using the shared lib yes I can try lot of things, but I'd need to do this on many machines. Yet I didn't create this issue for some local purpose ;-) 99% of Pythons are installed by apt-get, .msi etc. And now this turns out as bigger issue with the early import of ssl in those mentioned locations. (A little surprising for me is that the Python2.6 of current Debian stable shall already be outdated. Its the new thing here ;-) Hope for Python 2.7 not beeing outdated so soon. Py3 has already 4 version - who can use Py3 in real word? Is the Python dev team too fast for reality? ;-) ) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13580 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13580] Pre-linkage of CPython =2.6 binary on Linux too fat (libssl, libcrypto)
kxroberto kxrobe...@users.sourceforge.net added the comment: It doesn't happen in Python 3 Yet the cheap/unnecessary pre-imports of ssl in those other mentioned socket using libs (urllib (cgi!),httplib,smtplib,pop,imap...) exist there. socket is rarely used directly, so not much difference to Py2 in effect overall. And Python2.7 lives - which is important for the majority of users perhaps. Thus I'd request to not close this issue so swift. This is IHMO really a point to make python startup significantly faster, with a rather simple means. Also the linkage of _ssl solely against a detailed version of libssl/libcrypto is still questionable. This is therefore a problem with the Debian package I'm not into the Python build files. Just to ask/double-check: is that observed _semi_ static link selection (which is good otherwise - somebody must have done surprisingly lots of care) really from Debian or is there maybe a sort of 2nd default option bundle somewhere in Pythons configure? (If really not so I would go for Debian BTS.) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13580 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13580] Pre-linkage of CPython =2.6 binary on Linux too fat (libssl, libcrypto)
kxroberto kxrobe...@users.sourceforge.net added the comment: It doesn't happen in Python 3 Yet the cheap/unnecessary pre-imports of ssl in those other mentioned socket using libs (urllib (cgi!),httplib,smtplib,pop,imap...) exist there. socket is rarely used directly, so not much difference to Py2 in effect overall. Well, by this measure, we probably have unnecessary imports all over the place, since the general guideline is to import modules at the top-level rather than inside functions (the reason is partly technical, to avoid potential deadlocks with the import lock). see e.g. issue #1046077. As said, in this case here its even about libs almost as big as the python binary itself (on many platforms). When there are only few link points, and a huge effect (-usage probabilities), late imports shall be used. Python is dynamic. The list, as posted, is short here. Grep ssl in Python lib. Its not reasonable to draw the big libcrypto and libssl almost always. Thus I'd request to not close this issue so swift. This is IHMO really a point to make python startup significantly faster, with a rather simple means. If you are using a network library such as urllib or others you mentioned, then startup time will surely be small compared to the time spent sending and retrieving data over the network, no? no. This is to cheap here. I'd vote for some discipline regarding such levels of resource usage. (I often wonder why software today isn't much faster than years ago - though the nominal speed of hardware increases tremendously. package sizes grow, without appropriate growth of functionality. This is one example how the rescources are wasted too careless.) For example each cgi script (which has to respond fast and does only a small job), which does import cgi and a few lines; or a script which just uses e.g., urllib string format functions ... : the whole thing is drawn. Also the linkage of _ssl solely against a detailed version of libssl/libcrypto is still questionable. I don't know the reasons (if any). Perhaps you can open a separate issue about that? Yet the issue of this library is here now. Why procrastinate? This is therefore a problem with the Debian package I'm not into the Python build files. Just to ask/double-check: is that observed _semi_ static link selection (which is good otherwise - somebody must have done surprisingly lots of care) really from Debian or is there maybe a sort of 2nd default option bundle somewhere in Pythons configure? (If really not so I would go for Debian BTS.) Well, seeing as Mageia's Python 2.7 doesn't have the problem, I really think it must be Debian-specific: as emphasized in my sentence such reasoning alone would be sloppy. Thats why I asked. Does sb actually know, if this optimized semistatic module list is really not in Pythons configure somewhere? (The Debians would have gone rather deep into issues when they really created that fine tuning on their own. almost can't believe. If so I'd even recommend to adopt that (except _ssl.so) generally into Pythons standard configuration - at least for Linux) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13580 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13432] Encoding alias unicode
kxroberto kxrobe...@users.sourceforge.net added the comment: I wonder where is the origin, who is the inventor of the frequent charset=unicode? But: Sorry, but it's not obviously that Unicode means UTF-8. When I faced meta content=text/html; charset=unicode http-equiv=Content-Type/ the first time on the web, I guessed it is UTF-8 without looking. It even sounds colloquially reasonable ;-) And its right 99.999% of cases. (UTF-16 is less frequent than this non-canonical unicode) Definitely; this will just serve to create more confusion for beginners over what a Unicode string is: unicodestring.encode('unicode') - WTF? I guess no python tutorial writer or encoding menu writer poses that example. That string comes in on technical paths: web, MIME etc. In the aliases.py there are many other names which are not canonical. frequency convenience alias Joining the chorus: people who need it in their application will have to add it themselves (monkeypatching the aliases dictionary as appropriate). Those people first would need to be aware of the option: Be all-seeing, or all wait for the first bug reports ... Reverse question: what would be the minus of having this alias? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13432 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13479] pickle to picky on re-defined classes
New submission from kxroberto kxrobe...@users.sourceforge.net: When a class definition was re-executed (reload, exec ..) , pickling of existing instances fails for to picky reason (class object id mismatch). Solved by the one liner patch below. Rational: Python is dynamic. Like with any normal attribute lookup: When its the same module/..name this class is really meant - no matter about its id. (During unpickling its another id anyway, the code may have evolved years ...) diff -ur --strip _orig/pickle.py ./pickle.py --- _orig/pickle.py 2008-09-08 10:58:32 + +++ ./pickle.py 2011-11-24 15:47:11 + @@ -747,7 +747,7 @@ Can't pickle %r: it's not found as %s.%s % (obj, module, name)) else: -if klass is not obj: +if klass.__name__ != obj.__name__: raise PicklingError( Can't pickle %r: it's not the same object as %s.%s % (obj, module, name)) -- components: Library (Lib) messages: 148311 nosy: kxroberto priority: normal severity: normal status: open title: pickle to picky on re-defined classes type: crash versions: Python 2.6, Python 2.7, Python 3.1, Python 3.2, Python 3.3, Python 3.4 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13479 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13479] pickle too picky on re-defined classes
Changes by kxroberto kxrobe...@users.sourceforge.net: -- title: pickle to picky on re-defined classes - pickle too picky on re-defined classes ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13479 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1459867] Message.as_string should use mangle_from_=unixfrom?
kxroberto kxrobe...@users.sourceforge.net added the comment: I'd still say this is a plain bug, which simply should be fixed. People who have working code must have already a smart work around: either - or! : By doing the 5 low level code lines of .as_string() on their own. (there is no other way to produce clearly either a unixfrom=True or unixfrom=False Message). As of now this is simply neither nor. People are just quiet (I wonder), because the bug effect is rare. If somebody really wants to produce a unix mbox format for antiquated purposes, he would use unixfrom=True, when calling this function. (because otherwise its not complete unixfrom). And then the patched version is ok as well. But when you call with unixfrom=False (default), a partially unixfrom=True mangled MIME body comes out. This is just buggy ... Most striking is, that all lines in the message body (which the mail recipient reads), which start with the word From, are converted to From. This is not acceptable. a little bit more likely to preserve format of the message that was fed into it : Certainly mail message bodies must not be altered in a funny way when mangling is not ordered. I cannot imagine that sb can consciously or unconsciously rely on the bug. But in 99% of cases the patch would just fix peoples buggy programs. If this really cannot be fixed, then at least a extra function next to as_string should be added (e.g. as_unmangled_string()), which allows creation of legal unmangled message. The current function can so far only produce a managled message, but consistently only then if in addition it is called explicitely with unixfrom=True ;-) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1459867 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13432] Encoding alias unicode
New submission from kxroberto kxrobe...@users.sourceforge.net: unicode seems not to be an official unicode encoding name alias. Yet it is quite frequent on the web - and obviously means UTF-8. (search 'text/html; charset=unicode' in Google) Chrome and IE display it as UTF-8. (Mozilla as ASCII, thus mixed up chars). Should it be added in to aliases.py ? --- ./aliases.py +++ ./aliases.py @@ -511,6 +511,7 @@ 'utf8' : 'utf_8', 'utf8_ucs2' : 'utf_8', 'utf8_ucs4' : 'utf_8', +'unicode': 'utf_8', # uu_codec codec 'uu' : 'uu_codec', -- messages: 147936 nosy: kxroberto priority: normal severity: normal status: open title: Encoding alias unicode ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13432 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13432] Encoding alias unicode
Changes by kxroberto kxrobe...@users.sourceforge.net: -- components: +Unicode nosy: +ezio.melotti type: - feature request versions: +Python 2.6, Python 2.7, Python 3.1, Python 3.2, Python 3.3, Python 3.4 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13432 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1486713] HTMLParser : A auto-tolerant parsing mode
kxroberto kxrobe...@users.sourceforge.net added the comment: Well in many browsers for example there is a internal warning and error log (window). Which yet does not (need to) claim to be a official W3C checker. It has positive effect on web stabilization. For example just looking now I see the many HTML and CSS warnings and errors about the sourceforge site and this bug tracker in the Browsers log - not believing that the log covers the bugs 100% ;-) The events of warnings are easily available here, and calling self.warning, as it was, costs quite nothing. I don't see a problem for non-users of this feature. And most code using HTMLParser also emits warnings on the next higher syntax level, so to not have a black box... As I used a tolerant version of HTMLParser for about a decade, I can say the warnings are of the same value in many apps and use case, as to be able to have look into a Browsers syntax log. The style of stretching a argument to black-white is not reasonable here in the world of human edited HTML ;-) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1486713 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1486713] HTMLParser : A auto-tolerant parsing mode
kxroberto kxrobe...@users.sourceforge.net added the comment: The old patch warned already the majority of real cases - except the missing white space between attributes. The tolerant regex will match both: locatestarttagend_tolerant: The main and frequent issue on the web here is the missing white space between attributes (with enclosed values). And there is the new tolerant comma between attributes, which however I have not seen so far anywhere (the old warning machanism and attrfind.match would have already raised it at junk chars ... event. Both issues can be easily warned (also/already) at quite no cost by the slightly extended regex below (when the 2 new non-pseudo regex groups are check against None in check_for_whole_start_tag). Or missing whitespace could be warned (multiple times) at attrfind time. attrfind_tolerant : I see no point in the old/strict attrfind. (and the difference is guessed 0.000% of real cases). attrfind_tolerant could become the only attrfind. -- locatestarttagend_tolerant = re.compile(r [a-zA-Z][-.a-zA-Z0-9:_]* # tag name (?:(?:\s+|(\s*)) # optional whitespace before attribute name (?:[a-zA-Z_][-.:a-zA-Z0-9_]* # attribute name (?:\s*=\s* # value indicator (?:'[^']*' # LITA-enclosed value |\[^\]*\# LIT-enclosed value |[^'\\s]+# bare value ) (?:\s*(,))* # possibly followed by a comma )? ) )* \s*# trailing whitespace , re.VERBOSE) attrfind_tolerant = re.compile( r'\s*([a-zA-Z_][-.:a-zA-Z_0-9]*)(\s*=\s*' r'(\'[^\']*\'|[^]*|[^\s]*))?') #s='abc a=b,+c=de=ftext' #s='abc a=b,+ c=de=ftext' s='abc a=b,+,c=d e=ftext' m = locatestarttagend_tolerant.search(s) print m.group() print m.groups() #if m.group(1) is not None: self.warning('space missing ... #if m.group(2) is not None: self.warning('comma between attr... m = attrfind_tolerant.search(s, 5) print m.group() print m.groups() -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1486713 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1486713] HTMLParser : A auto-tolerant parsing mode
kxroberto kxrobe...@users.sourceforge.net added the comment: 16ed15ff0d7c was not in current stable py3.2 so I missed it.. When the comma is now raised as attribute name, then the problem is anyway moved to the higher level anyway - and is/can be handled easily there by usual methods. (still I guess locatestarttagend_tolerant matches a free standing comma extra after an attribute) should be generic enough to work with all kind of invalid markup: I think we would be rather complete then (-missing space issue)- at least regarding %age of real cases. And it could be improved with few touches over time if something missing. 100% is not the point unless it shall drive the official W3C checker. The call of self.warning, as in old patch, doesn't cost otherwise and I see no real increase of complexity/cpu-time. HTMLParser won't do any check about the validity of the elements' names or attributes' names/values: yes thats of course up to the next level handler (BTDT)- thus the possibilty of error handling is not killed. Its about what HTMLParser _hides_ irrecoverably. there should be a valid use case for this: Almost any app which parses HTML (self authored or remote) can have (should have?) a no-fuzz/collateral warn log option. (-no need to make a expensive W3C checker session). I mostly have this in use as said, as it was anyway there. Well, as for me, I use anyway a private backport to Python2 of this. I try to avoid Python3 as far as possible. (No real plus, too much problems) So for me its just about joining Python4 in the future perhaps - which can do true CPython multithreading, stackless, psyco/static typing ... and print statement again without typing so many extra braces ;-) I considered extra libs like the HTML tidy binding, but this is all too much fuzz for most cases. And HTMLParser has already quite everything, with the few calls inserted .. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1486713 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1481032] patch smtplib:when SMTPDataError, rset crashes with sslerror
kxroberto kxrobe...@users.sourceforge.net added the comment: ping! perhaps I forgot to write that I uploaded the cleaned patch also on 2010-08-23. I really think this simple patch is necessary. Just seen the same problem again - as I forgot the patch in one of my recent Python update installations. When SMTPDataError, SMTPRecipientsRefused, SMTPSenderRefused should be raised, very likely a subsequent error (like closed connection etc.) overlappes the original Exception during self.rset(), and one will not be able to locate the problem on user level in eons .. This patch is still in my vital-patches collection which I have to apply after every Python update since years. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1481032 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1486713] HTMLParser : A auto-tolerant parsing mode
kxroberto kxrobe...@users.sourceforge.net added the comment: I looked at the new patch http://hg.python.org/lookup/r86952 for Py3 (regarding the extended tolerance and local backporting to Python2.7): What I miss are the calls of a kind of self.warning(msg,i,k) function in non-strict/tolerant mode (where self.error is called in strict mode). Such function could be empty or could be a silent simple counter (like in the old patch) - and could be easily sub-classed for advanced use. I often want at least the possibilty of a HTML error log - so the HTML author (sometimes its me myself) can be noticed to get it more strict on the long run ;-) ... -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1486713 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1486713] HTMLParser : A auto-tolerant parsing mode
kxroberto kxrobe...@users.sourceforge.net added the comment: I'm not working with Py3. don't how much that module is different in 3. unless its going into a py2 version, I'll leave the FR so far to the py3 community -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1486713 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1486713] HTMLParser : A auto-tolerant parsing mode
kxroberto kxrobe...@users.sourceforge.net added the comment: for me a parser which cannot be feed with HTML from outside (which I cannot edit myself) has not much use at all. attached my current patch (vs. py26) - many changes meanwhile. and a test case. I've put the default to strict mode, but ... -- Added file: http://bugs.python.org/file18623/HTMLParser_tolerant_py26.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1486713 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1486713] HTMLParser : A auto-tolerant parsing mode
Changes by kxroberto kxrobe...@users.sourceforge.net: -- versions: +Python 2.6, Python 2.7 Added file: http://bugs.python.org/file18624/test_htmlparser_tolerant.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1486713 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1481032] patch smtplib:when SMTPDataError, rset crashes with sslerror
kxroberto kxrobe...@users.sourceforge.net added the comment: still I think all 3 self.rset()'s in SMTP.sendmail, which are followed by a raise someerror statement have to be bracketed with an except clause or so - nowadays better like try: self.res() except (EnvironmentError, SMTPException): pass , as all socket.error, sslerror seem meanwhile correctly inherited (via IOError). Because the original error must pop up, not a side effect of state cleaning! -- versions: +Python 2.6 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1481032 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1481032] patch smtplib:when SMTPDataError, rset crashes with sslerror
Changes by kxroberto kxrobe...@users.sourceforge.net: Added file: http://bugs.python.org/file18613/smtplib_nosideeffecterror.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1481032 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1481032] patch smtplib:when SMTPDataError, rset crashes with sslerror
Changes by kxroberto kxrobe...@users.sourceforge.net: Removed file: http://bugs.python.org/file7227/smtplib-authplain-tryrset.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1481032 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5141] C API for appending to arrays
kxroberto kxrobe...@users.sourceforge.net added the comment: A first thing would be to select a suitable prefix name for the Array API. Because the Numpy people have 'stolen' PyArray_ instead of staying home with PyNDArray_ or so ;-) In case sb goes into this: Other than PyList_ like stuff and existing members, think for speedy access (like in Cython array.pxd) a direct resizing, the buffer pointer, and something handy like this should be directly exposed: int PyArr_ExtendFromBuffer(PyObject *arr, void* stuff, Py_ssize_t items) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5141 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5141] C API for appending to arrays
kxroberto kxrobe...@users.sourceforge.net added the comment: I had a similar problem creating a C-fast array.array interface for Cython. The array.pxd package here (latest zip file) http://trac.cython.org/cython_trac/ticket/314 includes a arrayarray.h file, which provides ways for efficient creation and growth from C (extend, extend_buffer, resize, resize_smart ). Its probably in one of the next Cython distributions anyway, and will be maintained. And perhaps array2 and arrayM extension subclasses (very light-weight numpy) with public API coming soon too. It respects the different Python versions, so its a lite quasi API. And in case there will be a (unlikely) change in future Pythons, the Cython people will take care as far as there is no official API coming up. Or perhaps most people with such interest use Cython anyway. -- nosy: +kxroberto ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5141 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5977] distutils build_ext.get_outputs returns wrong result (patch)
kxroberto kxrobe...@users.sourceforge.net added the comment: its with any .pyx (Cython) module , when after pyximport.install() and injection of --inplace option a .pyx module is imported. the relevant lines in pyxbuild.py are dist.run_commands() return dist.get_command_obj(build_ext).get_outputs()[0] which use the buggy get_outputs and shall return the full path of built module to pyximport.py : so_path = pyxbuild.pyx_to_dll(pyxfilename, extension_mod, build_in_temp=build_in_temp, pyxbuild_dir=pyxbuild_dir, setup_args=sargs ) assert os.path.exists(so_path), Cannot find: %s % so_path = crash with Cannot find... before pyximport.load_module goes to import it. - A stripped down test case should perhaps 'build_ext' any arbitrary extension module with --inplace ( a option of base command 'build' ) and something like ... dist.get_command_obj(build_ext).inplace=1 dist.run_commands() so_path = dist.get_command_obj(build_ext).get_outputs()[0] assert os.path.isfile(so_path) and os.path.dirname(so_path) in ('','.') ... will produce a invalid so_path: not pointing to actual locally in-place built xy.pyd/.so, but to a non-existing or old file in the build folders -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5977 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5977] distutils build_ext.get_outputs returns wrong result (patch)
New submission from kxroberto kxrobe...@users.sourceforge.net: with --inplace etc. build_ext.get_outputs returns wrong extension paths; tries to computes them out of the air *again* and .. Tools which need to further know the produced module file, e.g. pyximport (Cython) do crash then ... patch see attachment: stores and uses the real path in use. writes a new attribute ext_filename to Extension object - should it be underscored _ext_filename or ... ? -- assignee: tarek components: Distutils files: ext_filename.patch keywords: patch messages: 87494 nosy: kxroberto, tarek severity: normal status: open title: distutils build_ext.get_outputs returns wrong result (patch) type: crash versions: Python 2.6 Added file: http://bugs.python.org/file13941/ext_filename.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5977 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5977] distutils build_ext.get_outputs returns wrong result (patch)
kxroberto kxrobe...@users.sourceforge.net added the comment: test_build_ext: The test must be run in a python build dir don't have a build setup currently. maybe in future. One can yet see in build_ext.py: after if self.inplace: (line485) there are 2 different computations for ext_filename while computation in get_outputs() is not aware of self.inplace (line447 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5977 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1475397] compute/doc %z os-indep., time.asctime_tz / _TZ
kxroberto kxrobe...@users.sourceforge.net added the comment: (I'm somewhat away from all that currently - and not aware if newest Python versions already solved but:) * a time.asctime_tz([tim]) or so should deliver a full OS-indep. _world_ time string incl. numeric timezone info like Sat Mar 21 10:33:36 2009 + It should accept a 10-tuple (like urlopen_file.info().getdate_tz('date'), or a time.time() float time, or interpret a 9-tuple as GMTIME/UTC. * strftime(%z) should be supported on all OS - _constant numeric_ format + * strftime(%C,[tim]) should be like asctime_tz. it should accept as 2nd parameter a 10-tuple alternatively, or a time.time() universal float time, or interpret the 9-tuple as LOCALTIME as it were. test cases to add: * simple render a variation of constant tuples/float_times to constant result strings. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1475397 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1475397] time.asctime_tz, time.strftime %z %C
Changes by kxroberto kxrobe...@users.sourceforge.net: -- title: compute/doc %z os-indep., time.asctime_tz / _TZ - time.asctime_tz, time.strftime %z %C ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1475397 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1459867] Message.as_string should use mangle_from_=unixfrom?
kxroberto kxrobe...@users.sourceforge.net added the comment: g = Generator(fp,mangle_from_=unixfrom) in that code location below? It produced exceptions often when message lines (or headerlines e.g. Subject also when I remember right) begin with the char or so. --- Message.py.orig 2004-12-22 16:01:38.0 +0100 +++ Message.py 2006-03-28 10:59:42.0 +0200 @@ -126,7 +126,7 @@ from email.Generator import Generator fp = StringIO() -g = Generator(fp) +g = Generator(fp,mangle_from_=unixfrom) g.flatten(self, unixfrom=unixfrom) return fp.getvalue() -- title: convenient Message.as_string to use mangle_from_=unixfrom ? - Message.as_string should use mangle_from_=unixfrom? ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1459867 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com