Re: [Python-Dev] Status of json (simplejson) in cpython
On Saturday, April 16, 2011, Antoine Pitrou solip...@pitrou.net wrote: Le samedi 16 avril 2011 à 17:07 +0200, Xavier Morel a écrit : On 2011-04-16, at 16:52 , Antoine Pitrou wrote: Le samedi 16 avril 2011 à 16:42 +0200, Dirkjan Ochtman a écrit : On Sat, Apr 16, 2011 at 16:19, Antoine Pitrou solip...@pitrou.net wrote: What you're proposing doesn't address the question of who is going to do the ongoing maintenance. Bob apparently isn't interested in maintaining stdlib code, and python-dev members aren't interested in maintaining simplejson (assuming it would be at all possible). Since both groups of people want to work on separate codebases, I don't see how sharing a single codebase would be possible. From reading this thread, it seems to me like the proposal is that Bob maintains a simplejson for both 2.x and 3.x and that the current stdlib json is replaced by a (trivially changed) version of simplejson. The thing is, we want to bring our own changes to the json module and its tests (and have already done so, although some have been backported to simplejson). Depending on what those changes are, would it not be possible to apply the vast majority of them to simplejson itself? Sure, but the thing is, I don't *think* we are interested in backporting stuff to simplejson much more than Bob is interested in porting stuff to the json module. I've backported every useful patch (for 2.x) I noticed from json to simplejson. Would be happy to apply any that I missed if anyone can point these out. I've contributed a couple of patches myself after they were integrated to CPython (they are part of the performance improvements Bob is talking about), but that was exceptional. Backporting a patch to another project with a different directory structure, a slightly different code, etc. is tedious and not very rewarding for us Python core developers, while we could do other work on our limited free time. That's exactly why I am not interested in stdlib maintenance myself, I only use 2.x and that's frozen... so I can't maintain the version we would actually use. Also, some types of work would be tedious to backport, for example if we refactor the tests to test both the C and Python implementations. simplejson's test suite has tested both for quite some time. Furthermore, now that python uses Mercurial, it should be possible (or even easy) to use a versioned queue (via MQ) for the trivial adaptation, and the temporary alterations (things which will likely be merged back into simplejson but are not yet, stuff like that) should it not? Perhaps, perhaps not. That would require someone motivated to put it in place, ensure that it doesn't get in the way, document it, etc. Honestly, I don't think maintaining a single stdlib module should require such an amount of logistics. It certainly shouldn't, especially because neither of them changes very fast. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status of json (simplejson) in cpython
On Thu, Apr 14, 2011 at 2:29 PM, Raymond Hettinger raymond.hettin...@gmail.com wrote: On Apr 14, 2011, at 12:22 PM, Sandro Tosi wrote: The version we have in cpython of json is simplejson 2.0.9 highly patched (either because it was converted to py3k, and because of the normal flow of issues/bugfixes) while upstream have already released 2.1.13 . Their 2 roads had diverged a lot, and since this blocks any further update of cpython's json from upstream, I'd like to close this gap. Are you proposing updates to the Python 3.3 json module to include newer features like use_decimal and changing the indent argument from an integer to a string? https://github.com/simplejson/simplejson/blob/master/CHANGES.txt - what are we going to do in the long run? If Bob shows no interest in Python 3, then the code bases will probably continue to diverge. I don't have any real interest in Python 3, but if someone contributes the code to make simplejson work in Python 3 I'm willing to apply the patches run the tests against any future changes. The porting work to make it suitable for the standard library at that point should be something that can be automated since it will be moving some files around and changing the string simplejson to json in a whole bunch of places. Since the JSON spec is set in stone, the changes will mostly be about API (indentation, object conversion, etc) and optimization. I presume the core parsing logic won't be changing much. Actually the core parsing logic is very different (and MUCH faster), which is why the merge is tricky. There's the potential for it to change more in the future, there's definitely more room for optimization. Probably not in the pure python parser, but the C one. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status of json (simplejson) in cpython
On Friday, April 15, 2011, Antoine Pitrou solip...@pitrou.net wrote: Since the JSON spec is set in stone, the changes will mostly be about API (indentation, object conversion, etc) and optimization. I presume the core parsing logic won't be changing much. Actually the core parsing logic is very different (and MUCH faster), Are you talking about the Python logic or the C logic? Both, actually. IIRC simplejson in pure python typically beats json with it's C extension. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status of json (simplejson) in cpython
On Fri, Apr 15, 2011 at 2:20 PM, Antoine Pitrou solip...@pitrou.net wrote: Le vendredi 15 avril 2011 à 14:18 -0700, Bob Ippolito a écrit : On Friday, April 15, 2011, Antoine Pitrou solip...@pitrou.net wrote: Since the JSON spec is set in stone, the changes will mostly be about API (indentation, object conversion, etc) and optimization. I presume the core parsing logic won't be changing much. Actually the core parsing logic is very different (and MUCH faster), Are you talking about the Python logic or the C logic? Both, actually. IIRC simplejson in pure python typically beats json with it's C extension. Really? It would be nice to see some concrete benchmarks against both repo tips. Maybe in a few weeks or months when I have time to finish up the benchmarks that I was working on... but it should be pretty easy for anyone to show that the version in CPython is very slow (and uses a lot more memory) in comparison to simplejson. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status of json (simplejson) in cpython
On Fri, Apr 15, 2011 at 4:12 PM, Antoine Pitrou solip...@pitrou.net wrote: On Fri, 15 Apr 2011 14:27:04 -0700 Bob Ippolito b...@redivi.com wrote: On Fri, Apr 15, 2011 at 2:20 PM, Antoine Pitrou solip...@pitrou.net wrote: Le vendredi 15 avril 2011 à 14:18 -0700, Bob Ippolito a écrit : On Friday, April 15, 2011, Antoine Pitrou solip...@pitrou.net wrote: Since the JSON spec is set in stone, the changes will mostly be about API (indentation, object conversion, etc) and optimization. I presume the core parsing logic won't be changing much. Actually the core parsing logic is very different (and MUCH faster), Are you talking about the Python logic or the C logic? Both, actually. IIRC simplejson in pure python typically beats json with it's C extension. Really? It would be nice to see some concrete benchmarks against both repo tips. Maybe in a few weeks or months when I have time to finish up the benchmarks that I was working on... but it should be pretty easy for anyone to show that the version in CPython is very slow (and uses a lot more memory) in comparison to simplejson. Well, here's a crude microbenchmark. I'm comparing 2.6+simplejson 2.1.3 to 3.3+json, so I'm avoiding integers: * json.dumps: $ python -m timeit -s from simplejson import dumps, loads; \ d = dict((str(i), str(i)) for i in range(1000)) \ dumps(d) - 2.6+simplejson: 372 usec per loop - 3.2+json: 352 usec per loop * json.loads: $ python -m timeit -s from simplejson import dumps, loads; \ d = dict((str(i), str(i)) for i in range(1000)); s = dumps(d) \ loads(s) - 2.6+simplejson: 224 usec per loop - 3.2+json: 233 usec per loop The runtimes look quite similar. That's the problem with trivial benchmarks. With more typical data (for us, anyway) you should see very different results. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pickle alternative in stdlib (Was: On breaking modules into packages)
On Friday, November 5, 2010, exar...@twistedmatrix.com wrote: On 12:21 am, m...@gsites.de wrote: Am 04.11.2010 17:15, schrieb anatoly techtonik: pickle is insecure, marshal too. If the transport or storage layer is not save, you should cryptographically sign the data anyway:: def pickle_encode(data, key): msg = base64.b64encode(pickle.dumps(data, -1)) sig = base64.b64encode(hmac.new(key, msg).digest()) return sig + ':' + msg def pickle_decode(data, key): if data and ':' in data: sig, msg = data.split(':', 1) if sig == base64.b64encode(hmac.new(key, msg).digest()): return pickle.loads(base64.b64decode(msg)) raise pickle.UnpicklingError(Wrong or missing signature.) Bottle (a web framework) uses a similar approach to store non-string data in client-side cookies. I don't see a (security) problem here. Your pickle_decode leaks information about the key. An attacker will eventually (a few seconds to a few minutes, depending on how they have access to this system) be able to determine your key and send you arbitrary pickles (ie, execute arbitrary code on your system). Oops. This stuff is hard. If you're going to mess around with it, make sure you're *serious* (better approach: don't mess around with it). Specifically you need to use a constant time signature verification or else there are possible timing attacks. Sounds like something a hmac module should provide in the first place. But yeah, this stuff is hard, better to just not have a code execution hole in the first place. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Return from generators in Python 3.2
On Fri, Aug 27, 2010 at 8:25 AM, Guido van Rossum gu...@python.org wrote: On Thu, Aug 26, 2010 at 5:05 PM, Yury Selivanov yseliva...@gmail.com wrote: On 2010-08-26, at 8:04 PM, Greg Ewing wrote: Even with your proposal, you'd still have to use a 'creepy abstraction' every time one of your coroutines calls another. That's why PEP 380 deals with 'more than just return'. Nope. In almost any coroutine framework you have a scheduler or trampoline object that basically does all the work of calling, passing values and propagating exceptions. And many other things that 'yield from' won't help you with (cooperation, deferring to process/thread pools, pausing, etc.) Being a developer of one of such frameworks, I can tell you, that I can easily live without 'yield from', but dealing with weird return syntax is a pain. That's not my experience. I wrote a trampoline myself (not released yet), and found that I had to write a lot more code to deal with the absence of yield-from than to deal with returns. In my framework, users write 'raise Return(value)' where Return is a subclass of StopIteration. The trampoline code that must be written to deal with StopIteration can be extended trivially to deal with this. The only reason I chose to use a subclass is so that I can diagnose when the return value is not used, but I could have chosen to ignore this or just diagnose whenever the argument to StopIteration is not None. A bit off-topic, but... In my experience the lack of yield from makes certain styles of programming both very tedious and very costly for performance. One example would be Genshi, which implements something like pipes or filters. There are many filters that will do something once (e.g. insert a doctype) and but have O(N) performance because of the function call overhead of for x in other_generator: yield x. Nest this a few times and you'll have 10 function calls for every byte of output (not an exaggeration in the case of Trac templates). I think if implemented properly yield from could get rid of most of that overhead. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] State of json in 2.7
On Tuesday, June 22, 2010, Brett Cannon br...@python.org wrote: [cc'ing Bob on his gmail address; didn't have any other address handy so I don't know if this will actually get to him] On Tue, Jun 22, 2010 at 09:54, Dirkjan Ochtman dirk...@ochtman.nl wrote: It looks like simplejson 2.1.0 and 2.1.1 have been released: http://bob.pythonmac.org/archives/2010/03/10/simplejson-210/ http://bob.pythonmac.org/archives/2010/03/31/simplejson-211/ It looks like any changes that didn't come from the Python tree didn't go into the Python tree, either. Has anyone asked Bob why he did this? There might be a logical reason. I've just been busy. It's not trivial to move patches from one to the other, so it's not something that has been easy for me to get around to actually doing. It seems that more often than not when I have had time to look at something, it didn't line up well with python's release schedule. (and speaking of busy I'm en route for a week long honeymoon so don't expect much else from me on this thread) -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Decimal - float comparisons in py3k.
On Sat, Mar 20, 2010 at 4:38 PM, Mark Dickinson dicki...@gmail.com wrote: On Sat, Mar 20, 2010 at 7:56 PM, Guido van Rossum gu...@python.org wrote: I propose to reduce all hashes to the hash of a normalized fraction, which we can define as a combination of the hashes for the numerator and the denominator. Then all we have to do is figure fairly efficient ways to convert floats and decimals to normalized fractions (not necessarily Fractions). I may be naive but this seems doable: for a float, the denominator is always a power of 2 and removing factors of 2 from the denominator is easy (just right-shift until the last bit is zero). For Decimal, the unnormalized denominator is always a power of 10, and the normalization is a bit messier, but doesn't seem excessively so. The resulting numerator and denominator may be large numbers, but for typical use of Decimal and float they will rarely be excessively large, and I'm not too worried about slowing things down when they are (everything slows down when you're using really large integers anyway). I *am* worried about slowing things down for large Decimals: if you can't put Decimal('1e1234567') into a dict or set without waiting for an hour for the hash computation to complete (because it's busy computing 10**1234567), I consider that a problem. But it's solvable! I've just put a patch on the bug tracker: http://bugs.python.org/issue8188 It demonstrates how hashes can be implemented efficiently and compatibly for all numeric types, even large Decimals like the above. It needs a little tidying up, but it works. I was interested in how the implementation worked yesterday, especially given the lack of explanation in the margins of numeric_hash3.patch. numeric_hash4.patch has much better comments, but I didn't see this patch until after I had sufficiently deciphered the previous patch and wrote most of this: http://bob.pythonmac.org/archives/2010/03/23/py3k-unified-numeric-hash/ I'm not really qualified to review the patch, what little formal math training I had has atrophied quite a bit over the years, but as far as I can tell it seems to work. The results also seem to match the Python implementations that I created. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
On Sun, Jan 31, 2010 at 11:16 AM, Guido van Rossum gu...@python.org wrote: Whoa. This thread already exploded. I'm picking this message to respond to because it reflects my own view after reading the PEP. On Sun, Jan 31, 2010 at 4:13 AM, Hanno Schlichting ha...@hannosch.eu wrote: On Sun, Jan 31, 2010 at 1:03 PM, Simon Cross hodgestar+python...@gmail.com wrote: I don't know whether I in favour of using a single pyr folder or not but if a single folder is used I'd definitely prefer the folder to be called __pyr__ rather than .pyr. Exactly what I would prefer. I worry that having many small directories is a fairly poor use of the filesystem. A quick scan of /usr/local/lib/python3.2 on my Linux box reveals 1163 .py files but only 57 directories). I like this option as well, but why not just name the directory .pyc instead of __pyr__ or .pyr? That way people probably won't even have to reconfigure their tools to ignore it :) -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Dropping bytes support in json
On Mon, Apr 27, 2009 at 7:25 AM, Damien Diederen d...@crosstwine.com wrote: Antoine Pitrou solip...@pitrou.net writes: Hello, We're in the process of forward-porting the recent (massive) json updates to 3.1, and we are also thinking of dropping remnants of support of the bytes type in the json library (in 3.1, again). This bytes support almost didn't work at all, but there was a lot of C and Python code for it nevertheless. We're also thinking of dropping the encoding argument in the various APIs, since it is useless. I had a quick look into the module on both branches, and at Antoine's latest patch (json_py3k-3). The current situation on trunk is indeed not very pretty in terms of code duplication, and I agree it would be nice not to carry that forward. I couldn't figure out a way to get rid of it short of multi-#including templates and playing with the C preprocessor, however, and have the nagging feeling the latter would be frowned upon by the maintainers. There is a precedent with xmltok.c/xmltok_impl.c, though, so maybe I'm wrong about that. Should I give it a try, and see how clean the result can be made? Under the new situation, json would only ever allow str as input, and output str as well. By posting here, I want to know whether anybody would oppose this (knowing, once again, that bytes support is already broken in the current py3k trunk). Provided one of the alternatives is dropped, wouldn't it be better to do the opposite, i.e., have the decoder take bytes as input, and the encoder produce bytes—and layer the str functionality on top of that? I guess the answer depends on how the (most common) lower layers are structured, but it would be nice to allow a straight bytes path to/from the underlying transport. (I'm willing to have a go at the conversion in case somebody is interested.) Bob, would you have an idea of which lower layers are most commonly used with the json module, and whether people are more likely to expect strs or bytes in Python 3.x? Maybe that data could be inferred from some bug tracking system? I don't know what Python 3.x users expect. As far as I know, none of the lower layers of the json package are used directly. They're certainly not supposed to be or documented as such. My use case for dumps is typically bytes output because we push it straight to and from IO. Some people embed JSON in other documents (e.g. HTML) where you would want it to be text. I'm pretty sure that the IO case is more common. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Dropping bytes support in json
On Mon, Apr 13, 2009 at 1:02 PM, Martin v. Löwis mar...@v.loewis.de wrote: Yes, there's a TCP connection. Sorry for not making that clear to begin with. If so, it doesn't matter what representation these implementations chose to use. True, I can always convert from bytes to str or vise versa. I think you are missing the point. It will not be necessary to convert. You can write the JSON into the TCP connection in Python, and it will come out just fine as strings just fine in C# and JavaScript. This is how middleware works - it abstracts from programming languages, and allows for different representations in different languages, in a manner invisible to the participating processes. At least one of these two needs to work: json.dumps({}).encode('utf-16le') # dumps() returns str '{\x00}\x00' json.dumps({}, encoding='utf-16le') # dumps() returns bytes '{\x00}\x00' In 2.6, the first one works. The second incorrectly returns '{}'. Ok, that might be a bug in the JSON implementation - but you shouldn't be using utf-16le, anyway. Use UTF-8 always, and it will work fine. The questions is: which of them is more appropriate, if, what you want, is bytes. I argue that the second form is better, since it saves you an encode invocation. It's not a bug in dumps, it's a matter of not reading the documentation. The encoding parameter of dumps decides how byte strings should be interpreted, not what the output encoding is. The output of json/simplejson dumps for Python 2.x is either an ASCII bytestring (default) or a unicode string (when ensure_ascii=False). This is very practical in 2.x because an ASCII bytestring can be treated as either text or bytes in most situations, isn't going to get mangled over any kind of encoding mismatch (as long as it's an ASCII superset), and skips an encoding step if getting sent over the wire.. simplejson.dumps(['\x00f\x00o\x00o'], encoding='utf-16be') '[foo]' simplejson.dumps(['\x00f\x00o\x00o'], encoding='utf-16be', ensure_ascii=False) u'[foo]' -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Dropping bytes support in json
On Fri, Apr 10, 2009 at 8:38 AM, Stephen J. Turnbull step...@xemacs.org wrote: Paul Moore writes: On the other hand, further down in the document: 3. Encoding JSON text SHALL be encoded in Unicode. The default encoding is UTF-8. Since the first two characters of a JSON text will always be ASCII characters [RFC0020], it is possible to determine whether an octet stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking at the pattern of nulls in the first four octets. This is at best confused (in my utterly non-expert opinion :-)) as Unicode isn't an encoding... The word encoding (by itself) does not have a standard definition AFAIK. However, since Unicode *is* a coded character set (plus a bunch of hairy usage rules), there's nothing wrong with saying text is encoded in Unicode. The RFC 2130 and Unicode TR#17 taxonomies are annoying verbose and pedantic to say the least. So what is being said there (in UTR#17 terminology) is (1) JSON is *text*, that is, a sequence of characters. (2) The abstract repertoire and coded character set are defined by the Unicode standard. (3) The default transfer encoding syntax is UTF-8. That implies that loads can/should also allow bytes as input, applying the given algorithm to guess an encoding. It's not a guess, unless the data stream is corrupt---or nonconforming. But it should not be the JSON package's responsibility to deal with corruption or non-conformance (eg, ISO-8859-15-encoded programs). That's the whole point of specifying the coded character set in the standard the first place. I think it's a bad idea for any of the core JSON API to accept or produce bytes in any language that provides a Unicode string type. That doesn't mean Python's module shouldn't provide convenience functions to read and write JSON serialized as UTF-8 (in fact, that *should* be done, IMO) and/or other UTFs (I'm not so happy about that). But those who write programs using them should not report bugs until they've checked out and eliminated the possibility of an encoding screwup! The current implementation doesn't do any encoding guesswork and I have no intention to allow that as a feature. The input must be unicode, UTF-8 bytes, or an encoding must be specified. Personally most of experience with JSON is as a wire protocol and thus bytes, so the obvious function to encode json should do that. There probably should be another function to get unicode output, but nobody has ever asked for that in the Python 2.x version. They either want the default behavior (encoding as ASCII str which can be used as unicode due to implementation details of Python 2.x) or encoding as a more compact UTF-8 str (without escaping non-ASCII code points). Perhaps Python 3 users would ask for a unicode output when decoding though. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Dropping bytes support in json
On Thu, Apr 9, 2009 at 1:05 PM, Martin v. Löwis mar...@v.loewis.de wrote: I can understand that you don't want to spend much time on it. How about removing it from 3.1? We could re-add it when long-term support becomes more likely. I'm speechless. It seems that my statement has surprised you, so let me explain: I think we should refrain from making design decisions (such as API decisions) without Bob's explicit consent, unless we assign a new maintainer for the simplejson module (perhaps just for the 3k branch, which perhaps would be a fork from Bob's code). Antoine suggests that Bob did not comment on the issues at hand, therefore, we should not proceed with the proposed design. Since the 3.1 release is only a few weeks ahead, we have the choice of either shipping with the broken version that is currently in the 3k branch, or drop the module from the 3k branch. I believe our users are better served by not having to waste time with a module that doesn't quite work, or may change. Most of my time to spend on json/simplejson and these mailing list discussions is on weekends, I try not to bother with it when I'm busy doing Actual Work unless there is a bug or some other issue that needs more immediate attention. I also wasn't aware that I was expected to comment on those issues. I'm CC'ed on the discussion for issue4136 but I don't see any unanswered questions directed at me. I have the issues (issue5723, issue4136) starred in my gmail and I planned to look at it more closely later, hopefully on Friday or Saturday. As far as Python 3 goes, I honestly have not yet familiarized myself with the changes to the IO infrastructure and what the new idioms are. At this time, I can't make any educated decisions with regard to how it should be done because I don't know exactly how bytes are supposed to work and what the common idioms are for other libraries in the stdlib that do similar things. Until I figure that out, someone else is better off making decisions about the Python 3 version. My guess is that it should work the same way as it does in Python 2.x: take bytes or unicode input in loads (which means encoding is still relevant). I also think the output of dumps should also be bytes, since it is a serialization, but I am not sure how other libraries do this in Python 3 because one could argue that it is also text. If other libraries that do text/text encodings (e.g. binascii, mimelib, ...) use str for input and output instead of bytes then maybe Antoine's changes are the right solution and I just don't know better because I'm not up to speed with how people write Python 3 code. I'll do my best to find some time to look into Python 3 more closely soon, but thus far I have not been very motivated to do so because Python 3 isn't useful for us at work and twiddling syntax isn't a very interesting problem for me to solve. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] json decoder speedups, any time left for 2.6?
simplejson 2.0.0 is now released which is about as optimized as I can be bothered to make it. It's about 4x faster than cPickle for encoding and just a little slower and decoding, which is good enough for now ;) The pure Python source is much uglier now (to avoid global lookups, etc.), but also several times faster than it was. http://pypi.python.org/pypi/simplejson One of the optimizations I made probably isn't good for Py3k, it will return ASCII strings as str objects instead of converting to unicode, but that shouldn't be too much work to port (though I haven't looked at the current _json.c port for Py3k). I also converted over to using Sphinx documentation, which was nice because I was able to just re-use the docs that were already in Python trunk after changing the module name around. All of the work should be easy to merge back into trunk so I'll try and take care of that quickly after Python 2.6 is released. On Wed, Sep 24, 2008 at 9:02 AM, Bob Ippolito [EMAIL PROTECTED] wrote: http://pypi.python.org/pypi/simplejson The _speedups module is optional. On Wed, Sep 24, 2008 at 8:42 AM, Alex Martelli [EMAIL PROTECTED] wrote: Meanwhile, can you please release (wherever you normally release things;-) the pure-Python version as well? I'd like to play around with it in Google App Engine opensource sandboxes (e.g., cfr. gae-json-rest -- I'll be delighted to add you to that project if you want of course;-) and that requires Python 2.5 and only pure-Python add-ons... thanks! Alex On Wed, Sep 24, 2008 at 8:08 AM, Bob Ippolito [EMAIL PROTECTED] wrote: On Wed, Sep 24, 2008 at 6:14 AM, Barry Warsaw [EMAIL PROTECTED] wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Sep 24, 2008, at 5:47 AM, Nick Coghlan wrote: Bob Ippolito wrote: How much time do I have left to get this into Python 2.6? Zero I'm afraid - with rc1 out, it's release blocker bugs only. Anything which can be deferred to the 2.6.1 release without causing any major harm is definitely out - and while a 2x speedup is very nice, it isn't something to be squeezing in post-rc1. Still, that should make for a nice incremental improvement when 2.6.1 rolls around. I concur. Ok, no problem. The speedup is about 3x now on the trunk ;) I think that further optimization will require some more C hacking, but 2.6.1 should give me plenty of time to get around to some of that. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/aleaxit%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] json decoder speedups, any time left for 2.6?
On Wed, Sep 24, 2008 at 6:14 AM, Barry Warsaw [EMAIL PROTECTED] wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Sep 24, 2008, at 5:47 AM, Nick Coghlan wrote: Bob Ippolito wrote: How much time do I have left to get this into Python 2.6? Zero I'm afraid - with rc1 out, it's release blocker bugs only. Anything which can be deferred to the 2.6.1 release without causing any major harm is definitely out - and while a 2x speedup is very nice, it isn't something to be squeezing in post-rc1. Still, that should make for a nice incremental improvement when 2.6.1 rolls around. I concur. Ok, no problem. The speedup is about 3x now on the trunk ;) I think that further optimization will require some more C hacking, but 2.6.1 should give me plenty of time to get around to some of that. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] json decoder speedups, any time left for 2.6?
http://pypi.python.org/pypi/simplejson The _speedups module is optional. On Wed, Sep 24, 2008 at 8:42 AM, Alex Martelli [EMAIL PROTECTED] wrote: Meanwhile, can you please release (wherever you normally release things;-) the pure-Python version as well? I'd like to play around with it in Google App Engine opensource sandboxes (e.g., cfr. gae-json-rest -- I'll be delighted to add you to that project if you want of course;-) and that requires Python 2.5 and only pure-Python add-ons... thanks! Alex On Wed, Sep 24, 2008 at 8:08 AM, Bob Ippolito [EMAIL PROTECTED] wrote: On Wed, Sep 24, 2008 at 6:14 AM, Barry Warsaw [EMAIL PROTECTED] wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Sep 24, 2008, at 5:47 AM, Nick Coghlan wrote: Bob Ippolito wrote: How much time do I have left to get this into Python 2.6? Zero I'm afraid - with rc1 out, it's release blocker bugs only. Anything which can be deferred to the 2.6.1 release without causing any major harm is definitely out - and while a 2x speedup is very nice, it isn't something to be squeezing in post-rc1. Still, that should make for a nice incremental improvement when 2.6.1 rolls around. I concur. Ok, no problem. The speedup is about 3x now on the trunk ;) I think that further optimization will require some more C hacking, but 2.6.1 should give me plenty of time to get around to some of that. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/aleaxit%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] json decoder speedups, any time left for 2.6?
I'm out of town this week for a conference (ICFP/CUFP in Victoria) and my hotel's connection has been bad enough such that I can't get any Real Work done so I've managed to hammer on the json library's decoding quite a bit instead. I just released simplejson 1.9.3 which improves decoding performance by about 2x and I've got some more changes along the way in trunk for 1.9.4 that will increase it even further (over 3x my original 1.9.2 benchmark perf). How much time do I have left to get this into Python 2.6? FWIW the changes are all on the Python side, no C code has been harmed (yet). The test suite still passes of course. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Should we help pythonmac.org?
The major difference between the packages on macports and pythonmac.org is that macports is their own distro of nearly everything, akin to installing a copy of FreeBSD over top of Mac OS X. pythonmac.org contains packages that are self-contained and don't have a whole new set of libraries to install (in the cases where they do require libraries, they link them in statically for the most part). These days I don't have a lot of preference, I don't use either :) On Mon, Aug 18, 2008 at 1:08 PM, Guido van Rossum [EMAIL PROTECTED] wrote: Alternatively, I just got mail from Bob Ippolito indicating that he'd be happy to hand over the domain to the PSF. It's got quite a bit more on it than Python distros, and it's a fairly popular resource for Mac users I imagine. However macports.org seems to have more Python stuff, and has a more recent version of 2.5. (2.5.2). Perhaps we should link to macports.org instead? On Mon, Aug 18, 2008 at 9:54 AM, Barry Warsaw [EMAIL PROTECTED] wrote: On Aug 18, 2008, at 12:05 PM, Guido van Rossum wrote: Does anyone have connections with the owners of pythonmac.org? Apparently they are serving up an ancient version of Python 2.5. The Google App Engine has a minor issue in 2.5 that's solved in 2.5.1, but that is apparently not available from that site. Perhaps we can contribute more recent Mac versions, or provide them directly on python.org? (The Downloads - Macintosh page points to pythonmac.org, which means lots of All Engine users download this old version.) I'd be happy to arrange things with a Mac expert to put the Mac binaries on the download site. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Pydotorg] Should we help pythonmac.org?
On Mon, Aug 18, 2008 at 3:41 PM, Barry Warsaw [EMAIL PROTECTED] wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Aug 18, 2008, at 6:13 PM, Fred Drake wrote: On Aug 18, 2008, at 5:42 PM, Steve Holden wrote: Someone told me the other day that macports made for difficult installs, but not being a Mac user I wasn't in a position to evaluate the advice. Not being a Mac user either, I've been using Mac OS X for about a year now for most of my development. I've got mixed feelings about macports: It's painful to use, compared to things like rpm and apt, but... it might be the best that's available for the Mac. I'm not going to trust it to give me a usable Python, though, in spite of not having had problems with Pythons it provides. Just 'cause I've gotten paranoid. I use macports too, mostly for stuff I'm too lazy to build from source. I'm sure there's a Python in there, but like Fred, I don't use it. I do agree that we could and probably should maintain any Mac Python content on the main python.org site, but also if Bob wants to donate the domain, we can just have it forward to www.python.org/allyourmacsarebelongtous We already do that for the wiki, we could do that for the other parts of the site just as easily (even without or before a transfer of ownership) :) I'm happy to pay for the domain and hosting, I just don't have a lot of spare cycles these days unless I need something at work. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The docs, reloaded
On 5/21/07, Martin v. Löwis [EMAIL PROTECTED] wrote: I think the people who have responded to my comment read too much into it. Nowhere do I think I asked Georg to write an equation typesetter to include in the Python documentation toolchain. I asked that math capability be considered. I have no idea what tools he used to build his new documentation set. I only briefly glanced at a couple of the output pages. I think what he has done is marvelous. However, I don't think the door should be shut on equation display. Is there a route to it based on the tools Georg is using? I don't think anything in the world can replace TeX for math typesetting. So if math typesetting was a requirement (which it should not be, for that very reason), then we could not consider anything but TeX. You can use docutils to generate LaTeX output from reST, and you can put raw LaTeX into the output with .. raw:: latex. I would imagine this is sufficient for now. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adding socket timeout to urllib2
On 3/6/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Guido Since idel timeout is not a commonly understood term it would Guido be even better if it was explained without using it. I think it's commonly understood, but it doesn't mean what the socket timeout is used for. It's how long a connection can be idle (the client doesn't make a request of the server) before the server will close the connection. What does idle timeout have to do with urllib2 or any IO layer for that matter? I've only seen it as a very high level server-only feature... -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Twisted Isn't Specific (was Re: Trial balloon: microthreads library in stdlib)
On 2/14/07, Greg Ewing [EMAIL PROTECTED] wrote: Thomas Wouters wrote: *I* don't like the idea of something in the Python installation deciding which reactor to use. I wouldn't mind if some way were provided of changing the reactor if you want. I'd just like to see a long term goal of making it unnecessary as far as possible. In any case, your idea requires a lot of changes in external, non-Python code -- PyGTK simply exposes the GTK mainloop, which couldn't care less about Python's idea of a perfect event reactor model. On unix at least, I don't think it should be necessary to change gtk, only pygtk. If it can find out the file descriptor of the connection to the X server, it can plug that into the reactor, and then call gtk_main_iteration_do() whenever something comes in on it. A similar strategy ought to work for any X11-based toolkit that exposes a function to perform one iteration of its main loop. Mileage on other platforms may vary. The PerfectReactor can be added later, all current reactors aliased to it, and no one would have to change a single line of code. Sure. The other side to all this is the client side, i.e. the code that installs event callbacks. At the moment there's no clear direction to take, so everyone makes their own choice -- some use asyncore, some use Twisted, some use the gtk event loop, some roll their own, etc. There is no single PerfectReactor. There are several use cases where you need to wait on 1 different event systems, which guarantees at least two OS threads (and two event loops). In general it's nice to have a single Python event loop (the reactor) to act on said threads (e.g. something just sitting on a mutex waiting for messages) but waiting for IO to occur should *probably* happen on one or more ancillary threads -- one per event system (e.g. select, GTK, WaitForMultipleEvents, etc.) -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Twisted Isn't Specific (was Re: Trial balloon: microthreads library in stdlib)
On 2/15/07, Baptiste Carvello [EMAIL PROTECTED] wrote: Ah, threads :-( It turns out that you need to invoke GetMessage in the context of the thread in which the window was created. In a different thread, you won't get any messages. I'd be interested to hear about other situations where threading would cause a problem. My suspicion is that Windows is the hard one, and as I've shown that one is solvable. I've tried something similar on Linux, with gtk an wx. You can run the gtk main loop in its own thread, but because gtk is not thread safe, you have to grab a mutex everytime you run gtk code outside the thread the mainloop is running in. So you have to surround your calls to the gtk api with calls to gtk.threads_enter and gtk.threads_leave. Except for callbacks of course, because they are executed in the main thread... Doable, but not fun. The same goes for wx. Then all hell breaks loose when you try to use both gtk and wx at the same time. That's because on Linux, the wx main loop calls the gtk mainloop behind the scenes. As far as I know, that problem can not be solved from python. So yes that strategy can work, but it's no silver bullet. And it's worse on Windows and Mac OS X where some GUI API calls *must* happen on a particular thread or they simply don't work. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] multiple interpreters and extension modules
On 12/23/06, Jeremy Kloth [EMAIL PROTECTED] wrote: On Friday 22 December 2006 5:02 pm, Josiah Carlson wrote: Jeremy Kloth [EMAIL PROTECTED] wrote: [[ This may be somewhat c.l.p.-ish but I feel that this crossed into CPython development enough to merit posting here ]] I have received a bug report for 4Suite that involves a PyObject_IsInstance check failing for what appears to be the correct class, that is, the class names match. With some investigating, I have found that the true problem is with multiple interpreters. The reason for this is that each sub-interpreter has a new copy of any pure Python module. The following issues are also true for modules that have been reloaded, but I think that is common knowledge. I mention it only for completeness. If I remember correctly, Python allows you to use multiple interpreters in the same process, but it makes no guarantees as to their correctness when running. See this post for further discussion on the issue: http://mail.python.org/pipermail/python-list/2004-January/244343.html You can also search for 'multiple python interpreters single process' in google without quotes to hear people lament over the (generally broken) multiple Python interpreter support. The problem here is that it is mod_python using the multiple interpreters. We have no control over that. What I'm proposing is fixing the extension module support for multiple interpreters with the bonus of adding extension module finalization which I've seen brought up here before. Fixing this does require support by the extension module author, but if that author doesn't feel the need to work in mod_python (if, of course, they load module level constants), that is their loss. Is 4Suite that different in its use of hybrid Python and C extensions? There is lots of back and forth between the two layers and performance is critical. I really don't feel like recoding thousands of lines of Python code into C just to get 4Suite to work in mod_python without error. It's a whole lot more practical to just stop using mod_python and go for one of the other ways of exposing Python code to the internet. I bet you can get the same or better performance out of another solution anyway, and you'd save deployment headaches. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] multiple interpreters and extension modules
On 12/23/06, Jeremy Kloth [EMAIL PROTECTED] wrote: On Friday 22 December 2006 7:54 pm, Bob Ippolito wrote: It's a whole lot more practical to just stop using mod_python and go for one of the other ways of exposing Python code to the internet. I bet you can get the same or better performance out of another solution anyway, and you'd save deployment headaches. I have no control over end-users choice of Python/webserver integration, I just end up making it possible to run our software in the environment of *their* choice. If it is the opinion that it is mod_python that is broken, I'd gladly point the users to the location stating that fact/belief. It would make my life easier. Well, it clearly is broken wrt pure python modules and objects that persist across requests. I believe that it's also broken with any extension that uses the PyGILState API due to the way it interacts with multiple interpreters. I stopped using mod_python years ago due to the sorts of issues that you're bringing up here (plus problems compiling, deploying, RAM bloat, etc.). I don't have any recent experience or references that I can point you to, but I can definitely say that I have had many good experiences with the WSGI based solutions (and Twisted, but that's a different game). I would at least advise your user that there are several perfectly good ways to make Python speak HTTP, and mod_python is the only one with this issue. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Non-blocking (asynchronous) timer without thread?
On 12/23/06, Evgeniy Khramtsov [EMAIL PROTECTED] wrote: Mike Klaas пишет: I'm not sure how having python execute code at an arbitrary time would _reduce_ race conditions and/or deadlocks. And if you want to make it safe by executing code that shares no variables or resources, then it is no less safe to use threads, due to the GIL. Ok. And what about a huge thread overhead? Just try to start 10-50k threading timers :) If you can write you application in an event-driven way, Twisted might be able to do what you are looking for. I don't like an idea of Twisted: you want the banana, but get the whole gorilla as well :) Well you simply can't do what you propose without writing code in the style of Twisted or with interpreter modifications or evil stack slicing such as with stackless or greenlet. If you aren't willing to choose any of those then you'll have to live without that functionality or use another language (though I can't think of any usable ones that actually safely do what you're asking). It should be relatively efficient to do what you want with a thread pool (one thread that manages all of the timers, and worker threads to execute the timer callbacks). FWIW, Erlang doesn't have that functionality. You can wait on messages with a timeout, but there are no interrupts. You do have cheap and isolated processes instead of expensive shared state threads, though. Writing Erlang/OTP code is actually a lot closer to writing Twisted style code than it is to other styles of concurrency (that you'd find in Python). It's just that Erlang/OTP has better support for concurrency oriented programming than Python does (across the board; syntax, interpreter, convention and libraries). -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] infinities
On 11/26/06, tomer filiba [EMAIL PROTECTED] wrote: i found several places in my code where i use positive infinity (posinf) for various things, i.e., def readline(self, limit = -1): if limit 0: limit = 1e1 # posinf chars = [] while limit 0: ch = self.read(1) chars.append(ch) if not ch or ch == \n: break limit -= 1 return .join(chars) i like the concept, but i hate the 1e1 stuff... why not add posint, neginf, and nan to the float type? i find it much more readable as: if limit 0: limit = float.posinf posinf, neginf and nan are singletons, so there's no problem with adding as members to the type. sys.maxint makes more sense there. Or you could change it to while limit != 0 and set it to -1 (though I probably wouldn't actually do that)... There is already a PEP 754 for float constants, which is implemented in the fpconst module (see CheeseShop). It's not (yet) part of the stdlib though. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Problem building module against Mac Python 2.4 and Python 2.5
On 10/15/06, Barry Scott [EMAIL PROTECTED] wrote: This may be down to my lack of knowledge of Mac OS X development. I want to build my python extension for Python 2.3, 2.4 and 2.5 on the same Mac. Build Python 2.3 and Python 2.4 has been working well for a long time. But after I installed Python 2.5 it seems that I can no longer link a against Python 2.4 without changing sym link /Library/Frameworks/Python.framework/ Versions/Current to point at the one I want to build against. The problem did not arise with Python 2.3 and Python 2.4 because Python 2.3 is in /System/Library and Python 2.4 is in /LIbrary. Telling ld which framework folder to look in allows both to be linked against. Is there a way to force ld to use a particular version of the python framework or do I have to change the symlink each time I build against a different version? This type of problem does not happen on Windows or Unix by design. Use an absolute path to the library rather than -framework. Or use distutils! -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 2.3.6 for the unicode buffer overrun
On 10/13/06, Anthony Baxter [EMAIL PROTECTED] wrote: On Friday 13 October 2006 16:59, Fredrik Lundh wrote: yeah, but *you* are doing it. if the server did that, Martin and other trusted contributors could upload the files as soon as they're available, instead of first transferring them to you, and then waiting for you to find yet another precious time slot to spend on this release. Sure - I get that. There's a couple of reasons for me doing it. First is gpg signing the release files, which has to happen on my local machine. There's also the variation in who actually builds the releases; at least one of the Mac builds was done by Bob I. But there could be ways around this. I don't want to have to ensure every builder has scp, and I'd also prefer for it to all go live at once. A while back, the Mac installer would follow up some time after the Windows and source builds. Every release, I'd get emails saying where's the mac build?! With most consumer connections it's a lot faster to download than to upload. Perhaps it would save you a few minutes if the contributors uploaded directly to the destination (or to some other fast server) and you could download and sign it, rather than having to scp it back up somewhere from your home connection. To be fair, (thanks to Ronald) the Mac build is entirely automated by a script with the caveat that you should be a little careful about what your environment looks like (e.g. don't install fink or macports, or to move them out of the way when building). It downloads all of the third party dependencies, builds them with some special flags to make it universal, builds Python, and then wraps it up in an installer package. Given any Mac OS X 10.4 machine, the builds could happen automatically. Apple could probably provide one if someone asked. They did it for Twisted. Or maybe the Twisted folks could appropriate part of that machine's time to also build Python. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as .join(x) idiom
On 10/6/06, Fredrik Lundh [EMAIL PROTECTED] wrote: Ron Adam wrote: I think what may be missing is a larger set of higher level string functions that will work with lists of strings directly. Then lists of strings can be thought of as a mutable string type by its use, and then working with substrings in lists and using ''.join() will not seem as out of place. as important is the observation that you don't necessarily have to join string lists; if the data ends up being sent over a wire or written to disk, you might as well skip the join step, and work directly from the list. (it's no accident that ET has grown tostringlist and fromstringlist functions, for example ;-) The just make lists paradigm is used by Erlang too, it's called iolist there (it's not a type, just a convention). The lists can be nested though, so concatenating chunks of data for IO is always a constant time operation even if the chunks are already iolists. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Caching float(0.0)
On 9/30/06, Terry Reedy [EMAIL PROTECTED] wrote: Nick Coghlan [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] I suspect the problem would typically stem from floating point values that are read in from a human-readable file rather than being the result of a 'calculation' as such: For such situations, one could create a translation dict for both common float values and for non-numeric missing value indicators. For instance, flotran = {'*': None, '1.0':1.0, '2.0':2.0, '4.0':4.0} The details, of course, depend on the specific case. But of course you have to know that common float values are never cached and that it may cause you problems. Some users may expect them to be because common strings and integers are cached. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Tix not included in 2.5 for Windows
On 9/30/06, Scott David Daniels [EMAIL PROTECTED] wrote: Christos Georgiou wrote: Does anyone know why this happens? I can't find any information pointing to this being deliberate. I just upgraded to 2.5 on Windows (after making sure I can build extensions with the freeware VC++ Toolkit 2003) and some of my programs stopped operating. I saw in a French forum that someone else had the same problem, and what they did was to copy the relevant files from a 2.4.3 installation. I did the same, and it seems it works, with only a console message appearing as soon as a root window is created: Also note: the Os/X universal seems to include a Tix runtime for the non-Intel processor, but not for the Intel processor. This makes me think there is a build problem. Are you sure about that? What file are you referring to specifically? -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Tix not included in 2.5 for Windows
On 9/30/06, Scott David Daniels [EMAIL PROTECTED] wrote: Bob Ippolito wrote: On 9/30/06, Scott David Daniels [EMAIL PROTECTED] wrote: Christos Georgiou wrote: Does anyone know why this happens? I can't find any information pointing to this being deliberate. Also note: the Os/X universal seems to include a Tix runtime for the non-Intel processor, but not for the Intel processor. This makes me think there is a build problem. Are you sure about that? What file are you referring to specifically? OK, from the 2.5 universal: (hand-typed, I e-mail from another machine) === Using Idle === import Tix Tix.Tk() Traceback (most recent call last): File (pyshell#8), line 1, in (module) Tix.Tk() File /Library/Frameworks/Python.framework/Versions/2.5/ lib/python2.5/lib-tk/Tix.py, line 210 in __init__ self.tk.eval('package require Tix') TclError: no suitable image found. Did find: /Library/Tcl/Tix8.4/libTix8.4.dylib: mach-o, but wrong architecture. === From the command line === import Tix Tix.Tk() Traceback (most recent call last): File stdin, line 1, in (module) File /Library/Frameworks/Python.framework/Versions/2.5/ lib/python2.5/lib-tk/Tix.py, line 210 in __init__ self.tk.eval('package require Tix') _tkinter.TclError: no suitable image found. Did find: /Library/Tcl/Tix8.4/libTix8.4.dylib: mach-o, but wrong architecture. Those files are not distributed with Python. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Caching float(0.0)
On 9/29/06, Greg Ewing [EMAIL PROTECTED] wrote: Nick Craig-Wood wrote: Is there any reason why float() shouldn't cache the value of 0.0 since it is by far and away the most common value? 1.0 might be another candidate for cacheing. Although the fact that nobody has complained about this before suggests that it might not be a frequent enough problem to be worth the effort. My guess is that people do have this problem, they just don't know where that memory has gone. I know I don't count objects unless I have a process that's leaking memory or it grows so big that I notice (by swapping or chance). That said, I've never noticed this particular issue.. but I deal with mostly strings. I have had issues with the allocator a few times that I had to work around, but not this sort of issue. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] weakref enhancements
On 9/28/06, Raymond Hettinger [EMAIL PROTECTED] wrote: [Alex Martelli] I've had use cases for weakrefs to boundmethods (and there IS a Cookbook recipe for them), Weakmethods make some sense (though they raise the question of why bound methods are being kept when the underlying object is no longer in use -- possibly as unintended side-effect of aggressive optimization). There are *definitely* use cases for keeping bound methods around. Contrived example: one_of = set([1,2,3,4]).__contains__ filter(one_of, [2,4,6,8,10]) -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] weakref enhancements
On 9/28/06, Raymond Hettinger [EMAIL PROTECTED] wrote: There are *definitely* use cases for keeping bound methods around. Contrived example: one_of = set([1,2,3,4]).__contains__ filter(one_of, [2,4,6,8,10]) ISTM, the example shows the (undisputed) utility of regular bound methods. How does it show the need for methods bound weakly to the underlying object, where the underlying can be deleted while the bound method persists, alive but unusable? It doesn't. I seem to have misinterpreted your Weakmethods have some use (...) sentence. Sorry for the noise. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Suggestion for a new built-in - flatten
On 9/22/06, Josiah Carlson [EMAIL PROTECTED] wrote: Michael Foord [EMAIL PROTECTED] wrote: Hello all, I have a suggestion for a new Python built in function: 'flatten'. This has been brought up many times. I'm -1 on its inclusion, if only because it's a fairly simple 9-line function (at least the trivial version I came up with), and not all X-line functions should be in the standard library. Also, while I have had need for such a function in the past, I have found that I haven't needed it in a few years. I think instead of adding a flatten function perhaps we should think about adding something like Erlang's iolist support. The idea is that methods like writelines should be able to take nested iterators and consume any object they find that implements the buffer protocol. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Suggestion for a new built-in - flatten
On 9/22/06, Brian Harring [EMAIL PROTECTED] wrote: On Fri, Sep 22, 2006 at 12:05:19PM -0700, Bob Ippolito wrote: On 9/22/06, Josiah Carlson [EMAIL PROTECTED] wrote: Michael Foord [EMAIL PROTECTED] wrote: Hello all, I have a suggestion for a new Python built in function: 'flatten'. This has been brought up many times. I'm -1 on its inclusion, if only because it's a fairly simple 9-line function (at least the trivial version I came up with), and not all X-line functions should be in the standard library. Also, while I have had need for such a function in the past, I have found that I haven't needed it in a few years. I think instead of adding a flatten function perhaps we should think about adding something like Erlang's iolist support. The idea is that methods like writelines should be able to take nested iterators and consume any object they find that implements the buffer protocol. Which is no different then just passing in a generator/iterator that does flattening. Don't much see the point in gumming up the file protocol with this special casing; still will have requests for a flattener elsewhere. If flattening was added, should definitely be a general obj, not a special casing in one method in my opinion. I disagree, the reason for iolist is performance and convenience; the required indirection of having to explicitly call a flattener function removes some optimization potential and makes it less convenient to use. While there certainly should be a general mechanism available to perform the task (easily accessible from C), the user would be better served by not having to explicitly call itertools.iterbuffers every time they want to write recursive iterables of stuff. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Suggestion for a new built-in - flatten
On 9/22/06, Josiah Carlson [EMAIL PROTECTED] wrote: Bob Ippolito [EMAIL PROTECTED] wrote: On 9/22/06, Brian Harring [EMAIL PROTECTED] wrote: On Fri, Sep 22, 2006 at 12:05:19PM -0700, Bob Ippolito wrote: I think instead of adding a flatten function perhaps we should think about adding something like Erlang's iolist support. The idea is that methods like writelines should be able to take nested iterators and consume any object they find that implements the buffer protocol. Which is no different then just passing in a generator/iterator that does flattening. Don't much see the point in gumming up the file protocol with this special casing; still will have requests for a flattener elsewhere. If flattening was added, should definitely be a general obj, not a special casing in one method in my opinion. I disagree, the reason for iolist is performance and convenience; the required indirection of having to explicitly call a flattener function removes some optimization potential and makes it less convenient to use. Sorry Bob, but I disagree. In the few times where I've needed to 'write a list of buffers to a file handle', I find that iterating over the buffers to be sufficient. And honestly, in all of my time dealing with socket and file IO, I've never needed to write a list of iterators of buffers. Not to say that YAGNI, but I'd like to see an example where 1) it was being used in the wild, and 2) where it would be a measurable speedup. The primary use for this is structured data, mostly file formats, where you can't write the beginning until you have a bunch of information about the entire structure such as the number of items or the count of bytes when serialized. An efficient way to do that is just to build a bunch of nested lists that you can use to calculate the size (iolist_size(...) in Erlang) instead of having to write a visitor that constructs a new flat list or writes to StringIO first. I suppose in the most common case, for performance reasons, you would want to restrict this to sequences only (as in PySequence_Fast) because iolist_size(...) should be non-destructive (or else it has to flatten into a new list anyway). I've definitely done this before in Python, most recently here: http://svn.red-bean.com/bob/flashticle/trunk/flashticle/ The flatten function in this case is flashticle.util.iter_only, and it's used in flashticle.actions, flashticle.amf, flashticle.flv, flashticle.swf, and flashticle.remoting. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] python, lipo and the future?
On 9/17/06, Martin v. Löwis [EMAIL PROTECTED] wrote: Josiah Carlson schrieb: Martin v. Löwis [EMAIL PROTECTED] wrote: Out of curiosity: how do the current universal binaries deal with this issue? If I remember correctly, usually you do two completely independant compile runs (optionally on the same machine with different configure or macro definitions, then use a packager provided by Apple to merge the results for each binary/so to be distributed. Each additional platform would just be a new compile run. Sometimes this is done, but usually people just use CC=cc -arch i386 -arch ppc. Most of the time that Just Works, unless the source depends on autoconf gunk for endianness related issues. It's true that the compiler is invoked twice, however, I very much doubt that configure is run twice. Doing so would cause the Makefile being regenerated, and the build starting from scratch. It would find the object files from the previous run, and either all overwrite them, or leave them in place. The gcc driver on OSX allows to invoke cc1/as two times, and then combines the resulting object files into a single one (not sure whether or not by invoking lipo). That's exactly what it does. The gcc frontend ensures that cc1/as is invoked exactly as many times as there are -arch flags, and the result is lipo'ed together. This also means that you get to see a copy of all warnings and errors for each -arch flag. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More tracker demos online
On Aug 5, 2006, at 4:52 AM, Hernan M Foffani wrote: Currently, we have two running tracker demos online: Roundup: http://efod.se/python-tracker/ Jira: http://jira.python.atlassian.com/secure/Dashboard.jspa Is anyone looking at the Google Code Hosting tracker, just for fun? =) ( code.google.com/hosting, although performance seems to be an issue for now) It's propietary code, isn't it? http://code.google.com/hosting/faq.html#itselfoss (Is that why you said just for fun?) So is Jira... -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys
On Aug 3, 2006, at 9:34 PM, Josiah Carlson wrote: Bob Ippolito [EMAIL PROTECTED] wrote: On Aug 3, 2006, at 6:51 PM, Greg Ewing wrote: M.-A. Lemburg wrote: Perhaps we ought to add an exception to the dict lookup mechanism and continue to silence UnicodeErrors ?! Seems to be that comparison of unicode and non-unicode strings for equality shouldn't raise exceptions in the first place. Seems like a slightly better idea than having dictionaries suppress exceptions. Still not ideal though because sticking non-ASCII strings that are supposed to be text and unicode in the same data structures is *probably* still an error. If/when 'python -U -c import test.testall' runs without unexpected error (I doubt it will happen prior to the all strings are unicode conversion), then I think that we can say that there aren't any use-cases for strings and unicode being in the same dictionary. As an alternate idea, rather than attempting to .decode('ascii') when strings and unicode compare, why not .decode('latin-1')? We lose the unicode decoding error, but the right thing happens (in my opinion) when u'\xa1' and '\xa1' compare. Well, in this case it would cause different behavior if u'\xa1' and '\xa1' compared equal. It'd just be an even more subtle error. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Dicts are broken Was: unicode hell/mixing str and unicode asdictionarykeys
On Aug 4, 2006, at 12:51 PM, Giovanni Bajo wrote: Paul Colomiets [EMAIL PROTECTED] wrote: Well it's not recomended to mix strings and unicode in the dictionaries but if we mix for example integer and float we have the same thing. It doesn't raise exception but still it is not expected behavior for me: d = { 1.0: 10, 2.0: 20 } then if i somewhere later do: d[1] = 100 d[2] = 200 to have here all floats in d.keys(). May be this is not a best example. There is a strong difference. Python is moving towards unifying number types in a way (see the true division issue): the idea is that, all in all, user shouldn't really care what type a number is, as long as he knows it's a number. On the other hand, unicode and str are going to diverge more and more. Well, not really. True division makes int/int return float instead of an int. You really do have to care if you have an int or a float most of the time, they're very different semantically. Unicode and str are eventually going to be the same thing (str would ideally end up becoming a synonym of unicode). The difference being that there will be some other type to contain bytes. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys
On Aug 3, 2006, at 9:51 AM, M.-A. Lemburg wrote: Ralf Schmitt wrote: Ralf Schmitt wrote: Still trying to port our software. here's another thing I noticed: d = {} d[u'm\xe1s'] = 1 d['m\xe1s'] = 1 print d With python 2.4 I can add those two keys to the dictionary and get: $ python2.4 t2.py {u'm\xe1s': 1, 'm\xe1s': 1} With python 2.5 I get: $ python2.5 t2.py Traceback (most recent call last): File t2.py, line 3, in module d['m\xe1s'] = 1 UnicodeDecodeError: 'ascii' codec can't decode byte 0xe1 in position 1: ordinal not in range(128) Is this intended behaviour? I guess this might break lots of programs and the way python 2.4 works looks right to me. I think it should be possible to mix str/unicode keys in dicts and let non-ascii strings compare not-equal to any unicode string. Also this behaviour makes your programs break randomly, that is, it will break when the string you add hashes to the same value that the unicode string has (at least that's what I guess..) This is because Unicode and 8-bit string keys only work in the same way if and only if they are plain ASCII. The reason lies in the hash function used by Unicode: it is crafted to make hash(u) == hash(s) for all ASCII s, such that s == u. For non-ASCII strings, there are no guarantees as to the hash value of the strings or whether they match or not. This has been like that since Unicode was introduced, so it's not new in Python 2.5. What is new is that the exception raised on u == s after hash collision is no longer silently swallowed. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys
On Aug 3, 2006, at 6:51 PM, Greg Ewing wrote: M.-A. Lemburg wrote: Perhaps we ought to add an exception to the dict lookup mechanism and continue to silence UnicodeErrors ?! Seems to be that comparison of unicode and non-unicode strings for equality shouldn't raise exceptions in the first place. Seems like a slightly better idea than having dictionaries suppress exceptions. Still not ideal though because sticking non-ASCII strings that are supposed to be text and unicode in the same data structures is *probably* still an error. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] struct module and coercing floats to integers
On Jul 28, 2006, at 1:35 PM, Bob Ippolito wrote: It seems that the pre-2.5 struct module has some additional undocumented behavior[1] that didn't percolate into the new version: http://python.org/sf/1530559 Python 2.4 and previous will coerce floats to integers when necessary as such without any kind of complaint: $ python2.4 -c import struct; print repr(struct.pack('H', 0.)) '\x00\x00' Python 2.5 refuses to coerce float to int: $ python2.5 -c import struct; print repr(struct.pack('H', 0.)) Traceback (most recent call last): File string, line 1, in module File /Users/bob/src/python/Lib/struct.py, line 63, in pack return o.pack(*args) TypeError: unsupported operand type(s) for : 'float' and 'long' The available options are to: 1. Reinstate the pre-2.5 weirdness 2. Reinstate the pre-2.5 weirdness with a DeprecationWarning 3. Break existing code that relies on undocumented behavior (seems more like a bug than lack of specification) There's a patch in the tracker for 2. It should get applied when the trunk freeze is over. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] struct module and coercing floats to integers
It seems that the pre-2.5 struct module has some additional undocumented behavior[1] that didn't percolate into the new version: http://python.org/sf/1530559 Python 2.4 and previous will coerce floats to integers when necessary as such without any kind of complaint: $ python2.4 -c import struct; print repr(struct.pack('H', 0.)) '\x00\x00' Python 2.5 refuses to coerce float to int: $ python2.5 -c import struct; print repr(struct.pack('H', 0.)) Traceback (most recent call last): File string, line 1, in module File /Users/bob/src/python/Lib/struct.py, line 63, in pack return o.pack(*args) TypeError: unsupported operand type(s) for : 'float' and 'long' The available options are to: 1. Reinstate the pre-2.5 weirdness 2. Reinstate the pre-2.5 weirdness with a DeprecationWarning 3. Break existing code that relies on undocumented behavior (seems more like a bug than lack of specification) Either 2 or 3 seems reasonable to me, with a preference for 3 because none of my code depends on old bugs in the struct module :) As far as precedent goes, the array module *used* to coerce floats silently, but it's had a DeprecationWarning since at least Python 2.3 (but perhaps even earlier). Maybe it's time to promote that warning to an exception for Python 2.5? [1] The pre-2.5 behavior should really be considered a bug, the documentation says Return a string containing the values v1, v2, ... packed according to the given format. The arguments must match the values required by the format exactly. I wouldn't consider arbitrary floating point numbers to match the value required by an integer format exactly. Floats are not in general interchangeable with integers in Python anyway (e.g. list indexes, etc.). -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Release manager pronouncement needed: PEP 302 Fix
On Jul 27, 2006, at 3:52 AM, Georg Brandl wrote: Armin Rigo wrote: Hi Phillip, On Wed, Jul 26, 2006 at 02:40:27PM -0400, Phillip J. Eby wrote: If we don't revert it, there are two ways to fix it. One is to just change PEP 302 so that the behavior is unbroken by definition. :) The other is to actually go ahead and fix it by adding PathImporter and NullImporter types to import.c, along with a factory function on sys.path_hooks to create them. (This would've been the PEP-compliant way to implement the need-for-speed patch.) So, fix by documentation, fix by fixing, or fix by reverting? Which should it be? fix by changing the definition looks like a bad idea to me. The import logic is already extremely complicated and delicate, any change to it is bound to break *some* code somewhere. Though beta1 and beta2 shipped with this change nobody reported any bug that could be linked to it. sys.path_importer_cache is quite an internal thing and most code, even import hooks, shouldn't have to deal with it. Anyone trying to emulate what imp.find_module does in a PEP 302 compliant way will need to introspect sys.path_importer_cache. I have some unreleased code based on the PEP 302 spec that does this and the way it was originally written would have broke in 2.5 if I had tested it there. Just because it's obscure doesn't mean we should go change how things work in a way that's not consistent with the documentation. The documentation should change to match the code or vice versa, though I really don't have any strong feelings one way or the other. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] JSON implementation in Python 2.6
On Jul 26, 2006, at 3:18 PM, John J Lee wrote: On Wed, 26 Jul 2006, Phillip J. Eby wrote: [...] Actually, I would see more reason to include JSON in the standard library, since it's at least something approaching an internet protocol these days. +1 If there's a consensus on that, my simplejson [1] implementation could migrate to the stdlib for 2.6. The API is modeled after marshal and pickle, the code should be PEP 8 compliant, its test suite has pretty good coverage, it's already used by (at least) TurboGears and Django, and it's the implementation currently endorsed by json.org. The work that would be required would be: - LaTeX docs (currently reST in docstrings) - Move the tests around and make them run from the suite rather than via nose - Possible module rename (jsonlib?) [1] http://undefined.org/python/#simplejson -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] User's complaints
On Jul 17, 2006, at 11:25 AM, Armin Rigo wrote: Hi Bob, On Thu, Jul 13, 2006 at 12:58:08AM -0700, Bob Ippolito wrote: @main def whatever(): ... It would probably need to be called something else, because main is often the name of the main function... Ah, but there is theoretically no name clash here :-) @main # - from the built-ins def main(): # - and only then set the global ... Just-making-a-stupid-point-and-not-endorsing-the-feature-ly yours, Of course it *works*, but it's still overriding a built-in... Who knows when assignment to main will become a SyntaxError like None ;) -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] User's complaints
On Jul 13, 2006, at 12:37 AM, Wolfgang Langner wrote: On 7/13/06, Jeroen Ruigrok van der Werven [EMAIL PROTECTED] wrote: Things that struck me as peculiar is the old: if __name__ == __main__: whatever() This is so out of tune with the rest of python it becomes a nuisance. It is not beautiful but very useful. In Python 3000 we can replace it with: @main def whatever(): ... to mark this function as main function if module executed directly. It would probably need to be called something else, because main is often the name of the main function... but you could write such a decorator now if you really wanted to. def mainfunc(fn): if fn.func_globals.get('__name__') == '__main__': # ensure the function is in globals fn.func_globals[fn.__name__] = fn fn() return fn @mainfunc def main(): print 'this is in __main__' -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] User's complaints
On Jul 13, 2006, at 2:02 AM, Greg Ewing wrote: Jeroen Ruigrok van der Werven wrote: - Open classes would be nice. What do you mean by open classes? Python classes already seem pretty open to me, by the standards of other languages! I'm guessing he's talking about being like Ruby or Objective-C where you can add methods to any other class in the runtime. Basically we'd have that if the built-in classes were mutable, but that just really encourages fragile code. The big problem you run into with open classes is that you end up depending on two libraries that have a different idea of what the foo method on string objects should do. Adding open classes would make it easier to develop DSLs, but you'd only be able to reasonably do one per interpreter (unless you mangled the class in a with block or something). -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] User's complaints
On Jul 13, 2006, at 5:02 AM, Jeroen Ruigrok van der Werven wrote: Hi Bob, On 7/13/06, Bob Ippolito [EMAIL PROTECTED] wrote: Adding open classes would make it easier to develop DSLs, but you'd only be able to reasonably do one per interpreter (unless you mangled the class in a with block or something). The person whose 'complaints' I was stating says that DSLs (Domain Specific Languages for those who, like me, were confused about the acronym) are a big part of what he is after and one per interpreter is fine by him. He also realises that the application(s) he needs them for might be unusual. He doesn't specifically need the builtin types to be extendable. It's just nice to be able to define a single class in multiple modules. Even C++ allows this to some extent (but not as much as he'd like). He understands the implications of allowing open classes (import vs. no import changes semantics, etc.). Personally, he doesn't care *too* much about newbie safety since he's not a newbie. To quote verbatim: give me the big guns :-) And while we're at it, he also stated: [...] add multiple dispatch to your list of improvements for Python. I hope this clarifies it a bit for other people. Well, if this person really weren't a newbie then of course they'd know how to define a metaclass that can be used to extend a (non- built-in) class from another module. They'd probably also know of two or three different implementations of multiple dispatch (or equivalent, such as generic functions) available, and could probably write their own if they had to ;) The only valid complaint, really, is that built-in classes are read- only. I doubt anyone wants to change that. If they want to write things in the style of Ruby, why not just use it? -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Community buildbots
On Jul 13, 2006, at 1:53 PM, Giovanni Bajo wrote: [EMAIL PROTECTED] wrote: (Aside: IMHO, the sooner we can drop old-style classes entirely, the better. That is one bumpy Python upgrade process that I will be _very_ happy to do. I think python should have a couple more of future imports. from __future__ import new_classes and from __future__ import unicode_literals would be really welcome, and would smooth the Py3k migration process from __future__ import new_classes exists, but the syntax is different: __metaclass__ = type -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Restricted execution: what's the threat model?
On Jul 12, 2006, at 2:23 PM, Jim Jewett wrote: Ka-Ping Yee writes: A. The interpreter will not crash no matter what Python code it is given to execute. Why? We don't want it to crash the embedding app (which might be another python interpreter), but if the sandboxed interpreter itself crashes, is that so bad? The embedding app should just act as though that interpreter exited, possibly with a status code. When he says crash, I'd have to imagine that he means of the segfault variety. Good luck saving the embedding app after that. C. Python programs running in different interpreters embedded in the same process cannot access each other's Python objects. Note that Brett's assumption of shared extension modules violates this -- but I'm not sure why he needs to assume that. (Because of the init-only-once semantics, I'm not even sure it is a good idea to share them.) Well if you don't share them, you can't have them at all other than in the main trusted interpreter. C extensions can only be safely initialized once and they often cache objects in static variables... lots of C modules aren't even safe to use when combined with multiple interpreters and threads (e.g. PyGILState API), so I guess that perhaps the C API should be refined anyway. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Musings on concurrency and scoping (replacing Javascript)
On Jul 7, 2006, at 1:08 PM, Guido van Rossum wrote: On 7/7/06, Ka-Ping Yee [EMAIL PROTECTED] wrote: I've been doing a bunch of Firefox extension programming in Javascript and suddenly a few of the recent topics here came together in my head in a silent kapow of thoughts. This is kind of a side note to the security discussion, but they're all interconnected: network programming, concurrency, lexical scoping, security. Hm... I wonder if this style has become so popular in JS because it's all they have? I find callback-style programming pretty inscrutable pretty soon. You really don't have any choice without continuations or some built- in concurrency primitive. Callbacks are slightly less painful in JavaScript because you can define them in-line instead of naming it first. Client-side web scripting tends to have a callback/continuation-ish concurrency style because it has to deal with network transactions (which can stall for long periods of time) in a user interface that is expected to stay always responsive. The Firefox API is full of listeners/observers, events, and continuation-like things. So one thing to consider is that, when Python is used for these purposes, it may be written in a specialized style. As i write JavaScript in this style i find i use nested functions a lot. When i want to set up a callback that uses variables in the current context, the natural thing to do is to define a new function in the local namespace. And if that function has to also provide a callback, then it has another function nested within it and so on. function spam() { var local_A = do_work(); do_network_transaction( new function(result_1) { var local_B = do_work(result_1); do_network_transaction( new function(result_2) { do_work(local_A, local_B, result_1, result_2); ... } ); } ); } How can you ever keep track of when a '}' must be followed by a ';' ? }\n is the same as }; as far as the JavaScript spec goes, you can do either or both. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Musings on concurrency and scoping (replacing Javascript)
On Jul 6, 2006, at 5:04 PM, Ka-Ping Yee wrote: On Thu, 6 Jul 2006, Phillip J. Eby wrote: As much as I'd love to have the nested scope feature, I think it's only right to point out that the above can be rewritten as something like this in Python 2.5: def spam(): local_A = do_work() result_1 = yield do_network_transaction() local_B = do_work(result_1) result_2 = yield do_network_transaction() do_work(local_A, local_B, result_1, result_2) ... All you need is an appropriate trampoline (possibly just a decorator) that takes the objects yielded by the function, and uses them up to set up callbacks that resume the generator with the returned result. Clever! Could you help me understand what goes on in do_network_transaction() when you write it this way? In the Firefox/JavaScript world, the network transaction is fired off in another thread, and when it's done it posts an event back to the JavaScript thread, which triggers the callback. And what happens if you want to supply more than one continuation? In my JavaScript code i'm setting up two continuations per step -- one for success and one for failure, since with a network you never know what might happen. When you have a failure the yield expression raises an exception instead of returning a result. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] zlib module build failure on Mac OSX 10.4.7
On Jul 1, 2006, at 10:45 AM, Ronald Oussoren wrote: On Jul 1, 2006, at 6:57 PM, [EMAIL PROTECTED] wrote: Ronald Are you sure you're building on a 10.4 box? Both the Ronald macosx-10.3 thingy and lack of inflateCopy seem to indicate that Ronald you're running on 10.3. Well, yeah, pretty sure. Let's see. The box with the disk says Mac OS X Tiger - Version 10.4 on the spine. The About This Mac popup says 10.4.7. That gets the easy solution out of the way ;-) It used to run 10.3 though. Is there some possibility the update from 10.3 to 10.4 had problems? Note that the compile log on the buildbot 10.4 box also has 10.3 in its directory names. If I remember correctly, it came from Apple with 10.4 installed. /me slaps head. Having 10.3 in the directory names is intentional, the version in the directory name is the value of MACOSX_DEPLOYMENT_TARGET, with is defaulted to 10.3 in the configure script. What I don't understand yet is why your copy of libz doesn't have inflateCopy. What does /usr/lib/libz.dylib point to on your system? On my 10.4 box it is a symlink that points to libz.1.2.3.dylib and there is an older version of libz (libz.1.1.3.dylib) in /usr/lib as well. Maybe Skip didn't upgrade to the latest version of Xcode? Perhaps he's still got an old SDK? -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] doc for new restricted execution design for Python
On Jun 28, 2006, at 10:54 AM, Brett Cannon wrote:On 6/28/06, Trent Mick [EMAIL PROTECTED] wrote: Brett Cannon wrote: Mark (and me a little bit) has been sketching out creating a "Python forMozilla/Firefox" extension for installing an embedded Python into anexisting Firefox installation on the pyxpcom list: http://aspn.activestate.com/ASPN/Mail/Message/pyxpcom/3167613 The idea is that there be a separate Python interpreter per web browser page instance.I think there may be scaling issues there. _javascript_ isn't doing that is it, do you know? As well, that doesn't seem like it would translatewell to sharing execution between separate chrome windows in anon-browser XUL/Mozilla-based app.I don't know how _javascript_ is doing it yet. The critical thing for me for this month was trying to come up with a security model. And if you don't think it is going to scale, how do you think it should be done?Why wouldn't it scale? How much interpreter state is there really anyway?-bob___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] xturtle.py a replacement for turtle.py(!?) ATTENTION PLEASE!
On Jun 28, 2006, at 1:05 PM, Gregor Lingl wrote: Martin v. Löwis schrieb: Collin Winter wrote: While I have no opinion on Gregor's app, and while I fully agree that new language features and stdlib modules should generally stay out of bug-fix point releases, xturtle doesn't seem to rise to that level (and hence, those restrictions). It's a stdlib module, even if no other stdlib modules depend on it; try import turtle. In the specific case, the problem with adding it to 2.5 is that xturtle is a huge rewrite, so ideally, the code should be reviewed before being added. Given that this is a lot of code, nobody will have the time to perform a serious review. It will be hard enough to find somebody to review it for 2.6 - often, changes of this size take several years to review (primarily because it is so specialized that only few people even consider reviewing it). Sorry Martin, but to me this seems not to be the right way to manage things. We have turtle.py revised in Python2.5b1 Please try this example (as I just did) : IDLE 1.2b1 No Subprocess from turtle import * begin_fill() circle(100,90) # observe the turtle backward(200) circle(100,90) color(red) end_fill() IDLE internal error in runcode() Traceback (most recent call last): File pyshell#6, line 1, in module end_fill() File C:\Python25\lib\lib-tk\turtle.py, line 724, in end_fill def end_fill(): _getpen.end_fill() AttributeError: 'function' object has no attribute 'end_fill' An error occurs, because in line 724 it should read def end_fill(): _getpen().end_fill() File a patch, this is a bug fix and should definitely be appropriate for inclusion before the release of Python 2.5! -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] doc for new restricted execution design for Python
On Jun 25, 2006, at 1:08 PM, Brett Cannon wrote:On 6/24/06, Bob Ippolito [EMAIL PROTECTED] wrote: On Jun 24, 2006, at 2:46 AM, Nick Coghlan wrote: Brett Cannon wrote: Yep. That API will be used directly in the changes to pymalloc and PyMem_*() macros (or at least the basic idea). It is not *only* for extension modules but for the core as well. Existing extension modules and existing C code in the Python interpreter have no idea of any PyXXX_ calls, so I don't understand how new API functions help here. The calls get added to pymalloc and PyMem_*() under the hood, so that existing extension modules use the memory check automatically without a change. The calls are just there in case some one has some random need to do their own malloc but still want to participate in the cap. Plus it helped me think everything through by giving everything I would need to change internally an API. This confused me a bit, too. It might help if you annotated each of the new API's with who the expected callers were:- trusted interpreter - untrusted interpreter- embedding application- extension moduleThreading is definitely going to be an issue with multipleinterpreters (restricted or otherwise)... for example, the PyGILState API probably wouldn't work anymore.PyGILState won't work because there are multiple interpreters period, or because of the introduced distinction of untrusted and trusted interpreters? In other words, is this some new possible breakage, or is this an issue with threads that has always existed with multiple interpreters? It's an issue that's always existed with multiple interpreters, but multiple interpreters aren't really commonly used or tested at the moment so it's not very surprising.It would be kinda nice to have an interpreter-per-thread with no GIL like some of the other languages have, but the C API depends on too much global state for that...-bob___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] doc for new restricted execution design for Python
On Jun 24, 2006, at 2:46 AM, Nick Coghlan wrote: Brett Cannon wrote: Yep. That API will be used directly in the changes to pymalloc and PyMem_*() macros (or at least the basic idea). It is not *only* for extension modules but for the core as well. Existing extension modules and existing C code in the Python interpreter have no idea of any PyXXX_ calls, so I don't understand how new API functions help here. The calls get added to pymalloc and PyMem_*() under the hood, so that existing extension modules use the memory check automatically without a change. The calls are just there in case some one has some random need to do their own malloc but still want to participate in the cap. Plus it helped me think everything through by giving everything I would need to change internally an API. This confused me a bit, too. It might help if you annotated each of the new API's with who the expected callers were: - trusted interpreter - untrusted interpreter - embedding application - extension module Threading is definitely going to be an issue with multiple interpreters (restricted or otherwise)... for example, the PyGILState API probably wouldn't work anymore. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PyRange_New() alternative?
On Jun 22, 2006, at 11:55 AM, Ralf W. Grosse-Kunstleve wrote: --- Georg Brandl [EMAIL PROTECTED] wrote: Ralf W. Grosse-Kunstleve wrote: http://docs.python.org/dev/whatsnew/ports.html says: The PyRange_New() function was removed. It was never documented, never used in the core code, and had dangerously lax error checking. I use this function (don't remember how I found it; this was years ago), therefore my code doesn't compile with 2.5b1 (it did OK before with 2.5a2). Is there an alternative spelling for PyRange_New()? You can call PyRange_Type with the appropriate parameters. Thanks a lot for the hint! However, I cannot find any documentation for PyRange_*. I tried this page... http://docs.python.org/api/genindex.html and google. Did I miss something? I am sure I can get this to work with some digging, but I am posting here to highlight a communication problem. I feel if a function is removed the alternative should be made obvious in the associated documentation; in particular if there is no existing documentation for the alternative. He means something like this: PyObject_CallFunction(PyRange_Type, llli, ...) -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] unicode imports
On Jun 16, 2006, at 9:02 AM, Phillip J. Eby wrote: At 01:29 AM 6/17/2006 +1000, Nick Coghlan wrote: Kristján V. Jónsson wrote: A cursory glance at import.c shows that the import mechanism is fairly complicated, and riddled with char *path thingies, and manual string arithmetic. Do you have any suggestions on a clean way to unicodify the import mechanism? Can you install a PEP 302 path hook and importer/loader that can handle path entries that are Unicode strings? (I think this would end up being the parallel implementation you were talking about, though) If the code that traverses sys.path and sys.path_hooks is itself unicode-unaware (I don't remember if it is or isn't), then you might be able to trick it by poking a Unicode-savvy importer directly into the path_importer_cache for affected Unicode paths. Actually, you would want to put it in sys.path_hooks, and then instances would be placed in path_importer_cache automatically. If you are adding it to the path_hooks after the fact, you should simply clear the path_importer_cache. Simply poking stuff into the path_importer_cache is not a recommended approach. One issue is that the package and file names still have to be valid Python identifiers, which means ASCII. Unicode would be, at best, permitted only in the path entries. If I understand the problem correctly, the issue is that if you install Python itself to a Unicode directory, you'll be unable to import anything from the standard library. This isn't about module names, it's about the places on the path where that stuff goes. There's a similar issue in that if sys.prefix contains a colon, Python is also busted: http://python.org/sf/1507224 Of course, that's not a Windows issue, but it is everywhere else. The offending code in that case is Modules/getpath.c, which probably also has to change in order to make unicode directories work on Win32 (though I think there may be a separate win32 implementation of getpath). -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Add pure python PNG writer module to stdlib?
On Jun 10, 2006, at 4:35 PM, Brett Cannon wrote:On 6/10/06, Johann C. Rocholl [EMAIL PROTECTED] wrote: I'm working on simple module to write PNG image files in pure python.Adding it to the standard library would be useful for people who wantto create images on web server installations without gd and imlib, oron platforms where the netpbm tools are not easily available. Does anybody find this idea interesting?Yes, although I wouldn't want an interface taking in strings but something more like an iterator that returns each row which itself contains int triples. In other words more array-based than string based. Well you could easily make such strings (or a buffer object that could probably be used in place of a string) with the array module... -bob___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Is implicit underscore assignment buggy?
On Jun 7, 2006, at 3:41 PM, Aahz wrote: On Wed, Jun 07, 2006, Raymond Hettinger wrote: Fredrik: for users, it's actually quite simple to figure out what's in the _ variable: it's the most recently *printed* result. if you cannot see it, it's not in there. Of course, there's a pattern to it. The question is whether it is the *right* behavior. Would the underscore assignment be more useful and intuitive if it always contained the immediately preceding result, even if it was None? In some cases (such as the regexp example), None is a valid and useful possible result of a computation and you may want to access that result with _. My take is that Fredrik is correct about the current behavior being most generally useful even if it is slightly less consistent, as well as being undesired in rare circumstances. Consider that your message is the only one I've seen in more than five years of monitoring python-dev and c.l.py. I agree. I've definitely made use of the current behavior, e.g. for printing a different representation of _ before doing something else with it. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_struct failure on 64 bit platforms
On May 31, 2006, at 12:49 AM, Neal Norwitz wrote: Bob, There are a couple of things I don't understand about the new struct. Below is a test that fails. $ ./python ./Lib/test/regrtest.py test_tarfile test_struct test_tarfile /home/pybot/test-trunk/build/Lib/struct.py:63: DeprecationWarning: 'l' format requires -2147483648 = number = 2147483647 return o.pack(*args) test_struct test test_struct failed -- pack('l', -2147483649) did not raise error 1 test OK. 1 test failed: test_struct I fixed the error message (the min value was off by one before). I think I fixed a few ssize_t issues too. The remaining issues I know of are: * The warning only appears on 64-bit platforms. * The warning doesn't seem correct for 64-bit platforms (l is 8 bytes, not 4). * test_struct only fails if run after test_tarfile. * The msg from test_struct doesn't seem correct for 64-bit platforms. I tracked the problem down to trying to write the gzip tar file. Can you fix this? The warning is correct, and so is the size. Only native formats have native sizes; l and i are exactly 4 bytes on all platforms when using =, , , or !. That's what std size and alignment means. It looks like the only thing that's broken here is the test. The behavior changed to consistently allow any integer whatsoever to be passed to struct for all formats (except q and Q which have always done proper range checking). Previously, the range checking was inconsistent across platforms (32-bit and 64-bit anyway) and when using int vs. long. Unfortunately I don't have a 64-bit platform easily accessible and I have no idea which test it is that's raising the warning. Could you isolate it? -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64
On May 31, 2006, at 8:31 AM, Tim Peters wrote: I'm afraid a sabbatical year isn't long enough to understand what the struct module did or intends to do by way of range checking 0.7 wink. Is this intended? This is on a 32-bit Windows box with current trunk: from struct import pack as p p(I, 2**32 + 2343) C:\Code\python\lib\struct.py:63: DeprecationWarning: 'I' format requires 0 = number = 4294967295 return o.pack(*args) '\x00\x00\x00\x00' The warning makes sense, but the result doesn't make sense to me. In Python 2.4.3, that example raised OverflowError, which seems better than throwing away all the bits without an exception. Throwing away all the bits is a bug, it's supposed to mask with 0xL -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64
On May 29, 2006, at 8:00 PM, Tim Peters wrote: [Bob Ippolito] ... Actually, should this be a FutureWarning or a DeprecationWarning? Since it was never documented, UndocumentedBugGoingAwayError ;-) Short of that, yes, DeprecationWarning. FutureWarning is for changes in non-exceptional behavior (.e.g, if we swapped the meanings of and in struct format codes, that would rate a FutureWarning subclass, line InsaneFutureWarning). OK, this behavior is implemented in revision 46537: (this is from ./python.exe -Wall) import struct ... struct.pack('B', -1) /Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: struct integer wrapping is deprecated return o.pack(*args) /Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: 'B' format requires 0 = number = 255 return o.pack(*args) '\xff' We certainly don't want to see two deprecation warnings for a single deprecated behavior. I suggest eliminating the struct integer wrapping warning, mostly because I had no idea what it _meant_ before reading the comments in _struct.c (wrapping is used most often in a proxy or delegation context in Python these days). 'B' format requires 0 = number = 255 is perfectly clear all by itself. What should it be called instead of wrapping? When it says it's wrapping, it means that it's doing x = (2 ^ (8 * n)) - 1 to force a number into meeting the expected range. Reducing it to one warning instead of two is kinda difficult. Is it worth the trouble? -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64
On May 30, 2006, at 2:41 AM, Nick Coghlan wrote: Bob Ippolito wrote: On May 29, 2006, at 8:00 PM, Tim Peters wrote: We certainly don't want to see two deprecation warnings for a single deprecated behavior. I suggest eliminating the struct integer wrapping warning, mostly because I had no idea what it _meant_ before reading the comments in _struct.c (wrapping is used most often in a proxy or delegation context in Python these days). 'B' format requires 0 = number = 255 is perfectly clear all by itself. What should it be called instead of wrapping? When it says it's wrapping, it means that it's doing x = (2 ^ (8 * n)) - 1 to force a number into meeting the expected range. integer overflow masking perhaps? Sounds good enough, I'll go ahead and change the wording to that. Reducing it to one warning instead of two is kinda difficult. Is it worth the trouble? If there are cases where only one warning or the other triggers, it doesn't seem worth the effort to try and suppress one of them when they both trigger. It works kinda like this: def get_ulong(x): ulong_mask = (sys.maxint 1L) | 1 if is_unsigned and ((unsigned)x) ulong_mask: x = ulong_mask warning('integer overflow masking is deprecated') return x def pack_ubyte(x): x = get_ulong(x) if not (0 = x = 255): warning('B' format requires 0 = number = 255) x = 0xff return chr(x) Given the implementation, it will warn twice if sizeof(format) sizeof(long) AND one of the following: 1. Negative numbers are given for an unsigned format 2. Input value is greater than ((sys.maxint 1) | 1) for an unsigned format 3. Input value is not ((-sys.maxint - 1) = x = sys.maxint) for a signed format -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Converting crc32 functions to use unsigned
It seems that we should convert the crc32 functions in binascii, zlib, etc. to deal with unsigned integers. Currently it seems that 32- bit and 64-bit platforms are going to have different results for these functions. Should we do the same as the struct module, and do DeprecationWarning when the input value is 0? Do we have a PyArg_ParseTuple format code or a converter that would be suitable for this purpose? None of the unit tests seem to exercise values where 32-bit and 64- bit platforms would have differing results, but that's easy enough to fix... -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64
On May 30, 2006, at 10:47 AM, Tim Peters wrote: [Bob Ippolito] What should it be called instead of wrapping? I don't know -- I don't know what it's trying to _say_ that isn't already said by saying that the input is out of bounds for the format code. The wrapping (now overflow masking) warning happens during conversion of PyObject* to long or unsigned long. It has no idea what the destination packing format is beyond whether it's signed or unsigned. If the packing format happens to be the same size as a long, it can't possibly trigger a range warning (unless range checks are moved up the stack and all of the function signatures and code get changed to accommodate that). When it says it's wrapping, it means that it's doing x = (2 ^ (8 * n)) - 1 to force a number into meeting the expected range. How is that different from what it does in this case?: struct.pack('B', 256L) /Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: 'B' format requires 0 = number = 255 return o.pack(*args) '\x00' That looks like wrapping to me too (256 (2**(8*1)-1)== 0x00), but in this case there is no deprecation warning about wrapping. Because of that, I'm afraid you're drawing distinctions that can't make sense to users. When it says integer wrapping it means that it's wrapping to fit in a long or unsigned long. n in this case is always 4 or 8 depending on the platform. The format-specific range check is separate. My description wasn't very good in the last email. Reducing it to one warning instead of two is kinda difficult. Is it worth the trouble? I don't understand. Every example you gave that showed a wrapping warning also showed a format requires i = number = j warning. Are there cases in which a wrapping warning is given but not a format requires i = number = j warning? If so, I simply haven't seen one (but I haven't tried all possible inputs ;-)). Since the implementation appears (to judge from the examples) to wrap in every case in which any warning is given (or are there cases in which it doesn't?), I don't understand the point of distinguishing between wrapping warnings and format requires i = number = j warnings either. The latter are crystal clear. A latter email in this thread enumerates exactly which circumstances should cause two warnings with the current implementation. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Converting crc32 functions to use unsigned
On May 30, 2006, at 11:19 AM, Guido van Rossum wrote: On 5/30/06, Giovanni Bajo [EMAIL PROTECTED] wrote: Bob Ippolito wrote: It seems that we should convert the crc32 functions in binascii, zlib, etc. to deal with unsigned integers. +1!! Seems ok, except I don't know what the backwards incompatibilities would be... I think the only compatibility issues we're going to run into are with the struct module in the way of DeprecationWarning. If people are depending on specific negative values for these, then their code should already be broken on 64-bit platforms. The only potential breakage I can see is if they're passing these values to other functions written in C that expect PyInt_AsLong(n) to work with the values (on 32-bit platforms). I can't think of a use case for that beyond the functions themselves and the struct module. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64
On May 28, 2006, at 5:34 PM, Thomas Wouters wrote:On 5/29/06, Bob Ippolito [EMAIL PROTECTED] wrote: On May 28, 2006, at 4:31 AM, Thomas Wouters wrote: I'm seeing a dubious failure of test_gzip and test_tarfile on my AMD64 machine. It's triggered by the recent struct changes, but I'd say it's probably caused by a bug/misfeature in zlibmodule: zlib.crc32 is the result of a zlib 'crc32' functioncall, which returns an unsigned long. zlib.crc32 turns that unsigned long into a (signed) Python int, which means a number beyond 131 goes negative on 32-bit systems and other systems with 32-bit longs, but stays positive on systems with 64-bit longs: (32-bit) zlib.crc32("foobabazr") -271938108 (64-bit) zlib.crc32("foobabazr") 4023029188 The old structmodule coped with that: struct.pack("l", -271938108) '\xc4\x8d\xca\xef' struct.pack("l", 4023029188) '\xc4\x8d\xca\xef' The new one does not: struct.pack("l", -271938108) '\xc4\x8d\xca\xef' struct.pack("l", 4023029188) Traceback (most recent call last): File "stdin", line 1, in module File "Lib/struct.py", line 63, in pack return o.pack(*args) struct.error: 'l' format requires -2147483647 = number = 2147483647 The structmodule should be fixed (and a test added ;) but I'm also wondering if zlib shouldn't be fixed. Now, I'm AMD64-centric, so my suggested fix would be to change the PyInt_FromLong() call to PyLong_FromUnsignedLong(), making zlib always return positive numbers -- it might break some code on 32-bit platforms, but that code is already broken on 64-bit platforms. But I guess I'm okay with the long being changed into an actual 32-bit signed number on 64-bit platforms, too.The struct module isn't what's broken here. All of the struct typeshave always had well defined bit sizes and alignment if you explicitly specify an endian, I and L are 32-bits everywhere, and Q is supported on platforms that don't have long long. The onlything that's changed is that it actually checks for errorsconsistently now. Yes. And that breaks things. I'm certain the behaviour is used in real-world code (and I don't mean just the gzip module.) It has always, as far as I can remember, accepted 'unsigned' values for the signed versions of ints, longs and long-longs (but not chars or shorts.) I agree that that's wrong, but I don't think changing struct to do the right thing should be done in 2.5. I don't even think it should be done in 2.6 -- although 3.0 is fine.Well, the behavior change is in response to a bug http://python.org/sf/1229380. If nothing else, we should at least fix the standard library such that it doesn't depend on struct bugs. This is the only way to find them :)Basically the struct module previously only checked for errors if you don't specify an endian. That's really strange and leads to very confusing results. The only code that really should be broken by this additional check is code that existed before Python had a long type and only signed values were available.-bob___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64
On May 29, 2006, at 3:14 AM, Thomas Wouters wrote:On 5/29/06, Bob Ippolito [EMAIL PROTECTED] wrote: Well, the behavior change is in response to a bug http://python.org/sf/1229380. If nothing else, we should at least fix the standard library such that it doesn't depend on struct bugs. This is the only way to find them :) Feel free to comment how the zlib.crc32/gzip co-operation should be fixed. I don't see an obviously correct fix. The trunk is currently failing tests it shouldn't fail. Also note that the error isn't with feeding signed values to unsigned formats (which is what the bug is about) but the other way 'round, although I do believe both should be accepted for the time being, while generating a warning. Well, first I'm going to just correct the modules that are broken (zlib, gzip, tarfile, binhex and probably one or two others).Basically the struct module previously only checked for errors if you don't specify an endian. That's really strange and leads to very confusing results. The only code that really should be broken by this additional check is code that existed before Python had a long type and only signed values were available. Alas, reality is different. The fundamental difference between types in Python and in C causes this, and code using struct is usually meant specifically to bridge those two worlds. Furthermore, struct is often used *fix* that issue, by flipping sign bits if necessary: Well, in C you get a compiler warning for stuff like this. struct.unpack("l", struct.pack("l", 3221225472))(-1073741824,) struct.unpack("l", struct.pack("L", 3221225472))(-1073741824,) struct.unpack("l", struct.pack("l", -1073741824))(-1073741824,) struct.unpack("l", struct.pack("L", -1073741824))(-1073741824,) Before this change, you didn't have to check whether the value is negative before the struct.unpack/pack dance, regardless of which format character you used. This misfeature is used (and many would consider it convenient, even Pythonic, for struct to DWIM), breaking it suddenly is bad. struct doesn't really DWIM anyway, since integers are up-converted to longs and will overflow past what the (old or new) struct module will accept. Before there was a long type or automatic up-converting, the sign agnosticism worked.. but it doesn't really work correctly these days.We have two choices, either fix it to behave consistently broken everywhere for numbers of every size (modulo every number that comes in so that it fits), or have it do proper range checking. A compromise is to do proper range checking as a warning, and do the modulo math anyway... but is that what we really want?-bob___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64
On May 29, 2006, at 12:44 PM, Guido van Rossum wrote: On 5/29/06, Tim Peters [EMAIL PROTECTED] wrote: I think we should do as Thomas proposes: plan to make it an error in 2.6 (or 2.7 if there's a big outcry, which I don't expect) and accept it with a warning in 2.5. That's what I arrived at, although 2.4.3's checking behavior is actually so inconsistent that it needs some defining (what exactly are we trying to still accept? e.g., that -1 doesn't trigger I complaints but that -1L does above? that one's surely a bug). No, it reflects that (up to 2.3 I believe) 0x was -1 but 0xL was 4294967295L. Python 2.3 did a FutureWarning on 0x but its value was -1. Anyway, my plan is to make it such that all non-native format codes will behave exactly like C casting, but will do a DeprecationWarning for input numbers that were initially out of bounds. This behavior will be consistent across (python) int and long, and will be easy enough to explain in the docs (but still more complicated than values not representable by this data type will raise struct.error). This means that I'm also changing it so that struct.pack will not raise OverflowError for some longs, it will always raise struct.error or do a warning (as long as the input is int or long). Pseudocode looks kinda like this: def wrap_unsigned(x, CTYPE): if not (0 = x = CTYPE_MAX): DeprecationWarning() x = CTYPE_MAX return x Actually, should this be a FutureWarning or a DeprecationWarning? -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64
On May 29, 2006, at 1:18 PM, Bob Ippolito wrote: On May 29, 2006, at 12:44 PM, Guido van Rossum wrote: On 5/29/06, Tim Peters [EMAIL PROTECTED] wrote: I think we should do as Thomas proposes: plan to make it an error in 2.6 (or 2.7 if there's a big outcry, which I don't expect) and accept it with a warning in 2.5. That's what I arrived at, although 2.4.3's checking behavior is actually so inconsistent that it needs some defining (what exactly are we trying to still accept? e.g., that -1 doesn't trigger I complaints but that -1L does above? that one's surely a bug). No, it reflects that (up to 2.3 I believe) 0x was -1 but 0xL was 4294967295L. Python 2.3 did a FutureWarning on 0x but its value was -1. Anyway, my plan is to make it such that all non-native format codes will behave exactly like C casting, but will do a DeprecationWarning for input numbers that were initially out of bounds. This behavior will be consistent across (python) int and long, and will be easy enough to explain in the docs (but still more complicated than values not representable by this data type will raise struct.error). This means that I'm also changing it so that struct.pack will not raise OverflowError for some longs, it will always raise struct.error or do a warning (as long as the input is int or long). Pseudocode looks kinda like this: def wrap_unsigned(x, CTYPE): if not (0 = x = CTYPE_MAX): DeprecationWarning() x = CTYPE_MAX return x Actually, should this be a FutureWarning or a DeprecationWarning? OK, this behavior is implemented in revision 46537: (this is from ./python.exe -Wall) import struct struct.pack('B', 256) Traceback (most recent call last): File stdin, line 1, in module File /Users/bob/src/python/Lib/struct.py, line 63, in pack return o.pack(*args) struct.error: ubyte format requires 0 = number = 255 struct.pack('B', -1) Traceback (most recent call last): File stdin, line 1, in module File /Users/bob/src/python/Lib/struct.py, line 63, in pack return o.pack(*args) struct.error: ubyte format requires 0 = number = 255 struct.pack('B', 256) /Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: 'B' format requires 0 = number = 255 return o.pack(*args) '\x00' struct.pack('B', -1) /Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: struct integer wrapping is deprecated return o.pack(*args) /Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: 'B' format requires 0 = number = 255 return o.pack(*args) '\xff' struct.pack('B', 256L) /Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: 'B' format requires 0 = number = 255 return o.pack(*args) '\x00' struct.pack('B', -1L) /Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: struct integer wrapping is deprecated return o.pack(*args) /Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: 'B' format requires 0 = number = 255 return o.pack(*args) '\xff' In _struct.c, getting rid of the #define PY_STRUCT_WRAPPING 1 will turn off this warning+wrapping nonsense and just raise errors for out of range values. It'll also enable some additional performance hacks (swapping out the host-endian table's pack and unpack functions with the faster native versions). -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64
On May 28, 2006, at 4:31 AM, Thomas Wouters wrote: I'm seeing a dubious failure of test_gzip and test_tarfile on my AMD64 machine. It's triggered by the recent struct changes, but I'd say it's probably caused by a bug/misfeature in zlibmodule: zlib.crc32 is the result of a zlib 'crc32' functioncall, which returns an unsigned long. zlib.crc32 turns that unsigned long into a (signed) Python int, which means a number beyond 131 goes negative on 32-bit systems and other systems with 32-bit longs, but stays positive on systems with 64-bit longs: (32-bit) zlib.crc32(foobabazr) -271938108 (64-bit) zlib.crc32(foobabazr) 4023029188 The old structmodule coped with that: struct.pack(l, -271938108) '\xc4\x8d\xca\xef' struct.pack(l, 4023029188) '\xc4\x8d\xca\xef' The new one does not: struct.pack(l, -271938108) '\xc4\x8d\xca\xef' struct.pack(l, 4023029188) Traceback (most recent call last): File stdin, line 1, in module File Lib/struct.py, line 63, in pack return o.pack(*args) struct.error: 'l' format requires -2147483647 = number = 2147483647 The structmodule should be fixed (and a test added ;) but I'm also wondering if zlib shouldn't be fixed. Now, I'm AMD64-centric, so my suggested fix would be to change the PyInt_FromLong() call to PyLong_FromUnsignedLong(), making zlib always return positive numbers -- it might break some code on 32-bit platforms, but that code is already broken on 64-bit platforms. But I guess I'm okay with the long being changed into an actual 32-bit signed number on 64-bit platforms, too. The struct module isn't what's broken here. All of the struct types have always had well defined bit sizes and alignment if you explicitly specify an endian, I and L are 32-bits everywhere, and Q is supported on platforms that don't have long long. The only thing that's changed is that it actually checks for errors consistently now. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] r46300 - in python/trunk: Lib/socket.py Lib/test/test_socket.py Lib/test/test_struct.py Modules/_struct.c Modules/arraymodule.c Modules/socketmodule.c
On May 26, 2006, at 4:56 PM, Guido van Rossum wrote: On 5/26/06, martin.blais [EMAIL PROTECTED] wrote: Log: Support for buffer protocol for socket and struct. * Added socket.recv_buf() and socket.recvfrom_buf() methods, that use the buffer protocol (send and sendto already did). * Added struct.pack_to(), that is the corresponding buffer compatible method to unpack_from(). Hm... The file object has a similar method readinto(). Perhaps the methods introduced here could follow that lead instead of using two different new naming conventions? (speaking specifically about struct and not socket) pack_to and unpack_from are named as such because they work with objects that support the buffer API (not file-like-objects). I couldn't find any existing convention for objects that manipulate buffers in such a way. If there is an existing convention then I'd be happy to rename these. readinto seems to imply that some kind of position is being incremented. Grammatically it only works if it's implemented on all buffer objects, but in this case it's implemented on the Struct type. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] SQLite header scan order
On May 26, 2006, at 8:35 AM, Ronald Oussoren wrote: The current version of setup.py looks for the sqlite header files in a number of sqlite-specific directories before looking into the default inc_dirs. I'd like to revert that order because that would make it possible to override the version of sqlite that gets picked up. Any objections to that? +1, the version that ships with Mac OS X 10.4 is pretty old. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Cost-Free Slice into FromString constructors--Long
On May 25, 2006, at 3:28 PM, Jean-Paul Calderone wrote: On Thu, 25 May 2006 15:01:36 +, Runar Petursson [EMAIL PROTECTED] wrote: We've been talking this week about ideas for speeding up the parsing of Longs coming out of files or network. The use case is having a large string with embeded Long's and parsing them to real longs. One approach would be to use a simple slice: long(mystring[x:y]) an expensive operation in a tight loop. The proposed solution is to add further keyword arguments to Long (such as): long(mystring, base=10, start=x, end=y) The start/end would allow for negative indexes, as slices do, but otherwise simply limit the scope of the parsing. There are other solutions, using buffer-like objects and such, but this seems like a simple win for anyone parsing a lot of text. I implemented it in a branch runar- longslice- branch, but it would need to be updated with Tim's latest improvements to long. Then you may ask, why not do it for everything else parsing from string--to which I say it should. Thoughts? This really seems like a poor option. Why fix the problem with a hundred special cases instead of a single general solution? Hmm, one reason could be that the general solution doesn't work: [EMAIL PROTECTED]:~$ python Python 2.4.3 (#2, Apr 27 2006, 14:43:58) [GCC 4.0.3 (Ubuntu 4.0.3-1ubuntu5)] on linux2 Type help, copyright, credits or license for more information. long(buffer('1234', 0, 3)) Traceback (most recent call last): File stdin, line 1, in ? ValueError: null byte in argument for long() long(buffer('123a', 0, 3)) Traceback (most recent call last): File stdin, line 1, in ? ValueError: invalid literal for long(): 123a One problem with buffer() is that it does a memcpy of the buffer. A zero-copy version of buffer (a view on some object that implements the buffer API) would be nice. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Google Summer of Code proposal: improvement of long int and adding new types/modules.
On Apr 21, 2006, at 5:58 PM, Alex Martelli wrote: On 4/21/06, Greg Ewing [EMAIL PROTECTED] wrote: ... GMP is covered by LGPL, so must any such derivative work But the wrapper is just using GMP as a library, so it shouldn't be infected with LGPLness, should it? If a lawyer for the PSF can confidently assert that gmpy is not a derivative work of GMP, I'll have no problem changing gmpy's licensing. But I won't make such a call myself: for example, gmpy.c #include's gmp.h and uses (==expands) some of the C macros there defined -- doesn't that make gmpy.o a derived work of gmp.h? I'm quite confident that the concept of derived work would not apply if gmpy.so only accessed a gmp.so (or other kinds of dynamic libraries), but I fear the connection is stronger than that, so, prudently, I'm assuming the derived work status until further notice. Well we already wrap readline, would this really be any worse? Readline is GPL. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] elementtree in stdlib
On Apr 5, 2006, at 9:02 PM, Alex Martelli wrote: On Apr 5, 2006, at 8:30 PM, Greg Ewing wrote: A while ago there was some discussion about including elementtree in the std lib. I can't remember what the conclusion about that was, but if it does go ahead, I'd like to suggest that it be reorganised a bit. I've just started playing with it, and having a package called elementtree containing a module called ElementTree containing a class called ElementTree is just too confusing for words! Try the 2.5 alpha 1 just released, and you'll see that the toplevel package is now xml.etree. The module and class are still called ElementTree, though. It would be nice to have new code be PEP 8 compliant.. Specifically: Modules should have short, lowercase names, without underscores. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Use dlopen() on Darwin/OS X to load extensions?
On Apr 3, 2006, at 9:01 PM, Neal Norwitz wrote: On 4/3/06, Zachary Pincus [EMAIL PROTECTED] wrote: Sorry if it's bad form to ask about patches one has submitted -- let me know if that sort of discussion should be kept strictly on the patch tracker. No, it's fine. Thanks for reminding us about this issue. Unfortunately, without an explicit ok from one of the Mac maintainers, I don't want to add this myself. If you can get Bob, Ronald, or Jack to say ok, I will apply the patch ASAP. I have a Mac OS X.4 box and can test it, but don't know the suitability of the patch. The patch has my OK (I gave it a while ago on pythonmac-sig). -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] towards a stricter definition of sys.executable
On Mar 17, 2006, at 12:40 AM, Martin v. Löwis wrote: Fredrik Lundh wrote: I don't think many people embed setup.py scripts, so alternative (e) would pro- bably cause the least problems: e) sys.executable contains the full path to the program used to invoke this interpreter instance, or None if this could not be determined. It seems that you indeed are trying to solve a problem you encountered. Can you please explain what the problem is? ISTM that the current definition doesn't really cause problems, despite potentially being fuzzy. People that start sys.executable typically *do* get a Python interpreter - in an embedded interpreter, they just don't want to start a new interpreter, as that couldn't work, anyway. I've seen cases where people want to start worker processes from bundled apps (as in py2app/py2exe). The bootstrap executable (sys.executable) is not suitable for this purpose, as it runs a specific script. Forking doesn't quite do the right thing either because it's not safe to fork without exec'ing in all cases due to state that persists that shouldn't across processes with certain platform libraries (in Mac OS X especially). For py2app, we can bundle a Python interpreter that links to the same Python framework and has the same set of modules and extensions that the bundled application does, so we can support this use case. I'd definitely like to see something like sys.python_executable become standard, and I think I'll go ahead and support it in the next release of py2app. It's possible to degrade gracefully with this approach too: def get_python_executable(): python_executable = getattr(sys, 'python_executable', None) if python_executable is not None: return python_executable if not sys.frozen and sys.executable: # launched from a standard interpreter return sys.executable # frozen without python_executable support raise RuntimeError -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Problem with module loading on multi-arch?
On Mar 17, 2006, at 4:38 PM, Neal Becker wrote: Martin v. Löwis wrote: Neal Becker wrote: Sorry, maybe I used confusing terminology. A reference is here: http://fedoraproject.org/wiki/Packaging/Python This is the current setup. For example, this is a standard macro used by Redhat in RPM SPEC files for python: %define python_sitearch %(%{__python} -c from distutils.sysconfig import get_python_lib; print get_python_lib(1))} %define python_sitelib %(%{__python} -c from distutils.sysconfig import get_python_lib; print get_python_lib())} Clearly this practice is widespread. It would seem that module search needs some modification to fully support it. Ah. That isn't supported at all, at the moment. Redhat should not be using it. Instead, there shouldn't be a difference between sitearch and sitelib. x86_64 is multiarch. That means, we allow both i386 and x86_64 binaries to coexits. Is the proposal that python should not support this? That would be unfortunate. I suspect is would not be that difficult to correctly support multiarch platforms. As it is, this usually works - but the example I gave above shows where it seems to break. All the difficult issues supporting multi-arch are going to be with distutils, not Python itself. On OS X it isn't all that hard to support (beyond backwards compatibility issues) because you run gcc once with the right options and get a single universal binary as output. It would be a lot more invasive if GCC had to be run multiple times and the products had to be put in different places. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] collections.idset and collections.iddict?
On Mar 6, 2006, at 4:14 PM, Guido van Rossum wrote: On 3/6/06, Raymond Hettinger [EMAIL PROTECTED] wrote: [Neil Schemenauer] I occasionally need dictionaries or sets that use object identity rather than __hash__ to store items. Would it be appropriate to add these to the collections module? Why not decorate the objects with a class adding a method: def __hash__(self): return id(self) That would seem to be more Pythonic than creating custom variants of other containers. I hate to second-guess the OP, but you'd have to override __eq__ too, and probably __ne__ and __cmp__ just to be sure. And probably that wouldn't do -- since the default __hash__ and __eq__ have the desired behavior, the OP is apparently talking about objects that override these operations to do something meaningful; overriding them back presumably breaks other functionality. I wonder if this use case and the frequently requested case-insensitive dict don't have some kind of generalization in common -- perhaps a dict that takes a key function a la list.sort()? +1. I've wanted such a thing a couple times, and there is some precedent in the stdlib (e.g. WeakKeyDictionary would be a lot shorter with such a base class). -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] operator.is*Type
On Feb 22, 2006, at 4:18 AM, Fuzzyman wrote: Raymond Hettinger wrote: from operator import isSequenceType, isMappingType class anything(object): ... def __getitem__(self, index): ... pass ... something = anything() isMappingType(something) True isSequenceType(something) True I suggest we either deprecate these functions as worthless, *or* we define the protocols slightly more clearly for user defined classes. They are not worthless. They do a damned good job of differentiating anything that CAN be differentiated. But as far as I can tell (and I may be wrong), they only work if the object is a subclass of a built in type, otherwise they're broken. So you'd have to do a type check as well, unless you document that an API call *only* works with a builtin type or subclass. If you really cared, you could check hasattr(something, 'get') and hasattr(something, '__getitem__'), which is a pretty good indicator that it's a mapping and not a sequence (in a dict-like sense, anyway). -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 358 (bytes type) comments
On Feb 22, 2006, at 1:22 PM, Brett Cannon wrote: First off, thanks to Neil for writing this all down. The whole thread of discussion on the bytes type was rather long and thus hard to follow. Nice to finally have it written down in a PEP. Anyway, a few comments on the PEP. One, should the hex() method instead be an attribute, implemented as a property? Seems like static data that is entirely based on the value of the bytes object and thus is not properly represented by a method. Next, why are the __*slice__ methods to be defined? Docs say they are deprecated. And for the open-ended questions, I don't think sort() is needed. sort would be totally useless for bytes. array.array doesn't have sort either. Lastly, maybe I am just dense, but it took me a second to realize that it will most likely return the ASCII string for __str__() for use in something like socket.send(), but it isn't explicitly stated anywhere. There is a chance someone might think that __str__ will somehow return the sequence of integers as a string does exist. That would be a bad idea given that bytes are supposed make the str type go away. It's probably better to make __str__ return __repr__ like it does for most types. If bytes type supports the buffer API (one would hope so), functions like socket.send should do the right thing as-is. http://docs.python.org/api/bufferObjects.html -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readline compilarion fails on OSX
On Feb 20, 2006, at 6:48 PM, Guido van Rossum wrote: On OSX (10.4.4) the readline module in the svn HEAD fails compilation as follows. This is particularly strange since the buildbot is green for OSX... What could be up with this? building 'readline' extension -lots of build junk- In Apple's quest to make our lives harder, they installed BSD libedit and symlinked it to readline. Python doesn't like that. The buildbot might have a real readline installation, or maybe the buildbot is skipping those tests. You'll need to install a real libreadline if you want it to work. I've also put together a little tarball that'll build readline.so statically, and there's pre-built eggs for OS X so the easy_install should be quick: http://python.org/pypi/readline -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes.from_hex()
On Feb 20, 2006, at 7:25 PM, Stephen J. Turnbull wrote: Martin == Martin v Löwis [EMAIL PROTECTED] writes: Martin Please do take a look. It is the only way: If you were to Martin embed base64 *bytes* into character data content of an XML Martin element, the resulting XML file might not be well-formed Martin anymore (if the encoding of the XML file is not an ASCII Martin superencoding). Excuse me, I've been doing category theory recently. By embedding I mean a map from an intermediate object which is a stream of bytes to the corresponding stream of characters. In the case of UTF-16-coded characters, this would necessarily imply a representation change, as you say. What I advocate for Python is to require that the standard base64 codec be defined only on bytes, and always produce bytes. Any representation change should be done explicitly. This is surely conformant with RFC 2045's definition and with RFC 3548. +1 -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes.from_hex()
On Feb 19, 2006, at 10:55 AM, Martin v. Löwis wrote: Stephen J. Turnbull wrote: BTW, what use cases do you have in mind for Unicode - Unicode decoding? I think rot13 falls into that category: it is a transformation on text, not on bytes. The current implementation is a transformation on bytes, not text. Conceptually though, it's a text-text transform. For other odd cases: base64 goes Unicode-bytes in the *decode* direction, not in the encode direction. Some may argue that base64 is bytes, not text, but in many applications, you can combine base64 (or uuencode) with abitrary other text in a single stream. Of course, it could be required that you go u.encode(ascii).decode(base64). I would say that base64 is bytes-bytes. Just because those bytes happen to be in a subset of ASCII, it's still a serialization meant for wire transmission. Sometimes it ends up in unicode (e.g. in XML), but that's the exception not the rule. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] New Module: CommandLoop
On Feb 19, 2006, at 5:03 PM, Raymond Hettinger wrote: @cmdloop.aliases('goodbye') @cmdloop.shorthelp('say goodbye') @cmdloop.usage('goodbye TARGET') to just: @cmdloop.addspec(aliases=['goodbye'], shorthelp ='say goodbye', usage='goodbye TARGET') leaving the possibility of multiple decorators when one line gets to long: @cmdloop.addspec(aliases=['goodbye'], shorthelp ='say goodbye') @cmdloop.addspec(usage='goodbye TARGET # where TARGET is a filename in the current directory') Well, why not support both, and leave it up to the user? Having only one method keeps the API simple. Also, the addspec() approach allows the user to choose between single and multiple lines. BTW, addspec() could be made completely general by supporting all possible keywords at once: def addspec(**kwds): def decorator(func): func.__dict__.update(kwds) return func return decorator With an open definition like that, users can specify new attributes with less effort. Doesn't this discussion belong on c.l.p / python-list? -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] http://www.python.org/dev/doc/devel still available
On Feb 16, 2006, at 11:35 AM, Benji York wrote: Alexander Schremmer wrote: In fact, PHP does it like php.net/functionname which is even shorter, i.e. they fallback to the documentation if that path does not exist otherwise. Like many things PHP, that seems a bit too magical for my tastes. Not only does it fall back to documentation, it falls back to a search for documentation if there isn't a function of that name. It's a convenient feature, I'm sure people would use it if it was there... even if it was something like http://python.org/doc/name -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]
On Feb 16, 2006, at 9:20 PM, Josiah Carlson wrote: Greg Ewing [EMAIL PROTECTED] wrote: Josiah Carlson wrote: They may not be encodings of _unicode_ data, But if they're not encodings of unicode data, what business do they have being available through someunicodestring.encode(...)? I had always presumed that bytes objects are going to be able to be a source for encode AND decode, like current non-unicode strings are able to be today. In that sense, if I have a bytes object which is an encoding of rot13, hex, uu, etc., or I have a bytes object which I would like to be in one of those encodings, I should be able to do b.encode(...) or b.decode(...), given that 'b' is a bytes object. Are 'encodings' going to become a mechanism to encode and decode _unicode_ strings, rather than a mechanism to encode and decode _text and data_ strings? That would seem like a backwards step to me, as the email package would need to package their own base-64 encode/decode API and implementation, and similarly for any other package which uses any one of the encodings already available. It would be VERY useful to separate the two concepts. bytes-bytes transforms should be one function pair, and bytes-text transforms should be another. The current situation is totally insane: str.decode(codec) - str or unicode or UnicodeDecodeError or ZlibError or TypeError.. who knows what else str.encode(codec) - str or unicode or UnicodeDecodeError or TypeError... probably other exceptions Granted, unicode.encode(codec) and unicode.decode(codec) are actually somewhat sane in that the return type is always a str and the exceptions are either UnicodeEncodeError or UnicodeDecodeError. I think that rot13 is the only conceptually text-text transform (though the current implementation is really bytes-bytes), everything else is either bytes-text or bytes-bytes. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]
On Feb 17, 2006, at 4:20 PM, Martin v. Löwis wrote: Ian Bicking wrote: Maybe it isn't worse, but the real alternative is: import zlib import base64 base64.b64encode(zlib.compress(s)) Encodings cover up eclectic interfaces, where those interfaces fit a basic pattern -- data in, data out. So should I write 3.1415.encode(sin) or would that be 3.1415.decode(sin) What about http://www.python.org.decode(URL) It's data in, data out, after all. Who needs functions? Well, 3.1415.decode(sin) is of course NaN, because 3.1415.encode (sinh) is not defined for numbers outside of [-1, 1] :) -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes.from_hex()
On Feb 17, 2006, at 8:33 PM, Josiah Carlson wrote: Greg Ewing [EMAIL PROTECTED] wrote: Stephen J. Turnbull wrote: Guido == Guido van Rossum [EMAIL PROTECTED] writes: Guido - b = bytes(t, enc); t = text(b, enc) +1 The coding conversion operation has always felt like a constructor to me, and in this particular usage that's exactly what it is. I prefer the nomenclature to reflect that. This also has the advantage that it competely avoids using the verbs encode and decode and the attendant confusion about which direction they go in. e.g. s = text(b, base64) makes it obvious that you're going from the binary side to the text side of the base64 conversion. But you aren't always getting *unicode* text from the decoding of bytes, and you may be encoding bytes *to* bytes: b2 = bytes(b, base64) b3 = bytes(b2, base64) Which direction are we going again? This is *exactly* why the current set of codecs are INSANE. unicode.encode and str.decode should be used *only* for unicode codecs. Byte transforms are entirely different semantically and should be some other method pair. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bdist_* to stdlib?
On Feb 15, 2006, at 4:49 AM, Jan Claeys wrote: Op wo, 15-02-2006 te 14:00 +1300, schreef Greg Ewing: I'm disappointed that the various Linux distributions still don't seem to have caught onto the very simple idea of *not* scattering files all over the place when installing something. MacOSX seems to be the only system so far that has got this right -- organising the system so that everything related to a given application or library can be kept under a single directory, clearly labelled with a version number. Those directories might be mounted on entirely different hardware (even over a network), often with different characteristics (access speed, writeability, etc.). Huh? What does that have to do with anything? I've never seen a system where /usr/include, /usr/lib, /usr/bin, etc. are not all on the same mount. It's not really any different with OS X either. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]
On Feb 15, 2006, at 6:35 PM, Aahz wrote: On Tue, Feb 14, 2006, Guido van Rossum wrote: Anyway, I'm now convinced that bytes should act as an array of ints, where the ints are restricted to range(0, 256) but have type int. range(0, 255)? No, Guido was correct. range(0, 256) is [0, 1, 2, ..., 255]. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com