Re: [Python-Dev] Keyword meanings [was: Accept just PEP-0426]
PJ Eby writes: By clear, I mean free of prior assumptions. Ah, well, I guess I've just run into a personal limitation. I can't imagine thinking that is free of prior assumptions. Not my ownwink/, and not by others, either. So, unfortunately, I was left with the conventional opposition in thinking: clear vs. muddy. That impression was only strengthened by the phrase vs. simply copying fields from distro tools. Put together, the phrase clear thinking on concrete use cases means (at least to me), dropping all preconceptions of the existing design and starting over from square one, to ask how best the problem may be solved, using specific examples as a guide rather than using generalities. Sure, but ISTM that's the opposite of what you've actually been doing, at least in terms of contributing to my understanding. One obstacle to discussion you have contributed to overcoming in my thinking is the big generality that the packager (ie, the person writing the metadata) is in a position to recommend good behavior to the installation tool, vs. being in a position to point out relevant considerations for users and tools installing the packager's product. Until that generality is formulated and expressed, it's very difficult to see why the examples and particular solutions to use cases that various proponents have described fail to address some real problems. It was difficult for me to see, at first, what distinction was actually being made. Specifically, I thought that the question about Obsoletes vs. Obsoleted-By was about which package should be considered authoritative about obsolescence. That is a reasonable distinction for that particular discussion, but there is a deeper, and general, principle behind that. Namely, metadata is descriptive, not prescriptive. Of course once one understands that principle, the names of the fields don't matter so much, but it is helpful for naive users of the metadata if the field names strongly connote description of the package rather than behavior of the tool. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More compact dictionaries with faster iteration
On 10.12.12 05:30, Raymond Hettinger wrote: On Dec 9, 2012, at 10:03 PM, MRAB pyt...@mrabarnett.plus.com mailto:pyt...@mrabarnett.plus.com wrote: What happens when a key is deleted? Will that leave an empty slot in the entry array? Yes. See the __delitem__() method in the pure python implemention at http://code.activestate.com/recipes/578375 You can move the last entry on freed place. This will preserve compactness of entries array and simplify and speedup iterations and some other operations. def __delitem__(self, key, hashvalue=None): if hashvalue is None: hashvalue = hash(key) found, i = self._lookup(key, hashvalue) if found: index = self.indices[i] self.indices[i] = DUMMY self.size -= 1 if index != size: lasthash = self.hashlist[index] lastkey = self.keylist[index] found, j = self._lookup(lastkey, lasthash) assert found assert i != j self.indices[j] = index self.hashlist[index] = lasthash self.keylist[index] = lastkey self.valuelist[index] = self.valuelist[size] index = size self.hashlist[index] = UNUSED self.keylist[index] = UNUSED self.valuelist[index] = UNUSED else: raise KeyError(key) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More compact dictionaries with faster iteration
On 10.12.12 09:48, Christian Heimes wrote: On the other hand every lookup and collision checks needs an additional multiplication, addition and pointer dereferencing: entry = entries_baseaddr + sizeof(PyDictKeyEntry) * idx I think that main slowdown will be in getting index from array with variable size of elements. It requires conditional jump which is not such cheap as additional or shifting. switch (self-index_size) { case 1: idx = ((uint8_t *)self-indices)[i]; break; case 2: idx = ((uint16_t *)self-indices)[i]; break; ... } I think that variable-size indices don't worth effort. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More compact dictionaries with faster iteration
On 10/12/12 01:44, Raymond Hettinger wrote: The current memory layout for dictionaries is unnecessarily inefficient. It has a sparse table of 24-byte entries containing the hash value, key pointer, and value pointer. Instead, the 24-byte entries should be stored in a dense table referenced by a sparse table of indices. What minimum size and resizing factor do you propose for the entries array? Cheers, Mark. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Do more at compile time; less at runtime
Do know my registervm project? I try to emit better bytecode. See the implementation on http://hg.python.org/sandbox/registervm See for example the documentation: hg.python.org/sandbox/registervm/file/tip/REGISTERVM.txt http://hg.python.org/sandbox/registervm I implemented other optimisations. See also WPython project which uses also more efficient bytecode. http://code.google.com/p/wpython/ Victor Le 9 déc. 2012 23:23, Mark Shannon m...@hotpy.org a écrit : Hi all, The current CPython bytecode interpreter is rather more complex than it needs to be. A number of bytecodes could be eliminated and a few more simplified by moving the work involved in handling compound statements (loops, try-blocks, etc) from the interpreter to the compiler. This simplest example of this is the while loop... while cond: body This currently compiled as start: if not cond goto end body goto start end: but it could be compiled as goto test: start: body if cond goto start which eliminates one instruction per iteration. A more complex example is a return in a try-finally block. try: part1 if cond: return X part2 finally: part3 Currently, handling the return is complex and involves pseudo exceptions, but if part3 were duplicated by the compiler, then the RETURN bytecode could just perform a simple return. The code above would be compiled thus... PUSH_BLOCK try part1 if not X goto endif push X POP_BLOCK part3duplicated RETURN_VALUE endif: part2 POP_BLOCK part3duplicated The changes I am proposing are: Allow negative line deltas in the lnotab array (bytecode deltas would remain non-negative) Remove the SETUP_LOOP, BREAK and CONTINUE bytecodes Simplify the RETURN bytecode Eliminate pseudo exceptions from the interpreter Simplify (or perhaps eliminate) SETUP_TRY, END_FINALLY, END_WITH. Reverse the sense of the FOR_ITER bytecode (ie. jump on not-exhausted) The net effect of these changes would be: Reduced code size and reduced code complexity. A small (1-5%)? increase in speed, due the simplification of the bytecodes and a very small change in the number of bytecodes executed. A small change in the static size of the bytecodes (-2% to +2%)? Although this is a quite intrusive change, I think it is worthwhile as it simplifies ceval.c considerably. The interpreter has become rather convoluted and any simplification has to be a good thing. I've already implemented negative line deltas and the transformed while loop: https://bitbucket.org/**markshannon/cpython-lnotab-**signedhttps://bitbucket.org/markshannon/cpython-lnotab-signed I'm currently working on the block unwinding. So, Good idea? Bad idea? Should I write a PEP or is the bug tracker sufficient? Cheers, Mark. __**_ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/**mailman/listinfo/python-devhttp://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/**mailman/options/python-dev/** victor.stinner%40gmail.comhttp://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More compact dictionaries with faster iteration
Hi Raymond, On Mon, Dec 10, 2012 at 2:44 AM, Raymond Hettinger raymond.hettin...@gmail.com wrote: Instead, the data should be organized as follows: indices = [None, 1, None, None, None, 0, None, 2] entries = [[-9092791511155847987, 'timmy', 'red'], [-8522787127447073495, 'barry', 'green'], [-6480567542315338377, 'guido', 'blue']] As a side note, your suggestion also enables order-preserving dictionaries: iter() would automatically yield items in the order they were inserted, as long as there was no deletion. People will immediately start relying on this feature... and be confused by the behavior of deletion. :-/ A bientôt, Armin. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Distutils] Is is worth disentangling distutils?
On Mon, Dec 10, 2012 at 2:22 AM, Antonio Cavallo a.cava...@cavallinux.euwrote: Hi, I wonder if is it worth/if there is any interest in trying to clean up distutils: nothing in terms to add new features, just a *major* cleanup retaining the exact same interface. I'm not planning anything like *adding features* or rewriting rpm/rpmbuild here, simply cleaning up that un-holy code mess. Yes it served well, don't get me wrong, and I think it did work much better than anything it was meant to replace it. I'm not into the py3 at all so I wonder how possibly it could fit/collide into the big plan. Or I'll be wasting my time? It has been tried before. IIUC the nature of distutils and extending distutils is that client code depends on the entire tangle. If you try to clean it up you will break backwards compatibility. distutils2 is designed to break backwards compatibility with distutils and is essentially a cleaned up distutils. Have you tried Bento? http://bento.readthedocs.org/en/latest/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Conflicts [was Re: Keyword meanings [was: Accept just PEP-0426]]
On Sun, Dec 09, 2012 at 01:51:09PM +1100, Chris Angelico wrote: On Sun, Dec 9, 2012 at 1:11 PM, Steven D'Aprano st...@pearwood.info wrote: Why would a software package called Spam install a top-level module called Jam rather than Spam? Isn't the whole point of Python packages to solve this namespace problem? That would require/demand that the software package MUST define a module with its own name, and MUST NOT define any other top-level modules, and also that package names MUST be unique. (RFC 2119 keywords.) That would work, as long as those restrictions are acceptable. /me notes that setuptools itself is an example of a package that violates this rule )setuptools and pkg_resources). No objections to That would work, as long as those restrictions are acceptable that seems to sum up where we're at. -Toshio pgpC3zr5ZrDtv.pgp Description: PGP signature ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Conflicts [was Re: Keyword meanings [was: Accept just PEP-0426]]
On Sun, Dec 09, 2012 at 12:48:45AM -0500, PJ Eby wrote: Do any of the distro folks know of a Python project tagged as conflicting with another for their distro, where the conflict does *not* involve any files in conflict? In Fedora we do work to avoid most types of Conflicts (backporting fixes, etc) but I can give some examples of where Conflivts could have been used in the past: In docutils prior to the latest release, certain portions of docutils was broken if pyxml was installed (since pyxml replaces certain stdlib xml.* functionaltiy). So older docutils versions could have had a Conflicts: PyXML. Nick has since provided a technique for docutils to use that loads from the stdlib first and only goes to PyXML if the functionality is not available there. Various libraries in web stacks have had bugs that prevent the propser functioning of the web framework at the top level. In case of major issues (security, unable to startup), these top level frameworks could use versioned Conflicts to prevent installation. For instance: TurboGears might have a Conflicts: CherryPy 2.3.1 Note, though, that if parallel installable versions and selection of the proper versions from that work, then this type of Conflict wouldn't be necessary. Instead you'd have versioned Requires: instead. -Toshio pgpb06wUdfxFB.pgp Description: PGP signature ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More compact dictionaries with faster iteration
On Dec 10, 2012, at 08:48 AM, Christian Heimes wrote: It's hard to predict how the extra CPU instructions are going to affect the performance on different CPUs and scenarios. My guts tell me that your proposal is going to perform better on modern CPUs with lots of L1 and L2 cache and in an average case scenario. A worst case scenario with lots of collisions might be measurable slower for large dicts and small CPU cache. I'd be interested to see what effect this has on memory constrained platforms such as many current ARM applications (mostly likely 32bit for now). Python's memory consumption is an overheard complaint for folks developing for those platforms. Cheers, -Barry ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More compact dictionaries with faster iteration
Le Mon, 10 Dec 2012 10:40:30 +0100, Armin Rigo ar...@tunes.org a écrit : Hi Raymond, On Mon, Dec 10, 2012 at 2:44 AM, Raymond Hettinger raymond.hettin...@gmail.com wrote: Instead, the data should be organized as follows: indices = [None, 1, None, None, None, 0, None, 2] entries = [[-9092791511155847987, 'timmy', 'red'], [-8522787127447073495, 'barry', 'green'], [-6480567542315338377, 'guido', 'blue']] As a side note, your suggestion also enables order-preserving dictionaries: iter() would automatically yield items in the order they were inserted, as long as there was no deletion. People will immediately start relying on this feature... and be confused by the behavior of deletion. :-/ If that's really an issue, we can deliberately scramble the iteration order a bit :-) (of course it might negatively impact HW prefetching) Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More compact dictionaries with faster iteration
On 10/12/12 20:40, Armin Rigo wrote: As a side note, your suggestion also enables order-preserving dictionaries: iter() would automatically yield items in the order they were inserted, as long as there was no deletion. People will immediately start relying on this feature... and be confused by the behavior of deletion. :-/ If we want to avoid the attractive nuisance of iteration order being almost, but not quite, order-preserving, there is a simple fix: when iterating over the dict, instead of starting at the start of the table, start at some arbitrary point in the middle and wrap around. That will increase the cost of iteration slightly, but avoid misleading behaviour. I think all we need do is change the existing __iter__ method from this: def __iter__(self): for key in self.keylist: if key is not UNUSED: yield key to this: # Untested! def __iter__(self): # Choose an arbitrary starting point. # 103 is chosen because it is prime. n = (103 % len(self.keylist)) if self.keylist else 0 for key in self.keylist[n:] + self.keylist[:n]: # I presume the C version will not duplicate the keylist. if key is not UNUSED: yield key This mixes the order of iteration up somewhat, just enough to avoid misleading people into thinking that this is order-preserving. The order will depend on the size of the dict, not the history. For example, with keys a, b, c, ... added in that order, the output is: len=1 key=a len=2 key=ba len=3 key=bca len=4 key=dabc len=5 key=deabc len=6 key=bcdefa len=7 key=fgabcde len=8 key=habcdefg len=9 key=efghiabcd which I expect is mixed up enough to discourage programmers from jumping to conclusions about dicts having guaranteed order. -- Steven ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More compact dictionaries with faster iteration
On Mon, Dec 10, 2012 at 10:48 AM, Antoine Pitrou solip...@pitrou.net wrote: Le Mon, 10 Dec 2012 10:40:30 +0100, Armin Rigo ar...@tunes.org a écrit : Hi Raymond, On Mon, Dec 10, 2012 at 2:44 AM, Raymond Hettinger raymond.hettin...@gmail.com wrote: Instead, the data should be organized as follows: indices = [None, 1, None, None, None, 0, None, 2] entries = [[-9092791511155847987, 'timmy', 'red'], [-8522787127447073495, 'barry', 'green'], [-6480567542315338377, 'guido', 'blue']] As a side note, your suggestion also enables order-preserving dictionaries: iter() would automatically yield items in the order they were inserted, as long as there was no deletion. People will immediately start relying on this feature... and be confused by the behavior of deletion. :-/ If that's really an issue, we can deliberately scramble the iteration order a bit :-) (of course it might negatively impact HW prefetching) On the other hand, this would also make a fast ordered dictionary subclass possible, just by not using the free list for additions, combined with periodic compaction before adds or after deletes. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Keyword meanings [was: Accept just PEP-0426]
On Sun, Dec 9, 2012 at 10:38 PM, Andrew McNabb amcn...@mcnabbs.org wrote: On Fri, Dec 07, 2012 at 05:02:26PM -0500, PJ Eby wrote: If the packages have files in conflict, they won't be both installed. If they don't have files in conflict, there's nothing important to be informed of. If one is installing pexpect-u, then one does not need to discover that it is a successor of pexpect. In the specific case of pexpect and pexpect-u, the files don't actually conflict. The pexpect package includes a pexpect.py file, while pexpect-u includes a pexpect/ directory. These conflict, but not in the easily detectable sense. Excellent! A concrete non-file use case. Setuptools handles this particular scenario by including a list of top-level module or package names, but newer tools ought to look out for this scenario, too. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More compact dictionaries with faster iteration
Le Mon, 10 Dec 2012 11:16:49 -0500, PJ Eby p...@telecommunity.com a écrit : On Mon, Dec 10, 2012 at 10:48 AM, Antoine Pitrou solip...@pitrou.net wrote: Le Mon, 10 Dec 2012 10:40:30 +0100, Armin Rigo ar...@tunes.org a écrit : Hi Raymond, On Mon, Dec 10, 2012 at 2:44 AM, Raymond Hettinger raymond.hettin...@gmail.com wrote: Instead, the data should be organized as follows: indices = [None, 1, None, None, None, 0, None, 2] entries = [[-9092791511155847987, 'timmy', 'red'], [-8522787127447073495, 'barry', 'green'], [-6480567542315338377, 'guido', 'blue']] As a side note, your suggestion also enables order-preserving dictionaries: iter() would automatically yield items in the order they were inserted, as long as there was no deletion. People will immediately start relying on this feature... and be confused by the behavior of deletion. :-/ If that's really an issue, we can deliberately scramble the iteration order a bit :-) (of course it might negatively impact HW prefetching) On the other hand, this would also make a fast ordered dictionary subclass possible, just by not using the free list for additions, combined with periodic compaction before adds or after deletes. I suspect that's what Raymond has in mind :-) Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More compact dictionaries with faster iteration
On Mon, Dec 10, 2012 at 10:28 AM, Barry Warsaw ba...@python.org wrote: On Dec 10, 2012, at 08:48 AM, Christian Heimes wrote: It's hard to predict how the extra CPU instructions are going to affect the performance on different CPUs and scenarios. My guts tell me that your proposal is going to perform better on modern CPUs with lots of L1 and L2 cache and in an average case scenario. A worst case scenario with lots of collisions might be measurable slower for large dicts and small CPU cache. I'd be interested to see what effect this has on memory constrained platforms such as many current ARM applications (mostly likely 32bit for now). Python's memory consumption is an overheard complaint for folks developing for those platforms. Maybe Kristjan can shed some light on how this would have helped him on the ps3 work he has been doing for Dust 514 as he has had memory issues there. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Keyword meanings [was: Accept just PEP-0426]
On Mon, Dec 10, 2012 at 3:27 AM, Stephen J. Turnbull step...@xemacs.org wrote: PJ Eby writes: By clear, I mean free of prior assumptions. Ah, well, I guess I've just run into a personal limitation. I can't imagine thinking that is free of prior assumptions. Not my ownwink/, and not by others, either. I suppose I should have said, free of *known* prior assumptions, since the trick to suspending assumptions is to find the ones you *have*. The deeper assumptions, alas, can usually only be found by clashing opinions with others... then stepping back and going, wait... what does he/she believe that's *different* from what I believe, that allows them to have that differing opinion? And then that's how you find out what it is that *you're* assuming, that you didn't know you were assuming. ;-) (Not to mention what the other person is.) Put together, the phrase clear thinking on concrete use cases means (at least to me), dropping all preconceptions of the existing design and starting over from square one, to ask how best the problem may be solved, using specific examples as a guide rather than using generalities. Sure, but ISTM that's the opposite of what you've actually been doing, at least in terms of contributing to my understanding. One obstacle to discussion you have contributed to overcoming in my thinking is the big generality that the packager (ie, the person writing the metadata) is in a position to recommend good behavior to the installation tool, vs. being in a position to point out relevant considerations for users and tools installing the packager's product. Right, but I started from a concrete scenario I wanted to avoid, which led me to question the assumption that those fields were actually useful. As soon as I began questioning *that* assumption and asking for use cases (2 years ago, in the last PEP 345 revision discussion), it became apparent to me that there was something seriously wrong with the conflicts and obsoletes fields, as they had almost no real utility as they were defined and understood at that point. Until that generality is formulated and expressed, it's very difficult to see why the examples and particular solutions to use cases that various proponents have described fail to address some real problems. Unfortunately, it's a chicken-and-egg problem: until you know what assumptions are being made, you can't formulate them. It's an iterative process of exposing assumptions, until you succeed in actually communicating. ;-) Heck, even something as simple as my assumptions about what clear thinking meant and what I was trying to say has taken some back and forth to clarify. ;-) It was difficult for me to see, at first, what distinction was actually being made. Specifically, I thought that the question about Obsoletes vs. Obsoleted-By was about which package should be considered authoritative about obsolescence. That is a reasonable distinction for that particular discussion, but there is a deeper, and general, principle behind that. Namely, metadata is descriptive, not prescriptive. Actually, the principle I was clinging to for *both* fields was not giving project authors authority over other people's projects. It's fine for metadata to be prescriptive (e.g. requirements), it's just that it should be prescriptive *only* for that project in isolation. (In the broader sense, it also applies to the distro situation: the project author doesn't really have authority over the distro, either, so it can only be a suggestion there, as well.) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Keyword meanings [was: Accept just PEP-0426]
On Fri, Dec 7, 2012 at 10:46 PM, PJ Eby p...@telecommunity.com wrote: In any case, as I said before, I don't have an issue with the fields all being declared as being for informational purposes only. My issue is only with recommendations for automated tool behavior that permit one project's author to exercise authority over another project's installation. Skipping over a lot of other replies between you and I because I think that we disagree on a lot but that's all moot if we agree here. I have no problems with Obsoletes, Conflicts, Requires, and Provides types of fields are marked informational. In fact, there are many cases where packages are overzealous in their use of Requires right now that cause distributions to patch the dependency information in the package metadata. -Toshio ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More compact dictionaries with faster iteration
Am 10.12.2012 16:28, schrieb Barry Warsaw: I'd be interested to see what effect this has on memory constrained platforms such as many current ARM applications (mostly likely 32bit for now). Python's memory consumption is an overheard complaint for folks developing for those platforms. I had ARM platforms in mind, too. Unfortunately we don't have any ARM platforms for testing and performance measurement. Even Trent's Snakebite doesn't have ARM boxes. I've a first generation Raspberry Pi, that's all. Perhaps somebody (PSF ?) is willing to donate a couple of ARM boards to Snakebite. I'm thinking of Raspberry Pi (ARMv6), Pandaboard (ARMv7 Cortex-A9) and similar. Christian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Keyword meanings [was: Accept just PEP-0426]
On 10 December 2012 16:35, Toshio Kuratomi a.bad...@gmail.com wrote: I have no problems with Obsoletes, Conflicts, Requires, and Provides types of fields are marked informational. In fact, there are many cases where packages are overzealous in their use of Requires right now that cause distributions to patch the dependency information in the package metadata. Given the endless debate on these fields, and the fact that it pretty much all seems to be about what happens when tools enforce them, I'm +1 on this. Particularly as these fields were not the focus of this change to the spec in any case. Paul. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More compact dictionaries with faster iteration
Indeed, I had to change the dict tuning parameters to make dicts behave reasonably on the PS3. Interesting stuff indeed. -Original Message- From: Python-Dev [mailto:python-dev- bounces+kristjan=ccpgames@python.org] On Behalf Of Barry Warsaw Sent: 10. desember 2012 15:28 To: python-dev@python.org Subject: Re: [Python-Dev] More compact dictionaries with faster iteration I'd be interested to see what effect this has on memory constrained platforms such as many current ARM applications (mostly likely 32bit for now). Python's memory consumption is an overheard complaint for folks developing for those platforms. Cheers, -Barry ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More compact dictionaries with faster iteration
On Dec 10, 2012, at 05:39 PM, Christian Heimes wrote: I had ARM platforms in mind, too. Unfortunately we don't have any ARM platforms for testing and performance measurement. Even Trent's Snakebite doesn't have ARM boxes. I've a first generation Raspberry Pi, that's all. I have a few ARM machines that I can use for testing, though I can't provide external access to them. * http://buildbot.python.org/all/buildslaves/warsaw-ubuntu-arm (Which oops, I see is down. Why did I not get notifications about that?) * I have a Nexus 7 flashed with Ubuntu 12.10 (soon to be 13.04). * Pandaboard still sitting in its box. ;) Perhaps somebody (PSF ?) is willing to donate a couple of ARM boards to Snakebite. I'm thinking of Raspberry Pi (ARMv6), Pandaboard (ARMv7 Cortex-A9) and similar. Suitable ARM boards can be had cheap, probably overwhelmed by labor costs of getting them up and running. I am not offering for *that*. ;) Cheers, -Barry signature.asc Description: PGP signature ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More compact dictionaries with faster iteration
Hi Philip, On Mon, Dec 10, 2012 at 5:16 PM, PJ Eby p...@telecommunity.com wrote: On the other hand, this would also make a fast ordered dictionary subclass possible, just by not using the free list for additions, combined with periodic compaction before adds or after deletes. Technically, I could see Python switching to ordered dictionaries everywhere. Raymond's insight suddenly makes it easy for CPython and PyPy, and at least Jython could use the LinkedHashMap class (although this would need checking with Jython guys). I'd vaguely argue that dictionary orders are one of the few non-reproducible factors in a Python program, so it might be a good thing. But only vaguely --- maybe I got far too used to random orders over time... A bientôt, Armin. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More compact dictionaries with faster iteration
On Mon, 10 Dec 2012 16:06:23 +, krist...@ccpgames.com wrote: Indeed, I had to change the dict tuning parameters to make dicts behave reasonably on the PS3. Interesting stuff indeed. Out of both curiosity and a possible application of my own for the information, what values did you end up with? --David ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Is is worth disentangling distutils?
Hi, I wonder if is it worth/if there is any interest in trying to clean up distutils: nothing in terms to add new features, just a *major* cleanup retaining the exact same interface. I'm not planning anything like *adding features* or rewriting rpm/rpmbuild here, simply cleaning up that un-holy code mess. Yes it served well, don't get me wrong, and I think it did work much better than anything it was meant to replace it. I'm not into the py3 at all so I wonder how possibly it could fit/collide into the big plan. Or I'll be wasting my time? Thanks ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Is is worth disentangling distutils?
On Mon, Dec 10, 2012 at 1:22 AM, Antonio Cavallo a.cava...@cavallinux.eu wrote: I'm not into the py3 at all so I wonder how possibly it could fit/collide into the big plan. Or I'll be wasting my time? If you're not doing it on Python 3 then you are wasting your time. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More compact dictionaries with faster iteration
On Mon, Dec 10, 2012 at 1:01 PM, Armin Rigo ar...@tunes.org wrote: On Mon, Dec 10, 2012 at 5:16 PM, PJ Eby p...@telecommunity.com wrote: On the other hand, this would also make a fast ordered dictionary subclass possible, just by not using the free list for additions, combined with periodic compaction before adds or after deletes. Technically, I could see Python switching to ordered dictionaries everywhere. Raymond's insight suddenly makes it easy for CPython and PyPy, and at least Jython could use the LinkedHashMap class (although this would need checking with Jython guys). What about IronPython? Also, note that using ordered dictionaries carries a performance cost for dictionaries whose keys change a lot. This probably wouldn't affect most dictionaries in most programs, because module and object dictionaries generally don't delete and re-add a lot of keys very often. But in cases where a dictionary is used as a queue or stack or something of that sort, the packing costs could add up. Under the current scheme, as long as collisions were minimal, the contents wouldn't be repacked very often. Without numbers it's hard to say for certain, but the advantage to keeping ordered dictionaries a distinct type is that the standard dictionary type can then get that extra bit of speed in exchange for dropping the ordering requirement. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More compact dictionaries with faster iteration
On Mon, 10 Dec 2012 11:53:17 -0500 Barry Warsaw ba...@python.org wrote: On Dec 10, 2012, at 05:39 PM, Christian Heimes wrote: I had ARM platforms in mind, too. Unfortunately we don't have any ARM platforms for testing and performance measurement. Even Trent's Snakebite doesn't have ARM boxes. I've a first generation Raspberry Pi, that's all. I have a few ARM machines that I can use for testing, though I can't provide external access to them. * http://buildbot.python.org/all/buildslaves/warsaw-ubuntu-arm (Which oops, I see is down. Why did I not get notifications about that?) Because buildbot.python.org is waiting for someone (mail.python.org, perhaps) to accept its SMTP requests. Feel free to ping the necessary people ;-) Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Emacs users: hg-tools-grep
P.S. Who wants to abuse Jono and Matthew's copyright again and provide a shudder git version? Oh, I do! I also feel weird about adding a copyright to this, but how will other people feel comfortable using it if I don't? Also I put it in github, in case people want to fix it: https://github.com/quodlibetor/git-tools.el ;; Copyright (c) 2012 Brandon W Maister ;; ;; Permission is hereby granted, free of charge, to any person obtaining ;; a copy of this software and associated documentation files (the ;; Software), to deal in the Software without restriction, including ;; without limitation the rights to use, copy, modify, merge, publish, ;; distribute, sublicense, and/or sell copies of the Software, and to ;; permit persons to whom the Software is furnished to do so, subject to ;; the following conditions: ;; ;; The above copyright notice and this permission notice shall be ;; included in all copies or substantial portions of the Software. ;; ;; THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, ;; EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF ;; MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND ;; NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE ;; LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION ;; OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION ;; WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ;; This code is based on hg-tools.el, which in turn is based on bzr-tools.el ;; Copyright (c) 2008-2012 Jonathan Lange, Matthew Lefkowitz, Barry A. Warsaw (provide 'git-tools) (defconst git-tools-grep-command git ls-files -z | xargs -0 grep -In %s The command used for grepping files using git. See `git-tools-grep'.) ;; Run 'code' at the root of the branch which dirname is in. (defmacro git-tools-at-branch-root (dirname rest code) `(let ((default-directory (locate-dominating-file (expand-file-name ,dirname) .git))) ,@code)) (defun git-tools-grep (expression dirname) Search a branch for `expression'. If there's a C-u prefix, prompt for `dirname'. (interactive (let* ((string (read-string Search for: )) (dir (if (null current-prefix-arg) default-directory (read-directory-name (format Search for %s in: string) (list string dir))) (git-tools-at-branch-root dirname (grep-find (format git-tools-grep-command (shell-quote-argument expression) On Sat, Dec 8, 2012 at 4:51 PM, Barry Warsaw ba...@python.org wrote: Hark fellow Emacsers. All you unenlightened heathens can stop reading now. A few years ago, my colleague Jono Lange wrote probably the best little chunk of Emacs lisp ever. `M-x bzr-tools-grep` lets you easily search a Bazaar repository for a case-sensitive string, providing you with a nice *grep* buffer which you can scroll through. When you find a code sample you want to look at, C-c C-c visits the file and plops you right at the matching line. You *only* grep through files under version control, so you get to ignore generated files, and compilation artifacts, etc. Of course, this doesn't help you for working on the Python code base, because Mercurial. I finally whipped up this straight up rip of Jono's code to work with hg. I'm actually embarrassed to put a copyright on this thing, and would happily just donate it to Jono, drop it in Python's Misc directory, or slip it like a lump of coal into the xmas stocking of whoever wants to maintain it for the next 20 years. But anyway, it's already proven enormously helpful to me, so here it is. Cheers, -Barry P.S. Who wants to abuse Jono and Matthew's copyright again and provide a shudder git version? ;; Copyright (c) 2012 Barry A. Warsaw ;; ;; Permission is hereby granted, free of charge, to any person obtaining ;; a copy of this software and associated documentation files (the ;; Software), to deal in the Software without restriction, including ;; without limitation the rights to use, copy, modify, merge, publish, ;; distribute, sublicense, and/or sell copies of the Software, and to ;; permit persons to whom the Software is furnished to do so, subject to ;; the following conditions: ;; ;; The above copyright notice and this permission notice shall be ;; included in all copies or substantial portions of the Software. ;; ;; THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, ;; EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF ;; MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND ;; NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE ;; LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION ;; OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION ;; WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ;; This code is based on bzr-tools.el ;; Copyright (c) 2008-2009 Jonathan Lange, Matthew Lefkowitz (provide
Re: [Python-Dev] Emacs users: hg-tools-grep
On Dec 10, 2012, at 03:02 PM, Brandon W Maister wrote: I also feel weird about adding a copyright to this, but how will other people feel comfortable using it if I don't? Nice! I've been told that Emacs 23 is probably a minimum requirement because locate-dominating-file is missing in = Emacs 22. I'm using Emacs 24.2 almost everywhere these days. -Barry signature.asc Description: PGP signature ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More compact dictionaries with faster iteration
On 12/10/2012 1:38 PM, PJ Eby wrote: On Mon, Dec 10, 2012 at 1:01 PM, Armin Rigo ar...@tunes.org wrote: On Mon, Dec 10, 2012 at 5:16 PM, PJ Eby p...@telecommunity.com wrote: On the other hand, this would also make a fast ordered dictionary subclass possible, just by not using the free list for additions, combined with periodic compaction before adds or after deletes. Technically, I could see Python switching to ordered dictionaries everywhere. Raymond's insight suddenly makes it easy for CPython and PyPy, and at least Jython could use the LinkedHashMap class (although this would need checking with Jython guys). What about IronPython? Also, note that using ordered dictionaries carries a performance cost for dictionaries whose keys change a lot. This probably wouldn't affect most dictionaries in most programs, because module and object dictionaries generally don't delete and re-add a lot of keys very often. But in cases where a dictionary is used as a queue or stack or something of that sort, the packing costs could add up. Under the current scheme, as long as collisions were minimal, the contents wouldn't be repacked very often. Without numbers it's hard to say for certain, but the advantage to keeping ordered dictionaries a distinct type is that the standard dictionary type can then get that extra bit of speed in exchange for dropping the ordering requirement. I think that there should be a separate OrderedDict class (or subclass) and that the dict docs should warn (if not now) that while iterating a dict *may*, in some circumstances, give items in insertion *or* sort order, it *will not* in all cases. Example of the latter: d = {8:0, 9:0, 15:0, 13:0, 14:0} for k in d: print(k) 8 9 13 14 15 If one entered the keys in sorted order, as might be natural, one might mistakenly think that insertion order was being reproduced. -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More compact dictionaries with faster iteration
On 11 December 2012 05:01, Armin Rigo ar...@tunes.org wrote: Hi Philip, On Mon, Dec 10, 2012 at 5:16 PM, PJ Eby p...@telecommunity.com wrote: On the other hand, this would also make a fast ordered dictionary subclass possible, just by not using the free list for additions, combined with periodic compaction before adds or after deletes. Technically, I could see Python switching to ordered dictionaries everywhere. Raymond's insight suddenly makes it easy for CPython and PyPy, and at least Jython could use the LinkedHashMap class (although this would need checking with Jython guys). I'd vaguely argue that dictionary orders are one of the few non-reproducible factors in a Python program, so it might be a good thing. But only vaguely --- maybe I got far too used to random orders over time... Whilst I think Python should not move to ordered dictionaries everywhere, I would say there is an argument (no pun intended) for making **kwargs a dictionary that maintains insertion order *if there are no deletions*. It sounds like we could get that for free with this implementation, although from another post IronPython might not have something suitable. I think there are real advantages to doing so - a trivial one being the ability to easily initialise an ordered dictionary from another ordered dictionary. I could also see an argument for having this property for all dicts. There are many dictionaries that are never deleted from (probably most dict literals) and it would be nice to have an iteration order for them that matched the source code. However if deletions occur all bets would be off. If you need to maintain insertion order in the face of deletions, use an explicit ordereddict. Tim Delaney ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Guido, Dropbox, and Python
For those who have not heard, Guido left Google Friday and starts at Dropbox in January. (I hope you enjoy the break in between ;-). https://twitter.com/gvanrossum/status/277126763295944705 https://tech.dropbox.com/2012/12/welcome-guido/ My question, Guido, is how this will affect Python development, and in particular, your work on async. If not proprietary info, does or will Dropbox use Python3? -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Guido, Dropbox, and Python
On Mon, Dec 10, 2012 at 1:51 PM, Terry Reedy tjre...@udel.edu wrote: For those who have not heard, Guido left Google Friday and starts at Dropbox in January. (I hope you enjoy the break in between ;-). Thank you, I am already enjoying it! https://twitter.com/gvanrossum/status/277126763295944705 https://tech.dropbox.com/2012/12/welcome-guido/ My question, Guido, is how this will affect Python development, and in particular, your work on async. If not proprietary info, does or will Dropbox use Python3? It should not change a thing. I don't know whether Dropbox uses Python 3 but I don't think it will make any difference. Of course I have not yet started working at Dropbox so I will be able to give better answers in a few months. -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More compact dictionaries with faster iteration
On Mon, Dec 10, 2012 at 11:29 PM, Tim Delaney tim.dela...@aptare.com wrote: On 11 December 2012 05:01, Armin Rigo ar...@tunes.org wrote: Hi Philip, On Mon, Dec 10, 2012 at 5:16 PM, PJ Eby p...@telecommunity.com wrote: On the other hand, this would also make a fast ordered dictionary subclass possible, just by not using the free list for additions, combined with periodic compaction before adds or after deletes. Technically, I could see Python switching to ordered dictionaries everywhere. Raymond's insight suddenly makes it easy for CPython and PyPy, and at least Jython could use the LinkedHashMap class (although this would need checking with Jython guys). I'd vaguely argue that dictionary orders are one of the few non-reproducible factors in a Python program, so it might be a good thing. But only vaguely --- maybe I got far too used to random orders over time... Whilst I think Python should not move to ordered dictionaries everywhere, I would say there is an argument (no pun intended) for making **kwargs a dictionary that maintains insertion order *if there are no deletions*. It sounds like we could get that for free with this implementation, although from another post IronPython might not have something suitable. Please, no. dict and kwargs should be unordered without any assumptions. I think there are real advantages to doing so - a trivial one being the ability to easily initialise an ordered dictionary from another ordered dictionary. It can be done with adding short-circuit for OrderedDict class to accept another OrderedDict instance. I could also see an argument for having this property for all dicts. There are many dictionaries that are never deleted from (probably most dict literals) and it would be nice to have an iteration order for them that matched the source code. However if deletions occur all bets would be off. If you need to maintain insertion order in the face of deletions, use an explicit ordereddict. Tim Delaney ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/andrew.svetlov%40gmail.com -- Thanks, Andrew Svetlov ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More compact dictionaries with faster iteration
On Mon, Dec 10, 2012 at 4:29 PM, Tim Delaney tim.dela...@aptare.com wrote: Whilst I think Python should not move to ordered dictionaries everywhere, I would say there is an argument (no pun intended) for making **kwargs a dictionary that maintains insertion order *if there are no deletions*. Oooh. Me likey. There have been many times where I've wished kwargs were ordered when designing an API. (Oddly, I don't remember any one of the APIs specifically, so don't ask me for a good example. I just remember a bunch of different physical locations where I was when I thought, Ooh, what if I could... no, that's not going to work.) One other useful place for ordered dictionaries is class definitions processed by class decorators: no need to write a metaclass just to know what order stuff was defined in. It sounds like we could get that for free with this implementation, although from another post IronPython might not have something suitable. Actually, IronPython may already have ordered dictionaries by default; see: http://mail.python.org/pipermail/ironpython-users/2006-May/002319.html It's described as an implementation detail that may change, perhaps that could be changed to being unchanging. ;-) I think there are real advantages to doing so - a trivial one being the ability to easily initialise an ordered dictionary from another ordered dictionary. Or to merge two of them together, either at creation or .update(). I'm really starting to wonder if it might not be worth paying the compaction overhead to just make all dictionaries ordered, all the time. The problem is that if you are always adding new keys and deleting old ones (as might be done in a LRU cache, a queue, or other things like that) you'll probably see a lot of compaction overhead compared to today's dicts. OTOH... with a good algorithm for deciding *when* to compact, we can actually make the amortized cost O(1) or so, so maybe that's not a big deal. The cost to do a compaction is at worst, the current size of the table. So if you wait until a table has twice as many entries (more precisely, until the index of the last entry is twice what it was at last compaction), you will amortize the compaction cost down to one entry move per add, or O(1). That would handle the case of a cache or queue, but I'm not sure how it would work with supersized dictionaries that are then deleted down to a fraction of their original size. I suppose if you delete your way down to half the entries being populated, then you end up with two moves per delete, or O(2). (Yes, I know that's not a valid O number.) So, offhand, it seems pretty doable, and unlikely to significantly change worst-case performance even for pathological use cases. (Like using a dict when you should be using a deque.) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More compact dictionaries with faster iteration
On Mon, Dec 10, 2012 at 5:14 PM, Andrew Svetlov andrew.svet...@gmail.com wrote: Please, no. dict and kwargs should be unordered without any assumptions. Why? ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More compact dictionaries with faster iteration
On Mon, Dec 10, 2012 at 10:01 AM, Armin Rigo ar...@tunes.org wrote: Technically, I could see Python switching to ordered dictionaries everywhere. Raymond's insight suddenly makes it easy for CPython and PyPy, and at least Jython could use the LinkedHashMap class (although this would need checking with Jython guys). I honestly hope this doesn't happen - we use ConcurrentHashMap for our dictionaries (which lack ordering) and I'm sure getting it to preserve insertion order would cost us. -Frank ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More compact dictionaries with faster iteration
On Dec 10, 2012, at 2:48 AM, Christian Heimes christ...@python.org wrote: On the other hand every lookup and collision checks needs an additional multiplication, addition and pointer dereferencing: entry = entries_baseaddr + sizeof(PyDictKeyEntry) * idx Currently, the dict implementation allows alternative lookup functions based on whether the keys are all strings. The choice of lookup function is stored in a function pointer. That lets each lookup use the currently active lookup function without having to make any computations or branches. Likewise, the lookup functions could be swapped between char, short, long, and ulong index sizes during the resize step. IOW, the selection only needs to be made once per resize, not one per lookup. Raymond___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More compact dictionaries with faster iteration
On Mon, Dec 10, 2012 at 3:13 PM, fwierzbi...@gmail.com fwierzbi...@gmail.com wrote: On Mon, Dec 10, 2012 at 10:01 AM, Armin Rigo ar...@tunes.org wrote: Technically, I could see Python switching to ordered dictionaries everywhere. Raymond's insight suddenly makes it easy for CPython and PyPy, and at least Jython could use the LinkedHashMap class (although this would need checking with Jython guys). I honestly hope this doesn't happen - we use ConcurrentHashMap for our dictionaries (which lack ordering) and I'm sure getting it to preserve insertion order would cost us. I just found this http://code.google.com/p/concurrentlinkedhashmap/ so maybe it wouldn't be all that bad. I still personally like the idea of leaving basic dict order undetermined (there is already an OrderedDict if you need it right?) But if ConcurrentLinkedHashMap is as good as is suggested on that page then Jython doesn't need to be the thing that blocks the argument. -Frank ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More compact dictionaries with faster iteration
On Dec 10, 2012, at 4:06 AM, Mark Shannon m...@hotpy.org wrote: Instead, the 24-byte entries should be stored in a dense table referenced by a sparse table of indices. What minimum size and resizing factor do you propose for the entries array? There are many ways to do this. I don't know which is best. The proof-of-concept code uses the existing list resizing logic. Another approach is to pre-allocate the two-thirds maximum (This is simple and fast but gives the smallest space savings.) A hybrid approach is to allocate in two steps (1/3 and then 2/3 if needed). Since hash tables aren't a new problem, there may already be published research on the best way to handle the entries array. Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More compact dictionaries with faster iteration
On 10/12/12 23:39, Raymond Hettinger wrote: On Dec 10, 2012, at 4:06 AM, Mark Shannon m...@hotpy.org mailto:m...@hotpy.org wrote: Instead, the 24-byte entries should be stored in a dense table referenced by a sparse table of indices. What minimum size and resizing factor do you propose for the entries array? There are many ways to do this. I don't know which is best. The proof-of-concept code uses the existing list resizing logic. Another approach is to pre-allocate the two-thirds maximum What do you mean by maximum? (This is simple and fast but gives the smallest space savings.) A hybrid approach is to allocate in two steps (1/3 and then 2/3 if needed). I think you need to do some more analysis on this. It is possible that there is some improvement to be had from your approach, but I don't think the improvements will be as large as you have claimed. I suspect that you may be merely trading performance for reduced memory use, which can be done much more easily by reducing the minimum size and increasing the load factor. Cheers, Mark. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More compact dictionaries with faster iteration
On Dec 10, 2012, at 7:04 PM, Mark Shannon m...@hotpy.org wrote: Another approach is to pre-allocate the two-thirds maximum (This is simple and fast but gives the smallest space savings.) What do you mean by maximum? A dict with an index table size of 8 gets resized when it is two-thirds full, so the maximum number of entries is 5. If you pre-allocate five entries for the initial dict, you've spent 5 * 24 bytes + 8 bytes for the indices for a total of 128 bytes. This compares to the current table of 8 * 24 bytes totaling 192 bytes. Many other strategies are possible. The proof-of-concept code uses the one employed by regular python lists. Their growth pattern is: 0, 4, 8, 16, 25, 35, 46, 58, 72, 88, This produces nice memory savings for entry lists. If you have a suggested allocation pattern or other constructive suggestion, it would be would welcome. Further sniping and unsubstantiated FUD would not. Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More compact dictionaries with faster iteration
On Mon, Dec 10, 2012 at 08:53:17AM -0800, Barry Warsaw wrote: On Dec 10, 2012, at 05:39 PM, Christian Heimes wrote: I had ARM platforms in mind, too. Unfortunately we don't have any ARM platforms for testing and performance measurement. Even Trent's Snakebite doesn't have ARM boxes. I've a first generation Raspberry Pi, that's all. Perhaps somebody (PSF ?) is willing to donate a couple of ARM boards to Snakebite. I'm thinking of Raspberry Pi (ARMv6), Pandaboard (ARMv7 Cortex-A9) and similar. Suitable ARM boards can be had cheap, probably overwhelmed by labor costs of getting them up and running. I am not offering for *that*. ;) If someone donates the hardware, I'll take care of everything else. Trent. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More compact dictionaries with faster iteration
On Dec 10, 2012, at 4:02 AM, Serhiy Storchaka storch...@gmail.com wrote: On 10.12.12 05:30, Raymond Hettinger wrote: On Dec 9, 2012, at 10:03 PM, MRAB pyt...@mrabarnett.plus.com mailto:pyt...@mrabarnett.plus.com wrote: What happens when a key is deleted? Will that leave an empty slot in the entry array? Yes. See the __delitem__() method in the pure python implemention at http://code.activestate.com/recipes/578375 You can move the last entry on freed place. This will preserve compactness of entries array and simplify and speedup iterations and some other operations. def __delitem__(self, key, hashvalue=None): if hashvalue is None: hashvalue = hash(key) found, i = self._lookup(key, hashvalue) if found: index = self.indices[i] self.indices[i] = DUMMY self.size -= 1 if index != size: lasthash = self.hashlist[index] lastkey = self.keylist[index] found, j = self._lookup(lastkey, lasthash) assert found assert i != j self.indices[j] = index self.hashlist[index] = lasthash self.keylist[index] = lastkey self.valuelist[index] = self.valuelist[size] index = size self.hashlist[index] = UNUSED self.keylist[index] = UNUSED self.valuelist[index] = UNUSED else: raise KeyError(key) That is a clever improvement. Thank you. Using your idea (plus some tweaks) cleans-up the code a bit (simplifying iteration, simplifying the resizing logic, and eliminating the UNUSED constant). I'm updating the posted code to reflect your suggestion. Thanks again, Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] where is the python import implemented
Hi, everyone, I am testing modifying the pyc file when it is imported. As I know, there is three situation: 1、runing in the python.exe eg: python.exe test.pyc in this situation, I find the sourceon line 1983 in file pythonrun.c 2、import the pyc from a zip file I find the source on line 1132 in zipimport.c 3、do a normal import eg: two file : main.py and testmodule.py and in main.py: import testmodule in this situation, I can not find the source code how python implement it. I test a wrong format pyc, and got a error "ImportError: bad magic number",and I search "bad magic number" in the source code, I find it is in importlib/_bootstrap.py(line 815),but when I modify this error info(eg: test bad magic) and run again, nothing is changed. It seems that the file is not the correct position.___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] cpython: Using 'long double' to force this structure to be worst case aligned is no
On Tue, 11 Dec 2012 03:05:19 +0100 (CET) gregory.p.smith python-check...@python.org wrote: Using 'long double' to force this structure to be worst case aligned is no longer required as of Python 2.5+ when the gc_refs changed from an int (4 bytes) to a Py_ssize_t (8 bytes) as the minimum size is 16 bytes. The use of a 'long double' triggered a warning by Clang trunk's Undefined-Behavior Sanitizer as on many platforms a long double requires 16-byte alignment but the Python memory allocator only guarantees 8 byte alignment. So our code would allocate and use these structures with technically improper alignment. Though it didn't matter since the 'dummy' field is never used. This silences that warning. Spelunking into code history, the double was added in 2001 to force better alignment on some platforms and changed to a long double in 2002 to appease Tru64. That issue should no loner be present since the upgrade from int to Py_ssize_t where the minimum structure size increased to 16 (unless anyone knows of a platform where ssize_t is 4 bytes?) What?? Every 32-bit platform has a 4 bytes ssize_t (and size_t). We can probably get rid of the double and this union hack all together today. That is a slightly more invasive change that can be left for later. How do you suggest to get rid of it? Some platforms still have strict alignment rules and we must enforce that PyObjects (*) are always aligned to the largest possible alignment, since a PyObject-derived struct can hold arbitrary C types. (*) GC-enabled PyObjects, anyway. Others will be naturally aligned thanks to the memory allocator. What's more, I think you shouldn't be doing this kind of change in a bugfix release. It might break compiled C extensions since you are changing some characteristics of object layout (although you would probably only break those extensions which access the GC header, which is probably not many of them). Resource consumption improvements generally go only into the next feature release. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] cpython: Using 'long double' to force this structure to be worst case aligned is no
On Tue, 11 Dec 2012 08:16:27 +0100 Antoine Pitrou solip...@pitrou.net wrote: On Tue, 11 Dec 2012 03:05:19 +0100 (CET) gregory.p.smith python-check...@python.org wrote: Using 'long double' to force this structure to be worst case aligned is no longer required as of Python 2.5+ when the gc_refs changed from an int (4 bytes) to a Py_ssize_t (8 bytes) as the minimum size is 16 bytes. The use of a 'long double' triggered a warning by Clang trunk's Undefined-Behavior Sanitizer as on many platforms a long double requires 16-byte alignment but the Python memory allocator only guarantees 8 byte alignment. So our code would allocate and use these structures with technically improper alignment. Though it didn't matter since the 'dummy' field is never used. This silences that warning. Spelunking into code history, the double was added in 2001 to force better alignment on some platforms and changed to a long double in 2002 to appease Tru64. That issue should no loner be present since the upgrade from int to Py_ssize_t where the minimum structure size increased to 16 (unless anyone knows of a platform where ssize_t is 4 bytes?) What?? Every 32-bit platform has a 4 bytes ssize_t (and size_t). We can probably get rid of the double and this union hack all together today. That is a slightly more invasive change that can be left for later. How do you suggest to get rid of it? Some platforms still have strict alignment rules and we must enforce that PyObjects (*) are always aligned to the largest possible alignment, since a PyObject-derived struct can hold arbitrary C types. Ok, I hadn't seen your proposal. I find it reasonable: “A more correct non-hacky alternative if any alignment issues are still found would be to use a compiler specific alignment declaration on the structure and determine which value to use at configure time.” However, the commit is still problematic, and I think it should be reverted. We can't remove the alignment hack just because it seems to be useless on x86(-64). Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com