Re: [Python-Dev] Draft PEP for time zone support.
On 12 December 2012 00:58, Nick Coghlan ncogh...@gmail.com wrote: I'd prefer a more aggressive name for this like tzdata_override. My rationale is that *nix users need to thoroughly aware that if they install this package, they will stop benefiting from the automatic tz database updates provided by their OS (especially if they install it into the system site packages on a distro that has migrated to Python 3 for system tools). Such a name would also make it possible to provide *two* packaged databases, one checked before the OS data (tzdata_override), and one shipped with Python itself that is used only if the OS doesn't provide the timezone database (tzdata_fallback). tzdata_fallback would then be updated to the latest Olsen database for each maintenance release. Cross-platform applications that wanted more reliably up to date timezone data could then conditionally depend on tzdata_override for Windows deployments (using the environment marker support in metadata 1.2+). That sounds sensible, EIBTI and all that. It is a lot simpler than shipping the package and some sort of auto-updater, too. Paul ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Guido, Dropbox, and Python
On Dec 10, 2012, at 1:52 PM, Terry Reedy tjre...@udel.edu wrote: My question, Guido, is how this will affect Python development, and in particular, your work on async. If not proprietary info, does or will Dropbox use Python3? I talked to some Dropbox people tonight, and they said they use 2.7 for the client and 2.5 for the server. It is a project for them to switch the server to using 2.7. --Chris Sent from my iPhone ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP for time zone support.
Am 12.12.2012 01:58, schrieb Nick Coghlan: Ick, why a new module? Why not just add this directly to datetime? (It doesn't need to be provided by the C accelerator, it can go straight in the pure Python part). +1 for something like datetime.timezone How well does hg handle files renames? The datetime module could be converted to a package. I'd prefer a more aggressive name for this like tzdata_override. My rationale is that *nix users need to thoroughly aware that if they install this package, they will stop benefiting from the automatic tz database updates provided by their OS (especially if they install it into the system site packages on a distro that has migrated to Python 3 for system tools). +1, too. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Emacs users: hg-tools-grep
Brandon W Maister wrote: (defconst git-tools-grep-command git ls-files -z | xargs -0 grep -In %s The command used for grepping files using git. See `git-tools-grep'.) What's wrong with git grep? ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Emacs users: hg-tools-grep
On Wed, Dec 12, 2012 at 01:27:21PM +0200, Petri Lehtinen wrote: Brandon W Maister wrote: (defconst git-tools-grep-command git ls-files -z | xargs -0 grep -In %s The command used for grepping files using git. See `git-tools-grep'.) What's wrong with git grep? Or hg grep, for that matter? -- Ross Lagerwall ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Emacs users: hg-tools-grep
On 2012-12-12, at 15:12 , Ross Lagerwall wrote: On Wed, Dec 12, 2012 at 01:27:21PM +0200, Petri Lehtinen wrote: Brandon W Maister wrote: (defconst git-tools-grep-command git ls-files -z | xargs -0 grep -In %s The command used for grepping files using git. See `git-tools-grep'.) What's wrong with git grep? Or hg grep, for that matter? hg grep searches the history, not the working copy. *-tools-grep only searches the working copy but automatically filters files to only search in files under version control. Which as far as I know is indeed what git-grep does already. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Emacs users: hg-tools-grep
Ross Lagerwall wrote: On Wed, Dec 12, 2012 at 01:27:21PM +0200, Petri Lehtinen wrote: Brandon W Maister wrote: (defconst git-tools-grep-command git ls-files -z | xargs -0 grep -In %s The command used for grepping files using git. See `git-tools-grep'.) What's wrong with git grep? Or hg grep, for that matter? hg grep searches in the repository history, so it's not good for this. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Emacs users: hg-tools-grep
Yes indeed-- in my eagerness to make my first post to python-dev be well-received I completely forgot about git grep. brandon On Wed, Dec 12, 2012 at 9:20 AM, Xavier Morel python-...@masklinn.netwrote: On 2012-12-12, at 15:12 , Ross Lagerwall wrote: On Wed, Dec 12, 2012 at 01:27:21PM +0200, Petri Lehtinen wrote: Brandon W Maister wrote: (defconst git-tools-grep-command git ls-files -z | xargs -0 grep -In %s The command used for grepping files using git. See `git-tools-grep'.) What's wrong with git grep? Or hg grep, for that matter? hg grep searches the history, not the working copy. *-tools-grep only searches the working copy but automatically filters files to only search in files under version control. Which as far as I know is indeed what git-grep does already. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/quodlibetor%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP for time zone support.
General comments: It seems like the consensus is moving towards making sure there always is a database available. If this means including it in the standard Python distribution as well, or only on Windows, I don't know, opinions on that are welcome. The steps to look for a database would then change to: 1. The path specified, if not None. 2. The module for timezone overrides. 3. The OS database. 4. The database included in Python. We need to determine if a warning should be raised in case of 4 or not, as well as the name for the override module. I think the word override here is possibly unclear, I'd prefer something like timezone-update or similar. I'm personally a bit sceptical to writing a special updater/installer just for this. I don't want to have a special unique way to install this package. As it comes to OS packages, Christian Heimes pointed out that most Windows installations today has Java installed, and kept updated, and it has a zoneinfo database. We could consider using that on Windows as well, although it admittedly feels quite icky. I haven't been able to find any other common locations for the zoneinfo database on Windows. Specific answers: On Tue, Dec 11, 2012 at 4:39 PM, Dirkjan Ochtman dirk...@ochtman.nl wrote: I wonder if there needs to be something here about how to port from pytz to the new timezone library. It would be nice to have, but I don't think it's necessary to have in the PEP. It seems like calling get_timezone() with an unknown timezone should just throw ValueError, not necessarily some custom Exception? That could very well be. What are others opinions on this? Why not keep a bit more of the pytz API to make porting easy? The renaming of the timezone() function to get_timezone() is indeed small, but changing pytz.timezone(foo) to timezone.timezone(foo) is really significantly easier than renaming it to timezone.get_timezone(foo). If we keep all of the API intact you could do try: import pytz as timezone except ImportError: import timezone Which would make porting quicker, that's true, but do we really want to keep unecessary API's around forever? Isn't it better to minimize the noise from the start? It also seems relatively painless to keep localize() and normalize() functions around for easy porting. Sure, but you then have two ways of doing the same thing, which I think we should avoid. On Tue, Dec 11, 2012 at 5:07 PM, Antoine Pitrou solip...@pitrou.net wrote: The ``is_dst`` parameter can be ``True`` (default), ``False``, or ``None``. Why is it True by default? Do we have statistics showing that Python gets more use in summer? Because for some reason both me and Stuart Bishop thought it should be, but at least in my case I don't have any actual good reason why. Checking with how pytz does this shows that pytz in fact defaults to False, so I think the default should be False. On Wed, Dec 12, 2012 at 3:50 AM, Barry Warsaw ba...@python.org wrote: This is likely the hardest part of this PEP as this involves updating the Oops, something got cut off there. Ah, yes, I was going to write that the difficult bit was updating the _datetime.c module. Why add a new module instead of putting all this into the existing datetime module, either directly or as a submodule? Seems like the obvious place to put it instead of claiming another top-level module name. pytz as it is consists of several modules, and a significant amount of code, it didn't feel right to move all that into the datetime.py module. It also didn't feel right to then not implement it in _datetime.c, but perhaps that's just me being silly. But a submodule could work. I'm bikeshedding, but can we find a better name than `db` for the second argument? Something that makes it obvious we're looking for file system path? Absolutely. db_path? I'd really like to see a TimeZoneError base class from which all these new exceptions inherit. That makes sense. The ``timezonedata``-package - Just to be clear, this doesn't expose any new modules, right? That's the intention, yes, although I haven't investigated ways of knowing if it's installed or not yet, and exposing a module is the obvious way of doing that. But I'm hoping there will be better ways, right? One other thing that the PEP should describe is what happens on a distro that has timezone data, but which you also pip install the PyPI tzdata package. Which one wins? Is there a way to control it, other than providing an explicit path? Is there a way to uninstall the PyPI package? Does the API need to provide a method which tells you where the database it is using by default lives? The PyPI package wins, I'll clarify that bit. I'm think the data should end up in site-packages somewhere, and that it should be installable and uninstallable with pip/easy_install and by simply deleting it. On Wed, Dec 12, 2012 at 4:14 AM, Nick Coghlan
Re: [Python-Dev] Draft PEP for time zone support.
On Wed, Dec 12, 2012 at 9:56 AM, Lennart Regebro rege...@gmail.com wrote: General comments: It seems like the consensus is moving towards making sure there always is a database available. If this means including it in the standard Python distribution as well, or only on Windows, I don't know, opinions on that are welcome. The steps to look for a database would then change to: 1. The path specified, if not None. 2. The module for timezone overrides. 3. The OS database. 4. The database included in Python. We need to determine if a warning should be raised in case of 4 or not, as well as the name for the override module. I think the word override here is possibly unclear, I'd prefer something like timezone-update or similar. I'm personally a bit sceptical to writing a special updater/installer just for this. I don't want to have a special unique way to install this package. As it comes to OS packages, Christian Heimes pointed out that most Windows installations today has Java installed, and kept updated, and it has a zoneinfo database. We could consider using that on Windows as well, although it admittedly feels quite icky. Depending on Java being installed or even installing it alongside Python would be a funny April Fools prank. This can't happen. I don't think it's all that bad to include a small script on Windows which runs every few days to check PyPI, then present an option to update the info. This is what Java itself is doing anyway. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP for time zone support.
On Wed, Dec 12, 2012 at 4:56 PM, Lennart Regebro rege...@gmail.com wrote: Why not keep a bit more of the pytz API to make porting easy? The renaming of the timezone() function to get_timezone() is indeed small, but changing pytz.timezone(foo) to timezone.timezone(foo) is really significantly easier than renaming it to timezone.get_timezone(foo). If we keep all of the API intact you could do try: import pytz as timezone except ImportError: import timezone Which would make porting quicker, that's true, but do we really want to keep unecessary API's around forever? Isn't it better to minimize the noise from the start? That entirely depends on when you define to be the start. It seems to me the consensus on python-dev has been that packages primarily evolve outside the stdlib; it seems a bit weird to then, at the time of stdlib inclusion, start changing the API. Why is it True by default? Do we have statistics showing that Python gets more use in summer? Because for some reason both me and Stuart Bishop thought it should be, but at least in my case I don't have any actual good reason why. Checking with how pytz does this shows that pytz in fact defaults to False, so I think the default should be False. Here, too, I think that sticking with pytz's default would be a good idea. Cheers, Dirkjan ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP for time zone support.
On 12 December 2012 16:11, Brian Curtin br...@python.org wrote: I don't think it's all that bad to include a small script on Windows which runs every few days to check PyPI, then present an option to update the info. This is what Java itself is doing anyway. What would that do in an environment without internet access? Or with a firewall blocking Python's requests and returning an error page without warning (so the updater just sees incorrect data)? What about corporate environments that want to control the rollout of updates? (I can't imagine that in practice, but certainly companies do it for Java). Most Windows updaters use the official Windows APIs so that they work properly with odd cases like ISA proxies taking credentials from the Windows user login. Python's stdlib doesn't support that type of thing. I'm -1 on auto-updating because it's too easy to produce a nearly right solution that doesn't work in highly-controlled (e.g., corporate) environments. And a correct solution would be hard to support with python-dev's level of Windows expertise. Paul. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP for time zone support.
Le Wed, 12 Dec 2012 10:11:15 -0600, Brian Curtin br...@python.org a écrit : I don't think it's all that bad to include a small script on Windows which runs every few days to check PyPI, then present an option to update the info. This is what Java itself is doing anyway. I don't get why people are so obsessed about updating the timezone database. Really, this is not worse than having a vulnerable OpenSSL linked with your Python executable. Purity does not bring any advantage here. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP for time zone support.
On Wed, Dec 12, 2012 at 8:44 AM, Antoine Pitrou solip...@pitrou.net wrote: Le Wed, 12 Dec 2012 10:11:15 -0600, Brian Curtin br...@python.org a écrit : I don't think it's all that bad to include a small script on Windows which runs every few days to check PyPI, then present an option to update the info. This is what Java itself is doing anyway. I don't get why people are so obsessed about updating the timezone database. Really, this is not worse than having a vulnerable OpenSSL linked with your Python executable. Purity does not bring any advantage here. Bingo. As long as the recipe to update is clear, most users can ignore this, because the countries about which they care don't change DST rules often enough for it to matter. When it does matter, they'll know (changing the DST rules is something that local news sources tend to track :-) and they can update their software when stuff they use starts getting the time wrong. Obviously sysadmins responsible for large numbers of users can make this into a routine, and ditto people who run services. But these folks are professionals and are good at automating tasks like this. -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP for time zone support.
Paul Moore wrote: On 12 December 2012 16:11, Brian Curtin br...@python.org wrote: I don't think it's all that bad to include a small script on Windows which runs every few days to check PyPI, then present an option to update the info. This is what Java itself is doing anyway. What would that do in an environment without internet access? Or with a firewall blocking Python's requests and returning an error page without warning (so the updater just sees incorrect data)? What about corporate environments that want to control the rollout of updates? (I can't imagine that in practice, but certainly companies do it for Java). Most Windows updaters use the official Windows APIs so that they work properly with odd cases like ISA proxies taking credentials from the Windows user login. Python's stdlib doesn't support that type of thing. I'm -1 on auto-updating because it's too easy to produce a nearly right solution that doesn't work in highly-controlled (e.g., corporate) environments. And a correct solution would be hard to support with python-dev's level of Windows expertise. And what about embedded installations of Python, such as in TortoiseHg? And all the people (such as myself) who disable updaters that they don't like or didn't expect? The correct solution on Windows may be to use a static database for historical dates and the information in the registry for current and future dates. The registry is updated through Windows Update, which is at least as reliable as anything Python could do. (I'm not sure exactly what the state of updates to older versions is like, but I'd assume WinXP still gets timezone updates and Win2K doesn't.) Details of the registry entries are at http://msdn.microsoft.com/en-us/library/ms725481.aspx. It looks like the data is focused on modern timezones rather than localities, which would mean a many-to-one mapping from zoneinfo. Unfortunately it doesn't look like there's enough overlap to allow an automated mapping. That said, it is incredibly easy to convert between UTC and local (http://msdn.microsoft.com/en-us/library/ms724949.aspx), even for dates in the past or future when the information is available. It's just that timezones other than the user's preference are difficult. Cheers, Steve ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP for time zone support.
Hi, Le 12/12/2012 04:53, Christian Heimes a écrit : Am 12.12.2012 01:58, schrieb Nick Coghlan: Ick, why a new module? Why not just add this directly to datetime? (It doesn't need to be provided by the C accelerator, it can go straight in the pure Python part). +1 for something like datetime.timezone How well does hg handle files renames? The datetime module could be converted to a package. Quite well. It’s easy to rename datetime.py to datetime/__init__py, and subsequent fixes in 3.3’s datetime.py will be merged into datetime/__init__.py by Mercurial’s merge subsystem. Cheers ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More compact dictionaries with faster iteration
On 12/12/2012 01:15 AM, Nick Coghlan wrote: On Wed, Dec 12, 2012 at 5:37 AM, Dino Viehland di...@microsoft.com mailto:di...@microsoft.com wrote: OTOH changing certain dictionaries in IronPython (such as keyword args) to be ordered would certainly be possible. Personally I just wouldn't want to see it be the default as that seems like unnecessary overhead when the specialized class exists. Which reminds me, I was going to note that one of the main gains with ordered keyword arguments, is their use in the construction of string-keyed objects where you want to be able to control the order of iteration (e.g. for serialisation or display purposes). Currently you have to go the path of something like namedtuple where you define the order of iteration in one operation, and set the values in another. So here's a brand new argument against ordered dicts: The existence of perfect hashing schemes. They fundamentally conflict with ordered dicts. I played with using them for vtable dispatches in Cython this summer, and they can perform really, really well for branch-predicted lookups in hot loops, because you always/nearly always eliminate linear probing and so there's no branch misses or extra comparisons. (The overhead of a perfect hash table lookup over a traditional vtable lookup was only a couple of cycles in my highly artificial fully branch-predicted micro-benchmark.) There's some overhead in setup; IIRC, ~20 microseconds for 64 elements, 2 GHz CPU, though that was a first prototype implementation and both algorithmic improvements and tuning should be possible. So it's not useful for everything, but perhaps for things like module dictionaries and classes an optionally perfect dict can make sense. Note: I'm NOT suggesting the use of perfect hashing, just making sure it's existence is mentioned and that one is aware that if always-ordered dicts become the language standard it precludes this option far off in the future. (Something like a sort() method could still work and make the dict unperfect; one could also have a pack() method that made the dict perfect again.). That concludes the on-topic parts of my post. -- Dag Sverre Seljebotn APPENDIX Going off-topic for those who are interested, here's the longwinded and ugly details. My code [1] is based on the paper [2] (psuedo-code in Appendix A), but I adapted it a bit to be useful for tens/hundreds of elements rather than billions. The ingredients: 1) You need the hash to be 32 bits (or 64) of good entropy (md5 or murmurhash or similar). (Yes, that's a tall order for CPython, I'm just describing the scheme.) (If the hash collides on all bits you *will* collide, so some fallback is still necesarry, just unlikely.) 2) To lookup, the idea is (psuedo-code!) typedef struct { int m_f m_g, r, k; int16_t d[k]; /* small int, like current proposal */ } table_header_t; And then one computes index of an element with hash h using the function ((h tab-r) tab-m_f) ^ tab-d[h tab-m_g] rather than the usual h % n. While more arithmetic, arithmetic is cheap and branch misses are not. 3) To set up/repack a table one needs to find the parameters. The general idea is: a) Partition the hashes into k bins by using h m_g. There will be collisions, but the number of bins with many collisions will be very small; most bins will have 2 or 1 or 0 elements. b) Starting with the largest bin, distribute the elements according to the hash function. If a bin collides with the existing contents, try another value for d[binindex] until it doesn't. The r parameter let's you try again 32 (or 64) times to find a solution. In my testcases there was ~0.1% chance of not finding a solution (that is, exhausting possible choices of r) with 64-bit hashes with 4 or 8 elements and no empty table elements. For any other number of elements, or with some empty elements, the chance of failure was much lower.) [1] It's not exactly a great demo, but it contains the algorithm. If there's much interest I should clean it up and make a proper benchmark demo out of it: https://github.com/dagss/pyextensibletype/blob/perfecthash/include/perfecthash.h [2] Pagh (1999) http://www.brics.dk/RS/99/13/BRICS-RS-99-13.ps.gz ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More compact dictionaries with faster iteration
On Wed, Dec 12, 2012 at 3:37 PM, Dag Sverre Seljebotn d.s.seljeb...@astro.uio.no wrote: On 12/12/2012 01:15 AM, Nick Coghlan wrote: On Wed, Dec 12, 2012 at 5:37 AM, Dino Viehland di...@microsoft.com mailto:di...@microsoft.com wrote: OTOH changing certain dictionaries in IronPython (such as keyword args) to be ordered would certainly be possible. Personally I just wouldn't want to see it be the default as that seems like unnecessary overhead when the specialized class exists. Which reminds me, I was going to note that one of the main gains with ordered keyword arguments, is their use in the construction of string-keyed objects where you want to be able to control the order of iteration (e.g. for serialisation or display purposes). Currently you have to go the path of something like namedtuple where you define the order of iteration in one operation, and set the values in another. So here's a brand new argument against ordered dicts: The existence of perfect hashing schemes. They fundamentally conflict with ordered dicts. If I understand your explanation, then they don't conflict with the type of ordering described in this thread. Raymond's optimization separates the hash table part from the contents part of a dictionary, and there is no requirement that these two parts be in the same size or the same order. Indeed, Raymond's split design lets you re-parameterize the hashing all you want, without perturbing the iteration order at all. You would in fact be able to take a dictionary at any moment, and say, optimize the 'hash table' part to a non-colliding state based on the current contents, without touching the 'contents' part at all. (One could do this at class creation time on a class dictionary, and just after importing on a module dictionary, for example.) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More compact dictionaries with faster iteration
On 12/12/2012 10:31 PM, PJ Eby wrote: On Wed, Dec 12, 2012 at 3:37 PM, Dag Sverre Seljebotn d.s.seljeb...@astro.uio.no wrote: On 12/12/2012 01:15 AM, Nick Coghlan wrote: On Wed, Dec 12, 2012 at 5:37 AM, Dino Viehland di...@microsoft.com mailto:di...@microsoft.com wrote: OTOH changing certain dictionaries in IronPython (such as keyword args) to be ordered would certainly be possible. Personally I just wouldn't want to see it be the default as that seems like unnecessary overhead when the specialized class exists. Which reminds me, I was going to note that one of the main gains with ordered keyword arguments, is their use in the construction of string-keyed objects where you want to be able to control the order of iteration (e.g. for serialisation or display purposes). Currently you have to go the path of something like namedtuple where you define the order of iteration in one operation, and set the values in another. So here's a brand new argument against ordered dicts: The existence of perfect hashing schemes. They fundamentally conflict with ordered dicts. If I understand your explanation, then they don't conflict with the type of ordering described in this thread. Raymond's optimization separates the hash table part from the contents part of a dictionary, and there is no requirement that these two parts be in the same size or the same order. I don't fully agree. Perfect hashing already separates hash table from contents (sort of), and saves the memory in much the same way. Yes, you can repeat the trick and have 2 levels of indirection, but that then requires an additional table of small ints which is pure overhead present for the sorting; in short, it's no longer an optimization but just overhead for the sortability. Dag Sverre Indeed, Raymond's split design lets you re-parameterize the hashing all you want, without perturbing the iteration order at all. You would in fact be able to take a dictionary at any moment, and say, optimize the 'hash table' part to a non-colliding state based on the current contents, without touching the 'contents' part at all. (One could do this at class creation time on a class dictionary, and just after importing on a module dictionary, for example.) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More compact dictionaries with faster iteration
On 12/12/2012 11:06 PM, Dag Sverre Seljebotn wrote: On 12/12/2012 10:31 PM, PJ Eby wrote: On Wed, Dec 12, 2012 at 3:37 PM, Dag Sverre Seljebotn d.s.seljeb...@astro.uio.no wrote: On 12/12/2012 01:15 AM, Nick Coghlan wrote: On Wed, Dec 12, 2012 at 5:37 AM, Dino Viehland di...@microsoft.com mailto:di...@microsoft.com wrote: OTOH changing certain dictionaries in IronPython (such as keyword args) to be ordered would certainly be possible. Personally I just wouldn't want to see it be the default as that seems like unnecessary overhead when the specialized class exists. Which reminds me, I was going to note that one of the main gains with ordered keyword arguments, is their use in the construction of string-keyed objects where you want to be able to control the order of iteration (e.g. for serialisation or display purposes). Currently you have to go the path of something like namedtuple where you define the order of iteration in one operation, and set the values in another. So here's a brand new argument against ordered dicts: The existence of perfect hashing schemes. They fundamentally conflict with ordered dicts. If I understand your explanation, then they don't conflict with the type of ordering described in this thread. Raymond's optimization separates the hash table part from the contents part of a dictionary, and there is no requirement that these two parts be in the same size or the same order. I don't fully agree. Perfect hashing already separates hash table from contents (sort of), and saves the memory in much the same way. This was a bit inaccurate, but the point is: The perfect hash function doesn't need any fill-in to avoid collisions, you can (except in exceptional circumstances) fill the table 100%, so the memory is already saved. Dag Sverre Yes, you can repeat the trick and have 2 levels of indirection, but that then requires an additional table of small ints which is pure overhead present for the sorting; in short, it's no longer an optimization but just overhead for the sortability. Dag Sverre Indeed, Raymond's split design lets you re-parameterize the hashing all you want, without perturbing the iteration order at all. You would in fact be able to take a dictionary at any moment, and say, optimize the 'hash table' part to a non-colliding state based on the current contents, without touching the 'contents' part at all. (One could do this at class creation time on a class dictionary, and just after importing on a module dictionary, for example.) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP for time zone support.
On Tue, Dec 11, 2012 at 8:07 AM, Antoine Pitrou solip...@pitrou.net wrote: Do we have statistics showing that Python gets more use in summer? Well, pythons are cold-blooded, so they're probably more active during the warmer seasons... -- Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP for time zone support.
On Wed, Dec 12, 2012 at 5:21 PM, Dirkjan Ochtman dirk...@ochtman.nl wrote: That entirely depends on when you define to be the start. It seems to me the consensus on python-dev has been that packages primarily evolve outside the stdlib; it seems a bit weird to then, at the time of stdlib inclusion, start changing the API. But this bit of the API is there only as a hack, because stdlib does not support is_dst. We are changing that. Hence those extra functions are no longer needed. //Lennart ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP for time zone support.
On Wed, Dec 12, 2012 at 5:54 PM, Steve Dower steve.do...@microsoft.com wrote: Details of the registry entries are at http://msdn.microsoft.com/en-us/library/ms725481.aspx. It looks like the data is focused on modern timezones rather than localities, which would mean a many-to-one mapping from zoneinfo. Unfortunately it doesn't look like there's enough overlap to allow an automated mapping. No, but the Unicode consortium (I think) is keeping a mapping updated manually. I'm using that in tzlocal, to figure out the local timezone of the computer on Windows. However, I think that mixing and matching timezone data in this way from the two systems are likely to be full of pitfalls edge-cases and complexities I do not dare even think seriously about. There will probably be *less* errors by just keeping an old timezone database around. Besides, what it they don't run Windows update? Then the data still is outdated? //Lennart ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP for time zone support.
On 12.12.12 02:43, Guido van Rossum wrote: On Tue, Dec 11, 2012 at 5:11 PM, Robert Brewerfuman...@aminus.org wrote: Guido van Rossum wrote: Sent: Tuesday, December 11, 2012 4:11 PM To: Antoine Pitrou Cc:python-dev@python.org Subject: Re: [Python-Dev] Draft PEP for time zone support. On Tue, Dec 11, 2012 at 8:07 AM, Antoine Pitrousolip...@pitrou.net wrote: Le Tue, 11 Dec 2012 16:23:37 +0100, Lennart Regebrorege...@gmail.com a écrit : Changes in the ``datetime``-module -- A new ``is_dst`` parameter is added to several of the `tzinfo` methods to handle time ambiguity during DST changeovers. * ``tzinfo.utcoffset(self, dt, is_dst=True)`` * ``tzinfo.dst(self, dt, is_dst=True)`` * ``tzinfo.tzname(self, dt, is_dst=True)`` The ``is_dst`` parameter can be ``True`` (default), ``False``, or ``None``. ``True`` will specify that the given datetime should be interpreted as happening during daylight savings time, ie that the time specified is before the change from DST. Why is it True by default? Do we have statistics showing that Python gets more use in summer? My question exactly. Summer in the USA, at least, is 238 days in 2012, while Winter into 2013 is only 126 days: import datetime datetime.date(2012, 11, 4) - datetime.date(2012, 3, 11) datetime.timedelta(238) datetime.date(2013, 3, 10) - datetime.date(2012, 11, 4) datetime.timedelta(126) Very funny, but that can't be the real reason. *Most* datetime values aren't ambiguous, so in those cases the parameter should be ignored, right? There's only one hour per year where you need to specify it (two, if we want to artificially assign a meaning to values falling the impossible hour). And during those times it's equally likely that you meant either of the possibilities. I think the meaning of the parameter must be clarified, perhaps as follows: - ignored except during the ambiguous hour and during the impossible hour - during the ambiguous or impossible hour: - if True, prefer/pretend DST - if False, prefer/pretend non-DST - if None, raise an error Here I'd prefer the default to be None if I had to do it over again, but given that the current behavior is one of the first two (which one?) we probably can't do that. Still, it's slightly confusing that passing None is not the same as omitting the parameter altogether -- there aren't many APIs that explicitly support passing None but don't use it as the default (though there probably are some precedents). Maybe requesting an error should be done through some other special value, and None should be the same as omitted and the same as the old behavior? But where would the special value come from? It should be made as easy as possible to do the right thing (i.e. raise an error). Or maybe have a separate Boolean flag to request an error? I see an issue here that makes me a little uncomfortable: Having a default that makes code work all year but raises an error during the impossible hour could be problematic in critical code. Can we make this more explicit by forcing the users to decide? I like the idea of the extra boolean flag here, because it will be explicitly visible that this code intentionally creates an exception. Or even not a flag, but the exception to be raised, or a callable to handle this case? Sloppy coding can be dangerous. So maybe the warning module could be helpful as well: If None is passed and no explicit flag/exception/callable given, bother the user with a warning message ;-) cheers - chris -- Christian Tismer :^)mailto:tis...@stackless.com Software Consulting : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 :*Starship*http://starship.python.net/ 14482 Potsdam: PGP key -http://pgp.uni-mainz.de phone +49 173 24 18 776 fax +49 (30) 700143-0023 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today?http://www.stackless.com/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP for time zone support.
On 12/12/2012 11:53 AM, Guido van Rossum wrote: Bingo. As long as the recipe to update is clear, most users can ignore this, because the countries about which they care don't change DST rules often enough for it to matter. When it does matter, they'll know (changing the DST rules is something that local news sources tend to track :-) and they can update their software when stuff they use starts getting the time wrong. Obviously sysadmins responsible for large numbers of users can make this into a routine, and ditto people who run services. But these folks are professionals and are good at automating tasks like this. As a Windows user, I would like there to be one tz data file used by all Python versions on my machine, including ones included with other apps. I would like every installer, including for bug fix releases, to update it. This should be sufficient for 99% of Windows users. As Guido says above, the docs should tell the other 1% how to update it explicitly. -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP for time zone support.
On 12/12/2012 10:56 AM, Lennart Regebro wrote: It seems like calling get_timezone() with an unknown timezone should just throw ValueError, not necessarily some custom Exception? That could very well be. What are others opinions on this? ValueError. That is what it is. Nothing special here. Why not keep a bit more of the pytz API to make porting easy? The renaming of the timezone() function to get_timezone() is indeed small, And gratuitous, to me. I don't generally like 'get' prefixes anyway. but changing pytz.timezone(foo) to timezone.timezone(foo) is really significantly easier than renaming it to timezone.get_timezone(foo). If we keep all of the API intact you could do try: import pytz as timezone except ImportError: import timezone Which would make porting quicker, that's true, but do we really want to keep unecessary API's around forever? Isn't it better to minimize the noise from the start? While the module that was the basis for the ipaddress module was released on PyPI and its api developed however it did, the API was worked over quite a bit before the addition of ipaddress. So I agree that the current api can be revised before being more-or-less frozen in the stdlib. It also seems relatively painless to keep localize() and normalize() functions around for easy porting. Sure, but you then have two ways of doing the same thing, which I think we should avoid. I agree that this is precisely the time to remove cruft (if indeed it is such). -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP for time zone support.
On Thu, Dec 13, 2012 at 12:23 AM, Terry Reedy tjre...@udel.edu wrote: As a Windows user, I would like there to be one tz data file used by all Python versions on my machine, including ones included with other apps. That would be nice, but where would that be installed? There is no standard location for zoneinfo files. And do we really want to install python-specific files outside the Python tree? //Lennart ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP for time zone support.
On 2012-12-12 23:33, Lennart Regebro wrote: On Thu, Dec 13, 2012 at 12:23 AM, Terry Reedy tjre...@udel.edu wrote: As a Windows user, I would like there to be one tz data file used by all Python versions on my machine, including ones included with other apps. That would be nice, but where would that be installed? There is no standard location for zoneinfo files. And do we really want to install python-specific files outside the Python tree? Python version x.y is installed into, say, C:\Pythonxy, so perhaps a good place would be, say, C:\Python. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP for time zone support.
On Wed, Dec 12, 2012 at 6:10 PM, MRAB pyt...@mrabarnett.plus.com wrote: On 2012-12-12 23:33, Lennart Regebro wrote: On Thu, Dec 13, 2012 at 12:23 AM, Terry Reedy tjre...@udel.edu wrote: As a Windows user, I would like there to be one tz data file used by all Python versions on my machine, including ones included with other apps. That would be nice, but where would that be installed? There is no standard location for zoneinfo files. And do we really want to install python-specific files outside the Python tree? Python version x.y is installed into, say, C:\Pythonxy, so perhaps a good place would be, say, C:\Python. C:\ProgramData\Python ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP for time zone support.
On 12/12/2012 7:27 PM, Brian Curtin wrote: On Wed, Dec 12, 2012 at 6:10 PM, MRAB pyt...@mrabarnett.plus.com wrote: On 2012-12-12 23:33, Lennart Regebro wrote: On Thu, Dec 13, 2012 at 12:23 AM, Terry Reedy tjre...@udel.edu wrote: As a Windows user, I would like there to be one tz data file used by all Python versions on my machine, including ones included with other apps. That would be nice, but where would that be installed? There is no standard location for zoneinfo files. And do we really want to install python-specific files outside the Python tree? There is no 'Python tree' on windows. Rather, there is a separate tree for each version, located where the user directs. Windows used to have a %APPDATA% directory variable. Not sure about Win 7, let alone 8. Martin and others should know better. Or ask the user where to put it. I know where I would choose, and it would not be on my C drive. Un-installers would not delete (unless a reference count were kept and were decremented to 0). Python version x.y is installed into, say, C:\Pythonxy, so perhaps a good place would be, say, C:\Python. C:\ProgramData\Python Making a new top-level directory without asking is obnoxious. -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP for time zone support.
On Dec 12, 2012 7:24 PM, Terry Reedy tjre...@udel.edu wrote: On 12/12/2012 7:27 PM, Brian Curtin wrote: On Wed, Dec 12, 2012 at 6:10 PM, MRAB pyt...@mrabarnett.plus.com wrote: On 2012-12-12 23:33, Lennart Regebro wrote: On Thu, Dec 13, 2012 at 12:23 AM, Terry Reedy tjre...@udel.edu wrote: As a Windows user, I would like there to be one tz data file used by all Python versions on my machine, including ones included with other apps. That would be nice, but where would that be installed? There is no standard location for zoneinfo files. And do we really want to install python-specific files outside the Python tree? There is no 'Python tree' on windows. Rather, there is a separate tree for each version, located where the user directs. Windows used to have a %APPDATA% directory variable. Not sure about Win 7, let alone 8. Martin and others should know better. Or ask the user where to put it. I know where I would choose, and it would not be on my C drive. Un-installers would not delete (unless a reference count were kept and were decremented to 0). Python version x.y is installed into, say, C:\Pythonxy, so perhaps a good place would be, say, C:\Python. C:\ProgramData\Python Making a new top-level directory without asking is obnoxious. See http://stackoverflow.com/questions/9518890/what-is-the-significance-programdata-in-windows ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP for time zone support.
On 12/12/2012 5:36 PM, Brian Curtin wrote: C:\ProgramData\Python ^ That. Is not the path that the link below is talking about, though. Making a new top-level directory without asking is obnoxious. See http://stackoverflow.com/questions/9518890/what-is-the-significance-programdata-in-windows ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP for time zone support.
On 12/12/2012 8:43 PM, Glenn Linderman wrote: On 12/12/2012 5:36 PM, Brian Curtin wrote: C:\ProgramData\Python ^ That. Is not the path that the link below is talking about, though. It actually does; it is rather confusing though. :/ It's referring to KNOWNFOLDERID constant FOLDERID_ProgramData. The actual on disk location for this has changed over windows versions. As noted below in the SO link given: Note that this documentation refers to the typical path as per older versions of Windows. In modern versions of Windows it is located in %SystemDrive%\ProgramData. Making a new top-level directory without asking is obnoxious. See http://stackoverflow.com/questions/9518890/what-is-the-significance-programdata-in-windows ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP for time zone support.
On Wed, Dec 12, 2012 at 8:10 PM, Janzert janz...@janzert.com wrote: On 12/12/2012 8:43 PM, Glenn Linderman wrote: On 12/12/2012 5:36 PM, Brian Curtin wrote: C:\ProgramData\Python ^ That. Is not the path that the link below is talking about, though. It actually does; it is rather confusing though. :/ It's referring to KNOWNFOLDERID constant FOLDERID_ProgramData. The actual on disk location for this has changed over windows versions. As noted below in the SO link given: Note that this documentation refers to the typical path as per older versions of Windows. In modern versions of Windows it is located in %SystemDrive%\ProgramData. Correct. Anyway, on with the actual timezone stuff... ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More compact dictionaries with faster iteration
On Wed, Dec 12, 2012 at 5:06 PM, Dag Sverre Seljebotn d.s.seljeb...@astro.uio.no wrote: Perfect hashing already separates hash table from contents (sort of), and saves the memory in much the same way. Yes, you can repeat the trick and have 2 levels of indirection, but that then requires an additional table of small ints which is pure overhead present for the sorting; in short, it's no longer an optimization but just overhead for the sortability. I'm confused. I understood your algorithm to require repacking, rather than it being a suitable algorithm for incremental change to an existing dictionary. ISTM that that would mean you still pay some sort of overhead (either in time or space) while the dictionary is still being mutated. Also, I'm not sure how 2 levels of indirection come into it. The algorithm you describe has, as I understand it, only up to 12 perturbation values (bins), so it's a constant space overhead, not a variable one. What's more, you can possibly avoid the extra memory access by using a different perfect hashing algorithm, at the cost of a slower optimization step or using a little more memory. Note: I'm NOT suggesting the use of perfect hashing, just making sure it's existence is mentioned and that one is aware that if always-ordered dicts become the language standard it precludes this option far off in the future. Not really. It means that some forms of perfect hashing might require adding a few more ints worth of overhead for the dictionaries that use it. If it's really a performance benefit for very-frequently-used dictionaries, that might still be worthwhile. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP for time zone support.
On Thu, Dec 13, 2012 at 2:24 AM, Terry Reedy tjre...@udel.edu wrote: Or ask the user where to put it. If we ask where it should be installed, then we need a registry setting for that or we don't know where it's located when it is to be used. And if we ask, then people will install it in non-standard locations. While installers for software with Python don't want their users to be asked, so they'll install it in the standard location, overriding the managers preferred, updated custom location with the standard location with a database that is probably not updated. So I think that asking is not an option at all. It either goes in %PROGRAMDATA%\Python\zoneinfo or it's not shared at all. I know where I would choose, and it would not be on my C drive. Un-installers would not delete (unless a reference count were kept and were decremented to 0). True, and that's annoying when those counters go wrong. All in all I would say I would prefer to install this per Python. //Lennart ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Distutils] Is is worth disentangling distutils?
On Mon, Dec 10, 2012 at 8:22 AM, Antonio Cavallo a.cava...@cavallinux.eu wrote: Hi, I wonder if is it worth/if there is any interest in trying to clean up distutils: nothing in terms to add new features, just a *major* cleanup retaining the exact same interface. I'm not planning anything like *adding features* or rewriting rpm/rpmbuild here, simply cleaning up that un-holy code mess. Yes it served well, don't get me wrong, and I think it did work much better than anything it was meant to replace it. I'm not into the py3 at all so I wonder how possibly it could fit/collide into the big plan. Or I'll be wasting my time? The effort of making something that replaces distutils is, as far as I can understand, currently on the level of taking the best bits out of distutils2 and putting it into Python 3.4 under the name packaging. I'm sure that effort can need more help. //Lennart ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP for time zone support.
On 12/12/2012 6:10 PM, Janzert wrote: On 12/12/2012 8:43 PM, Glenn Linderman wrote: On 12/12/2012 5:36 PM, Brian Curtin wrote: C:\ProgramData\Python ^ That. Is not the path that the link below is talking about, though. It actually does; it is rather confusing though. :/ I agree with the below. But I have never seen a version of Windows on which c:\ProgramData was the actual path for FOLDERID_ProgramData. Can you reference documentation that states that it was there, for some version? This documentation speaks of: c:\Documents and Settings\AllUsers\Application Data (which I knew from XP, and I think 2000, not sure I remember NT) In Vista.0, Vista.1, and Vista.2, I guess it is moved to C:\users\AllUsers\AppData\Roaming (typically). Neither of those would result in C:\ProgramData\Python. It's referring to KNOWNFOLDERID constant FOLDERID_ProgramData. The actual on disk location for this has changed over windows versions. As noted below in the SO link given: Note that this documentation refers to the typical path as per older versions of Windows. In modern versions of Windows it is located in %SystemDrive%\ProgramData. Making a new top-level directory without asking is obnoxious. See http://stackoverflow.com/questions/9518890/what-is-the-significance-programdata-in-windows ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] cpython: expose TCP_FASTOPEN and MSG_FASTOPEN
On Thu, 13 Dec 2012 04:24:54 +0100 (CET) benjamin.peterson python-check...@python.org wrote: http://hg.python.org/cpython/rev/5435a9278028 changeset: 80834:5435a9278028 user:Benjamin Peterson benja...@python.org date:Wed Dec 12 22:24:47 2012 -0500 summary: expose TCP_FASTOPEN and MSG_FASTOPEN files: Misc/NEWS | 3 +++ Modules/socketmodule.c | 7 ++- 2 files changed, 9 insertions(+), 1 deletions(-) diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -163,6 +163,9 @@ Library --- +- Expose the TCP_FASTOPEN and MSG_FASTOPEN flags in socket when they're + available. This should be documented, no? Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More compact dictionaries with faster iteration
On 12/13/2012 06:11 AM, PJ Eby wrote: On Wed, Dec 12, 2012 at 5:06 PM, Dag Sverre Seljebotn d.s.seljeb...@astro.uio.no wrote: Perfect hashing already separates hash table from contents (sort of), and saves the memory in much the same way. Yes, you can repeat the trick and have 2 levels of indirection, but that then requires an additional table of small ints which is pure overhead present for the sorting; in short, it's no longer an optimization but just overhead for the sortability. I'm confused. I understood your algorithm to require repacking, rather than it being a suitable algorithm for incremental change to an existing dictionary. ISTM that that would mean you still pay some sort of overhead (either in time or space) while the dictionary is still being mutated. As-is the algorithm just assumes all key-value-pairs are available at creation time. So yes, if you don't reallocate when making the dict perfect then it could make sense to combine it with the scheme discussed in this thread. If one does leave some free slots open there's some probability of an insertion working without complete repacking, but the probability is smaller than with a normal dict. Hybrid schemes and trade-offs in this direction could be possible. Also, I'm not sure how 2 levels of indirection come into it. The algorithm you describe has, as I understand it, only up to 12 perturbation values (bins), so it's a constant space overhead, not a variable one. What's more, you can possibly avoid the extra memory access by using a different perfect hashing algorithm, at the cost of a slower optimization step or using a little more memory. I said there's k perturbation values; you need an additional array some_int_t d[k] where some_int_t is large enough to hold n (the number of entries). Just like what's proposed in this thread. The paper recommends k 2*n, but in my experiments I could get away with k = n in 99.9% of the cases (given perfect entropy in the hashes...). So the overhead is roughly the same as what's proposed here. I think the most promising thing would be to have always have a single integer table and either use it for indirection (usual case) or perfect hash function parameters (say, after a pack() method has been called and before new insertions). Note: I'm NOT suggesting the use of perfect hashing, just making sure it's existence is mentioned and that one is aware that if always-ordered dicts become the language standard it precludes this option far off in the future. Not really. It means that some forms of perfect hashing might require adding a few more ints worth of overhead for the dictionaries that use it. If it's really a performance benefit for very-frequently-used dictionaries, that might still be worthwhile. As mentioned above the overhead is larger. I think the main challenge is to switch to a hashing scheme with larger entropy for strings, like murmurhash3. Having lots of zero bits in the string for short strings will kill the scheme, since it needs several attempts to succeed (the r parameter). So string hashing is slowed down a bit (given the caching I don't know how important this is). Ideally one should make sure the hashes 64-bit on 64-bit platforms too (IIUC long is 32-bit on Windows but I don't know Windows well). But the main reason I say I'm not proposing it is I don't have time to code it up for demonstration and people like to have something to look at when they get proposals :-) Dag Sverre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP for time zone support.
On 12/13/2012 1:39 AM, Glenn Linderman wrote: On 12/12/2012 6:10 PM, Janzert wrote: On 12/12/2012 8:43 PM, Glenn Linderman wrote: On 12/12/2012 5:36 PM, Brian Curtin wrote: C:\ProgramData\Python ^ That. Is not the path that the link below is talking about, though. It actually does; it is rather confusing though. :/ I agree with the below. But I have never seen a version of Windows on which c:\ProgramData was the actual path for FOLDERID_ProgramData. Can you reference documentation that states that it was there, for some version? This documentation speaks of: c:\Documents and Settings\AllUsers\Application Data (which I knew from XP, and I think 2000, not sure I remember NT) In Vista.0, Vista.1, and Vista.2, I guess it is moved to C:\users\AllUsers\AppData\Roaming (typically). Neither of those would result in C:\ProgramData\Python. The SO answer links to the KNOWNFOLDERID docs; the relevant entry specifically is at http://msdn.microsoft.com/en-us/library/windows/desktop/dd378457.aspx#FOLDERID_ProgramData which gives the default path as, %ALLUSERSPROFILE% (%ProgramData%, %SystemDrive%\ProgramData) checking on my local windows 7 install gives: C:\echo %ALLUSERSPROFILE% C:\ProgramData C:\echo %ProgramData% C:\ProgramData It's referring to KNOWNFOLDERID constant FOLDERID_ProgramData. The actual on disk location for this has changed over windows versions. As noted below in the SO link given: Note that this documentation refers to the typical path as per older versions of Windows. In modern versions of Windows it is located in %SystemDrive%\ProgramData. Making a new top-level directory without asking is obnoxious. See http://stackoverflow.com/questions/9518890/what-is-the-significance-programdata-in-windows ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com