[issue23103] ipaddress should be Flyweight

2014-12-26 Thread Antoine Pitrou
Antoine Pitrou added the comment: So, there are several aspects to this request: - add a __slots__: this is Serhiy's patch - add some weak interning of IP addresses: I believe it is ok to add it as an optimization, provided it doesn't otherwise change the API or the classes' behaviour Seth:

[issue23103] ipaddress should be Flyweight

2014-12-24 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Here is a patch which makes IPv4Address and IPv6Address more lightweight (for other classes the size is not so important). It makes _version the class attribute and adds __slots__. -- keywords: +patch resolution: wont fix - stage: - patch review

[issue23103] ipaddress should be Flyweight

2014-12-24 Thread Eric Snow
Eric Snow added the comment: @sbromberger, there's no need for your own package. Just use something like this: _addr_cache = {} def ipaddr(addr): try: return _addr_cache[addr] except KeyError: _addr_cache[addr] = ipaddress.ipaddress(addr) return

[issue23103] ipaddress should be Flyweight

2014-12-24 Thread Josh Rosenberg
Josh Rosenberg added the comment: Serhiy: I believe you need to add a bunch of __slots__ = () to various base classes in the module, even though they lack member variables. Every link in the inheritance chain must declare __slots__, or the child class will have __dict__ and __weakref__.

[issue23103] ipaddress should be Flyweight

2014-12-23 Thread Seth Bromberger
New submission from Seth Bromberger: ipaddress.ip_address instances should be flyweight. There's really no reason to make them mutable: a = ipaddress.ip_address(10.1.2.3) b = ipaddress.ip_address(10.1.2.3) id(a) 140066533772368 id(b) 140066504622544 Especially with IPv6 and large numbers

[issue23103] ipaddress should be Flyweight

2014-12-23 Thread Seth Bromberger
Seth Bromberger added the comment: Confirmed on 3.4.0; likely exists in previous versions as well. -- versions: +Python 3.4 -Python 3.6 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23103

[issue23103] ipaddress should be Flyweight

2014-12-23 Thread Berker Peksag
Changes by Berker Peksag berker.pek...@gmail.com: -- nosy: +ncoghlan, pmoody ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23103 ___ ___

[issue23103] ipaddress should be Flyweight

2014-12-23 Thread Josh Rosenberg
Josh Rosenberg added the comment: What is your proposal? WeakValueDictionary mapping raw bytes object to single instance of ipaddress that is queried from ipaddress's __new__? No built-in has quite that extensive a level of caching and aggressive deduplication to my knowledge. --

[issue23103] ipaddress should be Flyweight

2014-12-23 Thread Seth Bromberger
Seth Bromberger added the comment: What is your proposal? WeakValueDictionary mapping raw bytes object to single instance of ipaddress that is queried from ipaddress's __new__? No built-in has quite that extensive a level of caching and aggressive deduplication to my knowledge. I was

[issue23103] ipaddress should be Flyweight

2014-12-23 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: IP address instances already are immutable and flyweight. There are no mutating methods. And a number of address attributes (packed, is_loopback, etc) are calculated on fly from integer representation. But IP address objects can be lighter. 1) Use

[issue23103] ipaddress should be Flyweight

2014-12-23 Thread R. David Murray
Changes by R. David Murray rdmur...@bitdance.com: -- nosy: +r.david.murray ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23103 ___ ___

[issue23103] ipaddress should be Flyweight

2014-12-23 Thread Seth Bromberger
Seth Bromberger added the comment: 1) However: b = ipaddress.IPv4Address('1.2.3.4') a = ipaddress.IPv4Address('1.2.3.4') id(a) 4428380928 id(b) 4428381768 a == b True b._ip += 6 id(b) 4428381768 b IPv4Address('1.2.3.10') 2) Isn’t _version already a class attribute? It’s set in

[issue23103] ipaddress should be Flyweight

2014-12-23 Thread R. David Murray
R. David Murray added the comment: If ipaddress objects are already immutable, it seems to me that having a registry that makes it so there is only one instance per ip value isn't a crazy idea. I've written such code in other contexts and it's not complicated. We have caching of identifier

[issue23103] ipaddress should be Flyweight

2014-12-23 Thread Seth Bromberger
Seth Bromberger added the comment: The opposite argument is that it may be better left up to the application that has to handle lots of ips to do the caching according to what it knows to be an optimum pattern. I'd agree with you wholeheartedly if ipaddress wasn't part of stdlib, but the

[issue23103] ipaddress should be Flyweight

2014-12-23 Thread Raymond Hettinger
Raymond Hettinger added the comment: Adding __slots__ is reasonable. Adding caching to fold together repeated addresses is dubious (it comes with its own space and time cost but won't benefit all users) and, as David said, the decision on how to handle this is up to the user. Also, it is

[issue23103] ipaddress should be Flyweight

2014-12-23 Thread Josh Rosenberg
Josh Rosenberg added the comment: So, just to be clear, checking the implementation (as of 3.4): 1. All the ipaddress classes leave __slots__ unset. So the overhead is actually higher than 56 bytes per instance; the __dict__ for an IPv4Address (ignoring the actual keys and values in the dict,

[issue23103] ipaddress should be Flyweight

2014-12-23 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Josh, you might want to have a look at the lightning talk on this page and the associated slides... http://www.egenix.com/library/presentations/PyCon-UK-2014-When-performance-matters/ After having done the tests, using __slots__ is what I consider the

[issue23103] ipaddress should be Flyweight

2014-12-23 Thread Seth Bromberger
Seth Bromberger added the comment: As a test, I tried the following (taken mostly from http://codesnipers.com/?q=python-flyweights): class Foo(object): _Foo = weakref.WeakValueDictionary() def __new__(cls, addr): obj = Foo._Foo.get(addr, None) if obj is None:

[issue23103] ipaddress should be Flyweight

2014-12-23 Thread Seth Bromberger
Seth Bromberger added the comment: Sorry for the additional followup, but I re-ran the test with approximately real-world parameters. In this second test, I created 10 million Foo() instances with random values from 1-2000 (using random.randint). This corresponds to 2000 hosts generating 10

[issue23103] ipaddress should be Flyweight

2014-12-23 Thread Josh Rosenberg
Josh Rosenberg added the comment: Marc-Andre: Oh, I know. I proselytize to coworkers on the virtues of using __slots__ for classes that will have many instances created (particularly since entirely too much of our stuff is still Py2.7, so we don't have the free savings from shared key

[issue23103] ipaddress should be Flyweight

2014-12-23 Thread Seth Bromberger
Seth Bromberger added the comment: I'm just pointing out that if he thinks 56 bytes per object is too large, he's going to be disappointed with what Python has to offer. General purpose user-defined Python objects don't optimize for low memory usage, and even __slots__ only gets you so far.