Re: fastest way to detect a user type
Robin Becker wrote: Whilst considering a port of old code to python 3 I see that in several places we are using type comparisons to control processing of user instances (as opposed to instances of built in types eg float, int, str) I find that the obvious alternatives are not as fast as the current code; func0 below. On my machine isinstance seems slower than type for some reason. My 2.6 timings are First question is, why do you care that it's slower? The difference between the fastest and slowest functions is 1.16-0.33 = 0.83 microsecond. If you call the slowest function one million times, your code will run less than a second longer. Does that really matter, or are you engaged in premature optimization? In your test functions, the branches all execute pass. Your real code probably calls other functions, makes calculations, etc, which will all take time. Probably milliseconds rather than microseconds. I suspect you're concerned about a difference of 0.1 of a percent, of one small part of your entire application. Unless you have profiled your code and this really is a bottleneck, I recommend you worry more about making your code readable and maintainable than worrying about micro-optimisations. Even more important that being readable is being *correct*, and I believe that your code has some unexpected failure modes (bugs). See below: so func 3 seems to be the fastest option for the case when the first test matches, but is poor when it doesn't. Can anyone suggest a better way to determine if an object is a user instance? ## from types import InstanceType I believe this will go away in Python 3, as all classes will be New Style classes. class X: __X__=True This is an Old Style class in Python 2.x, and a New Style class in Python 3. Using hasattr('__X__') is a curious way of detecting what you want. I suppose it could be argued that it is a variety of duck-typing: if it has a duck's bill, it must be a duck. (Unless it is a platypus, of course.) However, attribute names with leading and trailing double-underscores are reserved for use as special methods. You should rename it to something more appropriate: _MAGIC_LABEL, say. class V(X): pass def func0(ob): t=type(ob) if t is InstanceType: pass This test is too broad. It will succeed for *any* old-style class, not just X and V instances. That's probably not what you want. It will also fail if ob is an instance of a New Style class. Remember that in Python 3, all classes become new-style. elif t in (float, int): pass This test will fail if ob is a subclass of float or int. That's almost certainly the wrong behavior. A better way of writing that is: elif issubclass(t, (float, int)): pass else: pass def func1(ob): if isinstance(ob,X): pass If you have to do type checking, that's the recommended way of doing so. elif type(ob) in (float, int): pass The usual way to write that is: if isinstance(ob, (float, int)): pass Hope this helps, -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: fastest way to detect a user type
Steven D'Aprano st...@pearwood.info writes: First question is, why do you care that it's slower? The difference between the fastest and slowest functions is 1.16-0.33 = 0.83 microsecond. That's a 71% speedup, pretty good if you ask me. If you call the slowest function one million times, your code will run less than a second longer. What if you call it a billion times, or a trillion times, or a quadrillion times, you see where this is going? If you're testing 100-digit numbers, there are an awful lot of them before you run out. -- http://mail.python.org/mailman/listinfo/python-list
Re: fastest way to detect a user type
Paul Rubin wrote: Steven D'Aprano st...@pearwood.info writes: First question is, why do you care that it's slower? The difference between the fastest and slowest functions is 1.16-0.33 = 0.83 microsecond. That's a 71% speedup, pretty good if you ask me. Don't you care that the code is demonstrably incorrect? The OP is investigating options to use in Python 3, but the fastest method will fail, because the type is InstanceType test will no longer work. (I believe the fastest method, as given, is incorrect even in Python 2.x, as it will accept ANY old-style class instead of just the relevant X or V classes.) That reminds me of something that happened to my wife some years ago: she was in a van with her band's roadies, and one asked the driver Are you sure you know where you're going?, to which the driver replied, Who cares? We're making great time. (True story.) If you're going to accept incorrect code in order to save time, then I can write even faster code: def func4(ob): pass Trying beating that for speed! If you call the slowest function one million times, your code will run less than a second longer. What if you call it a billion times, or a trillion times, or a quadrillion times, you see where this is going? It doesn't matter. The proportion of time saved will remain the same. If you run it a trillion times, you'll save 12 minutes in a calculation that takes 278 hours to run. Big Effing Deal. Saving such trivial amounts of time is not worth the cost of hard-to-read or incorrect code. Of course, if you have profiled your code and discovered that *significant* amounts of time are being used in type-testing, *then* such a micro-optimization may be worth doing. But I already allowed for that: Does that really matter...? (the answer could be Yes) Unless you have profiled your code and this really is a bottleneck ... (it could be) If you're testing 100-digit numbers, there are an awful lot of them before you run out. Yes. So what? Once you've tested them, then what? If *all* you are doing them is testing them, your application is pretty boring. Even a print statement afterwards is going to take 1000 times longer than doing the type-test. In any useful application, the amount of time used in type-testing is almost surely going to be a small fraction of the total runtime. A 71% speedup on 50% of the runtime is significant; but a 71% speedup on 0.1% of the total execution time is not. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: fastest way to detect a user type
Steven D'Aprano wrote: Paul Rubin wrote: Steven D'Aprano st...@pearwood.info writes: First question is, why do you care that it's slower? The difference between the fastest and slowest functions is 1.16-0.33 = 0.83 microsecond. That's a 71% speedup, pretty good if you ask me. Don't you care that the code is demonstrably incorrect? The OP is investigating options to use in Python 3, but the fastest method will fail, because the type is InstanceType test will no longer work. (I believe the fastest method, as given, is incorrect even in Python 2.x, as it will accept ANY old-style class instead of just the relevant X or V classes.) I'm not clear why this is true? Not all instances will have the __X__ attribute or has something else changed in Python3? The original code was intended to be called with only a subset of all class instances being passed as argument; as currently written it was unsafe because an instance of an arbitrary old class would pass into branch 1. Of course it will still be unsafe as arbitrary instances end up in branch 3 The intent is to firm up the set of cases being accepted in the first branch. The problem is that when all instances are new style then there's no easy check for the other acceptable arguments eg float,int, str etc, as I see it, the instances must be of a known class or have a distinguishing attribute. As for the timing, when I tried the effect of func1 on our unit tests I noticed that it slowed the whole test suite by 0.5%. Luckily func 3 style improved things by about 0.3% so that's what I'm going for. -- Robin Becker -- http://mail.python.org/mailman/listinfo/python-list
Re: fastest way to detect a user type
Robin Becker wrote: Steven D'Aprano wrote: Paul Rubin wrote: Steven D'Aprano st...@pearwood.info writes: First question is, why do you care that it's slower? The difference between the fastest and slowest functions is 1.16-0.33 = 0.83 microsecond. That's a 71% speedup, pretty good if you ask me. Don't you care that the code is demonstrably incorrect? The OP is investigating options to use in Python 3, but the fastest method will fail, because the type is InstanceType test will no longer work. (I believe the fastest method, as given, is incorrect even in Python 2.x, as it will accept ANY old-style class instead of just the relevant X or V classes.) I'm not clear why this is true? Not all instances will have the __X__ attribute or has something else changed in Python3? The func0() test doesn't look for __X__. The original code was intended to be called with only a subset of all class instances being passed as argument; as currently written it was unsafe because an instance of an arbitrary old class would pass into branch 1. Of course it will still be unsafe as arbitrary instances end up in branch 3 The intent is to firm up the set of cases being accepted in the first branch. The problem is that when all instances are new style then there's no easy check for the other acceptable arguments eg float,int, str etc, Of course there is. isinstance(ob, (float, int)) is the easy, and correct, way to check if ob is a float or int. as I see it, the instances must be of a known class or have a distinguishing attribute. Are you sure you need to check for different types in the first place? Just how polymorphic is your code, really? It's hard to judge because I don't know what your code actually does. As for the timing, when I tried the effect of func1 on our unit tests I noticed that it slowed the whole test suite by 0.5%. An entire half a percent slower. Wow. That's like one minute versus one minute and 0.3 second. Or one hour, versus one hour and 18 seconds. I find it very difficult to get worked up over such small differences. I think you're guilty of premature optimization: wasting time and energy trying to speed up parts of the code that are trivial. (Of course I could be wrong, but I doubt it.) Luckily func 3 style improved things by about 0.3% so that's what I'm going for. I would call that the worst solution. Not only are you storing an attribute which is completely redundant (instances already know what type they are, you don't need to manually store a badge on them to mark them as an instance of a class), but you're looking up this attribute only to immediately throw away the value you get. The only excuse for this extra redirection would be if it were significantly faster. But it isn't: you said it yourself, 0.3% speed up. That's like 60 seconds versus 59.82 seconds. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
fastest way to detect a user type
Whilst considering a port of old code to python 3 I see that in several places we are using type comparisons to control processing of user instances (as opposed to instances of built in types eg float, int, str) I find that the obvious alternatives are not as fast as the current code; func0 below. On my machine isinstance seems slower than type for some reason. My 2.6 timings are C:\Tmp\Python\lib\timeit.py -simport t;v=t.X() t.func0(v) 100 loops, best of 3: 0.348 usec per loop C:\Tmp\Python\lib\timeit.py -simport t;v=t.X() t.func1(v) 100 loops, best of 3: 0.747 usec per loop C:\Tmp\Python\lib\timeit.py -simport t;v=t.X() t.func2(v) 100 loops, best of 3: 0.378 usec per loop C:\Tmp\Python\lib\timeit.py -simport t;v=t.X() t.func3(v) 100 loops, best of 3: 0.33 usec per loop C:\Tmp\Python\lib\timeit.py -simport t;v=t.X() t.func0(1) 100 loops, best of 3: 0.477 usec per loop C:\Tmp\Python\lib\timeit.py -simport t;v=t.X() t.func1(1) 100 loops, best of 3: 1.14 usec per loop C:\Tmp\Python\lib\timeit.py -simport t;v=t.X() t.func2(1) 100 loops, best of 3: 1.16 usec per loop C:\Tmp\Python\lib\timeit.py -simport t;v=t.X() t.func3(1) 100 loops, best of 3: 1.14 usec per loop so func 3 seems to be the fastest option for the case when the first test matches, but is poor when it doesn't. Can anyone suggest a better way to determine if an object is a user instance? ## from types import InstanceType class X: __X__=True class V(X): pass def func0(ob): t=type(ob) if t is InstanceType: pass elif t in (float, int): pass else: pass def func1(ob): if isinstance(ob,X): pass elif type(ob) in (float, int): pass else: pass def func2(ob): if getattr(ob,'__X__',False): pass elif type(ob) in (float, int): pass else: pass def func3(ob): if hasattr(ob,'__X__'): pass elif type(ob) in (float, int): pass else: pass ## -- Robin Becker -- http://mail.python.org/mailman/listinfo/python-list