Re: fastest way to detect a user type

2009-02-01 Thread Steven D'Aprano
Robin Becker wrote:

 Whilst considering a port of old code to python 3 I see that in several
 places we are using type comparisons to control processing of user
 instances (as opposed to instances of built in types eg float, int, str)
 
 I find that the obvious alternatives are not as fast as the current
 code; func0 below. On my machine isinstance seems slower than type for
 some reason. My 2.6 timings are

First question is, why do you care that it's slower? The difference between
the fastest and slowest functions is 1.16-0.33 = 0.83 microsecond. If you
call the slowest function one million times, your code will run less than a
second longer.

Does that really matter, or are you engaged in premature optimization? In
your test functions, the branches all execute pass. Your real code
probably calls other functions, makes calculations, etc, which will all
take time. Probably milliseconds rather than microseconds. I suspect you're
concerned about a difference of 0.1 of a percent, of one small part of your
entire application. Unless you have profiled your code and this really is a
bottleneck, I recommend you worry more about making your code readable and
maintainable than worrying about micro-optimisations.

Even more important that being readable is being *correct*, and I believe
that your code has some unexpected failure modes (bugs). See below:



 so func 3 seems to be the fastest option for the case when the first
 test matches, but is poor when it doesn't. Can anyone suggest a better
 way to determine if an object is a user instance?
 
 ##
 from types import InstanceType

I believe this will go away in Python 3, as all classes will be New Style
classes.


 class X:
  __X__=True

This is an Old Style class in Python 2.x, and a New Style class in Python 3.

Using hasattr('__X__') is a curious way of detecting what you want. I
suppose it could be argued that it is a variety of duck-typing: if it has
a duck's bill, it must be a duck. (Unless it is a platypus, of course.)
However, attribute names with leading and trailing double-underscores are
reserved for use as special methods. You should rename it to something
more appropriate: _MAGIC_LABEL, say.

 
 class V(X):
  pass
 
 def func0(ob):
  t=type(ob)
  if t is InstanceType:
  pass

This test is too broad. It will succeed for *any* old-style class, not just
X and V instances. That's probably not what you want.

It will also fail if ob is an instance of a New Style class. Remember that
in Python 3, all classes become new-style.


  elif t in (float, int):
  pass

This test will fail if ob is a subclass of float or int. That's almost
certainly the wrong behavior. A better way of writing that is:

elif issubclass(t, (float, int)):
pass


  else:
  pass
 
 def func1(ob):
  if isinstance(ob,X):
  pass

If you have to do type checking, that's the recommended way of doing so.



  elif type(ob) in (float, int):
  pass

The usual way to write that is:

if isinstance(ob, (float, int)):
pass



Hope this helps,


-- 
Steven

--
http://mail.python.org/mailman/listinfo/python-list


Re: fastest way to detect a user type

2009-02-01 Thread Paul Rubin
Steven D'Aprano st...@pearwood.info writes:
 First question is, why do you care that it's slower? The difference between
 the fastest and slowest functions is 1.16-0.33 = 0.83 microsecond. 

That's a 71% speedup, pretty good if you ask me.

 If you call the slowest function one million times, your code will
 run less than a second longer.

What if you call it a billion times, or a trillion times, or a
quadrillion times, you see where this is going?  If you're testing
100-digit numbers, there are an awful lot of them before you run out.
--
http://mail.python.org/mailman/listinfo/python-list


Re: fastest way to detect a user type

2009-02-01 Thread Steven D'Aprano
Paul Rubin wrote:

 Steven D'Aprano st...@pearwood.info writes:
 First question is, why do you care that it's slower? The difference
 between the fastest and slowest functions is 1.16-0.33 = 0.83
 microsecond.
 
 That's a 71% speedup, pretty good if you ask me.

Don't you care that the code is demonstrably incorrect? The OP is
investigating options to use in Python 3, but the fastest method will fail,
because the type is InstanceType test will no longer work. (I believe the
fastest method, as given, is incorrect even in Python 2.x, as it will
accept ANY old-style class instead of just the relevant X or V classes.)

That reminds me of something that happened to my wife some years ago: she
was in a van with her band's roadies, and one asked the driver Are you
sure you know where you're going?, to which the driver replied, Who
cares? We're making great time. (True story.)

If you're going to accept incorrect code in order to save time, then I can
write even faster code:

def func4(ob):
pass

Trying beating that for speed!


 If you call the slowest function one million times, your code will
 run less than a second longer.
 
 What if you call it a billion times, or a trillion times, or a
 quadrillion times, you see where this is going?

It doesn't matter. The proportion of time saved will remain the same. If you
run it a trillion times, you'll save 12 minutes in a calculation that takes
278 hours to run. Big Effing Deal. Saving such trivial amounts of time is
not worth the cost of hard-to-read or incorrect code.

Of course, if you have profiled your code and discovered that *significant*
amounts of time are being used in type-testing, *then* such a
micro-optimization may be worth doing. But I already allowed for that:

Does that really matter...?
(the answer could be Yes)

Unless you have profiled your code and this really is a bottleneck ...
(it could be)


 If you're testing 
 100-digit numbers, there are an awful lot of them before you run out.

Yes. So what? Once you've tested them, then what? If *all* you are doing
them is testing them, your application is pretty boring. Even a print
statement afterwards is going to take 1000 times longer than doing the
type-test. In any useful application, the amount of time used in
type-testing is almost surely going to be a small fraction of the total
runtime. A 71% speedup on 50% of the runtime is significant; but a 71%
speedup on 0.1% of the total execution time is not.



-- 
Steven

--
http://mail.python.org/mailman/listinfo/python-list


Re: fastest way to detect a user type

2009-02-01 Thread Robin Becker

Steven D'Aprano wrote:

Paul Rubin wrote:


Steven D'Aprano st...@pearwood.info writes:

First question is, why do you care that it's slower? The difference
between the fastest and slowest functions is 1.16-0.33 = 0.83
microsecond.

That's a 71% speedup, pretty good if you ask me.


Don't you care that the code is demonstrably incorrect? The OP is
investigating options to use in Python 3, but the fastest method will fail,
because the type is InstanceType test will no longer work. (I believe the
fastest method, as given, is incorrect even in Python 2.x, as it will
accept ANY old-style class instead of just the relevant X or V classes.)


I'm not clear why this is true? Not all instances will have the __X__ 
attribute or has something else changed in Python3?


The original code was intended to be called with only a subset of all 
class instances being passed as argument; as currently written it was 
unsafe because an instance of an arbitrary old class would pass into 
branch 1. Of course it will still be unsafe as arbitrary instances end 
up in branch 3


The intent is to firm up the set of cases being accepted in the first 
branch. The problem is that when all instances are new style then 
there's no easy check for the other acceptable arguments eg float,int, 
str etc, as I see it, the instances must be of a known class or have a 
distinguishing attribute.


As for the timing, when I tried the effect of func1 on our unit tests I 
noticed that it slowed the whole test suite by 0.5%. Luckily func 3 
style improved things by about 0.3% so that's what I'm going for.

--
Robin Becker
--
http://mail.python.org/mailman/listinfo/python-list


Re: fastest way to detect a user type

2009-02-01 Thread Steven D'Aprano
Robin Becker wrote:

 Steven D'Aprano wrote:
 Paul Rubin wrote:
 
 Steven D'Aprano st...@pearwood.info writes:
 First question is, why do you care that it's slower? The difference
 between the fastest and slowest functions is 1.16-0.33 = 0.83
 microsecond.
 That's a 71% speedup, pretty good if you ask me.
 
 Don't you care that the code is demonstrably incorrect? The OP is
 investigating options to use in Python 3, but the fastest method will
 fail, because the type is InstanceType test will no longer work. (I
 believe the fastest method, as given, is incorrect even in Python 2.x, as
 it will accept ANY old-style class instead of just the relevant X or V
 classes.)
 
 I'm not clear why this is true? Not all instances will have the __X__
 attribute or has something else changed in Python3?

The func0() test doesn't look for __X__.


 The original code was intended to be called with only a subset of all
 class instances being passed as argument; as currently written it was
 unsafe because an instance of an arbitrary old class would pass into
 branch 1. Of course it will still be unsafe as arbitrary instances end
 up in branch 3
 
 The intent is to firm up the set of cases being accepted in the first
 branch. The problem is that when all instances are new style then
 there's no easy check for the other acceptable arguments eg float,int,
 str etc, 

Of course there is.

isinstance(ob, (float, int)) 

is the easy, and correct, way to check if ob is a float or int.


 as I see it, the instances must be of a known class or have a 
 distinguishing attribute.

Are you sure you need to check for different types in the first place? Just
how polymorphic is your code, really? It's hard to judge because I don't
know what your code actually does.


 As for the timing, when I tried the effect of func1 on our unit tests I
 noticed that it slowed the whole test suite by 0.5%. 

An entire half a percent slower. Wow.

That's like one minute versus one minute and 0.3 second. Or one hour, versus
one hour and 18 seconds. I find it very difficult to get worked up over
such small differences. I think you're guilty of premature optimization:
wasting time and energy trying to speed up parts of the code that are
trivial. (Of course I could be wrong, but I doubt it.)



 Luckily func 3 
 style improved things by about 0.3% so that's what I'm going for.

I would call that the worst solution. Not only are you storing an attribute
which is completely redundant (instances already know what type they are,
you don't need to manually store a badge on them to mark them as an
instance of a class), but you're looking up this attribute only to
immediately throw away the value you get. The only excuse for this extra
redirection would be if it were significantly faster. But it isn't: you
said it yourself, 0.3% speed up. That's like 60 seconds versus 59.82
seconds.


-- 
Steven

--
http://mail.python.org/mailman/listinfo/python-list


fastest way to detect a user type

2009-01-31 Thread Robin Becker
Whilst considering a port of old code to python 3 I see that in several 
places we are using type comparisons to control processing of user 
instances (as opposed to instances of built in types eg float, int, str)


I find that the obvious alternatives are not as fast as the current 
code; func0 below. On my machine isinstance seems slower than type for 
some reason. My 2.6 timings are


C:\Tmp\Python\lib\timeit.py -simport t;v=t.X() t.func0(v)
100 loops, best of 3: 0.348 usec per loop

C:\Tmp\Python\lib\timeit.py -simport t;v=t.X() t.func1(v)
100 loops, best of 3: 0.747 usec per loop

C:\Tmp\Python\lib\timeit.py -simport t;v=t.X() t.func2(v)
100 loops, best of 3: 0.378 usec per loop

C:\Tmp\Python\lib\timeit.py -simport t;v=t.X() t.func3(v)
100 loops, best of 3: 0.33 usec per loop

C:\Tmp\Python\lib\timeit.py -simport t;v=t.X() t.func0(1)
100 loops, best of 3: 0.477 usec per loop

C:\Tmp\Python\lib\timeit.py -simport t;v=t.X() t.func1(1)
100 loops, best of 3: 1.14 usec per loop

C:\Tmp\Python\lib\timeit.py -simport t;v=t.X() t.func2(1)
100 loops, best of 3: 1.16 usec per loop

C:\Tmp\Python\lib\timeit.py -simport t;v=t.X() t.func3(1)
100 loops, best of 3: 1.14 usec per loop

so func 3 seems to be the fastest option for the case when the first 
test matches, but is poor when it doesn't. Can anyone suggest a better 
way to determine if an object is a user instance?


##
from types import InstanceType
class X:
__X__=True

class V(X):
pass

def func0(ob):
t=type(ob)
if t is InstanceType:
pass
elif t in (float, int):
pass
else:
pass

def func1(ob):
if isinstance(ob,X):
pass
elif type(ob) in (float, int):
pass
else:
pass

def func2(ob):
if getattr(ob,'__X__',False):
pass
elif type(ob) in (float, int):
pass
else:
pass

def func3(ob):
if hasattr(ob,'__X__'):
pass
elif type(ob) in (float, int):
pass
else:
pass
##

--
Robin Becker
--
http://mail.python.org/mailman/listinfo/python-list