[Python-Dev] operator.is*Type

2006-02-22 Thread Fuzzyman
Hello all,

Feel free to shoot this down, but a suggestion.

The operator module defines two functions :

isMappingType
isSquenceType


These return a guesstimation as to whether an object passed in supports 
the mapping and sequence protocols.

These protocols are loosely defined. Any object which has a 
``__getitem__`` method defined could support either protocol.

Therefore :

  from operator import isSequenceType, isMappingType
  class anything(object):
... def __getitem__(self, index):
... pass
...
  something = anything()
  isMappingType(something)
True
  isSequenceType(something)
True

I suggest we either deprecate these functions as worthless, *or* we 
define the protocols slightly more clearly for user defined classes.

An object prima facie supports the mapping protocol if it defines a 
``__getitem__`` method, and a ``keys`` method.

An object prima facie supports the sequence protocol if it defines a 
``__getitem__`` method, and *not* a ``keys`` method.

As a result code which needs to be able to tell the difference can use 
these functions and can sensibly refer to the definition of the mapping 
and sequence protocols when documenting what sort of objects an API call 
can accept.

All the best,

Michael Foord
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] operator.is*Type

2006-02-22 Thread Raymond Hettinger
  from operator import isSequenceType, isMappingType
  class anything(object):
 ... def __getitem__(self, index):
 ... pass
 ...
  something = anything()
  isMappingType(something)
 True
  isSequenceType(something)
 True

 I suggest we either deprecate these functions as worthless, *or* we
 define the protocols slightly more clearly for user defined classes.

They are not worthless.  They do a damned good job of differentiating anything 
that CAN be differentiated.

Your example simply highlights the consequences of one of Python's most basic, 
original design choices (using getitem for both sequences and mappings).  That 
choice is now so fundamental to the language that it cannot possibly change. 
Get used to it.

In your example, the results are correct.  The anything class can be viewed 
as 
either a sequence or a mapping.

In this and other posts, you seem to be focusing your design around notions of 
strong typing and mandatory interfaces.  I would suggest that that approach is 
futile unless you control all of the code being run.


Raymond


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] operator.is*Type

2006-02-22 Thread Thomas Heller
Fuzzyman wrote:
 Hello all,
 
 Feel free to shoot this down, but a suggestion.
 
 The operator module defines two functions :
 
 isMappingType
 isSquenceType
 
 
 These return a guesstimation as to whether an object passed in supports 
 the mapping and sequence protocols.
 
 These protocols are loosely defined. Any object which has a 
 ``__getitem__`` method defined could support either protocol.

The docs contain clear warnings about that.

 I suggest we either deprecate these functions as worthless, *or* we 
 define the protocols slightly more clearly for user defined classes.

I have no problems deprecating them since I've never used one of these
functions.  If I want to know if something is a string I use isinstance(),
for string-like objects I would use

  try: obj + 
  except TypeError:

and so on.

 
 An object prima facie supports the mapping protocol if it defines a 
 ``__getitem__`` method, and a ``keys`` method.
 
 An object prima facie supports the sequence protocol if it defines a 
 ``__getitem__`` method, and *not* a ``keys`` method.
 
 As a result code which needs to be able to tell the difference can use 
 these functions and can sensibly refer to the definition of the mapping 
 and sequence protocols when documenting what sort of objects an API call 
 can accept.

Thomas

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] operator.is*Type

2006-02-22 Thread Fuzzyman
Raymond Hettinger wrote:
  from operator import isSequenceType, isMappingType
  class anything(object):
 ... def __getitem__(self, index):
 ... pass
 ...
  something = anything()
  isMappingType(something)
 True
  isSequenceType(something)
 True

 I suggest we either deprecate these functions as worthless, *or* we
 define the protocols slightly more clearly for user defined classes.

 They are not worthless.  They do a damned good job of differentiating 
 anything that CAN be differentiated.

But as far as I can tell (and I may be wrong), they only work if the 
object is a subclass of a built in type, otherwise they're broken. So 
you'd have to do a type check as well, unless you document that an API 
call *only* works with a builtin type or subclass.

In which case - an isinstance call does the same, with the advantage of 
not being broken if the object is a user-defined class.

At the very least the function would be better renamed 
``MightBeMappingType``  ;-)

 Your example simply highlights the consequences of one of Python's 
 most basic, original design choices (using getitem for both sequences 
 and mappings).  That choice is now so fundamental to the language that 
 it cannot possibly change. Get used to it.

I have no problem with it - it's useful.

 In your example, the results are correct.  The anything class can be 
 viewed as either a sequence or a mapping.

But in practise an object is *unlikely* to be both. (Although 
conceivable a mapping container *could* implement integer indexing an 
thus be both - but *very* rare). Therefore the current behaviour is not 
really useful in any conceivable situation - not that I can think of anyway.

 In this and other posts, you seem to be focusing your design around 
 notions of strong typing and mandatory interfaces.  I would suggest 
 that that approach is futile unless you control all of the code being 
 run.

Not directly. I'm suggesting that the loosely defined protocol (used 
with duck typing) can be made quite a bit more useful by making the 
definition *slightly* more specific.

A preference for strong typing would require subclassing, surely ?

The approach I suggest would allow a *less* 'strongly typed' approach to 
code, because it establishes a convention to decide whether a user 
defined class supports the mapping and sequence protocols.

The simple alternative (which we took in ConfigObj) is to require a 
'strongly typed' interface, because there is currently no useful way to 
determine whether an object that implements __getitem__ supports mapping 
or sequence. (Other than *assuming* that a mapping container implements 
a random choice from the other common mapping methods.)

All the best,

Michael

 Raymond




___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] operator.is*Type

2006-02-22 Thread Bob Ippolito

On Feb 22, 2006, at 4:18 AM, Fuzzyman wrote:

 Raymond Hettinger wrote:
 from operator import isSequenceType, isMappingType
 class anything(object):
 ... def __getitem__(self, index):
 ... pass
 ...
 something = anything()
 isMappingType(something)
 True
 isSequenceType(something)
 True

 I suggest we either deprecate these functions as worthless, *or* we
 define the protocols slightly more clearly for user defined classes.

 They are not worthless.  They do a damned good job of differentiating
 anything that CAN be differentiated.

 But as far as I can tell (and I may be wrong), they only work if the
 object is a subclass of a built in type, otherwise they're broken. So
 you'd have to do a type check as well, unless you document that an API
 call *only* works with a builtin type or subclass.

If you really cared, you could check hasattr(something, 'get') and  
hasattr(something, '__getitem__'), which is a pretty good indicator  
that it's a mapping and not a sequence (in a dict-like sense, anyway).

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] operator.is*Type

2006-02-22 Thread Ian Bicking
Raymond Hettinger wrote:
from operator import isSequenceType, isMappingType
class anything(object):

... def __getitem__(self, index):
... pass
...

something = anything()
isMappingType(something)

True

isSequenceType(something)

True

I suggest we either deprecate these functions as worthless, *or* we
define the protocols slightly more clearly for user defined classes.
 
 
 They are not worthless.  They do a damned good job of differentiating 
 anything 
 that CAN be differentiated.

But they are just identical...?  They seem terribly pointless to me. 
Deprecation is one option, of course.  I think Michael's suggestion also 
makes sense.  *If* we distinguish between sequences and mapping types 
with two functions, *then* those two functions should be distinct.  It 
seems kind of obvious, doesn't it?

I think hasattr(obj, 'keys') is the simplest distinction of the two 
kinds of collections.

 Your example simply highlights the consequences of one of Python's most 
 basic, 
 original design choices (using getitem for both sequences and mappings).  
 That 
 choice is now so fundamental to the language that it cannot possibly change. 
 Get used to it.
 
 In your example, the results are correct.  The anything class can be viewed 
 as 
 either a sequence or a mapping.
 
 In this and other posts, you seem to be focusing your design around notions 
 of 
 strong typing and mandatory interfaces.  I would suggest that that approach 
 is 
 futile unless you control all of the code being run.

I think you are reading too much into it.  If the functions exist, they 
should be useful.  That's all I see in Michael's suggestion.


-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] operator.is*Type

2006-02-22 Thread Raymond Hettinger
[Ian Bicking]
 They seem terribly pointless to me.

FWIW, here is the script that had I used while updating and improving the two 
functions (can't remember whether it was for Py2.3 or Py2.4).  It lists 
comparative results for many different types of inputs.  Since perfection was 
not possible, the goal was to have no false negatives and mostly accurate 
positives.  IMO, they do a pretty good job and are able to access information 
in 
not otherwise visable to pure Python code.  With respect to user defined 
instances, I don't care that they can't draw a distinction where none exists in 
the first place -- at some point you have to either fallback on duck-typing or 
be in control of what kind of arguments you submit to your functions. 
Practicality beats purity -- especially when a pure solution doesn't exist 
(i.e. 
given a user defined class that defines just __getitem__, both mapping or 
sequence behavior is a possibility).


 Analysis Script 

from collections import deque
from UserList import UserList
from UserDict import UserDict
from operator import *
types = (set,
 int, float, complex, long, bool,
 str, unicode,
 list, UserList, tuple, deque,
)

for t in types:
print isMappingType(t()), isSequenceType(t()), repr(t()), repr(t)

class c:
def __repr__(self):
return 'Instance w/o getitem'

class cn(object):
def __repr__(self):
return 'NewStyle Instance w/o getitem'

class cg:
def __repr__(self):
return 'Instance w getitem'
def __getitem__(self):
return 10

class cng(object):
def __repr__(self):
return 'NewStyle Instance w getitem'
def __getitem__(self):
return 10

def f():
return 1

def g():
yield 1

for i in (None, NotImplemented, g(), c(), cn()):
print isMappingType(i), isSequenceType(i), repr(i), type(i)

for i in (cg(), cng(), dict(), UserDict()):
print isMappingType(i), isSequenceType(i), repr(i), type(i)



 Output 

False False set([]) type 'set'
False False 0 type 'int'
False False 0.0 type 'float'
False False 0j type 'complex'
False False 0L type 'long'
False False False type 'bool'
False True '' type 'str'
False True u'' type 'unicode'
False True [] type 'list'
True True [] class UserList.UserList at 0x00F11B70
False True () type 'tuple'
False True deque([]) type 'collections.deque'
False False None type 'NoneType'
False False NotImplemented type 'NotImplementedType'
False False generator object at 0x00F230A8 type 'generator'
False False Instance w/o getitem type 'instance'
False False NewStyle Instance w/o getitem class '__main__.cn'
True True Instance w getitem type 'instance'
True True NewStyle Instance w getitem class '__main__.cng'
True False {} type 'dict'
True True {} type 'instance'

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] operator.is*Type

2006-02-22 Thread Michael Foord
Raymond Hettinger wrote:
 [Ian Bicking]
 They seem terribly pointless to me.

 FWIW, here is the script that had I used while updating and improving 
 the two functions (can't remember whether it was for Py2.3 or Py2.4).  
 It lists comparative results for many different types of inputs.  
 Since perfection was not possible, the goal was to have no false 
 negatives and mostly accurate positives.  IMO, they do a pretty good 
 job and are able to access information in not otherwise visable to 
 pure Python code.  With respect to user defined instances, I don't 
 care that they can't draw a distinction where none exists in the first 
 place -- at some point you have to either fallback on duck-typing or 
 be in control of what kind of arguments you submit to your functions. 
 Practicality beats purity -- especially when a pure solution doesn't 
 exist (i.e. given a user defined class that defines just __getitem__, 
 both mapping or sequence behavior is a possibility).

But given :

True True Instance w getitem type 'instance'
True True NewStyle Instance w getitem class '__main__.cng'
True True [] class UserList.UserList at 0x00F11B70
True True {} type 'instance'

(Last one is UserDict)

I can't conceive of circumstances where this is useful without duck 
typing *as well*.

The tests seem roughly analogous to :

def isMappingType(obj):
return isinstance(obj, dict) or hasattr(obj, '__getitem__')

def isSequenceType(obj):
return isinstance(obj, (basestring, list, tuple, collections.deque)) 
or hasattr(obj, '__getitem__')

If you want to allow sequence access you could either just use the 
isinstance or you *have* to trap an exception in the case of a mapping 
object being passed in.

Redefining (effectively) as :

def isMappingType(obj):
return isinstance(obj, dict) or (hasattr(obj, '__getitem__') and 
hasattr(obj, 'keys'))

def isSequenceType(obj):
return isinstance(obj, (basestring, list, tuple, collections.deque)) 
or (hasattr(obj, '__getitem__')
and not hasattr(obj, 'keys'))

Makes the test useful where you want to know you can safely treat an 
object as a mapping (or sequence) *and* where you want to tell the 
difference.

The only code that would break is use of mapping objects that don't 
define ``keys`` and sequences that do. I imagine these must be very rare 
and *would* be interested in seeing real code that does break. 
Especially if that code cannot be trivially rewritten to use the first 
example.

All the best,

Michael Foord

  Analysis Script 

 from collections import deque
 from UserList import UserList
 from UserDict import UserDict
 from operator import *
 types = (set,
 int, float, complex, long, bool,
 str, unicode,
 list, UserList, tuple, deque,
 )

 for t in types:
print isMappingType(t()), isSequenceType(t()), repr(t()), repr(t)

 class c:
def __repr__(self):
return 'Instance w/o getitem'

 class cn(object):
def __repr__(self):
return 'NewStyle Instance w/o getitem'

 class cg:
def __repr__(self):
return 'Instance w getitem'
def __getitem__(self):
return 10

 class cng(object):
def __repr__(self):
return 'NewStyle Instance w getitem'
def __getitem__(self):
return 10

 def f():
return 1

 def g():
yield 1

 for i in (None, NotImplemented, g(), c(), cn()):
print isMappingType(i), isSequenceType(i), repr(i), type(i)

 for i in (cg(), cng(), dict(), UserDict()):
print isMappingType(i), isSequenceType(i), repr(i), type(i)



  Output 

 False False set([]) type 'set'
 False False 0 type 'int'
 False False 0.0 type 'float'
 False False 0j type 'complex'
 False False 0L type 'long'
 False False False type 'bool'
 False True '' type 'str'
 False True u'' type 'unicode'
 False True [] type 'list'
 True True [] class UserList.UserList at 0x00F11B70
 False True () type 'tuple'
 False True deque([]) type 'collections.deque'
 False False None type 'NoneType'
 False False NotImplemented type 'NotImplementedType'
 False False generator object at 0x00F230A8 type 'generator'
 False False Instance w/o getitem type 'instance'
 False False NewStyle Instance w/o getitem class '__main__.cn'
 True True Instance w getitem type 'instance'
 True True NewStyle Instance w getitem class '__main__.cng'
 True False {} type 'dict'
 True True {} type 'instance'



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] operator.is*Type

2006-02-22 Thread Raymond Hettinger
 But  given :

 True True Instance w getitem type 'instance'
 True True NewStyle Instance w getitem class '__main__.cng'
 True True [] class UserList.UserList at 0x00F11B70
 True True {} type 'instance'

 (Last one is UserDict)

 I can't conceive of circumstances where this is useful without duck
 typing *as well*.

Yawn.  Give it up.  For user defined instances, these functions can only 
discriminate between the presence or absence of __getitem__.  If you're trying 
to distinguish between sequences and mappings for instances, you're own your 
own 
with duck-typing.  Since there is no mandatory mapping or sequence API, the 
operator module functions cannot add more checks without getting some false 
negatives (your original example is a case in point).

Use the function as-is and add your own isinstance checks for your own personal 
definition of what makes a mapping a mapping and what makes a sequence a 
sequence.  Or better yet, stop designing APIs that require you to differentiate 
things that aren't really different ;-)


Raymond 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] operator.is*Type

2006-02-22 Thread Delaney, Timothy (Tim)
Raymond Hettinger wrote:

 Your example simply highlights the consequences of one of Python's
 most basic, original design choices (using getitem for both sequences
 and mappings).  That choice is now so fundamental to the language
 that it cannot possibly change. 

Hmm - just a thought ...

Since we're adding the __index__ magic method, why not have a
__getindexed__ method for sequences.

Then semantics of indexing operations would be something like:

if hasattr(obj, '__getindexed__'):
return obj.__getindexed__(val.__index__())
else:
   return obj.__getitem__(val)

Similarly __setindexed__ and __delindexed__.

This would allow distinguishing between sequences and mappings in a
fairly backwards-compatible way. It would also enforce that only indexes
can be used for sequences.

The backwards-incompatibility comes in when you have a type that
implements __getindexed__, and a subclass that implements __getitem__
e.g. if `list` implemented __getindexed__ then any `list` subclass that
overrode __getitem__ would fail. However, I think we could make it 100%
backwards-compatible for the builtin sequence types if they just had
__getindexed__ delegate to __getitem__. Effectively:

class list (object):

def __getindexed__(self, index):
return self.__getitem__(index)

Tim Delaney
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] operator.is*Type

2006-02-22 Thread Greg Ewing
Delaney, Timothy (Tim) wrote:

 Since we're adding the __index__ magic method, why not have a
 __getindexed__ method for sequences.

I don't think this is a good idea, since it would be
re-introducing all the confusion that the existence of
two C-level indexing slots has led to, this time for
user-defined types.

 The backwards-incompatibility comes in when you have a type that
 implements __getindexed__, and a subclass that implements __getitem__

I don't think this is just a backwards-incompatibility
issue. Having a single syntax that can correspond to more
than one special method is inherently ambiguous. What do
you do if both are defined? Sure you can come up with
some rule to handle it, but it's better to avoid the
situation in the first place.

-- 
Greg Ewing, Computer Science Dept, +--+
University of Canterbury,  | Carpe post meridiam! |
Christchurch, New Zealand  | (I'm not a morning person.)  |
[EMAIL PROTECTED]  +--+
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] operator.is*Type

2006-02-22 Thread Greg Ewing
Fuzzyman wrote:

 The operator module defines two functions :
 
 isMappingType
 isSquenceType
 
  These protocols are loosely defined. Any object which has a
  ``__getitem__`` method defined could support either protocol.

These functions are actually testing for the presence
of two different __getitem__ methods at the C level, one
in the mapping substructure of the type object, and the
other in the sequence substructure. This only works
for types implemented in C which make use of this distinction.
It's not much use for user-defined classes, where the
presence of a __getitem__ method causes both of these
slots to become populated.

Having two different slots for __getitem__ seems to have
been an ill-considered feature in the first place and
would probably best be removed in 3.0. I wouldn't mind if
these two functions went away.

-- 
Greg Ewing, Computer Science Dept, +--+
University of Canterbury,  | Carpe post meridiam! |
Christchurch, New Zealand  | (I'm not a morning person.)  |
[EMAIL PROTECTED]  +--+
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com