On 02/10/2013 10:10 AM, Dave Angel wrote:
On 02/10/2013 09:32 AM, Walter Prins wrote:
Hello,

I have a program where I'm overriding the retrieval of items from a list.
  As background: The data held by the lists are calculated but then read
potentially many times thereafter, so in order to prevent needless
re-calculating the same value over and over, and to remove
checking/caching
code from the calculation logic code, I therefore created a subclass of
list that will automatically calculate the value in a given slot
automatically if not yet calculated. (So differently put, I'm
implemented a
kind of list specific caching/memoization with the intent that it
should be
transparent to the client code.)

The way I've implemented this so far was to simply override
list.__getitem__(self, key) to check if the value needs to be
calculated or
not and call a calculation method if required, after which the value is
returned as normal.  On subsequent calls __getitem__ then directly
returns
the value without calculating it again.

This worked mostly fine, however yesterday I ran into a slightly
unexpected
problem when I found that when the list contents is iterated over and
values retrieved that way rather than via [], then __getitem__ is in fact
*not* called on the list to read the item values from the list, and
consequently I get back the "not yet calculated" entries in the list,
without the calculation routine being automatically called as is
intended.

Here's a test application that demonstrates the issue:

class NotYetCalculated:
     pass

class CalcList(list):
     def __init__(self, calcitem):
         super(CalcList, self).__init__()
         self.calcitem = calcitem

     def __getitem__(self, key):
         """Override __getitem__ to call self.calcitem() if needed"""
         print "CalcList.__getitem__(): Enter"
         value = super(CalcList, self).__getitem__(key)
         if value is NotYetCalculated:
             print "CalcList.__getitem__(): calculating"
             value = self.calcitem(key)
             self[key] = value
         print "CalcList.__getitem__(): return"
         return value

def calcitem(key):
     # Demo: return square of index
     return key*key


def main():
     # Create a list that calculates its contents via a given
     # method/fn onece only
     l = CalcList(calcitem)
     # Extend with  few entries to demonstrate issue:
     l.extend([NotYetCalculated, NotYetCalculated, NotYetCalculated,
               NotYetCalculated])

     print "1) Directly getting values from list works as expected:
__getitem__ is called:"
     print "Retrieving value [2]:\n", l[2]
     print
     print "Retrieving value [3]:\n", l[3]
     print
     print "Retrieving value [2] again (no calculation this time):\n",
l[2]
     print

     print "Retrieving values via an iterator doesn't work as expected:"
     print "(__getitem__ is not called and the code returns "
     print " NotYetCalcualted entries without calling __getitem__. How
do I
fix this?)"
     print "List contents:"
     for x in l: print x


if __name__ == "__main__":
     main()

To reiterate:

What should happen:  In test 2) above all entries should be automatically
calculated and output should be numbers only.

What actually happens: In test 2) above the first 2 list entries
corresponding to list indexes 0 and 1 are output as "NotYetCalculated"
and
calcitem is not called when required.

What's the best way to fix this problem?  Do I need to maybe override
another method, perhaps provide my own iterator implementation?  For that
matter, why doesn't iterating over the list contents fall back to calling
__getitem__?


Implement your own __iter__() special method.

And consider whether you might need __setitem__(), __len__(),
__setslice__(), __getslice__() and others.

Maybe you'd be better off not inheriting from list at all, and just
having an attribute that's a list.  It doesn't sound like you're
defining a very big subset of list, and overriding the methods you
*don't* want seems to be more work than just implementing the ones you do.

A separate question:  is this likely to be a sparse list?  If it's very
sparse, perhaps you'd consider using a dict, rather than a list attribute.




BTW, the answer to why iterating over the list contents doesn't call __getitem__, the answer is because list does define __iter__, presumably to do it more efficiently.

And there is your clue that perhaps you don't want to inherit from list. You don't want its more-efficient version, so all you have to do is not to implement an __iter__ and it should just work.


--
DaveA
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to