First, a disclaimer. I am a second year Maths and Computer Science undergraduate, and this is my first time ever on Usenet (I guess I'm part of the http generation). On top of that, I have been using Python for a grand total of about a fortnight now. Hence, I apologise if what follows is a stupid suggestion, or if its already been made somewhere else, or if this is not the appropriate place to make it, etc. But I did honestly do some background checking through the Python documentation, www.python.org, FAQs, and so forth before making this post. Basically what I'm trying to say here is, please don't flame this post :)
Anyway, I have a suggestion to make for a rather radical new feature in Python. Rather than just struggling to explain and justify it straight out, I'll give the background of how I thought up the feature - and how I worked through the ideas I came up with. The example may seem a bit trivial to start, but bear with me, because I think theres at least some possibility the suggestion has merit. I am writing a project in Python atm, and one thing it needs to do is parse a very simple text file to get a list of records. Its a straightforward kind of thing that I (and I'm sure you) have done many times before in many languages. Each bit of interesting data in the file occurs over 3 or 4 lines of text. On top of this there are lines of random uninteresting junk as well. I decided to write a generator to filter out the junk lines, strip excess whitespace and the like (generators are one of my favourite features in Python). So the main body of code looks like this: def filterJunk(lines): for line in lines: # Do filtering & cleaning up stuff on line. # ... yield line def getData(filename): recordList = [] input = file(filename) lines = line.readlines() input.close() for line in filterJunk(lines): # ... create records and add to recordList So far, so good. Next comes the business of actually creating the records from the lines of interesting text. You can probably see the problem I ran into - I need to look at 3 or 4 lines at a time to create the record. Theres no easy way out of this requirement, because whether the record is contained in 3 or 4 lines is conditional on the value of the second line. But the for loop isn't going to allow me to do that. So I rewrote the for loop above as: while True: try: # ....do stuff repeatedly with lines.next() to add to recordList[] except StopIteration: pass This works. (And yes, there are also a multitude of other reasonably simple ways to approach this problem, some of which I've used in this past. And probably some fancy ways involving map and reduce as well.) But it seems ugly to me somehow. And I'm suspicious of anything that hints of ugliness (in Computer Science, anyway.) I suppose that Python's getting me into the habit of assuming theres always an elegant looking solution. I tried, and failed, to think of some sort of Generator-based concept (I really do love generators! So simple, but so useful!) that would allow me to retain a for loop. I guess that, essentially, I feel this task is really just iterating over the lines, and that iterating over something in Python is a job for a for loop. Generators allow you to use for loops all regular iteration. So I wanted to have something "vaugely Generator like", that would allow me to get back to using a for loop. It occured to me - what if I had some sort of "inverse generator" - a function with a resumable call mechanism that retained state, but that would resume when accepting input rather than returning output. For want of a better term, call it an Acceptor. If you think of a Generator as a method acting as (an abstraction of) a source of sequential data, then the Acceptor correspondingly is a sink. I started playing round with the Acceptor idea, thinking about how it might work. I imagined a simple mirroring of Generator syntax - using a new keyword, accept, to take in a value, just as Generator's use yield to return a value. Here's my first attempt at 'coding' my precious for loop using an imaginary Acceptor: def combineIntoRecord(): # This is an acceptor function optionalline = None # We may not get given a value for this line accept firstline accept secondline if condition(secondline): accept optionalline accept lastline r = createRecord(firstline, secondline, lastline, optionalline) return r recordlist = [] for line in lines: recordlist.append(combineIntoRecord(line)) I've since thought about it somewhat, and its clear I got this first try just plain wrong. Its got a number of hideous defects. First, combineIntoRecord() is 'accepting' its values by taking a magical argument, line, that does not appear in the argument list. Second, in this particular syntax, do combineIntoRecord(x) at line 50 and combineIntoRecord(y) at line 200 want x and y to be combined into the same record, using the old state, or is the second trying to start a "new call" to the method? For an extreme example, imagine calls made in different modules. Basically, the jump into combineIntoRecord code needs to somehow specify an 'instance' of the function, with a given existing state. The method call needs to be bound to something, somehow. Worst of all, though, since combineIntoRecord(line) doesn't always return a value, how does the interpreter know when to call append? It seems the 'magic' of the acceptor is spilling out into the surrounding code, making it impossible to understand or intelligently parse. The magicness needs to be contained within the acceptor itself. How to resolve these issues? Well, thinking about how generators achieve their magic-like effects: - A generator, despite being a function, doesn't return a value in the usual sense. - Calls to a generator don't result in a value, they result in an Iterator object. - The magic happens at the 'yield' keyword. So, analguously: - Perhaps call to combineIntoRecord() shouldn't return a value - It should return an acceptor object instance instead - Every call results in a different object - solves problem 2 above - Acceptor objects have a special method for feeding them values, takes care of problem 1 above BUT: - When and what does an acceptor object 'return'? - We don't actually want to restrict what an acceptor *function* returns. - Acceptors (unlike generators) should be able to return just as per ordinary functions - Its not the output of the function thats 'magic', its the input to the function Thus: - We must confine the magic to the input the function recieves - its argument list. - Have a 'magic argument' that 'takes in' the accepted values. - This is fine - it keeps the magic within the call of the acceptor function - If an acceptor returns like an ordinary function, it needs to return when called - Hence it needs to know its accepted values "all at once" >From our earlier thinking, an acceptor, fundamentally, accepts a sequence of data. So let the first argument to an acceptor function call be a "magic sequence" that contains all the values we want to be accepted. Other arguments may be passed as normal. A disadvantage: the power of the acceptor is restricted in this version, since you can't have elements Accepted at any arbitrary point in the code. This is really a good thing, though - with the old version it was the case that too much power was making the idea totally unworkable. What we've done is essentially bound our Acceptor to something, namely a sequence. Note, though, that we can use any abstract source of sequential data - including a Generator - so theres still quite a lot of flexibility in how it gets data for accepting. To illustrate this model of an Acceptor, heres an example of a simple Acceptor that returns the first element of a sequence that satisfies some condtion (which is presumed to be a function): def firstThatSatisfies(condition): accept element if condition(element): return element To use, we simply call: first = firstThatSatisfies(somelist, condition) somelist, not appearing in the argument list, gets fed into the accept statement by the 'acceptor magic' . Note how explicit iteration has been removed from this canonical algorithm altogether. I dont know about you, but I think it looks kind of neat. A Slight Digression: Clearly it might be true that condtion(element) == 0 for every element in some list. This brings us to the issue of an accept statment that doesnt get fed any data. There are two possible ways to deal with this. One is to insist every accept statement must get called, and raise an exception if this doesnt happen. However, I prefer the seemingly more flexible option where you simply say "Bad luck, once the magic sequece is empty, any remaining acceptor code doesnt get executed, and any acceptor state is lost." This means an acceptor might not always give a well defined return value, but this is no worse than something like def f() x = someCalculation() bool = someConditionThatWillAlwaysBeFalse() if false: return x This conventional example has a return value of None, and an Acceptor that doesnt reach an explict return statement can be treated exactly the same way. Digression Now, going back to my original problem and the first Acceptor I wrote: def combineIntoRecord(): # This is an acceptor function optionalline = None # We may not get given a value for this line accept firstline accept secondline if condition(secondline): accept optionalline accept lastline r = createRecord(firstline, secondline, lastline, optionalline) return r This doesnt work anymore - it produces and returns the first record we want, and then stops. We want a list of all records . Maybe we could use a loop.The 'accepting magic' would need to be confied to this loop. And the loop needs to know when the magic stream of data comes to an end to know when to return the final value. Maybe use another special keyword: def combineIntoRecord(): recordList = []. while acceptingonly # ....accept values, append createRecord to recordLlist Perhaps. I don't really like the idea of having any magic new keywords other than accept itself; if it needs this sort of explicit while loop, maybe it shouldnt be in an Acceptor at all (the whole original point of this idea was to get rid of while loop after all!) ** So are Acceptors pretty useless, after all that? Maybe. Then again, maybe not. What is this combineIntoRecord(), that I have written using non-existent language features, trying to do, in a general sense? It wants to take some sequence (the lines in a file) and produce another sequence, that in some way depends first. Howevers it wants to be able define basically any arbitary dependence between the sequences, taking the elements from the first and producing output to the second at its leisure. The 'taking at its leisure' comes by being virtue of an Acceptor. 'Producing at its leisure' is a feature already available in Python. Its called a Generator. Behold: # An Acceptor/Generator!!! def combineIntoRecords(): optionalline = None # We may not get given a value for this line accept firstline accept secondline if condition(secondline): accept optionalline accept lastline yield createRecord(firstline, secondline, optionalline, lastline) def getData(lines): return list(combineIntoRecords(filterJunk(lines))) So, alas, my beloved for loop was not saved after all. Fortunately, the dreaded While loop has been vanquished. In fact all loopiness has disappeared. Along with any other semblance of a main method. I think I like the finished result. I'll leave if for you to decide if you like it too. -- http://mail.python.org/mailman/listinfo/python-list