Nothing to repeat
Hello everyone, long time no see, This is probably not a Python problem, but rather a regular expressions problem. I want, for the sake of arguments, to match strings comprising any number of occurrences of 'spa', each interspersed by any number of occurrences of the 'm'. 'any number' includes zero, so the whole pattern should match the empty string. Here's the conversation Python and i had about it: Python 2.6.4 (r264:75706, Jun 4 2010, 18:20:16) [GCC 4.4.4 20100503 (Red Hat 4.4.4-2)] on linux2 Type help, copyright, credits or license for more information. import re re.compile((spa|m*)*) Traceback (most recent call last): File stdin, line 1, in module File /usr/lib/python2.6/re.py, line 190, in compile return _compile(pattern, flags) File /usr/lib/python2.6/re.py, line 245, in _compile raise error, v # invalid expression sre_constants.error: nothing to repeat What's going on here? Why is there nothing to repeat? Is the problem having one *'d term inside another? Now, i could actually rewrite this particular pattern as '(spa|m)*'. But what i neglected to mention above is that i'm actually generating patterns from structures of objects (representations of XML DTDs, as it happens), and as it stands, patterns like this are a possibility. Any thoughts on what i should do? Do i have to bite the bullet and apply some cleverness in my pattern generation to avoid situations like this? Thanks, tom -- If it ain't broke, open it up and see what makes it so bloody special. -- http://mail.python.org/mailman/listinfo/python-list
Parsing DTDs
Hello! I would like to parse XML DTDs. The goal is to be able to validate XML-like object structures against DTDs in a fairly flexible way, although i can get from a parsed DTD to a validation engine myself, so that's not an essential feature of the parser (although it would be nice!). What should i do? A bit of googling revealed that the xmlproc package contains a DTD parser that looks like it does just what i want, and that xmlproc became PyXML, and that PyXML is no longer maintained. Is there a DTD parser that is being maintained? Or does it not really matter that PyXML is no longer maintained, given that it's not like the DTD spec has changed very much? Thanks, tom -- Many of us adopted the File's slang as our own, feeling that we'd found a tangible sign of the community of minds we'd half-guessed to be out there. -- http://mail.python.org/mailman/listinfo/python-list
Re: strptime and timezones
On Wed, 13 Aug 2008, Christian Heimes wrote: Tom Anderson wrote: Secondly, do you really have to do this just to parse a date with a timezone? If so, that's ridiculous. No, you don't. :) Download the pytz package from the Python package index. It's *the* tool for timezone handling in Python. The time zone definition are not part of the Python standard library because they change every few of months. Stupid politicians ... My problem has absolutely nothing to do with timezone definitions. In fact, it involves less timezone knowledge than the time package supplies! The wonderful thing about RFC 1123 timestamps is that they give the numeric value of their timezone, so you don't have to decode a symbolic one or anything like that. Knowing about timezones thus isn't necessary. The problem is simply that the standard time package doesn't think that way, and always assumes that a time is in your local timezone. That said, it does look like pytz might be able to parse RFC 1123 dates. Ill check it out. tom -- Come on thunder; come on thunder. -- http://mail.python.org/mailman/listinfo/python-list
strptime and timezones
Hello! Possibly i'm missing something really obvious here. But ... If i have a date-time string of the kind specified in RFC 1123, like this: Tue, 12 Aug 2008 20:48:59 -0700 Can i turn that into a seconds-since-the-epoch time using the standard time module without jumping through substantial hoops? Apart from the timezone, this can be parsed using time.strptime with the format: %a, %d %b %Y %H:%M:%S You can stick a %Z on the end for the timezone, but that parses timezone names ('BST', 'EDT'), not numeric specifiers. Also, it doesn't actually parse anything, it just requires that the timezone that's in the string matches your local timezone. Okay, no problem, so you use a regexp to split off the timezone specifier, parse that yourself, then parse the raw time with strptime. Now you just need to adjust the parsed time for the timezone. Now, from strptime, you get a struct_time, and that doesn't have room for a timezone (although it does have room for a daylight saving time flag), so you can't add the timezone in before you convert to seconds-since-the-epoch. Okay, so convert the struct_time to seconds-since-the-epoch as if it were UTC, then apply the timezone correction. Converting a struct_time to seconds-since-the-epoch is done with mktime, right? Wrong! That does the conversion *in your local timezone*. There's no way to tell it to use any specific timezone, not even just UTC. So how do you do this? Can we convert from struct_time to seconds-since-the-epoch by hand? Well, the hours, minutes and seconds are pretty easy, but dealing with the date means doing some hairy calculations with leap years, which are doable but way more effort than i thought i'd be expending on parsing the date format found in every single email in the world. Can we pretend the struct_time is a local time, convert it to seconds-since-the-epoch, then adjust it by whatever our current timezone is to get true seconds-since-the-epoch, *then* apply the parsed timezone? I think so: def mktime_utc(tm): Return what mktime would return if we were in the UTC timezone return time.mktime(tm) - time.timezone Then: def mktime_zoned(tm, tz): Return what mktime would return if we were in the timezone given by tz return mktime_utc(tm) - tz The only problem there is that mktime_utc doesn't deal with DST: if tm is a date for which DST would be in effect for the local timezone, then we need to subtract time.altzone, not time.timezone. strptime doesn't fill in the dst flag, as far as i can see, so we have to round-trip via mktime/localtim: def isDST(tm): tm2 = time.localtime(time.mktime(tm)) assert (tm2.isdst != -1) return bool(tm2.isdst) def timezone(tm): if (isDST(tm)): return time.altzone else: return time.timezone mktime_utc then becomes: def mktime_utc(tm): return time.mktime(tm) - timezone(tm) And you can of course inline that and eliminate a redundant call to mktime: def mktime_utc(tm): t = time.mktime(tm) isdst = time.localtime(t).isdst assert (isdst != -1) if (isdst): tz = time.altzone else: tz = time.timezone return t - tz So, firstly, does that work? Answer: i've tested it a it, and yes. Secondly, do you really have to do this just to parse a date with a timezone? If so, that's ridiculous. tom -- 102 FX 6 (goblins) -- http://mail.python.org/mailman/listinfo/python-list
Re: Question about idioms for clearing a list
On Tue, 7 Feb 2006, Ben Sizer wrote: Raymond Hettinger wrote: [Steven D'Aprano] The Zen isn't only one way to do it. If it were, we wouldn't need iterators, list comps or for loops, because they can all be handled with a while loop (at various costs of efficiency, clarity or obviousness). del L[:] works, but unless you are Dutch, it fails the obviousness test. [Fredrik Lundh] unless you read some documentation, that is. del on sequences and mappings is a pretty fundamental part of Python. so are slicings. both are things that you're likely to need and learn long before you end up in situation where you need to be able to clear an aliased sequence. I don't agree with that at all. I'd been programming python for a while (a year?) before i knew about del l[:]. Likewise, the del keyword is fundamental -- if you can't get, set, and del, then you need to go back to collections school. I have hardly used the del keyword in several years of coding in Python. Ditto. Why should it magically spring to mind in this occasion? Similarly I hardly ever find myself using slices, never mind in a mutable context. del L[:] is not obvious, especially given the existence of clear() in dictionaries. Agreed. tom -- GOLDIE LOOKIN' CHAIN [...] will ultimately make all other forms of music both redundant and unnecessary -- ntk -- http://mail.python.org/mailman/listinfo/python-list
Re: learning python, using string help
On Fri, 2 Feb 2006, [EMAIL PROTECTED] wrote: silly newbie mistake your code runs fine on my openbsd box. ( I didnt uncomment the return map(...) line My apologies - i should have made it clearer in the comment that it was hardwired to return example data! thanks for the awesome example! I'm not sure how awesome it is - it's pretty simple, and probably has lots of bugs. Is the BSD ruptime output format the same as on HP-UX? I have a Mac myself, but no local machines broadcasting rwho data, so i don't get any output to play with when i run ruptime! tom -- hip whizzo teddy bear egghead realpolitik tiddly-om-pom-pom sacred cow gene blues celeb cheerio civvy street U-boat tailspin ceasefire ad-lib demob pop wizard hem-line lumpenproletariat avant garde kitsch sudden death Big Apple sex drive-in Mickey Mouse bagel dumb down pesticide racism spliff dunk cheeseburger Blitzkrieg Molotov cocktail snafu buzz pissed off DNA mobile phone megabucks Wonderbra cool Big Brother brainwashing fast food Generation X hippy non-U boogie sexy psychedelic beatnik cruise missile cyborg awesome bossa nova peacenik byte miniskirt acid love-in It-girl microchip hypermarket green Watergate F-word punk detox Trekkie naff all trainers karaoke power dressing toy-boy hip-hop beatbox double-click OK yah mobile virtual reality gangsta latte applet hot-desking URL have it large Botox kitten heels ghetto fabulous dot-commer text message google bling bling 9/11 axis of evil sex up chav -- http://mail.python.org/mailman/listinfo/python-list
Re: would it be feasable to write python DJing software
On Fri, 3 Feb 2006, Ivan Voras wrote: Levi Campbell wrote: Hi, I'm thinking about writing a system for DJing in python, but I'm not sure if Python is fast enough to handle the realtime audio needed for DJing, could a guru shed some light on this subject and tell me if this is doable or if I'm out of my fscking mind? Perhaps surprisingly, it is: http://www.python.org/pycon/dc2004/papers/6/ At least, you can certainly mix in realtime in pure python, and can probably manage some level of effects processing. I'd be skeptical about decoding MP3 in realtime, but then you don't want to write your own MP3 decoder anyway, and the existing ones you might reuse are all native code. Any and all mixing would probably happen in some sort of multimedia library written in C (it would be both clumsy to program and slow to execute if the calculations of raw samples/bytes were done in python) Clumsy? Clumsier than C? No, python isn't as good with binary data as it is with text or objects, but on the whole program scale, it's still miles ahead of C. My advice would be to tackle the task in the same way you'd tackle any other: write it in pure python, then fall back to native code where it's unavoidable. When i say 'pure python', i don't mean 'not using any native modules at all', obviously - if someone's written an MP3 decoder, don't eschew it because it happens to be in C. Also, bear in mind that resorting to native code doesn't automatically mean writing in C - you can start doing stuff like moving from representing buffers as lists of ints to using NumPy arrays, using the functions in the standard audioop module, whatever; if that's not fast enough, rewrite chunks of the code in pyrex (a derivative of python that can be compiled to native code, via translation to C); if it's still not fast enough, go to C. Oh, and before you start going native, try running your program under psyco. tom -- Throw bricks at lawyers if you can! -- http://mail.python.org/mailman/listinfo/python-list
Re: Why checksum? [was Re: Fuzzy Lookups]
On Thu, 1 Feb 2006, it was written: Tom Anderson [EMAIL PROTECTED] writes: The obvious way is make a list of hashes, and sort the list. Obvious, perhaps, prudent, no. To make the list of hashes, you have to read all of every single file first, which could take a while. If your files are reasonably random at the beginning, ... The possibility of two different mp3 files having the same id3 tags is something you might specifically be checking for. So read from the end of the file, rather than the beginning. Better yet, note that if two files are identical, they must have the same length, and that finding the length can be done very cheaply, so a quicker yet approach is to make a list of lengths, sort that, and look for duplicates; when you find one, do a byte-by-byte comparison of the files (probably terminating in the first block) to see if they really are the same. Yes, checking the file lengths first is an obvious heuristic, but if you fine you have a bunch of files with the same length, what do you do? You're back to a list of hashes. Or prefixes or suffixes. By way of example, of the 2690 music files in my iTunes library, i have twelve pairs of same-sized files [1], and all of these differ within the first 18 bytes (mostly, within the first 9 bytes). That's a small enough set of matches that you don't need a general purpose algorithm. True - and this is *exactly* the situation that the OP was talking about, so this algorithm is appropriate. Moreover, i believe is representative of most situations where you have a bunch of files to compare. Of course, cases where files are tougher to tell apart do exist, but i think they're corner cases. Could you suggest a common kind of file with degenerate lengths, prefixes and suffixes? The only one that springs to mind is a set of same-sized image files in some noncompressed format, recording similar images (frames in a movie, say), where the differences might be buried deep in the pixel data. As it happens, i have just such a dataset on disk: with the images in TIFF format, i get differences between subsequent frames after 9 bytes, but i suspect that's a timestamp or something; if i convert everything to a nice simple BMP (throwing away 8 bits per sample of precision in the process - probably turning most of the pixels to 0!), then i find differences about a megabyte in. If i compare from the tail in, i also have to wade through about a megabyte before finding a difference. Here, hashes would be ideal. tom -- The revolution is here. Get against the wall, sunshine. -- Mike Froggatt -- http://mail.python.org/mailman/listinfo/python-list
Re: learning python, using string help
On Thu, 2 Feb 2006, [EMAIL PROTECTED] wrote: Well, I did want to add some formatting for example I getcha. This is really an HTML problem rather than a python problem, isn't it? What you need to do is output a table. FWIW, here's how i'd do it (assuming you've got HP-UX ruptime, since that's the only one i can find example output for [1]): http://urchin.earth.li/~twic/ruptime.py You can use this as a library, a command-line tool, or a CGI script; it'll automagically detect which context it's in and do the right thing. It's built around the output from the HP-UX version of ruptime; let me know what the output from yours looks like (a few lines would do) and i'll show you how to change it. The first key bit is a pair of regular expressions: lineRe = re.compile(r(\S+)\s+(\S+)\s+([\d+:]+)(?:,\s+(\d+) users,\s+load ([\d., ]+))?) uptimeRe = re.compile(r(?:(\d+)\+)?(\d*):(\d*)) These rip the output from ruptime apart to produce a set of fields, which can then be massaged into useful data. Once that's done, there's a big splodge of code which prints out an HTML document containing a table displaying the information. It's probably neither the shortest nor the cleanest bit of code in the universe, but it does the job and should, i hope, be reasonably clear. tom [1] http://docs.hp.com/en/B2355-90743/ch06s02.html -- Science Never Sleeps -- http://mail.python.org/mailman/listinfo/python-list
Re: Why checksum? [was Re: Fuzzy Lookups]
On Tue, 31 Jan 2006, it was written: Steven D'Aprano [EMAIL PROTECTED] writes: This isn't a criticism, it is a genuine question. Why do people compare local files with MD5 instead of doing a byte-to-byte compare? I often wonder that! Is it purely a caching thing (once you have the checksum, you don't need to read the file again)? Are there any other reasons? It's not just a matter of comparing two files. The idea is you have 10,000 local files and you want to find which ones are duplicates (i.e. if files 637 and 2945 have the same contents, you want to discover that). The obvious way is make a list of hashes, and sort the list. Obvious, perhaps, prudent, no. To make the list of hashes, you have to read all of every single file first, which could take a while. If your files are reasonably random at the beginning, you'd be better off just using the first N bytes of the file, since this would be just as effective, and cheaper to read. Looking at some random MP3s i have to hand, they all differ within the first 20 bytes - probably due to the ID3 tags, so this should work for these. Better yet, note that if two files are identical, they must have the same length, and that finding the length can be done very cheaply, so a quicker yet approach is to make a list of lengths, sort that, and look for duplicates; when you find one, do a byte-by-byte comparison of the files (probably terminating in the first block) to see if they really are the same. By way of example, of the 2690 music files in my iTunes library, i have twelve pairs of same-sized files [1], and all of these differ within the first 18 bytes (mostly, within the first 9 bytes). Therefore, i could rule out duplication with just 22 data blocks read from disk (plus rather more blocks of directory information and inodes, of course). A hash-based approach would have had to wade through a touch over 13 GB of data before it could even get started. Of course, there are situations where this is the wrong approach - if you have a collection of serialised sparse matrices, for example, which consist of identically-sized blocks of zeroes with a scattering of ones throughout, then lengths and prefixes will be useless, whereas hashes will work perfectly. However, here, we're looking at MP3s, where lengths and prefixes will be a win. tom [1] The distribution of those is a bit weird: ten pairs consist of two tracks from The Conet Project's 'Recordings of Shortwave Numbers Stations', one is a song from that and The Doors' 'Horse Latitudes', and one is between to Calexico songs ('The Ride (Pt II)' and 'Minas De Cobre'). Why on earth are eleven of the twelve pairs pairs of songs from the same artist? Is it really that they're pairs of songs from the same compressor (those tracks aren't from CD), i wonder? -- Not all legislation can be eye-catching, and it is important that the desire to achieve the headlines does not mean that small but useful measures are crowded out of the legislative programme. -- Select Committee on Transport -- http://mail.python.org/mailman/listinfo/python-list
Re: simple perl program in python gives errors
On Mon, 30 Jan 2006, Grant Edwards wrote: On 2006-01-30, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: i was hoping one didnt have to initialize variables because perl defaults their value to zero. Also I noticed if I initialize a variable as 0 , then I can only do integer math not floating math. Python is a strictly typed language. Perl isn't -- Perl does all sorts of stuff automagically by trying to guess what you wanted. I perfer languages that do exactly what I tell them to rather than what the language's author thought I might have meant. Especially when that's Larry Wall ... :) tom -- Don't trust the laws of men. Trust the laws of mathematics. -- http://mail.python.org/mailman/listinfo/python-list
Re: StringIO proposal: add __iadd__
On Sun, 29 Jan 2006, Alex Martelli wrote: Paul Rubin http://[EMAIL PROTECTED] wrote: Maybe the standard versions of some of these things can be written in RPython under PyPy, so they'll compile to fast machine code, and then the C versions won't be needed. By all means, the C versions are welcome, I just don't want to lose the Python versions either (and making them less readable by recoding them in RPython would interfere with didactical use). Is RPython really that bad? Lack of generators seems like the only serious issue to me. But with CPython I think we need the C versions. Unless we use Shed Skin to translate the RPython into C++. Or maybe we could write the code in Pyrex, generate C from that for CPython, then have a python script which strips out the type definitions to generate pure python for PyPy. tom -- Don't trust the laws of men. Trust the laws of mathematics. -- http://mail.python.org/mailman/listinfo/python-list
Re: Numarray, numeric, NumPy, scpy_core ??!!
On Sat, 21 Jan 2006, Robert Kern wrote: Tom Anderson wrote: Pardon my failure to RTFM, but does NumPy pick up the vecLib BLAS on Macs? Yes. Excellent, thanks. tom -- forget everything from school -- you are programmer -- http://mail.python.org/mailman/listinfo/python-list
Re: Returning a tuple-struct
On Thu, 18 Jan 2006 [EMAIL PROTECTED] wrote: Is there a better way? Thoughts? I was thinking along these lines: class NamedTuple(tuple): def __init__(self, indices, values): indices should be a map from name to index tuple.__init__(self, values) self.indices = indices def __getattr__(self, name): return self[self.indices[name]] colourNames = {red: 0, green: 1, blue:2} plum = NamedTuple(colourNames, (219, 55, 121)) The idea is that it's a tuple, but it has some metadata alongside (shared with other similarly-shaped tuples) which allows it to resolve names to indices - thus avoiding having two references to everything. However, if i try that, i get: Traceback (most recent call last): File stdin, line 1, in ? TypeError: tuple() takes at most 1 argument (2 given) As far as i can tell, inheriting from tuple is forcing my constructor to only take one argument. Is that the case? If so, anyone got any idea why? If i rewrite it like this: class NamedTuple(tuple): def __init__(self, values): tuple.__init__(self, values) def __getattr__(self, name): return self[self.indices[name]] class ColourTuple(NamedTuple): indices = {red: 0, green: 1, blue:2} plum = ColourTuple((219, 55, 121)) Then it works. This is even an arguably better style. Changing the constructor to take *values rather than values, and to validate the length of the value tuple against the length of the index tuple, would be good, but, since i'm lazy, is left as an exercise to the reader. tom -- Throwin' Lyle's liquor away is like pickin' a fight with a meat packing plant! -- Ray Smuckles -- http://mail.python.org/mailman/listinfo/python-list
Re: Arithmetic sequences in Python
On Fri, 20 Jan 2006, it was written: [EMAIL PROTECTED] (Alex Martelli) writes: How would you make a one-element list, which we'd currently write as [3]? Would you have to say list((3,))? Yep. I don't particularly like the mandatory trailing comma in the tuple's display form, mind you, but, if it's good enough for tuples, and good enough for sets (how else would you make a one-element set?), If you really want to get rid of container literals, maybe the best way is with constructor functions whose interfaces are slightly different from the existing type-coercion functions: listx(1,2,3) = [1, 2, 3] listx(3) = [3] listx(listx(3)) = [[3]] dictx((a,b), (c,d)) = {a:b, c:d} setx(a,b,c) = Set((a,b,c)) listx/dictx/setx would be the display forms as well as the constructor forms. Could these even replace the current forms? If you want the equivalent of list(sometuple), write list(*sometuple). With a bit of cleverness down in the worky bits, this could be implemented to avoid the apparent overhead of unpacking and then repacking the tuple. In fact, in general, it would be nice if code like: def f(*args): fondle(args) foo = (1, 2, 3) f(*foo) Would avoid the unpack/repack. The problem is that you then can't easily do something like: mytable = ((1, 2, 3), (a, b, c), (Tone.do, Tone.re, Tone.mi)) mysecondtable = map(list, mytable) Although that's moderately easy to work around with possibly the most abstract higher-order-function i've ever written: def star(f): def starred_f(args): return f(*args) return starred_f Which lets us write: mysecondtable = map(star(list), mytable) While we're here, we should also have the natural complement of star, its evil mirror universe twin: def bearded_star(f): def bearded_starred_f(*args): return f(args) return bearded_starred_f Better names (eg unpacking and packing) would obviously be needed. tom -- I might feel irresponsible if you couldn't go almost anywhere and see naked, aggressive political maneuvers in iteration, marinating in your ideology of choice. That's simply not the case. -- Tycho Brahae -- http://mail.python.org/mailman/listinfo/python-list
Re: Arithmetic sequences in Python
On Sat, 21 Jan 2006, it was written: Tom Anderson [EMAIL PROTECTED] writes: listx/dictx/setx would be the display forms as well as the constructor forms. Could these even replace the current forms? If you want the equivalent of list(sometuple), write list(*sometuple). The current list function is supposed to be something like a typecast: A what? ;-| list() = [] xlist() = [] # ok list(list()) = [] # casting a list to a list does nothing xlist(xlist()) = [[]] # make a new list, not the same list(xrange(4)) = [0,1,2,3] xlist(xrange(4)) = [xrange(4)] # not the same list((1,2)) = [1,2] xlist((1,2)) = [(1,2)] True, but so what? Is it that it has to be that way, or is it just that it happens to be that way now? tom -- It's the 21st century, man - we rue _minutes_. -- Benjamin Rosenbaum -- http://mail.python.org/mailman/listinfo/python-list
Re: Numarray, numeric, NumPy, scpy_core ??!!
On Sat, 21 Jan 2006, Travis E. Oliphant wrote: J wrote: I will just jump in an use NumPy. I hope this one will stick and evolve into the mother of array packages. How stable is it ? For now I really just need basic linear algebra. i.e. matrix multiplication, dot, cross etc There is a new release coming out this weekend. It's closer to 1.0 and so should be more stable. It also has some speed improvements in matrix-vector operations (if you have ATLAS BLAS --- or if you download a binary version with ATLAS BLAS compiled in). I would wait for it. Pardon my failure to RTFM, but does NumPy pick up the vecLib BLAS on Macs? tom -- It's the 21st century, man - we rue _minutes_. -- Benjamin Rosenbaum -- http://mail.python.org/mailman/listinfo/python-list
Re: OT: excellent book on information theory
Slow and to the pointless, but ... On Wed, 18 Jan 2006, Terry Hancock wrote: On Mon, 16 Jan 2006 12:15:25 -0500 Tim Peters [EMAIL PROTECTED] wrote: More Britishisms are surviving in the Scholastic editions as the series goes on, but as the list for Half-Blood Prince shows the editors still make an amazing number of seemingly pointless changes: like: UK:Harry smiled vaguely back US:Harry smiled back vaguely I know you are pointing out the triviality of this, since both US and UK English allow either placement -- but is it really preferred style in the UK to put the adverb right before the verb? For the meaning which i assume is meant here, no, i wouldn't have said so. In US English, the end of the clause (or the beginning) is probably more common. Same in British English (or at least, English English). As Dave Hansen pointed out, Harry smiled vaguely back, means that the direction Harry was smiling was vaguely back - might have been a bit to the side or something. This actually gets back on topic ( ;-) ), because it might affect the localization of a Python interactive fiction module I'm working on -- it's a GUI to generate sentences that are comprehensible to the IF engine. My guess would be that you're going to need something far more powerful than a localisation engine for this. en_US: Sally, gently put flower in basket vs en_UK: Sally, put flower in basket gently That example isn't as bad as the Rowling one (although the lack of articles is a bit odd); i think i'd only use the latter form if i wanted to put particular emphasis on the 'gently', particularly if it was as a modified repetition of a previous sentence: Instructor: Sally, put a flower in the basket. [Sally roughly puts the flower in the basket, crushing it] Instructor: Sally, put a flower in the basket *gently*. Your second construction isn't the equivalent of the Rowling sentence, though, where the adverb goes right after the verb; that would make it Sally, put gently the flower in the basket, which would be completely awful. Or maybe it would be Sally, put the flower gently in the basket, which would be fine, although a bit dated - has an admittedly euphonious 1950s BBC English feel to it. tom -- It's the 21st century, man - we rue _minutes_. -- Benjamin Rosenbaum -- http://mail.python.org/mailman/listinfo/python-list
Re: On Numbers
On Wed, 18 Jan 2006, Steven D'Aprano wrote: On Tue, 17 Jan 2006 23:34:40 +, Tom Anderson wrote: So I don't really know what point you are making. What solution(s) for 1**0.5 were you expecting? He's probably getting at the fact that if you're dealing with complex numbers, square root get a lot more complicated: http://mathworld.wolfram.com/SquareRoot.html But still, that doesn't change the fact that x**0.5 as is meant here is the principal (positive) real square root, and that can be true whether your hierarchy of numeric types includes a complex type or not. Er, actually, i meant to write -1, but evidently missed a key, and failed to check what i'd written. Since exponentiation has higher priority than negation, -1**0.5 is -1.0 in both Python and ordinary mathematics. Perhaps you meant to write (-1)**0.5, Yes. [FX: bangs head on keyboard] I'm still getting this wrong after all these years. in which case Python developers have a decision to make: should it assume real-valued maths unless explicitly told differently, and hence raise an exception, or coerce the result to complex? Precisely. In this case, Python raises an exception, as it should, unless you explicitly uses complex numbers. That's the best behaviour for the majority of people: most people don't even know what complex numbers are, let alone want to deal with them in their code. Python, after all, is not Mathematica. I think i agree with you, as a matter of practical value. However, this does go against the whole numeric unification thing we were discussing. Hmm. What happens if i say (-1) ** (0.5+0j)? Ah, i get the right answer. Well, that's handy - it means i don't have to resort to cmath or sprinkle complex() calls all over the place for complex maths. tom -- Biochemistry is the study of carbon compounds that wriggle. -- http://mail.python.org/mailman/listinfo/python-list
Re: On Numbers
On Mon, 16 Jan 2006, Erik Max Francis wrote: Steven D'Aprano wrote: The square root of 1 is +1 (the negative root being explicitly rejected). Pure mathematicians, who may be expected to care whether the root is the integer 1 or the real number 1, are unlikely to write 1**0.5, prefering the squareroot symbol. For the rest of us, including applied mathematicians, 1**0.5 implies floating point, which implies the correct answer is 1.0. So I don't really know what point you are making. What solution(s) for 1**0.5 were you expecting? He's probably getting at the fact that if you're dealing with complex numbers, square root get a lot more complicated: http://mathworld.wolfram.com/SquareRoot.html But still, that doesn't change the fact that x**0.5 as is meant here is the principal (positive) real square root, and that can be true whether your hierarchy of numeric types includes a complex type or not. Er, actually, i meant to write -1, but evidently missed a key, and failed to check what i'd written. But excellent discussion there, chaps! All shall have medals! tom -- Taking care of business -- http://mail.python.org/mailman/listinfo/python-list
Re: Web application design question (long)
On Tue, 16 Jan 2006, Fried Egg wrote: I am interested if anyone can shed any light on a web application problem, I'm not going to help you with that, but i am going to mention the Dada Engine: http://dev.null.org/dadaengine/ And its most famous incarnation, the Postmodernism Generator: http://www.elsewhere.org/pomo tom -- Taking care of business -- http://mail.python.org/mailman/listinfo/python-list
Re: Arithmetic sequences in Python
On Tue, 16 Jan 2006, it was written: Tom Anderson [EMAIL PROTECTED] writes: The natural way to implement this would be to make .. a normal operator, rather than magic, and add a __range__ special method to handle it. a .. b would translate to a.__range__(b). I note that Roman Suzi proposed this back in 2001, after PEP 204 was rejected. It's a pretty obvious implementation, after all. Interesting, but what do you do about the unary postfix (1 ..) infinite generator? 1.__range__(None) (-3,-5 ..) -- 'infinite' generator that yield -3,-5,-7 and so on -1. Personally, i find the approach of specifying the first two elements *absolutely* *revolting*, and it would consistently be more awkward to use than a start/step/stop style syntax. Come on, when do you know the first two terms but not the step size? Usually you know both, but showing the first two elements makes sequence more visible. I certainly like (1,3..9) better than (1,9;2) or whatever. I have to confess that i don't have a pretty three-argument syntax to offer as an alternative to yours. But i'm afraid i still don't like yours. :) 1) [] means list, () means generator Yuck. Yes, i know it's consistent with list comps and genexps, but yuck to those too! I'd be ok with getting rid of [] and just having generators or xrange-like class instances. If you want to coerce one of those to a list, you'd say list((1..5)) instead of [1..5]. Sounds good. More generally, i'd be more than happy to get rid of list comprehensions, letting people use list(genexp) instead. That would obviously be a Py3k thing, though. tom -- Taking care of business -- http://mail.python.org/mailman/listinfo/python-list
Re: Arithmetic sequences in Python
On Tue, 17 Jan 2006, Antoon Pardon wrote: Op 2006-01-16, Alex Martelli schreef [EMAIL PROTECTED]: Paul Rubin http://[EMAIL PROTECTED] wrote: Steven D'Aprano [EMAIL PROTECTED] writes: For finite sequences, your proposal adds nothing new to existing solutions like range and xrange. Oh come on, [5,4,..0] is much easier to read than range(5,-1,-1). But not easier than reversed(range(6)) [[the 5 in one of the two expressions in your sentence has to be an offbyone;-)]] Why don't we give slices more functionality and use them. These are a number of ideas I had. (These are python3k ideas) 1) Make slices iterables. (No more need for (x)range) 2) Use a bottom and stop variable as default for the start and stop attribute. top would be a value that is greater than any other value, bottom would be a value smaller than any other value. 3) Allow slice notation to be used anywhere a value can be used. 4) Provide a number of extra operators on slices. __neg__ (reverses the slice) __and__ gives the intersection of two slices __or__ gives the union of two slices 5) Provide sequences with a range (or slice) method. This would provide an iterator that iterates over the indexes of the sequences. A slice could be provided +5 for i, el in enumerate(sequence): would become for i in sequence.range(): el = sequence[i] That one, i'm not so happy with - i quite like enumerate; it communicates intention very clearly. I believe enumerate is implemented with iterators, meaning it's potentially more efficient than your approach, too. And since enumerate works on iterators, which yours doesn't, you have to keep it anyway. Still, both would be possible, and it's a matter of taste. But the advantage is that this would still work when someone subclasses a list so that it start index is an other number but 0. It would be possible to patch enumerate to do the right thing in those situations - it could look for a range method on the enumerand, and if it found one, use it to generate the indices. Like this: def enumerate(thing): if (hasattr(thing, range)): indices = thing.range() else: indices = itertools.count() return itertools.izip(indices, thing) If you only wanted every other index one could do the following for i in sequence.range(::2): which would be equivallent to for i in sequence.range() (::2): Oh, that is nice. Still, you could also extend enumerate to take a range as an optional second parameter and do this with it. Six of one, half a dozen of the other, i suppose. tom -- Taking care of business -- http://mail.python.org/mailman/listinfo/python-list
Re: On Numbers
On Sun, 15 Jan 2006, Alex Martelli wrote: Paul Rubin http://[EMAIL PROTECTED] wrote: Mike Meyer [EMAIL PROTECTED] writes: I'd like to work on that. The idea would be that all the numeric types are representations of reals with different properties that make them appropriate for different uses. 2+3j? Good point, so s/reals/complex numbers/ -- except for this detail, Mike's idea do seem well founded. 1 ** 0.5 ? I do like the mathematical cleanliness of making ints and floats do the right thing when the answer would be complex, but as a pragmatic decision, it might not be the right thing to do. It evidently wasn't thought it was when python's current number system was designed. I think Tim Peters has an opinion on this. tom -- Socialism - straight in the mainline! -- http://mail.python.org/mailman/listinfo/python-list
Re: Arithmetic sequences in Python
On Mon, 16 Jan 2006, it was written: There's something to be said for that. Should ['a'..'z'] be a list or a string? And while we're there, what should ['aa'..'zyzzogeton'] be? tom -- Socialism - straight in the mainline! -- http://mail.python.org/mailman/listinfo/python-list
Re: Arithmetic sequences in Python
On Mon, 16 Jan 2006, Gregory Petrosyan wrote: Please visit http://www.python.org/peps/pep-0204.html first. As you can see, PEP 204 was rejected, mostly because of not-so-obvious syntax. But IMO the idea behind this pep is very nice. Agreed. Although i have to say, i like the syntax there - it seems like a really natural extension of existing syntax. So, maybe there's a reason to adopt slightly modified Haskell's syntax? Well, i do like the .. - 1..3 seems like a natural way to write a range. I'd find 1...3 more natural, since an ellipsis has three dots, but it is slightly more tedious. The natural way to implement this would be to make .. a normal operator, rather than magic, and add a __range__ special method to handle it. a .. b would translate to a.__range__(b). I note that Roman Suzi proposed this back in 2001, after PEP 204 was rejected. It's a pretty obvious implementation, after all. Something like [1,3..10] -- [1,3,5,7,9] (1,3..10) -- same values as above, but return generator instead of list [1..10] -- [1,2,3,4,5,6,7,8,9,10] (1 ..)-- 'infinite' generator that yield 1,2,3 and so on (-3,-5 ..) -- 'infinite' generator that yield -3,-5,-7 and so on -1. Personally, i find the approach of specifying the first two elements *absolutely* *revolting*, and it would consistently be more awkward to use than a start/step/stop style syntax. Come on, when do you know the first two terms but not the step size? 1) [] means list, () means generator Yuck. Yes, i know it's consistent with list comps and genexps, but yuck to those too! Instead, i'd like to see lazy lists used here - these look like lists, and can be used exactly like a list, but if all you want to do is iterate over them, they don't need to instantiate themselves in memory, so they're as efficient as an iterator. The best of both worlds! I've written a sketch of a generic lazy list: http://urchin.earth.li/~twic/lazy.py Note that this is what xrange does already (as i've just discovered). tom -- Socialism - straight in the mainline! -- http://mail.python.org/mailman/listinfo/python-list
Re: Arithmetic sequences in Python
On Mon, 16 Jan 2006, Alex Martelli wrote: Steven D'Aprano [EMAIL PROTECTED] wrote: On Mon, 16 Jan 2006 12:51:58 +0100, Xavier Morel wrote: For those who'd need the (0..n-1) behavior, Ruby features something that I find quite elegant (if not perfectly obvious at first), (first..last) provides a range from first to last with both boundaries included, but (first...last) (notice the 3 periods) No, no I didn't. Sheesh, that just *screams* Off By One Errors!!!. Python deliberately uses a simple, consistent system of indexing from the start to one past the end specifically to help prevent signpost errors, and now some folks want to undermine that. *shakes head in amazement* Agreed. *IF* we truly needed an occasional up to X *INCLUDED* sequence, it should be in a syntax that can't FAIL to be noticed, such as range(X, endincluded=True). How about first,,last? Harder to do by mistake, but pretty horrible in its own way. tom -- Socialism - straight in the mainline! -- http://mail.python.org/mailman/listinfo/python-list
Re: how do real python programmers work?
On Thu, 12 Jan 2006, bblais wrote: In Matlab, I do much the same thing, except there is no compile phase. I have the editor on one window, the Matlab interactive shell in the other. I often make a bunch of small scripts for exploration of a problem, before writing any larger apps. I go back and forth editing the current file, and then running it directly (Matlab looks at the time stamp, and automagically reloads the script when I modify it). I wouldn't describe myself as an experienced programmer, but this is definitely how i work - editor plus interactive interpreter, using import/reload to bring in and play with bits of of code. Towards the end of coding a program, when i'm done with the inner functions and am working on the main function, which does stuff like command line parsing, setting up input and output, etc, i'll often leave the interpreter and work from the OS shell, since that's the proper environment for a whole program. Often, i'll actually have more than one shell open - generally three: one with an interpreter without my code loaded, for doing general exploratory programming, testing code fragments, doing sums, etc; one with an interpreter with my code loaded, for testing individual components of the code, and one at the OS shell, for doing whole-program tests, firing up editors, general shell work, etc. Another trick is to write lightweight tests as functions in the interpreter-with-code-loaded that reload my module and then do something with it. For example, for testing my (entirely fictional) video compressor, i might write: def testcompressor(): reload(vidzip) seq = vidzip.ImageSequence((640, 480)) for i in xrange(200): frameName = testmovie.%02i.png % i frame = Image.open(frameName) seq.append(frame) codec = vidzip.Compressor(vidzip.DIRAC, 9) codec.compress(seq, file(testmovie.bbc, w)) Then, after editing and saving my code, i can just enter testcompressor() (or, in most cases, hit up-arrow and return) to reload and test. You can obviously extend this a bit to make the test routine take parameters which control the nature of the test, so you can easily test a range of things, and you can have multiple different test on the go at once. tom -- Only men's minds could have mapped into abstraction such a territory -- http://mail.python.org/mailman/listinfo/python-list
Re: how do real python programmers work?
On Thu, 12 Jan 2006, Mike Meyer wrote: well, we need a term for development environment built out of Unix tools Disintegrated development environment? Differentiated development environment? How about just a development environment? tom -- NOW ALL ASS-KICKING UNTIL THE END -- http://mail.python.org/mailman/listinfo/python-list
Re: how do real python programmers work?
On Fri, 13 Jan 2006, Roy Smith wrote: Mike Meyer [EMAIL PROTECTED] wrote: we need a term for development environment built out of Unix tools We already have one. The term is emacs. Emacs isn't built out of unix tools - it's a standalone program. Ah, of course - to an true believer, emacs *is* the unix toolset. :) tom -- NOW ALL ASS-KICKING UNTIL THE END -- http://mail.python.org/mailman/listinfo/python-list
Re: Help wanted with md2 hash algorithm
On Sun, 8 Jan 2006, Tom Anderson wrote: On Fri, 6 Jan 2006 [EMAIL PROTECTED] wrote: below you find my simple python version of MD2 algorithm as described in RFC1319 (http://rfc1319.x42.com/MD2). It produces correct results for strings shorter than 16 Bytes and wrong results for longer strings. I guess the thing to do is extract the C code from the RFC and compile it, verify that it works, then stick loads of print statements in the C and the python, to see where the states of the checksum engines diverge. Okay, i've done this. I had to fiddle with the source a bit - added a #include global.h to md2.h (it needs it for the PROTO_LIST macro) and took the corresponding includes out of md2c.c and mddriver.c (to avoid duplicate definitions) - but after that, it built cleanly with: gcc -DMD=2 *.c *.h -o mddriver A couple of pairs of (somewhat spurious) parentheses in mddriver.c, and it even built cleanly with -Wall. Running the test suite with mddriver -x gives results matching the test vectors in the RFC - a good start! Patching the code to dump the checksums immediately after updating with the pad, and before updating with the checksum: *** checksum after padding = 623867b6af52795e5f214e9720beea8d MD2 () = 8350e5a3e24c153df2275c9f80692773 *** checksum after padding = 19739cada3ba281693348e9d256fff31 MD2 (a) = 32ec01ec4a6dac72c0ab96fb34c0b5d1 *** checksum after padding = 19e29d1b7304368e595a276f302f57cc MD2 (abc) = da853b0d3f88d99b30283a69e6ded6bb *** checksum after padding = 56d65157dedfcd75a7b1e82d970eec4b MD2 (message digest) = ab4f496bfb2a530b219ff33031fe06b0 *** checksum after padding = 4a42d3a377b7e9988fb9289699e4d3a3 MD2 (abcdefghijklmnopqrstuvwxyz) = 4e8ddff3650292ab5a4108c3aa47940b *** checksum after padding = c3db7592ee1dd9b84505cfb4e2f9a765 MD2 (ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789) = da33def2a42df13975352846c30338cd *** checksum after padding = 59ca5673c8f931bc41214f56b5c6c01 MD2 (12345678901234567890123456789012345678901234567890123456789012345678901234567890) = d5976f79d83d3a0dc9806c3c66f3efd8 And here's my python code with the same modification, running the test suite: *** checksum after padding = 623867b6af52795e5f214e9720beea8d MD2 () = 8350e5a3e24c153df2275c9f80692773 *** checksum after padding = 19739cada3ba281693348e9d256fff31 MD2 (a) = 32ec01ec4a6dac72c0ab96fb34c0b5d1 *** checksum after padding = 19e29d1b7304368e595a276f302f57cc MD2 (abc) = da853b0d3f88d99b30283a69e6ded6bb *** checksum after padding = 56d65157dedfcd75a7b1e82d970eec4b MD2 (message digest) = ab4f496bfb2a530b219ff33031fe06b0 *** checksum after padding = 539ba695f264f365bcabc5c8b10913c7 MD2 (abcdefghijklmnopqrstuvwxyz) = 65182bb8c569485fcba44dbc66a02b56 *** checksum after padding = 365fe0617f5f56a56090af1cfd6caac3 MD2 (ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789) = a1ccc835ea9654d6a2926c21f0b20813 *** checksum after padding = 9acf39425d22c4e3b4ddbdc563d23716 MD2 (12345678901234567890123456789012345678901234567890123456789012345678901234567890) = 8f1f49dc8de490b9aa7c99cec3fbccdf As you can see, the checksums start to go wrong when we hit 16 bytes. So, let us turn our attention to the checksum function. Here's the python i wrote: def checksum_old(c, buf): # c is checksum array, buf is input block l = c[-1] for i in xrange(digest_size): l = S[(buf[i] ^ l)] c[i] = l Here's the C from the RFC: unsigned int i, j, t; t = checksum[15]; for (i = 0; i 16; i++) t = checksum[i] ^= PI_SUBST[block[i] ^ t]; Spot the difference. Yes, the assignment into the checksum array is a ^=, not a straight = - checksum bytes get set to current-value-of-checksum-byte xor S-box-transformation-of (input-byte xor accumulator). Translating that into python, we get: def checksum(c, buf): l = c[-1] for i in xrange(digest_size): l = S[(buf[i] ^ l)] ^ c[i] c[i] = l And when we put that back into the code, we get the right digests out. Victory! However, here's what the pseudocode in the RFC says: For j = 0 to 15 do Set c to M[i*16+j]. Set C[j] to S[c xor L]. Set L to C[j]. end /* of loop on j */ I certainly don't see any sign of a xor with the current-value-of-checksum-byte in there - it looks like the C and pseudocode in the RFC don't match up. And, yes, googling for RFC 1319 errata brings up a report correcting this. They really ought to amend RFCs to mention errata! Correct code here: http://urchin.earth.li/~twic/md2.py tom -- Mathematics is the door and the key to the sciences. -- Roger Bacon -- http://mail.python.org/mailman/listinfo/python-list
Re: try: except never:
On Tue, 10 Jan 2006, Duncan Booth wrote: Paul Rubin wrote: Hallvard B Furuseth [EMAIL PROTECTED] writes: class NeverRaised(Exception): pass for ex in ZeroDivisionError, NeverRaised: Heh. Simple enough. Unless some obstinate person raises it anyway... Hmm, ok, how's this?: def NeverRaised(): class blorp(Exception): pass return blorp for ex in ZeroDivisionError, NeverRaised(): ... Nice. Or you can create an unraisable exception: \ class NeverRaised(Exception): def __init__(self, *args): raise RuntimeError('NeverRaised should never be raised') Briliant! Although i'd be tempted to define an UnraisableExceptionError to signal what's happened. Or ... class ImpossibleException(Exception): def __init__(self, *args): raise ImpossibleException, args Although crashing the interpreter is probably overkill. tom -- Like Kurosawa i make mad films; okay, i don't make films, but if i did they'd have a samurai. -- http://mail.python.org/mailman/listinfo/python-list
Re: Help wanted with md2 hash algorithm
On Fri, 6 Jan 2006 [EMAIL PROTECTED] wrote: below you find my simple python version of MD2 algorithm as described in RFC1319 (http://rfc1319.x42.com/MD2). It produces correct results for strings shorter than 16 Bytes and wrong results for longer strings. I can't find what's wrong. Can anybody help? Okay, i've reimplemented the code from scratch, based on the RFC, without even looking at your code, as a basis for comparison. The trouble is, i get exactly the same results as you! Here's mine: http://urchin.earth.li/~twic/md2.py I guess the thing to do is extract the C code from the RFC and compile it, verify that it works, then stick loads of print statements in the C and the python, to see where the states of the checksum engines diverge. tom -- Death to all vowels! The Ministry of Truth says vowels are plus undoublethink. Vowels are a Eurasian plot! Big Brother, leading us proles to victory! -- http://mail.python.org/mailman/listinfo/python-list
Re: Calling GPL code from a Python application
On Wed, 4 Jan 2006, Mike Meyer wrote: Terry Hancock [EMAIL PROTECTED] writes: It is interesting to note that the FSF holds the position that the language that gives you this right *doesn't* -- it just clarifies the fact that you already hold that right, because it is provided by fair use. Their position is that it is not possible to restrict the *use* of software you have legally acquired, because copyright only controls copying. I believe there is precedent that contradicts the FSF's position. There are two arguments against it: 1) Executing software involves several copy operations. Each of those potentially violate the copyright, and hence the copyright holder can restrict execution of a program. 2) Executing a program is analogous to a performance of the software. Copyright includes limits on performances, so the copyright holder can place limits on the execution of the software. Personally, I agree with the FSF - if own a copy of a program, executing it should be fair use. I'm with you - i don't accept either of those legal arguments. The copying that copyright talks about is the making of copies which can be distributed - copies which are the equivalent of the original. It doesn't mean the incidental, transient copies made during use - otherwise, it would be illegal to read a book, since a copy of the text is transiently made in your visual cortex, or to listen to a record, since a copy of the music is transiently made in the pattern of sound waves in the air. The performance that the law talks about is not like execution, but is communication, and so a form of copying - by performing a play, you're essentially giving a copy of the text to the audience. Executing a program doesn't communicate it to any third parties. Of course, in practice, it matters rather little whether i accept either of those, since i'm not a judge trying the relevant test case! While I'm here, I'll point out the the address space argument is specious. What if I bundle a standalone GPL'ed application with my own application, and distribute binaries for a machine that has a shared address space? By that criteria, I'd have to GPL my code for the distribution for the shared address space machine, but not for a Unix system. I'm not buying that. I also agree that the address space thing is bunk. What if i write a CORBA/RPC/COM/etc wrapper round some GPL'd library, release that under the GPL, then write my non-GPL'd program to access the wrapped library via a socket? Or if i write a wrapper application that takes a function name and some parameters on the command line, calls that function, and writes the result to stdout, then access it via popen? I get the use of the library, without sharing its address space! On the flip side, we could argue that an application which uses a dynamic library *is* a derivative work, since we need a header file from the library to compile it, and that header file is covered by the GPL. What happpens when you compile with a non-GPL but compatible header (say, one you've clean-roomed) but link to a GPL library at runtime, though? tom -- I am the best at what i do. -- http://mail.python.org/mailman/listinfo/python-list
Re: Memoization and encapsulation
On Wed, 4 Jan 2006 [EMAIL PROTECTED] wrote: I think python is broken here-- why aren't lists hashable, or why isn't there a straightforward way to make memoised() work? a = [1, 2, 3] d = {a: foo} a[0] = 0 print d[a] I feel your pain, but i don't think lists (and mutable objects generally) being unhashable is brokenness. I do think there's room for a range of opinion, though, and i'm not sure what i think is right. tom -- Rapid oxidation is the new black. -- some Mike -- http://mail.python.org/mailman/listinfo/python-list
Filename case-insensitivity on OS X
Afternoon all, MacOS X seems to have some heretical ideas about the value of case in paths - it seems to believe that it doesn't exist, more or less, so touch foo FOO touches just one file, you can't have both 'makefile' and 'Makefile' in the same directory, os.path.exists(some_valid_path.upper()) returns True even when os.path.split(some_valid_path.upper())[1] in os.listdir(os.path.split(some_valid_path)[0]) returns False, etc (although, of course, ls *.txt doesn't mention any of those .TXT files lying around). Just to prove it, here's what unix (specifically, linux) does: [EMAIL PROTECTED]:~$ uname Linux [EMAIL PROTECTED]:~$ python Python 2.3.5 (#2, Sep 4 2005, 22:01:42) [GCC 3.3.5 (Debian 1:3.3.5-13)] on linux2 Type help, copyright, credits or license for more information. import os filenames = os.listdir(.) first = filenames[0] first in filenames True first.upper() in filenames False os.path.exists(os.path.join(., first)) True os.path.exists(os.path.join(., first.upper())) False And here's what OS X does: Hooke:~ tom$ uname Darwin Hooke:~ tom$ python Python 2.4.1 (#2, Mar 31 2005, 00:05:10) [GCC 3.3 20030304 (Apple Computer, Inc. build 1666)] on darwin Type help, copyright, credits or license for more information. import os filenames = os.listdir(.) first = filenames[0] first in filenames True first.upper() in filenames False os.path.exists(os.path.join(., first)) True os.path.exists(os.path.join(., first.upper())) True Sigh. Anyone got any bright ideas for working around this, specifically for os.path.exists? I was hoping there was some os.path.actualpath, so i could say: def exists_dontignorecase(path): return os.path.exists(path) and (path == os.path.actualpath(path)) Java has a java.io.File.getCanonicalPath method that does this, but i can't find an equivalent in python - is there one? I can emulate it like this: def _canonicalise(s, l): s = s.lower() for t in l: if s == t.lower(): return t raise ValueError, (could not canonicalise string, s) def canonicalpath(path): if (path in (/, )): return path parent, child = os.path.split(path) cparent = canonicalpath(parent) cchild = _canonicalise(child, os.listdir(cparent)) return os.path.join(cparent, cchild) Or, more crudely, do something like this: def exists_dontignorecase(path): dir, f = os.path.split(path) return f in os.listdir(dir) But better solutions are welcome. Thanks, tom -- Infantry err, infantry die. Artillery err, infantry die. -- IDF proverb -- http://mail.python.org/mailman/listinfo/python-list
Re: Spiritual Programming (OT, but Python-inspired)
On Mon, 2 Jan 2006 [EMAIL PROTECTED] wrote: In this sense, we are like the ghost in the machine of a computer system running a computer program, or programs, written in a procedural language and style. Makes sense - i heard that Steve Russell invented continuations after reading the Tibetan Book of the Dead. tom -- Chance? Or sinister scientific conspiracy? -- http://mail.python.org/mailman/listinfo/python-list
Re: itertools.izip brokeness
On Tue, 3 Jan 2006, it was written: [EMAIL PROTECTED] writes: The problem is that sometimes, depending on which file is the shorter, a line ends up missing, appearing neither in the izip() output, or in the subsequent direct file iteration. I would guess that it was in izip's buffer when izip terminates due to the exception on the other file. A different possible long term fix: change StopIteration so that it takes an optional arg that the program can use to figure out what happened. Then change izip so that when one of its iterator args runs out, it wraps up the remaining ones in a new tuple and passes that to the StopIteration it raises. +1 I think you also want to send back the items you read out of the iterators which are still alive, which otherwise would be lost. Here's a somewhat minimalist (but tested!) implementation: def izip(*iters): while True: z = [] try: for i in iters: z.append(i.next()) yield tuple(z) except StopIteration: raise StopIteration, z The argument you get back with the exception is z, the list of items read before the first empty iterator was encountered; if you still have your array iters hanging about, you can find the iterator which stopped with iters[len(z)], the ones which are still going with iters[:len(z)], and the ones which are in an uncertain state, since they were never tried, with iters[(len(z) + 1):]. This code could easily be extended to return more information explicitly, of course, but simple, sparse, etc. You would want some kind of extended for-loop syntax (maybe involving the new with statement) with a clean way to capture the exception info. How about for ... except? for z in izip(a, b): lovingly_fondle(z) except StopIteration, leftovers: angrily_discard(leftovers) This has the advantage of not giving entirely new meaning to an existing keyword. It does, however, afford the somewhat dubious use: for z in izip(a, b): lovingly_fondle(z) except ValueError, leftovers: pass # execution should almost certainly never get here Perhaps that form should be taken as meaning: try: for z in izip(a, b): lovingly_fondle(z) except ValueError, leftovers: pass # execution could well get here if the fondling goes wrong Although i think it would be more strictly correct if, more generally, it made: for LOOP_VARIABLE in ITERATOR: SUITE except EXCEPTION: HANDLER Work like: try: while True: try: LOOP_VARIABLE = ITERATOR.next() except EXCEPTION: raise __StopIteration__, sys.exc_info() except StopIteration: break SUITE except __StopIteration__, exc_info: somehow_set_sys_exc_info(exc_info) HANDLER As it stands, throwing a StopIteration in the suite inside a for loop doesn't terminate the loop - the exception escapes; by analogy, the for-except construct shouldn't trap exceptions from the loop body, only those raised by the iterator. tom -- Chance? Or sinister scientific conspiracy? -- http://mail.python.org/mailman/listinfo/python-list
Re: Filename case-insensitivity on OS X
On Tue, 3 Jan 2006, Scott David Daniels wrote: Tom Anderson wrote: Java has a java.io.File.getCanonicalPath method that does this, but i can't find an equivalent in python - is there one? What's wrong with: os.path.normcase(path) ? It doesn't work. Hooke:~ tom$ uname Darwin Hooke:~ tom$ python Python 2.4.1 (#2, Mar 31 2005, 00:05:10) [GCC 3.3 20030304 (Apple Computer, Inc. build 1666)] on darwin Type help, copyright, credits or license for more information. import os path=os.path.join(., os.listdir(.)[0]) path './.appletviewer' os.path.normcase(path) './.appletviewer' os.path.normcase(path.upper()) './.APPLETVIEWER' I'm not entirely sure what normcase is supposed to do - the documentation says Normalize case of pathname. Has no effect under Posix, which is less than completely illuminating. tom -- It involves police, bailiffs, vampires and a portal to hell under a tower block in Hackney. -- http://mail.python.org/mailman/listinfo/python-list
Re: Filename case-insensitivity on OS X
On Tue, 3 Jan 2006, Dan Sommers wrote: On Tue, 03 Jan 2006 15:21:19 GMT, Doug Schwarz [EMAIL PROTECTED] wrote: Strictly speaking, it's not OS X, but the HFS file system that is case insensitive. Aaah, of course. Why on earth didn't Apple move to UFS/FFS/whatever with the switch to OS X? You can use other file systems, such as UNIX File System. Use Disk Utility to create a disk image and then erase it (again, using Disk Utility) and put UFS on it. You'll find that touch foo FOO will create two files. You may also find some native Mac OS X applications failing in strange ways. Oh, that's why. :( tom -- It involves police, bailiffs, vampires and a portal to hell under a tower block in Hackney. -- http://mail.python.org/mailman/listinfo/python-list
Re: Memoization and encapsulation
On Sat, 31 Dec 2005 [EMAIL PROTECTED] wrote: just I actually prefer such a global variable to the default arg just trick. The idiom I generally use is: just _cache = {} just def func(x): just result = _cache.get(x) just if result is None: just result = x + 1 # or a time consuming calculation... just _cache[x] = result just return result None of the responses I've seen mention the use of decorators such as the one shown here: http://wiki.python.org/moin/PythonDecoratorLibrary While wrapping one function in another is obviously a bit slower, you can memoize any function without tweaking its source. I'd definitely say this is the way to go. def memoised(fn): cache = {} def memoised_fn(*args): if args in cache: return cache[args] else: rtn = fn(*args) cache[args] = rtn return rtn return memoised_fn @memoised def func(x): return x + 1 # or a time-consuming calculation tom -- Exceptions say, there was a problem. Someone must deal with it. If you won't deal with it, I'll find someone who will. -- http://mail.python.org/mailman/listinfo/python-list
Re: how to remove duplicated elements in a list?
On Mon, 19 Dec 2005, Brian van den Broek wrote: [EMAIL PROTECTED] said unto the world upon 2005-12-19 02:27: Steve Holden wrote: Kevin Yuan wrote: How to remove duplicated elements in a list? eg. [1,2,3,1,2,3,1,2,1,2,1,3] - [1,2,3]? Thanks!! list(set([1,2,3,1,2,3,1,2,1,2,1,3])) [1, 2, 3] Would this have the chance of changing the order ? Don't know if he wants to maintain the order or don't care though. For that worry: orig_list = [3,1,2,3,1,2,3,1,2,1,2,1,3] new_list = list(set(orig_list)) new_list.sort(cmp= lambda x,y: cmp(orig_list.index(x), orig_list.index(y))) new_list [3, 1, 2] Ah, that gives me an idea: import operator orig_list = [3,1,2,3,1,2,3,1,2,1,2,1,3] new_list = map(operator.itemgetter(1), ... filter(lambda (i, x): i == orig_list.index(x), ... enumerate(orig_list))) new_list [3, 1, 2] This is a sort of decorate-fondle-undecorate, where the fondling is filtering on whether this is the first occurrance of the the value. This is, IMHO, a clearer expression of the original intent - how can i remove such-and-such elements from a list is begging for filter(), i'd say. My code is O(N**2), a bit better than your O(N**2 log N), but we can get down to O(N f(N)), where f(N) is the complexity of set.__in__ and set.add, using a lookaside set sort of gizmo: orig_list = [3,1,2,3,1,2,3,1,2,1,2,1,3] seen = set() def unseen(x): ... if (x in seen): ... return False ... else: ... seen.add(x) ... return True ... new_list = filter(unseen, orig_list) new_list [3, 1, 2] Slightly tidier like this, i'd say: orig_list = [3,1,2,3,1,2,3,1,2,1,2,1,3] class seeingset(set): ... def see(self, x): ... if (x in self): ... return False ... else: ... self.add(x) ... return True ... new_list = filter(seeingset().see, orig_list) new_list [3, 1, 2] tom -- Hit to death in the future head -- http://mail.python.org/mailman/listinfo/python-list
Re: getopt and options with multiple arguments
On Mon, 19 Dec 2005, [EMAIL PROTECTED] wrote: I want to be able to do something like: myscript.py * -o outputfile and then have the shell expand the * as usual, perhaps to hundreds of filenames. But as far as I can see, getopt can only get one argument with each option. In the above case, there isn't even an option string before the *, but even if there was, I don't know how to get getopt to give me all the expanded filenames in an option. I'm really surprised that getopt doesn't handle this properly by default (so getopt.getopt mimics unices with crappy getopts - since when was that a feature?), but as Steven pointed out, getopt.gnu_getopt will float your boat. I have an irrational superstitious fear of getopt, so this is what i use (it returns a list of arguments, followed by a dict mapping flags to values; it only handles long options, but uses a single dash for them, as is, for some reason, the tradition in java, where i grew up): def arguments(argv, expand=True): argv = list(argv) args = [] flags = {} while (len(argv) 0): arg = argv.pop(0) if (arg == --): args.extend(argv) break elif (expand and arg.startswith(@)): if (len(arg) 1): arg = arg[1:] else: arg = argv.pop(0) argv[0:0] = list(stripped(file(arg))) elif (arg.startswith(-) and (len(arg) 1)): arg = arg[1:] if (: in arg): key, value = arg.split(:) else: key = arg value = flags[key] = value else: args.append(arg) return args, flags def stripped(f): Return an iterator over the strings in the iterable f in which strings are stripped of #-delimited comments and leading and trailing whitespace, and blank strings are skipped. for line in f: if (# in line): line = line[:line.index(#)] line = line.strip() if (line == ): continue yield line raise StopIteration As a bonus, you can say @foo or @ foo to mean insert the lines contained in file foo in the command line here, which is handy if, say, you have a file containing a list of files to be processed, and you want to invoke a script to process them, or if you want to put some standard flags in a file and pull them in on the command line. Yes, you could use xargs for this, but this is a bit easier. If you don't want this, delete the elif block mentioning the @, and the stripped function. A slightly neater implementation not involving list.pop also then becomes possible. tom -- Hit to death in the future head -- http://mail.python.org/mailman/listinfo/python-list
Re: putenv
On Tue, 20 Dec 2005, Steve Holden wrote: Mike Meyer wrote: Terry Hancock [EMAIL PROTECTED] writes: On Tue, 20 Dec 2005 05:35:48 - Grant Edwards [EMAIL PROTECTED] wrote: On 2005-12-20, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: I have csh script that calls a bunch of python programs and I'd like to use env variables as kind of a global variable that I can pass around to the pythong scripts. You can't change the environment of the parent process. There is an evil trick, however: Instead of setting the environment directly, have the python program return csh code to alter the environment the way you want, then call the python code by sourcing its output: source `my_script.py` Does this actually work? It looks to me like you need two levels: my_script.py creates a file, then outputs the name of the file, as the csh source command reads commands from the file named as an argument. To be able to output the commands directly, you'd need to use the eval command, not the source command. I suspect the trick that Terry was thinking of was eval, not source. You are correct in saying he'd need to create a file to source. True. The downside of eval is that it doesn't (well, in bash, anyway) handle line breaks properly (for some value of 'properly') - it seems to treat them as linear whitespace, not line ends. I was about to suggest: source (my_script.py) As a way to use source to run the script's output, but that seems not to work. I think () might be a bashism anyway. tom -- Hit to death in the future head -- http://mail.python.org/mailman/listinfo/python-list
Re: urllib.urlopen
On Sat, 17 Dec 2005, Dennis Lee Bieber wrote: (Now there is an interesting technical term: #define ERROR_ARENA_TRASHED 7) FreeBSD at one point had an EDOOFUS; Apple kvetched about this being offensive, so it was changed to EDONTPANIC. I shitteth thee not. tom -- information distribution, vox humana, deviation, handle, feed, l.g. ** -- http://mail.python.org/mailman/listinfo/python-list
Re: const objects (was Re: Death to tuples!)
On Wed, 14 Dec 2005, Steven D'Aprano wrote: On Wed, 14 Dec 2005 10:57:05 +0100, Gabriel Zachmann wrote: I was wondering why python doesn't contain a way to make things const? If it were possible to declare variables at the time they are bound to objects that they should not allow modification of the object, then we would have a concept _orthogonal_ to data types themselves and, as a by-product, a way to declare tuples as constant lists. In an earlier thread, somebody took me to task for saying that Python doesn't have variables, but names and objects instead. I'd hardly say it was a taking to task - that phrase implies authoritativeness on my part! :) This is another example of the mental confusion that occurs when you think of Python having variables. What? What does this have to do with it? The problem here - as Christopher and Magnus point out - is the conflation in the OP's mind of the idea of a variable, and of the object referenced by that variable. He could have expressed the same confusion using your names-values-and-bindings terminology - just replace 'variable' with 'name'. The expression would be nonsensical, but it's nonsensical in the variables-objects-and-pointers terminology too. Some languages have variables. Some do not. Well, there is the lambda calculus, i guess ... tom -- The sky above the port was the colour of television, tuned to a dead channel -- http://mail.python.org/mailman/listinfo/python-list
Re: IsString
On Tue, 13 Dec 2005, Fredrik Lundh wrote: Steve Holden wrote: In Python a name (*not* a variable, though people do talk loosely about instance variables and class variables just to be able to use terms familiar to users of other to languages) is simply *bound* to a value. The only storage that is required, therefore, is enough to hold a pointer (to the value currently bound to the name). in tom's world, the value of an object is the pointer to the object, not the object itself, If you meant he value of a *variable* is a pointer to an object, not the object itself, then bingo, yes, that's what it's like in my world. so I'm not sure he can make sense of your explanation. The explanation makes perfect sense - i think the names-values-bindings terminology is consistent, correct and clear. It's just that i think that the variables-objects-pointers terminology is equally so, so i object to statements like python is not pass-by-value. tom -- The sky above the port was the colour of television, tuned to a dead channel -- http://mail.python.org/mailman/listinfo/python-list
Re: IsString
On Tue, 13 Dec 2005, Xavier Morel wrote: Tom Anderson wrote: In what sense are the names-bound-to-references-to-objects not variables? In the sense that a variable has various meta-informations (at least a type) No. In a statically typed language (or possibly only a manifestly typed language), a variable has a type; in an untyped language, it doesn't. while a Python name has no information. A Python name would be equivalent to a C void pointer, it can mean *any*thing and has no value/meaning by itself, only the object it references has. Quite right - so it's also equivalent to a LISP, Smalltalk or Objective C (to mention but a few) variable? tom -- The sky above the port was the colour of television, tuned to a dead channel -- http://mail.python.org/mailman/listinfo/python-list
Re: IsString
On Tue, 13 Dec 2005, Mike Meyer wrote: You can show the same difference in behavior between Python and C (for example) without using a function call. Really? You certainly don't do that with the code below. Here's C: #include assert.h main() { int i, *ref ; i = 1 ; ref = i ; /* Save identity of i */ Here, ref is a reference to a variable. i = 2 ; assert(ref == i) ; Here, you're comparing the addresses of variables. } This runs just fine; i is the same object throughout the program. On the other hand, the equivalent Python: i = 1 ref = id(i)# Save the identity of i Here, ref is a reference to a value. i = 2 assert ref == id(i) Here, you're comparing values. Traceback (most recent call last): File stdin, line 1, in ? AssertionError Blows up - i is no longer the same object. Right, because the two bits of code are doing quite different things. Python does call by reference, which means that it passes pointers to objects by value. That's not what call by reference is - call by reference is passing pointers to *variables* by value. C is call by value, faking call by reference by passing reference values. The real difference is that in C, you can get a reference to a variable to pass, allowing you to change the variable. In python, you can't get a reference to a name (one of the reasons we call them names instead of variables), so you can't pass a value that will let the called function change it. Kinda. Here's a python translation of Steven's incrementing function example: def increment(n): Add one to the argument changing it in place. # python (rightly) doesn't have references to variables # so i will use a 2-tuple (namespace, name) to fake them # n should be such a 2-tuple n_namespace = n[0] n_name = n[1] n_namespace[n_name] += 1 x = 1 increment((locals(), x)) assert x == 2 This is an evil, festering, bletcherous hack, but it is a direct translation of the use of pass-by-reference in C. As a bonus, here's a similarly literal python translation of your C program: i = 1 ref = i i = 2 assert ref == i tom -- The sky above the port was the colour of television, tuned to a dead channel -- http://mail.python.org/mailman/listinfo/python-list
Re: IsString
On Wed, 14 Dec 2005, Steven D'Aprano wrote: On Tue, 13 Dec 2005 15:28:32 +, Tom Anderson wrote: On Tue, 13 Dec 2005, Steven D'Aprano wrote: On Mon, 12 Dec 2005 18:51:36 -0600, Larry Bates wrote: [snippidy-doo-dah] I had the same thought, but reread the post. He asks if a given variable is a character or a number. I figured that even if he is coming from another language he knows the difference between a given variable and the contents of a give variable. I guess we will see ;-). This list is so good, he gets BOTH questions answered. The problem is, Python doesn't have variables (although it is oh-so-tempting to use the word, I sometimes do myself). It has names in namespaces, and objects. In what sense are the names-bound-to-references-to-objects not variables? Because saying Python has variables leads to nonsense like the following: [snip] That's why, for instance, Python is neither call by reference nor call by value, it is call by object. No, python is call by value, and it happens that all values are pointers. All values in Python are pointers??? Right. So when I write: name = spam spam spam spam the value of the variable name is a pointer, and not a string. Riiight. Right. Call by value and call by reference have established meanings in computer science, Right. and Python doesn't behave the same as either of them. Wrong. Python behaves exactly like call by value, just like Smalltalk, Objective C, LISP, Java, and even C. Consider the following function: def modify(L): Modify a list and return it. L.append(None); return L If I call that function: mylist = range(10**10) # it is a BIG list anotherlist = modify(mylist) if the language is call by value, mylist is DUPLICATED before being passed to the function. Wrong. The value of mylist is a pointer to a list, and that's what's passed to the function. The same analysis applies to the rest of your example. The conceptual problem you are having is that you are conflating the object model of Python the language with the mechanism of the underlying C implementation, which does simply pass pointers around. No, i'm not, i'm really not. Thinking in terms of variables, pointers and objects is a simple, consistent and useful abstract model of computation in python. If you like, we can use the word 'reference' instead of 'pointer' - i guess a lot of people who came from C (which i didn't) are hung up on the idea that a pointer is a memory address, rather than just a conceptual thing which goes from a variable to an object; the trouble is that then we remind people of 'call by reference', and it all goes to pot. I think the background thing is the kicker here. I'm guessing you come from C, where pointers are physical and explicit, you can have a variable which really does contain an object, etc, and so for you, applying those terms to python is awkward. I come from java, where all pointers are abstract (in the sense of being opaque) and implicit, and variables only ever contain pointers (unless they're primitive - but that's an implementation detail), so the terminology carries over to python quite naturally. I'm sure this has been argued over many times here, and we still all have our different ideas, so please just ignore this post! I'd love to, but unfortunately I've already hit send on my reply. Fair enough. Sorry about all this. In future, i'm going to send posts which i *know* will generate heat but no light straight to /dev/null ... tom -- The literature, especially in recent years, has come to resemble `The Blob', growing and consuming everything in its path, and Steve McQueen isn't going to come to our rescue. -- The Mole -- http://mail.python.org/mailman/listinfo/python-list
Re: IsString
On Tue, 13 Dec 2005, Steve Holden wrote: Tom Anderson wrote: On Tue, 13 Dec 2005, Steven D'Aprano wrote: On Mon, 12 Dec 2005 18:51:36 -0600, Larry Bates wrote: [snippidy-doo-dah] I had the same thought, but reread the post. He asks if a given variable is a character or a number. I figured that even if he is coming from another language he knows the difference between a given variable and the contents of a give variable. I guess we will see ;-). This list is so good, he gets BOTH questions answered. The problem is, Python doesn't have variables (although it is oh-so-tempting to use the word, I sometimes do myself). It has names in namespaces, and objects. In what sense are the names-bound-to-references-to-objects not variables? In a very important sense, one which you should understand in order to understand the nature of Python. In C Stop. How am i going to understand the nature of python by reading about C? Python is not C. What C does in the privacy of its own compilation unit is of no concern to us. if you declare a variable as (for example) a character string of length 24, the compiler will generate code that allocates 24 bytes to this variable on the stack frame local to the function in which it's declared. Similarly if you declare a variable as a double-length floating point number the compiler will emit code that allocates 16 bytes on the local stack-frame. True but irrelevant. In Python a name [...] is simply *bound* to a value. The only storage that is required, therefore, is enough to hold a pointer (to the value currently bound to the name). Thus assignment (i.e. binding to a name, as opposed to binding to an element of a data structure) NEVER copes the object, it simply stores a pointer to the bound object in the part of the local namespace allocated to that name. Absolutely true. I'm not saying your terminology is wrong - i'm pointing out that mine is also right. Basically, we're both saying: In python, the universe consists of things; in order to manipulate them, programs use hands, which hold things - the program is expressed as actions on hands, which direct actions on things at runtime. Although it appears at first glance that there is a direct correspondence between hands and things, it is crucial to realise that the relationship is mediated by a holding - the hand identifies a particular holding, which in turn identifies a particular thing. So, when we make a function call, and specify hands as parameters, it is not the hands themselves, *or* the things, that get passed to the function - it's the holdings. Similarly, when we make an assignment, we are not assigning a thing - no things are touched by an assignment - but a holding, so that the hand assigned to ends up gripping a different thing. There is in fact another layer of indirection - the programmer refers to hands using strings, but this is just part of the language used to express programs textually: the correspondence between these strings and the hands they refer to is called a manual. The manual which applies at any point in a program is determined lexically - it is the manual corresponding to the function enclosing that point, or the global manual, if it is at the top level. Where you can substitute either of: steves_terminology = { thing: value, hand: name, hold: are bound to, holding: binding, gripping: being bound to, manual: namespace } toms_terminology = { thing: object, hand: variable, hold: point to, holding: pointer, gripping: pointing to, manual: scope } Using: def substitute(text, substitutions): substituands = substitutions.keys() # to handle substituands which are prefixes of other substituands: substituands.sort(lambda a, b: -cmp(len(a), len(b))) for substituand in substituands: text = text.replace(substituand, substitutions[substituand]) return text I'd then point out that my terminology is the one used in all other programming languages, including languages whose model is the same as python's, and so we should use it for consistency's sake. I guess the argument for your terminology is that it's less confusing to C programmers who don't realise that the * in *foo is now implicit. It be a subtle difference, but an important one. No, it's just spin, bizarre spin for which i can see no reason. Python has variables. You appear very confident of your ignorance ;-) You appear to be very liberal with your condescension. Steering rapidly away from further ad hominem attacks ... I'm sure this has been argued over many times here, and we still all have our different ideas, so please just ignore this post! Couldn't! I do apologise, though, for any implication you assertions are based on ignorance because you do demonstrate quite a sophisticated
Re: IsString
On Tue, 13 Dec 2005, Steven D'Aprano wrote: On Mon, 12 Dec 2005 18:51:36 -0600, Larry Bates wrote: [snippidy-doo-dah] I had the same thought, but reread the post. He asks if a given variable is a character or a number. I figured that even if he is coming from another language he knows the difference between a given variable and the contents of a give variable. I guess we will see ;-). This list is so good, he gets BOTH questions answered. The problem is, Python doesn't have variables (although it is oh-so-tempting to use the word, I sometimes do myself). It has names in namespaces, and objects. In what sense are the names-bound-to-references-to-objects not variables? It be a subtle difference, but an important one. No, it's just spin, bizarre spin for which i can see no reason. Python has variables. That's why, for instance, Python is neither call by reference nor call by value, it is call by object. No, python is call by value, and it happens that all values are pointers. Just like java, but without the primitive types, and like LISP, and like a load of other languages. Python's parameter passing is NO DIFFERENT to that in those languages, and those languages are ALL described as call-by-value, so to claim that python does not use call-by-reference but some random new 'call-by-object' convention is incorrect, unneccessary, confusing and silly. /rant I'm sure this has been argued over many times here, and we still all have our different ideas, so please just ignore this post! tom -- So the moon is approximately 24 toasters from Scunthorpe. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python is incredible!
On Mon, 12 Dec 2005, Xavier Morel wrote: Luis M. Gonzalez wrote: You are not the first lisper who fell inlove with Python... Check this out: http://www.paulgraham.com/articles.html Paul Graham is not in love with Python though, he's still very much in love with Lisp. He merely admits being unfaithful to Lisp from time to time (and clearly states that Python is one of the non-Lisp languages he likes best). Oh come on - he loves LISP but he plays away with python every chance he gets? What he has with LISP is a hollow sham - he's only keeping up the pretense for the children. ;) tom -- So the moon is approximately 24 toasters from Scunthorpe. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python is incredible!
On Tue, 13 Dec 2005, Cameron Laird wrote: In article [EMAIL PROTECTED], Tom Anderson [EMAIL PROTECTED] wrote: On Mon, 12 Dec 2005, Cameron Laird wrote: While there is indeed much to love about Lisp, please be aware that meaningful AI work has already been done in Python Wait - meaningful AI work has been done? I richly deserved that. As penance, I follow-up with URL: http://www.robotwisdom.com/ai/ . I think that document actually sells AI a little short: it's true that little progress has been made with language or reasoning, but vision's actually done rather well; the recent winning of the Grand Challenge drive across the Mojave is proof of that. But then, i don't think AI was ever really the goal of the AI movement - it was basically a time when DARPA gathered together smart, curious people, and threw torrents of resources at them to use as they pleased. We didn't get AI out of it, but we did get a hell of a lot of cool stuff. It was a bit like the Apollo programme, but without the air force dudes planting flags at the end. An AI refugee, who worked at SAIL in the 70s, recently told me AI was always just a sandpit, now it's become a tarpit the clever people have moved on - because it was the environment and the opportunity to do neat stuff, rather than AI per se, that drove them. tom -- So the moon is approximately 24 toasters from Scunthorpe. -- http://mail.python.org/mailman/listinfo/python-list
Re: how does exception mechanism work?
On Mon, 12 Dec 2005, it was written: [EMAIL PROTECTED] writes: Is this model correct or wrong? Where can I read about the mechanism behind exceptions? Usually you push exception handlers and finally clauses onto the activation stack like you push return addresses for function calls. When something raises an exception, you scan the activation stack backwards, popping stuff from it as you scan and executing finally clauses as you find them, until you find a handler for the raised exception. That varies an awful lot, though - AIUI, in java, the catch blocks are specified sort of in the same place as the code; a method definition consists of bytecode, a pile of metadata, and an exception table, which says 'if an exception of type x happens at a bytecode in the range a to b, jump to bytecode c'. When the exception-handling machinery is walking the stack, rather than looking at some concrete stack of exception handlers, it walks the stack of stack frames (or activation records or whatever you call them), and for each one, follows the pointer to the relevant method definition and inspects its exception table. Finally blocks are handled by putting the finally's code right after the try's code in the normal flow of execution, then concocting an exception handler for the try block which points into the finally block, so however the try block finishes, execution goes to the finally block. The advantage of this approach over an explicit stack of handlers is that, although unwinding the stack is perhaps a bit slower, due to having to chase more pointers to get to the exception table, there's zero work to be done to set up a try block, and since executing a try is a lot more frequent than executing a throw-catch, that's a win. Of course, that's how the conceptual virtual machine does it; real implementations don't necessarily do that. That said, it is a traditional superstition in java that a try block is essentially free, which would suggest that this sort of implementation is common. Indeed, i see no reason why it wouldn't be - i think the push-a-handler style seen in C/C++ implementations is only necessary because of the platform ABI, which doesn't usually mandate a standard layout for per-function metadata. tom -- limited to concepts that are meta, generic, abstract and philosophical -- IEEE SUO WG -- http://mail.python.org/mailman/listinfo/python-list
Re: Pattern matching with string and list
On Mon, 12 Dec 2005 [EMAIL PROTECTED] wrote: I'd need to perform simple pattern matching within a string using a list of possible patterns. For example, I want to know if the substring starting at position n matches any of the string I have a list, as below: sentence = the color is $red patterns = [blue,red,yellow] pos = sentence.find($) I assume that's a typo for sentence.find('$'), rather than some new syntax i've not learned yet! # here I need to find whether what's after 'pos' matches any of the strings of my 'patterns' list bmatch = ismatching( sentence[pos:], patterns) Is an equivalent of this ismatching() function existing in some Python lib? I don't think so, but it's not hard to write: def ismatching(target, patterns): for pattern in patterns: if target.startswith(pattern): return True return False You don't say what bmatch should be at the end of this, so i'm going with a boolean; it would be straightforward to return the pattern which matched, or the index of the pattern which matched in the pattern list, if that's what you want. The tough guy way to do this would be with regular expressions (in the re module); you could do the find-the-$ and the match-a-pattern bit in one go: import re patternsRe = re.compile(r\$(blue)|(red)|(yellow)) bmatch = patternsRe.search(sentence) At the end, bmatch is None if it didn't match, or an instance of re.Match (from which you can get details of the match) if it did. If i was doing this myself, i'd be a bit cleaner and use non-capturing groups: patternsRe = re.compile(r\$(?:blue)|(?:red)|(?:yellow)) And if i did want to capture the colour string, i'd do it like this: patternsRe = re.compile(r\$((?:blue)|(?:red)|(?:yellow))) If this all looks like utter gibberish, DON'T PANIC! Regular expressions are quite scary to begin with (and certainly not very regular-looking!), but they're actually quite simple, and often a very powerful tool for text processing (don't get carried way, though; regular expressions are a bit like absinthe, in that a little helps your creativity, but overindulgence makes you use perl). In fact, we can tame the regular expressions quite neatly by writing a function which generates them: def regularly_express_patterns(patterns): pattern_regexps = map( lambda pattern: (?:%s) % re.escape(pattern), patterns) regexp = r\$( + |.join(pattern_regexps) + ) return re.compile(regexp) patternsRe = regularly_express_patterns(patterns) tom -- limited to concepts that are meta, generic, abstract and philosophical -- IEEE SUO WG -- http://mail.python.org/mailman/listinfo/python-list
Re: Python is incredible!
On Mon, 12 Dec 2005, Cameron Laird wrote: While there is indeed much to love about Lisp, please be aware that meaningful AI work has already been done in Python Wait - meaningful AI work has been done? ;) tom -- limited to concepts that are meta, generic, abstract and philosophical -- IEEE SUO WG -- http://mail.python.org/mailman/listinfo/python-list
Re: Python is incredible!
On Mon, 12 Dec 2005, Tolga wrote: I am using Common Lisp for a while and nowadays I've heard so much about Python that finally I've decided to give it a try becuase You read reddit.com, and you want to know why they switched? Python is not very far away from Lisp family. That's an interesting assertion. LISP certainly had an influence on python, but i don't think it's really related - they're pretty different in fundamental ways. On the other hand, i sort of see what you mean - it has this lightweight, magical feeling, a sense of effortless power, as LISP does. using Python is not programming, it IS a fun! +1 QOTW. I'll be here!!! Good to hear it - welcome! tom -- limited to concepts that are meta, generic, abstract and philosophical -- IEEE SUO WG -- http://mail.python.org/mailman/listinfo/python-list
Re: Developing a network protocol with Python
On Mon, 12 Dec 2005, Laszlo Zsolt Nagy wrote: I think to be effective, I need to use TCP_NODELAY, and manually buffered transfers. Why? I would like to create a general messaging object that has methods like sendinteger recvinteger sendstring recvstring Okay. So you're really developing a marshalling layer, somewhere between the transport and application layers - fair enough, there are a lot of protocols that do that. To be more secure, Do you really mean secure? I don't think using pickle will give you security. If you want security, run your protocol over an TLS/SSL connection. If, however, you mean robustness, then this is a reasonable thing to do - it reduces the amount of code you have to write, and so reduces the number of bugs you'll write! One thing to watch out for, though, is the compatibility of the pickling at each end - i have no idea what the backwards- and forwards-compatibility of the pickle protocols is like, but you might find that if they're on different python versions, the ends won't understand each other. Defining your own protocol down to the bits-on-the-socket level would preclude that possibility. I think I can use this loads function to transfer more elaborate python stuctures: def loads(s): Loads an object from a string. @param s: The string to load the object from. @return: The object loaded from the string. This function will not unpickle globals and instances. f = cStringIO.StringIO(s) p = cPickle.Unpickler(f) p.find_global = None return p.load() I don't know the pickle module, so i can't comment on the code. Am I on the right way to develop a new protocol? Aside from the versioning issue i mention above, you should bear in mind that using pickle will make it insanely hard to implement this protocol in any language other than python (unless someone's implemented a python pickle library in it - is there such a beast for any other language?). Personally, i'd steer clear of doing it like this, and try to use an existing, language-neutral generic marshalling layer. XML and ASN.1 would be the obvious ones, but i wouldn't advise using either of them, as they're abominations. JSON would be a good choice: http://www.json.org/ If it's expressive enough for your objects. This is a stunningly simple format, and there are libraries for working with it for a wide range of languages. Are there any common mistakes that programmers do? The key one, i'd say, is not thinking about the future. Make sure your protocol is able to grow - use a version number, so peers can figure out what language they're talking, and perhaps an option negotiation mechanism, if you're doing anything complex enough to warrant it (hey, you could always start without it and add it in a later version!). Try to allow for addition of new commands, message types or whatever, and for extension of existing ones (within reason). Is there a howto where I can read more about this? Not really - protocol design is a bit of a black art. Someone asked about this on comp.protocols.tcp-ip a while ago: http://groups.google.co.uk/group/comp.protocols.tcp-ip/browse_thread/thread/39f810b43a6008e6/72ca111d67768b83 And didn't get much in the way of answers. Someone did point to this, though: http://www.internet2.edu/~shalunov/writing/protocol-design.html Although i don't agree with much of what that says. tom -- limited to concepts that are meta, generic, abstract and philosophical -- IEEE SUO WG -- http://mail.python.org/mailman/listinfo/python-list
Re: OO in Python? ^^
On Mon, 12 Dec 2005, Bengt Richter wrote: On Mon, 12 Dec 2005 01:12:26 +, Tom Anderson [EMAIL PROTECTED] wrote: -- ø¤º°`°º¤øø¤º°`°º¤øø¤º°`°º¤øø¤º°`°º¤ø [OT} (just taking liberties with your sig ;-) ,@ °º¤øø¤º°`°º¤øø¤º°P`°º¤ø,,y,,ø¤º°t`°º¤ø,,h,,ø¤º°o`°º¤ø,,n,,ø¤º° The irony is that with my current news-reading setup, i see my own sig as a row of question marks, seasoned with backticks and commas. Your modification looks like it's adding a fish; maybe the question marks are a kelp bed, which the fish is exploring for food. Hmm. Maybe if i look at it through Google Groups ... Aaah! Very good! However, given the context, i think it should be: ,OO °º¤øø¤º°`°º¤øø¤º°P`°º¤ø,,y,,ø¤º°t`°º¤ø,,h,,ø¤º°o`°º¤ø,,n,,ø¤º° ! tom -- limited to concepts that are meta, generic, abstract and philosophical -- IEEE SUO WG-- http://mail.python.org/mailman/listinfo/python-list
Re: OO in Python? ^^
On Mon, 12 Dec 2005, Donn Cave wrote: In article [EMAIL PROTECTED], [EMAIL PROTECTED] (Alex Martelli) wrote: Tom Anderson [EMAIL PROTECTED] wrote: ... For example, if i wrote code like this (using python syntax): def f(x): return 1 + x The compiler would think well, he takes some value x, and he adds it to 1 and 1 is an integer, and the only thing you can add to an integer is another integer, so x must be an integer; he returns whatever 1 + x works out to, and 1 and x are both integers, and adding two integers makes an integer, so the return type must be integer hmmm, not exactly -- Haskell's not QUITE as strongly/rigidly typed as this... you may have in mind CAML, which AFAIK in all of its variations (O'CAML being the best-known one) *does* constrain + so that the only thing you can add to an integer is another integer. In Haskell, + can sum any two instances of types which meet typeclass Num -- including at least floats, as well as integers (you can add more types to a typeclass by writing the required functions for them, too). Therefore (after loading in ghci a file with f x = x + 1 ), we can verify...: *Main :type f f :: (Num a) = a - a But if you try f x = x + 1.0 it's f :: (Fractional a) = a - a I asserted something like this some time ago here, and was set straight, I believe by a gentleman from Chalmers. You're right that addition is polymorphic, but that doesn't mean that it can be performed on any two instances of Num. That's what i understand. What it comes down to, i think, is that the Standard Prelude defines an overloaded + operator: def __add__(x: int, y: int) - int: primitive operation to add two ints def __add__(x: float, y: float) - float: primitive operation to add two floats def __add__(x: str, y: str) - str: primitive operation to add two strings # etc So that when the compiler hits the expression x + 1, it has a finite set of possible interpretations for '+', of which only one is legal - addition of two integers to yield an integer. Or rather, given that 1 can be an int or a float, it decides that x could be either, and so calls it alpha, where alpha is a number. Or something. While we're on the subject of Haskell - if you think python's syntactically significant whitespace is icky, have a look at Haskell's 'layout' - i almost wet myself in terror when i saw that! tom -- limited to concepts that are meta, generic, abstract and philosophical -- IEEE SUO WG -- http://mail.python.org/mailman/listinfo/python-list
Re: ANN: Dao Language v.0.9.6-beta is release!
On Sun, 11 Dec 2005, Steven D'Aprano wrote: On Sat, 10 Dec 2005 16:34:13 +, Tom Anderson wrote: On Sat, 10 Dec 2005, Sybren Stuvel wrote: Zeljko Vrba enlightened us with: Find me an editor which has folds like in VIM, regexp search/replace within two keystrokes (ESC,:), marks to easily navigate text in 2 keystrokes (mx, 'x), can handle indentation-level matching as well as VIM can handle {}()[], etc. And, unlike emacs, respects all (not just some) settings that are put in its config file. Something that works satisfactorily out-of-the box without having to learn a new programming language/platform (like emacs). Found it! VIM! ED IS THE STANDARD TEXT EDITOR. Huh! *Real* men edit their text files by changing bits on the hard disk by hand with a magnetized needle. Hard disk? HARD DISK? Hard disks are for losers who can't write tight code. *Real* mean keep everything in core. Unless it's something performance-critical, in which case they fit it in the cache. tom -- ø¤º°`°º¤øø¤º°`°º¤øø¤º°`°º¤øø¤º°`°º¤ø-- http://mail.python.org/mailman/listinfo/python-list
Re: OO in Python? ^^
On Mon, 12 Dec 2005, Steven D'Aprano wrote: On Sun, 11 Dec 2005 05:48:00 -0800, bonono wrote: And I don't think Haskell make the programmer do a lot of work(just because of its static type checking at compile time). I could be wrong, but I think Haskell is *strongly* typed (just like Python), not *statically* typed. Haskell is strongly and statically typed - very strongly and very statically! However, what it's not is manifestly typed - you don't have to put the types in yourself; rather, the compiler works it out. For example, if i wrote code like this (using python syntax): def f(x): return 1 + x The compiler would think well, he takes some value x, and he adds it to 1 and 1 is an integer, and the only thing you can add to an integer is another integer, so x must be an integer; he returns whatever 1 + x works out to, and 1 and x are both integers, and adding two integers makes an integer, so the return type must be integer, and concludes that you meant (using Guido's notation): def f(x: int) - int: return 1 + x Note that this still buys you type safety: def g(a, b): c = { + a + } d = 1 + b return c + d The compiler works out that c must be a string and d must be an int, then, when it gets to the last line, finds an expression that must be wrong, and refuses to accept the code. This sounds like it wouldn't work for complex code, but somehow, it does. And somehow, it works for: def f(x): return x + 1 Too. I think this is due to the lack of polymorphic operator overloading. A key thing is that Haskell supports, and makes enormous use of, a powerful system of generic types; with: def h(a): return a + a There's no way to infer concrete types for h or a, so Haskell gets generic; it says okay, so i don't know what type a is, but it's got to be something, so let's call it alpha; we're adding two alphas, and one thing i know about adding is that adding two things of some type makes a new thing of that type, so the type of some-alpha + some-alpha is alpha, so this function returns an alpha. ISTR that alpha gets written 'a, so this function is: def h(a: 'a) - 'a: return a + a Although that syntax might be from ML. This extends to more complex cases, like: def i(a, b): return [a, b] In Haskell, you can only make lists of a homogenous type, so the compiler deduces that, although it doesn't know what type a and b are, they must be the same type, and the return value is a list of that type: def i(a: 'a, b: 'a) - ['a]: return [a, b] And so on. I don't know Haskell, but i've had long conversations with a friend who does, which is where i've got this from. IANACS, and this could all be entirely wrong! At least the What Is Haskell? page at haskell.org describes the language as strongly typed, non-strict, and allowing polymorphic typing. When applied to functional languages, 'strict' (or 'eager'), ie that expressions are evaluated as soon as they are formed; 'non-strict' (or 'lazy') means that expressions can hang around as expressions for a while, or even not be evaluated all in one go. Laziness is really a property of the implementation, not the the language - in an idealised pure functional language, i believe that a program can't actually tell whether the implementation is eager or lazy. However, it matters in practice, since a lazy language can do things like manipulate infinite lists. tom -- ø¤º°`°º¤øø¤º°`°º¤øø¤º°`°º¤øø¤º°`°º¤ø-- http://mail.python.org/mailman/listinfo/python-list
Re: ANN: Dao Language v.0.9.6-beta is release!
On Sat, 10 Dec 2005, Sybren Stuvel wrote: Zeljko Vrba enlightened us with: Find me an editor which has folds like in VIM, regexp search/replace within two keystrokes (ESC,:), marks to easily navigate text in 2 keystrokes (mx, 'x), can handle indentation-level matching as well as VIM can handle {}()[], etc. And, unlike emacs, respects all (not just some) settings that are put in its config file. Something that works satisfactorily out-of-the box without having to learn a new programming language/platform (like emacs). Found it! VIM! ED IS THE STANDARD TEXT EDITOR. tom -- Argumentative and pedantic, oh, yes. Although it's properly called correct -- Huge -- http://mail.python.org/mailman/listinfo/python-list
Re: How to get the extension of a filename from the path
On Thu, 8 Dec 2005, gene tani wrote: Lad wrote: what is a way to get the the extension of a filename from the path? minor footnote: windows paths can be raw strings for os.path.split(), or you can escape / tho Tom's examp indicates unescaped, non-raw string works with splitext() DOH. Yes, my path's got a tab in it, hasn't it! tom -- Women are monsters, men are clueless, everyone fights and no-one ever wins. -- cleanskies -- http://mail.python.org/mailman/listinfo/python-list
Re: Encoding of file names
On Thu, 8 Dec 2005, Martin v. Löwis wrote: utabintarbo wrote: Fredrik, you are a God! Thank You^3. I am unworthy /ass-kiss-mode For all those who followed this thread, here is some more explanation: Apparently, utabintarbo managed to get U+2592 (MEDIUM SHADE, a filled 50% grayish square) and U+2524 (BOX DRAWINGS LIGHT VERTICAL AND LEFT, a vertical line in the middle, plus a line from that going left) into a file name. How he managed to do that, I can only guess: most likely, the Samba installation assumes that the file system encoding on the Solaris box is some IBM code page (say, CP 437 or CP 850). If so, the byte on disk would be \xb4. Where this came from, I have to guess further: perhaps it is ACUTE ACCENT from ISO-8859-*. Anyway, when he used listdir() to get the contents of the directory, Windows applies the CP_ACP encoding (known as mbcs in Python). For reasons unknown to me, the US and several European versions of XP map this to \xa6, VERTICAL BAR (I can somewhat see that as meaningful for U+2524, but not for U+2592). So when he then applies isfile to that file name, \xa6 is mapped to U+00A6, which then isn't found on the Samba side. So while Unicode here is the solution, the problem is elsewhere; most likely in a misconfiguration of the Samba server (which assumes some encoding for the files on disk, yet the AIX application uses a different encoding). Isn't the key thing that Windows is applying a non-roundtrippable character encoding? If i've understood this right, Samba and Windows are talking in unicode, with these (probably quite spurious, but never mind) U+25xx characters, and Samba is presenting a quite consistent view of the world: there's a file called double bucky backlash grey box in the directory listing, and if you ask for a file called double bucky backlash grey box, you get it. Windows, however, maps that name to the 8-bit string double bucky blackslash vertical bar, but when you pass *that* back to it, it gets encoded as the unicode string double bucky backslash vertical bar, which Sambda then doesn't recognise. I don't know what Windows *should* do here. I know it shouldn't do this - this leads to breaking of some very basic invariants about files and directories, and so the kind of confusion utabintarbo suffered. The solution is either to apply an information-preserving encoding (UTF-8, say), or to refuse to do it at all (ie, raise an error if there are unencodable characters), neither of which are particularly beautiful solutions. I think Windows is in a bit of a rock/hard place situation here, poor thing. Incidentally, for those who haven't come across CP_ACP before, it's not yet another character encoding, it's a pseudovalue which means 'the system's current default character set'. tom -- Women are monsters, men are clueless, everyone fights and no-one ever wins. -- cleanskies-- http://mail.python.org/mailman/listinfo/python-list
Validating an email address
Hi all, A hoary old chestnut this - any advice on how to syntactically validate an email address? I'd like to support both the display-name-and-angle-bracket and bare-address forms, and to allow everything that RFC 2822 allows (and nothing more!). Currently, i've got some regexps which recognise a common subset of possible addresses, but it would be nice to do this properly - i don't currently support quoted pairs, quoted strings, or whitespace in various places where it's allowed. Adding support for those things using regexps is really hard. See: http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html For a level to which i am not prepared to stoop. I hear the email-sig are open to adding a validation function to the email package, if a satisfactory one can be written; i would definitely support their doing that. tom -- Women are monsters, men are clueless, everyone fights and no-one ever wins. -- cleanskies -- http://mail.python.org/mailman/listinfo/python-list
Re: heartbeats
On Fri, 9 Dec 2005, Sybren Stuvel wrote: Yves Glodt enlightened us with: In detail I need a daemon on my central server which e.g. which in a loop pings (not really ping but you know what I mean) each 20 seconds one of the clients. Do you mean pings one client every 20 sec, or each client every 20 sec? You probably mean really a ping, just not an ICMP echo request. What's a real ping, if not an ICMP echo request? That's pretty much the definitive packet for internetwork groping as far as i know. I think that the more generic sense of ping is a later meaning (BICouldVeryWellBW). My central server, and this is important, should have a short timeout. If one client does not respond because it's offline, after max. 10 seconds the central server should continue with the next client. I'd write a single function that pings a client and waits for a response/timeout. It then should return True if the client is online, and False if it is offline. You can then use a list of clients and the filter() function, to retrieve a list of online clients. That sounds like a good plan. To do the timeouts, you want the settimeout method on socket: import socket def default_validate(sock): return True def ping(host, port, timeout=10.0, validate=default_validate): Ping a specified host on the specified port. The timeout (in seconds) and a validation function can be set; the validation function should accept a freshly opened socket and return True if it's okay, and False if not. This functions returns True if the specified target can be connected to and yields a valid socket, and False otherwise. sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) sock.settimeout(timeout) try: sock.connect((host, port)) except socket.error: return False ok = validate(sock) sock.close() return ok A potential problem with this is that in the worst case, you'll be spending a little over ten seconds on each socket; if you have a lot of sockets, that might mean you're not getting through them fast enough. There are two ways round this: handle several pings in parallel using threads, or use non-blocking sockets to handle several at once with a single thread. tom -- everything from live chats and the Web, to the COOLEST DISGUSTING PORNOGRAPHY AND RADICAL MADNESS!! -- http://mail.python.org/mailman/listinfo/python-list
Re: Encoding of file names
On Fri, 9 Dec 2005, Martin v. Löwis wrote: Tom Anderson wrote: Isn't the key thing that Windows is applying a non-roundtrippable character encoding? This is a fact, but it is not a key thing. Of course Windows is applying a non-roundtrippable character encoding. What else could it do? Well, i'm no great thinker, but i'd say that errors should never pass silently, and that in the face of ambiguity, one should refuse the temptation to guess. So, as i said in my post, if the name couldn't be translated losslessly, an error should be raised. I don't know what Windows *should* do here. I know it shouldn't do this - this leads to breaking of some very basic invariants about files and directories, and so the kind of confusion utabintarbo suffered. It always did this, and always will. Applications should stop using the *A versions of the API. Absolutely true. If they continue to do so, they will continue to get bogus results in border cases. No. The availability of a better alternative is not an excuse for gratuitous breakage of the worse alternative. tom -- Whose house? Run's house!-- http://mail.python.org/mailman/listinfo/python-list
Re: Validating an email address
On Sat, 10 Dec 2005, Ben Finney wrote: Tom Anderson [EMAIL PROTECTED] wrote: A hoary old chestnut this - any advice on how to syntactically validate an email address? Yes: Don't. URL:http://www.apps.ietf.org/rfc/rfc3696.html#sec-3 The IETF must have updated that RFC between you posting the link and me reading it, because that's not what it says. What it says that the syntax for local parts is complicated, and many of the variations are actually used for reasons i can't even imagine, so they should be permitted. It doesn't say anything about not validating the local part against that syntax. Please, don't attempt to validate the local-part. It's not up to you to decide what the receiving MTA will accept as a local-part, Absolutely not - it's up to the IETF, and their decision is recorded in RFC 2822. tom -- Whose house? Run's house! -- http://mail.python.org/mailman/listinfo/python-list
Re: How to get the extension of a filename from the path
On Thu, 8 Dec 2005, Lad wrote: what is a way to get the the extension of a filename from the path? E.g., on my XP windows the path can be C:\Pictures\MyDocs\test.txt and I would like to get the the extension of the filename, that is here txt You want os.path.splitext: import os os.path.splitext(C:\Pictures\MyDocs\test.txt) ('C:\\Pictures\\MyDocs\test', '.txt') os.path.splitext(C:\Pictures\MyDocs\test.txt)[1] '.txt' I would like that to work on Linux also It'll be fine. tom -- [Philosophy] is kind of like being driven behind the sofa by Dr Who - scary, but still entertaining. -- Itchyfidget -- http://mail.python.org/mailman/listinfo/python-list
Re: Tabs bad (Was: ANN: Dao Language v.0.9.6-beta is release!)
On Sun, 4 Dec 2005, [utf-8] Björn Lindström wrote: This article should explain it: http://www.jwz.org/doc/tabs-vs-spaces.html Ah, Jamie Zawinski, that well-known fount of sane and reasonable ideas. It seems to me that the tabs-vs-spaces thing is really about who controls the indentation: with spaces, it's the writer, and with tabs, it's the reader. Does that match up with people's attitudes? Is it the case that the space cadets want to control how their code looks to others, and the tabulators want to control how others' code looks to them? I wonder if there's a further correlation between preferring spaces to tabs and the GPL to the BSDL ... tom Lexicographical PS: 'tabophobia' is, apparently, fear of the neurodegenerative disorder tabes dorsalis. -- 3118110161 Pies-- http://mail.python.org/mailman/listinfo/python-list
Re: ANN: Dao Language v.0.9.6-beta is release!
On Sun, 4 Dec 2005 [EMAIL PROTECTED] wrote: you're about 10 years late The same could be said for hoping that the GIL will be eliminated. Utterly hopeless. Until... there was PyPy. Maybe now it's not so hopeless. No - structuring by indentation and the global lock are entirely different kettles of fish. The lock is an implementation detail, not part of the language, and barely even perceptible to users; indeed, Jython and IronPython, i assume, don't even have one. Structuring by indentation, on the other hand, is a part of the language, and a very fundamental one, at that. Python without structuring by indentation *is not* python. Which is not to say that it's a bad idea - if it really is scaring off potential converts, then a dumbed-down dialect of python which uses curly brackets and semicolons might be a useful evangelical tool. tom -- 3118110161 Pies -- http://mail.python.org/mailman/listinfo/python-list
Re: ANN: Dao Language v.0.9.6-beta is release!
On Fri, 2 Dec 2005, [EMAIL PROTECTED] wrote: Dave Hansen wrote: TAB characters are evil. They should be banned from Python source code. The interpreter should stop translation of code and throw an exception when one is encountered. Seriously. At least, I'm serious when I say that. I've never seen TAB characters solve more problems than they cause in any application. But I suspect I'm a lone voice crying in the wilderness. Regards, You're not alone. I still don't get why there is still people using real tabs as indentation. I use real tabs. To me, it seems perfectly simple - i want the line to be indented a level, so i use a tab. That's what tabs are for. And i've never, ever come across any problem with using tabs. Spaces, on the otherhand, can be annoying: using spaces means that the author's personal preference about how wide a tab should be gets embedded in the code, so if that's different to mine, i end up having to look at weird code. Navigating and editing the code with arrow-keys under a primitive editor, which one is sometimes forced to do, is also slower and more error-prone. So, could someone explain what's so evil about tabs? tom -- Space Travel is Another Word for Love! -- http://mail.python.org/mailman/listinfo/python-list
Re: Comparison problem
Chris, as well as addressing what i think is causing your problem, i'm going to point out some bits of your code that i think could be polished a little. It's intended in a spirit of constructive criticism, so i hope you don't mind! On Sat, 26 Nov 2005, Chris wrote: if item[0:1]==-: item[0:1] seems a rather baroque way of writing item[0]! I'd actually suggest writing this line like this: if item.startswith(-:): As i feel it's more readable. item=item[ :-7] item=item[1:] You could just write: item = item[1:7] For those two lines. infile=open(inventory,r) The r isn't necessary - reading is the default mode for files. You could argue that this documents your intentions towards the file, i suppose, but the traditional python idiom would leave it out. while infile: dummy=infile.readline() The pythonic idiom for this is: for dummy in infile: Although i'd strongly suggest you change 'dummy' to a more descriptive variable name; i use 'line' myself. Now, this is also the line that i think is at the root of your trouble: readline returns lines with the line-terminator ('\n' or whatever it is on your system) still on them. That gets you into trouble later - see below. When i'm iterating over lines in a file, the first thing i do with the line is chomp off any trailing newline; the line after the for loop is typically: line = line.rstrip(\n) if dummy=='':break You don't by any chance mean 'continue' here, do you? print item print , +dummy if (dummy == item): This comparison isn't working This is where it all falls down - i suspect that what's happening here is that dummy has a trailing newline, and item doesn't, so although they look very similar, they're not the same string, so the comparison comes out false. Try throwing in that rstrip at the head of the loop and see if it fixes it. HTH. tom -- Gotta treat 'em mean to make 'em scream. -- http://mail.python.org/mailman/listinfo/python-list
Re: icmp - should this go in itertools?
On Fri, 25 Nov 2005, Roy Smith wrote: Tom Anderson [EMAIL PROTECTED] wrote: It's modelled after the way cmp treats lists - if a and b are lists, icmp(iter(a), iter(b)) should always be the same as cmp(a, b). Is this any good? Would it be any use? Should this be added to itertools? Whatever happens, please name it something other than icmp. When I read icmp, I think Internet Control Message Protocol. Heh! That's a good point. The trouble is, icmp is clearly the Right Thing to call it from the point of view of itertools, continuing the pattern of imap, ifilter, izip etc. Wouldn't it be clear from context that this was nothing to do with ICMP? tom -- Gotta treat 'em mean to make 'em scream. -- http://mail.python.org/mailman/listinfo/python-list
Re: icmp - should this go in itertools?
On Sat, 26 Nov 2005, Diez B. Roggisch wrote: Tom Anderson wrote: Is this any good? Would it be any use? Should this be added to itertools? Whilst not a total itertools-expert myself, I have one little objection with this: the comparison won't let me know how many items have been consumed. And I end up with two streams that lack some common prefix plus one field. Good point. It would probably only be useful if you didn't need to do anything with the iterators afterwards. One option - which is somewhat icky - would be to encode that in the return value; if n is the number of items read from both iterators, then if the first argument is smaller, the return value is -n, and if the second is smaller, it's n. The trouble is that you couldn't be sure exactly how many items had been read from the larger iterator - it could be n, if the values in the iterators differ, or n+1, if the values were the same but the larger one was longer. I'm just not sure if there is any usecase for that. I used it in my ordered dictionary implementation; it was a way of comparing two 'virtual' lists that are lazily generated on demand. I'll go away and think about this more. tom -- Gotta treat 'em mean to make 'em scream. -- http://mail.python.org/mailman/listinfo/python-list
Re: Comparison problem
On Sat, 26 Nov 2005, Peter Hansen wrote: Tom Anderson wrote: On Sat, 26 Nov 2005, Chris wrote: if item[0:1]==-: item[0:1] seems a rather baroque way of writing item[0]! I'd actually suggest writing this line like this: Actually, it's not so much baroque as it is safe... item[0] will fail if the string is empty, while item[0:1] will return '' in that case. Ah i didn't realise that. Whether that's safe rather depends on what the subsequent code does with an empty string - an empty string might be some sort of error (in this particular case, it would mean that the loop test had gone wrong, since bool() == False), and the slicing behaviour would constitute silent passing of an error. But, more importantly, egad! What's the thinking behind having slicing behave like that? Anyone got any ideas? What's the use case, as seems to be the fashionable way of putting it these days? :) tom -- This should be on ox.boring, shouldn't it? -- http://mail.python.org/mailman/listinfo/python-list
Re: Why are there no ordered dictionaries?
On Wed, 23 Nov 2005, Christoph Zwerschke wrote: Alex Martelli wrote: However, since Christoph himself just misclassified C++'s std::map as ordered (it would be sorted in this new terminology he's now introducing), it seems obvious that the terminological confusion is rife. Speaking about ordered and sorted in the context of collections is not a new terminology I am introducing, but seems to be pretty common in computer science This is quite true. I haven't seen any evidence for 'rife' misunderstanding of these terms. That said ... Perhaps Pythonists are not used to that terminology, since they use the term list for an ordered collection. An ordered dictionary is a dictionary whose keys are a (unique) list. Sometimes it is also called a sequence Maybe we should call it a 'sequenced dictionary' to fit better with pythonic terminology? tom -- YOU HAVE NO CHANCE TO ARRIVE MAKE ALTERNATIVE TRAVEL ARRANGEMENTS. -- Robin May -- http://mail.python.org/mailman/listinfo/python-list
Re: Why are there no ordered dictionaries?
On Wed, 23 Nov 2005, Carsten Haese wrote: On Wed, 2005-11-23 at 15:17, Christoph Zwerschke wrote: Bengt Richter wrote: E.g., it might be nice to have a mode that assumes d[key] is d.items()[k][1] when key is an integer, and otherwise uses dict lookup, for cases where the use case is just string dict keys. I also thought about that and I think PHP has that feature, but it's probably better to withstand the temptation to do that. It could lead to an awful confusion if the keys are integers. Thus quoth the Zen of Python: Explicit is better than implicit. In the face of ambiguity, refuse the temptation to guess. With those in mind, since an odict behaves mostly like a dictionary, [] should always refer to keys. An odict implementation that wants to allow access by numeric index should provide explicitly named methods for that purpose. +1 Overloading [] to sometimes refer to keys and sometimes to indices is a really, really, REALLY bad idea. Let's have it refer to keys, and do indices either via a sequence attribute or the return value of items(). More generally, if we're going to say odict is a subtype of dict, then we have absolutely no choice but to make the methods that it inherits behave the same way as in dict - that's what subtyping means. That means not doing funky things with [], returning a copy from items() rather than a live view, etc. So, how do we provide mutatory access to the order of items? Of the solutions discussed so far, i think having a separate attribute for it - like items, a live view, not a copy (and probably being a variable rather than a method) - is the cleanest, but i am starting to think that overloading items to be a mutable sequence as well as a method is quite neat. I like it in that the it combines two things - a live view of the order and a copy of the order - that are really two aspects of one thing, which seems elegant. However, it does strike me as rather unpythonic; it's trying to cram a lot of functionality in an unexpected combination into one place. Sparse is better than dense and all that. I guess the thing to do is to try both out and see which users prefer. tom -- YOU HAVE NO CHANCE TO ARRIVE MAKE ALTERNATIVE TRAVEL ARRANGEMENTS. -- Robin May -- http://mail.python.org/mailman/listinfo/python-list
Re: Why are there no ordered dictionaries?
On Wed, 23 Nov 2005, Christoph Zwerschke wrote: Tom Anderson wrote: I think it would be probably the best to hide the keys list from the public, but to provide list methods for reordering them (sorting, slicing etc.). one with unusual constraints, so there should be a list i can manipulate in code, and which should of course be bound by those constraints. Think of it similar as the case of an ordinary dictionary: There is conceptually a set here (the set of keys), but you cannot manipulate it directly, but only through the according dictionary methods. Which is a shame! For an ordedred dictionary, there is conceptually a list (or more specifically a unique list). Again you should not manipulate it directly, but only through methods of the ordered dictionary. This sounds at first more complicated, but is in reality more easy. For instance, if I want to put the last two keys of an ordered dict d at the beginning, I would do it as d = d[:-2] + d[-2:]. As i mentioned elsewhere, i think using [] like this is a terrible idea - and definitely not easier. With the list attribute (called sequence in odict, you would have to write: d.sequence = d.sequence[:-2] + d.sequence[-2:]. This is not only longer to write down, but you also have to know that the name of the attribute is sequence. True, but that's not exactly rocket science. I think the rules governing when your [] acts like a dict [] and when it acts like a list [] are vastly more complex than the name of one attribute. Python's strength is that you don't have to keep many details in mind because it has a small basic vocabulary and orthogonal use. No it isn't - it's in having a wide set of basic building blocks which do one simple thing well, and thus which are easy to use, but which can be composed to do more complex things. What are other examples of this kind of 'orthogonal use'? I prefer the ordered dictionary does not introduce new concepts or attributes if everything can be done intuitively with the existing Python methods and operators. I strongly agree. However, i don't think your overloading of [] is at all intuitive. tom -- YOU HAVE NO CHANCE TO ARRIVE MAKE ALTERNATIVE TRAVEL ARRANGEMENTS. -- Robin May -- http://mail.python.org/mailman/listinfo/python-list
Re: Which License Should I Use?
On Fri, 25 Nov 2005, Robert Kern wrote: You may also want to read this Licensing HOWTO: http://www.catb.org/~esr/faqs/Licensing-HOWTO.html It's a draft, but it contains useful information. It's worth mentioning that ESR, who wrote that, is zealously pro-BSD-style-license. That's not to say that the article isn't useful and/or balanced, but it's something to bear in mind while reading it. tom -- Science runs with us, making us Gods. -- http://mail.python.org/mailman/listinfo/python-list
Re: Which License Should I Use?
On Fri, 25 Nov 2005, mojosam wrote: How do I decide on a license? You decide on what obligations you wish to impose on licensees, then pick a license which embodies those. There are basically three levels of obligation: 1. None. 2. Derivatives of the code must be open source. 3. Derivatives of the code and any other code which uses it must be open source. By 'derivatives', i mean modified versions. By 'open source', i really mean 'under the same license as the original code'. So, the licenses corresponding to these obligations are: 1. A BSD-style license. I say 'BSD-style' because there are about a hojillion licenses which say more or less the same thing - and it's quite amazing just how many words can be split spelling out the absence of obligations - but the grand-daddy of them all is the BSD license: http://www.opensource.org/licenses/bsd-license.php 2. The GNU Lesser General Public License: http://www.gnu.org/copyleft/lesser.html 3. The GNU General Public License: http://www.gnu.org/copyleft/gpl.html The GPL licenses place quite severe restrictions on the freedom of programmers using the code, but you often hear GNU people banging on about freedom - 'free software', 'free as in speech', etc. What you have to realise is that they're not talking about the freedom of the programmers, but about the freedom of the software. The logic, i think, is that the freedom of the code is the key to the freedom of the end-users: applying the GPL to your code means that other programmers will be forced to apply to to their code, which means that users of that code will get the benefits of open source. Having said all that, you can only license software if you own the copyright on it, and as has been pointed out, in this case, you might not. Are there any web sites that summarize the pros and cons? The GNU project has a quite useful list of licenses, with their takes on them: http://www.gnu.org/licenses/license-list.html Bear in mind that the GNU project is strongly in favour of the GPL, so they're perhaps not as positive about non-GPL licenses as would be fair. This dude's written about this a bit: http://zooko.com/license_quick_ref.html I guess I don't care too much about how other people use it. These things won't be comprehensive enough or have broad enough appeal that somebody will slap a new coat of paint on them and try to sell them. I guess I don't care if somebody incorporates them into something bigger. If somebody were to add features to them, it would be nice to get the code and keep the derivative work as open source, but I don't think that matters all that much to me. If somebody can add value and find a way of making money at it, I don't think I'd be too upset. To me, it sounds like you want a BSD-style license. But then i'm a BSD afficionado myself, so perhaps i would say that! In fact, while were on the subject, let me plug my own license page: http://urchin.earth.li/~twic/The_Amazing_Disappearing_BSD_License.html I apply 0-clause BSD to all the code i release these days. I will be doing the bulk of the coding on my own time, because I need to be able to take these tools with me when I change employers. However, I'm sure that in the course of using these tools, I will need to spend time on the job debugging or tweaking them. I do not want my current employer to have any claim on my code in any way. Usually if you program on company time, that makes what you do a work for hire. I can't contaminate my code like that. Does that mean the GPL is the strongest defense in this situation? The license you choose has absolutely no bearing on this. Either the copyright belongs to you, in which case you're fine, or to your employer, in which case you don't have the right to license it, so its moot. Let's keep the broader issue of which license will bring about the fall of Western Civilization You mean the GPL? on the other thread. Oops! tom -- Science runs with us, making us Gods. -- http://mail.python.org/mailman/listinfo/python-list
icmp - should this go in itertools?
Hi all, This is a little function to compare two iterators: def icmp(a, b): for xa in a: try: xb = b.next() d = cmp(xa, xb) if (d != 0): return d except StopIteration: return 1 try: b.next() return -1 except StopIteration: return 0 It's modelled after the way cmp treats lists - if a and b are lists, icmp(iter(a), iter(b)) should always be the same as cmp(a, b). Is this any good? Would it be any use? Should this be added to itertools? tom -- I content myself with the Speculative part [...], I care not for the Practick. I seldom bring any thing to use, 'tis not my way. Knowledge is my ultimate end. -- Sir Nicholas Gimcrack -- http://mail.python.org/mailman/listinfo/python-list
Yet another ordered dictionary implementation
What up yalls, Since i've been giving it all that all over the ordered dictionary thread lately, i thought i should put my fingers where my mouth is and write one myself: http://urchin.earth.li/~twic/odict.py It's nothing fancy, but it does what i think is right. The big thing that i'm not happy with is the order list (what Larosa and Foord call 'sequence', i call 'order', just to be a pain); this is a list of keys, which for many purposes is ideal, but does mean that there are things you might want to do with the order that you can't do with normal python idioms. For example, say we wanted to move the last item in the order to be first; if this was a normal list, we'd say: od.order.insert(0, od.order.pop()) But we can't do that here - the argument to the insert is just a key, so there isn't enough information to make an entry in the dict. To make up for this, i've added move and swap methods on the list, but this still isn't idiomatic. In order to have idiomatic order manipulation, i think we need to make the order list a list of items - that is, (key, value) pairs. Then, there's enough information in the results of a pop to support an insert. This also allows us to implement the various other mutator methods on the order lists that i've had to rub out in my code. However, this does seem somehow icky to me. I can't quite put my finger on it, but it seems to violate Once And Only Once. Also, even though the above idiom becomes possible, it leads to futile remove-reinsert cycles in the dict bit, which it would be nice to avoid. Thoughts? tom -- I content myself with the Speculative part [...], I care not for the Practick. I seldom bring any thing to use, 'tis not my way. Knowledge is my ultimate end. -- Sir Nicholas Gimcrack -- http://mail.python.org/mailman/listinfo/python-list
Re: Why are there no ordered dictionaries?
On Fri, 25 Nov 2005, Christoph Zwerschke wrote: Tom Anderson wrote: True, but that's not exactly rocket science. I think the rules governing when your [] acts like a dict [] and when it acts like a list [] are vastly more complex than the name of one attribute. I think it's not really rocket science either to assume that an ordered dictionary behaves like a dictionary if you access items by subscription and like a list if you use slices (since slice indexes must evaluate to integers anyway, they can only be used as indexes, not as keys). When you put it that way, it makes a certain amount of sense - [:] is always about index, and [] is always about key. It's still icky, but it is completely unambiguous. tom -- I content myself with the Speculative part [...], I care not for the Practick. I seldom bring any thing to use, 'tis not my way. Knowledge is my ultimate end. -- Sir Nicholas Gimcrack -- http://mail.python.org/mailman/listinfo/python-list
Re: Backwards compatibility [was Re: is parameter an iterable?]
On Tue, 22 Nov 2005, Steven D'Aprano wrote: Are there practical idioms for solving the metaproblem solve problem X using the latest features where available, otherwise fall back on older, less powerful features? For instance, perhaps I might do this: try: built_in_feature except NameError: # fall back on a work-around from backwards_compatibility import \ feature as built_in_feature Do people do this or is it a bad idea? From some code i wrote yesterday, which has to run under 2.2: try: True except NameError: True = 1 == 1 False = 1 == 0 Great minds think alike! As for whether it's a bad idea, well, bad or not, it certainly seems like the least worst. Are there other techniques to use? Obviously refusing to run is a solution (for some meaning of solution), it may even be a practical solution for some cases, but is it the only one? How about detecting which environment you're in, then running one of two entirely different sets of code? Rather than trying to construct modern features in the antique environment, write code for each, using the local idioms. The trouble with this is that you end up with massive duplication; you can try to factor out the common parts, but i suspect that the differing parts will be a very large fraction of the codebase. If I have to write code that can't rely on iter() existing in the language, what should I do? Can you implement your own iter()? I have no idea what python 2.0 was like, but would something like this work: class _iterator: def __init__(self, x): self.x = x self.j = 0 def next(self): self.j = self.j + 1 return self.x.next() def __getitem__(self, i): if (i != self.j): raise ValueError, out of order iteration try: return self.next() except StopIteration: raise IndexError def __iter__(self): return self # hopefully, we don't need this, but if we do ... def __len__(self): return sys.maxint # and rely on StopIteration to stop the loop class _listiterator(_iterator): def next(self): try: item = self.x[self.j] self.j = self.j + 1 return item except IndexError: raise StopIteration def __getitem__(self, i): if (i != self.j): raise ValueError, out of order iteration self.j = self.j + 1 return self.x[i] import types def iter(x): # if there's no hasattr, use explicit access and try-except blocks # handle iterators and iterables from the future if hasattr(x, __iter__): return _iterator(x.__iter__()) # if there's no __getitem__ on lists, try x[0] and catch the exception # but leave the __getitem__ test to catch objects from the future if hasattr(x, __getitem__): return _listiterator(x) if type(x) == types.FileType: return _fileiterator(x) # you can imagine the implementation of this # insert more tests for specific types here as you like raise TypeError, iteration over non-sequence ? NB haven't actually tried to run that code. tom -- I'm angry, but not Milk and Cheese angry. -- Mike Froggatt -- http://mail.python.org/mailman/listinfo/python-list
Re: Any royal road to Bezier curves...?
On Tue, 22 Nov 2005, Warren Francis wrote: For my purposes, I think you're right about the natural cubic splines. Guaranteeing that an object passes through an exact point in space will be more immediately useful than trying to create rules governing where control points ought to be placed so that the object passes close enough to where I intended it to go. Right so. I wrote that code the first time when i was in a similar spot myself - trying to draw maps with nice smooth roads etc based on a fairly sparse set of points - so i felt your pain. Thanks for the insight, I never would have found that on my own. At least not until Google labs comes out with a search engine that gives names for what you're thinking of. ;-) You're in for a wait - i think that feature's scheduled for summer 2006. I know this is a fairly pitiful request, since it just involves parsing your code, but I'm new enough to this that I'd benefit greatly from an couple of lines of example code, implementing your classes... how do I go from a set of coordinates to a Natural Cubic Spline, using your python code? Pitiful but legit - i haven't documented that code at all well. If you go right to the foot of my code, you'll find a simple test routine, which shows you the skeleton of how to drive the code. It looks a bit like this (this is slightly simplified): def test_spline(): knots = [(0, 0), (0, 1), (1, 0), (0, -2), (-3, 0)] # a spiral trace = [] c = NaturalCubicSpline(tuples2points(knots)) u = 0.0 du = 0.1 lim = len(c) + du while (u lim): p = c(u) trace.append(tuple(p)) u = u + du return trace tuples2points is a helper function which turns your coordinates from a list of tuples (really, an iterable of length-2 iterables) to a list of Points. The alternative way of doing it is something like: curve = NaturalCubicSpline() for x, y in knot_coords: curve.knots.append(Point(x, y)) do_something_with(curve) tom -- I DO IT WRONG!!! -- http://mail.python.org/mailman/listinfo/python-list
Re: user-defined operators: a very modest proposal
On Tue, 22 Nov 2005, Steve R. Hastings wrote: User-defined operators could be defined like the following: ]+[ Eeek. That really doesn't look right. Could you remind me of the reason we can't say [+]? It seems to me that an operator can never be a legal filling for an array literal or a subscript, so there wouldn't be ambiguity. We could even just say that [?] is an array version of whatever operator ? is, and let python do the heavy lifting (excuse the pun) of looping it over the operands. [[?]] would obviously be a doubly-lifted version. Although that would mean [*] is a componentwise product, rather than an outer product, which wouldn't really help you very much! Maybe we could define {?} as the generalised outer/tensor version of the ? operator ... For improved readability, Python could even enforce a requirement that there should be white space on either side of a user-defined operator. I don't really think that's necessary. Indeed, it would be extremely wrong - normal operators don't require that, and special cases aren't special enough to break the rules. Reminds me of my idea for using spaces instead of parentheses for grouping in expressions, so a+b * c+d evaluates as (a+b)*(c+d) - one of my worst ideas ever, i'd say, up there with gin milkshakes. Also, there should be a way to declare what kind of precedence the user-defined operators use. Can't be done - different uses of the same operator symbol on different classes could have different precedence, right? So python would need to know what the class of the receiver is before it can work out the evaluation order of the expression; python does evaluation order at compile time, but only knows classes at execute time, so no dice. Also, i'm pretty sure you could cook up a situation where you could exploit differing precedences of different definitions of one symbol to generate ambiguous cases, but i'm not in a twisted enough mood to actually work out a concrete example! And now for something completely different. For Py4k, i think we should allow any sequence of characters that doesn't mean something else to be an operator, supported with one special method to rule them all, __oper__(self, ator, and), so: a + b Becomes: a.__oper__(+, b) And: a --{--@ b Becomes: a.__oper__(--{--@, b) # Euler's 'single rose' operator Etc. We need to be able to distinguish a + -b from a +- b, but this is where i can bring my grouping-by-whitespace idea into play, requiring whitespace separating operands and operators - after all, if it's good enough for grouping statements (as it evidently is at present), it's good enough for expressions. The character ']' would be treated as whitespace, so a[b] would be handled as a.__oper__([, b). Naturally, the . operator would also be handled through __oper__. Jeff Epler's proposal to use unicode operators would synergise most excellently with this, allowing python to finally reach, and even surpass, the level of expressiveness found in languages such as perl, APL and INTERCAL. tom -- I DO IT WRONG!!! -- http://mail.python.org/mailman/listinfo/python-list
Re: user-defined operators: a very modest proposal
On Tue, 22 Nov 2005 [EMAIL PROTECTED] wrote: Each unicode character in the class 'Sm' (Symbol, Math) whose value is greater than 127 may be used as a user-defined operator. EXCELLENT idea, Jeff! Also, to accomodate operators such as u'\N{DOUBLE INTEGRAL}', which are not simple unary or binary operators, the character u'\N{NO BREAK SPACE}' will be used to separate arguments. When necessary, parentheses will be added to remove ambiguity. This leads naturally to expressions like \N{DOUBLE INTEGRAL} (y * x**2) \N{NO BREAK SPACE} dx \N{NO BREAK SPACE} dy (corresponding to the call (y*x**2).__u222c__(dx, dy)) which are clearly easy to love, except for the small issue that many inferior editors will not clearly display the \N{NO BREAK SPACE} characters. Could we use '\u2202' instead of 'd'? Or, to be more correct, is there a d-which-is-not-a-d somewhere in the mathematical character sets? It would be very useful to be able to distinguish d'x', as it were, from 'dx'. * Do we immediately implement the combination of operators with nonspacing marks, or defer it? As long as you don't use normalisation form D, i'm happy. * Should some of the unicode mathematical symbols be reserved for literals? It would be greatly preferable to write \u2205 instead of the other proposed empty-set literal notation, {-}. Perhaps nullary operators could be defined, so that writing \u2205 alone is the same as __u2205__() i.e., calling the nullary function, whether it is defined at the local, lexical, module, or built-in scope. Sounds like a good idea. \u211D and relatives would also be a candidate for this treatment. And for those of you out there who are laughing at this, i'd point out that Perl IS ACTUALLY DOING THIS. tom -- I DO IT WRONG!!! -- http://mail.python.org/mailman/listinfo/python-list
Re: Why are there no ordered dictionaries?
On Tue, 22 Nov 2005, Carsten Haese wrote: On Tue, 2005-11-22 at 14:37, Christoph Zwerschke wrote: In Foord/Larosa's odict, the keys are exposed as a public member which also seems to be a bad idea (If you alter the sequence list so that it no longer reflects the contents of the dictionary, you have broken your OrderedDict). That could easily be fixed by making the sequence a managed property whose setter raises a ValueError if you try to set it to something that's not a permutation of what it was. I'm not a managed property expert (although there's a lovely studio in Bayswater you might be interested in), but how does this stop you doing: my_odict.sequence[0] = Shrubbery() Which would break the odict good and hard. tom -- When I see a man on a bicycle I have hope for the human race. -- H. G. Wells -- http://mail.python.org/mailman/listinfo/python-list
Re: Why are there no ordered dictionaries?
On Tue, 22 Nov 2005, Christoph Zwerschke wrote: One implementation detail that I think needs further consideration is in which way to expose the keys and to mix in list methods for ordered dictionaries. In Foord/Larosa's odict, the keys are exposed as a public member which also seems to be a bad idea (If you alter the sequence list so that it no longer reflects the contents of the dictionary, you have broken your OrderedDict). I think it would be probably the best to hide the keys list from the public, but to provide list methods for reordering them (sorting, slicing etc.). I'm not too keen on this - there is conceptually a list here, even if it's one with unusual constraints, so there should be a list i can manipulate in code, and which should of course be bound by those constraints. I think the way to do it is to have a sequence property (which could be a managed attribute to prevent outright clobberation) which walks like a list, quacks like a list, but is in fact a mission-specific list subtype whose mutator methods zealously enforce the invariants guaranteeing the odict's integrity. I haven't actually tried to write such a beast, so i don't know if this is either of possible and straightforward. tom -- When I see a man on a bicycle I have hope for the human race. -- H. G. Wells -- http://mail.python.org/mailman/listinfo/python-list
Re: Why are there no ordered dictionaries?
On Tue, 22 Nov 2005, Christoph Zwerschke wrote: Fuzzyman schrieb: Of course ours is ordered *and* orderable ! You can explicitly alter the sequence attribute to change the ordering. What I actually wanted to say is that there may be a confusion between a sorted dictionary (one where the keys are automatically sorted) and an ordered dictionary (where the keys are not automatically ordered, but have a certain order that is preserved). Those who suggested that the sorted function would be helpful probably thought of a sorted dictionary rather than an ordered dictionary. Exactly. Python could also do with a sorted dict, like binary tree or something, but that's another story. tom -- When I see a man on a bicycle I have hope for the human race. -- H. G. Wells -- http://mail.python.org/mailman/listinfo/python-list
Re: Any royal road to Bezier curves...?
On Sun, 20 Nov 2005, Warren Francis wrote: Basically, I'd like to specify a curved path of an object through space. 3D space would be wonderful, but I could jimmy-rig something if I could just get 2D... Are bezier curves really what I want after all? No. You want a natural cubic spline: http://mathworld.wolfram.com/CubicSpline.html This is a fairly simple curve, which can be fitted through a series of points (called knots) in space of any dimensionality, without the need to specify extra control points (unlike a Bezier curve), and which has the nice property of minimising the curvature of the curve - it's the shape you'd get if you ran a springy wire through your knots. It usually looks pretty good too. Google will help you find python implementations. There are other kinds of splines - Catmull-Rom, B-spline (a generalisation of a Bezier curve), Hermite - but they mostly don't guarantee to pass through the knots, which might make them less useful to you. In the opposite direction on the mathematical rigour scale, there's what i call the blended quadratic spline, which i invented as a simpler and more malleable alternative to the cubic spline. It's a piecewise parametric spline, like the cubic, but rather than calculating a series of pieces which blend together naturally, using cubics and linear algebra, it uses simple quadratic curves fitted to overlapping triples of adjacent knots, then interpolates ('blends') between them to draw the curve. It looks very like a cubic spline, but the code is simpler, and the pieces are local - each piece depends only on nearby knots, rather than on all the knots, as in a cubic spline - which is a useful property for some jobs. Also, it's straightforward to add the ability to constrain the angle at which the curve passes through a subset of the knots (you can do it for some knots, while leaving others 'natural') by promoting the pieces to cubics at the constrained knots and constraining the appropriate derivatives. Let me know if you want more details on this. To be honest, i'd suggest using a proper cubic spline, unless you have specific problems with it. tom -- ... a tale for which the world is not yet prepared -- http://mail.python.org/mailman/listinfo/python-list
Re: Why are there no ordered dictionaries?
On Sun, 20 Nov 2005, Alex Martelli wrote: Christoph Zwerschke [EMAIL PROTECTED] wrote: The 'sorted' function does not help in the case I have indicated, where I do not want the keys to be sorted alphabetically, but according to some criteria which cannot be derived from the keys themselves. Ah, but WHAT 'some criteria'? There's the rub! First insertion, last insertion, last insertion that wasn't subsequently deleted, last insertion that didn't change the corresponding value, or...??? All the requests for an ordered dictionary that i've seen on this group, and all the cases where i've needed on myself, want one which behaves like a list - order of first insertion, with no memory after deletion. Like the Larosa-Foord ordered dict. Incidentally, can we call that the Larosa-Foord ordered mapping? Then it sounds like some kind of rocket science discrete mathematics stuff, which (a) is cool and (b) will make Perl programmers feel even more inadequate when faced with the towering intellectual might of Python. Them and their Scwartzian transform. Bah! tom -- Baby got a masterplan. A foolproof masterplan. -- http://mail.python.org/mailman/listinfo/python-list
Re: Any royal road to Bezier curves...?
On Mon, 21 Nov 2005, Tom Anderson wrote: On Sun, 20 Nov 2005, Warren Francis wrote: Basically, I'd like to specify a curved path of an object through space. 3D space would be wonderful, but I could jimmy-rig something if I could just get 2D... Are bezier curves really what I want after all? No. You want a natural cubic spline: In a fit of code fury (a short fit - this is python, so it didn't take long), i ported my old java code to python, and tidied it up a bit in the process: http://urchin.earth.li/~twic/splines.py That gives you a natural cubic spline, plus my blended quadratic spline, and a framework for implementing other kinds of splines. tom -- Gin makes a man mean; let's booze up and riot! -- http://mail.python.org/mailman/listinfo/python-list
Re: running functions
On Thu, 17 Nov 2005, Scott David Daniels wrote: Gorlon the Impossible wrote: I have to agree with you there. Threading is working out great for me so far. The multiprocess thing has just baffled me, but then again I'm learning. Any tips or suggestions offered are appreciated... The reason multiprocess is easier is that you have enforced separation. Multiple processes / threads / whatever that share reads and writes into shared memory are rife with irreproducible bugs and untestable code. Processes must be explicit about their sharing (which is where the bugs occur), so those parts of the code cane be examined carefully. That's a good point. If you program threads with shared nothing and communication over Queues you are, in effect, using processes. If all you share is read-only memory, similarly, you are doing easy stuff and can get away with it. In all other cases you need to know things like which operations are indivisible and what happens if I read part of this from before an update and the other after the update completes, . Right, but you have exactly the same problem with separate processes - except that with processes, having that richness of interaction is so hard, that you'll probably never do it in the first place! tom -- science fiction, old TV shows, sports, food, New York City topography, and golden age hiphop -- http://mail.python.org/mailman/listinfo/python-list
Re: Iterator addition
On Sun, 13 Nov 2005, Reinhold Birkenfeld wrote: [EMAIL PROTECTED] wrote: Tom Anderson: And we're halfway to looking like perl already! Perhaps a more pythonic thing would be to define a then operator: all_lines = file1 then file2 then file3 Or a chain one: all_lines = file1 chain file2 chain file3 This may just be NIH syndrome, but i like that much less - 'then' makes for something that reads much more naturally to me. 'and' would be even better, but it's taken; 'andthen' is a bit unwieldy. Besides, chain file2 is going to confuse people coming from a BASIC background :). That's certainly not better than the chain() function. Introducing new operators for just one application is not pythonic. True, but would this be for just one application With python moving towards embracing a lazy functional style, with generators and genexps, maybe chaining iterators is a generally useful operation that should be supported at the language level. I'm not seriously suggesting doing this, but i don't think it's completely out of the question. tom -- limited to concepts that are meta, generic, abstract and philosophical -- IEEE SUO WG -- http://mail.python.org/mailman/listinfo/python-list
Re: Iterator addition
On Thu, 9 Nov 2005, it was written: [EMAIL PROTECTED] (Alex Martelli) writes: Is there a good reason to not define iter1+iter2 to be the same as If you mean for *ALL* built-in types, such as generators, lists, files, dicts, etc, etc -- I'm not so sure. Hmm, there might also be __add__ operations on the objects, that would have to take precedence over iterator addition. Iterator addition itself would have to be a special kludge like figuring out from __cmp__, etc. Yeah, I guess the idea doesn't work out that well. Oh well. How about if we had some sort of special sort of iterator which did the right thing when things were added to it? like an iterable version of The Blob: class blob(object): def __init__(self, it=None): self.its = [] if (it != None): self.its.append(iter(it)) def __iter__(self): return self def next(self): try: return self.its[0].next() except StopIteration: # current iterator has run out! self.its.pop(0) return self.next() except IndexError: # no more iterators raise StopIteration def __add__(self, it): self.its.append(iter(it)) return self def __radd__(self, it): self.its.insert(0, iter(it)) Then we could do: all_lines = blob(file1) + file2 + file3 candidate_primes = blob((2,)) + (1+2*i for i in itertools.count(1)) Which, although not quite as neat, isn't entirely awful. Another option would be a new operator for chaining - let's use $, since that looks like the chain on the fouled anchor symbol used by navies etc: http://www.diggerhistory.info/images/badges-asstd/female-rels-navy.jpg Saying a $ b would be equivalent to chain(a, b), where chain (which could even be a builtin if you like) is defined: def chain(a, b): if (hasattr(a, __chain__)): return a.__chain__(b) elif (hasattr(b, __rchain__)): # optional return b.__rchain__(a) else: return itertools.chain(a, b) # or equivalent Whatever it is that itertools.chain or whatever returns would be modified to have a __chain__ method which behaved like blob.__add__ above. This then gets you: all_lines = file1 $ file2 $ file3 candidate_primes = (2,) $ (1+2*i for i in itertools.count(1)) And we're halfway to looking like perl already! Perhaps a more pythonic thing would be to define a then operator: all_lines = file1 then file2 then file3 candidate_primes = (2,) then (1+2*i for i in itertools.count(1)) That looks quite nice. The special method would be __then__, of course. tom -- if you can't beat them, build them -- http://mail.python.org/mailman/listinfo/python-list
Re: Hash map with multiple keys per value ?
On Fri, 11 Nov 2005, Chris Stiles wrote: Is there an easier and cleaner way of doing this ? Is there example code floating around that I might have a look at ? I'm not aware of a way which can honestly be called better. However, i do feel your pain about representing the alias relationships twice - it feels wrong. Therefore, i offer you an alternative implementation - represent each set as a linked list, threaded through a dict by making the value the dict holds under each key point to the next key in the alias set. Confused? No? You will be ... class Aliases(object): def __init__(self, aliases=None): self.nexts = {} if (aliases != None): for key, value in aliases: self[key] = value def __setitem__(self, key, value): if ((value != None) and (value != key)): self.nexts[key] = self.nexts[value] self.nexts[value] = key else: self.nexts[key] = key def __getitem__(self, key): return list(follow(key, self.nexts)) def __delitem__(self, key): cur = key while (self.nexts[cur] != key): cur = self.nexts[cur] if (cur != key): self.nexts[cur] = self.nexts[key] del self.nexts[key] def canonical(self, key): canon = key for cur in follow(key, self.nexts): if (cur canon): canon = cur return canon def iscanonical(self, key): for cur in follow(key, self.nexts): if (cur key): False return True def iteraliases(self, key): cur = self.nexts[key] while (cur != key): yield cur cur = self.nexts[cur] def __iter__(self): return iter(self.nexts) def itersets(self): for key in self.nexts: if (not isprimary(key, self.nexts)): continue yield [key] + self[key] def __len__(self): return len(self.nexts) def __contains__(self, key): return key in self.nexts def __str__(self): return Aliases + str(list(self.itersets())) + def __repr__(self): return Aliases([ + , .join(str((key, self.canonical(key))) for key in sorted(self.nexts.keys())) + ]) As i'm sure you'll agree, code that combines a complete absence of clarity with abject lack of runtime efficiency. Oh, and i haven't tested it properly. tom -- if you can't beat them, build them -- http://mail.python.org/mailman/listinfo/python-list