Nothing to repeat

2011-01-09 Thread Tom Anderson

Hello everyone, long time no see,

This is probably not a Python problem, but rather a regular expressions 
problem.


I want, for the sake of arguments, to match strings comprising any number 
of occurrences of 'spa', each interspersed by any number of occurrences of 
the 'm'. 'any number' includes zero, so the whole pattern should match the 
empty string.


Here's the conversation Python and i had about it:

Python 2.6.4 (r264:75706, Jun  4 2010, 18:20:16)
[GCC 4.4.4 20100503 (Red Hat 4.4.4-2)] on linux2
Type help, copyright, credits or license for more information.

import re
re.compile((spa|m*)*)

Traceback (most recent call last):
  File stdin, line 1, in module
  File /usr/lib/python2.6/re.py, line 190, in compile
return _compile(pattern, flags)
  File /usr/lib/python2.6/re.py, line 245, in _compile
raise error, v # invalid expression
sre_constants.error: nothing to repeat

What's going on here? Why is there nothing to repeat? Is the problem 
having one *'d term inside another?


Now, i could actually rewrite this particular pattern as '(spa|m)*'. But 
what i neglected to mention above is that i'm actually generating patterns 
from structures of objects (representations of XML DTDs, as it happens), 
and as it stands, patterns like this are a possibility.


Any thoughts on what i should do? Do i have to bite the bullet and apply 
some cleverness in my pattern generation to avoid situations like this?


Thanks,
tom

--
If it ain't broke, open it up and see what makes it so bloody special.
--
http://mail.python.org/mailman/listinfo/python-list


Parsing DTDs

2009-05-29 Thread Tom Anderson

Hello!

I would like to parse XML DTDs. The goal is to be able to validate 
XML-like object structures against DTDs in a fairly flexible way, although 
i can get from a parsed DTD to a validation engine myself, so that's not 
an essential feature of the parser (although it would be nice!). What 
should i do?


A bit of googling revealed that the xmlproc package contains a DTD parser 
that looks like it does just what i want, and that xmlproc became PyXML, 
and that PyXML is no longer maintained.


Is there a DTD parser that is being maintained? Or does it not really 
matter that PyXML is no longer maintained, given that it's not like the 
DTD spec has changed very much?


Thanks,
tom

--
Many of us adopted the File's slang as our own, feeling that we'd found a
tangible sign of the community of minds we'd half-guessed to be out there.
--
http://mail.python.org/mailman/listinfo/python-list


Re: strptime and timezones

2008-08-14 Thread Tom Anderson

On Wed, 13 Aug 2008, Christian Heimes wrote:


Tom Anderson wrote:

Secondly, do you really have to do this just to parse a date with a 
timezone? If so, that's ridiculous.


No, you don't. :) Download the pytz package from the Python package 
index. It's *the* tool for timezone handling in Python. The time zone 
definition are not part of the Python standard library because they 
change every few of months. Stupid politicians ...


My problem has absolutely nothing to do with timezone definitions. In 
fact, it involves less timezone knowledge than the time package supplies! 
The wonderful thing about RFC 1123 timestamps is that they give the 
numeric value of their timezone, so you don't have to decode a symbolic 
one or anything like that. Knowing about timezones thus isn't necessary.


The problem is simply that the standard time package doesn't think that 
way, and always assumes that a time is in your local timezone.


That said, it does look like pytz might be able to parse RFC 1123 dates. 
Ill check it out.


tom

--
Come on thunder; come on thunder.
--
http://mail.python.org/mailman/listinfo/python-list


strptime and timezones

2008-08-13 Thread Tom Anderson

Hello!

Possibly i'm missing something really obvious here. But ...

If i have a date-time string of the kind specified in RFC 1123, like this:

Tue, 12 Aug 2008 20:48:59 -0700

Can i turn that into a seconds-since-the-epoch time using the standard 
time module without jumping through substantial hoops?


Apart from the timezone, this can be parsed using time.strptime with the 
format:


%a, %d %b %Y %H:%M:%S

You can stick a %Z on the end for the timezone, but that parses timezone 
names ('BST', 'EDT'), not numeric specifiers. Also, it doesn't actually 
parse anything, it just requires that the timezone that's in the string 
matches your local timezone.


Okay, no problem, so you use a regexp to split off the timezone specifier, 
parse that yourself, then parse the raw time with strptime.


Now you just need to adjust the parsed time for the timezone. Now, from 
strptime, you get a struct_time, and that doesn't have room for a timezone 
(although it does have room for a daylight saving time flag), so you can't 
add the timezone in before you convert to seconds-since-the-epoch.


Okay, so convert the struct_time to seconds-since-the-epoch as if it were 
UTC, then apply the timezone correction. Converting a struct_time to 
seconds-since-the-epoch is done with mktime, right? Wrong! That does the 
conversion *in your local timezone*. There's no way to tell it to use any 
specific timezone, not even just UTC.


So how do you do this?

Can we convert from struct_time to seconds-since-the-epoch by hand? Well, 
the hours, minutes and seconds are pretty easy, but dealing with the date 
means doing some hairy calculations with leap years, which are doable but 
way more effort than i thought i'd be expending on parsing the date format 
found in every single email in the world.


Can we pretend the struct_time is a local time, convert it to 
seconds-since-the-epoch, then adjust it by whatever our current timezone 
is to get true seconds-since-the-epoch, *then* apply the parsed timezone? 
I think so:


def mktime_utc(tm):
Return what mktime would return if we were in the UTC timezone
return time.mktime(tm) - time.timezone

Then:

def mktime_zoned(tm, tz):
Return what mktime would return if we were in the timezone given by tz
return mktime_utc(tm) - tz

The only problem there is that mktime_utc doesn't deal with DST: if tm is 
a date for which DST would be in effect for the local timezone, then we 
need to subtract time.altzone, not time.timezone. strptime doesn't fill in 
the dst flag, as far as i can see, so we have to round-trip via 
mktime/localtim:


def isDST(tm):
tm2 = time.localtime(time.mktime(tm))
assert (tm2.isdst != -1)
return bool(tm2.isdst)

def timezone(tm):
if (isDST(tm)):
return time.altzone
else:
return time.timezone

mktime_utc then becomes:

def mktime_utc(tm):
return time.mktime(tm) - timezone(tm)

And you can of course inline that and eliminate a redundant call to 
mktime:


def mktime_utc(tm):
t = time.mktime(tm)
isdst = time.localtime(t).isdst
assert (isdst != -1)
if (isdst):
tz = time.altzone
else:
tz = time.timezone
return t - tz

So, firstly, does that work? Answer: i've tested it a it, and yes.

Secondly, do you really have to do this just to parse a date with a 
timezone? If so, that's ridiculous.


tom

--
102 FX 6 (goblins)
--
http://mail.python.org/mailman/listinfo/python-list


Re: Question about idioms for clearing a list

2006-02-07 Thread Tom Anderson
On Tue, 7 Feb 2006, Ben Sizer wrote:

 Raymond Hettinger wrote:
 [Steven D'Aprano]
 The Zen isn't only one way to do it. If it were, we
 wouldn't need iterators, list comps or for loops,
 because they can all be handled with a while loop (at
 various costs of efficiency, clarity or obviousness).

 del L[:] works, but unless you are Dutch, it fails the
 obviousness test.

 [Fredrik Lundh]
 unless you read some documentation, that is.  del on sequences
 and mappings is a pretty fundamental part of Python.  so are slicings.

 both are things that you're likely to need and learn long before you 
 end up in situation where you need to be able to clear an aliased 
 sequence.

I don't agree with that at all. I'd been programming python for a while (a 
year?) before i knew about del l[:].

 Likewise, the del keyword is fundamental -- if you can't get, set, and 
 del, then you need to go back to collections school.

 I have hardly used the del keyword in several years of coding in Python.

Ditto.

 Why should it magically spring to mind in this occasion? Similarly I 
 hardly ever find myself using slices, never mind in a mutable context.

 del L[:] is not obvious, especially given the existence of clear() in 
 dictionaries.

Agreed.

tom

-- 
GOLDIE LOOKIN' CHAIN [...] will ultimately make all other forms of music
both redundant and unnecessary -- ntk
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: learning python, using string help

2006-02-03 Thread Tom Anderson
On Fri, 2 Feb 2006, [EMAIL PROTECTED] wrote:

 silly newbie mistake

 your code runs fine on my openbsd box. ( I didnt uncomment the return
 map(...) line

My apologies - i should have made it clearer in the comment that it was 
hardwired to return example data!

 thanks for the awesome example!

I'm not sure how awesome it is - it's pretty simple, and probably has lots 
of bugs. Is the BSD ruptime output format the same as on HP-UX? I have a 
Mac myself, but no local machines broadcasting rwho data, so i don't get 
any output to play with when i run ruptime!

tom

-- 
hip  whizzo  teddy bear  egghead  realpolitik  tiddly-om-pom-pom
sacred cow  gene  blues  celeb  cheerio  civvy street  U-boat  tailspin
ceasefire  ad-lib  demob  pop  wizard  hem-line  lumpenproletariat  avant
garde  kitsch  sudden death  Big Apple  sex  drive-in  Mickey Mouse  bagel
dumb down  pesticide  racism  spliff  dunk  cheeseburger  Blitzkrieg
Molotov cocktail  snafu  buzz  pissed off  DNA  mobile phone  megabucks
Wonderbra  cool  Big Brother  brainwashing  fast food  Generation X
hippy  non-U  boogie  sexy  psychedelic  beatnik  cruise missile  cyborg
awesome  bossa nova  peacenik  byte  miniskirt  acid  love-in  It-girl
microchip  hypermarket  green  Watergate  F-word  punk  detox  Trekkie
naff all  trainers  karaoke  power dressing  toy-boy  hip-hop  beatbox
double-click  OK yah  mobile  virtual reality  gangsta  latte  applet
hot-desking  URL  have it large  Botox  kitten heels  ghetto fabulous
dot-commer  text message  google  bling bling  9/11  axis of evil  sex
up  chav
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: would it be feasable to write python DJing software

2006-02-03 Thread Tom Anderson
On Fri, 3 Feb 2006, Ivan Voras wrote:

 Levi Campbell wrote:

 Hi, I'm thinking about writing a system for DJing in python, but I'm 
 not sure if Python is fast enough to handle the realtime audio needed 
 for DJing, could a guru shed some light on this subject and tell me if 
 this is doable or if I'm out of my fscking mind?

Perhaps surprisingly, it is:

http://www.python.org/pycon/dc2004/papers/6/

At least, you can certainly mix in realtime in pure python, and can 
probably manage some level of effects processing. I'd be skeptical about 
decoding MP3 in realtime, but then you don't want to write your own MP3 
decoder anyway, and the existing ones you might reuse are all native code.

 Any and all mixing would probably happen in some sort of multimedia 
 library written in C (it would be both clumsy to program and slow to 
 execute if the calculations of raw samples/bytes were done in python)

Clumsy? Clumsier than C? No, python isn't as good with binary data as it 
is with text or objects, but on the whole program scale, it's still miles 
ahead of C.

My advice would be to tackle the task in the same way you'd tackle any 
other: write it in pure python, then fall back to native code where it's 
unavoidable. When i say 'pure python', i don't mean 'not using any native 
modules at all', obviously - if someone's written an MP3 decoder, don't 
eschew it because it happens to be in C. Also, bear in mind that resorting 
to native code doesn't automatically mean writing in C - you can start 
doing stuff like moving from representing buffers as lists of ints to 
using NumPy arrays, using the functions in the standard audioop module, 
whatever; if that's not fast enough, rewrite chunks of the code in pyrex 
(a derivative of python that can be compiled to native code, via 
translation to C); if it's still not fast enough, go to C.

Oh, and before you start going native, try running your program under 
psyco.

tom

-- 
Throw bricks at lawyers if you can!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why checksum? [was Re: Fuzzy Lookups]

2006-02-02 Thread Tom Anderson
On Thu, 1 Feb 2006, it was written:

 Tom Anderson [EMAIL PROTECTED] writes:

 The obvious way is make a list of hashes, and sort the list.

 Obvious, perhaps, prudent, no. To make the list of hashes, you have to 
 read all of every single file first, which could take a while. If your 
 files are reasonably random at the beginning, ...

 The possibility of two different mp3 files having the same id3 tags is 
 something you might specifically be checking for.

So read from the end of the file, rather than the beginning.

 Better yet, note that if two files are identical, they must have the 
 same length, and that finding the length can be done very cheaply, so a 
 quicker yet approach is to make a list of lengths, sort that, and look 
 for duplicates; when you find one, do a byte-by-byte comparison of the 
 files (probably terminating in the first block) to see if they really 
 are the same.

 Yes, checking the file lengths first is an obvious heuristic, but if you 
 fine you have a bunch of files with the same length, what do you do? 
 You're back to a list of hashes.

Or prefixes or suffixes.

 By way of example, of the 2690 music files in my iTunes library, i have 
 twelve pairs of same-sized files [1], and all of these differ within 
 the first 18 bytes (mostly, within the first 9 bytes).

 That's a small enough set of matches that you don't need a general 
 purpose algorithm.

True - and this is *exactly* the situation that the OP was talking about, 
so this algorithm is appropriate. Moreover, i believe is representative of 
most situations where you have a bunch of files to compare. Of course, 
cases where files are tougher to tell apart do exist, but i think they're 
corner cases. Could you suggest a common kind of file with degenerate 
lengths, prefixes and suffixes?

The only one that springs to mind is a set of same-sized image files in 
some noncompressed format, recording similar images (frames in a movie, 
say), where the differences might be buried deep in the pixel data. As it 
happens, i have just such a dataset on disk: with the images in TIFF 
format, i get differences between subsequent frames after 9 bytes, but i 
suspect that's a timestamp or something; if i convert everything to a nice 
simple BMP (throwing away 8 bits per sample of precision in the process - 
probably turning most of the pixels to 0!), then i find differences about 
a megabyte in. If i compare from the tail in, i also have to wade through 
about a megabyte before finding a difference. Here, hashes would be ideal.

tom

-- 
The revolution is here. Get against the wall, sunshine. -- Mike Froggatt
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: learning python, using string help

2006-02-02 Thread Tom Anderson
On Thu, 2 Feb 2006, [EMAIL PROTECTED] wrote:

 Well, I did want to add some formatting for example

I getcha. This is really an HTML problem rather than a python problem, 
isn't it? What you need to do is output a table.

FWIW, here's how i'd do it (assuming you've got HP-UX ruptime, since 
that's the only one i can find example output for [1]):

http://urchin.earth.li/~twic/ruptime.py

You can use this as a library, a command-line tool, or a CGI script; it'll 
automagically detect which context it's in and do the right thing. It's 
built around the output from the HP-UX version of ruptime; let me know 
what the output from yours looks like (a few lines would do) and i'll show 
you how to change it.

The first key bit is a pair of regular expressions:

lineRe = re.compile(r(\S+)\s+(\S+)\s+([\d+:]+)(?:,\s+(\d+) users,\s+load 
([\d., ]+))?)
uptimeRe = re.compile(r(?:(\d+)\+)?(\d*):(\d*))

These rip the output from ruptime apart to produce a set of fields, which 
can then be massaged into useful data. Once that's done, there's a big 
splodge of code which prints out an HTML document containing a table 
displaying the information. It's probably neither the shortest nor the 
cleanest bit of code in the universe, but it does the job and should, i 
hope, be reasonably clear.

tom

[1] http://docs.hp.com/en/B2355-90743/ch06s02.html

-- 
Science Never Sleeps
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why checksum? [was Re: Fuzzy Lookups]

2006-02-01 Thread Tom Anderson
On Tue, 31 Jan 2006, it was written:

 Steven D'Aprano [EMAIL PROTECTED] writes:

 This isn't a criticism, it is a genuine question. Why do people compare 
 local files with MD5 instead of doing a byte-to-byte compare?

I often wonder that!

 Is it purely a caching thing (once you have the checksum, you don't 
 need to read the file again)? Are there any other reasons?

 It's not just a matter of comparing two files.  The idea is you have 
 10,000 local files and you want to find which ones are duplicates (i.e. 
 if files 637 and 2945 have the same contents, you want to discover 
 that).  The obvious way is make a list of hashes, and sort the list.

Obvious, perhaps, prudent, no. To make the list of hashes, you have to 
read all of every single file first, which could take a while. If your 
files are reasonably random at the beginning, you'd be better off just 
using the first N bytes of the file, since this would be just as 
effective, and cheaper to read. Looking at some random MP3s i have to 
hand, they all differ within the first 20 bytes - probably due to the ID3 
tags, so this should work for these.

Better yet, note that if two files are identical, they must have the same 
length, and that finding the length can be done very cheaply, so a quicker 
yet approach is to make a list of lengths, sort that, and look for 
duplicates; when you find one, do a byte-by-byte comparison of the files 
(probably terminating in the first block) to see if they really are the 
same.

By way of example, of the 2690 music files in my iTunes library, i have 
twelve pairs of same-sized files [1], and all of these differ within the 
first 18 bytes (mostly, within the first 9 bytes). Therefore, i could rule 
out duplication with just 22 data blocks read from disk (plus rather more 
blocks of directory information and inodes, of course). A hash-based 
approach would have had to wade through a touch over 13 GB of data before 
it could even get started.

Of course, there are situations where this is the wrong approach - if you 
have a collection of serialised sparse matrices, for example, which 
consist of identically-sized blocks of zeroes with a scattering of ones 
throughout, then lengths and prefixes will be useless, whereas hashes will 
work perfectly. However, here, we're looking at MP3s, where lengths and 
prefixes will be a win.

tom

[1] The distribution of those is a bit weird: ten pairs consist of two 
tracks from The Conet Project's 'Recordings of Shortwave Numbers 
Stations', one is a song from that and The Doors' 'Horse Latitudes', and 
one is between to Calexico songs ('The Ride (Pt II)' and 'Minas De 
Cobre'). Why on earth are eleven of the twelve pairs pairs of songs from 
the same artist? Is it really that they're pairs of songs from the same 
compressor (those tracks aren't from CD), i wonder?

-- 
Not all legislation can be eye-catching, and it is important that the
desire to achieve the headlines does not mean that small but useful
measures are crowded out of the legislative programme. -- Select Committee
on Transport
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: simple perl program in python gives errors

2006-01-30 Thread Tom Anderson
On Mon, 30 Jan 2006, Grant Edwards wrote:

 On 2006-01-30, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:

 i was hoping one didnt have to initialize variables because perl 
 defaults their value to zero. Also I noticed if I initialize a variable 
 as 0 , then I can only do integer math not floating math.

 Python is a strictly typed language.  Perl isn't -- Perl does all sorts 
 of stuff automagically by trying to guess what you wanted.  I perfer 
 languages that do exactly what I tell them to rather than what the 
 language's author thought I might have meant.

Especially when that's Larry Wall ... :)

tom

-- 
Don't trust the laws of men. Trust the laws of mathematics.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: StringIO proposal: add __iadd__

2006-01-30 Thread Tom Anderson
On Sun, 29 Jan 2006, Alex Martelli wrote:

 Paul Rubin http://[EMAIL PROTECTED] wrote:
 
 Maybe the standard versions of some of these things can be written in 
 RPython under PyPy, so they'll compile to fast machine code, and then 
 the C versions won't be needed.

 By all means, the C versions are welcome, I just don't want to lose the 
 Python versions either (and making them less readable by recoding them 
 in RPython would interfere with didactical use).

Is RPython really that bad? Lack of generators seems like the only serious 
issue to me.

 But with CPython I think we need the C versions.

Unless we use Shed Skin to translate the RPython into C++. Or maybe we 
could write the code in Pyrex, generate C from that for CPython, then have 
a python script which strips out the type definitions to generate pure 
python for PyPy.

tom

-- 
Don't trust the laws of men. Trust the laws of mathematics.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Numarray, numeric, NumPy, scpy_core ??!!

2006-01-22 Thread Tom Anderson
On Sat, 21 Jan 2006, Robert Kern wrote:

 Tom Anderson wrote:

 Pardon my failure to RTFM, but does NumPy pick up the vecLib BLAS on Macs?

 Yes.

Excellent, thanks.

tom

-- 
forget everything from school -- you are programmer
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Returning a tuple-struct

2006-01-21 Thread Tom Anderson
On Thu, 18 Jan 2006 [EMAIL PROTECTED] wrote:

 Is there a better way?  Thoughts?

I was thinking along these lines:

class NamedTuple(tuple):
def __init__(self, indices, values):
indices should be a map from name to index
tuple.__init__(self, values)
self.indices = indices
def __getattr__(self, name):
return self[self.indices[name]]

colourNames = {red: 0, green: 1, blue:2}
plum = NamedTuple(colourNames, (219, 55, 121))

The idea is that it's a tuple, but it has some metadata alongside (shared 
with other similarly-shaped tuples) which allows it to resolve names to 
indices - thus avoiding having two references to everything.

However, if i try that, i get:

Traceback (most recent call last):
   File stdin, line 1, in ?
TypeError: tuple() takes at most 1 argument (2 given)

As far as i can tell, inheriting from tuple is forcing my constructor to 
only take one argument. Is that the case? If so, anyone got any idea why?

If i rewrite it like this:

class NamedTuple(tuple):
def __init__(self, values):
tuple.__init__(self, values)
def __getattr__(self, name):
return self[self.indices[name]]

class ColourTuple(NamedTuple):
indices = {red: 0, green: 1, blue:2}

plum = ColourTuple((219, 55, 121))

Then it works. This is even an arguably better style. Changing the 
constructor to take *values rather than values, and to validate the length 
of the value tuple against the length of the index tuple, would be good, 
but, since i'm lazy, is left as an exercise to the reader.

tom

-- 
Throwin' Lyle's liquor away is like pickin' a fight with a meat packing
plant! -- Ray Smuckles
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Arithmetic sequences in Python

2006-01-21 Thread Tom Anderson
On Fri, 20 Jan 2006, it was written:

 [EMAIL PROTECTED] (Alex Martelli) writes:

 How would you make a one-element list, which we'd currently write as 
 [3]? Would you have to say list((3,))?

 Yep.  I don't particularly like the mandatory trailing comma in the 
 tuple's display form, mind you, but, if it's good enough for tuples, 
 and good enough for sets (how else would you make a one-element set?),

 If you really want to get rid of container literals, maybe the best way 
 is with constructor functions whose interfaces are slightly different 
 from the existing type-coercion functions:

listx(1,2,3)  = [1, 2, 3]
listx(3)  = [3]
listx(listx(3)) = [[3]]
dictx((a,b), (c,d))  = {a:b, c:d}
setx(a,b,c)   = Set((a,b,c))

 listx/dictx/setx would be the display forms as well as the constructor forms.

Could these even replace the current forms? If you want the equivalent of 
list(sometuple), write list(*sometuple). With a bit of cleverness down in 
the worky bits, this could be implemented to avoid the apparent overhead 
of unpacking and then repacking the tuple. In fact, in general, it would 
be nice if code like:

def f(*args):
fondle(args)

foo = (1, 2, 3)
f(*foo)

Would avoid the unpack/repack.

The problem is that you then can't easily do something like:

mytable = ((1, 2, 3), (a, b, c), (Tone.do, Tone.re, Tone.mi))
mysecondtable = map(list, mytable)

Although that's moderately easy to work around with possibly the most 
abstract higher-order-function i've ever written:

def star(f):
def starred_f(args):
return f(*args)
return starred_f

Which lets us write:

mysecondtable = map(star(list), mytable)

While we're here, we should also have the natural complement of star, its 
evil mirror universe twin:

def bearded_star(f):
def bearded_starred_f(*args):
return f(args)
return bearded_starred_f

Better names (eg unpacking and packing) would obviously be needed.

tom

-- 
I might feel irresponsible if you couldn't go almost anywhere and see
naked, aggressive political maneuvers in iteration, marinating in your
ideology of choice. That's simply not the case. -- Tycho Brahae
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Arithmetic sequences in Python

2006-01-21 Thread Tom Anderson
On Sat, 21 Jan 2006, it was written:

 Tom Anderson [EMAIL PROTECTED] writes:

 listx/dictx/setx would be the display forms as well as the constructor 
 forms.

 Could these even replace the current forms? If you want the equivalent 
 of list(sometuple), write list(*sometuple).

 The current list function is supposed to be something like a typecast:

A what?

;-|

 list() = []
 xlist() = []   # ok

 list(list()) = []   # casting a list to a list does nothing
 xlist(xlist()) = [[]]  # make a new list, not the same

 list(xrange(4)) = [0,1,2,3]
 xlist(xrange(4)) = [xrange(4)]   # not the same

 list((1,2)) = [1,2]
 xlist((1,2)) = [(1,2)]

True, but so what? Is it that it has to be that way, or is it just that it 
happens to be that way now?

tom

-- 
It's the 21st century, man - we rue _minutes_. -- Benjamin Rosenbaum
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Numarray, numeric, NumPy, scpy_core ??!!

2006-01-21 Thread Tom Anderson
On Sat, 21 Jan 2006, Travis E. Oliphant wrote:

 J wrote:

 I will just jump in an use NumPy. I hope this one will stick and evolve 
 into the mother of array packages. How stable is it ? For now I really 
 just need basic linear algebra. i.e. matrix multiplication, dot, cross 
 etc

 There is a new release coming out this weekend.  It's closer to 1.0 and 
 so should be more stable.  It also has some speed improvements in 
 matrix-vector operations (if you have ATLAS BLAS --- or if you download 
 a binary version with ATLAS BLAS compiled in).  I would wait for it.

Pardon my failure to RTFM, but does NumPy pick up the vecLib BLAS on Macs?

tom

-- 
It's the 21st century, man - we rue _minutes_. -- Benjamin Rosenbaum
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: OT: excellent book on information theory

2006-01-21 Thread Tom Anderson
Slow and to the pointless, but ...

On Wed, 18 Jan 2006, Terry Hancock wrote:

 On Mon, 16 Jan 2006 12:15:25 -0500
 Tim Peters [EMAIL PROTECTED] wrote:

 More Britishisms are surviving in the Scholastic editions as the 
 series goes on, but as the list for Half-Blood Prince shows the editors 
 still make an amazing number of seemingly pointless changes: like:

UK:Harry smiled vaguely back
US:Harry smiled back vaguely

 I know you are pointing out the triviality of this, since both US and UK 
 English allow either placement -- but is it really preferred style in 
 the UK to put the adverb right before the verb?

For the meaning which i assume is meant here, no, i wouldn't have said so.

 In US English, the end of the clause (or the beginning) is probably more 
 common.

Same in British English (or at least, English English).

As Dave Hansen pointed out, Harry smiled vaguely back, means that the 
direction Harry was smiling was vaguely back - might have been a bit to 
the side or something.

 This actually gets back on topic ( ;-) ), because it might affect the 
 localization of a Python interactive fiction module I'm working on -- 
 it's a GUI to generate sentences that are comprehensible to the IF 
 engine.

My guess would be that you're going to need something far more powerful 
than a localisation engine for this.

 en_US:
 Sally, gently put flower in basket

 vs

 en_UK:
 Sally, put flower in basket gently

That example isn't as bad as the Rowling one (although the lack of 
articles is a bit odd); i think i'd only use the latter form if i wanted 
to put particular emphasis on the 'gently', particularly if it was as a 
modified repetition of a previous sentence:

Instructor: Sally, put a flower in the basket.
[Sally roughly puts the flower in the basket, crushing it]
Instructor: Sally, put a flower in the basket *gently*.

Your second construction isn't the equivalent of the Rowling sentence, 
though, where the adverb goes right after the verb; that would make it 
Sally, put gently the flower in the basket, which would be completely 
awful. Or maybe it would be Sally, put the flower gently in the basket, 
which would be fine, although a bit dated - has an admittedly euphonious 
1950s BBC English feel to it.

tom

-- 
It's the 21st century, man - we rue _minutes_. -- Benjamin Rosenbaum
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: On Numbers

2006-01-18 Thread Tom Anderson
On Wed, 18 Jan 2006, Steven D'Aprano wrote:

 On Tue, 17 Jan 2006 23:34:40 +, Tom Anderson wrote:

 So I don't really know what point you are making. What solution(s) for
 1**0.5 were you expecting?

 He's probably getting at the fact that if you're dealing with complex
 numbers, square root get a lot more complicated:

 http://mathworld.wolfram.com/SquareRoot.html

 But still, that doesn't change the fact that x**0.5 as is meant here is
 the principal (positive) real square root, and that can be true whether
 your hierarchy of numeric types includes a complex type or not.

 Er, actually, i meant to write -1, but evidently missed a key, and failed
 to check what i'd written.

 Since exponentiation has higher priority than negation, -1**0.5 is -1.0 in
 both Python and ordinary mathematics.

 Perhaps you meant to write (-1)**0.5,

Yes.

[FX: bangs head on keyboard]

I'm still getting this wrong after all these years.

 in which case Python developers have a decision to make: should it 
 assume real-valued maths unless explicitly told differently, and hence 
 raise an exception, or coerce the result to complex?

Precisely.

 In this case, Python raises an exception, as it should, unless you 
 explicitly uses complex numbers. That's the best behaviour for the 
 majority of people: most people don't even know what complex numbers 
 are, let alone want to deal with them in their code. Python, after all, 
 is not Mathematica.

I think i agree with you, as a matter of practical value. However, this 
does go against the whole numeric unification thing we were discussing.

Hmm. What happens if i say (-1) ** (0.5+0j)? Ah, i get the right answer. 
Well, that's handy - it means i don't have to resort to cmath or sprinkle 
complex() calls all over the place for complex maths.

tom

-- 
Biochemistry is the study of carbon compounds that wriggle.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: On Numbers

2006-01-17 Thread Tom Anderson
On Mon, 16 Jan 2006, Erik Max Francis wrote:

 Steven D'Aprano wrote:

 The square root of 1 is +1 (the negative root being explicitly 
 rejected). Pure mathematicians, who may be expected to care whether the 
 root is the integer 1 or the real number 1, are unlikely to write 
 1**0.5, prefering the squareroot symbol.
 
 For the rest of us, including applied mathematicians, 1**0.5 implies 
 floating point, which implies the correct answer is 1.0.
 
 So I don't really know what point you are making. What solution(s) for 
 1**0.5 were you expecting?

 He's probably getting at the fact that if you're dealing with complex 
 numbers, square root get a lot more complicated:

   http://mathworld.wolfram.com/SquareRoot.html

 But still, that doesn't change the fact that x**0.5 as is meant here is 
 the principal (positive) real square root, and that can be true whether 
 your hierarchy of numeric types includes a complex type or not.

Er, actually, i meant to write -1, but evidently missed a key, and failed 
to check what i'd written.

But excellent discussion there, chaps! All shall have medals!

tom

-- 
Taking care of business
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Web application design question (long)

2006-01-17 Thread Tom Anderson
On Tue, 16 Jan 2006, Fried Egg wrote:

 I am interested if anyone can shed any light on a web application 
 problem,

I'm not going to help you with that, but i am going to mention the Dada 
Engine:

http://dev.null.org/dadaengine/

And its most famous incarnation, the Postmodernism Generator:

http://www.elsewhere.org/pomo

tom

-- 
Taking care of business
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Arithmetic sequences in Python

2006-01-17 Thread Tom Anderson
On Tue, 16 Jan 2006, it was written:

 Tom Anderson [EMAIL PROTECTED] writes:

 The natural way to implement this would be to make .. a normal 
 operator, rather than magic, and add a __range__ special method to 
 handle it. a .. b would translate to a.__range__(b). I note that 
 Roman Suzi proposed this back in 2001, after PEP 204 was rejected. It's 
 a pretty obvious implementation, after all.

 Interesting, but what do you do about the unary postfix (1 ..)
 infinite generator?

1.__range__(None)

 (-3,-5 ..)   --  'infinite' generator that yield -3,-5,-7 and so on

 -1. Personally, i find the approach of specifying the first two 
 elements *absolutely* *revolting*, and it would consistently be more 
 awkward to use than a start/step/stop style syntax. Come on, when do 
 you know the first two terms but not the step size?

 Usually you know both, but showing the first two elements makes sequence 
 more visible.  I certainly like (1,3..9) better than (1,9;2) or 
 whatever.

I have to confess that i don't have a pretty three-argument syntax to 
offer as an alternative to yours. But i'm afraid i still don't like yours. 
:)

 1) [] means list, () means generator
 Yuck. Yes, i know it's consistent with list comps and genexps, but yuck 
 to those too!

 I'd be ok with getting rid of [] and just having generators or 
 xrange-like class instances.  If you want to coerce one of those to a 
 list, you'd say list((1..5)) instead of [1..5].

Sounds good. More generally, i'd be more than happy to get rid of list 
comprehensions, letting people use list(genexp) instead. That would 
obviously be a Py3k thing, though.

tom

-- 
Taking care of business
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Arithmetic sequences in Python

2006-01-17 Thread Tom Anderson
On Tue, 17 Jan 2006, Antoon Pardon wrote:

 Op 2006-01-16, Alex Martelli schreef [EMAIL PROTECTED]:
 Paul Rubin http://[EMAIL PROTECTED] wrote:

 Steven D'Aprano [EMAIL PROTECTED] writes:
 For finite sequences, your proposal adds nothing new to existing
 solutions like range and xrange.

 Oh come on, [5,4,..0] is much easier to read than range(5,-1,-1).

 But not easier than reversed(range(6)) [[the 5 in one of the two
 expressions in your sentence has to be an offbyone;-)]]

 Why don't we give slices more functionality and use them.
 These are a number of ideas I had. (These are python3k ideas)

 1) Make slices iterables. (No more need for (x)range)

 2) Use a bottom and stop variable as default for the start and
   stop attribute. top would be a value that is greater than
   any other value, bottom would be a value smaller than any
   other value.

 3) Allow slice notation to be used anywhere a value can be
   used.

 4) Provide a number of extra operators on slices.
   __neg__ (reverses the slice)
   __and__ gives the intersection of two slices
   __or__ gives the union of two slices

 5) Provide sequences with a range (or slice) method.
   This would provide an iterator that iterates over
   the indexes of the sequences. A slice could be
   provided

+5

  for i, el in enumerate(sequence):

 would become

  for i in sequence.range():
el = sequence[i]

That one, i'm not so happy with - i quite like enumerate; it communicates 
intention very clearly. I believe enumerate is implemented with iterators, 
meaning it's potentially more efficient than your approach, too. And since 
enumerate works on iterators, which yours doesn't, you have to keep it 
anyway. Still, both would be possible, and it's a matter of taste.

 But the advantage is that this would still work when someone subclasses 
 a list so that it start index is an other number but 0.

It would be possible to patch enumerate to do the right thing in those 
situations - it could look for a range method on the enumerand, and if it 
found one, use it to generate the indices. Like this:

def enumerate(thing):
if (hasattr(thing, range)):
indices = thing.range()
else:
indices = itertools.count()
return itertools.izip(indices, thing)

 If you only wanted every other index one could do the following

  for i in sequence.range(::2):

 which would be equivallent to

  for i in sequence.range()  (::2):

Oh, that is nice. Still, you could also extend enumerate to take a range 
as an optional second parameter and do this with it. Six of one, half a 
dozen of the other, i suppose.

tom

-- 
Taking care of business
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: On Numbers

2006-01-16 Thread Tom Anderson
On Sun, 15 Jan 2006, Alex Martelli wrote:

 Paul Rubin http://[EMAIL PROTECTED] wrote:

 Mike Meyer [EMAIL PROTECTED] writes:

 I'd like to work on that. The idea would be that all the numeric types 
 are representations of reals with different properties that make them 
 appropriate for different uses.

 2+3j?

 Good point, so s/reals/complex numbers/ -- except for this detail, 
 Mike's idea do seem well founded.

1 ** 0.5 ?

I do like the mathematical cleanliness of making ints and floats do the 
right thing when the answer would be complex, but as a pragmatic decision, 
it might not be the right thing to do. It evidently wasn't thought it was 
when python's current number system was designed. I think Tim Peters has 
an opinion on this.

tom

-- 
Socialism - straight in the mainline!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Arithmetic sequences in Python

2006-01-16 Thread Tom Anderson
On Mon, 16 Jan 2006, it was written:

 There's something to be said for that.  Should ['a'..'z'] be a list or a 
 string?

And while we're there, what should ['aa'..'zyzzogeton'] be?

tom

-- 
Socialism - straight in the mainline!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Arithmetic sequences in Python

2006-01-16 Thread Tom Anderson
On Mon, 16 Jan 2006, Gregory Petrosyan wrote:

 Please visit http://www.python.org/peps/pep-0204.html first.

 As you can see, PEP 204 was rejected, mostly because of not-so-obvious
 syntax. But IMO the idea behind this pep is very nice.

Agreed. Although i have to say, i like the syntax there - it seems like a 
really natural extension of existing syntax.

 So, maybe there's a reason to adopt slightly modified Haskell's syntax?

Well, i do like the .. - 1..3 seems like a natural way to write a range. 
I'd find 1...3 more natural, since an ellipsis has three dots, but it is 
slightly more tedious.

The natural way to implement this would be to make .. a normal operator, 
rather than magic, and add a __range__ special method to handle it. a .. 
b would translate to a.__range__(b). I note that Roman Suzi proposed 
this back in 2001, after PEP 204 was rejected. It's a pretty obvious 
implementation, after all.

 Something like

 [1,3..10]  --  [1,3,5,7,9]
 (1,3..10)  --  same values as above, but return generator instead of
 list
 [1..10] --  [1,2,3,4,5,6,7,8,9,10]
 (1 ..)--  'infinite' generator that yield 1,2,3 and so on
 (-3,-5 ..)   --  'infinite' generator that yield -3,-5,-7 and so on

-1. Personally, i find the approach of specifying the first two elements 
*absolutely* *revolting*, and it would consistently be more awkward to use 
than a start/step/stop style syntax. Come on, when do you know the first 
two terms but not the step size?

 1) [] means list, () means generator

Yuck. Yes, i know it's consistent with list comps and genexps, but yuck to 
those too!

Instead, i'd like to see lazy lists used here - these look like lists, and 
can be used exactly like a list, but if all you want to do is iterate over 
them, they don't need to instantiate themselves in memory, so they're as 
efficient as an iterator. The best of both worlds! I've written a sketch 
of a generic lazy list:

http://urchin.earth.li/~twic/lazy.py

Note that this is what xrange does already (as i've just discovered).

tom

-- 
Socialism - straight in the mainline!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Arithmetic sequences in Python

2006-01-16 Thread Tom Anderson
On Mon, 16 Jan 2006, Alex Martelli wrote:

 Steven D'Aprano [EMAIL PROTECTED] wrote:

 On Mon, 16 Jan 2006 12:51:58 +0100, Xavier Morel wrote:

 For those who'd need the (0..n-1) behavior, Ruby features something 
 that I find quite elegant (if not perfectly obvious at first), 
 (first..last) provides a range from first to last with both boundaries 
 included, but (first...last) (notice the 3 periods)

 No, no I didn't.

 Sheesh, that just *screams* Off By One Errors!!!. Python deliberately 
 uses a simple, consistent system of indexing from the start to one past 
 the end specifically to help prevent signpost errors, and now some 
 folks want to undermine that.

 *shakes head in amazement*

 Agreed.  *IF* we truly needed an occasional up to X *INCLUDED* 
 sequence, it should be in a syntax that can't FAIL to be noticed, such 
 as range(X, endincluded=True).

How about first,,last? Harder to do by mistake, but pretty horrible in its 
own way.

tom

-- 
Socialism - straight in the mainline!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: how do real python programmers work?

2006-01-13 Thread Tom Anderson
On Thu, 12 Jan 2006, bblais wrote:

 In Matlab, I do much the same thing, except there is no compile phase. I 
 have the editor on one window, the Matlab interactive shell in the 
 other.  I often make a bunch of small scripts for exploration of a 
 problem, before writing any larger apps.  I go back and forth editing 
 the current file, and then running it directly (Matlab looks at the time 
 stamp, and automagically reloads the script when I modify it).

I wouldn't describe myself as an experienced programmer, but this is 
definitely how i work - editor plus interactive interpreter, using 
import/reload to bring in and play with bits of of code.

Towards the end of coding a program, when i'm done with the inner 
functions and am working on the main function, which does stuff like 
command line parsing, setting up input and output, etc, i'll often leave 
the interpreter and work from the OS shell, since that's the proper 
environment for a whole program.

Often, i'll actually have more than one shell open - generally three: one 
with an interpreter without my code loaded, for doing general exploratory 
programming, testing code fragments, doing sums, etc; one with an 
interpreter with my code loaded, for testing individual components of the 
code, and one at the OS shell, for doing whole-program tests, firing up 
editors, general shell work, etc.

Another trick is to write lightweight tests as functions in the 
interpreter-with-code-loaded that reload my module and then do something 
with it. For example, for testing my (entirely fictional) video 
compressor, i might write:

def testcompressor():
reload(vidzip)
seq = vidzip.ImageSequence((640, 480))
for i in xrange(200):
frameName = testmovie.%02i.png % i
frame = Image.open(frameName)
seq.append(frame)
codec = vidzip.Compressor(vidzip.DIRAC, 9)
codec.compress(seq, file(testmovie.bbc, w))

Then, after editing and saving my code, i can just enter 
testcompressor() (or, in most cases, hit up-arrow and return) to reload 
and test. You can obviously extend this a bit to make the test routine 
take parameters which control the nature of the test, so you can easily 
test a range of things, and you can have multiple different test on the go 
at once.

tom

-- 
Only men's minds could have mapped into abstraction such a territory
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: how do real python programmers work?

2006-01-13 Thread Tom Anderson
On Thu, 12 Jan 2006, Mike Meyer wrote:

 well, we need a term for development environment built out of Unix 
 tools

Disintegrated development environment? Differentiated development 
environment? How about just a development environment?

tom

-- 
NOW ALL ASS-KICKING UNTIL THE END
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: how do real python programmers work?

2006-01-13 Thread Tom Anderson
On Fri, 13 Jan 2006, Roy Smith wrote:

 Mike Meyer [EMAIL PROTECTED] wrote:

 we need a term for development environment built out of Unix tools

 We already have one.  The term is emacs.

Emacs isn't built out of unix tools - it's a standalone program.

Ah, of course - to an true believer, emacs *is* the unix toolset.

:)

tom

-- 
NOW ALL ASS-KICKING UNTIL THE END
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Help wanted with md2 hash algorithm

2006-01-10 Thread Tom Anderson
On Sun, 8 Jan 2006, Tom Anderson wrote:

 On Fri, 6 Jan 2006 [EMAIL PROTECTED] wrote:

 below you find my simple python version of MD2 algorithm as described 
 in RFC1319 (http://rfc1319.x42.com/MD2). It produces correct results 
 for strings shorter than 16 Bytes and wrong results for longer strings.

 I guess the thing to do is extract the C code from the RFC and compile 
 it, verify that it works, then stick loads of print statements in the C 
 and the python, to see where the states of the checksum engines diverge.

Okay, i've done this. I had to fiddle with the source a bit - added a 
#include global.h to md2.h (it needs it for the PROTO_LIST macro) and 
took the corresponding includes out of md2c.c and mddriver.c (to avoid 
duplicate definitions) - but after that, it built cleanly with:

gcc -DMD=2 *.c *.h -o mddriver

A couple of pairs of (somewhat spurious) parentheses in mddriver.c, and it 
even built cleanly with -Wall.

Running the test suite with mddriver -x gives results matching the test 
vectors in the RFC - a good start!

Patching the code to dump the checksums immediately after updating with 
the pad, and before updating with the checksum:

*** checksum after padding = 623867b6af52795e5f214e9720beea8d
MD2 () = 8350e5a3e24c153df2275c9f80692773
*** checksum after padding = 19739cada3ba281693348e9d256fff31
MD2 (a) = 32ec01ec4a6dac72c0ab96fb34c0b5d1
*** checksum after padding = 19e29d1b7304368e595a276f302f57cc
MD2 (abc) = da853b0d3f88d99b30283a69e6ded6bb
*** checksum after padding = 56d65157dedfcd75a7b1e82d970eec4b
MD2 (message digest) = ab4f496bfb2a530b219ff33031fe06b0
*** checksum after padding = 4a42d3a377b7e9988fb9289699e4d3a3
MD2 (abcdefghijklmnopqrstuvwxyz) = 4e8ddff3650292ab5a4108c3aa47940b
*** checksum after padding = c3db7592ee1dd9b84505cfb4e2f9a765
MD2 (ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789) = 
da33def2a42df13975352846c30338cd
*** checksum after padding = 59ca5673c8f931bc41214f56b5c6c01
MD2 
(12345678901234567890123456789012345678901234567890123456789012345678901234567890)
 = d5976f79d83d3a0dc9806c3c66f3efd8

And here's my python code with the same modification, running the test 
suite:

*** checksum after padding =  623867b6af52795e5f214e9720beea8d
MD2 () = 8350e5a3e24c153df2275c9f80692773
*** checksum after padding =  19739cada3ba281693348e9d256fff31
MD2 (a) = 32ec01ec4a6dac72c0ab96fb34c0b5d1
*** checksum after padding =  19e29d1b7304368e595a276f302f57cc
MD2 (abc) = da853b0d3f88d99b30283a69e6ded6bb
*** checksum after padding =  56d65157dedfcd75a7b1e82d970eec4b
MD2 (message digest) = ab4f496bfb2a530b219ff33031fe06b0
*** checksum after padding =  539ba695f264f365bcabc5c8b10913c7
MD2 (abcdefghijklmnopqrstuvwxyz) = 65182bb8c569485fcba44dbc66a02b56
*** checksum after padding =  365fe0617f5f56a56090af1cfd6caac3
MD2 (ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789) = 
a1ccc835ea9654d6a2926c21f0b20813
*** checksum after padding =  9acf39425d22c4e3b4ddbdc563d23716
MD2 
(12345678901234567890123456789012345678901234567890123456789012345678901234567890)
 = 8f1f49dc8de490b9aa7c99cec3fbccdf

As you can see, the checksums start to go wrong when we hit 16 bytes.

So, let us turn our attention to the checksum function.

Here's the python i wrote:

def checksum_old(c, buf): # c is checksum array, buf is input block
l = c[-1]
for i in xrange(digest_size):
l = S[(buf[i] ^ l)]
c[i] = l

Here's the C from the RFC:

   unsigned int i, j, t;
   t = checksum[15];
   for (i = 0; i  16; i++)
 t = checksum[i] ^= PI_SUBST[block[i] ^ t];

Spot the difference. Yes, the assignment into the checksum array is a ^=, 
not a straight = - checksum bytes get set to 
current-value-of-checksum-byte xor S-box-transformation-of (input-byte xor 
accumulator). Translating that into python, we get:

def checksum(c, buf):
l = c[-1]
for i in xrange(digest_size):
l = S[(buf[i] ^ l)] ^ c[i]
c[i] = l

And when we put that back into the code, we get the right digests out. 
Victory!

However, here's what the pseudocode in the RFC says:

  For j = 0 to 15 do
 Set c to M[i*16+j].
 Set C[j] to S[c xor L].
 Set L to C[j].
   end /* of loop on j */

I certainly don't see any sign of a xor with the 
current-value-of-checksum-byte in there - it looks like the C and 
pseudocode in the RFC don't match up.

And, yes, googling for RFC 1319 errata brings up a report correcting 
this. They really ought to amend RFCs to mention errata!

Correct code here:

http://urchin.earth.li/~twic/md2.py

tom

-- 
Mathematics is the door and the key to the sciences. -- Roger Bacon
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: try: except never:

2006-01-10 Thread Tom Anderson
On Tue, 10 Jan 2006, Duncan Booth wrote:

 Paul Rubin wrote:

 Hallvard B Furuseth [EMAIL PROTECTED] writes:
 class NeverRaised(Exception): pass
 for ex in ZeroDivisionError, NeverRaised:

 Heh.  Simple enough.  Unless some obstinate person raises it anyway...

 Hmm, ok, how's this?:

def NeverRaised():
  class blorp(Exception): pass
  return blorp
for ex in ZeroDivisionError, NeverRaised():
  ...

Nice.

 Or you can create an unraisable exception:

 \
 class NeverRaised(Exception):
def __init__(self, *args):
raise RuntimeError('NeverRaised should never be raised')

Briliant! Although i'd be tempted to define an UnraisableExceptionError to 
signal what's happened. Or ...

class ImpossibleException(Exception):
def __init__(self, *args):
raise ImpossibleException, args

Although crashing the interpreter is probably overkill.

tom

-- 
Like Kurosawa i make mad films; okay, i don't make films, but if i did
they'd have a samurai.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Help wanted with md2 hash algorithm

2006-01-07 Thread Tom Anderson
On Fri, 6 Jan 2006 [EMAIL PROTECTED] wrote:

 below you find my simple python version of MD2 algorithm
 as described in RFC1319  (http://rfc1319.x42.com/MD2).
 It produces correct results for strings shorter than 16 Bytes and wrong
 results for longer strings.

 I can't find what's wrong.

 Can anybody help?

Okay, i've reimplemented the code from scratch, based on the RFC, without 
even looking at your code, as a basis for comparison.

The trouble is, i get exactly the same results as you!

Here's mine:

http://urchin.earth.li/~twic/md2.py

I guess the thing to do is extract the C code from the RFC and compile it, 
verify that it works, then stick loads of print statements in the C and 
the python, to see where the states of the checksum engines diverge.

tom

-- 
Death to all vowels! The Ministry of Truth says vowels are plus
undoublethink. Vowels are a Eurasian plot! Big Brother, leading us proles
to victory!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Calling GPL code from a Python application

2006-01-04 Thread Tom Anderson
On Wed, 4 Jan 2006, Mike Meyer wrote:

 Terry Hancock [EMAIL PROTECTED] writes:

 It is interesting to note that the FSF holds the position that the 
 language that gives you this right *doesn't* -- it just clarifies the 
 fact that you already hold that right, because it is provided by fair 
 use.  Their position is that it is not possible to restrict the *use* 
 of software you have legally acquired, because copyright only controls 
 copying.

 I believe there is precedent that contradicts the FSF's
 position. There are two arguments against it:

 1) Executing software involves several copy operations. Each of those
   potentially violate the copyright, and hence the copyright holder
   can restrict execution of a program.

 2) Executing a program is analogous to a performance of the software.
   Copyright includes limits on performances, so the copyright holder
   can place limits on the execution of the software.

 Personally, I agree with the FSF - if own a copy of a program, executing 
 it should be fair use.

I'm with you - i don't accept either of those legal arguments. The copying 
that copyright talks about is the making of copies which can be 
distributed - copies which are the equivalent of the original. It doesn't 
mean the incidental, transient copies made during use - otherwise, it 
would be illegal to read a book, since a copy of the text is transiently 
made in your visual cortex, or to listen to a record, since a copy of the 
music is transiently made in the pattern of sound waves in the air. The 
performance that the law talks about is not like execution, but is 
communication, and so a form of copying - by performing a play, you're 
essentially giving a copy of the text to the audience. Executing a program 
doesn't communicate it to any third parties.

Of course, in practice, it matters rather little whether i accept either 
of those, since i'm not a judge trying the relevant test case!

 While I'm here, I'll point out the the address space argument is 
 specious. What if I bundle a standalone GPL'ed application with my own 
 application, and distribute binaries for a machine that has a shared 
 address space? By that criteria, I'd have to GPL my code for the 
 distribution for the shared address space machine, but not for a Unix 
 system. I'm not buying that.

I also agree that the address space thing is bunk. What if i write a 
CORBA/RPC/COM/etc wrapper round some GPL'd library, release that under the 
GPL, then write my non-GPL'd program to access the wrapped library via a 
socket? Or if i write a wrapper application that takes a function name and 
some parameters on the command line, calls that function, and writes the 
result to stdout, then access it via popen? I get the use of the library, 
without sharing its address space!

On the flip side, we could argue that an application which uses a dynamic 
library *is* a derivative work, since we need a header file from the 
library to compile it, and that header file is covered by the GPL. What 
happpens when you compile with a non-GPL but compatible header (say, one 
you've clean-roomed) but link to a GPL library at runtime, though?

tom

-- 
I am the best at what i do.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Memoization and encapsulation

2006-01-04 Thread Tom Anderson
On Wed, 4 Jan 2006 [EMAIL PROTECTED] wrote:

 I think python is broken here-- why aren't lists hashable, or why isn't
 there a straightforward way to make memoised() work?

a = [1, 2, 3]
d = {a: foo}
a[0] = 0
print d[a]

I feel your pain, but i don't think lists (and mutable objects generally) 
being unhashable is brokenness. I do think there's room for a range of 
opinion, though, and i'm not sure what i think is right.

tom

-- 
Rapid oxidation is the new black. -- some Mike
-- 
http://mail.python.org/mailman/listinfo/python-list


Filename case-insensitivity on OS X

2006-01-03 Thread Tom Anderson
Afternoon all,

MacOS X seems to have some heretical ideas about the value of case in 
paths - it seems to believe that it doesn't exist, more or less, so touch 
foo FOO touches just one file, you can't have both 'makefile' and 
'Makefile' in the same directory, 
os.path.exists(some_valid_path.upper()) returns True even when 
os.path.split(some_valid_path.upper())[1] in 
os.listdir(os.path.split(some_valid_path)[0]) returns False, etc 
(although, of course, ls *.txt doesn't mention any of those .TXT files 
lying around).

Just to prove it, here's what unix (specifically, linux) does:

[EMAIL PROTECTED]:~$ uname
Linux
[EMAIL PROTECTED]:~$ python
Python 2.3.5 (#2, Sep  4 2005, 22:01:42)
[GCC 3.3.5 (Debian 1:3.3.5-13)] on linux2
Type help, copyright, credits or license for more information.
 import os
 filenames = os.listdir(.)
 first = filenames[0]
 first in filenames
True
 first.upper() in filenames
False
 os.path.exists(os.path.join(., first))
True
 os.path.exists(os.path.join(., first.upper()))
False


And here's what OS X does:

Hooke:~ tom$ uname
Darwin
Hooke:~ tom$ python
Python 2.4.1 (#2, Mar 31 2005, 00:05:10)
[GCC 3.3 20030304 (Apple Computer, Inc. build 1666)] on darwin
Type help, copyright, credits or license for more information.
 import os
 filenames = os.listdir(.)
 first = filenames[0]
 first in filenames
True
 first.upper() in filenames
False
 os.path.exists(os.path.join(., first))
True
 os.path.exists(os.path.join(., first.upper()))
True


Sigh. Anyone got any bright ideas for working around this, specifically 
for os.path.exists? I was hoping there was some os.path.actualpath, so i 
could say:

def exists_dontignorecase(path):
return os.path.exists(path) and (path == os.path.actualpath(path))

Java has a java.io.File.getCanonicalPath method that does this, but i 
can't find an equivalent in python - is there one?

I can emulate it like this:

def _canonicalise(s, l):
s = s.lower()
for t in l:
if s == t.lower():
return t
raise ValueError, (could not canonicalise string, s)

def canonicalpath(path):
if (path in (/, )):
return path
parent, child = os.path.split(path)
cparent = canonicalpath(parent)
cchild = _canonicalise(child, os.listdir(cparent))
return os.path.join(cparent, cchild)

Or, more crudely, do something like this:

def exists_dontignorecase(path):
dir, f = os.path.split(path)
return f in os.listdir(dir)

But better solutions are welcome.

Thanks,
tom

-- 
Infantry err, infantry die. Artillery err, infantry die. -- IDF proverb
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Spiritual Programming (OT, but Python-inspired)

2006-01-03 Thread Tom Anderson
On Mon, 2 Jan 2006 [EMAIL PROTECTED] wrote:

 In this sense, we are like the ghost in the machine of a computer
 system running a computer program, or programs, written in a procedural
 language and style.

Makes sense - i heard that Steve Russell invented continuations after 
reading the Tibetan Book of the Dead.

tom

-- 
Chance? Or sinister scientific conspiracy?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: itertools.izip brokeness

2006-01-03 Thread Tom Anderson
On Tue, 3 Jan 2006, it was written:

 [EMAIL PROTECTED] writes:

 The problem is that sometimes, depending on which file is the shorter, 
 a line ends up missing, appearing neither in the izip() output, or in 
 the subsequent direct file iteration.  I would guess that it was in 
 izip's buffer when izip terminates due to the exception on the other 
 file.

 A different possible long term fix: change StopIteration so that it
 takes an optional arg that the program can use to figure out what
 happened.  Then change izip so that when one of its iterator args runs
 out, it wraps up the remaining ones in a new tuple and passes that
 to the StopIteration it raises.

+1

I think you also want to send back the items you read out of the iterators 
which are still alive, which otherwise would be lost. Here's a somewhat 
minimalist (but tested!) implementation:

def izip(*iters):
while True:
z = []
try:
for i in iters:
z.append(i.next())
yield tuple(z)
except StopIteration:
raise StopIteration, z

The argument you get back with the exception is z, the list of items read 
before the first empty iterator was encountered; if you still have your 
array iters hanging about, you can find the iterator which stopped with 
iters[len(z)], the ones which are still going with iters[:len(z)], and the 
ones which are in an uncertain state, since they were never tried, with 
iters[(len(z) + 1):]. This code could easily be extended to return more 
information explicitly, of course, but simple, sparse, etc.

 You would want some kind of extended for-loop syntax (maybe involving 
 the new with statement) with a clean way to capture the exception 
 info.

How about for ... except?

for z in izip(a, b):
lovingly_fondle(z)
except StopIteration, leftovers:
angrily_discard(leftovers)

This has the advantage of not giving entirely new meaning to an existing 
keyword. It does, however, afford the somewhat dubious use:

for z in izip(a, b):
lovingly_fondle(z)
except ValueError, leftovers:
pass # execution should almost certainly never get here

Perhaps that form should be taken as meaning:

try:
for z in izip(a, b):
lovingly_fondle(z)
except ValueError, leftovers:
pass # execution could well get here if the fondling goes wrong

Although i think it would be more strictly correct if, more generally, it 
made:

for LOOP_VARIABLE in ITERATOR:
SUITE
except EXCEPTION:
HANDLER

Work like:

try:
while True:
try:
LOOP_VARIABLE = ITERATOR.next()
except EXCEPTION:
raise __StopIteration__, sys.exc_info()
except StopIteration:
break
SUITE
except __StopIteration__, exc_info:
somehow_set_sys_exc_info(exc_info)
HANDLER

As it stands, throwing a StopIteration in the suite inside a for loop 
doesn't terminate the loop - the exception escapes; by analogy, the 
for-except construct shouldn't trap exceptions from the loop body, only 
those raised by the iterator.

tom

-- 
Chance? Or sinister scientific conspiracy?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Filename case-insensitivity on OS X

2006-01-03 Thread Tom Anderson
On Tue, 3 Jan 2006, Scott David Daniels wrote:

 Tom Anderson wrote:

 Java has a java.io.File.getCanonicalPath method that does this, but i can't 
 find an equivalent in python - is there one?

 What's wrong with: os.path.normcase(path) ?

It doesn't work.

Hooke:~ tom$ uname
Darwin
Hooke:~ tom$ python
Python 2.4.1 (#2, Mar 31 2005, 00:05:10)
[GCC 3.3 20030304 (Apple Computer, Inc. build 1666)] on darwin
Type help, copyright, credits or license for more information.
 import os
 path=os.path.join(., os.listdir(.)[0])
 path
'./.appletviewer'
 os.path.normcase(path)
'./.appletviewer'
 os.path.normcase(path.upper())
'./.APPLETVIEWER'


I'm not entirely sure what normcase is supposed to do - the documentation 
says Normalize case of pathname.  Has no effect under Posix, which is 
less than completely illuminating.

tom

-- 
It involves police, bailiffs, vampires and a portal to hell under a
tower block in Hackney.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Filename case-insensitivity on OS X

2006-01-03 Thread Tom Anderson
On Tue, 3 Jan 2006, Dan Sommers wrote:

 On Tue, 03 Jan 2006 15:21:19 GMT,
 Doug Schwarz [EMAIL PROTECTED] wrote:

 Strictly speaking, it's not OS X, but the HFS file system that is case
 insensitive.

Aaah, of course. Why on earth didn't Apple move to UFS/FFS/whatever with 
the switch to OS X?

 You can use other file systems, such as UNIX File System.  Use Disk 
 Utility to create a disk image and then erase it (again, using Disk 
 Utility) and put UFS on it.  You'll find that touch foo FOO will 
 create two files.

 You may also find some native Mac OS X applications failing in strange 
 ways.

Oh, that's why. :(

tom

-- 
It involves police, bailiffs, vampires and a portal to hell under a
tower block in Hackney.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Memoization and encapsulation

2006-01-01 Thread Tom Anderson
On Sat, 31 Dec 2005 [EMAIL PROTECTED] wrote:

just I actually prefer such a global variable to the default arg
just trick. The idiom I generally use is:

just _cache = {}
just def func(x):
just result = _cache.get(x)
just if result is None:
just result = x + 1  # or a time consuming calculation...
just _cache[x] = result
just return result

 None of the responses I've seen mention the use of decorators such as the
 one shown here:

http://wiki.python.org/moin/PythonDecoratorLibrary

 While wrapping one function in another is obviously a bit slower, you can
 memoize any function without tweaking its source.

I'd definitely say this is the way to go.

def memoised(fn):
cache = {}
def memoised_fn(*args):
if args in cache:
return cache[args]
else:
rtn = fn(*args)
cache[args] = rtn
return rtn
return memoised_fn

@memoised
def func(x):
return x + 1 # or a time-consuming calculation

tom

-- 
Exceptions say, there was a problem. Someone must deal with it. If you
won't deal with it, I'll find someone who will.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: how to remove duplicated elements in a list?

2005-12-20 Thread Tom Anderson
On Mon, 19 Dec 2005, Brian van den Broek wrote:

 [EMAIL PROTECTED] said unto the world upon 2005-12-19 02:27:
 Steve Holden wrote:
 
 Kevin Yuan wrote:
 
 How to remove duplicated elements in a list? eg.
 [1,2,3,1,2,3,1,2,1,2,1,3] - [1,2,3]?
 Thanks!!
 
   list(set([1,2,3,1,2,3,1,2,1,2,1,3]))
 [1, 2, 3]
 
 Would this have the chance of changing the order ? Don't know if he
 wants to maintain the order or don't care though.

 For that worry:

 orig_list = [3,1,2,3,1,2,3,1,2,1,2,1,3]
 new_list = list(set(orig_list))
 new_list.sort(cmp= lambda x,y: cmp(orig_list.index(x), 
 orig_list.index(y)))
 new_list
 [3, 1, 2]


Ah, that gives me an idea:

 import operator
 orig_list = [3,1,2,3,1,2,3,1,2,1,2,1,3]
 new_list = map(operator.itemgetter(1),
... filter(lambda (i, x): i == orig_list.index(x),
... enumerate(orig_list)))
 new_list
[3, 1, 2]

This is a sort of decorate-fondle-undecorate, where the fondling is 
filtering on whether this is the first occurrance of the the value. This 
is, IMHO, a clearer expression of the original intent - how can i remove 
such-and-such elements from a list is begging for filter(), i'd say.

My code is O(N**2), a bit better than your O(N**2 log N), but we can get 
down to O(N f(N)), where f(N) is the complexity of set.__in__ and set.add, 
using a lookaside set sort of gizmo:

 orig_list = [3,1,2,3,1,2,3,1,2,1,2,1,3]
 seen = set()
 def unseen(x):
... if (x in seen):
... return False
... else:
... seen.add(x)
... return True
...
 new_list = filter(unseen, orig_list)
 new_list
[3, 1, 2]

Slightly tidier like this, i'd say:

 orig_list = [3,1,2,3,1,2,3,1,2,1,2,1,3]
 class seeingset(set):
... def see(self, x):
... if (x in self):
... return False
... else:
... self.add(x)
... return True
...
 new_list = filter(seeingset().see, orig_list)
 new_list
[3, 1, 2]


tom

-- 
Hit to death in the future head
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: getopt and options with multiple arguments

2005-12-20 Thread Tom Anderson
On Mon, 19 Dec 2005, [EMAIL PROTECTED] wrote:

 I want to be able to do something like:

 myscript.py * -o outputfile

 and then have the shell expand the * as usual, perhaps to hundreds of 
 filenames. But as far as I can see, getopt can only get one argument 
 with each option. In the above case, there isn't even an option string 
 before the *, but even if there was, I don't know how to get getopt to 
 give me all the expanded filenames in an option.

I'm really surprised that getopt doesn't handle this properly by default 
(so getopt.getopt mimics unices with crappy getopts - since when was that 
a feature?), but as Steven pointed out, getopt.gnu_getopt will float your 
boat.

I have an irrational superstitious fear of getopt, so this is what i use 
(it returns a list of arguments, followed by a dict mapping flags to 
values; it only handles long options, but uses a single dash for them, as 
is, for some reason, the tradition in java, where i grew up):

def arguments(argv, expand=True):
argv = list(argv)
args = []
flags = {}
while (len(argv)  0):
arg = argv.pop(0)
if (arg == --):
args.extend(argv)
break
elif (expand and arg.startswith(@)):
if (len(arg)  1):
arg = arg[1:]
else:
arg = argv.pop(0)
argv[0:0] = list(stripped(file(arg)))
elif (arg.startswith(-) and (len(arg)  1)):
arg = arg[1:]
if (: in arg):
key, value = arg.split(:)
else:
key = arg
value = 
flags[key] = value
else:
args.append(arg)
return args, flags

def stripped(f):
Return an iterator over the strings in the iterable f in which
strings are stripped of #-delimited comments and leading and
trailing whitespace, and blank strings are skipped.


for line in f:
if (# in line): line = line[:line.index(#)]
line = line.strip()
if (line == ): continue
yield line
raise StopIteration

As a bonus, you can say @foo or @ foo to mean insert the lines contained 
in file foo in the command line here, which is handy if, say, you have a 
file containing a list of files to be processed, and you want to invoke a 
script to process them, or if you want to put some standard flags in a 
file and pull them in on the command line. Yes, you could use xargs for 
this, but this is a bit easier. If you don't want this, delete the elif 
block mentioning the @, and the stripped function. A slightly neater 
implementation not involving list.pop also then becomes possible.

tom

-- 
Hit to death in the future head
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: putenv

2005-12-20 Thread Tom Anderson
On Tue, 20 Dec 2005, Steve Holden wrote:

 Mike Meyer wrote:
 Terry Hancock [EMAIL PROTECTED] writes:
 
 On Tue, 20 Dec 2005 05:35:48 -
 Grant Edwards [EMAIL PROTECTED] wrote:
 
 On 2005-12-20, [EMAIL PROTECTED] [EMAIL PROTECTED]
 wrote:
 
 I have csh script that calls a bunch of python programs and I'd like 
 to use env variables as kind of a global variable that I can pass 
 around to the pythong scripts.
 
 You can't change the environment of the parent process.
 
 There is an evil trick, however:
 
 Instead of setting the environment directly, have the python program 
 return csh code to alter the environment the way you want, then call 
 the python code by sourcing its output:
 
 source `my_script.py`
 
 Does this actually work? It looks to me like you need two levels:
 my_script.py creates a file, then outputs the name of the file, as the
 csh source command reads commands from the file named as an argument.
 
 To be able to output the commands directly, you'd need to use the eval
 command, not the source command.

 I suspect the trick that Terry was thinking of was eval, not source. You are 
 correct in saying he'd need to create a file to source.

True. The downside of eval is that it doesn't (well, in bash, anyway) 
handle line breaks properly (for some value of 'properly') - it seems to 
treat them as linear whitespace, not line ends. I was about to suggest:

source (my_script.py)

As a way to use source to run the script's output, but that seems not to 
work. I think () might be a bashism anyway.

tom

-- 
Hit to death in the future head
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: urllib.urlopen

2005-12-18 Thread Tom Anderson
On Sat, 17 Dec 2005, Dennis Lee Bieber wrote:

 (Now there is an interesting technical term:
 #define ERROR_ARENA_TRASHED 7)

FreeBSD at one point had an EDOOFUS; Apple kvetched about this being 
offensive, so it was changed to EDONTPANIC.

I shitteth thee not.

tom

-- 
information distribution, vox humana, deviation, handle, feed, l.g. **
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: const objects (was Re: Death to tuples!)

2005-12-14 Thread Tom Anderson
On Wed, 14 Dec 2005, Steven D'Aprano wrote:

 On Wed, 14 Dec 2005 10:57:05 +0100, Gabriel Zachmann wrote:

 I was wondering why python doesn't contain a way to make things const?

 If it were possible to declare variables at the time they are bound 
 to objects that they should not allow modification of the object, then 
 we would have a concept _orthogonal_ to data types themselves and, as a 
 by-product, a way to declare tuples as constant lists.

 In an earlier thread, somebody took me to task for saying that Python 
 doesn't have variables, but names and objects instead.

I'd hardly say it was a taking to task - that phrase implies 
authoritativeness on my part! :)

 This is another example of the mental confusion that occurs when you 
 think of Python having variables.

What? What does this have to do with it? The problem here - as Christopher 
and Magnus point out - is the conflation in the OP's mind of the idea of a 
variable, and of the object referenced by that variable. He could have 
expressed the same confusion using your names-values-and-bindings 
terminology - just replace 'variable' with 'name'. The expression would be 
nonsensical, but it's nonsensical in the variables-objects-and-pointers 
terminology too.

 Some languages have variables. Some do not.

Well, there is the lambda calculus, i guess ...

tom

-- 
The sky above the port was the colour of television, tuned to a dead
channel
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: IsString

2005-12-14 Thread Tom Anderson
On Tue, 13 Dec 2005, Fredrik Lundh wrote:

 Steve Holden wrote:

 In Python a name (*not* a variable, though people do talk loosely 
 about instance variables and class variables just to be able to use 
 terms familiar to users of other to languages)  is simply *bound* to a 
 value. The only storage that is required, therefore, is enough to hold 
 a pointer (to the value currently bound to the name).

 in tom's world, the value of an object is the pointer to the object, not 
 the object itself,

If you meant he value of a *variable* is a pointer to an object, not the 
object itself, then bingo, yes, that's what it's like in my world.

 so I'm not sure he can make sense of your explanation.

The explanation makes perfect sense - i think the names-values-bindings 
terminology is consistent, correct and clear. It's just that i think that 
the variables-objects-pointers terminology is equally so, so i object to 
statements like python is not pass-by-value.

tom

-- 
The sky above the port was the colour of television, tuned to a dead
channel
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: IsString

2005-12-14 Thread Tom Anderson
On Tue, 13 Dec 2005, Xavier Morel wrote:

 Tom Anderson wrote:

 In what sense are the names-bound-to-references-to-objects not 
 variables?

 In the sense that a variable has various meta-informations (at least a 
 type)

No. In a statically typed language (or possibly only a manifestly typed 
language), a variable has a type; in an untyped language, it doesn't.

 while a Python name has no information. A Python name would be 
 equivalent to a C void pointer, it can mean *any*thing and has no 
 value/meaning by itself, only the object it references has.

Quite right - so it's also equivalent to a LISP, Smalltalk or Objective C 
(to mention but a few) variable?

tom

-- 
The sky above the port was the colour of television, tuned to a dead
channel
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: IsString

2005-12-14 Thread Tom Anderson
On Tue, 13 Dec 2005, Mike Meyer wrote:

 You can show the same difference in behavior between Python and C (for 
 example) without using a function call.

Really? You certainly don't do that with the code below.

 Here's C:

 #include assert.h

 main() {
  int i, *ref ;
  i = 1 ;
  ref = i ;   /* Save identity of i */

Here, ref is a reference to a variable.

  i = 2 ;
  assert(ref == i) ;

Here, you're comparing the addresses of variables.

 }

 This runs just fine; i is the same object throughout the program.

 On the other hand, the equivalent Python:

 i = 1
 ref = id(i)# Save the identity of i

Here, ref is a reference to a value.

 i = 2
 assert ref == id(i)

Here, you're comparing values.

 Traceback (most recent call last):
  File stdin, line 1, in ?
 AssertionError


 Blows up - i is no longer the same object.

Right, because the two bits of code are doing quite different things.

 Python does call by reference, which means that it passes pointers to 
 objects by value.

That's not what call by reference is - call by reference is passing 
pointers to *variables* by value.

 C is call by value, faking call by reference by passing reference 
 values. The real difference is that in C, you can get a reference to a 
 variable to pass, allowing you to change the variable. In python, you 
 can't get a reference to a name (one of the reasons we call them names 
 instead of variables), so you can't pass a value that will let the 
 called function change it.

Kinda. Here's a python translation of Steven's incrementing function 
example:

def increment(n):
Add one to the argument changing it in place.
# python (rightly) doesn't have references to variables
# so i will use a 2-tuple (namespace, name) to fake them
# n should be such a 2-tuple
n_namespace = n[0]
n_name = n[1]
n_namespace[n_name] += 1

x = 1
increment((locals(), x)) 
assert x == 2

This is an evil, festering, bletcherous hack, but it is a direct 
translation of the use of pass-by-reference in C.

As a bonus, here's a similarly literal python translation of your C 
program:

 i = 1
 ref = i
 i = 2
 assert ref == i

tom

-- 
The sky above the port was the colour of television, tuned to a dead
channel
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: IsString

2005-12-14 Thread Tom Anderson
On Wed, 14 Dec 2005, Steven D'Aprano wrote:

 On Tue, 13 Dec 2005 15:28:32 +, Tom Anderson wrote:

 On Tue, 13 Dec 2005, Steven D'Aprano wrote:

 On Mon, 12 Dec 2005 18:51:36 -0600, Larry Bates wrote:

 [snippidy-doo-dah]

 I had the same thought, but reread the post.  He asks if a given
 variable is a character or a number.  I figured that even if he is
 coming from another language he knows the difference between a given
 variable and the contents of a give variable.  I guess we will
 see ;-).  This list is so good, he gets BOTH questions answered.

 The problem is, Python doesn't have variables (although it is
 oh-so-tempting to use the word, I sometimes do myself). It has names in
 namespaces, and objects.

 In what sense are the names-bound-to-references-to-objects not variables?

 Because saying Python has variables leads to nonsense like the following:

 [snip]
 That's why, for instance, Python is neither call by reference nor call
 by value, it is call by object.

 No, python is call by value, and it happens that all values are
 pointers.

 All values in Python are pointers???

Right.

 So when I write:

 name = spam spam spam spam

 the value of the variable name is a pointer, and not a string. Riiight.

Right.

 Call by value and call by reference have established meanings in 
 computer science,

Right.

 and Python doesn't behave the same as either of them.

Wrong. Python behaves exactly like call by value, just like Smalltalk, 
Objective C, LISP, Java, and even C.

 Consider the following function:

 def modify(L):
Modify a list and return it.
L.append(None); return L

 If I call that function:

 mylist = range(10**10) # it is a BIG list
 anotherlist = modify(mylist)

 if the language is call by value, mylist is DUPLICATED before being
 passed to the function.

Wrong. The value of mylist is a pointer to a list, and that's what's 
passed to the function. The same analysis applies to the rest of your 
example.

 The conceptual problem you are having is that you are conflating the 
 object model of Python the language with the mechanism of the underlying 
 C implementation, which does simply pass pointers around.

No, i'm not, i'm really not. Thinking in terms of variables, pointers and 
objects is a simple, consistent and useful abstract model of computation 
in python. If you like, we can use the word 'reference' instead of 
'pointer' - i guess a lot of people who came from C (which i didn't) are 
hung up on the idea that a pointer is a memory address, rather than just a 
conceptual thing which goes from a variable to an object; the trouble is 
that then we remind people of 'call by reference', and it all goes to pot.

I think the background thing is the kicker here. I'm guessing you come 
from C, where pointers are physical and explicit, you can have a variable 
which really does contain an object, etc, and so for you, applying those 
terms to python is awkward. I come from java, where all pointers are 
abstract (in the sense of being opaque) and implicit, and variables only 
ever contain pointers (unless they're primitive - but that's an 
implementation detail), so the terminology carries over to python quite 
naturally.

 I'm sure this has been argued over many times here, and we still all 
 have our different ideas, so please just ignore this post!

 I'd love to, but unfortunately I've already hit send on my reply.

Fair enough. Sorry about all this. In future, i'm going to send posts 
which i *know* will generate heat but no light straight to /dev/null ...

tom

-- 
The literature, especially in recent years, has come to resemble `The
Blob', growing and consuming everything in its path, and Steve McQueen
isn't going to come to our rescue. -- The Mole
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: IsString

2005-12-14 Thread Tom Anderson
On Tue, 13 Dec 2005, Steve Holden wrote:

 Tom Anderson wrote:
 On Tue, 13 Dec 2005, Steven D'Aprano wrote:
 
 On Mon, 12 Dec 2005 18:51:36 -0600, Larry Bates wrote:
 
 [snippidy-doo-dah]
 
 I had the same thought, but reread the post.  He asks if a given 
 variable is a character or a number.  I figured that even if he is 
 coming from another language he knows the difference between a given 
 variable and the contents of a give variable.  I guess we will see 
 ;-).  This list is so good, he gets BOTH questions answered.
 
 The problem is, Python doesn't have variables (although it is 
 oh-so-tempting to use the word, I sometimes do myself). It has names 
 in namespaces, and objects.
 
 In what sense are the names-bound-to-references-to-objects not variables?

 In a very important sense, one which you should understand in order to 
 understand the nature of Python.

 In C

Stop. How am i going to understand the nature of python by reading about 
C? Python is not C. What C does in the privacy of its own compilation unit 
is of no concern to us.

 if you declare a variable as (for example) a character string of length 
 24, the compiler will generate code that allocates 24 bytes to this 
 variable on the stack frame local to the function in which it's 
 declared. Similarly if you declare a variable as a double-length 
 floating point number the compiler will emit code that allocates 16 
 bytes on the local stack-frame.

True but irrelevant.

 In Python a name [...] is simply *bound* to a value. The only storage 
 that is required, therefore, is enough to hold a pointer (to the value 
 currently bound to the name). Thus assignment (i.e. binding to a name, 
 as opposed to binding to an element of a data structure) NEVER copes the 
 object, it simply stores a pointer to the bound object in the part of 
 the local namespace allocated to that name.

Absolutely true. I'm not saying your terminology is wrong - i'm pointing 
out that mine is also right.

Basically, we're both saying:

In python, the universe consists of things; in order to manipulate 
them, programs use hands, which hold things - the program is expressed as 
actions on hands, which direct actions on things at runtime. Although it 
appears at first glance that there is a direct correspondence between 
hands and things, it is crucial to realise that the relationship is 
mediated by a holding - the hand identifies a particular holding, which in 
turn identifies a particular thing. So, when we make a function call, and 
specify hands as parameters, it is not the hands themselves, *or* the 
things, that get passed to the function - it's the holdings. Similarly, 
when we make an assignment, we are not assigning a thing - no things are 
touched by an assignment - but a holding, so that the hand assigned to 
ends up gripping a different thing.

There is in fact another layer of indirection - the programmer refers to 
hands using strings, but this is just part of the language used to express 
programs textually: the correspondence between these strings and the hands 
they refer to is called a manual. The manual which applies at any point in 
a program is determined lexically - it is the manual corresponding to the 
function enclosing that point, or the global manual, if it is at the top 
level.



Where you can substitute either of:

steves_terminology = {
thing: value,
hand: name,
hold: are bound to,
holding: binding,
gripping: being bound to,
manual: namespace
}

toms_terminology = {
thing: object,
hand: variable,
hold: point to,
holding: pointer,
gripping: pointing to,
manual: scope
}

Using:

def substitute(text, substitutions):
substituands = substitutions.keys()
# to handle substituands which are prefixes of other substituands:
substituands.sort(lambda a, b: -cmp(len(a), len(b)))
for substituand in substituands:
text = text.replace(substituand, substitutions[substituand])
return text

I'd then point out that my terminology is the one used in all other 
programming languages, including languages whose model is the same as 
python's, and so we should use it for consistency's sake. I guess the 
argument for your terminology is that it's less confusing to C programmers 
who don't realise that the * in *foo is now implicit.

 It be a subtle difference, but an important one.
 
 No, it's just spin, bizarre spin for which i can see no reason. Python 
 has variables.

 You appear very confident of your ignorance ;-)

You appear to be very liberal with your condescension.

Steering rapidly away from further ad hominem attacks ...

 I'm sure this has been argued over many times here, and we still all have 
 our different ideas, so please just ignore this post!

 Couldn't!

 I do apologise, though, for any implication you assertions are based on 
 ignorance because you do demonstrate quite a sophisticated

Re: IsString

2005-12-13 Thread Tom Anderson
On Tue, 13 Dec 2005, Steven D'Aprano wrote:

 On Mon, 12 Dec 2005 18:51:36 -0600, Larry Bates wrote:

 [snippidy-doo-dah]

 I had the same thought, but reread the post.  He asks if a given 
 variable is a character or a number.  I figured that even if he is 
 coming from another language he knows the difference between a given 
 variable and the contents of a give variable.  I guess we will 
 see ;-).  This list is so good, he gets BOTH questions answered.

 The problem is, Python doesn't have variables (although it is 
 oh-so-tempting to use the word, I sometimes do myself). It has names in 
 namespaces, and objects.

In what sense are the names-bound-to-references-to-objects not variables?

 It be a subtle difference, but an important one.

No, it's just spin, bizarre spin for which i can see no reason. Python has 
variables.

 That's why, for instance, Python is neither call by reference nor call 
 by value, it is call by object.

No, python is call by value, and it happens that all values are pointers. 
Just like java, but without the primitive types, and like LISP, and like a 
load of other languages. Python's parameter passing is NO DIFFERENT to 
that in those languages, and those languages are ALL described as 
call-by-value, so to claim that python does not use call-by-reference but 
some random new 'call-by-object' convention is incorrect, unneccessary, 
confusing and silly.

/rant

I'm sure this has been argued over many times here, and we still 
all have our different ideas, so please just ignore this post!

tom

-- 
So the moon is approximately 24 toasters from Scunthorpe.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python is incredible!

2005-12-13 Thread Tom Anderson
On Mon, 12 Dec 2005, Xavier Morel wrote:

 Luis M. Gonzalez wrote:

 You are not the first lisper who fell inlove with Python...
 Check this out:
 http://www.paulgraham.com/articles.html

 Paul Graham is not in love with Python though, he's still very much in love 
 with Lisp.

 He merely admits being unfaithful to Lisp from time to time (and clearly 
 states that Python is one of the non-Lisp languages he likes best).

Oh come on - he loves LISP but he plays away with python every chance he 
gets? What he has with LISP is a hollow sham - he's only keeping up the 
pretense for the children.

;)

tom

-- 
So the moon is approximately 24 toasters from Scunthorpe.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python is incredible!

2005-12-13 Thread Tom Anderson
On Tue, 13 Dec 2005, Cameron Laird wrote:

 In article [EMAIL PROTECTED],
 Tom Anderson  [EMAIL PROTECTED] wrote:
 On Mon, 12 Dec 2005, Cameron Laird wrote:

 While there is indeed much to love about Lisp, please be aware
 that meaningful AI work has already been done in Python

 Wait - meaningful AI work has been done?

 I richly deserved that.  As penance, I follow-up with URL: 
 http://www.robotwisdom.com/ai/ .

I think that document actually sells AI a little short: it's true that 
little progress has been made with language or reasoning, but vision's 
actually done rather well; the recent winning of the Grand Challenge drive 
across the Mojave is proof of that.

But then, i don't think AI was ever really the goal of the AI movement - 
it was basically a time when DARPA gathered together smart, curious 
people, and threw torrents of resources at them to use as they pleased. We 
didn't get AI out of it, but we did get a hell of a lot of cool stuff. It 
was a bit like the Apollo programme, but without the air force dudes 
planting flags at the end. An AI refugee, who worked at SAIL in the 70s, 
recently told me AI was always just a sandpit, now it's become a tarpit 
the clever people have moved on - because it was the environment and the 
opportunity to do neat stuff, rather than AI per se, that drove them.

tom

-- 
So the moon is approximately 24 toasters from Scunthorpe.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: how does exception mechanism work?

2005-12-12 Thread Tom Anderson
On Mon, 12 Dec 2005, it was written:

 [EMAIL PROTECTED] writes:

 Is this model correct or wrong? Where can I read about the mechanism 
 behind exceptions?

 Usually you push exception handlers and finally clauses onto the 
 activation stack like you push return addresses for function calls. When 
 something raises an exception, you scan the activation stack backwards, 
 popping stuff from it as you scan and executing finally clauses as you 
 find them, until you find a handler for the raised exception.

That varies an awful lot, though - AIUI, in java, the catch blocks are 
specified sort of in the same place as the code; a method definition 
consists of bytecode, a pile of metadata, and an exception table, which 
says 'if an exception of type x happens at a bytecode in the range a to b, 
jump to bytecode c'. When the exception-handling machinery is walking the 
stack, rather than looking at some concrete stack of exception handlers, 
it walks the stack of stack frames (or activation records or whatever you 
call them), and for each one, follows the pointer to the relevant method 
definition and inspects its exception table. Finally blocks are handled by 
putting the finally's code right after the try's code in the normal flow 
of execution, then concocting an exception handler for the try block which 
points into the finally block, so however the try block finishes, 
execution goes to the finally block.

The advantage of this approach over an explicit stack of handlers is that, 
although unwinding the stack is perhaps a bit slower, due to having to 
chase more pointers to get to the exception table, there's zero work to be 
done to set up a try block, and since executing a try is a lot more 
frequent than executing a throw-catch, that's a win.

Of course, that's how the conceptual virtual machine does it; real 
implementations don't necessarily do that. That said, it is a traditional 
superstition in java that a try block is essentially free, which would 
suggest that this sort of implementation is common. Indeed, i see no 
reason why it wouldn't be - i think the push-a-handler style seen in C/C++ 
implementations is only necessary because of the platform ABI, which 
doesn't usually mandate a standard layout for per-function metadata.

tom

-- 
limited to concepts that are meta, generic, abstract and philosophical --
IEEE SUO WG
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pattern matching with string and list

2005-12-12 Thread Tom Anderson
On Mon, 12 Dec 2005 [EMAIL PROTECTED] wrote:

 I'd need to perform simple pattern matching within a string using a list 
 of possible patterns. For example, I want to know if the substring 
 starting at position n matches any of the string I have a list, as 
 below:

 sentence = the color is $red
 patterns = [blue,red,yellow]
 pos = sentence.find($)

I assume that's a typo for sentence.find('$'), rather than some new 
syntax i've not learned yet!

 # here I need to find whether what's after 'pos' matches any of the
 strings of my 'patterns' list
 bmatch = ismatching( sentence[pos:], patterns)

 Is an equivalent of this ismatching() function existing in some Python
 lib?

I don't think so, but it's not hard to write:

def ismatching(target, patterns):
for pattern in patterns:
if target.startswith(pattern):
return True
return False

You don't say what bmatch should be at the end of this, so i'm going with 
a boolean; it would be straightforward to return the pattern which 
matched, or the index of the pattern which matched in the pattern list, if 
that's what you want.

The tough guy way to do this would be with regular expressions (in the re 
module); you could do the find-the-$ and the match-a-pattern bit in one 
go:

import re
patternsRe = re.compile(r\$(blue)|(red)|(yellow))
bmatch = patternsRe.search(sentence)

At the end, bmatch is None if it didn't match, or an instance of re.Match 
(from which you can get details of the match) if it did.

If i was doing this myself, i'd be a bit cleaner and use non-capturing 
groups:

patternsRe = re.compile(r\$(?:blue)|(?:red)|(?:yellow))

And if i did want to capture the colour string, i'd do it like this:

patternsRe = re.compile(r\$((?:blue)|(?:red)|(?:yellow)))

If this all looks like utter gibberish, DON'T PANIC! Regular expressions 
are quite scary to begin with (and certainly not very regular-looking!), 
but they're actually quite simple, and often a very powerful tool for text 
processing (don't get carried way, though; regular expressions are a bit 
like absinthe, in that a little helps your creativity, but overindulgence 
makes you use perl).

In fact, we can tame the regular expressions quite neatly by writing a 
function which generates them:

def regularly_express_patterns(patterns):
pattern_regexps = map(
lambda pattern: (?:%s) % re.escape(pattern),
patterns)
regexp = r\$( + |.join(pattern_regexps) + )
return re.compile(regexp)

patternsRe = regularly_express_patterns(patterns)

tom

-- 
limited to concepts that are meta, generic, abstract and philosophical --
IEEE SUO WG
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python is incredible!

2005-12-12 Thread Tom Anderson
On Mon, 12 Dec 2005, Cameron Laird wrote:

 While there is indeed much to love about Lisp, please be aware
 that meaningful AI work has already been done in Python

Wait - meaningful AI work has been done?

;)

tom

-- 
limited to concepts that are meta, generic, abstract and philosophical --
IEEE SUO WG
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python is incredible!

2005-12-12 Thread Tom Anderson
On Mon, 12 Dec 2005, Tolga wrote:

 I am using Common Lisp for a while and nowadays I've heard so much about 
 Python that finally I've decided to give it a try becuase

You read reddit.com, and you want to know why they switched?

 Python is not very far away from Lisp family.

That's an interesting assertion. LISP certainly had an influence on 
python, but i don't think it's really related - they're pretty different 
in fundamental ways.

On the other hand, i sort of see what you mean - it has this lightweight, 
magical feeling, a sense of effortless power, as LISP does.

 using Python is not programming, it IS a fun!

+1 QOTW.

 I'll be here!!!

Good to hear it - welcome!

tom

-- 
limited to concepts that are meta, generic, abstract and philosophical --
IEEE SUO WG
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Developing a network protocol with Python

2005-12-12 Thread Tom Anderson
On Mon, 12 Dec 2005, Laszlo Zsolt Nagy wrote:

 I think to be effective, I need to use TCP_NODELAY, and manually 
 buffered transfers.

Why?

 I would like to create a general messaging object that has methods like

 sendinteger
 recvinteger
 sendstring
 recvstring

Okay. So you're really developing a marshalling layer, somewhere between 
the transport and application layers - fair enough, there are a lot of 
protocols that do that.

 To be more secure,

Do you really mean secure? I don't think using pickle will give you 
security. If you want security, run your protocol over an TLS/SSL 
connection.

If, however, you mean robustness, then this is a reasonable thing to do - 
it reduces the amount of code you have to write, and so reduces the number 
of bugs you'll write! One thing to watch out for, though, is the 
compatibility of the pickling at each end - i have no idea what the 
backwards- and forwards-compatibility of the pickle protocols is like, but 
you might find that if they're on different python versions, the ends 
won't understand each other. Defining your own protocol down to the 
bits-on-the-socket level would preclude that possibility.

 I think I can use this loads function to transfer more elaborate python 
 stuctures:

 def loads(s):
   Loads an object from a string.
 @param s: The string to load the object from.
   @return: The object loaded from the string. This function will not 
 unpickle globals and instances.
   
   f = cStringIO.StringIO(s)
   p = cPickle.Unpickler(f)
   p.find_global = None
   return p.load()

I don't know the pickle module, so i can't comment on the code.

 Am I on the right way to develop a new protocol?

Aside from the versioning issue i mention above, you should bear in mind 
that using pickle will make it insanely hard to implement this protocol in 
any language other than python (unless someone's implemented a python 
pickle library in it - is there such a beast for any other language?). 
Personally, i'd steer clear of doing it like this, and try to use an 
existing, language-neutral generic marshalling layer. XML and ASN.1 would 
be the obvious ones, but i wouldn't advise using either of them, as 
they're abominations. JSON would be a good choice:

http://www.json.org/

If it's expressive enough for your objects. This is a stunningly simple 
format, and there are libraries for working with it for a wide range of 
languages.

 Are there any common mistakes that programmers do?

The key one, i'd say, is not thinking about the future. Make sure your 
protocol is able to grow - use a version number, so peers can figure out 
what language they're talking, and perhaps an option negotiation 
mechanism, if you're doing anything complex enough to warrant it (hey, you 
could always start without it and add it in a later version!). Try to 
allow for addition of new commands, message types or whatever, and for 
extension of existing ones (within reason).

 Is there a howto where I can read more about this?

Not really - protocol design is a bit of a black art. Someone asked about 
this on comp.protocols.tcp-ip a while ago:

http://groups.google.co.uk/group/comp.protocols.tcp-ip/browse_thread/thread/39f810b43a6008e6/72ca111d67768b83

And didn't get much in the way of answers. Someone did point to this, 
though:

http://www.internet2.edu/~shalunov/writing/protocol-design.html

Although i don't agree with much of what that says.

tom

-- 
limited to concepts that are meta, generic, abstract and philosophical --
IEEE SUO WG
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: OO in Python? ^^

2005-12-12 Thread Tom Anderson

On Mon, 12 Dec 2005, Bengt Richter wrote:


On Mon, 12 Dec 2005 01:12:26 +, Tom Anderson [EMAIL PROTECTED] wrote:


--
ø¤º°`°º¤øø¤º°`°º¤øø¤º°`°º¤øø¤º°`°º¤ø


[OT} (just taking liberties with your sig ;-)
   ,@
 °º¤øø¤º°`°º¤øø¤º°P`°º¤ø,,y,,ø¤º°t`°º¤ø,,h,,ø¤º°o`°º¤ø,,n,,ø¤º°


The irony is that with my current news-reading setup, i see my own sig as 
a row of question marks, seasoned with backticks and commas. Your 
modification looks like it's adding a fish; maybe the question marks are a 
kelp bed, which the fish is exploring for food.


Hmm. Maybe if i look at it through Google Groups ...

Aaah! Very good!

However, given the context, i think it should be:

  ,OO
°º¤øø¤º°`°º¤øø¤º°P`°º¤ø,,y,,ø¤º°t`°º¤ø,,h,,ø¤º°o`°º¤ø,,n,,ø¤º°

!

tom

--
limited to concepts that are meta, generic, abstract and philosophical --
IEEE SUO WG-- 
http://mail.python.org/mailman/listinfo/python-list

Re: OO in Python? ^^

2005-12-12 Thread Tom Anderson
On Mon, 12 Dec 2005, Donn Cave wrote:

 In article [EMAIL PROTECTED],
 [EMAIL PROTECTED] (Alex Martelli) wrote:

 Tom Anderson [EMAIL PROTECTED] wrote:
...


 For example, if i wrote code like this (using python syntax):

 def f(x):
   return 1 + x

 The compiler would think well, he takes some value x, and he adds it to 1
 and 1 is an integer, and the only thing you can add to an integer is
 another integer, so x must be an integer; he returns whatever 1 + x works
 out to, and 1 and x are both integers, and adding two integers makes an
 integer, so the return type must be integer

 hmmm, not exactly -- Haskell's not QUITE as strongly/rigidly typed as
 this... you may have in mind CAML, which AFAIK in all of its variations
 (O'CAML being the best-known one) *does* constrain + so that the only
 thing you can add to an integer is another integer.  In Haskell, + can
 sum any two instances of types which meet typeclass Num -- including at
 least floats, as well as integers (you can add more types to a typeclass
 by writing the required functions for them, too).  Therefore (after
 loading in ghci a file with
 f x = x + 1
 ), we can verify...:

 *Main :type f
 f :: (Num a) = a - a

 But if you try
   f x = x + 1.0

 it's
   f :: (Fractional a) = a - a

 I asserted something like this some time ago here, and was set straight, 
 I believe by a gentleman from Chalmers.  You're right that addition is 
 polymorphic, but that doesn't mean that it can be performed on any two 
 instances of Num.

That's what i understand. What it comes down to, i think, is that the 
Standard Prelude defines an overloaded + operator:

def __add__(x: int, y: int) - int:
primitive operation to add two ints

def __add__(x: float, y: float) - float:
primitive operation to add two floats

def __add__(x: str, y: str) - str:
primitive operation to add two strings

# etc

So that when the compiler hits the expression x + 1, it has a finite set 
of possible interpretations for '+', of which only one is legal - addition 
of two integers to yield an integer. Or rather, given that 1 can be an 
int or a float, it decides that x could be either, and so calls it alpha, 
where alpha is a number. Or something.

While we're on the subject of Haskell - if you think python's 
syntactically significant whitespace is icky, have a look at Haskell's 
'layout' - i almost wet myself in terror when i saw that!

tom

-- 
limited to concepts that are meta, generic, abstract and philosophical --
IEEE SUO WG
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ANN: Dao Language v.0.9.6-beta is release!

2005-12-11 Thread Tom Anderson

On Sun, 11 Dec 2005, Steven D'Aprano wrote:


On Sat, 10 Dec 2005 16:34:13 +, Tom Anderson wrote:


On Sat, 10 Dec 2005, Sybren Stuvel wrote:


Zeljko Vrba enlightened us with:

Find me an editor which has folds like in VIM, regexp search/replace 
within two keystrokes (ESC,:), marks to easily navigate text in 2 
keystrokes (mx, 'x), can handle indentation-level matching as well as 
VIM can handle {}()[], etc.  And, unlike emacs, respects all (not 
just some) settings that are put in its config file. Something that 
works satisfactorily out-of-the box without having to learn a new 
programming language/platform (like emacs).


Found it! VIM!


ED IS THE STANDARD TEXT EDITOR.


Huh! *Real* men edit their text files by changing bits on the hard disk 
by hand with a magnetized needle.


Hard disk? HARD DISK?

Hard disks are for losers who can't write tight code. *Real* mean keep 
everything in core. Unless it's something performance-critical, in which 
case they fit it in the cache.


tom

--
ø¤º°`°º¤øø¤º°`°º¤øø¤º°`°º¤øø¤º°`°º¤ø-- 
http://mail.python.org/mailman/listinfo/python-list

Re: OO in Python? ^^

2005-12-11 Thread Tom Anderson

On Mon, 12 Dec 2005, Steven D'Aprano wrote:


On Sun, 11 Dec 2005 05:48:00 -0800, bonono wrote:

And I don't think Haskell make the programmer do a lot of work(just 
because of its static type checking at compile time).


I could be wrong, but I think Haskell is *strongly* typed (just like
Python), not *statically* typed.


Haskell is strongly and statically typed - very strongly and very 
statically!


However, what it's not is manifestly typed - you don't have to put the 
types in yourself; rather, the compiler works it out. For example, if i 
wrote code like this (using python syntax):


def f(x):
return 1 + x

The compiler would think well, he takes some value x, and he adds it to 1 
and 1 is an integer, and the only thing you can add to an integer is 
another integer, so x must be an integer; he returns whatever 1 + x works 
out to, and 1 and x are both integers, and adding two integers makes an 
integer, so the return type must be integer, and concludes that you meant 
(using Guido's notation):


def f(x: int) - int:
return 1 + x

Note that this still buys you type safety:

def g(a, b):
c = { + a + }
d = 1 + b
return c + d

The compiler works out that c must be a string and d must be an int, then, 
when it gets to the last line, finds an expression that must be wrong, and 
refuses to accept the code.


This sounds like it wouldn't work for complex code, but somehow, it does. 
And somehow, it works for:


def f(x):
return x + 1

Too. I think this is due to the lack of polymorphic operator overloading.

A key thing is that Haskell supports, and makes enormous use of, a 
powerful system of generic types; with:


def h(a):
return a + a

There's no way to infer concrete types for h or a, so Haskell gets 
generic; it says okay, so i don't know what type a is, but it's got to be 
something, so let's call it alpha; we're adding two alphas, and one thing 
i know about adding is that adding two things of some type makes a new 
thing of that type, so the type of some-alpha + some-alpha is alpha, so 
this function returns an alpha. ISTR that alpha gets written 'a, so this 
function is:


def h(a: 'a) - 'a:
return a + a

Although that syntax might be from ML. This extends to more complex 
cases, like:


def i(a, b):
return [a, b]

In Haskell, you can only make lists of a homogenous type, so the compiler 
deduces that, although it doesn't know what type a and b are, they must be 
the same type, and the return value is a list of that type:


def i(a: 'a, b: 'a) - ['a]:
return [a, b]

And so on. I don't know Haskell, but i've had long conversations with a 
friend who does, which is where i've got this from. IANACS, and this could 
all be entirely wrong!


At least the What Is Haskell? page at haskell.org describes the 
language as strongly typed, non-strict, and allowing polymorphic typing.


When applied to functional languages, 'strict' (or 'eager'), ie that 
expressions are evaluated as soon as they are formed; 'non-strict' (or 
'lazy') means that expressions can hang around as expressions for a while, 
or even not be evaluated all in one go. Laziness is really a property of 
the implementation, not the the language - in an idealised pure functional 
language, i believe that a program can't actually tell whether the 
implementation is eager or lazy. However, it matters in practice, since a 
lazy language can do things like manipulate infinite lists.


tom

--
ø¤º°`°º¤øø¤º°`°º¤øø¤º°`°º¤øø¤º°`°º¤ø-- 
http://mail.python.org/mailman/listinfo/python-list

Re: ANN: Dao Language v.0.9.6-beta is release!

2005-12-10 Thread Tom Anderson
On Sat, 10 Dec 2005, Sybren Stuvel wrote:

 Zeljko Vrba enlightened us with:

 Find me an editor which has folds like in VIM, regexp search/replace 
 within two keystrokes (ESC,:), marks to easily navigate text in 2 
 keystrokes (mx, 'x), can handle indentation-level matching as well as 
 VIM can handle {}()[], etc.  And, unlike emacs, respects all (not just 
 some) settings that are put in its config file. Something that works 
 satisfactorily out-of-the box without having to learn a new programming 
 language/platform (like emacs).

 Found it! VIM!

ED IS THE STANDARD TEXT EDITOR.

tom

-- 
Argumentative and pedantic, oh, yes. Although it's properly called
correct -- Huge
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to get the extension of a filename from the path

2005-12-09 Thread Tom Anderson
On Thu, 8 Dec 2005, gene tani wrote:

 Lad wrote:

 what is a way to get the the extension of  a filename from the path?

 minor footnote: windows paths can be raw strings for os.path.split(),
 or you can escape /
 tho Tom's examp indicates unescaped, non-raw string works with
 splitext()

DOH. Yes, my path's got a tab in it, hasn't it!

tom

-- 
Women are monsters, men are clueless, everyone fights and no-one ever
wins. -- cleanskies
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Encoding of file names

2005-12-09 Thread Tom Anderson

On Thu, 8 Dec 2005, Martin v. Löwis wrote:


utabintarbo wrote:


Fredrik, you are a God! Thank You^3. I am unworthy /ass-kiss-mode


For all those who followed this thread, here is some more explanation:

Apparently, utabintarbo managed to get U+2592 (MEDIUM SHADE, a filled 
50% grayish square) and U+2524 (BOX DRAWINGS LIGHT VERTICAL AND LEFT, a 
vertical line in the middle, plus a line from that going left) into a 
file name. How he managed to do that, I can only guess: most likely, the 
Samba installation assumes that the file system encoding on the Solaris 
box is some IBM code page (say, CP 437 or CP 850). If so, the byte on 
disk would be \xb4. Where this came from, I have to guess further: 
perhaps it is ACUTE ACCENT from ISO-8859-*.


Anyway, when he used listdir() to get the contents of the directory, 
Windows applies the CP_ACP encoding (known as mbcs in Python). For 
reasons unknown to me, the US and several European versions of XP map 
this to \xa6, VERTICAL BAR (I can somewhat see that as meaningful for 
U+2524, but not for U+2592).


So when he then applies isfile to that file name, \xa6 is mapped to 
U+00A6, which then isn't found on the Samba side.


So while Unicode here is the solution, the problem is elsewhere; most 
likely in a misconfiguration of the Samba server (which assumes some 
encoding for the files on disk, yet the AIX application uses a different 
encoding).


Isn't the key thing that Windows is applying a non-roundtrippable 
character encoding? If i've understood this right, Samba and Windows are 
talking in unicode, with these (probably quite spurious, but never mind) 
U+25xx characters, and Samba is presenting a quite consistent view of the 
world: there's a file called double bucky backlash grey box in the 
directory listing, and if you ask for a file called double bucky backlash 
grey box, you get it. Windows, however, maps that name to the 8-bit 
string double bucky blackslash vertical bar, but when you pass *that* 
back to it, it gets encoded as the unicode string double bucky backslash 
vertical bar, which Sambda then doesn't recognise.


I don't know what Windows *should* do here. I know it shouldn't do this - 
this leads to breaking of some very basic invariants about files and 
directories, and so the kind of confusion utabintarbo suffered. The 
solution is either to apply an information-preserving encoding (UTF-8, 
say), or to refuse to do it at all (ie, raise an error if there are 
unencodable characters), neither of which are particularly beautiful 
solutions. I think Windows is in a bit of a rock/hard place situation 
here, poor thing.


Incidentally, for those who haven't come across CP_ACP before, it's not 
yet another character encoding, it's a pseudovalue which means 'the 
system's current default character set'.


tom

--
Women are monsters, men are clueless, everyone fights and no-one ever
wins. -- cleanskies-- 
http://mail.python.org/mailman/listinfo/python-list

Validating an email address

2005-12-09 Thread Tom Anderson
Hi all,

A hoary old chestnut this - any advice on how to syntactically validate an 
email address? I'd like to support both the display-name-and-angle-bracket 
and bare-address forms, and to allow everything that RFC 2822 allows (and 
nothing more!).

Currently, i've got some regexps which recognise a common subset of 
possible addresses, but it would be nice to do this properly - i don't 
currently support quoted pairs, quoted strings, or whitespace in various 
places where it's allowed. Adding support for those things using regexps 
is really hard. See:

http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html

For a level to which i am not prepared to stoop.

I hear the email-sig are open to adding a validation function to the email 
package, if a satisfactory one can be written; i would definitely support 
their doing that.

tom

-- 
Women are monsters, men are clueless, everyone fights and no-one ever
wins. -- cleanskies
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: heartbeats

2005-12-09 Thread Tom Anderson
On Fri, 9 Dec 2005, Sybren Stuvel wrote:

 Yves Glodt enlightened us with:

 In detail I need a daemon on my central server which e.g. which in a 
 loop pings (not really ping but you know what I mean) each 20 seconds 
 one of the clients.

Do you mean pings one client every 20 sec, or each client every 20 sec?

 You probably mean really a ping, just not an ICMP echo request.

What's a real ping, if not an ICMP echo request? That's pretty much the 
definitive packet for internetwork groping as far as i know. I think that 
the more generic sense of ping is a later meaning (BICouldVeryWellBW).

 My central server, and this is important, should have a short timeout. 
 If one client does not respond because it's offline, after max. 10 
 seconds the central server should continue with the next client.

 I'd write a single function that pings a client and waits for a 
 response/timeout. It then should return True if the client is online, 
 and False if it is offline. You can then use a list of clients and the 
 filter() function, to retrieve a list of online clients.

That sounds like a good plan.

To do the timeouts, you want the settimeout method on socket:



import socket

def default_validate(sock):
return True

def ping(host, port, timeout=10.0, validate=default_validate):

Ping a specified host on the specified port. The timeout (in
seconds) and a validation function can be set; the validation
function should accept a freshly opened socket and return True if
it's okay, and False if not. This functions returns True if the
specified target can be connected to and yields a valid socket, and
False otherwise.


sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.settimeout(timeout)
try:
sock.connect((host, port))
except socket.error:
return False
ok = validate(sock)
sock.close()
return ok



A potential problem with this is that in the worst case, you'll be 
spending a little over ten seconds on each socket; if you have a lot of 
sockets, that might mean you're not getting through them fast enough. 
There are two ways round this: handle several pings in parallel using 
threads, or use non-blocking sockets to handle several at once with a 
single thread.

tom

-- 
everything from live chats and the Web, to the COOLEST DISGUSTING
PORNOGRAPHY AND RADICAL MADNESS!!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Encoding of file names

2005-12-09 Thread Tom Anderson

On Fri, 9 Dec 2005, Martin v. Löwis wrote:


Tom Anderson wrote:

Isn't the key thing that Windows is applying a non-roundtrippable 
character encoding?


This is a fact, but it is not a key thing. Of course Windows is applying 
a non-roundtrippable character encoding. What else could it do?


Well, i'm no great thinker, but i'd say that errors should never pass 
silently, and that in the face of ambiguity, one should refuse the 
temptation to guess. So, as i said in my post, if the name couldn't be 
translated losslessly, an error should be raised.


I don't know what Windows *should* do here. I know it shouldn't do this 
- this leads to breaking of some very basic invariants about files and 
directories, and so the kind of confusion utabintarbo suffered.


It always did this, and always will. Applications should stop using the 
*A versions of the API.


Absolutely true.

If they continue to do so, they will continue to get bogus results in 
border cases.


No. The availability of a better alternative is not an excuse for 
gratuitous breakage of the worse alternative.


tom

--
Whose house? Run's house!-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Validating an email address

2005-12-09 Thread Tom Anderson
On Sat, 10 Dec 2005, Ben Finney wrote:

 Tom Anderson [EMAIL PROTECTED] wrote:

 A hoary old chestnut this - any advice on how to syntactically
 validate an email address?

 Yes: Don't.

URL:http://www.apps.ietf.org/rfc/rfc3696.html#sec-3

The IETF must have updated that RFC between you posting the link and me 
reading it, because that's not what it says. What it says that the syntax 
for local parts is complicated, and many of the variations are actually 
used for reasons i can't even imagine, so they should be permitted. It 
doesn't say anything about not validating the local part against that 
syntax.

 Please, don't attempt to validate the local-part. It's not up to you 
 to decide what the receiving MTA will accept as a local-part,

Absolutely not - it's up to the IETF, and their decision is recorded in 
RFC 2822.

tom

-- 
Whose house? Run's house!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to get the extension of a filename from the path

2005-12-08 Thread Tom Anderson
On Thu, 8 Dec 2005, Lad wrote:

 what is a way to get the the extension of  a filename from the path?
 E.g., on my XP windows the path can be
 C:\Pictures\MyDocs\test.txt
 and I would like to get
 the the extension of  the filename, that is here
 txt

You want os.path.splitext:

 import os
 os.path.splitext(C:\Pictures\MyDocs\test.txt)
('C:\\Pictures\\MyDocs\test', '.txt')
 os.path.splitext(C:\Pictures\MyDocs\test.txt)[1]
'.txt'


 I would like that to work on Linux also

It'll be fine.

tom

-- 
[Philosophy] is kind of like being driven behind the sofa by Dr Who -
scary, but still entertaining. -- Itchyfidget
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Tabs bad (Was: ANN: Dao Language v.0.9.6-beta is release!)

2005-12-04 Thread Tom Anderson

On Sun, 4 Dec 2005, [utf-8] Björn Lindström wrote:


This article should explain it:

http://www.jwz.org/doc/tabs-vs-spaces.html


Ah, Jamie Zawinski, that well-known fount of sane and reasonable ideas.

It seems to me that the tabs-vs-spaces thing is really about who controls 
the indentation: with spaces, it's the writer, and with tabs, it's the 
reader. Does that match up with people's attitudes? Is it the case that 
the space cadets want to control how their code looks to others, and the 
tabulators want to control how others' code looks to them?


I wonder if there's a further correlation between preferring spaces to 
tabs and the GPL to the BSDL ...


tom

Lexicographical PS: 'tabophobia' is, apparently, fear of the 
neurodegenerative disorder tabes dorsalis.


--
3118110161  Pies-- 
http://mail.python.org/mailman/listinfo/python-list

Re: ANN: Dao Language v.0.9.6-beta is release!

2005-12-04 Thread Tom Anderson
On Sun, 4 Dec 2005 [EMAIL PROTECTED] wrote:

  you're about 10 years late

 The same could be said for hoping that the GIL will be eliminated.
 Utterly hopeless.

 Until... there was PyPy.  Maybe now it's not so hopeless.

No - structuring by indentation and the global lock are entirely different 
kettles of fish. The lock is an implementation detail, not part of the 
language, and barely even perceptible to users; indeed, Jython and 
IronPython, i assume, don't even have one. Structuring by indentation, on 
the other hand, is a part of the language, and a very fundamental one, at 
that. Python without structuring by indentation *is not* python.

Which is not to say that it's a bad idea - if it really is scaring off 
potential converts, then a dumbed-down dialect of python which uses curly 
brackets and semicolons might be a useful evangelical tool.

tom

-- 
3118110161  Pies
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ANN: Dao Language v.0.9.6-beta is release!

2005-12-03 Thread Tom Anderson
On Fri, 2 Dec 2005, [EMAIL PROTECTED] wrote:

 Dave Hansen wrote:

 TAB characters are evil.  They should be banned from Python source 
 code. The interpreter should stop translation of code and throw an 
 exception when one is encountered.  Seriously.  At least, I'm serious 
 when I say that.  I've never seen TAB characters solve more problems 
 than they cause in any application.

 But I suspect I'm a lone voice crying in the wilderness.  Regards,

 You're not alone.

 I still don't get why there is still people using real tabs as
 indentation.

I use real tabs. To me, it seems perfectly simple - i want the line to be 
indented a level, so i use a tab. That's what tabs are for. And i've 
never, ever come across any problem with using tabs.

Spaces, on the otherhand, can be annoying: using spaces means that the 
author's personal preference about how wide a tab should be gets embedded 
in the code, so if that's different to mine, i end up having to look at 
weird code. Navigating and editing the code with arrow-keys under a 
primitive editor, which one is sometimes forced to do, is also slower and 
more error-prone.

So, could someone explain what's so evil about tabs?

tom

-- 
Space Travel is Another Word for Love!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Comparison problem

2005-11-26 Thread Tom Anderson
Chris, as well as addressing what i think is causing your problem, i'm 
going to point out some bits of your code that i think could be polished a 
little. It's intended in a spirit of constructive criticism, so i hope you 
don't mind!

On Sat, 26 Nov 2005, Chris wrote:

if item[0:1]==-:

item[0:1] seems a rather baroque way of writing item[0]! I'd actually 
suggest writing this line like this:

if item.startswith(-:):

As i feel it's more readable.

 item=item[ :-7]
 item=item[1:]

You could just write:

item = item[1:7]

For those two lines.

 infile=open(inventory,r)

The r isn't necessary - reading is the default mode for files. You could 
argue that this documents your intentions towards the file, i suppose, but 
the traditional python idiom would leave it out.

 while infile:
  dummy=infile.readline()

The pythonic idiom for this is:

for dummy in infile:

Although i'd strongly suggest you change 'dummy' to a more descriptive 
variable name; i use 'line' myself.

Now, this is also the line that i think is at the root of your trouble: 
readline returns lines with the line-terminator ('\n' or whatever it is on 
your system) still on them. That gets you into trouble later - see below.

When i'm iterating over lines in a file, the first thing i do with the 
line is chomp off any trailing newline; the line after the for loop is 
typically:

line = line.rstrip(\n)

  if dummy=='':break

You don't by any chance mean 'continue' here, do you?

  print item
  print , +dummy
  if (dummy == item): This comparison isn't working

This is where it all falls down - i suspect that what's happening here is 
that dummy has a trailing newline, and item doesn't, so although they look 
very similar, they're not the same string, so the comparison comes out 
false. Try throwing in that rstrip at the head of the loop and see if it 
fixes it.

HTH.

tom

-- 
Gotta treat 'em mean to make 'em scream.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: icmp - should this go in itertools?

2005-11-26 Thread Tom Anderson
On Fri, 25 Nov 2005, Roy Smith wrote:

 Tom Anderson [EMAIL PROTECTED] wrote:

 It's modelled after the way cmp treats lists - if a and b are lists,
 icmp(iter(a), iter(b)) should always be the same as cmp(a, b).

 Is this any good? Would it be any use? Should this be added to itertools?

 Whatever happens, please name it something other than icmp.  When I read 
 icmp, I think Internet Control Message Protocol.

Heh! That's a good point. The trouble is, icmp is clearly the Right Thing 
to call it from the point of view of itertools, continuing the pattern of 
imap, ifilter, izip etc. Wouldn't it be clear from context that this was 
nothing to do with ICMP?

tom

-- 
Gotta treat 'em mean to make 'em scream.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: icmp - should this go in itertools?

2005-11-26 Thread Tom Anderson
On Sat, 26 Nov 2005, Diez B. Roggisch wrote:

 Tom Anderson wrote:

 Is this any good? Would it be any use? Should this be added to itertools?

 Whilst not a total itertools-expert myself, I have one little objection 
 with this: the comparison won't let me know how many items have been 
 consumed. And I end up with two streams that lack some common prefix 
 plus one field.

Good point. It would probably only be useful if you didn't need to do 
anything with the iterators afterwards.

One option - which is somewhat icky - would be to encode that in the 
return value; if n is the number of items read from both iterators, then 
if the first argument is smaller, the return value is -n, and if the 
second is smaller, it's n. The trouble is that you couldn't be sure 
exactly how many items had been read from the larger iterator - it could 
be n, if the values in the iterators differ, or n+1, if the values were 
the same but the larger one was longer.

 I'm just not sure if there is any usecase for that.

I used it in my ordered dictionary implementation; it was a way of 
comparing two 'virtual' lists that are lazily generated on demand.

I'll go away and think about this more.

tom

-- 
Gotta treat 'em mean to make 'em scream.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Comparison problem

2005-11-26 Thread Tom Anderson
On Sat, 26 Nov 2005, Peter Hansen wrote:

 Tom Anderson wrote:
 On Sat, 26 Nov 2005, Chris wrote:
 
   if item[0:1]==-:
 
 item[0:1] seems a rather baroque way of writing item[0]! I'd actually 
 suggest writing this line like this:

 Actually, it's not so much baroque as it is safe... item[0] will fail if 
 the string is empty, while item[0:1] will return '' in that case.

Ah i didn't realise that. Whether that's safe rather depends on what the 
subsequent code does with an empty string - an empty string might be some 
sort of error (in this particular case, it would mean that the loop test 
had gone wrong, since bool() == False), and the slicing behaviour would 
constitute silent passing of an error.

But, more importantly, egad! What's the thinking behind having slicing 
behave like that? Anyone got any ideas? What's the use case, as seems to 
be the fashionable way of putting it these days? :)

tom

-- 
This should be on ox.boring, shouldn't it?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why are there no ordered dictionaries?

2005-11-25 Thread Tom Anderson
On Wed, 23 Nov 2005, Christoph Zwerschke wrote:

 Alex Martelli wrote:

 However, since Christoph himself just misclassified C++'s std::map as 
 ordered (it would be sorted in this new terminology he's now 
 introducing), it seems obvious that the terminological confusion is 
 rife.

 Speaking about ordered and sorted in the context of collections is 
 not a new terminology I am introducing, but seems to be pretty common in 
 computer science

This is quite true. I haven't seen any evidence for 'rife' 
misunderstanding of these terms.

That said ...

 Perhaps Pythonists are not used to that terminology, since they use the 
 term list for an ordered collection. An ordered dictionary is a 
 dictionary whose keys are a (unique) list. Sometimes it is also called a 
 sequence

Maybe we should call it a 'sequenced dictionary' to fit better with 
pythonic terminology?

tom

-- 
YOU HAVE NO CHANCE TO ARRIVE MAKE ALTERNATIVE TRAVEL ARRANGEMENTS. --
Robin May
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why are there no ordered dictionaries?

2005-11-25 Thread Tom Anderson
On Wed, 23 Nov 2005, Carsten Haese wrote:

 On Wed, 2005-11-23 at 15:17, Christoph Zwerschke wrote:
 Bengt Richter wrote:

  E.g., it might be nice to have a mode that assumes d[key] is
 d.items()[k][1] when
  key is an integer, and otherwise uses dict lookup, for cases where
 the use
  case is just string dict keys.

 I also thought about that and I think PHP has that feature, but it's 
 probably better to withstand the temptation to do that. It could lead 
 to an awful confusion if the keys are integers.

 Thus quoth the Zen of Python:
 Explicit is better than implicit.
 In the face of ambiguity, refuse the temptation to guess.

 With those in mind, since an odict behaves mostly like a dictionary, [] 
 should always refer to keys. An odict implementation that wants to allow 
 access by numeric index should provide explicitly named methods for that 
 purpose.

+1

Overloading [] to sometimes refer to keys and sometimes to indices is a 
really, really, REALLY bad idea. Let's have it refer to keys, and do 
indices either via a sequence attribute or the return value of items().

More generally, if we're going to say odict is a subtype of dict, then we 
have absolutely no choice but to make the methods that it inherits behave 
the same way as in dict - that's what subtyping means. That means not 
doing funky things with [], returning a copy from items() rather than a 
live view, etc.

So, how do we provide mutatory access to the order of items? Of the 
solutions discussed so far, i think having a separate attribute for it - 
like items, a live view, not a copy (and probably being a variable rather 
than a method) - is the cleanest, but i am starting to think that 
overloading items to be a mutable sequence as well as a method is quite 
neat. I like it in that the it combines two things - a live view of the 
order and a copy of the order - that are really two aspects of one thing, 
which seems elegant. However, it does strike me as rather unpythonic; it's 
trying to cram a lot of functionality in an unexpected combination into 
one place. Sparse is better than dense and all that. I guess the thing to 
do is to try both out and see which users prefer.

tom

-- 
YOU HAVE NO CHANCE TO ARRIVE MAKE ALTERNATIVE TRAVEL ARRANGEMENTS. --
Robin May
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why are there no ordered dictionaries?

2005-11-25 Thread Tom Anderson
On Wed, 23 Nov 2005, Christoph Zwerschke wrote:

 Tom Anderson wrote:

 I think it would be probably the best to hide the keys list from the 
 public, but to provide list methods for reordering them (sorting, slicing 
 etc.).

 one with unusual constraints, so there should be a list i can manipulate in 
 code, and which should of course be bound by those constraints.

 Think of it similar as the case of an ordinary dictionary: There is 
 conceptually a set here (the set of keys), but you cannot manipulate it 
 directly, but only through the according dictionary methods.

Which is a shame!

 For an ordedred dictionary, there is conceptually a list (or more 
 specifically a unique list). Again you should not manipulate it 
 directly, but only through methods of the ordered dictionary.

 This sounds at first more complicated, but is in reality more easy.

 For instance, if I want to put the last two keys of an ordered dict d at 
 the beginning, I would do it as d = d[:-2] + d[-2:].

As i mentioned elsewhere, i think using [] like this is a terrible idea - 
and definitely not easier.

 With the list attribute (called sequence in odict, you would have to 
 write: d.sequence = d.sequence[:-2] + d.sequence[-2:]. This is not only 
 longer to write down, but you also have to know that the name of the 
 attribute is sequence.

True, but that's not exactly rocket science. I think the rules governing 
when your [] acts like a dict [] and when it acts like a list [] are 
vastly more complex than the name of one attribute.

 Python's strength is that you don't have to keep many details in mind 
 because it has a small basic vocabulary and orthogonal use.

No it isn't - it's in having a wide set of basic building blocks which do 
one simple thing well, and thus which are easy to use, but which can be 
composed to do more complex things. What are other examples of this kind 
of 'orthogonal use'?

 I prefer the ordered dictionary does not introduce new concepts or 
 attributes if everything can be done intuitively with the existing 
 Python methods and operators.

I strongly agree. However, i don't think your overloading of [] is at all 
intuitive.

tom

-- 
YOU HAVE NO CHANCE TO ARRIVE MAKE ALTERNATIVE TRAVEL ARRANGEMENTS. --
Robin May
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Which License Should I Use?

2005-11-25 Thread Tom Anderson
On Fri, 25 Nov 2005, Robert Kern wrote:

 You may also want to read this Licensing HOWTO:

  http://www.catb.org/~esr/faqs/Licensing-HOWTO.html

 It's a draft, but it contains useful information.

It's worth mentioning that ESR, who wrote that, is zealously 
pro-BSD-style-license. That's not to say that the article isn't useful 
and/or balanced, but it's something to bear in mind while reading it.

tom

-- 
Science runs with us, making us Gods.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Which License Should I Use?

2005-11-25 Thread Tom Anderson
On Fri, 25 Nov 2005, mojosam wrote:

 How do I decide on a license?

You decide on what obligations you wish to impose on licensees, then pick 
a license which embodies those. There are basically three levels of 
obligation:

1. None.

2. Derivatives of the code must be open source.

3. Derivatives of the code and any other code which uses it must be open 
source.

By 'derivatives', i mean modified versions. By 'open source', i really 
mean 'under the same license as the original code'.

So, the licenses corresponding to these obligations are:

1. A BSD-style license. I say 'BSD-style' because there are about a 
hojillion licenses which say more or less the same thing - and it's quite 
amazing just how many words can be split spelling out the absence of 
obligations - but the grand-daddy of them all is the BSD license:

http://www.opensource.org/licenses/bsd-license.php

2. The GNU Lesser General Public License:

http://www.gnu.org/copyleft/lesser.html

3. The GNU General Public License:

http://www.gnu.org/copyleft/gpl.html

The GPL licenses place quite severe restrictions on the freedom of 
programmers using the code, but you often hear GNU people banging on about 
freedom - 'free software', 'free as in speech', etc. What you have to 
realise is that they're not talking about the freedom of the programmers, 
but about the freedom of the software. The logic, i think, is that the 
freedom of the code is the key to the freedom of the end-users: applying 
the GPL to your code means that other programmers will be forced to apply 
to to their code, which means that users of that code will get the 
benefits of open source.

Having said all that, you can only license software if you own the 
copyright on it, and as has been pointed out, in this case, you might not.

 Are there any web sites that summarize the pros and cons?

The GNU project has a quite useful list of licenses, with their takes on 
them:

http://www.gnu.org/licenses/license-list.html

Bear in mind that the GNU project is strongly in favour of the GPL, so 
they're perhaps not as positive about non-GPL licenses as would be fair.

This dude's written about this a bit:

http://zooko.com/license_quick_ref.html

 I guess I don't care too much about how other people use it.  These 
 things won't be comprehensive enough or have broad enough appeal that 
 somebody will slap a new coat of paint on them and try to sell them. I 
 guess I don't care if somebody incorporates them into something bigger. 
 If somebody were to add features to them, it would be nice to get the 
 code and keep the derivative work as open source, but I don't think that 
 matters all that much to me.  If somebody can add value and find a way 
 of making money at it, I don't think I'd be too upset.

To me, it sounds like you want a BSD-style license. But then i'm a BSD 
afficionado myself, so perhaps i would say that!

In fact, while were on the subject, let me plug my own license page:

http://urchin.earth.li/~twic/The_Amazing_Disappearing_BSD_License.html

I apply 0-clause BSD to all the code i release these days.

 I will be doing the bulk of the coding on my own time, because I need to 
 be able to take these tools with me when I change employers. However, 
 I'm sure that in the course of using these tools, I will need to spend 
 time on the job debugging or tweaking them.  I do not want my current 
 employer to have any claim on my code in any way.  Usually if you 
 program on company time, that makes what you do a work for hire. I 
 can't contaminate my code like that.  Does that mean the GPL is the 
 strongest defense in this situation?

The license you choose has absolutely no bearing on this. Either the 
copyright belongs to you, in which case you're fine, or to your employer, 
in which case you don't have the right to license it, so its moot.

 Let's keep the broader issue of which license will bring about the fall 
 of Western Civilization

You mean the GPL?

 on the other thread.

Oops!

tom

-- 
Science runs with us, making us Gods.
-- 
http://mail.python.org/mailman/listinfo/python-list


icmp - should this go in itertools?

2005-11-25 Thread Tom Anderson
Hi all,

This is a little function to compare two iterators:



def icmp(a, b):
for xa in a:
try:
xb = b.next()
d = cmp(xa, xb)
if (d != 0):
return d
except StopIteration:
return 1
try:
b.next()
return -1
except StopIteration:
return 0



It's modelled after the way cmp treats lists - if a and b are lists, 
icmp(iter(a), iter(b)) should always be the same as cmp(a, b).

Is this any good? Would it be any use? Should this be added to itertools?

tom

-- 
I content myself with the Speculative part [...], I care not for the
Practick. I seldom bring any thing to use, 'tis not my way. Knowledge
is my ultimate end. -- Sir Nicholas Gimcrack
-- 
http://mail.python.org/mailman/listinfo/python-list


Yet another ordered dictionary implementation

2005-11-25 Thread Tom Anderson
What up yalls,

Since i've been giving it all that all over the ordered dictionary thread 
lately, i thought i should put my fingers where my mouth is and write one 
myself:

http://urchin.earth.li/~twic/odict.py

It's nothing fancy, but it does what i think is right.

The big thing that i'm not happy with is the order list (what Larosa and 
Foord call 'sequence', i call 'order', just to be a pain); this is a list 
of keys, which for many purposes is ideal, but does mean that there are 
things you might want to do with the order that you can't do with normal 
python idioms. For example, say we wanted to move the last item in the 
order to be first; if this was a normal list, we'd say:

od.order.insert(0, od.order.pop())

But we can't do that here - the argument to the insert is just a key, so 
there isn't enough information to make an entry in the dict. To make up 
for this, i've added move and swap methods on the list, but this still 
isn't idiomatic.

In order to have idiomatic order manipulation, i think we need to make the 
order list a list of items - that is, (key, value) pairs. Then, there's 
enough information in the results of a pop to support an insert. This also 
allows us to implement the various other mutator methods on the order 
lists that i've had to rub out in my code.

However, this does seem somehow icky to me. I can't quite put my finger on 
it, but it seems to violate Once And Only Once. Also, even though the 
above idiom becomes possible, it leads to futile remove-reinsert cycles in 
the dict bit, which it would be nice to avoid.

Thoughts?

tom

-- 
I content myself with the Speculative part [...], I care not for the
Practick. I seldom bring any thing to use, 'tis not my way. Knowledge
is my ultimate end. -- Sir Nicholas Gimcrack
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why are there no ordered dictionaries?

2005-11-25 Thread Tom Anderson
On Fri, 25 Nov 2005, Christoph Zwerschke wrote:

 Tom Anderson wrote:

 True, but that's not exactly rocket science. I think the rules governing 
 when your [] acts like a dict [] and when it acts like a list [] are vastly 
 more complex than the name of one attribute.

 I think it's not really rocket science either to assume that an ordered 
 dictionary behaves like a dictionary if you access items by subscription 
 and like a list if you use slices (since slice indexes must evaluate to 
 integers anyway, they can only be used as indexes, not as keys).

When you put it that way, it makes a certain amount of sense - [:] is 
always about index, and [] is always about key. It's still icky, but it is 
completely unambiguous.

tom

-- 
I content myself with the Speculative part [...], I care not for the
Practick. I seldom bring any thing to use, 'tis not my way. Knowledge
is my ultimate end. -- Sir Nicholas Gimcrack
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Backwards compatibility [was Re: is parameter an iterable?]

2005-11-22 Thread Tom Anderson
On Tue, 22 Nov 2005, Steven D'Aprano wrote:

 Are there practical idioms for solving the metaproblem solve problem X 
 using the latest features where available, otherwise fall back on older, 
 less powerful features?

 For instance, perhaps I might do this:

 try:
built_in_feature
 except NameError:
# fall back on a work-around
from backwards_compatibility import \
feature as built_in_feature

 Do people do this or is it a bad idea?

From some code i wrote yesterday, which has to run under 2.2:

try:
True
except NameError:
True = 1 == 1
False = 1 == 0

Great minds think alike!

As for whether it's a bad idea, well, bad or not, it certainly seems like 
the least worst.

 Are there other techniques to use? Obviously refusing to run is a 
 solution (for some meaning of solution), it may even be a practical 
 solution for some cases, but is it the only one?

How about detecting which environment you're in, then running one of two 
entirely different sets of code? Rather than trying to construct modern 
features in the antique environment, write code for each, using the local 
idioms. The trouble with this is that you end up with massive duplication; 
you can try to factor out the common parts, but i suspect that the 
differing parts will be a very large fraction of the codebase.

 If I have to write code that can't rely on iter() existing in the 
 language, what should I do?

Can you implement your own iter()? I have no idea what python 2.0 was 
like, but would something like this work:

class _iterator:
def __init__(self, x):
self.x = x
self.j = 0
def next(self):
self.j = self.j + 1
return self.x.next()
def __getitem__(self, i):
if (i != self.j):
raise ValueError, out of order iteration
try:
return self.next()
except StopIteration:
raise IndexError
def __iter__(self):
return self
# hopefully, we don't need this, but if we do ...
def __len__(self):
return sys.maxint # and rely on StopIteration to stop the loop

class _listiterator(_iterator):
def next(self):
try:
item = self.x[self.j]
self.j = self.j + 1
return item
except IndexError:
raise StopIteration
def __getitem__(self, i):
if (i != self.j):
raise ValueError, out of order iteration
self.j = self.j + 1
return self.x[i]

import types

def iter(x):
# if there's no hasattr, use explicit access and try-except blocks
# handle iterators and iterables from the future
if hasattr(x, __iter__):
return _iterator(x.__iter__())
# if there's no __getitem__ on lists, try x[0] and catch the exception
# but leave the __getitem__ test to catch objects from the future
if hasattr(x, __getitem__):
return _listiterator(x)
if type(x) == types.FileType:
return _fileiterator(x) # you can imagine the implementation of 
this
# insert more tests for specific types here as you like
raise TypeError, iteration over non-sequence

?

NB haven't actually tried to run that code.

tom

-- 
I'm angry, but not Milk and Cheese angry. -- Mike Froggatt
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Any royal road to Bezier curves...?

2005-11-22 Thread Tom Anderson
On Tue, 22 Nov 2005, Warren Francis wrote:

 For my purposes, I think you're right about the natural cubic splines. 
 Guaranteeing that an object passes through an exact point in space will 
 be more immediately useful than trying to create rules governing where 
 control points ought to be placed so that the object passes close enough 
 to where I intended it to go.

Right so. I wrote that code the first time when i was in a similar spot 
myself - trying to draw maps with nice smooth roads etc based on a fairly 
sparse set of points - so i felt your pain.

 Thanks for the insight, I never would have found that on my own.  At 
 least not until Google labs comes out with a search engine that gives 
 names for what you're thinking of. ;-)

You're in for a wait - i think that feature's scheduled for summer 2006.

 I know this is a fairly pitiful request, since it just involves parsing 
 your code, but I'm new enough to this that I'd benefit greatly from an 
 couple of lines of example code, implementing your classes... how do I 
 go from a set of coordinates to a Natural Cubic Spline, using your 
 python code?

Pitiful but legit - i haven't documented that code at all well. If you go 
right to the foot of my code, you'll find a simple test routine, which 
shows you the skeleton of how to drive the code. It looks a bit like this 
(this is slightly simplified):

def test_spline():
knots = [(0, 0), (0, 1), (1, 0), (0, -2), (-3, 0)] # a spiral
trace = []
c = NaturalCubicSpline(tuples2points(knots))
u = 0.0
du = 0.1
lim = len(c) + du
while (u  lim):
p = c(u)
trace.append(tuple(p))
u = u + du
return trace

tuples2points is a helper function which turns your coordinates from a 
list of tuples (really, an iterable of length-2 iterables) to a list of 
Points. The alternative way of doing it is something like:

curve = NaturalCubicSpline()
for x, y in knot_coords:
curve.knots.append(Point(x, y))
do_something_with(curve)

tom

-- 
I DO IT WRONG!!!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: user-defined operators: a very modest proposal

2005-11-22 Thread Tom Anderson
On Tue, 22 Nov 2005, Steve R. Hastings wrote:

 User-defined operators could be defined like the following: ]+[

Eeek. That really doesn't look right.

Could you remind me of the reason we can't say [+]? It seems to me that an 
operator can never be a legal filling for an array literal or a subscript, 
so there wouldn't be ambiguity.

We could even just say that [?] is an array version of whatever operator ? 
is, and let python do the heavy lifting (excuse the pun) of looping it 
over the operands. [[?]] would obviously be a doubly-lifted version. 
Although that would mean [*] is a componentwise product, rather than an 
outer product, which wouldn't really help you very much! Maybe we could 
define {?} as the generalised outer/tensor version of the ? operator ...

 For improved readability, Python could even enforce a requirement that 
 there should be white space on either side of a user-defined operator. I 
 don't really think that's necessary.

Indeed, it would be extremely wrong - normal operators don't require that, 
and special cases aren't special enough to break the rules.

Reminds me of my idea for using spaces instead of parentheses for grouping 
in expressions, so a+b * c+d evaluates as (a+b)*(c+d) - one of my worst 
ideas ever, i'd say, up there with gin milkshakes.

 Also, there should be a way to declare what kind of precedence the 
 user-defined operators use.

Can't be done - different uses of the same operator symbol on different 
classes could have different precedence, right? So python would need to 
know what the class of the receiver is before it can work out the 
evaluation order of the expression; python does evaluation order at 
compile time, but only knows classes at execute time, so no dice.

Also, i'm pretty sure you could cook up a situation where you could 
exploit differing precedences of different definitions of one symbol to 
generate ambiguous cases, but i'm not in a twisted enough mood to actually 
work out a concrete example!

And now for something completely different.

For Py4k, i think we should allow any sequence of characters that doesn't 
mean something else to be an operator, supported with one special method 
to rule them all, __oper__(self, ator, and), so:

a + b

Becomes:

a.__oper__(+, b)

And:

a --{--@ b

Becomes:

a.__oper__(--{--@, b) # Euler's 'single rose' operator

Etc. We need to be able to distinguish a + -b from a +- b, but this is 
where i can bring my grouping-by-whitespace idea into play, requiring 
whitespace separating operands and operators - after all, if it's good 
enough for grouping statements (as it evidently is at present), it's good 
enough for expressions. The character ']' would be treated as whitespace, 
so a[b] would be handled as a.__oper__([, b). Naturally, the . operator 
would also be handled through __oper__.

Jeff Epler's proposal to use unicode operators would synergise most 
excellently with this, allowing python to finally reach, and even surpass, 
the level of expressiveness found in languages such as perl, APL and 
INTERCAL.

tom

-- 
I DO IT WRONG!!!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: user-defined operators: a very modest proposal

2005-11-22 Thread Tom Anderson
On Tue, 22 Nov 2005 [EMAIL PROTECTED] wrote:

 Each unicode character in the class 'Sm' (Symbol,
 Math) whose value is greater than 127 may be used as a user-defined operator.

EXCELLENT idea, Jeff!

 Also, to accomodate operators such as u'\N{DOUBLE INTEGRAL}', which are not
 simple unary or binary operators, the character u'\N{NO BREAK SPACE}' will be
 used to separate arguments.  When necessary, parentheses will be added to
 remove ambiguity.  This leads naturally to expressions like
   \N{DOUBLE INTEGRAL} (y * x**2) \N{NO BREAK SPACE} dx \N{NO BREAK SPACE} 
 dy
 (corresponding to the call (y*x**2).__u222c__(dx, dy)) which are clearly easy
 to love, except for the small issue that many inferior editors will not 
 clearly
 display the \N{NO BREAK SPACE} characters.

Could we use '\u2202' instead of 'd'? Or, to be more correct, is there a 
d-which-is-not-a-d somewhere in the mathematical character sets? It would 
be very useful to be able to distinguish d'x', as it were, from 'dx'.

* Do we immediately implement the combination of operators with nonspacing
  marks, or defer it?

As long as you don't use normalisation form D, i'm happy.

* Should some of the unicode mathematical symbols be reserved for literals?
  It would be greatly preferable to write \u2205 instead of the other 
 proposed
  empty-set literal notation, {-}.  Perhaps nullary operators could be 
 defined,
  so that writing \u2205 alone is the same as __u2205__() i.e., calling the
  nullary function, whether it is defined at the local, lexical, module, or
  built-in scope.

Sounds like a good idea. \u211D and relatives would also be a candidate 
for this treatment.

And for those of you out there who are laughing at this, i'd point out 
that Perl IS ACTUALLY DOING THIS.

tom

-- 
I DO IT WRONG!!!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why are there no ordered dictionaries?

2005-11-22 Thread Tom Anderson
On Tue, 22 Nov 2005, Carsten Haese wrote:

 On Tue, 2005-11-22 at 14:37, Christoph Zwerschke wrote:

 In Foord/Larosa's odict, the keys are exposed as a public member which 
 also seems to be a bad idea (If you alter the sequence list so that it 
 no longer reflects the contents of the dictionary, you have broken your 
 OrderedDict).

 That could easily be fixed by making the sequence a managed property 
 whose setter raises a ValueError if you try to set it to something 
 that's not a permutation of what it was.

I'm not a managed property expert (although there's a lovely studio in 
Bayswater you might be interested in), but how does this stop you doing:

my_odict.sequence[0] = Shrubbery()

Which would break the odict good and hard.

tom

-- 
When I see a man on a bicycle I have hope for the human race. --
H. G. Wells
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why are there no ordered dictionaries?

2005-11-22 Thread Tom Anderson
On Tue, 22 Nov 2005, Christoph Zwerschke wrote:

 One implementation detail that I think needs further consideration is in 
 which way to expose the keys and to mix in list methods for ordered 
 dictionaries.

 In Foord/Larosa's odict, the keys are exposed as a public member which 
 also seems to be a bad idea (If you alter the sequence list so that it 
 no longer reflects the contents of the dictionary, you have broken your 
 OrderedDict).

 I think it would be probably the best to hide the keys list from the public, 
 but to provide list methods for reordering them (sorting, slicing etc.).

I'm not too keen on this - there is conceptually a list here, even if it's 
one with unusual constraints, so there should be a list i can manipulate 
in code, and which should of course be bound by those constraints.

I think the way to do it is to have a sequence property (which could be a 
managed attribute to prevent outright clobberation) which walks like a 
list, quacks like a list, but is in fact a mission-specific list subtype 
whose mutator methods zealously enforce the invariants guaranteeing the 
odict's integrity.

I haven't actually tried to write such a beast, so i don't know if this is 
either of possible and straightforward.

tom

-- 
When I see a man on a bicycle I have hope for the human race. --
H. G. Wells
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why are there no ordered dictionaries?

2005-11-22 Thread Tom Anderson
On Tue, 22 Nov 2005, Christoph Zwerschke wrote:

 Fuzzyman schrieb:

 Of course ours is ordered *and* orderable ! You can explicitly alter 
 the sequence attribute to change the ordering.

 What I actually wanted to say is that there may be a confusion between a 
 sorted dictionary (one where the keys are automatically sorted) and an 
 ordered dictionary (where the keys are not automatically ordered, but 
 have a certain order that is preserved). Those who suggested that the 
 sorted function would be helpful probably thought of a sorted 
 dictionary rather than an ordered dictionary.

Exactly.

Python could also do with a sorted dict, like binary tree or something, 
but that's another story.

tom

-- 
When I see a man on a bicycle I have hope for the human race. --
H. G. Wells
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Any royal road to Bezier curves...?

2005-11-21 Thread Tom Anderson
On Sun, 20 Nov 2005, Warren Francis wrote:

 Basically, I'd like to specify a curved path of an object through space. 
 3D space would be wonderful, but I could jimmy-rig something if I could 
 just get 2D...  Are bezier curves really what I want after all?

No. You want a natural cubic spline:

http://mathworld.wolfram.com/CubicSpline.html

This is a fairly simple curve, which can be fitted through a series of 
points (called knots) in space of any dimensionality, without the need to 
specify extra control points (unlike a Bezier curve), and which has the 
nice property of minimising the curvature of the curve - it's the shape 
you'd get if you ran a springy wire through your knots. It usually looks 
pretty good too.

Google will help you find python implementations.

There are other kinds of splines - Catmull-Rom, B-spline (a generalisation 
of a Bezier curve), Hermite - but they mostly don't guarantee to pass 
through the knots, which might make them less useful to you.

In the opposite direction on the mathematical rigour scale, there's what i 
call the blended quadratic spline, which i invented as a simpler and more 
malleable alternative to the cubic spline. It's a piecewise parametric 
spline, like the cubic, but rather than calculating a series of pieces 
which blend together naturally, using cubics and linear algebra, it uses 
simple quadratic curves fitted to overlapping triples of adjacent knots, 
then interpolates ('blends') between them to draw the curve. It looks very 
like a cubic spline, but the code is simpler, and the pieces are local - 
each piece depends only on nearby knots, rather than on all the knots, as 
in a cubic spline - which is a useful property for some jobs. Also, it's 
straightforward to add the ability to constrain the angle at which the 
curve passes through a subset of the knots (you can do it for some knots, 
while leaving others 'natural') by promoting the pieces to cubics at the 
constrained knots and constraining the appropriate derivatives. Let me 
know if you want more details on this. To be honest, i'd suggest using a 
proper cubic spline, unless you have specific problems with it.

tom

-- 
... a tale for which the world is not yet prepared
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why are there no ordered dictionaries?

2005-11-21 Thread Tom Anderson
On Sun, 20 Nov 2005, Alex Martelli wrote:

 Christoph Zwerschke [EMAIL PROTECTED] wrote:

 The 'sorted' function does not help in the case I have indicated, where 
 I do not want the keys to be sorted alphabetically, but according to 
 some criteria which cannot be derived from the keys themselves.

 Ah, but WHAT 'some criteria'?  There's the rub!  First insertion, last 
 insertion, last insertion that wasn't subsequently deleted, last 
 insertion that didn't change the corresponding value, or...???

All the requests for an ordered dictionary that i've seen on this group, 
and all the cases where i've needed on myself, want one which behaves like 
a list - order of first insertion, with no memory after deletion. Like the 
Larosa-Foord ordered dict.

Incidentally, can we call that the Larosa-Foord ordered mapping? Then it 
sounds like some kind of rocket science discrete mathematics stuff, which 
(a) is cool and (b) will make Perl programmers feel even more inadequate 
when faced with the towering intellectual might of Python. Them and their 
Scwartzian transform. Bah!

tom

-- 
Baby got a masterplan. A foolproof masterplan.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Any royal road to Bezier curves...?

2005-11-21 Thread Tom Anderson
On Mon, 21 Nov 2005, Tom Anderson wrote:

 On Sun, 20 Nov 2005, Warren Francis wrote:

 Basically, I'd like to specify a curved path of an object through space. 3D 
 space would be wonderful, but I could jimmy-rig something if I could just 
 get 2D...  Are bezier curves really what I want after all?

 No. You want a natural cubic spline:

In a fit of code fury (a short fit - this is python, so it didn't take 
long), i ported my old java code to python, and tidied it up a bit in the 
process:

http://urchin.earth.li/~twic/splines.py

That gives you a natural cubic spline, plus my blended quadratic spline, 
and a framework for implementing other kinds of splines.

tom

-- 
Gin makes a man mean; let's booze up and riot!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: running functions

2005-11-18 Thread Tom Anderson
On Thu, 17 Nov 2005, Scott David Daniels wrote:

 Gorlon the Impossible wrote:

 I have to agree with you there. Threading is working out great for me
 so far. The multiprocess thing has just baffled me, but then again I'm
 learning. Any tips or suggestions offered are appreciated...

 The reason multiprocess is easier is that you have enforced separation. 
 Multiple processes / threads / whatever that share reads and writes into 
 shared memory are rife with irreproducible bugs and untestable code. 
 Processes must be explicit about their sharing (which is where the bugs 
 occur), so those parts of the code cane be examined carefully.

That's a good point.

 If you program threads with shared nothing and communication over Queues 
 you are, in effect, using processes.  If all you share is read-only 
 memory, similarly, you are doing easy stuff and can get away with it. 
 In all other cases you need to know things like which operations are 
 indivisible and what happens if I read part of this from before an 
 update and the other after the update completes, .

Right, but you have exactly the same problem with separate processes - 
except that with processes, having that richness of interaction is so 
hard, that you'll probably never do it in the first place!

tom

-- 
science fiction, old TV shows, sports, food, New York City topography,
and golden age hiphop
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Iterator addition

2005-11-13 Thread Tom Anderson
On Sun, 13 Nov 2005, Reinhold Birkenfeld wrote:

 [EMAIL PROTECTED] wrote:

 Tom Anderson:

 And we're halfway to looking like perl already! Perhaps a more 
 pythonic thing would be to define a then operator:

 all_lines = file1 then file2 then file3

 Or a chain one:

 all_lines = file1 chain file2 chain file3

This may just be NIH syndrome, but i like that much less - 'then' makes 
for something that reads much more naturally to me. 'and' would be even 
better, but it's taken; 'andthen' is a bit unwieldy.

Besides, chain file2 is going to confuse people coming from a BASIC 
background :).

 That's certainly not better than the chain() function. Introducing new 
 operators for just one application is not pythonic.

True, but would this be for just one application With python moving 
towards embracing a lazy functional style, with generators and genexps, 
maybe chaining iterators is a generally useful operation that should be 
supported at the language level. I'm not seriously suggesting doing this, 
but i don't think it's completely out of the question.

tom

-- 
limited to concepts that are meta, generic, abstract and philosophical --
IEEE SUO WG
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Iterator addition

2005-11-12 Thread Tom Anderson
On Thu, 9 Nov 2005, it was written:

 [EMAIL PROTECTED] (Alex Martelli) writes:

 Is there a good reason to not define iter1+iter2 to be the same as

 If you mean for *ALL* built-in types, such as generators, lists, files,
 dicts, etc, etc -- I'm not so sure.

 Hmm, there might also be __add__ operations on the objects, that would 
 have to take precedence over iterator addition.  Iterator addition 
 itself would have to be a special kludge like figuring out  from 
 __cmp__, etc.

 Yeah, I guess the idea doesn't work out that well.  Oh well.

How about if we had some sort of special sort of iterator which did the 
right thing when things were added to it? like an iterable version of The 
Blob:

class blob(object):
def __init__(self, it=None):
self.its = []
if (it != None):
self.its.append(iter(it))
def __iter__(self):
return self
def next(self):
try:
return self.its[0].next()
except StopIteration:
# current iterator has run out!
self.its.pop(0)
return self.next()
except IndexError:
# no more iterators
raise StopIteration
def __add__(self, it):
self.its.append(iter(it))
return self
def __radd__(self, it):
self.its.insert(0, iter(it))

Then we could do:

all_lines = blob(file1) + file2 + file3
candidate_primes = blob((2,)) + (1+2*i for i in itertools.count(1))

Which, although not quite as neat, isn't entirely awful.

Another option would be a new operator for chaining - let's use $, since 
that looks like the chain on the fouled anchor symbol used by navies etc:

http://www.diggerhistory.info/images/badges-asstd/female-rels-navy.jpg

Saying a $ b would be equivalent to chain(a, b), where chain (which 
could even be a builtin if you like) is defined:

def chain(a, b):
if (hasattr(a, __chain__)):
return a.__chain__(b)
elif (hasattr(b, __rchain__)): # optional
return b.__rchain__(a)
else:
return itertools.chain(a, b) # or equivalent

Whatever it is that itertools.chain or whatever returns would be modified 
to have a __chain__ method which behaved like blob.__add__ above. This 
then gets you:

all_lines = file1 $ file2 $ file3
candidate_primes = (2,) $ (1+2*i for i in itertools.count(1))

And we're halfway to looking like perl already! Perhaps a more pythonic 
thing would be to define a then operator:

all_lines = file1 then file2 then file3
candidate_primes = (2,) then (1+2*i for i in itertools.count(1))

That looks quite nice. The special method would be __then__, of course.

tom

-- 
if you can't beat them, build them
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Hash map with multiple keys per value ?

2005-11-12 Thread Tom Anderson
On Fri, 11 Nov 2005, Chris Stiles wrote:

 Is there an easier and cleaner way of doing this ?  Is there example 
 code floating around that I might have a look at ?

I'm not aware of a way which can honestly be called better.

However, i do feel your pain about representing the alias relationships 
twice - it feels wrong. Therefore, i offer you an alternative 
implementation - represent each set as a linked list, threaded through a 
dict by making the value the dict holds under each key point to the next 
key in the alias set. Confused? No? You will be ...

class Aliases(object):
def __init__(self, aliases=None):
self.nexts = {}
if (aliases != None):
for key, value in aliases:
self[key] = value
def __setitem__(self, key, value):
if ((value != None) and (value != key)):
self.nexts[key] = self.nexts[value]
self.nexts[value] = key
else:
self.nexts[key] = key
def __getitem__(self, key):
return list(follow(key, self.nexts))
def __delitem__(self, key):
cur = key
while (self.nexts[cur] != key):
cur = self.nexts[cur]
if (cur != key):
self.nexts[cur] = self.nexts[key]
del self.nexts[key]
def canonical(self, key):
canon = key
for cur in follow(key, self.nexts):
if (cur  canon):
canon = cur
return canon
def iscanonical(self, key):
for cur in follow(key, self.nexts):
if (cur  key):
False
return True
def iteraliases(self, key):
cur = self.nexts[key]
while (cur != key):
yield cur
cur = self.nexts[cur]
def __iter__(self):
return iter(self.nexts)
def itersets(self):
for key in self.nexts:
if (not isprimary(key, self.nexts)):
continue
yield [key] + self[key]
def __len__(self):
return len(self.nexts)
def __contains__(self, key):
return key in self.nexts
def __str__(self):
return Aliases  + str(list(self.itersets())) + 
def __repr__(self):
return Aliases([ + , .join(str((key, self.canonical(key))) 
for key in sorted(self.nexts.keys())) + ])

As i'm sure you'll agree, code that combines a complete absence of clarity 
with abject lack of runtime efficiency. Oh, and i haven't tested it 
properly.

tom

-- 
if you can't beat them, build them
-- 
http://mail.python.org/mailman/listinfo/python-list


  1   2   3   >