Re: Floating numbers and str

2005-11-09 Thread Christian Stapfer
Tuvas [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
I would like to limit a floating variable to 4 signifigant digits, when
 running thorugh a str command. Ei,


 x=.13241414515
 y=str(x)+ something here

 But somehow limiting that to 4 sign. digits. I know that if you use the
 print statement, you can do something like %.4d, but how can I do this
 with converting the number to a string? Thanks!

from scipy import round

print round(0.13241414515,4)

Regards,
Christian



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Help! Python either hangs or core dumps when calling C malloc

2005-09-08 Thread Christian Stapfer
Lil [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
I already double checked my C code. It runs perfectly fine in C without
 any errors. So in my python program, I added a pdb.set_trace()
 and step through the program and it did not dump. But when i took out
 the tracing, the core dump came back. sigh

Check your program for _uninitialized_variables_.
(Just a guess: but what other side-effect than
changing the values of uninitialized variables
- and the program's timing, of course - might
the stepping through with a debugger have?)

Regards,
Christian 


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Removing duplicates from a list

2005-09-14 Thread Christian Stapfer
[EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
I do this:

 def unique(keys):
unique = []
for i in keys:
if i not in unique:unique.append(i)
return unique

 I don't know what is faster at the moment.

This is quadratic, O(n^2), in the length n of the list
if all keys are unique.
Conversion to a set just might use a better sorting
algorithm than this (i.e. n*log(n)) and throwing out
duplicates (which, after sorting, are positioned
next to each other) is O(n). If conversion
to a set should turn out to be slower than O(n*log(n))
 [depending on the implementation], then you are well
advised to sort the list first (n*log(n)) and then
throw out the duplicate keys with a single walk over
the list. In this case you know at least what to
expect for large n...

Regards,
Christian


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: reading files with error

2005-09-17 Thread Christian Stapfer
Maurice Ling [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 [EMAIL PROTECTED] wrote:

I have written a program to do something similar.  My strategy is:
 * use os.read() to read 512 bytes at a time
 * when os.read fails, seek to the next multiple of 512 bytes
   and write '\0' * 512 to the output
I notice this code doesn't deal properly with short reads, but in my 
experience
they never happen (because the disk error makes an entire block 
unreadable, and
a block is never less than 512 bytes)

I use this code on a unix-style system.

def dd(src, target, bs=512):
src = os.open(src, os.O_RDONLY)
if os.path.exists(target):
target = os.open(target, os.O_WRONLY | os.O_APPEND, 0600)
existing = os.lseek(target, 0, SEEK_END)
else:
existing = 0
target = os.open(target, os.O_WRONLY | os.O_CREAT, 0600)

total = os.lseek(src, 0, SEEK_END) / bs
os.lseek(src, existing, SEEK_SET)
os.lseek(target, existing, SEEK_SET)

if existing: print starting at, existing
i = existing / bs
f = 0
lastrem = -1

last = start = time.time()
while 1:
try:
block = os.read(src, bs)
except os.error, detail:
if detail.errno == errno.EIO:
block = \0 * bs
os.lseek(src, (i+1) * bs, SEEK_SET)
f = f + 1
else:
raise
if block == : break

i = i + 1
os.write(target, block)

now = time.time()
if i % 1000 or now - last  1: continue
last = now

frac = i * 1. / total
rem = int((now-start) * (1-frac) / frac)
if rem  60 or abs(rem - lastrem)  .5:
rm, rs = divmod(rem, 60)
lastrem = rem
spd = i * 512. / (now - start) / 1024 / 1024
sys.stderr.write(%8d %8d %8d %3.1f%% %6d:%02d %6.1fMB/s\r
% (i, f, i-f, i * 100. / total, rm, rs, spd))
sys.stderr.write(\n)


 Sorry but what are SEEK_END and SEEK_SET?

The Python 2.3 documentation seems to specify the *numeric*
values of these constants only. But since Python's file
objects are implemented using C's stdio package, you
can read

http://www.opengroup.org/onlinepubs/009695399/functions/lseek.html

Regards,
Christian Stapfer


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Comparing lists

2005-10-10 Thread Christian Stapfer
[EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 try to use set.
L1 = [1,1,2,3,4]
L2 = [1,3, 99]
A = set(L1)
B = set(L2)

X = A-B
print X

Y = B-A
print Y

Z = A | B
print Z

But how efficient is this? Could you be a bit
more explicit on that point? What is the order
of complexity of set([...]) or of A-B, B-A,
A | B, A ^ B and A  B? - The Python Documentation
leaves me completely in the dark in this regard.

Sorting the two lists and then extracting
A-B, B-A, A|B, A  B and A ^ B in one single
pass seems to me very likely to be much faster
for large lists.

Regards,
Christian


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Comparing lists

2005-10-10 Thread Christian Stapfer
George Sakkis [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 Christian Stapfer [EMAIL PROTECTED] wrote:

 [EMAIL PROTECTED] wrote:
  try to use set.

 Sorting the two lists and then extracting
 A-B, B-A, A|B, A  B and A ^ B in one single
 pass seems to me very likely to be much faster
 for large lists.

 Why don't you implement it, test it and time it
 to be more convincing about your intuition ?

The problem is in the generation of the test data.
Even merely generating a set of (suitably average,
random, and suitably worst case) datasets might
turn out to be a major undertaking.
 If the documentation stated the order-of-magnitude
behavior of those basic operations up front, then
I (and *anyone* else who ever wanted to use those
operations on large lists / large sets) could do
a quick order-of-magnitude estimation of how
a certain program design will behave, performance
wise.
  *Experimenting* is not necessarily as easy to
do as you seem to believe. How do you, for example,
hit upon the worst-case behavior with your test
data? - Without knowing *anything* about the
implementation it might a matter sheer luck.
If you *know* something about the implementation
then, of course, you might be able to figure it
out. (But note that if you know *that* much about
the implementation, you usually have an order-of-
magnitude estimate anyway and don't need to do
*any* experimenting in order to answer my question.)

Regards,
Christian


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Comparing lists

2005-10-10 Thread Christian Stapfer
Steve Holden [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 Christian Stapfer wrote:
 George Sakkis [EMAIL PROTECTED] wrote in message 
 news:[EMAIL PROTECTED]

Christian Stapfer [EMAIL PROTECTED] wrote:
[EMAIL PROTECTED] wrote:

try to use set.

Sorting the two lists and then extracting
A-B, B-A, A|B, A  B and A ^ B in one single
pass seems to me very likely to be much faster
for large lists.

Why don't you implement it, test it and time it
to be more convincing about your intuition ?

 The problem is in the generation of the test data.
 Even merely generating a set of (suitably average,
 random, and suitably worst case) datasets might
 turn out to be a major undertaking.
  If the documentation stated the order-of-magnitude
 behavior of those basic operations up front, then
 I (and *anyone* else who ever wanted to use those
 operations on large lists / large sets) could do
 a quick order-of-magnitude estimation of how
 a certain program design will behave, performance
 wise.
   *Experimenting* is not necessarily as easy to
 do as you seem to believe. How do you, for example,
 hit upon the worst-case behavior with your test
 data? - Without knowing *anything* about the
 implementation it might a matter sheer luck.
 If you *know* something about the implementation
 then, of course, you might be able to figure it
 out. (But note that if you know *that* much about
 the implementation, you usually have an order-of-
 magnitude estimate anyway and don't need to do
 *any* experimenting in order to answer my question.)

 You are, of course, either assuming that there's a
 single implementation of Python,

Of course not!

 or that all implementations have the same behaviour.

Of course not!

But it is reasonable, anyway, to ask for information
about a specific implementation (that one is *forced*
to use). Performance *will* depend on it, whether we
like it or not, whether that information is given by
the implementer or not.
  And if the implementer wants to complain that
giving such information would break his wonderful
abstraction then I can only answer: It is *reality*
that *will* break your abstraction as regards
performance! It is, therefore, absolutely *no*
use to *pretend* you *can* avoid this form of breaking
the abstraction by simply avoiding to mention it
in the documentation...

 Or  alternatively you are asking all implementers
 to do what you seem to consider so difficult
 (i.e. identify worst-case scenarios and then estimate order-of-magnitude 
 behaviour for them).

I consider it the job of the implementer to know
about the trade-offs that he has been making in
choosing one particular implementation, and to
know what computational complexity therefore
attaches to the various operations exposed in
its interface. Why should *every* user of his
module be forced to either read the source
code of his implementation or try to figure
it out experiment-wise on his own?

 Test results with known test data are relatively
 easy to extrapolate from, and if your test data
 are reasonably representative of live data then so will your performance 
 estimates.

How reasonable is it to ask me, or anyone else
for that matter, to extract, experiment-wise
(if it can be done at all with reasonable effort)
information that properly belongs to the implementer
and really should have been exposed in the
documentation in the first place?

 Anyway, aren't you more interested in average
 behaviour than worst-case? Most people are.

It depends. *I* am NOT the OP of this thread.
*The*OP* had asked for an efficient way
to do what he needed to do. There are many
measures of efficiency. Even programmer efficiency
may be one of them. But I assumed that, since
the OP took the time to ask his question in this
NG, that it really was about computational
efficiency. Then he was offered solutions -
but they were offered just matter-of-factly.
*Surely* those who had posted those solutions
would be able to answer my question *why*
it was, that they *assumed* conversion into
sets to be more efficient than operating
on the lists in the first place. I was *not*
at all criticizing anyone (except, perhaps,
the Python documentation for its lack of
*insightful* information): I was just asking
for a *small* clarification that I *assumed*
(mistakenly it now appears) others could
*easily* provide.

Regards,
Christian
--
Those who cannot remember the past are
 condemned to repeat it.
 - George Santayana 


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Comparing lists

2005-10-11 Thread Christian Stapfer
Scott David Daniels [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]
 Christian Stapfer wrote:
 Steve Holden [EMAIL PROTECTED] wrote in message
 news:[EMAIL PROTECTED]

Christian Stapfer wrote:

George Sakkis [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]


Christian Stapfer [EMAIL PROTECTED] wrote:

[EMAIL PROTECTED] wrote:
try to use set.
 A reasonable suggestion.  A set must have the trade-offs involved.
 The abstraction itself brings to mind the issues, and the performance
 can, at least in theory, be handled there.  If that is true (that
 the set abstraction sees the problem), then you can rely on the
 Python implementation of set to either now, or eventually, have
 a good implementation -- one not too far off efficient.

It is, unfortunately, not infrequently the case
that there are different implementations for
a given data type that are efficient in different
ways - but that there is no one single implementation
that is the most efficient in *all* regards.
Thus the implementer must make a trade-off,
and that trade-off has consequences, performance
wise, that he might want to tell the users of his
module about...

  The Python gang is good at this stuff;

I'm not denying it. The question is about
telling users upfront what the consequences
are (roughly) of the trade-offs that have
been made so that they can use the modules
that have been provided more effectively.

 a better set implementation will win if
 it can show better performance without
 related down-sides.

Is there ever such a thing? I always
thought that, most of the time, there
is no such thing as a free lunch in
this field.

 As to the either now, or eventually;  if you _must_ have performance
 now, not in some abstract future, then it behooves you to _test_,
 _test_, _test_!

Well, I might want to test: but *first* I want
to design, preferably in armchair style
- at least as regards the basic approach
that I want to take. This armchair stage
involves not infrequently the use of rough
complexity measures to exclude the clearly
unusable and chose, instead, an approach
that is very likely efficient enough for
what I need.

 If the documentation stated the order-of-magnitude
behavior of those basic operations up front, then
I (and *anyone* else who ever wanted to use those
operations on large lists / large sets) could do
a quick order-of-magnitude estimation of how
a certain program design will behave, performance
wise.
 And, if the proper documentation is in place, and it
 says dictionary lookup is O(N) (and you avoid using
 it for exactly that reason), how surprised will you be
 to discover that the O(N) is only reached if the hash
 values of the keys are all equal?

It's not that difficult to distinguish
*average* case and *worst* case scenarios.
It might even be a good idea to state,
if that can easily be done, what the
*best* case happens do be...

 Oh, maybe you expect O(N) to really mean \Theta(N).
 Then, if you are a dweeb like me, you will respond that
 This is not possible, a dictionary of size N must take at
 least 'O(lg N)' to read the key, never mind processing it.
 But, it turns out, that given a bound on the size of a
 process, processing an address is O(1), not O(lg N).
 Is that too practical for you, or not enough?

I do not expect a developer to expend *inordinate*
amounts of work to figure out the computational
complexity of what he has implemented. But, as I
wrote, he *must* have thought about the matter, and
is thus, by definition, in a rather good position
to provide what he can frequently simply pull from
memory when it comes to documenting the interface
of his module.

  *Experimenting* is not necessarily as easy to
do as you seem to believe.


How do you, for example, hit upon the worst-case behavior with your test
data?
 Are you saying the documentation should characterize the
 cases that achieve worst-case behavior?  This is a stiff
 burden indeed, not one I am used to in even the most rigorous
 classes I've seen.

If it happens to be particularly difficult, for a
given implementation (of whatever abstract data type),
then, of course, state this upfront and leave it at
that.

  If there is such a characterization,
 how burned will you feel if a case is overlooked and that
 case is the one that you sold to your customer?

I am not at all a lawyer type, as you seem to
imagine. I just want to suggest that some
(to the implementer - but not the average user)
*easily* available information about computational
complexity (especially for the most basic data types)
would be a good thing to have. Nothing more.

  Are you
 willing to provide the same guaranteed performance and
 documentation of performance to your customer that you
 you expect of the Python system?

I'm using python mainly as a very handy language to
write whatever (usually relatively small) utility
programs I happen to require for my main job. I am
not (currently) developing any serious programs
that are targeted for sale. (Thought

Re: Yes, this is a python question, and a serious one at that (moving to Win XP)

2005-10-13 Thread Christian Stapfer
John J. Lee [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 Kenneth McDonald [EMAIL PROTECTED] writes:
 [...]
 absolutely preventing me from making the switch. Number one is the
 lack of a decent command line and command-line environment, and I'm
 wondering (hoping) if perhaps someone has written a Python shell-- 
 something that will look like a regular shell, let users type in
 commands, maybe have some of the nice features of bash etc. like tab
 completion, etc, and will then execute an underlying python script
 when the command is entered. I'm not thinking of IDLE, but something
 that is really aimed more at being a system terminal, not a Python-
 specific terminal.
 [...]

 cmd.exe can be made bearable.  I just got a new machine, so I'll have
 to do this myself in the next few days...

 0. Make a shortcut to cmd.exe, stick it somewhere get-at-able,
   eg. quick launch toolbar

0.0. ... and add an item to your SendTo folder that allows
you to have Windows Explorer open a terminal window with its
current directory set to the currently displayed folder
(= Open terminal here).

Regards,
Christian


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Comparing lists

2005-10-14 Thread Christian Stapfer
jon [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]

 To take the heat out of the discussion:

 sets are blazingly fast.

I'd prefer a (however) rough characterization
of computational complexity in terms of Big-Oh
(or Big-whatever) *anytime* to marketing-type
characterizations like this one...

Regards,
Christian


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Comparing lists

2005-10-15 Thread Christian Stapfer
Steven D'Aprano [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 On Sat, 15 Oct 2005 06:31:53 +0200, Christian Stapfer wrote:

 jon [EMAIL PROTECTED] wrote in message
 news:[EMAIL PROTECTED]

 To take the heat out of the discussion:

 sets are blazingly fast.

 I'd prefer a (however) rough characterization
 of computational complexity in terms of Big-Oh
 (or Big-whatever) *anytime* to marketing-type
 characterizations like this one...

 Oh how naive.

Why is it that even computer science undergrads
are required to learn the basics of Big-Oh and
all that? Are computer scientists really morons,
as Xah Lee suggests? I can't believe it, but
maybe you have a point...

 The marketing department says: It's O(N), so it is blindingly fast.

I might as well interpret blindingly fast
as meaning O(1). - Why not?
  Surely marketing might also have reasoned like
this: It's O(1), so its blindingly fast.
But I *want*, nay, I *must* know whether it is
O(N) or O(1). So forget about marketingspeak,
it's a deadly poison for developers. It might
be ok to induce grandiose feelings in childish
users - but developers are grown ups: they must
face reality...

 Translation: the amount of computation it does is linearly proportional
 to N. The constant of proportionality is 1e10.

 The marketing department says: Our competitor's product is O(N**2), so it
 runs like a three-legged dog.

 Translation: the amount of computation it does is linearly proportional to
 N squared. The constant of proportionality is 1e-100.

 You do the maths.

 Big O notation is practically useless for judging how fast a single
 algorithm will be, or how one algorithm compares to another.

That's why Knuth liked it so much?
That's why Aho, Hopcroft and Ullman liked it so much?
That's why Gonnet and Baeza-Yates liked it so much?

 It is only useful for telling you how a single algorithm
 will scale as the input increases.

And that's really very useful information indeed.
Since, given such information for the basic data types
and operations, as implemented by the language and
its standard libraries, I stand a real chance of
being able to determine the computational complexity
of the *particular*combination* of data types and
algorithms of my own small utility or of a
critical piece of my wonderful and large application,
on which the future of my company depends, with some
confidence and accuracy.

 It is very common for sensible programmers to fall back on a less
 efficient O(N**2) or even O(2**N) algorithm for small amounts of data, if
 that algorithm runs faster than the more efficient O(N) or O(log N)
 algorithm. In fact, that's exactly what the sort() method does in Python:
 for small enough lists, say, under 100 elements, it is quicker to run an
 O(N**2) algorithm (shell sort I believe) than it is to perform the
 complex set up for the merge-sort variant used for larger lists.

 As for sets, they are based on dicts, which are effectively hash tables.
 Hash tables are O(1), unless there are collisions,

Depending on the load factor of the hash tables.
So we would want to ask, if we have very large
lists indeed, how much space needs to be invested
to keep the load factor so low that we can say
that the membership test is O(1). Do A-B and AB
have to walk the entire hash table (which must be
larger than the sets, because of a load factor
 1)? Also: the conversion of lists to sets needs
the insertion of N elements into those hash tables.
That alone already makes the overall algorithm
*at*least* O(N). So forget about O(log N).

 in which case the more
 common algorithms degenerate to O(N).

So here, indeed, we have the kind of reasoning that
one ought to be able to deliver, based on what's in
the Python documentation. Luckily, you have that
kind the knowledge of both, how sets are implemented
and what Big-Oh attaches to the hash table operation
of look up.
  In order to *enable* SUCH reasoning for *everyone*,
starting from the module interface documentation only,
one clearly needs something along the lines that
I was suggesting...

 So, a very rough and ready estimate
 of the complexity of the algorithm using sets would be somewhere between
 O(1) and O(N) depending on the details of Python's algorithms.

 So, let's do an experiment, shall we?

 from sets import Set
 import time

 def compare_and_separate_with_sets(A, B):
AB = Set(A+B)
A = Set(A)
B = Set(B)
only_A = list(A-B)
only_B = list(B-A)
both_AB = list(AB - Set(only_A+only_B))
return (only_A, only_B, both_AB)

 def timeit(f, args, n):
Time function f when called with *args. For timing purposes,
does n calls of f. Returns the average time used per call in seconds.

loopit = range(n)
t = time.time()
for i in loopit:
results = f(*args)
t = time.time() - t
return t/n

 def single_test(N):
print (N = %-8d % N),
A = range(N)
B = range(N/2, 3*N/2)
return timeit(compare_and_separate_with_sets, (A, B), 20)

 def

Re: Comparing lists

2005-10-15 Thread Christian Stapfer
Steven D'Aprano [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]
 On Sat, 15 Oct 2005 18:17:36 +0200, Christian Stapfer wrote:

 I'd prefer a (however) rough characterization
 of computational complexity in terms of Big-Oh
 (or Big-whatever) *anytime* to marketing-type
 characterizations like this one...

 Oh how naive.

 Why is it that even computer science undergrads
 are required to learn the basics of Big-Oh and
 all that?

 So that they know how to correctly interpret what Big O notation means,
 instead of misinterpreting it. Big O notation doesn't tell you everything
 you need to know to predict the behaviour of an algorithm.

Well, that's right. I couldn't agree more:
it doesn't tell you *everything* but it does
tell you *something*.  And that *something*
is good to have.

 It doesn't even tell you most of what you need to know about its
 behaviour.
 Only actual *measurement* will tell you what you need to know.

Well, that's where you err: Testing doesn't
tell you everything *either*. You need *both*:
a reasonable theory *and* experimental data...
If theory and experimental data disagree,
we would want to take a closer look on both,
the theory (which may be mistaken or inadequate)
*and* the experiment (which may be inadequate or
just plain botched as well).

 Perhaps you should actually sit down and look at the assumptions,
 simplifications, short-cuts and trade-offs that computer scientists make
 when they estimate an algorithm's Big O behaviour. It might shock you out
 of your faith that Big O is the be all and end all of algorithm planning.

 For all but the simplest algorithm, it is impractical to actually count
 all the operations -- and even if you did, the knowledge wouldn't help
 you, because you don't know how long each operation takes to get executed.
 That is platform specific.

 So you simplify. You pretend that paging never happens. That memory
 allocations take zero time. That set up and removal of data structures
 take insignificant time. That if there is an N**2 term, it always swamps
 an N term. You assume that code performance is independent of the CPUs.
 You assume that some operations (e.g. comparisons) take no time, and
 others (e.g. moving data) are expensive.

 Those assumptions sometimes are wildly wrong. I've been seriously bitten
 following text book algorithms written for C and Pascal: they assume that
 comparisons are cheap and swapping elements are expensive. But in Python,
 swapping elements is cheap and comparisons are expensive, because of all
 the heavy object-oriented machinery used. Your classic text book algorithm
 is not guaranteed to survive contact with the real world: you have to try
 it and see.

Still: the expensiveness of those operations (such
as swapping elements vs. comparisons) will only
affect the constant of proportionality, not the
asymptotic behavior of the algorithm. Sooner or
later the part of your program that has the
worst asymptotic behavior will determine speed
(or memory requirements) of your program.

 Given all the assumptions, it is a wonder that Big O
 estimates are ever useful, not that they sometimes
 are misleading.

 [snip]
 The marketing department says: It's O(N), so it is blindingly fast.

 I might as well interpret blindingly fast as meaning O(1). - Why not?
   Surely marketing might also have reasoned like
 this: It's O(1), so its blindingly fast. But I *want*, nay, I *must*
 know whether it is O(N) or O(1).

 You might _want_, but you don't _need_ to know which it is, not in every
 case. In general, there are many circumstances where it makes no
 sense to worry about Big O behaviour. What's your expected data look like?
 If your data never gets above N=2, then who cares whether it is O(1)=1,
 O(N)=2, O(N**2)=4 or O(2**N)=2? They are all about as fast.

 Even bubble sort will sort a three element list fast enough -- and
 probably faster than more complicated sorts. Why spend all the time
 setting up the machinery for a merge sort for three elements?

Since you have relapsed into a fit of mere polemics,
I assume to have made my point as regards marketing
type characterizations of algorithms (blazingly
fast) vs. measures, however rough, of asymptotic
complexity measures, like Big-Oh. - Which really
was the root of this sub-thread that went like this:

.. To take the heat out of the discussion:
..  sets are blazingly fast.

..   I'd prefer a (however) rough characterization
..  of computational complexity in terms of Big-Oh
..  (or Big-whatever) *anytime* to marketing-type
..  characterizations like this one...

 [snip]

 Big O notation is practically useless for judging how fast a single
   ^^
 algorithm will be, or how one algorithm compares to another.

 That's why Knuth liked it so much?
 That's why Aho, Hopcroft and Ullman liked it so much? That's why Gonnet
 and Baeza-Yates liked it so much?

 Two reasons: it is useful for telling you how a single algorithm will
 scale

Re: Comparing lists

2005-10-16 Thread Christian Stapfer
Ron Adam [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]
 Christian Stapfer wrote:

 This discussion begins to sound like the recurring
 arguments one hears between theoretical and
 experimental physicists. Experimentalists tend
 to overrate the importance of experimental data
 (setting up a useful experiment, how to interpret
 the experimental data one then gathers, and whether
 one stands any chance of detecting systematic errors
 of measurement, all depend on having a good *theory*
 in the first place). Theoreticians, on the other hand,
 tend to overrate the importance of the coherence of
 theories. In truth, *both* are needed: good theories
 *and* carefully collected experimental data.

 Regards,
 Christian

 An interesting parallel can be made concerning management of production vs
 management of creativity.

 In general, production needs checks and feedback to insure quality, but
 will often come to a stand still if incomplete resources are available.

 Where as creativity needs checks to insure production, but in many cases
 can still be productive even with incomplete or questionable resources.
 The quality may very quite a bit in both directions, but in creative
 tasks, that is to be expected.

 In many ways programmers are a mixture of these two.  I think I and Steven
 use a style that is closer to the creative approach. I get the feeling
 your background may be closer to the production style.

This diagnosis reminds me of C.G. Jung, the psychologist,
who, after having introduced the concepts of extra- and
introversion, came to the conclusion that Freud was
an extravert whereas Adler an introvert. The point is
that he got it exactly wrong...

 As to the value of complexity theory for creativity
in programming (even though you seem to believe that
a theoretical bent of mind can only serve to stifle
creativity), the story of the discovery of an efficient
string searching algorithm by D.E.Knuth provides an
interesting case in point. Knuth based himself on
seemingly quite uncreatively theoretical work (from
*your* point of view) that gave a *better* value for
the computuational complexity of string searching
than any of the then known algorithms could provide.

Regards,
Christian
-- 
»It is no paradox to say that in our most theoretical
 moods we may be nearest to our most practical applications.«
 - Alfred North Whitehead

[and those practical applications will likely be most
 creative ones..]


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Comparing lists - somewhat OT, but still ...

2005-10-16 Thread Christian Stapfer
Ron Adam [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 Christian Stapfer wrote:

 This discussion begins to sound like the recurring
 arguments one hears between theoretical and
 experimental physicists. Experimentalists tend
 to overrate the importance of experimental data
 (setting up a useful experiment, how to interpret
 the experimental data one then gathers, and whether
 one stands any chance of detecting systematic errors
 of measurement, all depend on having a good *theory*
 in the first place). Theoreticians, on the other hand,
 tend to overrate the importance of the coherence of
 theories. In truth, *both* are needed: good theories
 *and* carefully collected experimental data.

 Regards,
 Christian

 An interesting parallel can be made concerning management of production vs 
 management of creativity.

 In general, production needs checks and feedback to insure quality, but 
 will often come to a stand still if incomplete resources are available.

 Where as creativity needs checks to insure production, but in many cases 
 can still be productive even with incomplete or questionable resources. 
 The quality may very quite a bit in both directions, but in creative 
 tasks, that is to be expected.

 In many ways programmers are a mixture of these two.  I think I and Steven 
 use a style that is closer to the creative approach. I get the feeling 
 your background may be closer to the production style.

 Both are good and needed for different types of tasks. And I think most 
 programmers can switch styles to some degree if they need to.

Come to think of an experience that I shared
with a student who was one of those highly
creative experimentalists you seem to have
in mind. He had just bought a new PC and
wanted to check how fast its floating point
unit was as compared to our VAX. After
having done his wonderfully creative
experimenting, he was utterly dejected: Our (old)
VAX is over 10'000 times faster than my new PC,
he told me, almost in despair. Whereupon I,
always the uncreative, dogmatic theoretician,
who does not believe that much in the decisiveness
of the outcome of mere experiments, told him
that this was *impossible*, that he *must* have
made a mistake...

It turned out that the VAX compiler had been
clever enough to hoist his simple-minded test
code out of the driving loop. In fact, our VAX
calculated the body of the loop only *once*
and thus *immediately* announced that it had finished
the whole test - the compiler on this student's
PC, on the other hand, had not been clever enough
for this type of optimization: hence the difference...

  I think this is really a cautionary tale for
experimentalists: don't *believe* in the decisiveness
of the outcomes your experiments, but try to *understand*
them instead (i.e. relate them to your theoretical grasp
of the situation)...

Regards,
Christian


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Comparing lists

2005-10-16 Thread Christian Stapfer
Fredrik Lundh [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]
 Christian Stapfer wrote:

 As to the value of complexity theory for creativity
 in programming (even though you seem to believe that
 a theoretical bent of mind can only serve to stifle
 creativity), the story of the discovery of an efficient
 string searching algorithm by D.E.Knuth provides an
 interesting case in point. Knuth based himself on
 seemingly quite uncreatively theoretical work (from
 *your* point of view) that gave a *better* value for
 the computuational complexity of string searching
 than any of the then known algorithms could provide.

 are you talking about KMP?

Yes. I cannot give you the source of the story,
unfortunately, because I only have the *memory* of
it but don't know exactly *where* I happended to read
it. There, Knuth was said to have first analyzed the
theoretical argument very, very carefully to figure
out *why* it was that the theoretical bound was so
much better than all practically known algorithms.
It was by studing the theoretical work on computational
complexity *only* that the light dawned upon him.
(But of course, Knuth is an uncreative dumbo fit
 only for production work - I am speaking ironically
 here, which should be obvious.)

  I'm not sure that's really a good example of
 how useful theoretical work really is in practice:

Oh sure, yes, yes, it is. But my problem is to find
a good source of the original story. Maybe one
of the readers of this thread can provide it?

 the better computational complexity of KMP has
 turned out to be mostly useless, in practice.

Well, that's how things might turn out in the long run.
Still, at the time, to all appearances, it *was* a
case of practical creativity *triggered* by apparently
purely theoretical work in complexity theory.

 More interesting than your trying to shoot down
one special case of the more general phenomenon of
theory engendering creativity would be to know
your position on the more general question...

 It happens *often* in physics, you known. Einstein
is only one example of many. Pauli's prediction of
the existence of the neutrino is another. It took
experimentalists a great deal of time and patience
(about 20 years, I am told) until they could finally
muster something amounting to experimental proof
of Pauli's conjecture.

Regards,
Christian
-- 
Experience without theory is blind,
 but theory without experience is mere
 intellectual play.
 - Immanuel Kant

»Experience remains, of course, the sole criterion
 of the *utility* of a mathematical construction.
 But *the*creative*principle* resides in mathematics.«
 - Albert Einstein: ‘The World As I See It’

»The astronomer Walter Baade told me that, when he
 was dining with Pauli one day, Pauli exclaimed,
 Today I have done the worst thing for a theoretical
 physicist. I have invented something which can never
 be detected experimentally. Baade immediately offered
 to bet a crate of champagne that the elusive neutrino
 would one day prove amenable to experimental discovery.
 Pauli accepted, unwisely failing to specify any time
 limit, which made it impossible for him ever to win
 the bet. Baade collected his crate of champagne (as
 I can testify, having helped Baade consume a bottle of it)
 when, just over twenty years later, in 1953, Cowan and
 Reines did indeed succeed in detecting Pauli’s particle.«
 - Fred Hoyle: ‘Astronomy and Cosmology’


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Comparing lists

2005-10-16 Thread Christian Stapfer
Ron Adam [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 Christian Stapfer wrote:
 Ron Adam [EMAIL PROTECTED] wrote in message
 news:[EMAIL PROTECTED]

Christian Stapfer wrote:


This discussion begins to sound like the recurring
arguments one hears between theoretical and
experimental physicists. Experimentalists tend
to overrate the importance of experimental data
(setting up a useful experiment, how to interpret
the experimental data one then gathers, and whether
one stands any chance of detecting systematic errors
of measurement, all depend on having a good *theory*
in the first place). Theoreticians, on the other hand,
tend to overrate the importance of the coherence of
theories. In truth, *both* are needed: good theories
*and* carefully collected experimental data.

Regards,
Christian

An interesting parallel can be made concerning management of production 
vs
management of creativity.

In general, production needs checks and feedback to insure quality, but
will often come to a stand still if incomplete resources are available.

Where as creativity needs checks to insure production, but in many cases
can still be productive even with incomplete or questionable resources.
The quality may very quite a bit in both directions, but in creative
tasks, that is to be expected.

In many ways programmers are a mixture of these two.  I think I and 
Steven
use a style that is closer to the creative approach. I get the feeling
your background may be closer to the production style.


 This diagnosis reminds me of C.G. Jung, the psychologist,
 who, after having introduced the concepts of extra- and
 introversion, came to the conclusion that Freud was
 an extravert whereas Adler an introvert. The point is
 that he got it exactly wrong...

  As to the value of complexity theory for creativity
 in programming (even though you seem to believe that
 a theoretical bent of mind can only serve to stifle
 creativity), the story of the discovery of an efficient
 string searching algorithm by D.E.Knuth provides an
 interesting case in point. Knuth based himself on
 seemingly quite uncreatively theoretical work (from
 *your* point of view) that gave a *better* value for
 the computational complexity of string searching
 than any of the then known algorithms could provide.

 Regards,
 Christian


 (even though you seem to believe that
 a theoretical bent of mind can only serve to stifle
 creativity)

 No, that is not at all what I believe.  What I believe is, The insistence 
 of strict conditions can limit creative outcomes.

That's agreed. But going off *blindly*experimenting*
without trying to relate the outcome of that experimenting
back to ones theoretical grasp of the work one is doing
is *not* a good idea. Certainly not in the long run.
In fact, muddling-trough and avoiding the question
of suitable theoretical support for one's work is
perhaps more typical of production environments.

 The lack of those limits does not prevent one from using any resources 
 (including theoretical ones) if they are available.

 You seem to be rejecting experimental results in your views.

Not at all. You must have mis-read (or simply not-read)
my posts in this thread and are simply projecting wildly,
as psychoanalysts would call it, that is all.

  And the level of insistence you keep in that view,

A view that I do not really have: you are really projecting
indeed.

 leads me to believe you favor a more productive environment
 rather than a more creative one.

You are mistaken. Although I have some practical background
(originally working as a self-taught programmer - although,
ironically, for a development and research department),
I went on to study mathematics at the Federal Institute
of Technology here in Switzerland. Do you want to say that
having been trained as a mathematician makes one uncreative?
- But it is true that mathematicians are socialized in such
a way that they tend to take over rather high standards of
precision and theoretical grounding of their work.

  Both are good, and I may entirely wrong about you,

.. you are at least *somewhat* wrong about me,
that I am quite sure of...

 as many people are capable of wearing different hats depending on the 
 situation.

 I think the gist of this thread may come down to...

 In cases where it is not clear on what direction to go because the choices 
 are similar enough to make the choosing difficult.  It is almost always 
 better to just pick one and see what happens than to do nothing.

As it appears, not even my most recent post has had
*any* recognizable effect on your thoroughly
misapprehending my position.

Regards,
Christian


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Comparing lists - somewhat OT, but still ...

2005-10-16 Thread Christian Stapfer
Steven D'Aprano [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 On Sun, 16 Oct 2005 15:16:39 +0200, Christian Stapfer wrote:

 Come to think of an experience that I shared
 with a student who was one of those highly
 creative experimentalists you seem to have
 in mind. He had just bought a new PC and
 wanted to check how fast its floating point
 unit was as compared to our VAX. After
 having done his wonderfully creative
 experimenting, he was utterly dejected: Our (old)
 VAX is over 10'000 times faster than my new PC,
 he told me, almost in despair.

 Which it was. It finished executing his code in almost 1/10,000th of the
 time his PC could do.

 Whereupon I,
 always the uncreative, dogmatic theoretician,
 who does not believe that much in the decisiveness
 of the outcome of mere experiments, told him
 that this was *impossible*, that he *must* have
 made a mistake...

 It wasn't a mistake and it did happen.

Yes, yes, of course, it was a mistake, since
the conclusion that he wanted to draw from
this experiment was completely *wrong*.
Similarly, blind experimentalism *without*
supporting theory is mostly useless.

 The VAX finished the calculation
 10,000 times faster than his PC.
You have a strange concept of impossible.

What about trying, for a change, to suppress
your polemical temperament? It will only lead
to quite unnecessarily long exchanges in this
NG.

 It turned out that the VAX compiler had been
 clever enough to hoist his simple-minded test
 code out of the driving loop.

But, mind you, his test was meant to determine,
*not* the cleverness of the VAX compiler *but*
the speed of the floating-point unit. So his
experiment was a complete *failure* in this regard.


 Optimizations have a tendency to make a complete mess of Big O
 calculations, usually for the better. How does this support your
 theory that Big O is a reliable predictor of program speed?

My example was meant to point out how
problematic it is to assume that experimental
outcomes (without carefully relating them
back to supporting theory) are quite *worthless*.
This story was not about Big-Oh notation but
a cautionary tale about the relation between
experiment and theory more generally.
- Got it now?

 For the record, the VAX 9000 can have up to four vector processors each
 running at up to 125 MFLOPS each, or 500 in total. A Pentium III runs at
 about 850 Mflops. Comparing MIPS or FLOPS from one system to another is
 very risky, for many reasons, but as a very rough and ready measure
 of comparison, a four processor VAX 9000 is somewhere about the
 performance of a P-II or P-III, give or take some fudge factor.

Well, that was in the late 1980s and our VAX
certanly most definitely did *not* have a
vector processor: we were doing work in
industrial automation at the time, not much
number-crunching in sight there.

 So, depending on when your student did this experiment, it is entirely
 conceivable that the VAX might have been faster even without the
 optimization you describe.

Rubbish. Why do you want to go off a tangent like
this? Forget it! I just do not have the time to
start quibbling again.

 Of course, you haven't told us what model VAX,

That's right. And it was *not* important. Since the
tale has a simple moral: Experimental outcomes
*without* supporting theory (be it of the Big-Oh
variety or something else, depending on context)
is mostly worthless.

 or how many processors, or what PC your student had,
 so this comparison might not be relevant.

Your going off another tangent like this is
certainly not relevant to the basic insight
that experiments without supproting theory
are mostly worhtless, I'd say...

 In fact, our VAX
 calculated the body of the loop only *once*
 and thus *immediately* announced that it had finished
 the whole test - the compiler on this student's
 PC, on the other hand, had not been clever enough
 for this type of optimization: hence the difference...

 Precisely. And all the Big O notation is the world will not tell you that.
 Only an experiment will. Now, perhaps in the simple case of a bare loop
 doing the same calculation over and over again, you might be able to
 predict ahead of time what optimisations the compiler will do. But for
 more complex algorithms, forget it.

 This is a clear case of experimentation leading to the discovery
 of practical results which could not be predicted from Big O calculations.

The only problem being: it was *me*, basing
myself on theory, who rejected the experimental
result that the student had accepted *as*is*.
(The student was actually an engineer, I myself
had been trained as a mathematician. Maybe that
rings a bell?)

 I find it quite mind-boggling that you would use as if it was a triumph
 of abstract theoretical calculation when it was nothing of the sort.

This example was not at all meant to be any
such thing. It was only about: experimenting
*without* relating experimental outcomes to
theory is mostly worthless. What's more

Re: Comparing lists

2005-10-16 Thread Christian Stapfer
Steven D'Aprano [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]
 On Sun, 16 Oct 2005 19:42:11 +0200, Christian Stapfer wrote:

 Pauli's prediction of
 the existence of the neutrino is another. It took
 experimentalists a great deal of time and patience
 (about 20 years, I am told) until they could finally
 muster something amounting to experimental proof
 of Pauli's conjecture.

 Pauli's conjecture was the result of experimental evidence that was
 completely inexplicable according to the theory of the day:

So was it mere experiment or was it the relation
between experiment and theory that provided
the spur for creative advancement? My position
is the latter. Mere experiment does not tell you
anything at all. Only experiment on the background
of suitable theory does that.

 energy and
 spin was disappearing from certain nuclear reactions. This was an
 experimental result that needed to be explained, and Pauli's solution was
 to invent an invisible particle that carried that energy and spin away.

Pauli's creativity lay in proposing *this*
particular solution to the puzzle. And, surely,
if it had not been for Pauli's characterization
of that hypothetical particle, experimentalists
like Cowan and Reines would not have *anything*
to aim for in the first place.

But I'm not going to argue Pauli's case any futher
in this NG, because this is, in the end,
not a physics NG...

 (When I put it like that, it sounds stupid, but in fact it was an elegant
 and powerful answer to the problem.)

 The neutrino wasn't something that Pauli invented from theoretical first
 principles. It came out of hard experimental results.

 Physics of the last half century is littered with the half-forgotten
 corpses of theoretical particles that never eventuated: gravitinos,
 photinos, tachyons, rishons, flavons, hypercolor pre-quarks, axions,
 squarks, shadow matter, white holes, and so on ad nauseum.

 Neutrinos and quarks are exceptional in that experimental predictions of
 their existence were correct, and I maintain that is because (unlike all
 of the above) they were postulated to explain solid experimental results,
 not just to satisfy some theoretical itch.

 So yet again, your triumph of theory is actually a victory for experiment.

Well, I might tell now the story of Maxwell,
sitting in his garden - and deducing, from
his equations (which, admittedly, were inspired
by earlier experimental work by Faraday),
something really quite shockingly *new*:
the existence of electromagnetic waves.

Regards,
Christian
--
»From a long view of the history of mankind -
 seen from, say, ten thousand years from now
 - there can be little doubt that the most
 significant event of the nineteenth century
 will be judged as Maxwell's discovery of the
 laws of electrodynamics. The American Civil
 War will pale into provincial insignificance
 in comparison with this important scientific
 event of the same decade.«
- Richard P. Feynman: The Feynman Lectures


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Comparing lists

2005-10-17 Thread Christian Stapfer
Alex Martelli [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 Christian Stapfer [EMAIL PROTECTED] wrote:

  This is why we would like to have a way of (roughly)
 estimating the reasonableness of the outlines of a
 program's design in armchair fashion - i.e. without
 having to write any code and/or test harness.

 And we would also like to consume vast amounts of chocolate, while
 similarly reclining in comfortable armchairs,

Maybe some of my inclination towards design
based on suitable *theories* (instead of
self-conditioning through testing) goes back
to the fact that I tend to think about the
design of my programs when no computer happens
to be near at hand to do some such experimenting,
or self-conditioning...

 without getting all fat and flabby.

Well, thinking can be hard work. There is no need
to suggest an image of laziness. Thought experiments
are also quite often successful. Hardware engineers
can design very often entire gadgets without doing
a great deal of testing. They usually need to resort
to testing only if they know (or feel?) not to have
a sufficiently clear *theoretical* grasp of the
behavior of some part of their design.

  Unfortunately, what we would like and what reality affords
 are often pretty uncorrelated.  No matter how much theoreticians may
 love big-O because it's (relatively) easy to compute, it still has two
 failings which are often sufficient to rule out its sufficiency for any
 estimate [of] the reasonableness of anything: [a] as we operate on
 finite machines with finite wordsize, we may never be able reach
 anywhere even remotely close to the asymptotic region where big-O has
 some relationship to reality; [b] in many important cases, the
 theoretical worst-case is almost impossible to characterize and hardly
 ever reached in real life, so big-O is of no earthly use (and much
 harder to compute measures such as big-Theta should be used for just
 about any practical purpose).

But the fact remains that programmers, somewhat
experienced with the interface a module offers,
have a *rough*idea* of that computational complexity
attaches to what operations of that interface.
And having such a *rough*idea* helps them to
design reasonably performing programs much more
quickly.
  Big-Oh and other asymptotic complexity measures
really do have *this* advantage over having
acquired, by way of conditioning experiences,
some such *rough*idea* of computational complexity:
they capture at least some of that rough idea
in a somewhat more easily communicable and much
more precise fashion.

  Maybe you and Steven prefer to be conditioned,
Pavlov style, by the wonderful experiences that
you get while testing? - This is perhaps really
one of my *worst* psychological handicaps, I must
admit: that I don't *like* to get conditioned
like that, no matter how good it feels, no matter
how effective it might be for practical work that
one has to do.
  I want to be able to really think *about* what
I am doing. And in order to be able to think about
it one usually needs some information about the
implementation, performance wise, of the language
features and the system modules that one might
want to use. If you happen to know of any *better*
way of offering the requisite information than
asymptotic complexity measures then, of course,
I am very grateful to hear more about it.

 Consider, for example, point [b].  Quicksort's big-O is N squared,
 suggesting that quicksort's no better than bubblesort or the like.  But
 such a characterization is absurd.  A very naive Quicksort, picking its
 pivot very systematically (e.g., always the first item), may hit its
 worst case just as systematically and in cases of practical importance
 (e.g., already-sorted data); but it takes just a little extra care (in
 the pivot picking and a few side issues) to make the worst-case
 occurrences into ones that will not occur in practice except when the
 input data has been deliberately designed to damage by a clever and
 determined adversary.

 Designing based on worst-case occurrences hardly ever makes
 sense in any field of engineering,

What's wrong with wanting to have a rough idea
of what might happen in the worst case? I believe
many engineers are actually expected to think
about at least some worst-case scenarios.
Think of nuclear reactors, airplanes, or
telephone exchanges (and dont' think of Google
for a change). Don't you expect engineers
and scientists designing, for example, a nuclear
reactor, to think hard about what the worst-case
scenario might be? And how likely it might happen?
(And *no* testing whatsoever in that direction,
please!) Not thinking is, admittedly, a lot easier.

snip/

 Why bother doing
 (e.g.) random pivot selection in quicksort, when its big-O (i.e.,
 worst-case) behavior will remain N-squared, just like naive quicksort,
 or, for that matter, bubblesort?

Because worst-case is not the only measure of
computational complexity that one might be
interested in. For some

Re: Comparing lists

2005-10-17 Thread Christian Stapfer
Alex Martelli [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 Christian Stapfer [EMAIL PROTECTED] wrote:

 Alex Martelli [EMAIL PROTECTED] wrote in message
 news:[EMAIL PROTECTED]
  Christian Stapfer [EMAIL PROTECTED] wrote:
 
   This is why we would like to have a way of (roughly)
  estimating the reasonableness of the outlines of a
  program's design in armchair fashion - i.e. without
  having to write any code and/or test harness.
 
  And we would also like to consume vast amounts of chocolate, while
  similarly reclining in comfortable armchairs,

 Maybe some of my inclination towards design
 based on suitable *theories* (instead of
 self-conditioning through testing) goes back
 to the fact that I tend to think about the
 design of my programs when no computer happens
 to be near at hand to do some such experimenting,
 or self-conditioning...

 Oh, I am as prone as anybody I know to do SW architecture and design in
 bed when the lights are off and I'm sliding into sleep -- just about the
 only case in which no computer is handy, or, rather, in which it's
 generally unwise to turn the computer on (since it would interfere with
 the sleep thing;-).  Back before laptops were really affordable and
 usable, I used to have a long bus commute, and did a lot of design with
 pen and paper; and whiteboards are a popular group-design tool at
 Google, no matter how many laptops or desktops happen to be around --
 whiteboards are simply more suitable for socialization around a draft
 design's sketch, than any computer-based tool I've ever seen.

 But that's *design*, and most often in pretty early stages, too -- quite
 a ways from *coding*.  At that stage, one doesn't even generally commit
 to a specific programming language or other for the eventual
 implementation of the components one's considering!  Rough ideas of
 *EXPECTED* run-times (big-Theta) for various subcomponents one is
 sketching are *MUCH* more interesting and important than asymptotic
 worst-case for amounts of input tending to infinity (big-O) -- for
 example, where I sketch-in (mentally, on paper, or on whiteboard) a
 hash table subcomponent, I consider the *expected* (Theta) performance
 (constant-time lookups), definitely NOT the big-O linear time lookups
 which just MIGHT occur (if, say, all inputs just happened to hash to the
 same value)... otherwise, I'd never use hash tables, right?-)

Well, Big-Oh, Big-Theta: these both are asymptotic
complexity measures, arent' they. So that's ok
with me.

  without getting all fat and flabby.

 Well, thinking can be hard work. There is no need
 to suggest an image of laziness. Thought experiments
 are also quite often successful. Hardware engineers
 can design very often entire gadgets without doing
 a great deal of testing. They usually need to resort
 to testing only if they know (or feel?) not to have
 a sufficiently clear *theoretical* grasp of the
 behavior of some part of their design.

 Having been a hardware designer (of integrated circuits, for Texas
 Instruments, and later briefly for IBM), before switching to software, I
 can resolutely deny this assertion: only an utter madman would approve a
 large production run of an IC who has not been EXTENSIVELY tested, in
 simulations and quite possibly in breadboards and later in limited
 pre-production runs.

Whoa, yet another straw-man attack!
First, remember, I do *not* advocate doing without
testing *altogether*: so here you are just attacking
a straw-man of your own making: that's *not* me.
What I wanted to say here is just that, in my
experience, as a close observer of average
hardware designers, I believe that their testing
*usually* doesn't turn up major gotchas, and this
can only be because the theory they have about
their hardware will operate, is sufficiently
powerful. - Usually: which means that exceptions
are possible, of course.

  And any larger gadget USING ICs would be
 similarly crazy to skimp on prototyping, simulation, and other testing
 -- because, as every HW engineer KNOWS (SW ones often have to learn the
 hard way), the distance between theory and practice, in practice, is
 much larger than the distance between practice and theory should be in
 theory;-).

   Unfortunately, what we would like and what reality affords
  are often pretty uncorrelated.  No matter how much theoreticians may
  love big-O because it's (relatively) easy to compute, it still has two
  failings which are often sufficient to rule out its sufficiency for any
  estimate [of] the reasonableness of anything: [a] as we operate on
  finite machines with finite wordsize, we may never be able reach
  anywhere even remotely close to the asymptotic region where big-O has
  some relationship to reality; [b] in many important cases, the
  theoretical worst-case is almost impossible to characterize and hardly
  ever reached in real life, so big-O is of no earthly use (and much
  harder to compute measures such as big-Theta should be used for just

Re: add an asynchronous exception class

2006-03-04 Thread Christian Stapfer
Paul Rubin http://[EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 Fredrik Lundh [EMAIL PROTECTED] writes:
 PEP 348 addresses this by moving special exceptions out of the
 Exception hierarchy:

 http://www.python.org/peps/pep-0348.html

 I see that suggestion was rejected (it needed changing the semantics
 of except:).  Also, PEP 348 was rejected and is a huge, complex
 reorganization of the whole exception system.  This is cited:

  http://mail.python.org/pipermail/python-dev/2005-August/055423.html

 but it talks about distinguishing terminating from non-terminating
 exceptions, whatever that means.  (Won't any uncaught exception like
 ValueError terminate the program)?

I guess it means the following:

Terminating exceptions are exceptions that
terminate the *thrower* of the exception.
Non-terminating exceptions are exceptions
that might allow the thrower to resume
(provided the catcher does *not* decide
to unwind the thrower's stack frame - and
possibly some other frames as well...).
So non-terminating exceptions allow the
thrower to offer the catcher a choice
between terminating the thrower or having
him try harder (until he possibly throws
yet another, but maybe this time terminating
exception that does not allow the catcher to
ask for resumption of the throwing code).

  So what's terminated (or not terminated)
here is not the program but merely the
code that throws an exception.
  VAX/VMS had such a non-terminating exception
handling mechanism, for example.

Regards,
Christian


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: add an asynchronous exception class

2006-03-04 Thread Christian Stapfer
Paul Rubin http://[EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 Christian Stapfer [EMAIL PROTECTED] writes:
 I guess it means the following:

 Terminating exceptions are exceptions that
 terminate the *thrower* of the exception.

 Are you sure?

Am I sure?  - Well no! As I wrote: this  is
just my *guess*.
But if certain terms are not explained
upfront then, I assume, the writer must have
had some well known interpretation in mind.
  These two alternatives, the abortion and
the resumption model of exception handling,
are certainly that: well known. Hence
my guess...

Regards,
Christian 


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: strange math?

2006-03-18 Thread Christian Stapfer
[EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 Hello everyone,  I'm experimenting with python and i'm following this
 tutorial:
 http://docs.python.org/tut/node6.html#SECTION00640  I'm
 in section 4.7.5 Lambda Forms.  In this section I was working along and
 I noticed something strange.  It happened because of a typo.  Below is
 a copy/paste from my idle session:

def make_incrementor(n):
  return lambda x: x+n

f=make_incrementor(42)
f(0)
 42
f(1)
 43
f(10)
 52
f(0)
 42
f(01)
 43
f(02)
 44
f(010)
 50
42+010
 50

 The first f(01) was a mistake.  I accidentally forgot to delete the
 zero, but to my suprise, it yielded the result I expected.  So, I tried
 it again, and viola, the right answer.  So, I decided to really try and
 throw it for a loop, f(010), and it produced 50.  I expected 52
 (42+10).  Why doesn't python ignore the first zero and produce a result
 of 52?

That's because python interprets 010 as *octal* 10
which is *decimal* 8. Thus

42+010 = 42+8 = 50

which is quite as it should be...

Regards,
Christian


-- 
http://mail.python.org/mailman/listinfo/python-list


Problem loading true-type font with PIL

2006-09-01 Thread Christian Stapfer
After switching from Python 2.3 to 2.4 (Enought),
PIL throws an exception that did not occur formerly
(under Python 2.3) when executing

ImageFont.truetype(font, size)

where font = C:/Windows/Fonts/comic.TTF.

Here is the traceback that results:

Traceback (most recent call last):
  File Gen\gen.py, line 808, in ?
mylabeler = labeler.Labeler(szNavigFont, iNavigFontSize)
  File E:\Homepage\Gen\labeler.py, line 11, in __init__
self._font = ImageFont.truetype(font, size)
  File C:\Python24\lib\site-packages\PIL\ImageFont.py, line 202, in 
truetype
return FreeTypeFont(filename, size, index, encoding)
  File C:\Python24\lib\site-packages\PIL\ImageFont.py, line 120, in 
__init__
import _imagingft
ImportError: No module named _imagingft

A module seems to be missing: do I have to install something
in addition to PIL in order to be able to load true-type fonts?

Could someone (more knowledgeable than myself as regards PIL
and this true-type font loading business) please point me
in the right direction?

Many thanks in advance,
Christian Stapfer

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problem loading true-type font with PIL

2006-09-01 Thread Christian Stapfer
Christian Stapfer [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 After switching from Python 2.3 to 2.4 (Enought),
  ^
I mean: Python Enthought Edition--Python 2.4.3 for Windows,
sorry for that.

I see in the documentation for PIL that an additional
module is needed: but I don't see where I can get
it from. I'd expected that the Python Enthought Edition
--Python 2.4.3 for *Windows* would include that module

Regards,
Christian STapfer

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problem loading true-type font with PIL - solved

2006-09-02 Thread Christian Stapfer
Christian Stapfer wrote:

 After switching from Python 2.3 to 2.4 (Enought),
 PIL throws an exception that did not occur formerly
 (under Python 2.3) when executing
 
ImageFont.truetype(font, size)

snip/

 A module seems to be missing: do I have to install something
 in addition to PIL in order to be able to load true-type fonts?

snip/

Problem solved by rudely installing PIL 1.1.5 for Windows and
Python 2.4 from http://www.pythonware.com/products/pil/
right on top of my existing Python Enthought Edition--Python
2.4.3 for Windows. This might have destroyed the consistency
of the overall installation, of course. I'm well punished
for installing Enthought Python 2.4.3: Next time I will again
install all packages that I need myself, as I did for Python
2.3, instead of using a prepackaged distribution like Enthought
Python.

Regards (and sorry for having bothered you with my silly
posts about this problem),
Christian Stapfer

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problem loading true-type font with PIL - solved

2006-09-03 Thread Christian Stapfer
Robert Kern wrote:
 Fredrik Lundh wrote:
 Christian Stapfer wrote:

 Problem solved by rudely installing PIL 1.1.5 for Windows and
 Python 2.4 from http://www.pythonware.com/products/pil/
 right on top of my existing Python Enthought Edition--Python
 2.4.3 for Windows. This might have destroyed the consistency
 of the overall installation, of course. I'm well punished
 for installing Enthought Python 2.4.3: Next time I will again
 install all packages that I need myself, as I did for Python
 2.3, instead of using a prepackaged distribution like Enthought
 Python.

 Since the truetype extension is quite popular, and from what I'm told
 worked just fine in earlier Enthought releases, this is probably just an
 accidental omission.  I'm sure the Enthought people will fix this if you
 report it to them.

 https://svn.enthought.com/enthought/ticket/864

 The person who builds the Enthought Edition releases is out on vacation
 this week, so a new release will probably wait until he comes back. In the
 meantime, installing Fredrik's binaries on top of ours should work just
 fine.

Great! - if it does. (Your words in God's ear...) Sorry about leaving out
some details about the release I installed. Maybe the following helps?

C:\Python24\Enthought\Docpython
Python 2.4.3 - Enthought Edition 1.0.0 (#69, Aug  2 2006, 12:09:59) [MSC
v.1310 32 bit (Intel)] on win32

Perhaps not. Since I have already deleted the Windows installer, and do not
know of any other way to get Enthought-specific version info, this is all I 
can
offer at the moment... (I had downloaded the installer from
http://code.enthought.com/enthon/
on August 31, around 06:00 GMT. I suppose this means that it was
enthon-python2.4-1.0.0.exe, but cannot verify anymore whether it was this
specific installer that I used - or not.)

Best regards and thanks for your reply,
Christian Stapfer

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: dynamic drawing in web page

2006-05-21 Thread Christian Stapfer
Paul Boddie [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 barbaros wrote:

 I need to put some dynamic drawings on my web page. More precisely, I
 need to draw a number of geometric figures (circles, rectangles) which
 evolve into a graphics windows according to some law (a little bit like
 the solar system). I need also to have several fields aside the window,
 where the visitor can change values for several significant parameters
 (like the mass of each body).

 Could you please suggest a solution (preferably not involving hundreds
 of lines of code) ? The computations should be performed on the client
 machine.

 In the context of recent Flash-related discussions and the usage of
 open Web standards, I searched for 'animation SVG solar system' and
 discovered this example:

 http://www.svgelves.com/svg/L_chrisa_02.svg

 Sadly, it only works in Opera amongst the browsers I have.

It also works under IE6, using Adobe's SVG Viewer 3.0

Regards,
Christian


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: algorithm for sorting functional expressions

2006-12-04 Thread Christian Stapfer
[EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
I am trying to write some code that will take a list of functional
 expressions, and order them so that those with primitive terms appear
 at the beginning of the list and those that are defined by other terms
 appear last.

 eg:
 getSortedEquations(['b = w + z','a = z - y','w = 2*z + v','z = e +
 f','y = p + l']) =
 ['w = 2*z + v', 'z = e + f', 'y = p + l', 'b = w + z', 'a = z - y']

 It is easy enough to tokenise each of the equations and produce a list
 like:
 ['b', ['w','z']], ['a': ['z','y']], ['w':'z','v'] , ['z', ['e','f']],
 ['y',['p','l']]

 But I'd like to find an algorithm that can handle the sorting problem.

 So I suspect that this is a common problem for those familiar with
 partially ordered sets or directed graphs.

Right.

 I'm wondering if anyone else
 is familiar with this problem and knows an efficient algorithm that
 will solve it. It would be good if any such algorithm would be able to
 check for circular definitions in the input.

Read http://en.wikipedia.org/wiki/Topological_sorting
or just search for topological sorting on the net.

Regards,
Christian


-- 
http://mail.python.org/mailman/listinfo/python-list


'LoadFile' not found when invoking Acrobat via wx.lib.pdfwin

2007-01-12 Thread Christian Stapfer
Hi,

 I get the following traceback when trying to have
wx.lib.pdfwin.PDFWindow open a PDF file:

E:\Tutoring\Teacher\Flashcardspython pdfwin1.py
Traceback (most recent call last):
  File pdfwin1.py, line 50, in OnOpenButton
self.pdf.LoadFile(dlg.GetPath())
  File 
C:\Python24\lib\site-packages\wx-2.6-msw-unicode-enthought\wx\lib\pdfwin
.py, line 34, in LoadFile
return self.CallAXMethod('LoadFile', fileName)
  File 
C:\Python24\lib\site-packages\wx-2.6-msw-unicode-enthought\wx\activex.py
, line 385, in CallAXMethod
return self._CallAXMethod(name, args)
  File 
C:\Python24\lib\site-packages\wx-2.6-msw-unicode-enthought\wx\activex.py
, line 378, in _CallAXMethod
return _activex.ActiveXWindow__CallAXMethod(*args)
KeyError: 'method LoadFile not found'

The code in pdfwin1.py is from
http://www.daniweb.com/code/snippet618.html

How could the ActiveX control PDFWindow possibly
*not* find the 'LoadFile' method, I wonder. - Or
am I misinterpreting the message?
(I am running Enthought Python 2.4 on Windows XP.)

Thank you in advance for any ideas as to what
the problem might be,

Christian

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: can't find a way to display and print pdf through python.

2007-02-11 Thread Christian Stapfer
krishnakant Mane wrote in message 
news:[EMAIL PROTECTED]
 On 11/02/07, Vishal Bhargava [EMAIL PROTECTED] wrote:
 Use Report Lab...
 I mentioned in my first email that I am already using reportlab.
 but I can only generate pdf out of that.
 I want to display it on screen and I also will be giving a print
 button which should do the printing job.
 by the way I use wxpython for gui.

Under Windows you could do it by embedding Adobe's
ActiveX control in your application. Don't know
about Linux, though. Perhaps you could just convert
your PDF to a raster image for display (eg. by
using ImageMagick's convert) under Linux?

Regards,
Christian

-- 
http://mail.python.org/mailman/listinfo/python-list


Drawing glyphs based on their index (NOT their character code)

2010-05-20 Thread Christian Stapfer
Here's an interesting little problem: I am given a master.ttf font file and 
a subset file subset.ttf of that font, and I am asked to map indices of all 
the glyphs in subset.ttf to the corresponding indices in master.ttf. The 
subset font file is the result of a pipeline of 3 tools (pdflatex, Adobe 
Reader, and the Microsoft XPS Document Writer).
 The resulting document format (XAML) holds indices into subset.ttf, and in 
order to display that document with reference to master.ttf, I need to do 
this mapping of indices: subset - master.
(Carrying subset.ttf along with the XAML code is not an option, I think, 
because I need to be able to support copy and paste of small snippets from 
that document through a shared whiteboard.)


I have tried to parse the two font files to the point of having the x- and 
y-coordinate points of the outlines of glyphs handy. But, unfortunately, 
subset.ttf has been processed in a way that makes it very difficult to 
compare its glyph outlines with the outlines of glyphs in master.ttf.


So my next best idea is to draw the various glyphs as, 16x16 B/W images, 
say, and use the 16*16 bits (that is, 16 ints or 8 longs) that result from 
this as a sufficiently precise description of the glyph to do the necessary 
comparisons.


PIL would be great for this, except for one little problem: I need to be 
able to draw glyphs based on their index, not based on their character code.


Any ideas to get around that limitation of PIL's drawing primitives?

Thanks in advance,
Christian 


--
http://mail.python.org/mailman/listinfo/python-list


Re: Drawing glyphs based on their index (NOT their character code)

2010-05-20 Thread Christian Stapfer
Christian Stapfer nob...@nowhere.nil schrieb im Newsbeitrag 
news:b7256$4bf516e1$544ba447$20...@news.hispeed.ch...
Here's an interesting little problem: I am given a master.ttf font file 
and a subset file subset.ttf of that font, and I am asked to map indices 
of all the glyphs in subset.ttf to the corresponding indices in 
master.ttf. The subset font file is the result of a pipeline of 3 tools 
(pdflatex, Adobe Reader, and the Microsoft XPS Document Writer).


snip/

So my next best idea is to draw the various glyphs as, 16x16 B/W images, 
say, and use the 16*16 bits (that is, 16 ints or 8 longs) that result from 
this as a sufficiently precise description of the glyph to do the 
necessary comparisons.


PIL would be great for this, except for one little problem: I need to be 
able to draw glyphs based on their index, not based on their character 
code.


Any ideas to get around that limitation of PIL's drawing primitives?


To answer my own question: I might parse the cmap table of the ttf file, and 
then invert that mapping char-index. I think that's going to be the next 
approach that I will try.


Regards,
Christian


--
http://mail.python.org/mailman/listinfo/python-list