Re: mysteries of urllib/urllib2

2007-07-03 Thread Ben Cartwright
On Jul 3, 9:43 am, Adrian Smith [EMAIL PROTECTED] wrote:
 The following (pinched
 from Dive Into Python) seems to work perfectly in Idle, but falls at
 the final hurdle when run as a cgi script - can anyone suggest
 anything I may have overlooked?

 request = urllib2.Request(some_URL)
 request.add_header('User-Agent', 'some_plausible_string')
 opener = urllib2.build_opener()
 data = opener.open(request).read()

Most likely the account that cgi script is running as does not have
permissions to access the net. Check the traceback to be sure. Put
this at the top of your cgi script:

import cgitb; cgitb.enable()

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: mysteries of urllib/urllib2

2007-07-03 Thread Ben Cartwright
On Jul 3, 11:14 am, Adrian Smith [EMAIL PROTECTED] wrote:
   The following (pinched
   from Dive Into Python) seems to work perfectly in Idle, but
   falls at the final hurdle when run as a cgi script
  Put this at the top of your cgi script:

  import cgitb; cgitb.enable()

Did you even try this?  Asking for Python help without posting the
traceback is like phoning your mechanic and saying, My car is making
a generic rattling noise, can you tell me what the problem is without
looking under the hood?

 Apparently there's a way to change the user-agent string
 by subclassing urllib's URLopener class, but that's beyond my comfort
 zone at present.

Untested:

import urllib
url = 'http://groups.google.com/group/Google-AJAX-Search-API/
browse_thread/thread/a0eb87ad13b11762'
opener = urllib.FancyURLopener()
opener.addheaders = [('User-Agent', 'Fauxzilla 4.0')]
data = opener.open(url).read()

Hope that helps,
--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Reading stdout and stderr of an external program

2007-07-02 Thread Ben Cartwright
 I need to be able to read the stdout and stderr streams of an external
 program that I launch from my python script. os.system( 'my_prog' +
 ' err.log' ) and was planning on monitoring err.log and to display
 its contents. Is this the best way to do this?

from subprocess import Popen
stdout, stderr = Popen('my_prog').communicate()

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: saving an exception

2006-10-03 Thread Ben Cartwright
Bryan wrote:
 i would like to save an exception and reraise it at a later time.

 something similar to this:

 exception = None
 def foo():
 try:
 1/0
 except Exception, e:
 exception = e

 if exception: raise exception

 with the above code, i'm able to successfully raise the exception, but the
 line number of the exception is at the place of the explicit raise instead
 of the where the exception originally occurred.  is there anyway to fix
 this?

Sure:  generate the stack trace when the real exception occurs.  Check
out sys.exc_info() and the traceback module.

import sys
import traceback

exception = None
def foo():
global exception
try:
1/0
except Exception:
# Build a new exception of the same type with the inner stack
trace
exctype = sys.exc_info()[0]
exception = exctype('\nInner ' +
traceback.format_exc().strip())

foo()
if exception:
raise exception

# Output:
Traceback (most recent call last):
  File foo.py, line 15, in module
raise exception
ZeroDivisionError:
Inner Traceback (most recent call last):
  File foo.py, line 8, in foo
1/0
ZeroDivisionError: integer division or modulo by zero

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: loop beats generator expr creating large dict!?

2006-10-02 Thread Ben Cartwright
George Young wrote:
 I am puzzled that creating large dicts with an explicit iterable of
 key,value pairs seems to be slow.  I thought to save time by doing:

palettes = dict((w,set(w)) for w in words)

 instead of:

palettes={}
for w in words:
   palettes[w]=set(w)

 where words is a list of 20 english words.  But, in fact,
 timeit shows the generator expression takes 3.0 seconds
 and the for loop 2.1 seconds.  Am I missing something?

Creating those 200,000 (w, set(w)) intermediate tuples isn't free.  You
aren't doing that in for loop version.  If you were:

# Slowest of all!
palettes={}
for w,s in ((w,set(w)) for w in words):
palettes[w]=s

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Automatic methods in new-style classes

2006-09-29 Thread Ben Cartwright
[EMAIL PROTECTED] wrote:
 Hey, I have the following code that has to send every command it
 receives to a list of backends.
snip
 I would like to write each method like:

 flush = multimethod()

Here's one way, using a metaclass:

class multimethod(object):
def transform(self, attr):
def dispatch(self, *args, **kw):
results = []
for b in self.backends:
results.append(getattr(b, attr)(*args, **kw))
return results
return dispatch

def multimethodmeta(name, bases, dict):
Transform each multimethod object into an actual method
for attr in dict:
if isinstance(dict[attr], multimethod):
dict[attr] = dict[attr].transform(attr)
return type(name, bases, dict)

class MultiBackend(object):
__metaclass__ = multimethodmeta
def __init__(self, backends):
self.backends = backends
add = multimethod()

class Foo(object):
def add(self, x, y):
print 'in Foo.add'
return x + y

class Bar(object):
def add(self, x, y):
print 'in Bar.add'
return str(x) + str(y)

m = MultiBackend([Foo(), Bar()])
print m.add(3, 4)

# Output:
in Foo.add
in Bar.add
[7, '34']

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Variables in nested functions

2006-08-29 Thread Ben Cartwright
[EMAIL PROTECTED] wrote:
 Is it possible to change the value of a variable in the outer function
 if you are in a nested inner function?

The typical kludge is to wrap the variable in the outer function inside
a mutable object, then pass it into the inner using a default argument:

def outer():
a = outer
def inner(wrapa=[a]):
print wrapa[0]
wrapa[0] = inner
return inner

A cleaner solution is to use a class, and make a an instance
variable.

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: nested dictionary assignment goes too far

2006-06-26 Thread Ben Cartwright
Jake Emerson wrote:
 However, when
 the process goes to insert the unique 'char_freq' into a nested
 dictionary the value gets put into ALL of the sub-keys

The way you're currently defining your dict:
  rain_raw_dict =
dict.fromkeys(distinctID,{'N':-6999,'char_freq':-6999,...})

Is shorthand for:
  tmp = {'N':-6999,'char_freq':-6999,...}
  rain_raw_dict = {}
  for key in distinctID:
  rain_raw_dict[key] = tmp

Note that tmp is a *reference*.  Python does not magically create
copies for you; you have to be explicit.  Unless you want a shared
value, dict.fromkeys should only be used with an immutable value (e.g.,
int or str).

What you'll need to do is either:
  tmp = {'N':-6999,'char_freq':-6999,...}
  rain_raw_dict = {}
  for key in distinctID:
  # explicitly make a (shallow) copy of tmp
  rain_raw_dict[key] = dict(tmp)

Or more simply:
  rain_raw_dict = {}
  for key in distinctID:
  rain_raw_dict[key] = {'N':-6999,'char_freq':-6999,...}

Or if you're a one-liner kinda guy,
  rain_raw_dict = dict((key, {'N':-6999,'char_freq':-6999,...})
   for key in distinctID)

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: random.jumpahead: How to jump ahead exactly N steps?

2006-06-21 Thread Ben Cartwright
Matthew Wilson wrote:
 The random.jumpahead documentation says this:

 Changed in version 2.3: Instead of jumping to a specific state, n steps
 ahead, jumpahead(n) jumps to another state likely to be separated by
 many steps..

This change was necessary because the random module got a new default
generator in 2.3.  The new generator uses the Mersenne Twister
algorithm.  Pre 2.3, Wichmann-Hill was used.  (For more details, search
for jumpahead in
http://www.python.org/download/releases/2.3/NEWS.txt)

Unlike WH, there isn't a way to directly compute the Nth number in the
sequence using MT.  If you're curious as to why,
textbooks/journals/Google are your friends. :-)

 I really want a way to get to the Nth value in a random series started
 with a particular seed.  Is there any way to quickly do what jumpahead
 apparently used to do?

You can always use the old WH generator.  It's still available:
   import random
   wh = random.WichmannHill()
   N, SEED = 100, 0
   wh.seed(SEED)
   for i in range(N): dummy = wh.random()
   wh.random()
  0.68591619673484816
   wh.seed(SEED)
   wh.jumpahead(N)
   wh.random()
  0.68591619673484816

 I devised this function, but I suspect it runs really slowly:

Don't just suspect.  Experiment, too. :-)

 def trudgeforward(n):
 '''Advance the random generator's state by n calls.'''
 for _ in xrange(n): random.random()

 So any speed tips would be very appreciated.

Python's random generator is implemented in C and is quite fast.  In my
tests, your trudgeforward performs acceptably with n~10.

import psyco usually worth a try when improving execution speed, but
it won't help you here.  All the real work is being done in C; the
overhead of the Python interpreter is neglible.

Hope that helps,
--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: __getattr__ question

2006-06-09 Thread Ben Cartwright
Laszlo Nagy wrote:
 So how can I tell if 'root.item3' COULD BE FOUND IN THE USUAL PLACES, or
 if it is something that was calculated by __getattr__ ?
 Of course technically, this is possible and I could give a horrible
 method that tells this...
 But is there an easy, reliable and thread safe way in the Python
 language to give the answer?

Why are you trying to do this in the first place?  If you need to
distinguish between a real attribute and something your code returns,
you shouldn't mix them by defining __getattr__ to begin with.

If, as I suspect, you just want an easy way of accessing child objects
by name, why not rename __getattr__ in your code to something like
get?

Then instead of
   root.item3
Use
   root.get('item3')

Alternately, make self.items an instance of a custom class with
__getattr__ defined.  This way, root's attribute space won't be
cluttered up.
   root.items.item3

Either way is a few more characters to type, but it's far saner than
trying to distinguish between real and fake attributes.

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Function Verification

2006-06-06 Thread Ben Cartwright
Ws wrote:
 I'm trying to write up a module that *safely* sets sys.stderr and
 sys.stdout, and am currently having troubles with the function
 verification. I need to assure that the function can indeed be called
 as the Python manual specifies that sys.stdout and sys.stderr should be
 defined (standard file-like objects, only requiring a function named
 write).
snip
 My problem is in verifying the class we're trying to redirect output
 to.
 This is what I have so far:
 def _VerifyOutputStream(fh):
 if 'write' not in dir(fh):
 raise AttributeError, The Output Stream should have a write
 method.
 if not callable(fh.write):
 raise TypeError, The Output Stream's write method is not
 callable.
snip
 In the above _VerifyOutputStream function, how would I verify that the
 fh.write method requires only one argument, as the built-in file
 objects do?

Why not just call the function with an empty string?

def _VerifyOutputStream(fh):
fh.write('')

Note that you don't need to manually check for AttributeError or
TypeError.  Python will do that for you.  It's generally better to act
first and ask forgiveness later.

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: grouping a flat list of number by range

2006-06-01 Thread Ben Cartwright
[EMAIL PROTECTED] wrote:
 i'm looking for a way to have a list of number grouped by consecutive
 interval, after a search, for example :

 [3, 6, 7, 8, 12, 13, 15]

 =

 [[3, 4], [6,9], [12, 14], [15, 16]]

 (6, not following 3, so 3 = [3:4] ; 7, 8 following 6 so 6, 7, 8 =
 [6:9], and so on)

 i was able to to it without generators/yield but i think it could be
 better with them, may be do you an idea?

Sure:

def group_intervals(it):
it = iter(it)
val = it.next()
run = [val, val+1]
for val in it:
if val == run[1]:
run[1] += 1
else:
yield run
run = [val, val+1]
yield run

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: argmax

2006-06-01 Thread Ben Cartwright
David Isaac wrote:
 2. Is this a good argmax (as long as I know the iterable is finite)?
 def argmax(iterable): return max(izip( iterable, count() ))[1]

Other than the subtle difference that Peter Otten pointed out, that's a
good method.

However if the iterable is a list, it's cleaner (and more efficient) to
use seq.index(max(seq)).  That way you won't be creating and comparing
all those tuples.

  def argmax(it):
  try:
  it.index
  except AttributeError:
  it = list(it)
  # Or if it would too expensive to convert it to list:
  #return -max((v, -i) for i, v in enumerate(it))[1]
  return it.index(max(it))

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Can Python format long integer 123456789 to 12,3456,789 ?

2006-06-01 Thread Ben Cartwright
A.M wrote:
 Is there any built in feature in Python that can format long integer
 123456789 to 12,3456,789 ?

The locale module can help you here:

   import locale
   locale.setlocale(locale.LC_ALL, '')
  'English_United States.1252'
   locale.format('%d', 123456789, True)
  '123,456,789'

Be sure to read the caveats for setlocale in the module docs:
  http://docs.python.org/lib/node323.html
I'd recommend calling setlocale only once, and always at the start of
your program.

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Can Python format long integer 123456789 to 12,3456,789 ?

2006-06-01 Thread Ben Cartwright
John Machin wrote:
 A.M wrote:
  Hi,
 
  Is there any built in feature in Python that can format long integer
  123456789 to 12,3456,789 ?
 

 Sorry about my previous post. It would produce 123,456,789.
 12,3456,789 is weird -- whose idea PHB or yours??

If it's not a typo, it's probably a regional thing.  See, e.g.,
http://en.wikipedia.org/wiki/Indian_numbering_system

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: list comprehensions put non-names into namespaces!

2006-05-25 Thread Ben Cartwright
[EMAIL PROTECTED] wrote:
 Lonnie List comprehensions appear to store their temporary result in a
 Lonnie variable named _[1] (or presumably _[2], _[3] etc for
 Lonnie nested comprehensions)

 Known issue.  Fixed in generator comprehensions.  Dunno about plans to fix
 it in list comprehensions.  I believe at some point in the future they may
 just go away or become syntactic sugar for a gen comp wrapped in a list()
 call.

The latter, starting in Python 3.0.  It won't be fixed before Python
3.0 because it has the potential to break existing 2.x code.  From PEP
289:

List comprehensions also leak their loop variable into the
surrounding scope. This will also change in Python 3.0, so that the
semantic definition of a list comprehension in Python 3.0 will be
equivalent to list(generator expression). Python 2.4 and beyond
should issue a deprecation warning if a list comprehension's loop
variable has the same name as a variable used in the immediately
surrounding scope.

Source: http://www.python.org/dev/peps/pep-0289/

Also mentioned in PEP 3100.

Doesn't look like the deprecation warning was ever implemented for 2.4,
though.  On my 2.4.3:

   def f():
  [x for x in range(10)]
  print x
   f()
  9
   # no warning yet..

2.5 is in alpha now, hopefully the warning will be added.

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Speed up this code?

2006-05-25 Thread Ben Cartwright
[EMAIL PROTECTED] wrote:
 I'm creating a program to calculate all primes numbers in a range of 0
 to n, where n is whatever the user wants it to be. I've worked out the
 algorithm and it works perfectly and is pretty fast, but the one thing
 seriously slowing down the program is the following code:

 def rmlist(original, deletions):
return [i for i in original if i not in deletions]

 original will be a list of odd numbers and deletions will be numbers
 that are not prime, thus this code will return all items in original
 that are not in deletions. For n  100,000 or so, the program takes a
 very long time to run, whereas it's fine for numbers up to 10,000.

 Does anybody know a faster way to do this? (finding the difference all
 items in list a that are not in list b)?

The in operator is expensive for lists because Python has to check,
on average, half the items in the list.  Use a better data structure...
in this case, a set will do nicely.  See the docs:

http://docs.python.org/lib/types-set.html
http://docs.python.org/tut/node7.html#SECTION00740

Oh, and you didn't ask for it, but I'm sure you're going to get a dozen
pet implementations of prime generators from other c.l.py'ers.  So
here's mine. :-)

def primes():
Generate prime numbers using the sieve of Eratosthenes.
yield 2
marks = {}
cur = 3
while True:
skip = marks.pop(cur, None)
if skip is None:
# unmarked number must be prime
yield cur
# mark ahead
marks[cur*cur] = 2*cur
else:
n = cur + skip
while n in marks:
# x already marked as multiple of another prime
n += skip
# first unmarked multiple of this prime
marks[n] = skip
cur += 2

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: __getattr__ and functions that don't exist

2006-05-25 Thread Ben Cartwright
Erik Johnson wrote:
 Thanks for your reply, Nick.  My first thought was Ahhh, now I see. That's
 slick!, but after playing with this a bit...

  class Foo:
 ... def __getattr__(self, attr):
 ... def intercepted(*args):
 ... print %s%s % (attr, args)
 ... return intercepted
 ...
  f = Foo()
  f
 __repr__()
 Traceback (most recent call last):
   File stdin, line 1, in ?
 TypeError: __repr__ returned non-string (type NoneType)


 my thought is Oh... that is some nasty voodoo there!   Especially
 if one wants to also preserve the basic functionality of __getattr__ so that
 it still works to just get an attribute where no arguments were given.

 I was thinking it would be clean to maintain an interface where you
 could call things like f.set_Spam('ham') and implement that as self.Spam =
 'ham' without actually having to define all the set_XXX methods for all the
 different things I would want to set on my object (as opposed to just making
 an attribute assignment), but I am starting to think that is probably an
 idea I should just simply abandon.

Well, you could tweak __getattr__ as follows:

 class Foo:
... def __getattr__(self, attr):
... if attr.startswith('__'):
... raise AttributeError
... def intercepted(*args):
... print %s%s % (attr, args)
... return intercepted

But abandoning the whole idea is probably a good idea.  How is defining
a magic set_XXX method cleaner than just setting the attribute?  Python
is not C++/Java/C#.  Accessors and mutators for simple attributes are
overkill.  Keep it simple, you'll thank yourself for it later when
maintaining your code. :-)

 I guess I don't quite follow the error above though. Can you explain
 exactly what happens with just the evaluation of f?

Sure.  (Note, this is greatly simplified, but still somewhat complex.)
The Python interpreter does the following when you type in an
expression:

(1) evaluate the expression, store the result in temporary object
(2) attempt to access the object's __repr__ method
(3) if step 2 didn't raise an AttributeError, call the method, output
the result, and we're done
(4) if __getattr__ is defined for the object, call it with __repr__
as the argument
(5) if step 4 didn't raise an AttributeError, call the method, output
the result, and we're done
(6) repeat steps 2 through 5 for __str__
(7) as a last resort, output the default class __main__.Foo at
0xDEADBEEF string

In your case, the intepreter hit step 4.  f.__getattr__(__repr__)
returned the intercepted function, which was then called.  However,
the interpreted function returned None.  The interpreter was
expecting a string from __repr__, so it raised a TypeError.

Clear as mud, right?  Cutting out the __getattr__ trickery, here's a
simplified scenario (gets to step 3 from above):

   class Bar(object):
  ... def __repr__(self):
  ... return None
  ...
   b = Bar()
   b
  Traceback (most recent call last):
File stdin, line 1, in ?
  TypeError: __repr__ returned non-string (type NoneType)

Hope that helps!  One other small thing... please avoid top posting.

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: genexp surprise (wart?)

2006-05-25 Thread Ben Cartwright
Paul Rubin wrote:
 I tried to code the Sieve of Erastosthenes with generators:

 def sieve_all(n = 100):
 # yield all primes up to n
 stream = iter(xrange(2, n))
 while True:
 p = stream.next()
 yield p
 # filter out all multiples of p from stream
 stream = (q for q in stream if q%p != 0)

 # print primes up to 100
 print list(sieve_all(100))

 but it didn't work.  I had to replace

 stream = (q for q in stream if q%p != 0)

 with

 def s1(p):
 return (q for q in stream if q%p != 0)
 stream = s1(p)

 or alternatively

 stream = (lambda p,stream: \
 (q for q in stream if q%p != 0)) (p, stream)


You do realize that you're creating a new level of generator nesting
with each iteration of the while loop, right?  You will quickly hit the
maximum recursion limit.  Try generating the first 1000 primes.


 I had thought that genexps worked like that automatically, i.e. the
 stuff inside the genexp was in its own scope.  If it's not real
 obvious what's happening instead, that's a sign that the current
 behavior is a wart.  (The problem is that p in my first genexp comes
 from the outer scope, and changes as the sieve iterates through the
 stream)


I don't see how it's a wart.  p is accessed (i.e., not set) by the
genexp.  Consistent with the function scoping rules in...
http://www.python.org/doc/faq/programming/#what-are-the-rules-for-local-and-global-variables-in-python
...Python treats p in the genexp as a non-local variable.

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bind an instance of a base to a subclass - can this be done?

2006-05-24 Thread Ben Cartwright
Lou Pecora wrote:
 I want to subclass a base class that is returned from a Standard Library
 function (particularly, subclass file which is returned from open).  I
 would add some extra functionality and keep the base functions, too.
 But I am stuck.

 E.g.

 class myfile(file):
def myreadline():
   #code here to return something read from file

 Then do something like (I know this isn't right, I'm just trying to
 convey the idea of what I would like)

 mf=myfile()

 mf=open(Afile,r)

 s=mf.myreadline() # Use my added function

 mf.close()# Use the original file function


 Possible in some way?  Thanks in advance for any clues.

This:

   mf=myfile()
   mf=open(Afile,r)

Is actually creating an instance of myfile, then throwing it away,
replacing it with an instance of file.  There are no variable type
declarations in Python.

To accomplish what you want, simply instantiate the subclass:

   mf=myfile(Afile,r)

You don't need to do anything tricky, like binding the instance of the
base class to a subclass.  Python does actually support that, e.g.:

   class Base(object):
  def f(self):
  return 'base'
   class Subclass(Base):
  def f(self):
  return 'subclass'
   b = Base()
   b.__class__
  class '__main__.Base'
   b.f()
  'base'
   b.__class__ = Subclass
   b.__class__
  class '__main__.Subclass'
   b.f()
  'subclass'

But the above won't work for the built-in file type:

   f = file('foo')
   f.__class__
  type 'file'
   f.__class__ = Subclass
  TypeError: __class__ assignment: only for heap types

Again though, just instantiate the subclass.  Much cleaner.

Or if that's not an option due to the way your module will be used,
just define your custom file methods as global functions that take a
file instance as a parameter.  Python doesn't force you to use OOP for
everything.

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: File attributes

2006-05-22 Thread Ben Cartwright
[EMAIL PROTECTED] wrote:
 I know how to walk a folder/directory using Python, but I'd like to
 check the archive bit for each file.  Can anyone make suggestions on
 how I might do this?  Thanks.


Since the archive bit is Windows-specific, your first place to check is
Mark Hammond's Python for Windows Extensions (aka win32all).  It's a
quick and painless install; grab it here:
http://python.net/crew/skippy/win32/

Once you have that installed, look in the PyWin32.chm help file for the
function calls you need.  If the documentation is too sparse, check
MSDN or google it.

For what you're trying to do:

  import win32file
  import win32con

  def togglefileattribute(filename, fileattribute, value):
  Turn a specific file attribute on or off, leaving the other
  attributes intact.
  
  bitvector = win32file.GetFileAttributes(filename)
  if value:
  bitvector |= fileattribute
  else:
  bitvector = ~fileattribute
  win32file.SetFileAttributes(filename, bitvector)

  # Sample usage:
  togglefileattribute('foo.txt', win32con.FILE_ATTRIBUTE_ARCHIVE, True)

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: File attributes

2006-05-22 Thread Ben Cartwright
Ben Cartwright wrote:
 [EMAIL PROTECTED] wrote:
  I know how to walk a folder/directory using Python, but I'd like to
  check the archive bit for each file.  Can anyone make suggestions on
  how I might do this?  Thanks.


 Since the archive bit is Windows-specific, your first place to check is
 Mark Hammond's Python for Windows Extensions (aka win32all).  It's a
 quick and painless install; grab it here:
 http://python.net/crew/skippy/win32/

 Once you have that installed, look in the PyWin32.chm help file for the
 function calls you need.  If the documentation is too sparse, check
 MSDN or google it.

 For what you're trying to do:

   import win32file
   import win32con

   def togglefileattribute(filename, fileattribute, value):
   Turn a specific file attribute on or off, leaving the other
   attributes intact.
   
   bitvector = win32file.GetFileAttributes(filename)
   if value:
   bitvector |= fileattribute
   else:
   bitvector = ~fileattribute
   win32file.SetFileAttributes(filename, bitvector)

   # Sample usage:
   togglefileattribute('foo.txt', win32con.FILE_ATTRIBUTE_ARCHIVE, True)

Or to just check the value of the bit:

  def fileattributeisset(filename, fileattr):
  return bool(win32file.GetFileAttributes(filename)  fileattr)

  print fileattributeisset('foo.txt', win32con.FILE_ATTRIBUTE_ARCHIVE)

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: reusing parts of a string in RE matches?

2006-05-10 Thread Ben Cartwright
John Salerno wrote:
 So my question is, how can find all occurrences of a pattern in a
 string, including overlapping matches? I figure it has something to do
 with look-ahead and look-behind, but I've only gotten this far:

 import re
 string = 'abababababababab'
 pattern = re.compile(r'ab(?=a)')
 m = pattern.findall(string)

 This matches all the 'ab' followed by an 'a', but it doesn't include the
 'a'. What I'd like to do is find all the 'aba' matches. A regular
 findall() gives four results, but really there are seven.

 Is there a way to do this with just an RE pattern, or would I have to
 manually add the 'a' to the end of the matches?

Yes, and no extra for loops are needed!  You can define groups inside
the lookahead assertion:

   import re
   re.findall(r'(?=(aba))', 'abababababababab')
  ['aba', 'aba', 'aba', 'aba', 'aba', 'aba', 'aba']

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: reusing parts of a string in RE matches?

2006-05-10 Thread Ben Cartwright
Murali wrote:
  Yes, and no extra for loops are needed!  You can define groups inside
  the lookahead assertion:
 
 import re
 re.findall(r'(?=(aba))', 'abababababababab')
['aba', 'aba', 'aba', 'aba', 'aba', 'aba', 'aba']

 Wonderful and this works with any regexp, so

 import re

 def all_occurences(pat,str):
   return re.findall(r'(?=(%s))'%pat,str)

 all_occurences(a.a,abacadabcda) returns [aba,aca,ada] as
 required.


Careful.  That won't work as expected for *all* regexps.  Example:

   import re
   re.findall(r'(?=(a.*a))', 'abaca')
  ['abaca', 'aca']

Note that this does *not* find 'aba'.  You might think that making it
non-greedy might help, but:

   re.findall(r'(?=(a.*?a))', 'abaca')
  ['aba', 'aca']

Nope, now it's not finding 'abaca'.

This is by design, though.   From
http://www.regular-expressions.info/lookaround.html (a good read, by
the way):

As soon as the lookaround condition is satisfied, the regex engine
forgets about everything inside the lookaround. It will not backtrack
inside the lookaround to try different permutations.

Moral of the story:  keep lookahead assertions simple whenever
possible.  :-)

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: i don't understand this RE example from the documentation

2006-05-08 Thread Ben Cartwright
John Salerno wrote:
 John Salerno wrote:
  Ok, I've been staring at this and figuring it out for a while. I'm close
  to getting it, but I'm confused by the examples:
 
  (?(id/name)yes-pattern|no-pattern)
  Will try to match with yes-pattern if the group with given id or name
  exists, and with no-pattern if it doesn't. |no-pattern is optional and
  can be omitted.
 
  For example, ()?([EMAIL PROTECTED](?:\.\w+)+)(?(1)) is a poor email 
  matching
  pattern, which will match with '[EMAIL PROTECTED]' as well as
  '[EMAIL PROTECTED]', but not with '[EMAIL PROTECTED]'. New in version 2.4.
 
  group(1) is the email address pattern, right? So why does the above RE
  match '[EMAIL PROTECTED]'. If the email address exists, does the last part
  of the RE: (?(1)) mean that it has to end with a ''?

 I think I got it. The group(1) is referring to the opening '', not the
 email address. I had seen an earlier example that used group(0), so I
 thought maybe the groups were 0-based.

The groups *are* 0-based.  The 0th group is the whole match, e.g.:

   import re
   m = re.match(r'a(b+)', 'a')
   m.group(0)
  'a'
   m.group(1)
  ''

And for the pattern you were looking at:

   m = re.match(r'()?([EMAIL PROTECTED](?:\.\w+)+)(?(1))', '[EMAIL 
PROTECTED]')
   m.group(0)
  '[EMAIL PROTECTED]'
   m.group(1)
  ''
   m.group(2)
  '[EMAIL PROTECTED]'

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Splice two lists

2006-05-06 Thread Ben Cartwright
[EMAIL PROTECTED] wrote:
 Is there a good way to splice two lists together without resorting to a
 manual loop? Say I had 2 lists:

 l1 = [a,b,c]
 l2 = [1,2,3]

 And I want a list:

 [a,1,b,2,c,3] as the result.

Our good friend itertools can help us out here:

   from itertools import chain, izip
   x = ['a', 'b', 'c']
   y = [1, 2, 3]
   list(chain(*izip(x, y)))
  ['a', 1, 'b', 2, 'c', 3]
   # You can splice more than two iterables at once too:
   z = ['x', 'y', 'z']
   list(chain(*izip(x, y, z)))
  ['a', 1, 'x', 'b', 2, 'y', 'c', 3, 'z']
   # Cleaner to define it as a function:
   def splice(*its):
  return list(chain(*izip(*its)))
   splice(x, y)
  ['a', 1, 'b', 2, 'c', 3]
   splice(x, y, z)
  ['a', 1, 'x', 'b', 2, 'y', 'c', 3, 'z']

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Splice two lists

2006-05-06 Thread Ben Cartwright
[EMAIL PROTECTED] wrote:
 Thanks, this worked great.

Welcome. :-)

 Can you explain the syntax of the '*' on the
 return value of izip? I've only ever seen this syntax with respect to
 variable number of args.

When used in a function call (as opposed to a function definition), *
is the unpacking operator.  Basically, it flattens an iterable into
arguments.  The docs mention it...

http://www.python.org/doc/2.4.2/tut/node6.html#SECTION00674
http://www.python.org/doc/faq/programming/#how-can-i-pass-optional-or-keyword-parameters-from-one-function-to-another

...but not in great detail.  You can apply * to an arbitrary
expression, e.g.:

   def f3(a, b, c): pass

   f3(1, 2, 3)
   f3(*range(3))
   f3(*[1, 2, 3])

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Multiple hierarchie and method overloading

2006-04-24 Thread Ben Cartwright
Philippe Martin wrote:
 I have something like this:

 Class A:
 def A_Func(self, p_param):
  .
 Class B:
 def A_Func(self):
  .

 Class C (A,B):
 A.__init__(self)
 B.__init__(self)

 .

 self.A_Func() #HERE I GET AN EXCEPTION ... takes at least 2 
 arguments (1
 given).


 I renamed A_Func(self) to fix that ... but is there a cleaner way around ?

When using multiple inheritence, the order of the base classes matters!
 E.g.:

  class A(object):
  def f(self):
  print 'in A.f()'
  class B(object):
  def f(self):
  print 'in B.f()'
  class X(A, B):
  pass
  class Y(B, A):
  pass

   x = X()
   x.f()
  in A.f()
   y = Y()
   y.f()
  in B.f()

If you want to call B.f() instead of A.f() for an X instance, you can
either rename B.f() like you've done, or do this:

   B.f(x)
  in B.f()

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Passing data attributes as method parameters

2006-04-23 Thread Ben Cartwright
Panos Laganakos wrote:
 I'd like to know how its possible to pass a data attribute as a method
 parameter.

 Something in the form of:

 class MyClass:
 def __init__(self):
 self.a = 10
 self.b = '20'

 def my_method(self, param1=self.a, param2=self.b):
 pass

 Seems to produce a NameError of 'self' not being defined.

Default arguments are statically bound, so you'll need to do something
like this:

class MyClass:
def __init__(self):
self.a = 10
self.b = '20'

def my_method(self, param1=None, param2=None):
if param1 is None:
param1 = self.a
if param2 is None:
param2 = self.b

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Confused by Python and nested scoping (2.4.3)

2006-04-19 Thread Ben Cartwright
Sean Givan wrote:
 def outer():
   val = 10
   def inner():
   print val
   val = 20
   inner()
   print val

 outer()

 ..I expected to print '10', then '20', but instead got an error:

print val
 UnboundLocalError: local variable 'val' referenced before assignment.

 I'm thinking this is some bug where the interpreter is getting ahead of
 itself, spotting the 'val = 20' line and warning me about something that
 doesn't need warning.  Or am I doing something wrong?


Short answer:  No, it's not a Python bug.  If inner() must modify
variables defined in outer()'s scope, you'll need to use a containing
object.  E.g.:

  class Storage(object):
  pass
  def outer():
  data = Storage()
  data.val = 10
  def inner():
  print data.val
  data.val = 20
  inner()
  print data.val

Long answer:

The interpreter (actually, the bytecode compiler) is indeed looking
ahead.  This is by design, and is why the global keyword exists.  See
http://www.python.org/doc/faq/programming/#what-are-the-rules-for-local-and-global-variables-in-python

Things get more complex than that when nested function scopes are
involved.  But again, the behavior you observed is a design decision,
not a bug.  By BDFL declaration, there is no parentscope keyword
analogous to global.  See PEP 227, specifically the Rebinding names
in enclosing scopes section: http://www.python.org/dev/peps/pep-0227/

Hope that helps,
--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: object instance after if isalpha()

2006-04-12 Thread Ben Cartwright
Marcelo Urbano Lima wrote:
 class abc:
   def __init__(self):
 name='marcelo'
 print x.name
 Traceback (most recent call last):
   File 1.py, line 12, in ?
 print x.name
 AttributeError: abc instance has no attribute 'name'

In Python, you explicitly include a reference to an object when setting
or accessing the object's attributes... even when you're inside one of
that objects methods.  I.e.:

  class abc:
def __init__(self):
  self.name='marcelo'

When you omit the self. bit, Python creates a variable local to
__init__() named name, and the attribute is never set.  This is
different from some other OO languages (e.g. C++/Java/C#'s this), may
take some getting used to.

Hope that helps,
--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: CGI module: get form name

2006-04-12 Thread Ben Cartwright
ej wrote:
 I'm not seeing how to get at the 'name' attribute of an HTML form element.

 form = cgi.FieldStorage()

 gives you a dictionary-like object that has keys for the various named
 elements *within* the form...

 I could easily replicate the form name in a hidden field, but there ought to
 be some way to get directly at the form name but I'm just not seeing it.

There isn't.  This is a limitation of the CGI protocol, due to the way
HTTP requests work.  I.e., the name attribute of form is *not*
included in form submissions.  Regardless of whether the method is GET
or POST, it's only the fields' key/value pairs that are encoded and
sent off to the server.

If you need it, a hidden field is a good place for the form name.  Or
you could use cookies.

Hope that helps,
--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Regular expression intricacies: why do REs skip some matches?

2006-04-11 Thread Ben Cartwright
Tim Chase wrote:
  In [1]: import re
 
  In [2]: aba_re = re.compile('aba')
 
  In [3]: aba_re.findall('abababa')
  Out[3]: ['aba', 'aba']
 
  The return is two matches, whereas, I expected three. Why does this
  regular expression work this way?

It's just the way regexes work.  You may disagree, but it's more
intuitive that iterated pattern searching be non-overlapping by
default.  See also:

   'abababa'.count('aba')
  2

 Well, if you don't need the actual results, just their
 count, you can use

 how_many = len(re.findall('(?=aba)', 'abababa')

 which will return 3.  However, each result is empty:

print re.findall('(?=aba)', 'abababa')
   ['','','']

 You'd have to do some chicanary to get the actual pieces:
(snip)

Actually, you can just define a group inside the lookahead assertion:

   re.findall('(?=(aba))', 'abababa')
  ['aba', 'aba', 'aba']

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: finish_endtag in sgmllib.py [Python 2.4]

2006-04-11 Thread Ben Cartwright
Richard Hsu wrote:
 code:-

# Internal -- finish processing of end tag
 def finish_endtag(self, tag):
 if not tag:  #  i am confused about this
 found = len(self.stack) - 1
 if found  0:
 self.unknown_endtag(tag)  #  and this
 return

 I am a little confused as to what is intended by  if not tag: 
 does it mean
 if tag == None or tag == :  # ?

Technically, not quite.  See http://docs.python.org/lib/truth.html

In practice, tag will indeed be a string type (shouldn't ever be None),
so 'tag == ' would work just as well* as 'not tag'.  However, it's
cleaner and clearer to use the latter.

* = barring some contrived custom string type

 tag here is suppose to be a string.

 so the only way it will be True is when its either None or its , then
 we are essentially passing None or  to self.unknown_endtag(tag) ??

Yes, a string of length zero will always be passed to unknown_endtag
here.  Answering your implicit question, there's no good reason to
write it as self.unknown_endtag(None) instead.

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: RIIA in Python 2.5 alpha: with... as

2006-04-11 Thread Ben Cartwright
Terry Reedy wrote:
 Alexander Myodov [EMAIL PROTECTED] wrote in message
 news:[EMAIL PROTECTED]
  and even list comprehensions:
   b1 = [l for l in a1]
   print l: %s % l

 This will go away in 3.0.  For now, del l if you wish.

Or use a generator expression:
   b1 = list(l for l in a1)
   l
  NameError: name 'l' is not defined

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: About classes and OOP in Python

2006-04-11 Thread Ben Cartwright
Michele Simionato wrote:
 Roy Smith wrote:
 snip
  That being said, you can indeed have private data in Python.  Just prefix
  your variable names with two underscores (i.e. __foo), and they effectively
  become private.  Yes, you can bypass this if you really want to, but then
  again, you can bypass private in C++ too.

 Wrong, _foo is a *private* name (in the sense don't touch me!), __foo
 on the contrary is a *protected* name (touch me, touch me, don't worry
 I am protected against inheritance!).
 This is a common misconception, I made the error myself in the past.

Sure, if you only consider private and protected as they're defined
in a dictionary.  But then you'd be ignoring the meanings of the
public/private/protected keywords in virtually every language that has
them.  http://www.google.com/search?q=public+private+protected

Python doesn't have these keywords, but most Python programmers are at
least somewhat familiar with a language that does use them.  For the
sake of clarity:
  __foo ~= private = used internally by base class only
  _foo ~= protected = used internally by base and derived classes

The Python docs use the above definitions.

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How can I determine the property attributes on a class or instance?

2006-04-11 Thread Ben Cartwright
mrdylan wrote:
 class TestMe(object):
   def get(self):
 pass
   def set(self, v):
 pass

   p = property( get, set )

 t = TestMe()
 type(t.p) #returns NoneType, what???
 t.p.__str__  #returns method-wrapper object at XXX
 ---

 What is the best way to determine that the attribute t.p is actually a
 property object? Obviously I can test the __str__ or __repr__
 attributes using substring comparison but there must be a more elegant
 idiom.

Check the class instead of the instance:

   type(TestMe.p)
  type 'property'
   type(t.__class__.p)
  type 'property'
   isinstance(t.__class__.p, property)
  True

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: very strange problem in 2.4

2006-04-07 Thread Ben Cartwright
John Zenger wrote:
 Your list probably contains several references to the same object,
 instead of several different objects.  This happens often when you use a
 technique like:

 list = [ object ] * 100

This is most likely what's going on.  To the OP: please post the
relevant code, including how you create mylist and the definitions of
change_var_a and return_var_a.  I suspect you're doing something like
this:

 \
class C(object):
def __init__(self, x):
self.x = x
def __repr__(self):
return 'C(%r)' % self.x

 mylist = [C(0)]*3 + [C(1)]*3
 mylist
[C(0), C(0), C(0), C(1), C(1), C(1)]
 mylist[0].x = 2
[C(2), C(2), C(2), C(1), C(1), C(1)]

When you should do something like:

 mylist = [C(0) for i in range(3)] + [C(1) for i in range(3)]
[C(0), C(0), C(0), C(1), C(1), C(1)]
 mylist[0].x = 2
[C(2), C(0), C(0), C(1), C(1), C(1)]

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pre-PEP: The create statement

2006-04-06 Thread Ben Cartwright
Michael Ekstrand wrote:
 Is there a natural way
 to extend this to other things, so that function creation can be
 modified? For example:

 create tracer fib(x):
 # Return appropriate data here
 pass

 tracer could create a function that logs its entry and exit; behavior
 could be modifiable at run time so that tracer can go away into oblivion.

 Given the current semantics of create, this wouldn't work. What would be
  reasonable syntax and semantics to make something like this possible?

The standard idiom is to use a function wrapper, e.g.

def tracer(f):
def wrapper(*args):
print 'call', f, args
result = f(*args)
print f, args, '=', result
return result
return wrapper

def fact(x):
if not x: return 1
return x * fact(x-1)
fact = tracer(fact)  # wrap it

The decorator syntax was added in Python 2.4 to make the wrapper
application clearer:

@tracer
def fact(x):
if not x: return 1
return x * fact(x-1)

http://www.python.org/dev/peps/pep-0318

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: confusing behaviour of os.system

2006-04-06 Thread Ben Cartwright
Todd wrote:
 I'm trying to run the following in python.

 os.system('/usr/bin/gnuclient -batch -l htmlize -eval (htmlize-file
 \test.c\)')

Python is interpreting the \s as s before it's being passed to
os.system.  Try doubling the backslashes.

 print '/usr/bin/gnuclient -batch -l htmlize -eval (htmlize-file 
 \test.c\)'
/usr/bin/gnuclient -batch -l htmlize -eval (htmlize-file test.c)
 print '/usr/bin/gnuclient -batch -l htmlize -eval (htmlize-file 
 \\test.c\\)'
/usr/bin/gnuclient -batch -l htmlize -eval (htmlize-file \test.c\)

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: how to make a generator use the last yielded value when it regains control

2006-04-06 Thread Ben Cartwright
John Salerno wrote:
 It
 is meant to take a number and generate the next number that follows
 according to the Morris sequence. It works for a single number, but what
 I'd like it to do is either:

 1. repeat indefinitely and have the number of times controlled elsewhere
 in the program (e.g., use the morris() generator in a for loop and use
 that context to tell it when to stop)

 2. just make it a function that takes a second argument, that being the
 number of times you want it to repeat itself and create numbers in the
 sequence

Definitely go for (1).  The Morris sequence is a great candidate to
implement as a generator.  As a generator, it will be more flexible and
efficient than (2).

def morris(num):
Generate the Morris sequence starting at num.
num = str(num)
yield num
while True:
result, cur, run = [], None, 0
for digit in num+'\n':
if digit == cur:
run += 1
else:
if cur is not None:
result.append(str(run))
result.append(cur)
cur, run = digit, 1
num = ''.join(result)
yield num

# Example usage:
from itertools import islice
for n in islice(morris(1), 10):
print n

# Output:

1
11
21
1211
111221
312211
13112221
1113213211
31131211131221
13211311123113112211


--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: how to make a generator use the last yielded value when it regains control

2006-04-06 Thread Ben Cartwright
John Salerno wrote:
 Actually I was just thinking about this and it seems like, at least for
 my purpose (to simply return a list of numbers), I don't need a
 generator.

Yes, if it's just a list of numbers you need, a generator is more
flexibility than you need.  A generator would only come in handy if,
say, you wanted to give your users the option of getting the next N
items in the sequence, *without* having to recompute everything from
scratch.


 My understanding of a generator is that you do something to
 each yielded value before returning to the generator (so that you might
 not return at all),

A generator is just an object that spits out values upon request; it
doesn't care what the caller does with those values.

There's many different ways to use generators; a few examples:

# Get a list of the first 10
from itertools import islice
m = [n for n in islice(morris(1), 10)]

# Prompt user between each iteration
for n in morris(1):
if raw_input('keep going? ') != 'y':
break
print n

# Alternate way of writing the above
g = morris(1)
while raw_input('keep going? ') == 'y':
print g.next()

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Simple py script to calc folder sizes

2006-03-30 Thread Ben Cartwright
Caleb Hattingh wrote:
 Your code works on some folders but not others.   For example, it works
 on my /usr/lib/python2.4 (the example you gave), but on other folders
 it terminates early with StopIteration exception on the
 os.walk().next() step.

 I haven't really looked at this closely enough yet, but it looks as
 though there may be an issue with permissions (and not having enough)
 on subfolders within a tree.

You're quite correct.  Here's a version of John's code that handles
such cases:

import warnings
def foldersize(fdir):
   Returns the size of all data in folder fdir in bytes
   try:
  root, dirs, files = os.walk(fdir).next()
   except StopIteration:
  warnings.warn(Could not access  + fdir)
  return 0
   files = [os.path.join(root, x) for x in files]
   dirs = [os.path.join(root, x) for x in dirs]
   return sum(map(os.path.getsize, files)) + sum(map(foldersize, dirs))

There's also another bug in the prettier() function that barfs on empty
directories, as it's taking the log of 0.  The fix:

   exponent = int(math.log(max(1, bytesize), 1024))

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Newbie: splitting dictionary definition across two .py files

2006-03-30 Thread Ben Cartwright
[EMAIL PROTECTED] wrote:
 I like to define a big dictionary in two
 files and use it my main file, build.py

 I want the definition to go into build_cfg.py and build_cfg_static.py.

 build_cfg_static.py:
 target_db = {}
 target_db['foo'] = 'bar'

 build_cfg.py
 target_db['xyz'] = 'abc'

 In build.py, I like to do
 from build_cfg_static import *
 from build_cfg import *

 ...now use target_db to access all elements. The problem looks like, I
 can't
 have the definition of target_db split across two files. I think they
 reside in different name spaces?

Yes.  As it stands, build_cfg.py will not compile to bytecode
(NameError: name 'target_db' is not defined).

Unless you're doing something ugly like exec() on the its contents, .py
files need to be valid before they can be imported.

 Is there any way I can have the same
 dictionary definition split across two files?

Try this:

# build_cfg_static.py:
target_db = {}
target_db['foo'] = 'bar'

# build_cfg.py:
target_db = {}
target_db['xyz'] = 'abc'

# build.py:
from build_cfg_static import target_db
from build_cfg import target_db as merge_db
target_db.update(merge_db)

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: adding a new line of text in Tk

2006-03-26 Thread Ben Cartwright
nigel wrote:
 w =Label(root, text=Congratulations you have made it this far,just a few more
 questions then i will be asking you some)

 The problem i have is where i have started to write some textCongratulations
 you have made it this far,just a few more questions then i will be asking you
 some)
 I would actually like to add some text but it puts it all on one line.I would
 like to be able to tell it to start a new line.


Just use \n in your string, e.g.:

w = Label(root, text=Line 1\nLine 2\nLine 3)

Or a triple-quoted string will do the trick:

w = Label(root, text=Line 1
Line 2
Line 3)

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Simple py script to calc folder sizes

2006-03-21 Thread Ben Cartwright
Caleb Hattingh wrote:
 Unless you have a nice tool handy, calculating many folder sizes for
 clearing disk space can be a click-fest nightmare.   Looking around, I
 found Baobab (gui tool); the du linux/unix command-line tool; the
 extremely impressive tkdu: http://unpythonic.net/jeff/tkdu/ ; a python
 script I didn't really understand at
 http://vsbabu.org/webdev/zopedev/foldersize.html (are these folder
 objects zope thingies?);  there are also tools that can add a
 foldersize column into Explorer on Windows
 (foldersize.sourceforge.net, for example);  the superb freeCommander
 file-manager (win32) has the functionality built in, and so on.

You also might want to take a look at KDirStat
(http://kdirstat.sourceforge.net/) and its win32 counterpart,
WinDirStat (http://windirstat.sourceforge.net/).

 du is closest to what I was looking for, but is not immediately
 cross-platform: I know I can probably get it through Cygwin, and there
 is probably a win32 binary or clone around somewhere

Try http://unxutils.sourceforge.net/ ... much quicker to set up than
Cygwin.

A pure Python port of du (and other unix utilities) would be cool,
though.

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Using Dictionaries in Sets - dict objects are unhashable?

2006-03-21 Thread Ben Cartwright
Gregory Piñero wrote:
 Hey guys,

 I don't understand why this isn't working for me.  I'd like to be able
 to do this.  Is there another short alternative to get this
 intersection?

 [Dbg] set([{'a':1},{'b':2}]).intersection([{'a':1}])
 Traceback (most recent call last):
   File interactive input, line 1, in ?
 TypeError: dict objects are unhashable

Assuming you're using Python 2.4+:

 d1 = {'a':1, 'b':2, 'c':3, 'd':5}
 d2 = {'a':1, 'c':7, 'e':6}
 dict((k, v) for k, v in d1.iteritems() if k in d2)
{'a': 1, 'c': 3}

Or if you're comparing key/value pairs instead of just keys:

 dict((k, v) for k, v in d1.iteritems() if k in d2 and d2[k]==v)
{'a': 1}

Finally, if you're on Python 2.3, use these versions (less efficient
but still functional):

 dict([(k, v) for k, v in d1.iteritems() if k in d2])
{'a': 1, 'c': 3}
 dict([(k, v) for k, v in d1.iteritems() if k in d2 and d2[k]==v])
{'a': 1}

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: TypeError coercing to Unicode with field read from XML file

2006-03-21 Thread Ben Cartwright
Randall Parker wrote:
 My problem is that once I parse the file with minidom and a field from
 it to another variable as shown with this line:
 IPAddr = self.SocketSettingsObj.IPAddress

 I get this error:
[...]
 if TargetIPAddrList[0]   and TargetIPPortList[0] 
 0:
 StillNeedSettings = False

 TestSettingsStore.SettingsDictionary['TargetIPAddr'] =
 TargetIPAddrList[0]

 TestSettingsStore.SettingsDictionary['TargetIPPort'] =
 TargetIPPortList[0]


TargetIPAddrList[0] and TargetIPPortList[0] are *not* a string and an
int, respectively.  They're both DOM elements.  If you want an int, you
have to explicitly cast the variable as an int.  Type matters in
Python:

   '0' == 0
  False

Back to your code:  try a couple debugging print statements to see
exactly what your variables are.  The built-in type() function should
help.

To fix the problem, you need to dig a little deeper in the DOM, e.g.:

addr = TargetIPAddrList[0].firstChild.nodeValue
try:
port = int(TargetIPPortList[0].firstChild.nodeValue)
except ValueError:  # safely handle invalid strings for int
port = 0
if addr and port:
StillNeedSettings = False
TestSettingsStore.SettingsDictionary['TargetIPAddr'] = addr
TestSettingsStore.SettingsDictionary['TargetIPPort'] = port

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Function params with **? what do these mean?

2006-03-20 Thread Ben Cartwright
Dave Hansen wrote:
 On 20 Mar 2006 15:45:36 -0800 in comp.lang.python,
 [EMAIL PROTECTED] (Aahz) wrote:
 Personally, I think it's a Good Idea to stick with the semi-standard
 names of *args and **kwargs to make searching easier...

 Agreed (though kwargs kinda makes my skin crawl).

Coincidentally, kwargs is the sound my cat makes when coughing up a
hairball.

Fortunately, **kw is also semi-standard.

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: what's the general way of separating classes?

2006-03-20 Thread Ben Cartwright
John Salerno wrote:
 bruno at modulix wrote:

  It seems like this can
  get out of hand, since modules are separate from one another and not
  compiled together. You'd end up with a lot of import statements.
 
  Sorry, but I don't see the correlation between compilation and import
  here ?

 I meant that in a language like C#, which compiles all the separate
 files into one program, it is not necessary to have the equivalent of an
 import/include type of statement.

Er?  Surely you've used C#'s using statement?  Apples and oranges,
but:

C#'s using Foo.Bar; is roughly analogous to Python's
from foo.bar import *.

C#'s int x = Foo.Bar.f(); is roughly analogous to Python's
import foo.bar; x = foo.bar.f().

 You can just refer to the classes from
 any other file.

Iff they're in the same namespace.  You can have multiple namespaces in
the same .NET assembly, you know.

 But in Python, without this behavior, you must
 explicitly import any external files.

That's true.  Each Python file is essentially its own namespace.  And
when, say, __init__.py does a from submodule import * it essentially
merges submodule's namespace into its own.

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: xmlrpclib and carriagereturn (\r)

2006-03-18 Thread Ben Cartwright
Jonathan Ballet wrote:
 The problem is, xmlrpclib eats those carriage return characters when
 loading the XMLRPC request, and replace it by \n. So I got bla\n\nbla.

 When I sent back those parameters to others Windows clients (they are
 doing some kind of synchronisation throught the XMLRPC server), I send
 to them only \n\n, which makes problems when rendering strings.

Did you develop the Windows client, too?  If so, the client-side fix is
trivial: replace \n with \r\n in all renderable strings.  Or update
both the client and the server to encode the strings, also trivial
using the base64 module.

If not, and you're in the unfortunate position of being forced to
support buggy third-party clients, read on.

 It seems that XMLRPC spec doesn't propose to eat carriage return
 characters : (from http://www.xmlrpc.com/spec)
(snip)
 It seems to be a rather strange comportement from xmlrpclib. Is it known ?
 So, what happens here ? How could I solve this problem ?

The XMLRPC spec doesn't say anything about CRs one way or the other.
Newline handling is necessarily left to the XML parser implementation.
In Python's case, xmlrpclib uses the xml.parsers.expat module, which
reads universal newlines and writes Unix-style newlines (\n).  There's
no option to disable this feature.

You could modify xmlrpclib to use a different parser, but it would be
much easier to just hack the XML response right before it's sent out.
I'm assuming you used the SimpleXMLRPCServer module.  Example:

from SimpleXMLRPCServer import *

class MyServer(SimpleXMLRPCServer):
def _marshaled_dispatch(self, data, dm=None):
response = SimpleXMLRPCDispatcher._marshaled_dispatch(self,
data, dm)
return response.replace('\n', '\r\n')

server = MyServer(('localhost', 8000))
server.register_introspection_functions()
server.register_function(lambda x: x, 'echo')
server.serve_forever()

Hope that helps,
--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: xmlrpclib and carriagereturn (\r)

2006-03-18 Thread Ben Cartwright
Jonathan Ballet wrote:
 The problem is, xmlrpclib eats those carriage return characters when
 loading the XMLRPC request, and replace it by \n. So I got bla\n\nbla.

 When I sent back those parameters to others Windows clients (they are
 doing some kind of synchronisation throught the XMLRPC server), I send
 to them only \n\n, which makes problems when rendering strings.


Whoops, just realized we're talking about \n\r here, not \r\n.
Most of my previous reply doesn't apply to your situation, then.

As far as Python's expat parser is concerned, \n\r is two newlines:
one Unix-style and one Mac-style.  It correctly (per XML specs)
normalizes both to Unix-style.

Is \n\r being used as a newline by your Windows clients, or is it a
control code?  If the former, I'd sure like to know why.  If the
latter, then you're submitting binary data and you shouldn't be using
string to begin with.  Try base64.

If worst comes to worst and you have to stick with sending \n\r
intact in a string param, you'll need to modify xmlrpclib to use a
different (and technically noncompliant) XML parser.  Here's an ugly
hack to do that out of the box:

# In your server code:
import xmlrpclib
# This forces xmlrpclib to fall back on the obsolete xmllib module:
xmlrpclib.ExpatParser = None

xmllib doesn't normalize newlines, so it's noncompliant.  But this is
actually what you want.

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: filter list fast

2006-03-18 Thread Ben Cartwright
lars_woetmann wrote:
 I have a list I filter using another list and I would like this to be
 as fast as possible
 right now I do like this:

 [x for x in list1 if x not in list2]

 i tried using the method filter:

 filter(lambda x: x not in list2, list1)

 but it didn't make much difference, because of lambda I guess
 is there any way I can speed this up

Both of these techniques are O(n^2).  You can reduce it to O(n log n)
by using sets:

 set2 = set(list2)
 [x for x in list1 if x not in set2]

Checking to see if an item is in a set is much more efficient than a
list.

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Can I use a conditional in a variable declaration?

2006-03-18 Thread Ben Cartwright
[EMAIL PROTECTED] wrote:
 I've done this in Scheme, but I'm not sure I can in Python.

 I want the equivalent of this:

 if a == yes:
answer = go ahead
 else:
answer = stop

 in this more compact form:


 a = (if a == yes: go ahead: stop)


 is there such a form in Python? I tried playing around with lambda
 expressions, but I couldn't quite get it to work right.

There will be, in Python 2.5 (final release scheduled for August 2006):

 answer = go ahead if a==yes else stop

See:
http://mail.python.org/pipermail/python-dev/2005-September/056846.html
http://www.python.org/doc/peps/pep-0308/

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Importing an output from another function

2006-03-17 Thread Ben Cartwright
James Stroud wrote:
 Try this (I think its called argument expansion, but I really don't
 know what its called, so I can't point you to docs):

 def Func1():
  choice = ('A', 'B', 'C')
  output = random.choice(choice)
  output2 = random.choice(choice)
  return output, output2

 def Func2(*items):
  print items

 output = Func1()
 Func2(*output1)


Single asterisk == arbitrary argument list.  Useful in certain
patterns, but not something you use every day.

Documentation is in the tutorial:
http://www.python.org/doc/current/tut/node6.html#SECTION00673

PS:  Like self for class instance methods, *args is the
conventional name of the arbitrary argument list.

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pow (power) function

2006-03-16 Thread Ben Cartwright
Mike Ressler wrote:
  timeit.Timer(pow(111,111)).timeit()
 10.968398094177246
  timeit.Timer(111**111).timeit()
 10.04007887840271
  timeit.Timer(111.**111.).timeit()
 0.36576294898986816

 The pow and ** on integers take 10 seconds, but the float ** takes only
 0.36 seconds. (The pow with floats takes ~ 0.7 seconds). Clearly
 typecasting to floats is coming in here somewhere. (Python 2.4.1 on
 Linux FC4.)


No, there is not floating point math going on when the operands to **
are both int or long.  If there were, the following two commands would
have identical output:

   111**111
  107362012888474225801214565046695501959850723994224804804775911
  17562507619578334702249122617009363462146610374309298696786
  330067310159463303558666910091026017785587295539622142057315437
  069730229375357546494103400699864397711L
   int(111.0**111.0)
  107362012888474224720018046104893130890742038145054486592605938
  348914231670972887594279283213585412743799339280552157756096410
  839752020853099983680499334815422669184408961411319810030383904
  886446681757296875373689157536249282560L

The first result is accurate.  Work it out by hand if you don't believe
me. ;-)  The second suffers from inaccuracies due to floating point's
limited precision.

Of course, getting exact results with huge numbers isn't cheap,
computationally.  Because there's no type in C to represent arbitrarily
huge numbers, Python implements its own, called long.  There's a fair
amount of memory allocation, bit shifting, and other monkey business
going on behind the scenes in longobject.c.

Whenever possible, Python uses C's built-in signed long int type (known
simply as int on the Python side, and implemented in intobject.c).
On my platform, C's signed long int is 32 bits, so values range from
-2147483648 to 2147483647.  I.e., -(2**31) to (2**31)-1.

As long as your exponentiation result is in this range, Python uses
int_pow().  When it overflows, long_pow() takes over.  Both functions
use the binary exponentiation algorithm, but long_pow() is naturally
slower:

   from timeit import Timer
   Timer('2**28').timeit()
  0.24572032043829495
   Timer('2**29').timeit()
  0.25511642791934719
   Timer('2**30').timeit()
  0.27746782979170348
   Timer('2**31').timeit()  # overflow: 2**31  2147483647
  2.8205724462504804
   Timer('2**32').timeit()
  2.2251812151589547
   Timer('2**33').timeit()
  2.406713399635

Floating point is a whole 'nother ball game:

   Timer('2.0**30.0').timeit()
  0.33266301963840306
   Timer('2.0**31.0').timeit()  # no threshold here!
  0.33437446769630697

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pow (power) function

2006-03-15 Thread Ben Cartwright
Russ wrote:
 Ben Cartwright wrote:
  Russ wrote:

   Does pow(x,2) simply square x, or does it first compute logarithms
   (as would be necessary if the exponent were not an integer)?
 
 
  The former, using binary exponentiation (quite fast), assuming x is an
  int or long.
 
  If x is a float, Python coerces the 2 to 2.0, and CPython's float_pow()
  function is called.  This function calls libm's pow(), which in turn
  uses logarithms.

 I just did a little time test (which I should have done *before* my
 original post!), and 2.0**2 seems to be about twice as fast as
 pow(2.0,2). That seems consistent with your claim above.


Actually, the fact that x**y is faster than pow(x, y) has nothing do to
with the int vs. float issue.  It's actually due to do the way Python
parses operators versus builtin functions.  Paul Rubin hit the nail on
the head when he suggested you check the bytecode:

   import dis
   dis.dis(lambda x, y: x**y)
1   0 LOAD_FAST0 (x)
3 LOAD_FAST1 (y)
6 BINARY_POWER
7 RETURN_VALUE
   dis.dis(lambda x, y: pow(x,y))
1   0 LOAD_GLOBAL  0 (pow)
3 LOAD_FAST0 (x)
6 LOAD_FAST1 (y)
9 CALL_FUNCTION2
   12 RETURN_VALUE

LOAD_GLOBAL + CALL_FUNCTION is more expensive than LOAD_FAST,
especially when you're doing it a million times (which, coincidentally,
timeit does).

Anyway, if you want to see the int vs. float issue in action, try this:

   from timeit import Timer
   Timer('2**2').timeit()
  0.12681011582321844
   Timer('2.0**2.0').timeit()
  0.6011743438121
   Timer('2.0**2').timeit()
  0.36681835556112219
   Timer('2**2.0').timeit()
  0.37949818370600497

As you can see, the int version is much faster than the float version.
The last two cases, which also use the float version, have an
additional performance hit due to type coercion.  The relative speed
differences are similar when using pow():

   Timer('pow(2, 2)').timeit()
  0.33000968869157532
   Timer('pow(2.0, 2.0)').timeit()
  0.50356362184709269
   Timer('pow(2.0, 2)').timeit()
  0.55112938185857274
   Timer('pow(2, 2.0)').timeit()
  0.55198819605811877


 I'm a bit surprised that pow() would use logarithms even if the
 exponent is an integer. I suppose that just checking for an integer
 exponent could blow away the gain that would be achieved by avoiding
 logarithms.  On the other hand, I would think that using logarithms
 could introduce a tiny error (e.g., pow(2.0,2) = 3.96 - made
 up result) that wouldn't occur with multiplication.


These are good questions to ask an expert in floating point arithmetic.
 Which I'm not. :-)


   Does x**0.5 use the same algorithm as sqrt(x), or does it use some
   other (perhaps less efficient) algorithm based on logarithms?
 
  The latter, and that algorithm is libm's pow().  Except for a few
  special cases that Python handles, all floating point exponentation is
  left to libm.  Checking to see if the exponent is 0.5 is not one of
  those special cases.

 I just did another little time test comparing 2.0**0.5 with sqrt(2.0).
 Surprisingly, 2.0**0.5 seems to take around a third less time.


Again, this is because of the operator vs. function lookup issue.
pow(2.0, 0.5) vs. sqrt(2.0) is a better comparison:

   from timeit import Timer
   Timer('pow(2.0, 0.5)').timeit()
  0.51701437102815362
   Timer('sqrt(2.0)', 'from math import sqrt').timeit()
  0.46649096722239847


 None of these differences are really significant unless one is doing
 super-heavy-duty number crunching, of course, but I was just curious.
 Thanks for the information.


Welcome. :-)

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Large algorithm issue -- 5x5 grid, need to fit 5 queens plus some squares

2006-03-15 Thread Ben Cartwright
[EMAIL PROTECTED] wrote:
 The first named clearbrd() which takes no variables, and will reset the
 board to the 'no-queen' position.
(snip)
 The Code:
 #!/usr/bin/env python
 brd = [9,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
 def clearbrd():
   brd = [9,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]

clearbrd() isn't doing what you want it to.  It should be written as:

  def clearbrd():
  global brd
  brd = [9,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]

Explanation:
http://www.python.org/doc/faq/programming/#how-do-you-set-a-global-variable-in-a-function

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Printable string for 'self'

2006-03-14 Thread Ben Cartwright
Don Taylor wrote:
 Is there a way to discover the original string form of the instance that
 is represented by self in a method?

 For example, if I have:

   fred = C()
   fred.meth(27)

 then I would like meth to be able to print something like:

   about to call meth(fred, 27) or
   about to call fred.meth(27)

 instead of:

   about to call meth(__main__.C instance at 0x00A9D238, 27)


Not a direct answer to your question, but this may be what you want:
If you give class C a __repr__ method you can avoid the default class
instance at address string.

   class C(object):
  def __init__(self, name):
  self.name = name
  def __repr__(self):
  return '%s(%r)' % (self.__class__.__name__, self.name)
  def meth(self, y):
  print 'about to call %r.meth(%r)' % (self, y)

   fred = C('Fred')
   fred.meth(27)
  about to call C('Fred').meth(27)
   def meth2(x, y):
  print 'about to call meth2(%r, %r)' % (x, y)

   meth2(fred, 42)
  about to call meth2(C('Fred'), 42)


Of course, this doesn't tell you the name of the variable that points
to the instance.  For that (here's your direct answer), you will have
to go to the source:

   import inspect
   import linecache
   def f(a, b):
  print 'function call from parent scope:'
  caller = inspect.currentframe().f_back
  filename = caller.f_code.co_filename
  linecache.checkcache(filename)
  line = linecache.getline(filename, caller.f_lineno)
  print '' + line.strip()
  return a*b

   fred = 4
   x = f(3, fred)
  function call from parent scope:
  x = f(3, fred)


But defining __repr__ is far easier and more common.

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: global namescape of multilevel hierarchy

2006-03-13 Thread Ben Cartwright
Sakcee wrote:
 now in package.module.checkID function, i wnat to know what is the ID
 defiend in the calling scriipt

It's almost always a really bad idea to kludge scopes like this.  If
you need to access a variable from the caller's scope in a module
function, make it an argument to that function.  That's what arguments
are for in the first place!

But, if you must, you can use the inspect module:

  import inspect
  def checkID():
  ID = inspect.currentframe().f_back.f_locals['ID']
  print ID

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: counting number of (overlapping) occurances

2006-03-09 Thread Ben Cartwright
John wrote:
 This works but is a bit slow, I guess I'll have to live with it.
 Any chance this could be sped up in python?

Sure, to a point.  Instead of:

  def countoverlap(s1, s2):
  return len([1 for i in xrange(len(s1)) if s1[i:].startswith(s2)])

Try this version, which takes smaller slices (resulting in 2x-5x speed
increase when dealing with a large s1 and a small s2):

  def countoverlap(s1, s2):
  L = len(s2)
  return len([1 for i in xrange(len(s1)-L+1) if s1[i:i+L] == s2])

And for a minor extra boost, this version eliminates the list
comprehension:

  def countoverlap(s1, s2):
  L = len(s2)
  cnt = 0
  for i in xrange(len(s1)-L+1):
  if s1[i:i+L] == s2:
  cnt += 1
  return cnt

Finally, if the execution speed of this function is vital to your
application, create a C extension.  String functions like this one are
generally excellent candidates for extensionizing.

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to pop random item from a list?

2006-03-09 Thread Ben Cartwright
flamesrock wrote:
 whats the best way to pop a random item from a list??

  import random
  def popchoice(seq):
# raises IndexError if seq is empty
return seq.pop(random.randrange(len(seq)))

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: advice on this little script

2006-03-08 Thread Ben Cartwright
BartlebyScrivener wrote:
 What about a console beep? How do you add that?

 rpd

Just use ASCII code 007 (BEL/BEEP):

   import sys
   sys.stdout.write('\007')

Or if you're on Windows, use the winsound standard module.

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Separating elements from a list according to preceding element

2006-03-05 Thread Ben Cartwright
Rob Cowie wrote:
 I wish to derive two lists - each containing either tags to be
 included, or tags to be excluded. My idea was to take an element,
 examine what element precedes it and accordingly, insert it into the
 relevant list. However, I have not been successful.

 Is there a better way that I have not considered?

Maybe.  You could write a couple regexes, one to find the included
tags, and one for the excluded, then run re.findall on them both.

But there's nothing fundamentally wrong with your method.

 If this method is
 suitable, how might I implement it?

  tags = ['tag1', '+', 'tag2', '+', 'tag3', '-', 'tag4']

  include, exclude = [], []
  op = '+'
  for cur in tags:
  if cur in '+-':
  op = cur
  else:
  if op == '+':
  include.append(cur)
  else:
  exclude.append(cur)

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: A simple question

2006-03-04 Thread Ben Cartwright
Tuvas wrote:
 Why is the output list [[0, 1], [0, 1]] and not [[0,
 1], [0, 0]]? And how can I make it work right?

http://www.python.org/doc/faq/programming.html#how-do-i-create-a-multidimensional-list

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: string stripping issues

2006-03-02 Thread Ben Cartwright
orangeDinosaur wrote:
 I am encountering a behavior I can think of reason for.  Sometimes,
 when I use the .strip module for strings, it takes away more than what
 I've specified.  For example:

  a = 'TD WIDTH=175FONT SIZE=2Hughes. John/FONT/TD\r\n'

  a.strip('TD WIDTH=175FONT SIZE=2')

 returns:

 'ughes. John/FONT/TD\r\n'

 However, if I take another string, for example:

  b = 'TD WIDTH=175FONT SIZE=2Kim, Dong-Hyun/FONT/TD\r\n'

  b.strip('TD WIDTH=175FONT SIZE=2')

 returns:

 'Kim, Dong-Hyun/FONT/TD\r\n'

 I don't understand why in one case it eats up the 'H' but in the next
 case it leaves the 'K' alone.


That method... I do not think it means what you think it means.  The
argument to str.strip is a *set* of characters, e.g.:

   foo = 'abababaXabbaXababa'
   foo.strip('ab')
  'XabbaX'
   foo.strip('aabababaab') # no difference!
  'XabbaX'

For more info, see the string method docs:
http://docs.python.org/lib/string-methods.html
To do what you're trying to do, try this:

   prefix = 'hello '
   bar = 'hello world!'
   if bar.startswith(prefix): bar = bar[:len(prefix)]
  ...
   bar
  'world!'

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: string stripping issues

2006-03-02 Thread Ben Cartwright
Ben Cartwright wrote:
 orangeDinosaur wrote:
  I am encountering a behavior I can think of reason for.  Sometimes,
  when I use the .strip module for strings, it takes away more than what
  I've specified.  For example:
 
   a = 'TD WIDTH=175FONT SIZE=2Hughes. John/FONT/TD\r\n'
 
   a.strip('TD WIDTH=175FONT SIZE=2')
 
  returns:
 
  'ughes. John/FONT/TD\r\n'
 
  However, if I take another string, for example:
 
   b = 'TD WIDTH=175FONT SIZE=2Kim, Dong-Hyun/FONT/TD\r\n'
 
   b.strip('TD WIDTH=175FONT SIZE=2')
 
  returns:
 
  'Kim, Dong-Hyun/FONT/TD\r\n'
 
  I don't understand why in one case it eats up the 'H' but in the next
  case it leaves the 'K' alone.


 That method... I do not think it means what you think it means.  The
 argument to str.strip is a *set* of characters, e.g.:

foo = 'abababaXabbaXababa'
foo.strip('ab')
   'XabbaX'
foo.strip('aabababaab') # no difference!
   'XabbaX'

 For more info, see the string method docs:
 http://docs.python.org/lib/string-methods.html
 To do what you're trying to do, try this:

prefix = 'hello '
bar = 'hello world!'
if bar.startswith(prefix): bar = bar[:len(prefix)]
   ...
bar
   'world!'


Apologies, that should be:
prefix = 'hello '
bar = 'hello world!'
if bar.startswith(prefix): bar = bar[len(prefix):]
   ...
bar
   'world!'

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: in need of some sorting help

2006-03-02 Thread Ben Cartwright
ianaré wrote:
 However, i need the sorting done after the walk, due to the way the
 application works... should have specified that, sorry.


If your desired output is just a sorted list of files, there is no good
reason that you shouldn't be able sort in place.  Unless your app is
doing something extremely funky, in which case this should do it:

root = self.path.GetValue()  # wx.TextCtrl input
filter = self.fileType.GetValue().lower()  # wx.TextCtrl input
not_type = self.not_type.GetValue()  # wx.CheckBox input

matched_paths = {}
for base, dirs, walk_files in os.walk(root):
main.Update()
# i only need the part of the filename after the
# user selected path:
base = base.replace(root, '')

matched_paths[base] = []
for entry in walk_files:
entry = os.path.join(base, entry)
if not filter:
match = True
else:
match = filter in entry.lower()
if not_type:
match = not match
if match:
matched_paths[base].append(entry)

def tolower(x): return x.lower()
files = []
# Combine into flat list, first sorting on base path, then full
path
for base in sorted(matched_paths, key=tolower):
files.extend(sorted(matched_paths[base], key=tolower))

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: slicing the end of a string in a list

2006-03-02 Thread Ben Cartwright
John Salerno wrote:
 You can probably tell what I'm doing. Read a list of lines from a file,
 and then I want to slice off the '\n' character from each line. But
 after this code runs, the \n is still there. I thought it might have
 something to do with the fact that strings are immutable, but a test
 such as:

 switches[0][:-1]

 does slice off the \n character.

Actually, it creates a new string instance with the \n character
removed, then discards it.  The original switches[0] string hasn't
changed.

   foo = 'Hello world!'
   foo[:-1]
  'Hello world'
   foo
  'Hello world!'

 So I guess the problem lies in the
 assignment or somewhere in there.

Yes.  You are repeated assigning a new string instance to line, which
is then never referenced again.  If you want to update the switches
list, then instead of assigning to line inside the loop, you need:

  switches[i] = switches[i][:-1]

 Also, is this the best way to index the list?

No, since the line variable is unused.  This:

  i = 0
  for line in switches:
  line = switches[i][:-1]
  i += 1

Would be better written as:

  for i in range(len(switches)):
  switches[i] = switches[i][:-1]

For most looping scenarios in Python, you shouldn't have to manually
increment a counter variable.

--Ben

PS - actually, you can accomplish all of the above in a single line of
code:
  print [line[:-1] for line in open('C:\\switches.txt')]

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Removing .DS_Store files from mac folders

2006-03-01 Thread Ben Cartwright
David Pratt wrote:
 # Clean mac .DS_Store
  if current_file == '.DS_Store':
  print 'a DS_Store item encountered'
   os.remove(f)
...
 I can't figure why
 remove is not removing.


It looks like your indentation is off.  From what you posted, the
print line is prepended with 9 spaces, while the os.remove line is
prepended with a single tab.  Don't mix tabs and spaces.

Also, shouldn't that be os.remove(current_file)?

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Removing .DS_Store files from mac folders

2006-03-01 Thread Ben Cartwright
David Pratt wrote:
 Hi Ben. Sorry about the cut and paste job into my email. It is part of a
 larger script. It is actually all tabbed. This will give you a better idea:

   for f in file_names:
   current_file = os.path.basename(f)
   print 'Current File: %s' % current_file

   # Clean mac .DS_Store
   if current_file == '.DS_Store':
   print 'a DS_Store item encountered'
   os.remove(f)


I'm no Mac expert, but could it be that OSX is recreating .DS_Store?
Try putting this above your os.remove call:

  import os.stat
  print 'Last modified:', os.stat(f)[ST_MTIME]

Then run your script a few times and see if the modified times are
different.

You might also try verifying that you get an exception when attempting
to open the file right after removing it.

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Removing .DS_Store files from mac folders

2006-03-01 Thread Ben Cartwright
David Pratt wrote:
 OSError: [Errno 2] No such file or directory: '.DS_Store'


Ah.  You didn't mention a traceback earlier, so I assumed the code was
executing but you didn't see the file being removed.


 for f in file_names:
 current_file = os.path.basename(f)
 print 'Current File: %s' % current_file
 
 # Clean mac .DS_Store
 if current_file == '.DS_Store':
 print 'a DS_Store item encountered'
 os.remove(f)


How are you creating file_names?  More importantly, does it contain a
path (either absolute or relative to the current working directory)?
If not, you need an os.path.join, e.g.:

import os
for root_path, dir_names, file_names in os.walk('.'):
# file_names as generated by os.walk contains file
# names only (no path)
for f in file_names:
if f == '.DS_Store':
full_path = os.path.join(root_path, f)
os.remove(full_path)

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: str.count is slow

2006-02-27 Thread Ben Cartwright
[EMAIL PROTECTED] wrote:
 It seems to me that str.count is awfully slow.  Is there some reason
 for this?
 Evidence:

  str.count time test 
 import string
 import time
 import array

 s = string.printable * int(1e5) # 10**7 character string
 a = array.array('c', s)
 u = unicode(s)
 RIGHT_ANSWER = s.count('a')

 def main():
 print 'str:', time_call(s.count, 'a')
 print 'array:  ', time_call(a.count, 'a')
 print 'unicode:', time_call(u.count, 'a')

 def time_call(f, *a):
 start = time.clock()
 assert RIGHT_ANSWER == f(*a)
 return time.clock()-start

 if __name__ == '__main__':
 main()

 ## end 

 On my machine, the output is:

 str: 0.29365715475
 array:   0.448095498171
 unicode: 0.0243757237303

 If a unicode object can count characters so fast, why should an str
 object be ten times slower?  Just curious, really - it's still fast
 enough for me (so far).

 This is with Python 2.4.1 on WinXP.


 Chris Perkins


Your evidence points to some unoptimized code in the underlying C
implementation of Python.  As such, this should probably go to the
python-dev list (http://mail.python.org/mailman/listinfo/python-dev).

The problem is that the C library function memcmp is slow, and
str.count calls it frequently.  See lines 2165+ in stringobject.c
(inside function string_count):

r = 0;
while (i  m) {
if (!memcmp(s+i, sub, n)) {
r++;
i += n;
} else {
i++;
}
}

This could be optimized as:

r = 0;
while (i  m) {
if (s[i] == *sub  !memcmp(s+i, sub, n)) {
r++;
i += n;
} else {
i++;
}
}

This tactic typically avoids most (sometimes all) of the calls to
memcmp.  Other string search functions, including unicode.count,
unicode.index, and str.index, use this tactic, which is why you see
unicode.count performing better than str.count.

The above might be optimized further for cases such as yours, where a
single character appears many times in the string:

r = 0;
if (n == 1) {
/* optimize for a single character */
while (i  m) {
if (s[i] == *sub)
r++;
i++;
}
} else {
while (i  m) {
if (s[i] == *sub  !memcmp(s+i, sub, n)) {
r++;
i += n;
} else {
i++;
}
}
}

Note that there might be some subtle reason why neither of these
optimizations are done that I'm unaware of... in which case a comment
in the C source would help. :-)

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: changing params in while loop

2006-02-27 Thread Ben Cartwright
robin wrote:
 i have this function inside a while-loop, which i'd like to loop
 forever, but i'm not sure about how to change the parameters of my
 function once it is running.
 what is the best way to do that? do i have to use threading or is there
 some simpler way?


Why not just do this inside the function?  What exactly are you trying
to accomplish here?  Threading could work here, but like regexes,
threads are not only tricky to get right but also tricky to know when
to use in the first place.

That being said, here's some example threading code to get you started
(PrinterThread.run is your function; the ThreadSafeStorage instance
holds your parameters):

import threading
import time

class ThreadSafeStorage(object):
def __init__(self):
object.__setattr__(self, '_lock', threading.RLock())
def acquirelock(self):
object.__getattribute__(self, '_lock').acquire()
def releaselock(self):
object.__getattribute__(self, '_lock').release()
def __getattribute__(self, attr):
if attr in ('acquirelock', 'releaselock'):
return object.__getattribute__(self, attr)
self.acquirelock()
value = object.__getattribute__(self, attr)
self.releaselock()
return value
def __setattr__(self, attr, value):
self.acquirelock()
object.__setattr__(self, attr, value)
self.releaselock()

class PrinterThread(threading.Thread):
Prints the data in shared storage once per second.
storage = None
def run(self):
while not self.storage.killprinter:
self.storage.acquirelock()
print 'message:', self.storage.message
print 'ticks:', self.storage.ticks
self.storage.ticks += 1
self.storage.releaselock()
time.sleep(1)

data = ThreadSafeStorage()
data.killprinter = False
data.message = 'hello world'
data.ticks = 0
thread = PrinterThread()
thread.storage = data
thread.start()
# do some stuff in the main thread
time.sleep(3)
data.acquirelock()
data.message = 'modified ticks'
data.ticks = 100
data.releaselock()
time.sleep(3)
data.message = 'goodbye world'
time.sleep(1)
# notify printer thread that it needs to die
data.killprinter = True
thread.join()

# output:

message: hello world
ticks: 0
message: hello world
ticks: 1
message: hello world
ticks: 2
message: modified ticks
ticks: 100
message: modified ticks
ticks: 101
message: modified ticks
ticks: 102
message: goodbye world
ticks: 103


--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: except clause not catching IndexError

2006-02-22 Thread Ben Cartwright
Derek Schuff wrote:
 I have some code like this:
 for line in f:
 toks = line.split()
 try:
 if int(toks[2],16) == qaddrs[i]+0x1000 and toks[0] == 
 200: #producer
 write
 prod = int(toks[3], 16)
 elif int(toks[2],16) == qaddrs[i]+0x1002 and toks[0] == 
 200:
 #consumer write
 cons = int(toks[3], 16)
 else:
 continue
 except IndexError: #happens if theres a partial line at the 
 end of file
 print indexerror
 break

 However, when I run it, it seems that I'm not catching the IndexError:
 Traceback (most recent call last):
   File /home/dschuff/bin/speeds.py, line 202, in ?
 if int(toks[2],16) == qaddrs[i]+0x1000 and toks[0] == 200: #producer
 write
 IndexError: list index out of range

 If i change the except IndexError to except Exception, it will catch it (but
 i believe it's still an IndexError).
 this is python 2.3 on Debian sarge.

 any ideas?


Sounds like IndexError has been redefined somewhere, e.g.:
  IndexError = 'something entirely different'
  foo = []
  try:
  foo[42]
  except IndexError: # will not catch the real IndexError; we're
shadowing it
  pass

Try adding print IndexError right before your trouble spot, and see
if it outputs exceptions.IndexError.

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: a little more help with python server-side scripting

2006-02-21 Thread Ben Cartwright
John Salerno wrote:
 I contacted my domain host about how Python is implemented on their
 server, and got this response:

 ---
 Hello John,

 Please be informed that the implementation of python in our server is
 through mod_python integration with the apache.

 These are the steps needed for you to be able to run .py script directly
 from browser for your webpage:

 1. Please use the below mentioned path for python:
 #!/usr/bin/env python

 Furthermore, update us with the script path, so that we can set the
 appropriate ownership and permissions of the script on the server.

 If you require any further assistance, feel free to contact us.
 ---

 Unfortunately, I don't completely understand what it is I need to do
 now. Where do I put the path they mentioned? And what do they mean by my
 script path?


The Python tutorial should fill in the blanks
(http://www.python.org/doc/tut/node4.html):
 2.2.2 Executable Python Scripts

 On BSD'ish Unix systems, Python scripts can be made directly executable,
 like shell scripts, by putting the line

   #! /usr/bin/env python

 (assuming that the interpreter is on the user's PATH) at the beginning
 of the script and giving the file an executable mode. The #! must be
 the first two characters of the file. On some platforms, this first line
 must end with a Unix-style line ending (\n), not a Mac OS (\r) or
 Windows (\r\n) line ending. Note that the hash, or pound, character,
 #, is used to start a comment in Python.

This answers your first question.  Put the #! bit at the top of your
.py script.  This way the web server will know how to run the script.

 The script can be given a executable mode, or permission, using the
 chmod command:

   $ chmod +x myscript.py

And this answers your second.  Your host needs to know the path to your
script so they can use chmod to make it executable.

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Self-identifying functions and macro-ish behavior

2006-02-15 Thread Ben Cartwright
[EMAIL PROTECTED] wrote:
 How do I get some
 sort of macro behavior so I don't have to write the same thing over and
 over again, but which is also not neatly rolled up into a function,
 such as combining the return statements with a printing of self-name?


Decorators: http://www.python.org/peps/pep-0318.html


 My application has a bunch of functions that must do different things,
 then print out their names, and then each call another function before
 returning.  I'd like to have the last function call and the return in
 one statement, because if I forget to manually type it in, things get
 messed up.

 (ok, I'm writing a parser and I keep track of the call level with a tab
 count, which gets printed before any text messages.  So each text
 message has a tab count in accordance with how far down the parser is.
 Each time a grammar rule is entered or returned from, the tab count
 goes up or down.  If I mess up and forget to call tabsup() or tabsdn(),
 the printing gets messed up.  There are a lot of simple cheesy
 production rules, [I'm doing this largely as an exercise for myself,
 which is why I'm doing this parsing manually], so it's error-prone and
 tedious to type tabsup() each time I enter a function, and tabsdn()
 each time I return from a function, which may be from several different
 flow branches.)


def track(func):
Decorator to track calls to a set of functions
def wrapper(*args, **kwargs):
print  *track.depth + func.__name__, args, kwargs or 
track.depth += 1
result = func(*args, **kwargs)
track.depth -= 1
return result
return wrapper
track.depth = 0


# Then to apply the decorator to a function, e.g.:
def f(x):
return True
# Add this line somewhere after the function definition:
f = track(f)

# Alternately, if you're using Python 2.4 or newer, just define f as:
@track
def f(x):
return True


# Test it:
@track
def fact(n):
Factorial of n, n! = n*(n-1)*(n-2)*...*3*2
assert n = 0
if n  2:
return 1
return n * fact(n-1)
@track
def comb(n, r):
Choose r items from n w/out repetition, n!/(r!*(n-r)!)
assert n = r
return fact(n) / fact(r) / fact(n-r)
print comb(5, 3)
# Output:

comb (5, 3)
 fact (5,)
  fact (4,)
   fact (3,)
fact (2,)
 fact (1,)
 fact (3,)
  fact (2,) 
   fact (1,) 
 fact (2,) 
  fact (1,) 
10


--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: random playing soundfiles according to rating.

2006-02-10 Thread Ben Cartwright
kpp9c wrote:
 I've been looking at some of the suggested approaches and looked a
 little at Michael's bit which works well bisect is a module i
 always struggle with (hee hee)

 I am intrigued by Ben's solution and Ben's distilled my problem quite
 nicely

Thanks!-)  Actually, you should use Michael's solution, not mine.  It
uses the same concept, but it finds the correct subinterval in O(log n)
steps (by using bisect on a cached list of cumulative sums).  My code
takes O(n) steps -- this is a big difference when you're dealing with
thousands of items.

 but, welli don't understand what point is doing with
 wieght for key, weight for zlist

This line:
point = random.uniform(0, sum(weight for key, weight in zlist))
Is shorthand for:
total = 0
for key, weight in zlist:
total += weight
point = random.uniform(0, total)

 furthermore, it barfs in my
 interpreter... (Python 2.3)

Oops, that's because it uses generator expressions
(http://www.python.org/peps/pep-0289.html), a 2.4 feature.  Try
rewriting it longhand (see above).  The second line of the test code
will have to be changed too, i.e.:
   counts = dict([(key, 0) for key, weight in data])

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: random playing soundfiles according to rating.

2006-02-09 Thread Ben Cartwright
[EMAIL PROTECTED] wrote:
 But i am stuck on how to do a random chooser that works according to my
 idea of choosing according to rating system. It seems to me to be a bit
 different that just choosing a weighted choice like so:

...

 And i am not sure i want to have to go through what will be hundreds of
 sound files and scale their ratings by hand so that they all add up to
 100%. I just want to have a long list that i can add too whenever i
 want, and assign it a grade/rating according to my whims!

Indeed, manually normalizing all those weights would be a downright
sinful waste of time and effort.

The solution (to any problem, really) starts with how you conceptualize
it.  For this problem, consider the interval [0, T), where T is the sum
of all the weights.  This interval is made up of adjacent subintervals,
one for each weight.  Now pick a random point in [0, T).  Determine
which subinterval this point is in, and you're done.

import random
def choose_weighted(zlist):
point = random.uniform(0, sum(weight for key, weight in zlist))
for key, weight in zlist: # which subinterval is point in?
point -= weight
if point  0:
return key
return None # will only happen if sum of weights = 0

You'll get bogus results if you use negative weights, but that should
be obvious.  Also note that by using random.uniform instead of
random.randrange, floating point weights are handled correctly.

Test it:
   data = (('foo', 1), ('bar', 2), ('skipme', 0), ('baz', 10))
   counts = dict((key, 0) for key, weight in data)
   for i in range(1):
  ... counts[choose_weighted(data)] += 1
  ...
   [(key, counts[key]) for key, weight in data]
  [('foo', 749), ('bar', 1513), ('skipme', 0), ('baz', 7738)]
  

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list