Re: Graph library recommendations for large graphs

2009-08-24 Thread Istvan Albert
On Aug 24, 5:37 pm, VanL van.lindb...@gmail.com wrote:

 Can anybody who has worked with large graphs before give a recommendation?


when using large graphs another limitation may come from the various
graph algorithm run times. Most likely you will need to squeeze out as
much as possible and a python implementation has a lot of overhead.

I've used the LEDA graph library with great success. This is a C++
library with substantial syntax sugar that looks a bit like python
(and I made some python bindings for it via SWIG and thus got the best
of both worlds, lost the code I'm afraid).

http://www.algorithmic-solutions.info/leda_guide/Graphs.html

i.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: anyone with genomewide microarray analysis experience ?

2009-08-15 Thread Istvan Albert
On Aug 14, 8:52 am, trias t.gkikopou...@dundee.ac.uk wrote:

 Does anyone have some scripts I could use for this purpose. I work with
 S.cerevisiae

Since the largest chromosome on the yeast genome is around 4 million
bp, the easiest way to accomplish your goal is to create a list of the
same size as the chromosome, then populate this list by mapping the
genomic index to the list index.

After doing this your problem simplifies to slicing the list around
the coordinates of interest.

You'll be done in minutes.

i.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Relative Imports, why the hell is it so hard?

2009-03-24 Thread Istvan Albert
On Mar 23, 10:16 am, CinnamonDonkey cinnamondon...@googlemail.com
wrote:

 I'm fairly new to Python so I still have a lot to learn. But I'd like
 to know how to correectly use relative imports.

Relative imports are *fundamentally* broken in python. You will soon
see that code using relative import will break if you attempt to run
the module on its own. Yes, it is mindboggling!

Why is it so you ask? It is one of those issue that would be trivial
to implement correctly (it is relative to the current file location,
duh!!!), would be tremendously useful yet  for some reason it is
controversial with those who would need to implement it.

It looks like they think that the expected mode of operation would
greatly confuse the 'rest' of us. So that is the reason you end up
with a broken implementation that is worse than not having it at all.
All I can do is recommend that you avoid relative imports.

The easiest way around the issue is to create a module named
pathfix.py like the one below and import it in all of your modules.
This is the only way to fix this issue in a way that does not come
back and bite you, it is ugly, you are repeating yourself in multiple
places, but it is still better than the alternative.

---

import os, sys

def path_join(*args):
return os.path.abspath(os.path.join(*args))

# adds base directory to import path to allow relative imports
__currdir = os.path.dirname( __file__ )
__basedir = path_join(__currdir, '..' )
if __basedir not in sys.path:
sys.path.append( __basedir )


--
http://mail.python.org/mailman/listinfo/python-list


Re: Relative Imports, why the hell is it so hard?

2009-03-24 Thread Istvan Albert
On Mar 24, 3:16 pm, Gabriel Genellina gagsl-...@yahoo.com.ar
wrote:

 Did you know, once a module is imported by the first time

yeah yeah, could we not get sidetracked with details that are not
relevant? what it obviously means is to import it in all of your
modules that need to access to relative paths

 I don't understand, how is this supposed to help relative imports?

That's only because you have not had to deal with the problem that it
solves.
If you need to have a module that can do both:

1. access relative paths (even other packages)
2. be executable on its own (for example a modules may execute its own
doctests when running them directly)

this is the only way to achieve it.

 I'd recommend the oposite - use relative (intra-package) imports when 
 possible.

Does it not bother you that a module that uses relative imports cannot
be run on its own anymore? To me that is irritating because it forces
me to change a habit (running the doctests when the module is
executed) that I consider a very good practice. It is extremely handy
to be writing a module, press a key and the module is executed and I
can see the tests results. The relative import takes away this from
me. Like I said, it is irritating.

 Bindly inserting directories into sys.path can easily confuse the import 
 systemn

confuse the import system? what the heck does that mean? You either
have a path in the sys.path or not. FWIW it is far cleaner than doing
a relative import that does not work correctly.

Istvan














--
http://mail.python.org/mailman/listinfo/python-list


Re: Relative Imports, why the hell is it so hard?

2009-03-24 Thread Istvan Albert
On Mar 24, 9:35 pm, Maxim Khitrov mkhit...@gmail.com wrote:

 Works perfectly fine with relative imports.

This only demonstrates that you are not aware of what the problem
actually is.

Try using relative imports so that it works when you import the module
itself. Now run the module as a program. The same module that worked
fine when you imported it will raise the exception:

ValueError: Attempted relative import in non-package

when running it on its own.

Istvan
--
http://mail.python.org/mailman/listinfo/python-list


Re: what IDE is the best to write python?

2009-02-03 Thread Istvan Albert
On Feb 2, 12:06 pm, Thorsten Kampe thors...@thorstenkampe.de wrote:

  It makes my eyes bleed

 Ever tried sunglasses?

Sunglasses for bleeding eyes? For pete's sake try bandages.
--
http://mail.python.org/mailman/listinfo/python-list


Re: no sign() function ?

2008-12-22 Thread Istvan Albert

  conclusions ---

try testing on a large number of candidates that are all (or mostly)
positive or all (or mostly) negative and you'll see performance
numbers that are substantially different than the ones you report:

candidates = range(1000)

In general the function sign_1() is expected to be the fastest because
in most cases will detect the sign with the fewest operations, it only
visits the rest of the comparison when it hits the corner cases. Only
if you have lots of +/-0.0 cases will it be slower than the rest, due
to having to call an expensive operation.

i.
--
http://mail.python.org/mailman/listinfo/python-list


[issue4565] io write() performance very slow

2008-12-06 Thread Istvan Albert

New submission from Istvan Albert [EMAIL PROTECTED]:

The write performance into text files is substantially slower (5x-8x)
than that of python 2.5. This makes python 3.0 unsuited to any
application that needs to write larger amounts of data.

test code follows ---

import time

lo, hi, step = 10**5, 10**6, 10**5

# writes increasingly more lines to a file
for N in range(lo, hi, step):
fp = open('foodata.txt', 'wt')
start = time.time()
for i in range( N ):
fp.write( '%s\n' % i)
fp.close()
stop = time.time()
print ( %s\t%s % (N, stop-start) )

--
components: Interpreter Core
messages: 77132
nosy: ialbert
severity: normal
status: open
title: io write() performance very slow
type: performance
versions: Python 3.0

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue4565
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4565] io write() performance very slow

2008-12-06 Thread Istvan Albert

Istvan Albert [EMAIL PROTECTED] added the comment:

Well I would strongly dispute that anyone other than the developers
expected this. The release documentation states:

The net result of the 3.0 generalizations is that Python 3.0 runs the
pystone benchmark around 10% slower than Python 2.5.

There is no indication of an order of magnitudes in read/write slowdown.
I believe that this issue is extremely serious! IO is an essential part
of a program, and today we live in the world of gigabytes of data. I am
reading reports of even more severe io slowdowns than what I saw:

http://bugs.python.org/issue4561

Java has had a hard time getting rid of the it is very slow stigma
even after getting a JIT compiler, so there is a danger there for a
lasting negative impression.

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue4565
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



slow Python 3.0 write performance?

2008-12-05 Thread Istvan Albert
Could someone run the code below on both Python 2.5 and 3.0

For me (on Windows) it runs over 7 times slower with Python 3.0

import time

lo, hi, step = 10**5, 10**6, 10**5

# writes increasingly more lines to a file
for N in range(lo, hi, step):
fp = open('foodata.txt', 'wt')
start = time.time()
for i in range( N ):
fp.write( '%s\n' % i)
fp.close()
stop = time.time()
print ( %s\t%s % (N, stop-start) )



--
http://mail.python.org/mailman/listinfo/python-list


Re: slow Python 3.0 write performance?

2008-12-05 Thread Istvan Albert
On Dec 5, 3:06 pm, [EMAIL PROTECTED] wrote:

 It should get faster over time.  It will get faster over a shorter period of
 time if people contribute patches.

I see, thanks for the clarification.

I will make the point though that this makes python 3.0 unsuited for
anyone who has to process data. One could live with slowdowns of say
20-50 percent, to get the goodies that 3.0 offers, but when a process
that takes 1 second suddenly starts to take 10, it is makes the
situation untenable.

best,

Istvan
--
http://mail.python.org/mailman/listinfo/python-list


Re: slow Python 3.0 write performance?

2008-12-05 Thread Istvan Albert
On Dec 5, 3:41 pm, Christian Heimes [EMAIL PROTECTED] wrote:

 I've fixed the read() slowness yesterday. You'll get the fix in the next
 release of Python 3.0 in a couple of weeks.

Does this fix speed up the write() function as well?

A previous poster suggested that in this case the slowdown is caused
by the new io code being written in python rather than C.

Istvan
--
http://mail.python.org/mailman/listinfo/python-list


Re: Don't you just love writing this sort of thing :)

2008-12-05 Thread Istvan Albert
On Dec 3, 8:07 pm, Lawrence D'Oliveiro [EMAIL PROTECTED]
central.gen.new_zealand wrote:

snip code

Originally, like many others here I said YIKES! but on a second read,
it is not that bad. It actually grows on you.

After looking at it one more time I found it neat, very concise
without being unreadable.

i.
--
http://mail.python.org/mailman/listinfo/python-list


Re: RELEASED Python 3.0 final

2008-12-04 Thread Istvan Albert
Congratulations on a fantastic work!
--
http://mail.python.org/mailman/listinfo/python-list


Re: Python 3 read() function

2008-12-04 Thread Istvan Albert
I can confirm this,

I am getting very slow read performance when reading a smaller 20 MB
file.

 - Python 2.5 takes 0.4 seconds
 - Python 3.0 takes 62 seconds

fname = dmel-2R-chromosome-r5.1.fasta
data = open(fname, 'rt').read()
print ( len(data) )

--
http://mail.python.org/mailman/listinfo/python-list


Re: Python 3 read() function

2008-12-04 Thread Istvan Albert
On Dec 4, 1:31 pm, Terry Reedy [EMAIL PROTECTED] wrote:

 Jerry Hill wrote:

  That's 3 orders of magnitude slower on python3.0!

 Timing of os interaction may depend on os.  I verified above on WinXp
 with 4 meg Pythonxy.chm file.  Eye blink versus 3 secs, duplicated.  I
 think something is wrong that needs fixing in 3.0.1.

 http://bugs.python.org/issue4533

I believe that the slowdowns are even more substantial when opening
the file in text mode.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Python 3 read() function

2008-12-04 Thread Istvan Albert
Turns out write performance is also slow!

The program below takes

 3 seconds on python 2.5
17 seconds on python 3.0

yes, 17 seconds! tested many times in various order. I believe the
slowdowns are not constant (3x) but some sort of nonlinear function
(quadratic?) play with the N to see it.

===

import time

start = time.time()

N = 10**6
fp = open('testfile.txt', 'wt')
for n in range(N):
fp.write( '%s\n' % n )
fp.close()

end = time.time()

print (end-start)
--
http://mail.python.org/mailman/listinfo/python-list


Re: multiprocessing eats memory

2008-09-26 Thread Istvan Albert
On Sep 26, 4:52 am, redbaron [EMAIL PROTECTED] wrote:

 How could I avoid of storing them? I need something to check does it
 ready or not and retrieve results if ready. I couldn't see the way to
 achieve same result without storing asyncs set.

It all depends on what you are trying to do. The issue that you
originally brought up is that of memory consumption.

When processing data in parallel you will use up as much memory as
many datasets you are processing at any given time. If you need to
reduce memory use then you need to start fewer processes and use some
mechanism to distribute the work on them as they become free. (see
recommendation that uses Queues)
--
http://mail.python.org/mailman/listinfo/python-list


Re: multiprocessing eats memory

2008-09-25 Thread Istvan Albert
On Sep 25, 8:40 am, Max Ivanov [EMAIL PROTECTED] wrote:

 At any time in main process there are shouldn't be no more than two copies of 
 data
 (one original data and one result).

From the looks of it you are storing a lots of references to various
copies of your data via the async set.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Processing in Python

2008-05-20 Thread Istvan Albert
On May 20, 6:13 pm, Diez B. Roggisch [EMAIL PROTECTED] wrote:
 Salvatore DI DI0 schrieb:

  Hello,

  The Processing Graphics language has been implemented in Javascript.

 No, it hasn't. Processing is written in Java.

He meant it has been re-implemented in Javascript:

http://ejohn.org/blog/processingjs/

The closest python based equivalent of Processing is NodeBox but it is
a OS X product:

http://nodebox.net/code/index.php/Home

There is a jython based NodeBox that runs on windows that can be found
here:

http://research.nodebox.net/index.php/NodeBoxDev

i.


--
http://mail.python.org/mailman/listinfo/python-list


Re: ]ANN[ Vellum 0.16: Lots Of Documentation and Watching

2008-05-01 Thread Istvan Albert
On Apr 29, 3:51 am, Zed A. Shaw [EMAIL PROTECTED] wrote:

 You can grab the most recent draft of the book at:

  http://zedshaw.com/projects/vellum/manual-final.pdf

 However, I'm curious to get other people's thoughts.

IMO if you would refrain from using swear words in the manual it would
help broadening its reach and acceptance.

i.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Python Success stories

2008-04-23 Thread Istvan Albert
On Apr 23, 2:08 pm, Bob Woodham [EMAIL PROTECTED] wrote:

 x = x++;

 has unspecified behaviour in C.  That is, it is not specified
 whether the value of x after execution of the statement is the
 old value of x or one plus the old value of x.

unspecified means that the result could be anything: old value, old
value+1, -2993882, trallalla, core dump, stack overflow etc...

in Java the behavior is specified, but many might find the result
counterintuitive:

int x = 0;
x = x++;
System.out.println(x);

prints 0, if I recall it correctly the ++ mutates after the assignment
takes place, therefore it increments the old value that then summarily
discarded.

i.


--
http://mail.python.org/mailman/listinfo/python-list


Re: Python Success stories

2008-04-22 Thread Istvan Albert
On Apr 22, 6:25 am, azrael [EMAIL PROTECTED] wrote:

 A friend of mine i a proud PERL developer which always keeps making
 jokes on python's cost.

 This hurts. Please give me informations about realy famous
 aplications.

you could show him what Master Yoda said when he compared Python to
Perl

http://www.personal.psu.edu/iua1/pythonvsperl.htm

i.
--
http://mail.python.org/mailman/listinfo/python-list


Re: py3k s***s

2008-04-18 Thread Istvan Albert
On Apr 18, 1:39 am, Sverker Nilsson [EMAIL PROTECTED] wrote:

 Some whine. Some just don't care. Why not whine?

Whining and ranting is actually good for the psyche. It is better to
get it out of your system.

As for your original post, no doubt there are substantial downsides to
introducing Py3K, but as Guido put it every language must change or
die.

Of course we could end up with change and die as well. I for one am
an optimist, I think there are several substantial improvements to the
language that today may not be apparent for a casual observer, yet
will allow it to evolve into an even more powerful and fun language

i.


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: I am worried about Python 3

2008-04-09 Thread Istvan Albert
On Apr 9, 11:53 am, John Nagle [EMAIL PROTECTED] wrote:

 The general consensus is that Python 3.x isn't much of an

there are a number of unfortunate typos in there that interfere with
the message,

instead of The general consensus is I think you actually meant In
my opinion

i.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Line segments, overlap, and bits

2008-03-27 Thread Istvan Albert
On Mar 26, 5:28 pm, Sean Davis [EMAIL PROTECTED] wrote:
 I am working with genomic data.  Basically, it consists of many tuples
 of (start,end) on a line.  I would like to convert these tuples of
 (start,end) to a string of bits where a bit is 1 if it is covered by
 any of the regions described by the (start,end) tuples and 0 if it is
 not.  I then want to do set operations on multiple bit strings (AND,
 OR, NOT, etc.).  Any suggestions on how to (1) set up the bit string
 and (2) operate on 1 or more of them?  Java has a BitSet class that
 keeps this kind of thing pretty clean and high-level, but I haven't
 seen anything like it for python.

The solution depends on what size of genomes you want to work with.

There is a bitvector class that probably could do what you want, there
are some issues on scaling as it is pure python.

http://cobweb.ecn.purdue.edu/~kak/dist/BitVector-1.2.html

If you want high speed stuff (implemented in C and PyRex) that works
for large scale genomic data analysis the bx-python package might do
what you need (and even things that you don't yet know that you really
want to do)

http://bx-python.trac.bx.psu.edu/

but of course this one is a lot more complicated

i.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Is subprocess.Popen completely broken?

2008-03-27 Thread Istvan Albert
On Mar 27, 10:53 am, Skip Montanaro [EMAIL PROTECTED] wrote:

 Is subprocess.Popen completely broken?

Your lack of faith in Python is somewhat disturbing ...

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: sympy: what's wrong with this picture?

2008-03-04 Thread Istvan Albert
On Mar 4, 3:13 pm, Mensanator [EMAIL PROTECTED] wrote:

 But what if _I_ wanted to make a repeatable sequence for test
 purposes? Wouldn't factorint() destroy my attempt by reseeding
 on every call?

Would it?

It may just be that you are now itching to see a problem even where
there isn't one.

i.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ANN: Phatch = PHoto bATCH processor and renamer based on PIL

2008-02-21 Thread Istvan Albert
On Feb 18, 9:58 am, SPE - Stani's Python Editor
[EMAIL PROTECTED] wrote:
 I'm pleased to announce the release of Phatch which is a
 powerful batch processor and renamer. Phatch exposes a big part of

This program is fantastic! Very accesible user interface and produces
ggreat images.

Thanks!

Istvan

PS. the name Phatch is a bit hard to prononunce (easily confusable
with fetch), it is not easy to talk about it in a live conversation.
You should just call it photo-batch, in the end all the acronym saves
you is four letters and you lose the obvious meaning of what the tool
does. Anyhow, just a suggestion based on the first impressions, great
tool, great functionality. We can surely  nominate it for the best
Python based tool of 2008 ... so far ;-)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: IronPython vs CPython: faster in 1.6 times?

2008-02-05 Thread Istvan Albert
On Feb 5, 12:31 pm, dmitrey [EMAIL PROTECTED] wrote:
 Hi all,
 the urlhttp://torquedev.blogspot.com/2008/02/changes-in-air.html
 (blog of a game developers)
 says IronPython is faster than CPython in 1.6 times.
 Is it really true?

This is a second time around that IronPython piqued my interest
sufficiently to create a toy program to benchmark it and I must say
the results are not encouraging:

$ python bench.py
Elapsed time: 1.10 s

$ ipy bench.py
Elapsed time:65.01 s

and here is the benchmark program:

import time
start = time.time()

def foo(a):
return a * a

data = {}
for i in xrange( 10**6 ):
data[i] = foo(i)

print 'Elapsed time:%5.2f s' % ( time.time() - start)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: IronPython vs CPython: faster in 1.6 times?

2008-02-05 Thread Istvan Albert
On Feb 5, 4:56 pm, Arnaud Delobelle [EMAIL PROTECTED] wrote:

 Could it be because .NET doesn't have arbitrary length integer types
 and your little benchmark will  create lots of integers  2**32 ?
 What is the result if you replace foo(a) with

 def foo(a): return sqrt(a)

Good observation, in the case above the run times are about the same.

i.


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bizarre behavior with mutable default arguments

2007-12-30 Thread Istvan Albert
On Dec 30, 5:23 am, thebjorn [EMAIL PROTECTED]
wrote:

def age(dob, today=datetime.date.today()):
...

 None of my unit tests caught that one :-)

interesting example I can see how it caused some trouble. A quick fix
would be to write it:

def age(dob, today=datetime.date.today ):

and inside the definition invoke it as today() rather than just today.
That way it still keeps the original spirit of the definition.

i.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bizarre behavior with mutable default arguments

2007-12-30 Thread Istvan Albert
On Dec 29, 11:21 pm, bukzor [EMAIL PROTECTED] wrote:

 The standard library is not affected because

the people who wrote code into it know how python works.

Programming abounds with cases that some people think should work
differently:

a = b = []
a.append(1)

is b empty or not at this point? Get informed, remember the rules, be
happy and move on to write some cool code.

There is little new in what you say. Every so often someone is having
a confusing time with a feature and therefore proposes that the
language be changed to match his/her expectations.

i.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Fate of itertools.dropwhile() and itertools.takewhile()

2007-12-30 Thread Istvan Albert
On Dec 30, 3:29 am, Marc 'BlackJack' Rintsch [EMAIL PROTECTED] wrote:

 One recipe is extracting blocks from text files that are delimited by a
 special start and end line.

Neat solution!

I actually need such functionality every once in a while.

Takewhile + dropwhile to the rescue!

i.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bizarre behavior with mutable default arguments

2007-12-30 Thread Istvan Albert
On Dec 30, 11:26 am, George Sakkis [EMAIL PROTECTED] wrote:

 I'm with you on this one; IMHO it's one of the relatively few language
 design missteps of Python, favoring the rare case as the default
 instead of the common one.

George, you pointed this out this link in a different thread

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/521877

how would you rewrite the code below if you could not use mutable
default arguments (global variables not accepted)? Maybe there is a
way, but I can't think of it as of now.

---

def blocks(s, start, end):
def classify(c, ingroup=[0]):
klass = c==start and 2 or c==end and 3 or ingroup[0]
ingroup[0] = klass==1 or klass==2
return klass
return [tuple(g) for k, g in groupby(s, classify) if k == 1]

print blocks('the {quick} brown {fox} jumped', start='{', end='}')
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bizarre behavior with mutable default arguments

2007-12-30 Thread Istvan Albert
On Dec 30, 3:41 pm, bukzor [EMAIL PROTECTED] wrote:

 No globals, as you specified. BTW, it's silly not to 'allow' globals
 when they're called for, otherwise we wouldn't need the 'global'
 keyword.

okay, now note that you do not actually use the ingroup list for
anything else but getting and setting its first element. So why would
one really need it be a list? Let's replace it with a variable called
ingroup that is not a list anymore. See it below (run it to see what
happens):

--

def blocks(s, start, end):
ingroup = 0
def classify(c):
klass = c==start and 2 or c==end and 3 or ingroup
ingroup = klass==1 or klass==2
return klass
return [tuple(g) for k, g in groupby(s, classify) if k == 1]

print blocks('the {quick} brown {fox} jumped', start='{', end='}')




-- 
http://mail.python.org/mailman/listinfo/python-list


Re: sqlobject issue/question...

2007-12-29 Thread Istvan Albert
On Dec 28, 11:27 pm, bruce [EMAIL PROTECTED] wrote:

 i'm playing around, researching sqlobject, and i notice that it appears to
 require the use of id in each tbl it handles in the database.

 is there a way to overide this function/behavior...

there better be such way. An ORM that does not allow you to override
what the primary keys are called would be quite limited. Look at
sqlmeta data:

http://www.sqlobject.org/SQLObject.html#class-sqlmeta

i.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: joining rows

2007-12-29 Thread Istvan Albert
On Dec 29, 10:22 am, Tim Chase [EMAIL PROTECTED] wrote:

 If, however, order matters, you have to do it in a slightly
 buffered manner.

 Can be reduced to a sed one-liner

I think the original version works just as well for both cases. Your
sed version however does need the order you mention. Makes it no less
mind-bending though ... once I saw it I knew I had to try it :-)

i.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: joining rows

2007-12-29 Thread Istvan Albert

on a second read ... I see that you mean the case that should only
join consecutive lines with the same key
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bizarre behavior with mutable default arguments

2007-12-29 Thread Istvan Albert
On Dec 29, 12:50 pm, bukzor [EMAIL PROTECTED] wrote:

 Is this functionality intended? It seems very unintuitive. This has
 caused a bug in my programs twice so far, and both times I was
 completely mystified until I!realized that the default value was
 changing.

it is only unintuitive when you do not know about it

once you realize how it works and what it does it can actually be very
useful

i.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bizarre behavior with mutable default arguments

2007-12-29 Thread Istvan Albert
On Dec 29, 1:11 pm, Martin v. Löwis [EMAIL PROTECTED] wrote:

 Google for Python mutable default arguments

and a mere 30 minutes later this thread is already one of the results
that come up
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Fate of itertools.dropwhile() and itertools.takewhile()

2007-12-29 Thread Istvan Albert
On Dec 29, 6:10 pm, Raymond Hettinger [EMAIL PROTECTED] wrote:

 These thoughts reflect my own experience with the itertools module.
 It may be that your experience with them has been different.  Please
 let me know what you think.

first off, the itertools module is amazing, thanks for creating it. It
changed the way I think about programming. In fact nowadays I start
all my programs with:

from itertools import *

which may not be the best form, but I got tired of importing every
single function individually or writing out the module name.

Now I never needed the dropwhile() and takewhile() functions, but that
may not mean much. For quite a while I never needed the repeat()
function either. It even looked nonsensical to have an iterator that
simply repeats the same thing over and over. One day I had to solve a
problem that needed repeat() and made me really understand what it was
for and got to marvel at a just how neat the solution was.

i.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Converting old shelve databases to gdbm

2007-12-24 Thread Istvan Albert
On Dec 24, 7:38 pm, [EMAIL PROTECTED] wrote:

 Any tips welcome.

pickling has a text protocol that should be compatible across python
versions. Pickle  each of your database entries to a different file,
then read them in the newer version of the script.

i.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: 2D Game Development in Python

2007-12-21 Thread Istvan Albert
On Dec 20, 8:16 pm, PatrickMinnesota [EMAIL PROTECTED]
wrote:

 seen all the lists.  I've done my reading.  What I don't have is
 actual testimonials by people who have used a chunk of code to program
 an animated 2D game and had a great experience.

You could use Panda3D to create the game, who cares that it is really
a 3D engine ... just don't rotate the camera ;-)

http://panda3d.etc.cmu.edu/what.php
i.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Understanding memory leak reports

2007-12-21 Thread Istvan Albert
On Dec 21, 1:44 pm, Giampaolo Rodola' [EMAIL PROTECTED] wrote:

 Since the main module is very big (more than 2800 lines of code)

maybe that is the actual problem to begin with,

you should refactor it so it it more modular and trackable, otherwise
this is just one of the many issues that will crop up,

just an opinion.

i.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pypi and easy_install

2007-12-19 Thread Istvan Albert
On Dec 19, 8:07 pm, Giampaolo Rodola' [EMAIL PROTECTED] wrote:

 Could someone point me in the right direction?

 download_url = 'http://code.google.com/p/pyftpdlib/downloads/list',

you'll need to specify the full path to the actual archive, a link
that one could use to download the archive, not just point it to web
page that contains links

i.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pypi and easy_install

2007-12-19 Thread Istvan Albert
On Dec 19, 9:44 pm, Istvan Albert [EMAIL PROTECTED] wrote:
 On Dec 19, 8:07 pm, Giampaolo Rodola' [EMAIL PROTECTED] wrote:

  download_url = 'http://code.google.com/p/pyftpdlib/downloads/list',

this is from looking at your setup.py here:

http://pyftpdlib.googlecode.com/svn/trunk/setup.py
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Is anyone happy with csv module?

2007-12-11 Thread Istvan Albert
On Dec 11, 2:14 pm, massimo s. [EMAIL PROTECTED] wrote:

 dislike more is that it seems working by *rows* instead than by
 *columns*.

you can easily transpose the data to get your columns, for a data file
that looks like this:

 data.txt 
A,B,C
1,2,3
10,20,30
100,200,300

do the following:


import csv
reader = csv.reader( file('data.txt', 'U') )
rows = list(reader)

print rows

cols = zip(*rows)

print cols[0]
print cols[1]
print cols[2]

this will print:

-- Python --

[['A', 'B', 'C'], ['1', '2', '3'], ['10', '20', '30'], ['100', '200',
'300']]
('A', '1', '10', '100')
('B', '2', '20', '200')
('C', '3', '30', '300')

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Running unmodified CGI scripts persistently under mod_wsgi.

2007-12-08 Thread Istvan Albert
On Dec 8, 8:26 am, Michael Ströder [EMAIL PROTECTED] wrote:

  But conventional CGI scripts are implemented with the assumption of being
 stateless.

while it might be that some CGI scripts must be executed in a new
python process on each request, common sense and good programming
practices would steer most people away from that sort of approach.

after all such CGI scripts would not execute properly under mod_python
either

i.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Running unmodified CGI scripts persistently under mod_wsgi.

2007-11-27 Thread Istvan Albert
On Nov 25, 1:55 am, Graham Dumpleton [EMAIL PROTECTED]
wrote:

 The other question is whether there is even a demand for this. Do
 people want to be able to take unmodified Python CGI scripts and try
 to run them persistently in this way, or would they be better off
 converting them to proper WSGI applications.

I think CGI will be with us for many decades.

It will be awesome if mod_wsgi can run CGI without invoking python on
each access.

i.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python web frameworks

2007-11-22 Thread Istvan Albert
On Nov 21, 12:15 am, Graham Dumpleton [EMAIL PROTECTED]
wrote:

 I would say that that is now debatable. Overall mod_wsgi is probably a
 better package in terms of what it has to offer. Only thing against
 mod_wsgi at this point is peoples willingness to accept something that
 is new in conjunction with Linux distributions and web hosting
 companies being slow to adopt new packages.

Yes that is to be expected, many people want someone else to pay the
early adopter's costs. Nonetheless mod_wsgi seems like the right
direction to move the python world.

One confounding factor that may slow its adoption could be the need of
running plain old CGI in an efficient way. I'm not sure how that fits
into the WSGI picture.

i.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Populating a dictionary, fast [SOLVED]

2007-11-20 Thread Istvan Albert
On Nov 19, 2:33 pm, Francesc Altet [EMAIL PROTECTED] wrote:

 Just for the record.  I was unable to stop thinking about this, and
 after some investigation, I guess that my rememberings were gathered
 from some benchmarks with code in Pyrex (i.e. pretty near to C speed).

Pretty interesting and quite unexpected.

i.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python web frameworks

2007-11-20 Thread Istvan Albert
On Nov 20, 9:42 am, Diez B. Roggisch [EMAIL PROTECTED] wrote:
  12/7. Django comes with its own little server so that you don't have
  to set up Apache on your desktop to play with it.

 I was rather shocked to learn that django only has this tiny server and does
 not come with a stand-alone server

Alas it is even worse than that, the development server is single
threaded and that can be a big problem when developing sites that make
multiple simultaneous requests at the same time (say if one of those
requests stalls for some reason). It is trivial to add multi threading
via a mixin (takes about two lines of code) but it does not seem to be
a priority to do so.

For large traffic though, Django is better than just about anything
other framework because it is built as multiprocess framework through
mod_python (rather than threaded). So there is no global interpreter
lock, thread switching etc. Django was built based in a real life
heavy duty use scenario, and over the years that really helped with
ironing out the problems.

i.






-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Populating a dictionary, fast [SOLVED SOLVED]

2007-11-16 Thread Istvan Albert
On Nov 16, 1:18 pm, Michael Bacarella [EMAIL PROTECTED] wrote:

 You're right, it is completely inappropriate for us to be showing our
 dirty laundry to the public.

you are misinterpreting my words on many levels,

(and I of course could have refrained from the chair-monitor jab as
well)

anyhow, it is what it is, I could not reproduce any of the weird
behaviors myself, I got nothing more to add to this discussion
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: gc penalty of 30-40% when manipulating large data structures?

2007-11-16 Thread Istvan Albert
On Nov 16, 10:59 am, Chris Mellon [EMAIL PROTECTED] wrote:

 The GC has a heuristic where it kicks in when (allocations -
 deallocations) exceeds a certain threshold,

As the available ram increases this threshold can be more easily
reached. Ever since I moved to 2Gb ram I stumbled upon issues that
were easily solved by turning the gc off (the truth is that more ram
made me lazier, I'm a little less keen to keep memory consumption down
for occasional jobs, being overly cavalier with generating lists of
1Gb in size...)

One example, when moving from a list size from 1 million to 10 million
I hit this threshold. Nowadays I disable the gc during data
initialization.

i.



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Populating a dictionary, fast [SOLVED SOLVED]

2007-11-15 Thread Istvan Albert
On Nov 14, 6:26 pm, Steven D'Aprano [EMAIL PROTECTED]
cybersource.com.au wrote:

 On systems with multiple CPUs or 64-bit systems, or both, creating and/or
 deleting a multi-megabyte dictionary in recent versions of Python (2.3,
 2.4, 2.5 at least) takes a LONG time, of the order of 30+ minutes,
 compared to seconds if the system only has a single CPU.

Please don't propagate this nonsense. If you see this then the problem
exists between the chair and monitor.

There is nothing wrong with neither creating nor deleting
dictionaries.

i.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Populating a dictionary, fast [SOLVED SOLVED]

2007-11-15 Thread Istvan Albert
On Nov 15, 4:11 pm, Steven D'Aprano [EMAIL PROTECTED]
cybersource.com.au wrote:

 Unless you're accusing both myself and the original poster of outright
 lying, of faking our results, what's your explanation?

I don't attribute it to malice, I think you're simply measuring
something else. You both must be having some some system setting that
forces your application to swap disk.

Do you really believe that you cannot create or delete a large
dictionary with python versions less than 2.5 (on a 64 bit or multi-
cpu system)? That a bug of this magnitude has not been noticed until
someone posted on clp?

 Have you tried
 running our code on a 64-bit or multi-CPU system to see for yourself,

the answer is: yes and yes, I see nothing out of the ordinary.

i.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Populating a dictionary, fast [SOLVED]

2007-11-13 Thread Istvan Albert
On Nov 13, 11:27 am, Francesc Altet [EMAIL PROTECTED] wrote:

 Another possibility is using an indexed column in a table in a DB.
 Lookups there should be much faster than using a dictionary as well.

I would agree with others who state that for a simple key based lookup
nothing beats a dictionary.

But knowing that you are one of the authors of the excellent pytables
module I think you are approaching this problem from the perspective
of reading in a large number of values on each access. In those cases
specialized libraries (like the hdf) can directly load into memory
huge amounts of continuous data at speeds that substantially
outperform all other approaches.

i.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Populating a dictionary, fast

2007-11-12 Thread Istvan Albert
On Nov 12, 12:39 pm, Michael Bacarella [EMAIL PROTECTED] wrote:

 The win32 Python or the cygwin Python?

 What CPU architecture?

it is the win32 version, a dual core laptop with T5220 Core 2

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Populating a dictionary, fast

2007-11-11 Thread Istvan Albert
On Nov 10, 4:56 pm, Michael Bacarella [EMAIL PROTECTED] wrote:

 This would seem to implicate the line id2name[id] = name as being 
 excruciatingly slow.

As others have pointed out there is no way that this takes 45
minutes.Must be something with your system or setup.

A functionally equivalent code for me runs in about 49 seconds!
(it ends up using about twice as much ram as the data on disk)

i.




-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Populating a dictionary, fast

2007-11-11 Thread Istvan Albert
On Nov 11, 11:25 am, Michael Bacarella [EMAIL PROTECTED] wrote:

 I tried your code (with one change, time on feedback lines) and got the
  same terrible
 performance against my data set.

 To prove that my machine is sane, I ran the same against your generated
 sample file and got _excellent_ performance.  Start to finish in under a 
 minute.

One possibility could be that your dataset turns out to be some sort
of pathological worst case for the hashing algorithm in python.

i.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Populating a dictionary, fast

2007-11-11 Thread Istvan Albert
On Nov 11, 11:51 am, Michael Bacarella [EMAIL PROTECTED] wrote:

 and see it take about 45 minutes with this:

 $ cat cache-keys.py
 #!/usr/bin/python
 v = {}
 for line in open('keys.txt'):
 v[long(line.strip())] = True

On my system (windows vista) your code (using your data) runs in:

36 seconds with python 2.4
25 seconds with python 2.5
39 seconds with python 3000


i.


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: NUCULAR fielded text searchable indexing

2007-10-27 Thread Istvan Albert
On Oct 17, 7:20 am, Steve Holden [EMAIL PROTECTED] wrote:

 I still remember Gadfly fondly.

What a great piece of software Gadfly is ... congrats on that Aaron.
For me it was one of the first Python packages that truly stood out
and made me want to learn Python.

i.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: CGI and external JavaScript nightmare

2007-10-22 Thread Istvan Albert
On Oct 18, 5:04 pm, IamIan [EMAIL PROTECTED] wrote:
   The OP's problem is that he suffers from the delusion that people want
   to steal the source code for hisCGIscript.

 Why is assuming someone may try to get my source CGI delusional?

 I'm on a shared server (Dreamhost). The CGI itself has 755 permissions
 to execute, but what about folder permissions, etc? If you could
 expand on access to the server, and so on that would be great.

- as far as accessing through the web goes this is a matter of
webserver configuration, set up your webserver in a such a way that it
will not return the source code for your scripts

- regarding shared webhosting services, you may not be able to deny
access to people who have shared accounts (and shell login) on the
same server as you do. It all depends on how the shared accounts are
set up.

i.



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: CGI and external JavaScript nightmare

2007-10-11 Thread Istvan Albert
On Oct 11, 2:23 am, IamIan [EMAIL PROTECTED] wrote:

 is a very lengthy garbled js file 
 athttp://pagead2.googlesyndication.com/pagead/show_ads.js

 The first piece of JavaScript works fine and the ads display
 correctly, however the second file throws an unterminated string
 literal js error.

the javescript code above is fine. you must be making some other
error(s)

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Yet another comparison of Python Web Frameworks

2007-10-10 Thread Istvan Albert
On Oct 9, 11:57 pm, [EMAIL PROTECTED] wrote:
 Since you are starting a new project you may want to look into
 something new and different

 http://mdp.cti.depaul.edu/examples

This is actually a neat framework! I'm a somewhat of fan of web-
frameworks and I used most major ones and I like to poke around.

Here is a mini review:

Gluon seems to be modeled with the web.py mentality, everything in one
package, very simple but covering all essentials templating, database,
sessions and server. It reminds me of a content management system, you
start with a working server, then you add your components to it.

It also implements a Zope-like through-the-web interaction, everything
can be modified in the administration interface, templates, databases,
code, media. This sort of functionality is sometimes frowned upon but
can be a very convenient way to make small changes.

Honestly the framework field is a bit crowded so it is an uphill
battle to get a framework accepted, but it is nice to see something
different.

i.

ps. fix the typos in the docs and get people's names right, it makes a
bad impression otherwise

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: NUCULAR fielded text searchable indexing

2007-10-09 Thread Istvan Albert
On Oct 9, 7:26 am, [EMAIL PROTECTED] wrote:

 No, it doesn't stand for anything.

It also reminds me of someone we all know, and I wish it didn't.

As the latin proverb says Nomen est omen. Calling your package
docindexer would draw a lot more people. It is hard to justify to a
third party that a project named nucular actually does something
useful.

It looks like a great piece of work otherwise. Congrats.

ps. there is a python project named The Devil Framework, I cringe
every time I hear about it. Nucular is not as bad, but it is close.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: NUCULAR fielded text searchable indexing

2007-10-09 Thread Istvan Albert
On Oct 9, 9:14 am, [EMAIL PROTECTED] wrote:

 a great tradition of tounge-in-cheek package names, like
 Cold fusion, for example.

Cold Fusion is a super cool name. Nobody will every think of it as
representing something odd or silly.

 too late now.  sorry again,

why would it be late? is the future of you own work not worth the time
it takes to rename it? Compared to all the other work it took ... it
is just a mere inconvenience.

All I can say please do yourself a favor rename it. Imagine yourself
in the position of someone who has no idea what the project is about
and try to imagine what kind of thoughts and feelings the name will
conjure. It matters a lot. It is a shame to put yourself at
disadvantage.

Anyway that's all I can say. FWIW the Devil Framework guys are
sticking to their name, I think this is something that feels too
personal, people who invent a name will have a hard time accepting
that it is bad.

i.


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Yet another comparison of Python Web Frameworks

2007-10-07 Thread Istvan Albert

IMO this is not as much a framework comparison rather than an
evaluation of the individual components that make up Pylons.

The framework is the sum of all its parts. Programmers should not need
to know that that a package named Beaker is used for sessions, Routes
for url mapping, PasteDeploy for whatever. This is the weakness of all
glue-type frameworks i.e. TG and Pylons. It makes them look like they
are duct-taped together.

The more important question are whether the sessions actually work
properly: i.e does session data persist through a server restart?
Where is the session data stored: in memory, files, database and so
on.

The choice of templating language should be a non issue. Any half
decent framework should allow you to use any other templating engine
with ease.
... even python as you seem to prefer

i.




-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Yet another comparison of Python Web Frameworks

2007-10-07 Thread Istvan Albert
On Oct 7, 12:24 pm, Michele Simionato [EMAIL PROTECTED]
wrote:

 Here we disagree: I think that a programmer should know what he
 is using.

My point was that they should not *need* to know. Too much information
can be detrimental.

  Where is the session data stored: in memory, files, database and so
  on.

 Of course Beaker has all these features

Well their main page does not say anything about database based
storage:

http://beaker.groovie.org/

but then a Pylons wiki says it does, but I could not find any
information on out how to set up beaker with a database ... truth to
be told I was just curious of how they would go about integrating a
database and still keep it pluggable.

Because it would either need to know about the way the server handles
the database connections (which makes it application specific) or it
has to create new database connections in which case it would be
duplicating the functionality leading to all kinds of headaches.

So there, even a simple requirement exposes the fallacy of building
anything complex out of pluggable components.

i.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: setuptools without unexpected downloads

2007-09-27 Thread Istvan Albert
On Sep 26, 2:09 am, Ben Finney [EMAIL PROTECTED] wrote:

 behaviour with a specific invocation of 'setup.py'. But how can I
 disallow this from within the 'setup.py' program, so my users don't
 have to be aware of this unexpected default behaviour?

I don't have the answer for this, but I can tell you that I myself
dislike the auto-download behavior and I wish it worked differently.

I've given up on setuptools/easy-install altogether. It is most
annoying to end up with half a dozen unexpected packages.

The default behavior should be to pop a question with a list of
packages that will be downloaded (and have a flag that bypasses this).
And of course being able to turn off this feature from setup.py.

i.










-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Tracking memory usage and object life time.

2007-09-26 Thread Istvan Albert
On Sep 26, 8:06 am, Berteun Damman [EMAIL PROTECTED] wrote:

 that have been created after I don't need them anymore. I furthermore
 don't really see why there would be references to these larger objects
 left. (I can be mistaken of course).

This could be tricky because you have a graph that (probably) allows
you to walk its nodes, thus even having a single other reference to
any of the nodes could keep the entire graph alive

 The best I now can do is run the whole script several times (from a
 shell script) -- but this also forces Python to reparse the graph
 input again, and do some other stuff it only has to do once. A

you could pickle and save the graph once the initial processing is
done. That way subsequent runs will load substantially faster.

i.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Newbie completely confused

2007-09-24 Thread Istvan Albert
Two comments,

 ...
 self.item3 = float(foo[c]); c+=1
 self.item4 = float(foo[c]); c+=1
 self.item5 = float(foo[c]); c+=1
 self.item6 = float(foo[c]); c+=1
 ...

this here (and your code in general) is mind boggling and not in a
good way,

as for you original question, I don't think that reading in files of
the size you mention can cause any substantial problems, I think the
problem is somewhere else,

you can run the code below to see that the read times are unaffected
by the order   of processing

--

import timeit

# make a big file
NUM= 10**5
fp = open('bigfile.txt', 'wt')
longline = ' ABC '* 60 + '\n'
for count in xrange( NUM ):
fp.write( longline )
fp.close()

setup1 = 
def readLines():
data = []
for line in file('bigfile.txt'):
data.append( line )
return data


stmt1 = 
data = readLines()


stmt2 = 
data = readLines()
data = readLines()


stmt3 = 
data = file('bigfile.txt').readlines()


def run( setup, stmt, N=5 ):
t = timeit.Timer(stmt=stmt, setup=setup)
msec = 1000 * t.timeit(number=N)/N
print %f msec/pass % msec

if __name__ == '__main__':
for stmt in (stmt1, stmt2, stmt3):
run(setup=setup1, stmt=stmt)



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Zope review

2007-09-21 Thread Istvan Albert
On Sep 20, 7:44 pm, Norm [EMAIL PROTECTED] wrote:

 without meaning to start a flame war between the various python web
 tools, I was wondering if anyone had a review of the status of Zope.
 For example, is it being used for new projects or just maintenance?

Zope is heavily used. It is a mature and reliable product. It is also
very complicated and requires (enforces) a particular way of
programming that can feel very burdensome if it does not 'fit your
brain'.

i.






-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Zope review

2007-09-21 Thread Istvan Albert
On Sep 21, 7:04 pm, Sean Tierney [EMAIL PROTECTED] wrote:

 someone could contrast Zope w/ another Python framework like Twisted.

 I've been investing some time in learning Zope/Plone and would love to
 hear someone speak to alternatives.

Twisted is a networking engine, Zope is a web application framework,
Plone is a content management system, there is nothing to compare,
these are different applications altogether, it is not like you'd
replace one with the other

For applications that can be compared see Zope vs Django vs Pylons vs
web.py vs CherryPy. Google these and contrast away.

i

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: I could use some help making this Python code run faster using only Python code.

2007-09-20 Thread Istvan Albert
On Sep 20, 7:13 pm, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:

 How come it's not? Then I noticed you don't have brackets in
 the join statement. So I tried without them and got

If memory serves me right newer versions of python will recognize and
optimize string concatenation via the += operator, thus the advice to
use join does not apply.

i.


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: MemoryError on reading mbox file

2007-09-12 Thread Istvan Albert
On Sep 12, 5:27 am, Christoph Krammer [EMAIL PROTECTED]
wrote:

 string = self._file.read(stop - self._file.tell())
 MemoryError

This line reads an entire message into memory as a string. Is it
possible that you have a huge email in there (hundreds of MB) with
some attachment encoded as text?

Either way, the truth is that many modules in the standard library are
not well equipped to deal with large amounts of data. Many of them
were developed before gigabyte sized files were even possible to store
let alone process. Hopefully P3K will alleviate many of these problems
by its extensive use of generators.

For now I would recommend that you split your mbox file into several
smaller ones. (I think all you need is to split at the To: fields) and
run your script on these individual files.

i.








-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Speed of Python

2007-09-07 Thread Istvan Albert
On Sep 7, 12:42 pm, wang frank [EMAIL PROTECTED] wrote:

 Is my conclusion correct that Python is slower than matlab?

There are ways to speed up code execution, but to see substantial
improvement you'll need to use numpy and rework the code to operate
on  vectors/matrices rather than building the result one step at the
time. This applies to Octave as well. See the example code at the end
of the message.

With that code computing 1 million logarithms showed the following
tendency

original = 648.972728 msec per pass
optimized = 492.613773 msec per pass
with numpy = 120.578616 msec per pass

The slowness of python in this example mainly comes from the function
call (math.log) as it seems about 30% of the runtime is spent calling
the function.

import timeit

setup = 
import math
from  numpy import arange, log
size = 1000


code1 = 
#original code
for i in range(size):
for j in range(size):
a = math.log(j+1)


code2 = 
# minor improvements lead to 15% faster speed
from math import log
for i in xrange(size):
for j in xrange(size):
a = log(j+1)


code3 = 
# applying via a universal function makes it 5 times faster
for i in xrange(size):
nums = arange( size )
a = log( nums + 1)


N = 3
codes = [ code1, code2, code3 ]

for stmt in codes:
timer = timeit.Timer(stmt=stmt, setup=setup)
msec  = 1000.0 * timer.timeit(number=N)/N
print %f msec per pass % msec

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Haskell like (c:cs) syntax

2007-08-29 Thread Istvan Albert
On Aug 29, 8:12 am, Ricardo Aráoz [EMAIL PROTECTED] wrote:

  Caution : L[0] and L[1:] are COPIES of the head and tail of the list.

 Sorry, should have written RETURN copies instead of ARE copies.

L[0] does not return a copy, it does what is says, returns the object
stored at index 0.

i.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: This bit of code hangs Python Indefinitely

2007-08-08 Thread Istvan Albert
On Aug 8, 9:38 am, brad [EMAIL PROTECTED] wrote:
 The problem is that I have 512 things to add to the queue, but my limit
 is half that... whoops. Shouldn't the interpreter tell me that I'm an
 idiot for trying to do this instead of just hanging? A message such as
 this would be more appropriate:


See the docs, especially the block and timeout parameter for the put
method:

http://docs.python.org/lib/QueueObjects.html


Istvan

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Misleading wikipedia article on Python 3?

2007-08-08 Thread Istvan Albert
On Aug 6, 6:49 am, Neil Cerutti [EMAIL PROTECTED] wrote:

 Incidentally, from the second link I find it shocking that the
 keyword parameter file shadows a builtin. It seems to endorse a
 bad practice.

I believe that the file builtin has been removed as well so it won't
shadow anything.

i.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Misleading wikipedia article on Python 3?

2007-08-08 Thread Istvan Albert
On Aug 6, 6:11 am, Paul Rubin http://[EMAIL PROTECTED] wrote:

 Why on earth did they make this change?  It just seems gratuitous.

Having print a function (with parameters) makes is very easy to modify
where the output goes.

Say you want to have some prints go to one particular file, today you
cannot easily do it, you have to either do a regex based search/
replace or fiddle with the sys.stdout etc. both have substantial
drawbacks.

A solution would be writing the code with a logging function to begin
with, alas many times that is out of one's hand. I wished print was a
function great many times. I bet Guido has had similar experiences,
note that attempt to keep print in the current form but have it print
to a file ... with that crazy syntax, print f, ... alas that did not
solve anything

It is time to fix it for good.

i.


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Misleading wikipedia article on Python 3?

2007-08-08 Thread Istvan Albert
On Aug 8, 12:08 pm, Paul Boddie [EMAIL PROTECTED] wrote:

 However, for every enthusiast of one approach, there
 will always be an enthusiast for another (see point #6):

True.

For example I for one also like the way the current print adds a
newline, the vast majority of the time that is exactly what I want. I
don't know if this will be kept when print becomes a function (might
be that some default parameters make it work the same way)

Which reminds me of an experience I had when I originally learned how
to program in Pascal, then moved on to C. At first it felt wrong that
I had to include the \n to print a new line at the end, seemed
unreadable (Pascal had separate functions: write and a writeln for
it). Today I'd consider it appallingly bad design to have separate
functions for such a small difference.

i.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Misleading wikipedia article on Python 3?

2007-08-08 Thread Istvan Albert
On Aug 8, 2:00 pm, Neil Cerutti [EMAIL PROTECTED] wrote:

 I thought, in fact, that open was on more shaky ground. ;)

yeah, that too ...

 I can't find any evidence of that in the PEPs. Do you have a reference?

here is something:

http://svn.python.org/view/python/branches/p3yk/Misc/NEWS?rev=56685view=auto

look for :

- Removed these Python builtins:
  apply(), callable(), coerce(), file(), reduce(), reload()

i.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: zip() function troubles

2007-07-27 Thread Istvan Albert
On Jul 27, 2:16 am, Terry Reedy [EMAIL PROTECTED] wrote:

 References are not objects.

yes this a valid objection,

but in all fairness the example in the original post operates on
comparably sized objects and also exhibited unexpected performance
degradation

as it turns out the garbage collection is the culprit, I never used to
have to care about gc (and that's great!), but now that I'm that I'm
shuffling 1Gb chunks I have to be more careful.

best,

Istvan

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: zip() function troubles

2007-07-27 Thread Istvan Albert
On Jul 27, 1:24 am, Peter Otten [EMAIL PROTECTED] wrote:

 When you are allocating a lot of objects without releasing them the garbage
 collector kicks in to look for cycles. Try switching it off:

 import gc
 gc.disable()

Yes, this solves the problem I was experiencing. Thanks.

Istvan

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: zip() function troubles

2007-07-27 Thread Istvan Albert
On Jul 27, 2:18 pm, Raymond Hettinger [EMAIL PROTECTED] wrote:

 What was really surprising is that it works
 with no issues up until 1 million items

later editing made the sentence more difficult to read
I should have said: What was really surprising is that zip works
with no issues up until 1 million items

It was the zip function (and the garbage collection that it repeatedly
triggers) that cause the problem

best,

Istvan

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: zip() function troubles

2007-07-26 Thread Istvan Albert
On Jul 26, 9:33 pm, Paul Rubin http://[EMAIL PROTECTED] wrote:

 Do a top or vmstat while that is happening and see if you are
 swapping.  You are allocating 10 million ints and 10 million tuple
 nodes, = 20 million objects.  Although, even at 100 bytes per object
 that would be 1GB which would fit in your machine easily.  Is it
 a 64 bit cpu?

we can safely drop the memory limit as being the cause and think about
something else

if you try it yourself  you'll see that it is very easy to generate 10
million tuples,
on my system it takes 3 (!!!) seconds to do the following:

size = 10**7
data = []
for i in range(10):
x = [ (0,1) ] * size
data.append( x )

Now it takes over two minutes to do this:

size = 10**7
a = [ 0 ] * size
b = zip(a,a)

the only explanation I can come up with is that the internal
implementation of zip must have some flaws


-- 
http://mail.python.org/mailman/listinfo/python-list


zip() function troubles

2007-07-26 Thread Istvan Albert
Hello all,

I've been debugging the reason for a major slowdown in a piece of
code ... and it turns out that it was the zip function. In the past
the lists that were zipped were reasonably short, but once the size
exceeded 10 million the zip function slowed to a crawl. Note that
there was memory available to store over 100 million items.

Now I know that zip () wastes lots of memory because it copies the
content of the lists, I had used zip to try to trade memory for speed
(heh!) , and now that everything was replaced with izip it works just
fine.  What was really surprising is that it works with no issues up
until 1 million items, but for say 10 million it pretty much goes
nuts. Does anyone know why? is there some limit that it reaches, or is
there something about the operating system (Vista in the case)  that
makes it behave like so?

I've noticed the same kinds of behavior when trying to create very
long lists that should easily fit into memory, yet above a given
threshold I get inexplicable slowdowns. Now that I think about is this
something about the way lists grow when expanding them?

and here is the code:

from itertools import izip

BIGNUM = int(1E7)

# let's make a large list
data = range(BIGNUM)

# this works fine (uses about 200 MB and 4 seconds)
s = 0
for x in data:
s += x
print s


# this works fine, 4 seconds as well
s = 0
for x1, x2 in izip(data, data):
s += x1
print s


# this takes over 2 minutes! and uses 600 MB of memory
# the memory usage slowly ticks upwards
s = 0
for x1, x2 in zip(data, data):
s += x1
print s

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: zip() function troubles

2007-07-26 Thread Istvan Albert
On Jul 26, 7:44 pm, Paul Rubin http://[EMAIL PROTECTED] wrote:
 Istvan Albert [EMAIL PROTECTED] writes:
  exceeded 10 million the zip function slowed to a crawl. Note that
  there was memory available to store over 100 million items.

 How many bytes is that?  Maybe the items (heap-allocated boxed
 integers in your code example) are bigger than you expect.

while I don't have an answer to this

the point that I was trying to make is that I'm fairly certain that it
is not a memory issue (some sort of swapping) because the overall
memory usage with the OS included is about 1Gb (out of available 2Gb)

I tested this on a linux server system with 4Gb of RAM

a = [ 0 ] * 10**7

takes miliseconds, but say the

b = zip(a,a)

will take a very long time to finish:

atlas:~$ time python -c a = [0] * 10**7

real0m0.165s
user0m0.128s
sys 0m0.036s
atlas:~$ time python -c a = [0] * 10**7; b= zip(a,a)

real0m55.150s
user0m54.919s
sys 0m0.232s

Istvan

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-19 Thread Istvan Albert
On May 19, 3:33 am, Martin v. Löwis [EMAIL PROTECTED] wrote:

 That would be invalid syntax since the third line is an assignment
  with target identifiers separated only by spaces.

 Plus, the identifier starts with a number (even though 6 is not DIGIT
 SIX, but FULLWIDTH DIGIT SIX, it's still of category Nd, and can't
 start an identifier).

Actually both of these issues point to the real problem with this PEP.

I knew about them (note that the colon is also missing) alas I
couldn't fix them.
My editor would could not remove a space or add a colon anymore, it
would immediately change the rest of the characters to something
crazy.

(Of course now someone might feel compelled to state that this is an
editor problem but I digress, the reality is that features need to
adapt to reality, moreso had I used a different editor I'd be still
unable to write these characters).

i.


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Istvan Albert
On May 17, 2:30 pm, Gregor Horvath [EMAIL PROTECTED] wrote:

 Is there any difference for you in debugging this code snippets?

 class Türstock(object):

Of course there is, how do I type the ü ? (I can copy/paste for
example, but that gets old quick).

But you're making a strawman argument by using extended ASCII
characters that would work anyhow. How about debugging this (I wonder
will it even make it through?) :

class 6자회담관련론조
   6자회 = 0
   6자회담관련 고귀 명=10


(I don't know what it means, just copied over some words from a
japanese news site, but the first thing it did it messed up my editor,
would not type the colon anymore)

i.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Python Web Programming - looking for examples of solid high-traffic sites

2007-05-17 Thread Istvan Albert
On May 16, 5:04 pm, Victor Kryukov [EMAIL PROTECTED] wrote:

 Our main requirement for tools we're going to use is rock-solid
 stability. As one of our team-members puts it, We want to use tools
 that are stable, has many developer-years and thousands of user-years
 behind them, and that we shouldn't worry about their _versions_. The
 main reason for that is that we want to debug our own bugs, but not
 the bugs in our tools.

I think this is a requirement that is pretty much impossible to
satisfy. Only dead frameworks stay the same.  I have yet to see a
framework that did not have incompatible versions.

Django has a very large user base, great documentation and is deployed
for several online new and media sites. It is fast, it's efficient and
is simple to use. Few modern frameworks (in any language) are
comparable, and I have yet to see one that is better,

http://code.djangoproject.com/wiki/DjangoPoweredSites

i.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Istvan Albert
On May 16, 11:09 pm, Gregor Horvath [EMAIL PROTECTED] wrote:
 [EMAIL PROTECTED] schrieb:

  On May 16, 12:54 pm, Gregor Horvath [EMAIL PROTECTED] wrote:
  Istvan Albert schrieb:

  So the solution is to forbid Chinese XP ?

Who said anything like that? It's just an example of surprising and
unexpected difficulties that may arise even when doing trivial things,
and that proponents do not seem to want to admit to.

 Should computer programming only be easy accessible to a small fraction
 of privileged individuals who had the luck to be born in the correct
 countries?

 Should the unfounded and maybe xenophilous fear of loosing power and
 control of a small number of those already privileged be a guide for
 development?

Now that right there is your problem. You are reading a lot more into
this than you should. Losing power, xenophilus(?) fear, privileged
individuals,

just step back and think about it for a second, it's a PEP and people
have different opinions, it is very unlikely that there is some
generic sinister agenda that one must be subscribed to

i.



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Istvan Albert
On May 17, 9:07 am, Martin v. Löwis [EMAIL PROTECTED] wrote:

 up. I interviewed about 20 programmers (none of them Python users), and
 most took the position I might not use it myself, but it surely
 can't hurt having it, and there surely are people who would use it.

Typically when you ask people about esoteric features that seemingly
don't affect them but might be useful to someone, the majority will
say yes. Its simply common courtesy, its is not like they have to do
anything.

At the same time it takes some mental effort to analyze and understand
all the implications of a feature, and without taking that effort
something will always beat nothing.

After the first time that your programmer friends need fix a trivial
bug in a piece of code that does not display correctly in the terminal
I can assure you that their mellow acceptance will turn to something
entirely different.

i.





-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Istvan Albert
As a non-native English speaker,

On May 13, 11:44 am, Martin v. Löwis [EMAIL PROTECTED] wrote:

 - should non-ASCII identifiers be supported? why?

No. I don't think it adds much, I think it will be a little used
feature (as it should be), every python instructor will start their
class by saying here is a feature that you should stay away from
because you never know where your code ends up.

 - would you use them if it was possible to do so? in what cases?

No. The only possible uses I can think of are intentionally
obfuscating code.

Here is something that just happened and relates to this subject: I
had to help a student run some python code on her laptop, she had
Windows XP that hid the extensions. I wanted to set it up such that
the extension is shown. I don't have XP in front of me but when I do
it takes me 15 seconds to do it. Now her Windows was set up with some
asian fonts (Chinese, Korean not sure), looked extremely unfamiliar
and I had no idea what the menu systems were. We have spent quite a
bit of time figuring out how to accomplish the task. I had her read me
back the options, but something like hide extensions comes out quite
a bit different. Surprisingly tedious and frustrating experience.

Anyway, something to keep in mind. In the end features like this may
end up hurting those it was meant to help.

i.

-- 
http://mail.python.org/mailman/listinfo/python-list


web development with python - comparison

2007-03-26 Thread Istvan Albert

Here is a comprehensive review of python web apps:

http://jesusphreak.infogami.com/blog/vrp1

Since this comes up every so often in this group.

i.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Graphviz Python Binding for Python 2.5 on Windows?

2007-03-06 Thread Istvan Albert
On Mar 5, 5:16 pm, Alex Li [EMAIL PROTECTED] wrote:

 I tried to avoid.  Any suggestions?

try the networkx package, it includes the pygraphviz module that can
generate dot files:

https://networkx.lanl.gov/wiki

Istvan

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Graphviz Python Binding for Python 2.5 on Windows?

2007-03-06 Thread Istvan Albert
On Mar 6, 3:18 pm, Istvan Albert [EMAIL PROTECTED] wrote:

 try the networkx package, it includes the pygraphviz module that can
 generate dot files:

 https://networkx.lanl.gov/wiki

should've checked it before posting, it seems nowadays is actually a
separate package

https://networkx.lanl.gov/wiki/pygraphviz

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: threading and multicores, pros and cons

2007-02-14 Thread Istvan Albert
On Feb 14, 1:33 am, Maric Michaud [EMAIL PROTECTED] wrote:

 At this time, it 's not easy to explain him that python
 is notflawed compared to Java, and that he will not
 regret his choice in the future.

Database adaptors such as psycopg do release the GIL while connecting
and exchanging data.  Apache's MPM (multi processing module) can run
mod_python and with that multiple python instances as separate
processes thus avoiding the global lock as well.

 plone install up and running, he will immediately compare it to
 J2EE wonder why he should pay a consultant to make it work properly.

I really doubt that any performance difference will be due to the
global interpreter lock. This not how things work. You most certainly
have far more substantial bottlenecks in each application.

i.

-- 
http://mail.python.org/mailman/listinfo/python-list


  1   2   >