Re: [Python-Dev] A wordcode-based Python

2009-05-11 Thread Cesare Di Mauro
Hi Collin

On Mon, May 11, 2009 11:14PM, Collin Winter wrote:
> Hi Cesare,
>
> On Mon, May 11, 2009 at 11:00 AM, Cesare Di Mauro
>  wrote:
>> At the last PyCon3 at Italy I've presented a new Python implementation,
>> which you'll find at http://code.google.com/p/wpython/
>
> Good to see some more attention on Python performance! There's quite a
> bit going on in your changes; do you have an
> optimization-by-optimization breakdown, to give an idea about how much
> performance each optimization gives?

I planned it in the next release that will come may be next week.

I'll introduce some #DEFINEs and #IFs in the code, so that
only specific optimizations will be enabled.

> Looking over the slides, I see that you still need to implement
> functionality to make test_trace pass, for example; do you have a
> notion of how much performance it will cost to implement the rest of
> Python's semantics in these areas?

Very little. That's because there are only two tests on test_trace that
don't pass.

I think that the reason stays in the changes that I made in the loops.
With my code SETUP_LOOP and POP_BREAK are completely
removed, so the code in settrace will failt to recognize the loop and
the virtual machine crashes.

I'll fix it in the second release that I have planned.

> Also, I checked out wpython at head to run Unladen Swallow's
> benchmarks against it, but it refuses to compile with either gcc 4.0.1
> or 4.3.1 on Linux (fails in Python/ast.c). I can send you the build
> failures off-list, if you're interested.
>
> Thanks,
> Collin Winter

I'm very interested, thanks. That's because I worked only on Windows
machines, so I definitely need to test and fix it to let it run on any other
platform.

Cesare
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A wordcode-based Python

2009-05-11 Thread Cesare Di Mauro
On Mon, May 11, 2009 10:27PM, Antoine Pitrou wrote:

Hi Antoine

> Hi,
>
>> WPython is a re-implementation of (some parts of) Python, which drops
>> support for bytecode in favour of a wordcode-based model (where a is
>> word
>> is 16 bits wide).
>
> This is great!
> Have you planned to port in to the py3k branch? Or, at least, to trunk?

It was my idea too, but first I need to take a deep look at what parts
of code are changed from 2.6 to 3.0.
That's because I don't know how much work is required for this
"forward" port.

> Some opcode and VM optimizations have gone in after 2.6 was released,
> although
> nothing as invasive as you did.

:-D Interesting.

> About the CISC-y instructions, have you tried merging the fast and const
> arrays
> in frame objects? That way, you need less opcode space (since e.g.
> BINARY_ADD_FAST_FAST will cater with constants as well as local
> variables).
>
> Regards
>
> Antoine.

It's an excellent idea, that needs exploration.

Running my stats tools against all .py files found in Lib and Tools
folders, I discovered that the maximum index used for fast/locals
is 79, and 1853 for constants.

So if I find a way to easily map locals first and constants following
in the same array, your great idea can be implemented saving
A LOT of opcodes and reducing ceval.c source code.

I'll work on that after the two releases that I planned.

Thanks for your precious suggestions!

Cesare

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] using help function in Py3k

2009-05-11 Thread s|s
On Tue, May 5, 2009 at 7:13 PM, Daniel Stutzbach
 wrote:
> On Tue, May 5, 2009 at 5:41 AM, s|s  wrote:
>>
>> LookupError: unknown encoding: uft-8
>
> uft-8?
>
> Looks like a variation of Issue 4540 (or a duplicate?  I can't tell)
>

Yes. It is the same issue. I don't think pydoc should be modified. In
my humble opinion tests should exist in /usr/share or /usr/share/doc.



> --
> Daniel Stutzbach, Ph.D.
> President, Stutzbach Enterprises, LLC



-- 
~preet~
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] albatross backup

2009-05-11 Thread skip
Martin> As for volumes to backup: I think /srv needs regular backup.
Martin> Not sure about any of the others 

Backup of /usr/local/spambayes-corpus would be very helpful.

Skip
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] albatross backup

2009-05-11 Thread Martin v. Löwis
[please ignore this message - I sent it to the wrong mailing list]

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] albatross backup

2009-05-11 Thread Martin v. Löwis
Hi Sean,

Can you please setup backup for albatross?

I gave sudo permissions to the "jafo" user, which has
the key j...@guin.tummy.com authorized.

I think the policy now is that root logins to albatross
are not allowed. So what might work is this:

Create an rsyncbackup user, and give it sudo permission
to run rsync (any command line arguments). Put your backup
pubkey into rsyncbackup's authorized_keys.

Could that actually work?

albatross admins: would that be an acceptable setup?

As for volumes to backup: I think /srv needs regular backup.
Not sure about any of the others (and neither sure what your
current strategy is wrt. volumes on the other machines).
Compared to /srv, everything else is peanuts, anyway.

Regards,
Martin

P.S. I have removed ~root/.ssh/authorized_keys. It only
contained my key, and root logins are disallowed, anyway.

P.P.S. You can stop doing regular backups to bag. I think we
should keep the machine one for a little while, then turn
it off and keep it around for a further while, and then return
it to XS4ALL; making a complete dump before returning it.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A wordcode-based Python

2009-05-11 Thread Collin Winter
Hi Cesare,

On Mon, May 11, 2009 at 11:00 AM, Cesare Di Mauro
 wrote:
> At the last PyCon3 at Italy I've presented a new Python implementation,
> which you'll find at http://code.google.com/p/wpython/

Good to see some more attention on Python performance! There's quite a
bit going on in your changes; do you have an
optimization-by-optimization breakdown, to give an idea about how much
performance each optimization gives?

Looking over the slides, I see that you still need to implement
functionality to make test_trace pass, for example; do you have a
notion of how much performance it will cost to implement the rest of
Python's semantics in these areas?

Also, I checked out wpython at head to run Unladen Swallow's
benchmarks against it, but it refuses to compile with either gcc 4.0.1
or 4.3.1 on Linux (fails in Python/ast.c). I can send you the build
failures off-list, if you're interested.

Thanks,
Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A wordcode-based Python

2009-05-11 Thread Antoine Pitrou

Hi,

> WPython is a re-implementation of (some parts of) Python, which drops
> support for bytecode in favour of a wordcode-based model (where a is word
> is 16 bits wide).

This is great!
Have you planned to port in to the py3k branch? Or, at least, to trunk?
Some opcode and VM optimizations have gone in after 2.6 was released, although
nothing as invasive as you did.

About the CISC-y instructions, have you tried merging the fast and const arrays
in frame objects? That way, you need less opcode space (since e.g.
BINARY_ADD_FAST_FAST will cater with constants as well as local variables).

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] py3k, cgi, email, and form-data

2009-05-11 Thread MRAB

Robert Brewer wrote:

There's a major change in functionality in the cgi module between Python
2 and Python 3 which I've just run across: the behavior of
FieldStorage.read_multi, specifically when an HTTP app accepts a file
upload within a multipart/form-data payload.

In Python 2, each part would be read in sequence within its own
FieldStorage instance. This allowed file uploads to be shunted to a
TemporaryFile (via make_file) as needed:

klass = self.FieldStorageClass or self.__class__
part = klass(self.fp, {}, ib,
 environ, keep_blank_values, strict_parsing)
# Throw first part away
while not part.done:
headers = rfc822.Message(self.fp)
part = klass(self.fp, headers, ib,
 environ, keep_blank_values, strict_parsing)
self.list.append(part)

In Python 3 (svn revision 72466), the whole request body is read into
memory first via fp.read(), and then broken into separate parts in a
second step:

klass = self.FieldStorageClass or self.__class__
parser = email.parser.FeedParser()
# Create bogus content-type header for proper multipart parsing
parser.feed('Content-Type: %s; boundary=%s\r\n\r\n' % (self.type, ib))
parser.feed(self.fp.read())
full_msg = parser.close()
# Get subparts
msgs = full_msg.get_payload()
for msg in msgs:
fp = StringIO(msg.get_payload())
part = klass(fp, msg, ib, environ, keep_blank_values,
 strict_parsing)
self.list.append(part)

This makes the cgi module in Python 3 somewhat crippled for handling
multipart/form-data file uploads of any significant size (and since
the client is the one determining the size, opens a server up for an
unexpected Denial of Service vector).

I *think* the FeedParser is designed to accept incremental writes,
but I haven't yet found a way to do any kind of incremental reads
from it in order to shunt the fp.read out to a tempfile again.
I'm secretly hoping Barry has a one-liner fix for this. ;)


It think what it needs is for the email.parser.FeedParser class to have
a feed_from_file() method, supported by the class BufferedSubFile.

The BufferedSubFile class keeps an internal list of lines. Perhaps it
could also have a list of files, so that when the list of lines becomes
empty it can continue by reading lines from the files instead, dropping
a file from the list when it reaches the end, something like this:

[Module feedparser.py]
...
class BufferedSubFile(object):
...
def __init__(self):
# The last partial line pushed into this object.
self._partial = ''
# The list of full, pushed lines, in reverse order
self._lines = []
# The list of files.
self._files = []
...

...
def readline(self):
while not self._lines and self._files:
data = self._files[0].read(MAX_DATA_SIZE)
if data:
self.push(data)
else:
del self._files[0]
if not self._lines:
if self._closed:
return ''
return NeedMoreData
...

def push_file(self, data_file):
"""Push some new data from a file into this object."""
self._files.append(data_file)

...


and then:

...
class FeedParser:
...
def feed(self, data):
"""Push more data into the parser."""
self._input.push(data)
self._call_parse()

def feed_from_file(self, data_file):
"""Push more data from a file into the parser."""
self._input.push_file(data_file)
self._call_parse()

...
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] A wordcode-based Python

2009-05-11 Thread Cesare Di Mauro
At the last PyCon3 at Italy I've presented a new Python implementation,
which you'll find at http://code.google.com/p/wpython/

WPython is a re-implementation of (some parts of) Python, which drops
support for bytecode in favour of a wordcode-based model (where a is word
is 16 bits wide).

It also implements an hybrid stack-register virtual machine, and adds a
lot of other optimizations.

The slides are available in the download area, and explain the concept of
wordcode, showing also how work some optimizations, comparing them with
the current Python (2.6.1).

Unfortunately I had not time to make extensive benchmarks with real code,
so I've included some that I made with PyStone, PyBench, and a couple of
simple recoursive function calls (Fibonacci and Factorial).

This is the first release, and another two are scheduled; the first one to
make it possibile to select (almost) any optimization to be compiled (so
fine grained tests will be possibile).

The latter will be a rewrite of the constant folding code (specifically
for tuples, lists and dicts), removing a current "hack" to the python type
system to make them "hashable" for the constants dictionary used by
compile.c.

Then I'll start writing some documentation that will explain what parts of
code are related to a specific optimization, so that it'll be easier to
create patches for other Python implementations, if needed.

You'll find a bit more informations in the "README FIRST!" file present
into the project's repository.

I made so many changes to the source of Python 2.6.1, so feel free to ask
me for any information about them.

Cheers
Cesare
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Switchover: mail.python.org

2009-05-11 Thread Jeroen Ruigrok van der Werven
-On [20090511 14:47], Aahz (a...@pythoncraft.com) wrote:
>On Monday 2009-05-11, mail.python.org will be switched to another machine
>starting roughly at 14:00 UTC.  This should be invisible (expected
>downtime is less than ten minutes).

The headers for the python checkins mails are apparently different now. So
people might want to adjust any filtering.

-- 
Jeroen Ruigrok van der Werven  / asmodai
イェルーン ラウフロック ヴァン デル ウェルヴェン
http://www.in-nomine.org/ | http://www.rangaku.org/ | GPG: 2EAC625B
The reverse side also has a reverse side...
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] .pth files are evil

2009-05-11 Thread P.J. Eby

At 04:42 PM 5/9/2009 +0200, Martin v. Löwis wrote:

>> If you always use --single-version-externally-managed with easy_install,
>> it will stop editing .pth files on installation.
>
> It's --multi-version (-m) that does that.
> --single-version-externally-managed is a "setup.py install" option.
>
> Both have the effect of not editing .pth files, but they do so in
> different ways.  The "setup.py install" option causes it to install in a
> distutils-compatible layout, whereas --multi-version simply drops .egg
> files or directories in the target location and leaves it to the user
> (or the generated script wrappers) to add them to sys.path.

Ah, ok. Is there also an easy_install invocation that unpacks the zip
file into some location of sys.path (which then wouldn't require
editing sys.path)?


No; you'd have to use the -e option to easy_install to download and 
extract a source version of the package; then run that package's 
setup.py, e.g.:


   easy_install -eb /some/tmpdir SomeProject
   cd /some/tmpdir/someproject  # subdir is always lowercased/normalized
   setup.py install --single-version-externally-managed --record=...

I suspect that this is basically what pip is doing under the hood, as 
that would explain why it doesn't support .egg files.


I previously posted code to the distutils-sig that was an .egg 
unpacker with appropriate renaming, though.  It was untested, and 
assumes you already checked for collisions in the target directory, 
and that you're handling any uninstall manifest yourself.  It could 
probably be modified to take a filter function, though, something like:


def flatten_egg(egg_filename, extract_dir, filter=lambda s,d: d):
 eggbase = os.path.filename(egg_filename)+'-info'
 def file_filter(src, dst):
 if src.startswith('EGG-INFO/'):
 src = eggbase+s[8:]
 dst = os.path.join(extract_dir, *src.split('/'))
 return filter(src, dst)
 return unpack_archive(egg_filename, extract_dir, file_filter)

Then you could pass in a None-returning filter function to check and 
accumulate collisions and generate a manifest.  A second run with the 
default filter would do the unpacking.


(This function should work with either .egg files or .egg directories 
as input, btw, since unpack_archive treats a directory input as if it 
were an archive.)


Anyway, if you used "easy_install -mxd /some/tmpdir [specs]" to get 
your target eggs found/built, you could then run this flattening 
function (with appropriate filter functions) over the *.egg contents 
of /some/tmpdir to do the actual installation.


(The reason for using -mxd instead of -Zmaxd or -zmaxd is that we 
don't care whether the eggs are zipped or not, and we leave out the 
-a so that dependencies already present on sys.path aren't copied or 
re-downloaded to the target; only dependencies we don't already have 
will get dropped in /some/tmpdir.)


Of course, the devil of this is in the details; to handle conflicts 
and uninstalls properly you would need to know what namespace 
packages were in the eggs you are installing.  But if you don't care 
about blindly overwriting things (as the distutils does not), then 
it's actually pretty easy to make such an unpacker.


I mainly haven't made one myself because I *do* care about things 
being blindly overwritten.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] py3k, cgi, and form-data

2009-05-11 Thread Robert Brewer
There's a major change in functionality in the cgi module between Python
2 and Python 3 which I've just run across: the behavior of
FieldStorage.read_multi, specifically when an HTTP app accepts a file
upload within a multipart/form-data payload.

In Python 2, each part would be read in sequence within its own
FieldStorage instance. This allowed file uploads to be shunted to a
TemporaryFile (via make_file) as needed:

klass = self.FieldStorageClass or self.__class__
part = klass(self.fp, {}, ib,
 environ, keep_blank_values, strict_parsing)
# Throw first part away
while not part.done:
headers = rfc822.Message(self.fp)
part = klass(self.fp, headers, ib,
 environ, keep_blank_values, strict_parsing)
self.list.append(part)

In Python 3 (svn revision 72466), the whole request body is read into
memory first via fp.read(), and then broken into separate parts in a
second step:

klass = self.FieldStorageClass or self.__class__
parser = email.parser.FeedParser()
# Create bogus content-type header for proper multipart parsing
parser.feed('Content-Type: %s; boundary=%s\r\n\r\n' % (self.type, ib))
parser.feed(self.fp.read())
full_msg = parser.close()
# Get subparts
msgs = full_msg.get_payload()
for msg in msgs:
fp = StringIO(msg.get_payload())
part = klass(fp, msg, ib, environ, keep_blank_values,
 strict_parsing)
self.list.append(part)

This makes the cgi module in Python 3 somewhat crippled for handling
multipart/form-data file uploads of any significant size (and since
the client is the one determining the size, opens a server up for an
unexpected Denial of Service vector).

I *think* the FeedParser is designed to accept incremental writes,
but I haven't yet found a way to do any kind of incremental reads
from it in order to shunt the fp.read out to a tempfile again.
I'm secretly hoping Barry has a one-liner fix for this. ;)


Robert Brewer
fuman...@aminus.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] py3k, cgi, email, and form-data

2009-05-11 Thread Robert Brewer
There's a major change in functionality in the cgi module between Python
2 and Python 3 which I've just run across: the behavior of
FieldStorage.read_multi, specifically when an HTTP app accepts a file
upload within a multipart/form-data payload.

In Python 2, each part would be read in sequence within its own
FieldStorage instance. This allowed file uploads to be shunted to a
TemporaryFile (via make_file) as needed:

klass = self.FieldStorageClass or self.__class__
part = klass(self.fp, {}, ib,
 environ, keep_blank_values, strict_parsing)
# Throw first part away
while not part.done:
headers = rfc822.Message(self.fp)
part = klass(self.fp, headers, ib,
 environ, keep_blank_values, strict_parsing)
self.list.append(part)

In Python 3 (svn revision 72466), the whole request body is read into
memory first via fp.read(), and then broken into separate parts in a
second step:

klass = self.FieldStorageClass or self.__class__
parser = email.parser.FeedParser()
# Create bogus content-type header for proper multipart parsing
parser.feed('Content-Type: %s; boundary=%s\r\n\r\n' % (self.type, ib))
parser.feed(self.fp.read())
full_msg = parser.close()
# Get subparts
msgs = full_msg.get_payload()
for msg in msgs:
fp = StringIO(msg.get_payload())
part = klass(fp, msg, ib, environ, keep_blank_values,
 strict_parsing)
self.list.append(part)

This makes the cgi module in Python 3 somewhat crippled for handling
multipart/form-data file uploads of any significant size (and since
the client is the one determining the size, opens a server up for an
unexpected Denial of Service vector).

I *think* the FeedParser is designed to accept incremental writes,
but I haven't yet found a way to do any kind of incremental reads
from it in order to shunt the fp.read out to a tempfile again.
I'm secretly hoping Barry has a one-liner fix for this. ;)


Robert Brewer
fuman...@aminus.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Switchover: mail.python.org

2009-05-11 Thread Aahz
On Monday 2009-05-11, mail.python.org will be switched to another machine
starting roughly at 14:00 UTC.  This should be invisible (expected
downtime is less than ten minutes).
-- 
Aahz (a...@pythoncraft.com)   <*> http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] how GNU stow is complementary rather than alternative to distutils

2009-05-11 Thread Giuseppe Ottaviano
Talking of stow, I take advantage of this thread to do some shameless  
advertising :)
Recently I uploaded to PyPI a software of mine, BPT [1], which does  
the same symlinking trick of stow, but it is written in Python (and  
with a simple api) and, more importantly, it allows with another trick  
the relocation of the installation directory (it creates a semi- 
isolated environment, similar to virtualenv).
I find it very convenient when I have to switch between several  
versions of the same packages (for example during development), or I  
have to deploy on the same machine software that needs different  
versions of the dependencies.


I am planning to write an integration layer with buildout and  
easy_install. It should be very easy, since BPT can handle directly  
tarballs (and directories, in trunk) which contain a setup.py.


HTH,
Giuseppe

[1] http://pypi.python.org/pypi/bpt
P.S. I was not aware of stow, I'll add it to the references and see if  
there are any features that I can steal



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com