Re: [Python-Dev] Unicode charmap decoders slow

2005-10-06 Thread Martin v. Löwis
Hye-Shik Chang wrote:
> If the encoding optimization can be easily done in Walter's approach,
> the fastmap codec would be too expensive way for the objective because
> we must maintain not only fastmap but also charmap for backward
> compatibility.

IMO, whether a new function is added or whether the existing function
becomes polymorphic (depending on the type of table being passed) is
a minor issue. Clearly, the charmap API needs to stay for backwards
compatibility; in terms of code size or maintenance, I would actually
prefer separate functions.

One issue apparently is people tweaking the existing dictionaries,
with additional entries they think belong there. I don't think we
need to preserve compatibility with that approach in 2.5, but I
also think that breakage should be obvious: the dictionary should
either go away completely at run-time, or be stored under a
different name, so that any attempt of modifying the dictionary
gives an exception instead of having no interesting effect.

I envision a layout of the codec files like this:

decoding_dict = ...
decoding_map, encoding_map = codecs.make_lookup_tables(decoding_dict)

I think it should be possible to build efficient tables in a single
pass over the dictionary, so startup time should be fairly small
(given that the dictionaries are currently built incrementally, anyway,
due to the way dictionary literals work).

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Removing the block stack (was Re: PEP 343 and __with__)

2005-10-06 Thread Phillip J. Eby
At 10:09 PM 10/5/2005 -0700, Neal Norwitz wrote:
>I've also been thinking about avoiding tuple creation when calling
>python functions.  The change I have in mind would probably have to
>wait until p3k, but could yield some speed ups.
>
>Warning:  half baked idea follows.

Yeah, I've been baking that idea for a long time, and it's a bit more 
complex than you've suggested, due to generators, sys._getframe(), and 
tracebacks.


>My thoughts are to dynamically allocate the Python stack memory (e.g.,
>void *stack = malloc(128MB)).  Then all calls within each thread uses
>its own stack.  So things would be pushed onto the stack like they are
>currently, but we wouldn't need to do create a tuple to pass to a
>method, they could just be used directly.  Basically more closely
>simulate the way it currently works in hardware.

Actually, Python/ceval.c already skips creating a tuple when calling Python 
functions with a fixed number of arguments (caller and callee) and no cell 
vars (i.e., not a closure).  It copies them straight from the calling frame 
stack to the callee frame's stack.


>This would mean all the PyArg_ParseTuple()s would have to change.  It
>may be possible to fake it out, but I'm not sure it's worth it which
>is why it would be easier to do this for p3k.

Actually, I've been thinking that replacing the arg tuple with a PyObject* 
array would allow us to skip tuple creation when calling C functions, since 
you could just give the C functions a pointer to the arguments on the 
caller's stack.  That would let us get rid of most remaining tuple 
allocations.  I suppose we'd also need either an argcount parameter.  The 
old APIs taking tuples for calls could trivially convert the tuples to a 
array pointer and size, then call the new APIs.

Actually, we'd probably have to have a tp_arraycall slot or something, with 
the existing tp_call forwarding to tp_arraycall in most cases, but 
occasionally the reverse.  The tricky part is making sure you don't end up 
with cases where you call a tuple API that converts to an array that then 
turns it back into a tuple!


>The general idea is to allocate the stack in one big hunk and just
>walk up/down it as functions are called/returned.  This only means
>incrementing or decrementing pointers.  This should allow us to avoid
>a bunch of copying and tuple creation/destruction.  Frames would
>hopefully be the same size which would help.  Note that even though
>there is a free list for frames, there could still be
>PyObject_GC_Resize()s often (or unused memory).  WIth my idea,
>hopefully there would be better memory locality, which could speed
>things up.

Yeah, unfortunately for your idea, generators would have to copy off bits 
of the stack and then copy them back in, making generators slower.  If it 
weren't for that part, the idea would probably be a good one, as arguments, 
locals, cells, and the block and value stacks could all be handled that 
way, with the compiler treating all operations as base-pointer offsets, 
thereby eliminating lots of more-complex pointer management in ceval.c and 
frameobject.c.

Another possible fix for generators would be of course to give them their 
own stack arena, but then you have the problem of needing to copy overflows 
from one such stack to another - at which point you're basically back to 
having frames.

On the other hand, maybe the good part of this idea is just eliminating all 
the pointer fudging and having the compiler determine stack offsets.  Then, 
the frame object layout would just consist of a big hunk of stack space, 
laid out as a PyObject* array.

The main problem with this concept is that it would change the meaning of 
certain opcodes, since right now the offsets of free variables in opcodes 
start over the numbering, but this approach would add the number of locals 
to those offsets.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Removing the block stack (was Re: PEP 343 and __with__)

2005-10-06 Thread Martin v. Löwis
Neal Norwitz wrote:
> My thoughts are to dynamically allocate the Python stack memory (e.g.,
> void *stack = malloc(128MB)).  Then all calls within each thread uses
> its own stack.  So things would be pushed onto the stack like they are
> currently, but we wouldn't need to do create a tuple to pass to a
> method, they could just be used directly.  Basically more closely
> simulate the way it currently works in hardware.

One issue with argument tuples on the stack (or some sort of stack) is
that functions may hold onto argument tuples longer:

def foo(*args):
 global last_args
 last_args = args

I considered making true tuple objects (i.e. with ob_type etc.) on
the stack, but this possibility breaks it.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Unicode charmap decoders slow

2005-10-06 Thread Walter Dörwald
Martin v. Löwis wrote:

> Walter Dörwald wrote:
> 
>> OK, here's a patch that implements this enhancement to 
>> PyUnicode_DecodeCharmap(): http://www.python.org/sf/1313939
> 
> Looks nice!
> 
>> Creating the decoding_map as a string should probably be done by 
>> gencodec.py directly. This way the first import of the codec would be 
>> faster too.
> 
> Hmm. How would you represent the string in source code? As a Unicode
> literal? With \u escapes,

Yes, simply by outputting repr(decoding_string).

> or in a UTF-8 source file?

This might get unreadable, if your editor can't detect the coding header.

> Or as a UTF-8
> string, with an explicit decode call?

This is another possibility, but is unreadable too. But we might add the 
real codepoints as comments.

> I like the current dictionary style for being readable, as it also
> adds the Unicode character names into comments.

We could use

decoding_string = (
u"\u009c" # 0x0004 -> U+009C: CONTROL
u"\u0009" # 0x0005 -> U+000c: HORIZONTAL TABULATION
...
)

However the current approach has the advantage, that only those byte 
values that differ from the identical mapping have to be specified.

Bye,
Walter Dörwald
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] unifying str and unicode

2005-10-06 Thread Stephen J. Turnbull
> "M" == "M.-A. Lemburg" <[EMAIL PROTECTED]> writes:

M> From what I've read on the web about the Python Unicode
M> implementation we have one of the better ones compared to other
M> languages implementations and their choices and design
M> decisions.

Yes, indeed!

Speaking-as-a-card-carrying-member-of-the-loyal-opposition-ly y'rs,

-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of TsukubaTennodai 1-1-1 Tsukuba 305-8573 JAPAN
   Ask not how you can "do" free software business;
  ask what your business can "do for" free software.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Removing the block stack

2005-10-06 Thread Michael Hudson
Neal Norwitz <[EMAIL PROTECTED]> writes:

> On 10/5/05, Phillip J. Eby <[EMAIL PROTECTED]> wrote:
>> At 09:50 AM 10/4/2005 +0100, Michael Hudson wrote:
>> >(anyone still thinking about removing the block stack?).
>>
>> I'm not any more.  My thought was that it would be good for performance, by
>> reducing the memory allocation overhead for frames enough to allow pymalloc
>> to be used instead of the platform malloc.
>
> I did something similar to reduce the frame size to under 256 bytes
> (don't recall if I made a patch or not) and it had no overall effect
> on perf.

Hey, me too!  I also came to the same conclusion.

Cheers,
mwh

-- 
  The ultimate laziness is not using Perl.  That saves you so much
  work you wouldn't believe it if you had never tried it.
-- Erik Naggum, comp.lang.lisp
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Unicode charmap decoders slow

2005-10-06 Thread Walter Dörwald
Martin v. Löwis wrote:

> Hye-Shik Chang wrote:
> 
>> If the encoding optimization can be easily done in Walter's approach,
>> the fastmap codec would be too expensive way for the objective because
>> we must maintain not only fastmap but also charmap for backward
>> compatibility.
> 
> IMO, whether a new function is added or whether the existing function
> becomes polymorphic (depending on the type of table being passed) is
> a minor issue. Clearly, the charmap API needs to stay for backwards
> compatibility; in terms of code size or maintenance, I would actually
> prefer separate functions.

OK, I can update the patch accordingly. Any suggestions for the name? 
PyUnicode_DecodeCharmapString?

> One issue apparently is people tweaking the existing dictionaries,
> with additional entries they think belong there. I don't think we
> need to preserve compatibility with that approach in 2.5, but I
> also think that breakage should be obvious: the dictionary should
> either go away completely at run-time, or be stored under a
> different name, so that any attempt of modifying the dictionary
> gives an exception instead of having no interesting effect.

IMHO it should be stored under a different name, because there are 
codecs (c037, koi8_r, iso8859_11), that reuse existing dictionaries.

Or we could have a function that recreates the dictionary from the string.

> I envision a layout of the codec files like this:
> 
> decoding_dict = ...
> decoding_map, encoding_map = codecs.make_lookup_tables(decoding_dict)

Apart from the names (and the fact that encoding_map is still a 
dictionary), that's what my patch does.

> I think it should be possible to build efficient tables in a single
> pass over the dictionary, so startup time should be fairly small
> (given that the dictionaries are currently built incrementally, anyway,
> due to the way dictionary literals work).

Bye,
Walter Dörwald
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Unicode charmap decoders slow

2005-10-06 Thread M.-A. Lemburg
Walter Dörwald wrote:
> Martin v. Löwis wrote:
> 
>> Hye-Shik Chang wrote:
>>
>>> If the encoding optimization can be easily done in Walter's approach,
>>> the fastmap codec would be too expensive way for the objective because
>>> we must maintain not only fastmap but also charmap for backward
>>> compatibility.
>>
>>
>> IMO, whether a new function is added or whether the existing function
>> becomes polymorphic (depending on the type of table being passed) is
>> a minor issue. Clearly, the charmap API needs to stay for backwards
>> compatibility; in terms of code size or maintenance, I would actually
>> prefer separate functions.
> 
> 
> OK, I can update the patch accordingly. Any suggestions for the name?
> PyUnicode_DecodeCharmapString?

No, you can factor this part out into a separate C function
- there's no need to add a completely new entry point just
for this optimization. Later on we can then also add support
for compressed tables to the codec in the same way.

>> One issue apparently is people tweaking the existing dictionaries,
>> with additional entries they think belong there. I don't think we
>> need to preserve compatibility with that approach in 2.5, but I
>> also think that breakage should be obvious: the dictionary should
>> either go away completely at run-time, or be stored under a
>> different name, so that any attempt of modifying the dictionary
>> gives an exception instead of having no interesting effect.
> 
> 
> IMHO it should be stored under a different name, because there are
> codecs (c037, koi8_r, iso8859_11), that reuse existing dictionaries.

Only koi8_u reuses the dictionary from koi8_r - and it's
easy to recreate the codec from a standard mapping file.

> Or we could have a function that recreates the dictionary from the string.

Actually, I'd prefer that these operations be done by the
codec generator script, so that we don't have additional
startup time. The dictionaries should then no longer be
generated and instead. I'd like the comments to stay, though.
This can be done like this (using string concatenation
applied by the compiler):

decoding_charmap = (
u'x' # 0x -> 0x0078 LATIN SMALL LETTER X
u'y' # 0x0001 -> 0x0079 LATIN SMALL LETTER Y
...
)

Either way, monkey patching the codec won't work anymore.
Doesn't really matter, though, as this was never officially
supported.

We've always told people to write their own codecs
if they need to modify an existing one and then hook it into
the system using either a new codec search function or by
adding an appropriate alias.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 06 2005)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Unicode charmap decoders slow

2005-10-06 Thread M.-A. Lemburg
Hye-Shik Chang wrote:
> On 10/6/05, M.-A. Lemburg <[EMAIL PROTECTED]> wrote:
> 
>>Hye-Shik, could you please provide some timeit figures for
>>the fastmap encoding ?
>>

Thanks for the timings.

> (before applying Walter's patch, charmap decoder)
> 
> % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10';
> u=unicode(s, e)" "s.decode(e)"
> 100 loops, best of 3: 3.35 msec per loop
> 
> (applied the patch, improved charmap decoder)
> 
> % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10';
> u=unicode(s, e)" "s.decode(e)"
> 1000 loops, best of 3: 1.11 msec per loop
> 
> (the fastmap decoder)
> 
> % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10_fc';
> u=unicode(s, e)" "s.decode(e)"
> 1000 loops, best of 3: 1.04 msec per loop
> 
> (utf-8 decoder)
> 
> % ./python Lib/timeit.py -s "s='a'*53*1024; e='utf_8'; u=unicode(s,
> e)" "s.decode(e)"
> 1000 loops, best of 3: 851 usec per loop
> 
> Walter's decoder and the fastmap decoder run in mostly same way.
> So the performance difference is quite minor.  Perhaps, the minor
> difference came from the existence of wrapper function on each codecs;
> the fastmap codec provides functions usable as Codecs.{en,de}code
> directly.
> 
> (encoding, charmap codec)
> 
> % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10';
> u=unicode(s, e)" "u.encode(e)"
> 100 loops, best of 3: 3.51 msec per loop
> 
> (encoding, fastmap codec)
> 
> % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10_fc';
> u=unicode(s, e)" "u.encode(e)"
> 1000 loops, best of 3: 536 usec per loop
> 
> (encoding, utf-8 codec)
> 
> % ./python Lib/timeit.py -s "s='a'*53*1024; e='utf_8'; u=unicode(s,
> e)" "u.encode(e)"
> 1000 loops, best of 3: 1.5 msec per loop

I wonder why the UTF-8 codec is slower than the fastmap
codec in this case.

> If the encoding optimization can be easily done in Walter's approach,
> the fastmap codec would be too expensive way for the objective because
> we must maintain not only fastmap but also charmap for backward
> compatibility.

Indeed. Let's go with a patched charmap codec then.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 06 2005)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 2.5 and ast-branch

2005-10-06 Thread Nick Coghlan
[Brett]
> To answer Nick's email here, I didn't respond to that initial email
> because it seemed specifically directed at Guido and not me.

Fair enough. I think I was actually misrembering the sequence of events 
leading up to 2.4a1, so the question was less appropriate for Guido than I 
thought :)

[Guido]
> On 10/5/05, Guido van Rossum <[EMAIL PROTECTED]> wrote:
>>Given the total lack of response, I have a different suggestion. Let's
>>*abandon* the AST-branch. We're fooling ourselves believing that we
>>can ever switch to that branch, no matter how theoretically better it
>>is.

[Brett]
> Since the original people who have done the majority of the work
> (Jeremy, Tim, Neal, Nick, logistix, and myself) have fallen so far
> behind this probably is not a bad decision.  Obviously I would like to
> see the work pan out, but since I personally just have not found the
> time to shuttle the branch the rest of the way I really am in no
> position to say much in terms of objecting to its demise.

If we kill the branch for now, then anyone that wants to bring up the idea 
again can write a PEP first, not only to articulate the benefits of switching 
to an AST compiler (Jeremy has a few notes scattered around the web on that 
front), but also to propose a solid migration strategy. We tried the "develop 
in parallel, switch when done"; it doesn't seem to have worked due to the way 
it split developer effort between the branches, and both the HEAD and 
ast-branch ended up losing out.

> Maybe I can come up with a new design and get my dissertation out of it.  =)

A strategy that may work out better is to develop something independent of the 
Python core that can:

   1. Produce an ASDL based AST structure from:
 - Python source code
 - CPython 'AST'
 - CPython bytecode
   2. Parse an ASDL based AST structure and produce:
 - Python source code
 - CPython 'AST'
 - CPython bytecode

That is, initially develop an enhanced replacement for the compiler package, 
rather than aiming directly to replace the actual CPython compiler.

Then the folks who want to do serious bytecode hacking can reverse compile the 
bytecode on the fly ;)

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://boredomandlaziness.blogspot.com
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Unicode charmap decoders slow

2005-10-06 Thread Hye-Shik Chang
On 10/6/05, M.-A. Lemburg <[EMAIL PROTECTED]> wrote:
> Hye-Shik Chang wrote:
> > (encoding, fastmap codec)
> >
> > % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10_fc';
> > u=unicode(s, e)" "u.encode(e)"
> > 1000 loops, best of 3: 536 usec per loop
> >
> > (encoding, utf-8 codec)
> >
> > % ./python Lib/timeit.py -s "s='a'*53*1024; e='utf_8'; u=unicode(s,
> > e)" "u.encode(e)"
> > 1000 loops, best of 3: 1.5 msec per loop
>
> I wonder why the UTF-8 codec is slower than the fastmap
> codec in this case.

I guess that resizing made the difference.  fastmap encoder doesn't
resize the output buffer at all in the test case while UTF-8 encoder
allocates 4*53*1024 bytes and resizes it to 53*1024 bytes in the end.

Hye-Shik
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Lexical analysis and NEWLINE tokens

2005-10-06 Thread Matthew F. Barnes
I posted this question to python-help, but I think I have a better chance
of getting the answer here.

I'm looking for clarification on when NEWLINE tokens are generated during
lexical analysis of Python source code.  In particular, I'm confused about
some of the top-level components in Python's grammar (file_input,
interactive_input, and eval_input).

Section 2.1.7 of the reference manual states that blank lines (lines
consisting only of whitespace and possibly a comment) do not generate
NEWLINE tokens.  This is supported by the definition of a suite, which
does not allow for standalone or consecutive NEWLINE tokens.

suite ::= stmt_list NEWLINE | NEWLINE INDENT statement+ DEDENT

Yet the grammar for top-level components seems to suggest that a parsable
input may consist entirely of a single NEWLINE token, or include
consecutive NEWLINE tokens.

file_input ::= (NEWLINE | statement)*
interactive_input ::= [stmt_list] NEWLINE | compound_stmt NEWLINE
eval_input ::= expression_list NEWLINE*

To me this seems to contradict section 2.1.7 in so far as I don't see how
it's possible to generate such a sequence of tokens.

What kind of input would generate NEWLINE tokens in the top-level
components of the grammar?

Matthew Barnes
[EMAIL PROTECTED]
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Unicode charmap decoders slow

2005-10-06 Thread Walter Dörwald
M.-A. Lemburg wrote:

 > [...]
>>Or we could have a function that recreates the dictionary from the string.
> 
> Actually, I'd prefer that these operations be done by the
> codec generator script, so that we don't have additional
> startup time. The dictionaries should then no longer be
> generated and instead. I'd like the comments to stay, though.
> This can be done like this (using string concatenation
> applied by the compiler):
> 
> decoding_charmap = (
> u'x' # 0x -> 0x0078 LATIN SMALL LETTER X
> u'y' # 0x0001 -> 0x0079 LATIN SMALL LETTER Y
> ...
> )

I'd prefer that too.

> Either way, monkey patching the codec won't work anymore.
> Doesn't really matter, though, as this was never officially
> supported.
> 
> We've always told people to write their own codecs
> if they need to modify an existing one and then hook it into
> the system using either a new codec search function or by
> adding an appropriate alias.

OK, so can someone update gencodec.py and recreate the charmap codecs?

BTW, is codecs.make_encoding_map part of the official API, or can we 
change it to expect a string instead of a dictionary?

Bye,
Walter Dörwald
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Unicode charmap decoders slow

2005-10-06 Thread Tony Nelson
At 8:36 AM +0200 10/5/05, Martin v. Löwis wrote:
>Tony Nelson wrote:
 ...
>> Encoding can be made fast using a simple hash table with external chaining.
>> There are max 256 codepoints to encode, and they will normally be well
>> distributed in their lower 8 bits.  Hash on the low 8 bits (just mask), and
>> chain to an area with 256 entries.  Modest storage, normally short chains,
>> therefore fast encoding.
>
>This is what is currently done: a hash map with 256 keys. You are
>complaining about the performance of that algorithm. The issue of
>external chaining is likely irrelevant: there likely are no collisions,
>even though Python uses open addressing.

I think I'm complaining about the implementation, though on decode, not encode.

In any case, there are likely to be collisions in my scheme.  Over the next
few days I will try to do it myself, but I will need to learn Pyrex, some
of the Python C API, and more about Python to do it.


>>>...I suggest instead just /caching/ the translation in C arrays stored
>>>with the codec object.  The cache would be invalidated on any write to the
>>>codec's mapping dictionary, and rebuilt the next time anything was
>>>translated.  This would maintain the present semantics, work with current
>>>codecs, and still provide the desired speed improvement.
>
>That is not implementable. You cannot catch writes to the dictionary.

I should have been more clear.  I am thinking about using a proxy object in
the codec's 'encoding_map' and 'decoding_map' slots, that will forward all
the dictionary stuff.  The proxy will delete the cache on any call which
changes the dictionary contents.  There are proxy classed and dictproxy
(don't know how its implemented yet) so it seems doable, at least as far as
I've gotten so far.


>> Note that this caching is done by new code added to the existing C
>> functions (which, if I have it right, are in unicodeobject.c).  No
>> architectural changes are made; no existing codecs need to be changed;
>> everything will just work
>
>Please try to implement it. You will find that you cannot. I don't
>see how regenerating/editing the codecs could be avoided.

Will do!

TonyN.:'   
  '  
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Lexical analysis and NEWLINE tokens

2005-10-06 Thread Michael Hudson
"Matthew F. Barnes" <[EMAIL PROTECTED]> writes:

> I posted this question to python-help, but I think I have a better chance
> of getting the answer here.
>
> I'm looking for clarification on when NEWLINE tokens are generated during
> lexical analysis of Python source code.  In particular, I'm confused about
> some of the top-level components in Python's grammar (file_input,
> interactive_input, and eval_input).
>
> Section 2.1.7 of the reference manual states that blank lines (lines
> consisting only of whitespace and possibly a comment) do not generate
> NEWLINE tokens.  This is supported by the definition of a suite, which
> does not allow for standalone or consecutive NEWLINE tokens.
>
> suite ::= stmt_list NEWLINE | NEWLINE INDENT statement+ DEDENT

I don't have the spare brain cells to think about your real problem
(sorry) but something to be aware of is that the pseudo EBNF of the
reference manual is purely descriptive -- it is not actually used in
the parsing of Python code at all.  Among other things this means it
could well just be wrong :/

The real grammar is Grammar/Grammar in the source distribution.

Cheers,
mwh

-- 
  The Internet is full.  Go away.
  -- http://www.disobey.com/devilshat/ds011101.htm
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 343 and __with__

2005-10-06 Thread Guido van Rossum
Just a quick note. Nick convinced me that adding __with__ (without
losing __enter__ and __exit__!) is a good thing, especially for the
decimal context manager. He's got a complete proposal for PEP changes
which he'll post here. After a brief feedback period I'll approve his
changes and he'll check them into the PEP.

My apologies to Jason for missing the point he was making; thanks to
Nick for getting it and turning it into a productive change proposal.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Lexical analysis and NEWLINE tokens

2005-10-06 Thread Guido van Rossum
I think it is a relic from the distant past, when the lexer did
generate NEWLINE for every blank line. I think the only case where you
can still get a NEWLINE by itself is in interactive mode. This code is
extremely convoluted and may be buggy in end cases; this could explain
why you get a continuation prompt after entering a comment in
interactive mode...

--Guido

On 10/6/05, Michael Hudson <[EMAIL PROTECTED]> wrote:
> "Matthew F. Barnes" <[EMAIL PROTECTED]> writes:
>
> > I posted this question to python-help, but I think I have a better chance
> > of getting the answer here.
> >
> > I'm looking for clarification on when NEWLINE tokens are generated during
> > lexical analysis of Python source code.  In particular, I'm confused about
> > some of the top-level components in Python's grammar (file_input,
> > interactive_input, and eval_input).
> >
> > Section 2.1.7 of the reference manual states that blank lines (lines
> > consisting only of whitespace and possibly a comment) do not generate
> > NEWLINE tokens.  This is supported by the definition of a suite, which
> > does not allow for standalone or consecutive NEWLINE tokens.
> >
> > suite ::= stmt_list NEWLINE | NEWLINE INDENT statement+ DEDENT
>
> I don't have the spare brain cells to think about your real problem
> (sorry) but something to be aware of is that the pseudo EBNF of the
> reference manual is purely descriptive -- it is not actually used in
> the parsing of Python code at all.  Among other things this means it
> could well just be wrong :/
>
> The real grammar is Grammar/Grammar in the source distribution.
>
> Cheers,
> mwh
>
> --
>   The Internet is full.  Go away.
>   -- http://www.disobey.com/devilshat/ds011101.htm
> ___
> Python-Dev mailing list
> [email protected]
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Lexical analysis and NEWLINE tokens

2005-10-06 Thread Phillip J. Eby
At 07:36 AM 10/6/2005 -0500, Matthew F. Barnes wrote:
>I posted this question to python-help, but I think I have a better chance
>of getting the answer here.
>
>I'm looking for clarification on when NEWLINE tokens are generated during
>lexical analysis of Python source code.

If you're talking about the "tokenize" module, NEWLINE is only generated 
following a logical line, which is one that contains code.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 2.5 and ast-branch

2005-10-06 Thread Neil Schemenauer
Nick Coghlan <[EMAIL PROTECTED]> wrote:
> If we kill the branch for now, then anyone that wants to bring up the idea 
> again can write a PEP first

I still have some (very) small hope that it can be finished.  If we
don't get it done soon then I fear that it will never happen.  I had
hoped that a SoC student would pick up the task or someone would ask
for a grant from the PSF.  Oh well.

> A strategy that may work out better is [...]

Another thought I've had recently is that most of the complexity
seems to be in the CST to AST translator.  Perhaps having a parser
that provided a nicer CST might help.

  Neil

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 2.5 and ast-branch

2005-10-06 Thread Guido van Rossum
On 10/6/05, Neil Schemenauer <[EMAIL PROTECTED]> wrote:
> Nick Coghlan <[EMAIL PROTECTED]> wrote:
> > If we kill the branch for now, then anyone that wants to bring up the idea
> > again can write a PEP first
>
> I still have some (very) small hope that it can be finished.  If we
> don't get it done soon then I fear that it will never happen.  I had
> hoped that a SoC student would pick up the task or someone would ask
> for a grant from the PSF.  Oh well.
>
> > A strategy that may work out better is [...]
>
> Another thought I've had recently is that most of the complexity
> seems to be in the CST to AST translator.  Perhaps having a parser
> that provided a nicer CST might help.

Dream on, Neil... Adding more work won't make it more likely to happen.

The only alternative to abandoning it that I see is to merge it back
into main NOW, using the time that remains us until the 2.5 release to
make it robust. That way, everybody can help out (and it may motivate
more people).

Even if this is a temporary regression (e.g. PEP 342), it might be
worth it -- but only if there are at least two people committed to
help out quickly when there are problems.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Pythonic concurrency

2005-10-06 Thread Bruce Eckel
Jeremy Jones published a blog discussing some of the ideas we've
talked about here:
http://www.oreillynet.com/pub/wlg/8002
Although I hope our conversation isn't done, as he suggests!

At some point when more ideas have been thrown about (and TIJ4 is
done) I hope to summarize what we've talked about in an article.

Bruce Eckelhttp://www.BruceEckel.com   mailto:[EMAIL PROTECTED]
Contains electronic books: "Thinking in Java 3e" & "Thinking in C++ 2e"
Web log: http://www.artima.com/weblogs/index.jsp?blogger=beckel
Subscribe to my newsletter:
http://www.mindview.net/Newsletter
My schedule can be found at:
http://www.mindview.net/Calendar



___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pythonic concurrency

2005-10-06 Thread Paolo Invernizzi
Just to add another 2 cents

http://www.erights.org/talks/promises/paper/tgc05.pdf

---
Paolo Invernizzi


Bruce Eckel wrote:
> Jeremy Jones published a blog discussing some of the ideas we've
> talked about here:
> http://www.oreillynet.com/pub/wlg/8002
> Although I hope our conversation isn't done, as he suggests!
> 
> At some point when more ideas have been thrown about (and TIJ4 is
> done) I hope to summarize what we've talked about in an article.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 2.5 and ast-branch

2005-10-06 Thread Jeremy Hylton
On 10/6/05, Guido van Rossum <[EMAIL PROTECTED]> wrote:
> On 10/6/05, Neil Schemenauer <[EMAIL PROTECTED]> wrote:
> > Nick Coghlan <[EMAIL PROTECTED]> wrote:
> > > If we kill the branch for now, then anyone that wants to bring up the idea
> > > again can write a PEP first
> >
> > I still have some (very) small hope that it can be finished.  If we
> > don't get it done soon then I fear that it will never happen.  I had
> > hoped that a SoC student would pick up the task or someone would ask
> > for a grant from the PSF.  Oh well.
> >
> > > A strategy that may work out better is [...]
> >
> > Another thought I've had recently is that most of the complexity
> > seems to be in the CST to AST translator.  Perhaps having a parser
> > that provided a nicer CST might help.
>
> Dream on, Neil... Adding more work won't make it more likely to happen.

You're both right.  The CST-to-AST translator is fairly complex; it
would be better to parse directly to an AST.  On the other hand, the
AST translator seems fairly complete and not particularly hard to
write.  I'd love to see a new parser in 2.6.

> The only alternative to abandoning it that I see is to merge it back
> into main NOW, using the time that remains us until the 2.5 release to
> make it robust. That way, everybody can help out (and it may motivate
> more people).
>
> Even if this is a temporary regression (e.g. PEP 342), it might be
> worth it -- but only if there are at least two people committed to
> help out quickly when there are problems.

I'm sorry I didn't respond earlier.  I've been home with a new baby
for the last six weeks and haven't been keeping a close eye on my
email.  (I didn't see Nick's earlier email until his most recent
post.)

It would take a few days of work to get the branch ready to merge to
the head.  There are basic issues like renaming newcompile.c to
compile.c and the like.  I could work on that tomorrow and Monday.

I did do a little work on the ast branch earlier this week.  The
remaining issues feel pretty manageable, so you can certainly count me
as one of the two people committed to help out.  I'll make a point of
keeping a closer eye on python-dev email, in addition to writing some
code.

Jeremy
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pythonic concurrency

2005-10-06 Thread Michael Sparks
Hi Bruce,


On Thursday 06 October 2005 18:12, Bruce Eckel wrote:
> Although I hope our conversation isn't done, as he suggests!
...
> At some point when more ideas have been thrown about (and TIJ4 is
> done) I hope to summarize what we've talked about in an article.

I don't know if you saw my previous post[1] to python-dev on this topic, but 
Kamaelia is specifically aimed at making concurrency simple and easy to use. 
Initially we were focussed on using scheduled generators for co-operative 
CSP-style (but with buffers) concurrency.
   [1] http://tinyurl.com/dfnah, http://tinyurl.com/e4jfq

We've tested the system so far on 2 relatively inexperienced programmers
(as well as experienced, but the more interesting group is novices). The one
who hadn't done much programming at all (a little bit of VB, pre-university)
actually fared better IMO. This is probably because concurrency became
part of his standard toolbox of approaches.

I've placed the slides I've produced for Euro OSCON on Kamaelia here:
   * http://cerenity.org/KamaeliaEuroOSCON2005.pdf

The corrected URL for the whitepaper based on work now 6 months old (we've 
come quite a way since then!) is here:
   * http://www.bbc.co.uk/rd/pubs/whp/whp113.shtml

Consider a simple server for sending text (generated by a user typing into the 
server) to multiple clients connecting to a server. This is a naturally 
concurrent problem in various ways (user interaction, splitting, listening 
for connections, serving connections, etc). Why is that interesting to us? 
It's effectively a microcosm of how subtitling works. (I work at the BBC)

In Kamaelia this looks like this:

=== start ===
class ConsoleReader(threadedcomponent):
   def run(self):
  while 1:
 line = raw_input(">>> ")
 line = line + "\n"
 self.outqueues["outbox"].put(line)

Backplane("subtitles").activate()
pipeline(
ConsoleReader(),
publishTo("subtitles"),
).activate()
def subtitles_protocol():
return subscribeTo("subtitles")

SimpleServer(subtitles_protocol, 5000).run()
=== end ===

The ConsoleReader is threaded to allow the use of the naive way of
reading from the input, whereas the server, backplane (a named splitter
component in practice), pipelines, publishing, subscribing, splitting,
etc are all single threaded co-operative concurrency.

A possible client for this text service might be:

pipeline(
TCPClient("subtitles.rd.bbc.co.uk", 5000),
Ticker(),
).run()

(Though that would be a bit bare, even if it does use pygame :)

The entire system is based around communicating generators, but we also
have threads for blocking operations. (Though the entire network subsystem
is non-blocking)

What I'd be interested in, is hearing how our system doesn't match with
the goals of the hypothetical concurrency system you'd like to see (if it
doesn't). The main reason I'm interested in hearing this, is because the
goals you listed are ones we want to achieve. If you don't think our system
matches it (we don't have process migration as yet, so that's one area)
I'd be interested in hearing what areas you think are deficient.

However, the way we're beginning to refer to the project is to refer to
just the component aspect rather than concurrency - for one simple
reason - we're getting to stage where we can ignore /most/ concurrency
issues(not all).

If you have any time for feedback, it'd be appreciated. If you don't I hope 
it's useful food for thought! 

Best Regards,


Michael
-- 
Michael Sparks, Senior R&D Engineer, Digital Media Group
[EMAIL PROTECTED], http://kamaelia.sourceforge.net/
British Broadcasting Corporation, Research and Development
Kingswood Warren, Surrey KT20 6NP

This e-mail may contain personal views which are not the views of the BBC.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pythonic concurrency

2005-10-06 Thread Bruce Eckel
This does look quite fascinating, and I know there's a lot of really
interesting work going on at the BBC now -- looks like some really
pioneering stuff going on with respect to TV show distribution over
the internet, new compression formats, etc.

So yes indeed, this is quite high on my list to research. Looks like
people there have been doing some interesting work.

Right now I'm just trying to cast a net, so that people can put in
ideas, for when the Java book is done and I can spend more time on it.

Thursday, October 6, 2005, 1:54:56 PM, Michael Sparks wrote:

> Hi Bruce,


> On Thursday 06 October 2005 18:12, Bruce Eckel wrote:
>> Although I hope our conversation isn't done, as he suggests!
> ...
>> At some point when more ideas have been thrown about (and TIJ4 is
>> done) I hope to summarize what we've talked about in an article.

> I don't know if you saw my previous post[1] to python-dev on this topic, but
> Kamaelia is specifically aimed at making concurrency simple and easy to use.
> Initially we were focussed on using scheduled generators for co-operative
> CSP-style (but with buffers) concurrency.
>[1] http://tinyurl.com/dfnah, http://tinyurl.com/e4jfq

> We've tested the system so far on 2 relatively inexperienced programmers
> (as well as experienced, but the more interesting group is novices). The one
> who hadn't done much programming at all (a little bit of VB, pre-university)
> actually fared better IMO. This is probably because concurrency became
> part of his standard toolbox of approaches.

> I've placed the slides I've produced for Euro OSCON on Kamaelia here:
>* http://cerenity.org/KamaeliaEuroOSCON2005.pdf

> The corrected URL for the whitepaper based on work now 6 months old (we've
> come quite a way since then!) is here:
>* http://www.bbc.co.uk/rd/pubs/whp/whp113.shtml

> Consider a simple server for sending text (generated by a user typing into the
> server) to multiple clients connecting to a server. This is a naturally
> concurrent problem in various ways (user interaction, splitting, listening
> for connections, serving connections, etc). Why is that interesting to us?
> It's effectively a microcosm of how subtitling works. (I work at the BBC)

> In Kamaelia this looks like this:

> === start ===
> class ConsoleReader(threadedcomponent):
>def run(self):
>   while 1:
>  line = raw_input(">>> ")
>  line = line + "\n"
>  self.outqueues["outbox"].put(line)

> Backplane("subtitles").activate()
> pipeline(
> ConsoleReader(),
> publishTo("subtitles"),
> ).activate()
> def subtitles_protocol():
> return subscribeTo("subtitles")

> SimpleServer(subtitles_protocol, 5000).run()
> === end ===

> The ConsoleReader is threaded to allow the use of the naive way of
> reading from the input, whereas the server, backplane (a named splitter
> component in practice), pipelines, publishing, subscribing, splitting,
> etc are all single threaded co-operative concurrency.

> A possible client for this text service might be:

> pipeline(
> TCPClient("subtitles.rd.bbc.co.uk", 5000),
> Ticker(),
> ).run()

> (Though that would be a bit bare, even if it does use pygame :)

> The entire system is based around communicating generators, but we also
> have threads for blocking operations. (Though the entire network subsystem
> is non-blocking)

> What I'd be interested in, is hearing how our system doesn't match with
> the goals of the hypothetical concurrency system you'd like to see (if it
> doesn't). The main reason I'm interested in hearing this, is because the
> goals you listed are ones we want to achieve. If you don't think our system
> matches it (we don't have process migration as yet, so that's one area)
> I'd be interested in hearing what areas you think are deficient.

> However, the way we're beginning to refer to the project is to refer to
> just the component aspect rather than concurrency - for one simple
> reason - we're getting to stage where we can ignore /most/ concurrency
> issues(not all).

> If you have any time for feedback, it'd be appreciated. If you don't I hope
> it's useful food for thought! 

> Best Regards,


> Michael


Bruce Eckelhttp://www.BruceEckel.com   mailto:[EMAIL PROTECTED]
Contains electronic books: "Thinking in Java 3e" & "Thinking in C++ 2e"
Web log: http://www.artima.com/weblogs/index.jsp?blogger=beckel
Subscribe to my newsletter:
http://www.mindview.net/Newsletter
My schedule can be found at:
http://www.mindview.net/Calendar



___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Static builds on Windows (continued)

2005-10-06 Thread Marvin
> Date: Wed, 05 Oct 2005 00:21:20 +0200
> From: "Martin v. L?wis" <[EMAIL PROTECTED]>
> Subject: Re: [Python-Dev] Static builds on Windows (continued)
> Cc: [email protected]
> Message-ID: <[EMAIL PROTECTED]>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> 
> Marvin wrote:
> 
>>I built pythoncore and python. The resulting python.exe worked fine, but did
>>indeed fail when I tried to dynamically load anything (Dialog said: the
>>application terminated abnormally)
> 
> 
> Not sure what you are trying to do here. In your case, dynamic loading 
> simply cannot work. The extension modules all link with python24.dll, 
> which you don't have. It may find some python24.dll, which then gives 
> conflicts with the Python interpreter that is already running.
> 
> So what you really should do is disable dynamic loading entirely. To do
> so, remove dynload_win from your project, and #undef 
> HAVE_DYNAMIC_LOADING in PC/pyconfig.h.
> 
> Not sure if anybody has recently tested whether this configuration
> actually works - if you find that it doesn't, please post your patches
> to sf.net/projects/python.
> 
> If you really want to provide dynamic loading of some kind, you should
> arrange the extension modules to import the symbols from your .exe.
> Linking the exe should generate an import library, and you should link
> the extensions against that.
> 
> HTH,
> Martin
> 

I'll try that when I get back to this and feed back my results.  I figured out 
that I can avoid the need for dynamic loading.  I wanted to use some existing 
extension modules, but the whole point was to use the existing ones which as 
you 
point out are linked against a dll.  So even if I created an .EXE that exported 
the symbols, I'd still have to rebuild the extensions.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pythonic concurrency

2005-10-06 Thread Josiah Carlson

Michael Sparks <[EMAIL PROTECTED]> wrote:
> What I'd be interested in, is hearing how our system doesn't match with
> the goals of the hypothetical concurrency system you'd like to see (if it
> doesn't). The main reason I'm interested in hearing this, is because the
> goals you listed are ones we want to achieve. If you don't think our system
> matches it (we don't have process migration as yet, so that's one area)
> I'd be interested in hearing what areas you think are deficient.

I've not used the system you have worked on, so perhaps this is easy,
but the vast majority of concurrency issues can be described as fitting
into one or more of the following task distribution categories.

1. one to many (one producer, many consumers) without duplication (no
consumer has the same data, essentially a distributed queue)
2. one to many (one producer, many consumers) with duplication (the
producer broadcasts to all consumers)
3. many to one (many producers, one consumer)
4. many to many (many producers, many consumers) without duplication (no
consumer has the same data, essentially a distributed queue)
5. many to many (many producers, many consumers) with duplication (all
producers broadcast to all consumers)
6. one to one without duplication

MPI, for example, handles all the above cases with minor work, and
tuple space systems such as Linda can support all of the above with a
bit of work in cases 2 and 5.

If Kamaelia is able to handle all of the above mechanisms in both a
blocking and non-blocking fashion, then I would guess it has the basic
requirements for most concurrent applications.  If, however, it is not
able to easily handle all of the above mechanisms, or has issues with
blocking and/or non-blocking semantics on the producer and/or consumer
end, then it is likely that it will have difficulty gaining traction in
certain applications where the unsupported mechanism is common and/or
necessary.

One nice thing about the message queue style (which it seems as though
Kamaelia implements) is that it guarantees that a listener won't recieve
the same message twice when broadcasting a message to multiple listeners
(case 2 and 5 above) - something that is a bit more difficult to
guarantee in a tuple space scenario, but which is still possible (which
spurns me to add it into my tuple space implementation before it is
released). Another nice thing is that subscriptions to a queue seem to
be persistant in Kamaelia, which I should also implement.


 - Josiah

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 2.5 and ast-branch

2005-10-06 Thread Kurt B. Kaiser
Jeremy Hylton <[EMAIL PROTECTED]> writes:

> On 10/6/05, Guido van Rossum <[EMAIL PROTECTED]> wrote:
>> The only alternative to abandoning it that I see is to merge it back
>> into main NOW, using the time that remains us until the 2.5 release to
>> make it robust. That way, everybody can help out (and it may motivate
>> more people).
>>
>> Even if this is a temporary regression (e.g. PEP 342), it might be
>> worth it -- but only if there are at least two people committed to
>> help out quickly when there are problems.
>
> I'm sorry I didn't respond earlier.  I've been home with a new baby
> for the last six weeks and haven't been keeping a close eye on my
> email.  (I didn't see Nick's earlier email until his most recent
> post.)
>
> It would take a few days of work to get the branch ready to merge to
> the head.  There are basic issues like renaming newcompile.c to
> compile.c and the like.  I could work on that tomorrow and Monday.

Unless I'm missing something, we would need to merge HEAD to the AST
branch once more to pick up the changes in MAIN since the last merge,
and then make sure everything in the AST branch is passing the test
suite.  Otherwise we risk having MAIN broken for awhile following a
merge.

Finally, we can then merge the diff of HEAD to AST back into MAIN.

If we try to merge the entire AST branch since its inception, we will
re-apply to MAIN those changes made in MAIN which have already been
merged to the AST branch and it will be difficult to sort out all the
conflicts.

If we try to merge the AST branch from the its last merge tag to its
head we will miss the work done on AST prior to that merge.

Let me know at [EMAIL PROTECTED] if you want to do this.

-- 
KBK
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 2.5 and ast-branch

2005-10-06 Thread Raymond Hettinger
> Unless I'm missing something, we would need to merge HEAD to the AST
> branch once more to pick up the changes in MAIN since the last merge,
> and then make sure everything in the AST branch is passing the test
> suite.  Otherwise we risk having MAIN broken for awhile following a
> merge.

IMO, merging to the head is a somewhat dangerous strategy that doesn't
have any benefits.  Whether done on the head or in the branch, the same
amount of work needs to be done.

If the stability of the head is disrupted, it may impede other
maintenance efforts because it is harder to test bug fixes when the test
suites are not passing. 

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pythonic concurrency

2005-10-06 Thread Michael Sparks
On Thursday 06 October 2005 23:15, Josiah Carlson wrote:
[... 6 specific use cases ...]
> If Kamaelia is able to handle all of the above mechanisms in both a
> blocking and non-blocking fashion, then I would guess it has the basic
> requirements for most concurrent applications.

It can. I can easily knock up examples for each if required :-)

That said, a more interesting example implemented this week (as part of
a rapid prototyping project to look at collaborative community radio)
implements an networked audio mixer matrix. That allows mutiple sources of
audio to be mixed, sent on to multiple destinations, may be duplicate mixes
of each other, but also may select different mixes. The same system also
includes point to point communications for network control of the mix.

That application covers ( I /think/ ) 1, 2, 3, 4,  and 6 on your list of
things as I understand what you mean. 5 is fairly trivial though. (The
largest bottleneck in writing it was my personal misunderstanding of
how to actually mix 16bit signed audio :-)

Regarding blocking & non-blocking, links can be marked to synchronous, which
forces blocking style behaviour. Since generally we're using generators, we
can't block for real which is why we throw an exception there. However,
threaded components can & do block. The reason for this was due to the
architecture being inspired by noting the similarities between asynchronous
hardware systems/langages and network systems.

> into my tuple space implementation before it is released. 

I'd be interested in hearing more about that BTW. One thing we've found is
that much organic systems have a neural system for communications between
things, (hence Axon :), that you also need to equivalent of a hormonal system.
In the unix shell world, IMO the environment acts as that for pipelines, and
similarly that's why we have an assistant system. (Which has key/value lookup
facilities)

It's a less obvious requirement, but is a useful one nonetheless, so I don't
really see a message passing style as excluding a linda approach - since
they're orthoganal approaches.

Best Regards,


Michael.
-- 
Michael Sparks, Senior R&D Engineer, Digital Media Group
[EMAIL PROTECTED], http://kamaelia.sourceforge.net/
British Broadcasting Corporation, Research and Development
Kingswood Warren, Surrey KT20 6NP

This e-mail may contain personal views which are not the views of the BBC.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 2.5 and ast-branch

2005-10-06 Thread Guido van Rossum
[Kurt]
> > Unless I'm missing something, we would need to merge HEAD to the AST
> > branch once more to pick up the changes in MAIN since the last merge,
> > and then make sure everything in the AST branch is passing the test
> > suite.  Otherwise we risk having MAIN broken for awhile following a
> > merge.

[Raymond]
> IMO, merging to the head is a somewhat dangerous strategy that doesn't
> have any benefits.  Whether done on the head or in the branch, the same
> amount of work needs to be done.
>
> If the stability of the head is disrupted, it may impede other
> maintenance efforts because it is harder to test bug fixes when the test
> suites are not passing.

Well, at some point it will HAVE to be merged into the head. The
longer we wait the more painful it will be. If we suffer a week of
instability now, I think that's acceptable, as long as all developers
are suitably alerted, and as long as the AST team works towards
resolving the issues ASAP.

I happen to agree with Kurt that we should first merge the head into
the branch; then the AST team can work on making sure the entire test
suite passes; then they can merge back into the head.

BUT this should only be done with a serious commitment from the AST
team (I think Neil and Jeremy are offering this -- I just don't know
how much time they will have available, realistically).

My main point is, we should EITHER abandon the AST branch, OR force a
quick resolution. I'm willing to suffer a week of instability in head
now, or in a week or two -- but I'm not willing to wait again.

Let's draw a line in the sand. The AST team (which includes whoever
will help) has up to three weeks to het the AST branch into a position
where it passes all the current unit tests merged in from the head.
Then they merge it into the head after which we can accept at most a
week of instability in the head. After that the AST team must remain
available to resolve remaining issues quickly.

How does this sound to the non-AST-branch developers who have to
suffer the inevitable post-merge instability? I think it's now or
never -- waiting longer isn't going to make this thing easier (not
with several more language changes approved: with-statement, extended
import, what else...)

What does the AST team think?

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 2.5 and ast-branch

2005-10-06 Thread Phillip J. Eby
At 07:34 PM 10/6/2005 -0700, Guido van Rossum wrote:
>How does this sound to the non-AST-branch developers who have to
>suffer the inevitable post-merge instability? I think it's now or
>never -- waiting longer isn't going to make this thing easier (not
>with several more language changes approved: with-statement, extended
>import, what else...)

Do the AST branch changes affect the interface of the "parser" module?  Or 
do they just add new functionality?

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 2.5 and ast-branch

2005-10-06 Thread Brett Cannon
On 10/6/05, Guido van Rossum <[EMAIL PROTECTED]> wrote:
> [Kurt]
> > > Unless I'm missing something, we would need to merge HEAD to the AST
> > > branch once more to pick up the changes in MAIN since the last merge,
> > > and then make sure everything in the AST branch is passing the test
> > > suite.  Otherwise we risk having MAIN broken for awhile following a
> > > merge.
>
> [Raymond]
> > IMO, merging to the head is a somewhat dangerous strategy that doesn't
> > have any benefits.  Whether done on the head or in the branch, the same
> > amount of work needs to be done.
> >
> > If the stability of the head is disrupted, it may impede other
> > maintenance efforts because it is harder to test bug fixes when the test
> > suites are not passing.
>
> Well, at some point it will HAVE to be merged into the head. The
> longer we wait the more painful it will be. If we suffer a week of
> instability now, I think that's acceptable, as long as all developers
> are suitably alerted, and as long as the AST team works towards
> resolving the issues ASAP.
>
> I happen to agree with Kurt that we should first merge the head into
> the branch; then the AST team can work on making sure the entire test
> suite passes; then they can merge back into the head.
>
> BUT this should only be done with a serious commitment from the AST
> team (I think Neil and Jeremy are offering this -- I just don't know
> how much time they will have available, realistically).
>
> My main point is, we should EITHER abandon the AST branch, OR force a
> quick resolution. I'm willing to suffer a week of instability in head
> now, or in a week or two -- but I'm not willing to wait again.
>
> Let's draw a line in the sand. The AST team (which includes whoever
> will help) has up to three weeks to het the AST branch into a position
> where it passes all the current unit tests merged in from the head.
> Then they merge it into the head after which we can accept at most a
> week of instability in the head. After that the AST team must remain
> available to resolve remaining issues quickly.
>

So basically we have until November 1 to get all tests passing?

For anyone who wants a snapshot of where things stand,
http://www.python.org/sf/1191458 lists the tests that are currently
failing (read the comments to get the current list; count is at 14). 
All AST-related tracker items are under the AST group so filtering to
just AST stuff is easy.

I am willing to guess a couple of those tests will start passing as
soon as http://www.python.org/sf/1246473 is dealt with (this is just
based on looking at some of the failure output seeming to be off by
one).  As of right now the lnotab is only has statement granularity
when it really needs expression granularity.  That requires tweaking
all instances where an expression node is created to also take in the
line number of where the expression exists.  This fix is one of the
main reasons I have not touched the AST branch; it is not difficult,
but it is not exactly fun or small either.  =)

> How does this sound to the non-AST-branch developers who have to
> suffer the inevitable post-merge instability? I think it's now or
> never -- waiting longer isn't going to make this thing easier (not
> with several more language changes approved: with-statement, extended
> import, what else...)
>
> What does the AST team think?
>

Well, I have homework this weekend, a midterm two weeks from tomorrow
(so the preceding weekend will be studying), and October 23 is my
birthday so I will be busy that entire weekend visiting family.  In
other words Python time is a premium this month.  But I will try to
squeeze in what time I can.

But I think the three week time frame is reasonable to light the fire
under our asses to get this thing done (especially if it inspires
people to jump in and help out; as always, people interested in
joining in, check out the branch and read Python/compile.txt ).

-Brett
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 2.5 and ast-branch

2005-10-06 Thread Kurt B. Kaiser
Guido van Rossum <[EMAIL PROTECTED]> writes:

> I happen to agree with Kurt that we should first merge the head into
> the branch; then the AST team can work on making sure the entire
> test suite passes; then they can merge back into the head.

I can be available to do this again.  It would involve freezing the
AST branch for a day.

Once the AST branch is stable, we would need to freeze everything,
merge MAIN to AST one more time to pick up the last few changes in
MAIN, and then merge the AST head back to MAIN.

By doing these merges from MAIN to AST we would have effectively moved
the AST branch point along MAIN to HEAD.  So the final join is HEAD to
AST, conducted from MAIN.  I'll run a local experiment to verify this
concept is workable.

-- 
KBK
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pythonic concurrency

2005-10-06 Thread Josiah Carlson

Michael Sparks <[EMAIL PROTECTED]> wrote:
> 
> On Thursday 06 October 2005 23:15, Josiah Carlson wrote:
> [... 6 specific use cases ...]
> > If Kamaelia is able to handle all of the above mechanisms in both a
> > blocking and non-blocking fashion, then I would guess it has the basic
> > requirements for most concurrent applications.
> 
> It can. I can easily knock up examples for each if required :-)

That's cool, I trust you.  One thing I notice is absent from the
Kamaelia page is benchmarks.

On the one hand, benchmarks are technically useless, as one can tend to
benchmark those things that a system does well, and ignore those things
that it does poorly (take, for example how PyLinda's speed test only
ever inserts and removes one tuple at a time...try inserting 100k  and
use wildcards to extract those 100k, and you'll note how poor it
performs, or database benchmarks, etc.).  However, if one's benchmarks
provide examples from real use, then it shows that at least someone has
gotten some X performance from the system.

I'm personally interested in latency and throughput for varying sizes of
data being passed through the system.


> That said, a more interesting example implemented this week (as part of
> a rapid prototyping project to look at collaborative community radio)
> implements an networked audio mixer matrix. That allows mutiple sources of
> audio to be mixed, sent on to multiple destinations, may be duplicate mixes
> of each other, but also may select different mixes. The same system also
> includes point to point communications for network control of the mix.

Very neat.  How much data?  What kind of throughput?  What kinds of
latencies?

> That application covers ( I /think/ ) 1, 2, 3, 4,  and 6 on your list of
> things as I understand what you mean. 5 is fairly trivial though.

Cool.

> Regarding blocking & non-blocking, links can be marked to synchronous, which
> forces blocking style behaviour. Since generally we're using generators, we
> can't block for real which is why we throw an exception there. However,
> threaded components can & do block. The reason for this was due to the
> architecture being inspired by noting the similarities between asynchronous
> hardware systems/langages and network systems.

On the client side, I was lazy and used synchronous/blocking sockets to
block on read/write (every client thread gets its own connection,
meaning that tuple puts are never sitting in a queue).  I've also got
server-side timeouts for when you don't want to wait too long for data.
rslt = tplspace.get(PATTERN, timeout=None)


> > into my tuple space implementation before it is released. 
> 
> I'd be interested in hearing more about that BTW. One thing we've found is
> that much organic systems have a neural system for communications between
> things, (hence Axon :), that you also need to equivalent of a hormonal system.
> In the unix shell world, IMO the environment acts as that for pipelines, and
> similarly that's why we have an assistant system. (Which has key/value lookup
> facilities)

I have two recent posts about the performance and features of a (hacked
together) tuple space system I worked on (for two afternoons) in my blog.
"Feel Lucky" for "Josiah Carlson" in google and you will find it.


> It's a less obvious requirement, but is a useful one nonetheless, so I don't
> really see a message passing style as excluding a linda approach - since
> they're orthoganal approaches.

Indeed.  For me, the idea of being able to toss a tuple into memory
somewhere and being able to find it later maps into my mind as:
('name', arg1, ...) -> name(arg1, ...), which is, quite literally, an
RPC semantic (which seems a bit more natural to me than subscribing to
the 'name' queue).  With the ability to send to either single or
multiple listeners, you get message passing, broadcast messages, and a
standard job/result queueing semantic. The only thing that it is missing
is a prioritization mechanism (fifo, numeric priority, etc.), which
would get us a job scheduling kernel. Not bad for a "message
passing"/"tuple space"/"IPC" library.  (all of the above described have
direct algorithms for implementation).


 - Josiah

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 2.5 and ast-branch

2005-10-06 Thread Michael Hudson
Guido van Rossum <[EMAIL PROTECTED]> writes:

> How does this sound to the non-AST-branch developers who have to
> suffer the inevitable post-merge instability? I think it's now or
> never -- waiting longer isn't going to make this thing easier (not
> with several more language changes approved: with-statement, extended
> import, what else...)

It sounds OK to me.

Cheers,
mwh

-- 
  To summarise the summary of the summary:- people are a problem.
   -- The Hitch-Hikers Guide to the Galaxy, Episode 12
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com