Re: [Python-Dev] Path object design

2006-11-03 Thread Andrew Dalke
glyph:
> Path manipulation:
>
>  * This is confusing as heck:
>>>> os.path.join("hello", "/world")
>'/world'
>>>> os.path.join("hello", "slash/world")
>'hello/slash/world'
>>>> os.path.join("hello", "slash//world")
>'hello/slash//world'
>Trying to formulate a general rule for what the arguments to os.path.join
> are supposed to be is really hard.  I can't really figure out what it would
> be like on a non-POSIX/non-win32 platform.

Made trickier by the similar yet different behaviour of urlparse.urljoin.

 >>> import urlparse
 >>> urlparse.urljoin("hello", "/world")
 '/world'
 >>> urlparse.urljoin("hello", "slash/world")
 'slash/world'
 >>> urlparse.urljoin("hello", "slash//world")
 'slash//world'
 >>>

It does not make sense to me that these should be different.

   Andrew
   [EMAIL PROTECTED]

[Apologies to glyph for the dup; mixed up the reply-to.  Still getting
used to gmail.]
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Path object design

2006-11-03 Thread Steve Holden
Andrew Dalke wrote:
> glyph:
> 
>>Path manipulation:
>>
>> * This is confusing as heck:
>>   >>> os.path.join("hello", "/world")
>>   '/world'
>>   >>> os.path.join("hello", "slash/world")
>>   'hello/slash/world'
>>   >>> os.path.join("hello", "slash//world")
>>   'hello/slash//world'
>>   Trying to formulate a general rule for what the arguments to os.path.join
>>are supposed to be is really hard.  I can't really figure out what it would
>>be like on a non-POSIX/non-win32 platform.
> 
> 
> Made trickier by the similar yet different behaviour of urlparse.urljoin.
> 
>  >>> import urlparse
>  >>> urlparse.urljoin("hello", "/world")
>  '/world'
>  >>> urlparse.urljoin("hello", "slash/world")
>  'slash/world'
>  >>> urlparse.urljoin("hello", "slash//world")
>  'slash//world'
>  >>>
> 
> It does not make sense to me that these should be different.
> 
Although the last two smell like bugs, the point of urljoin is to make 
an absolute URL from an absolute ("current page") URL and a relative 
(link) one. As we see:

  >>> urljoin("/hello", "slash/world")
'/slash/world'

and

  >>> urljoin("http://localhost/hello";, "slash/world")
'http://localhost/slash/world'

but

  >>> urljoin("http://localhost/hello/";, "slash/world")
'http://localhost/hello/slash/world'
  >>> urljoin("http://localhost/hello/index.html";, "slash/world")
'http://localhost/hello/slash/world'
  >>>

I think we can probably conclude that this is what's supposed to happen. 
In the case of urljoin the first argument is interpreted as referencing 
an existing resource and the second as a link such as might appear in 
that resource.

regards
  Steve
-- 
Steve Holden   +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd  http://www.holdenweb.com
Skype: holdenweb   http://holdenweb.blogspot.com
Recent Ramblings http://del.icio.us/steve.holden

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Path object design

2006-11-03 Thread Fredrik Lundh
Steve Holden wrote:

> Although the last two smell like bugs, the point of urljoin is to make 
> an absolute URL from an absolute ("current page") URL

also known as a base URL:

 http://www.w3.org/TR/html4/struct/links.html#h-12.4.1

(os.path.join's behaviour is also well-defined, btw; if any component is 
an absolute path, all preceding components are ignored.)



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Path object design

2006-11-03 Thread Martin v. Löwis
Andrew Dalke schrieb:
>  >>> import urlparse
>  >>> urlparse.urljoin("hello", "/world")
>  '/world'
>  >>> urlparse.urljoin("hello", "slash/world")
>  'slash/world'
>  >>> urlparse.urljoin("hello", "slash//world")
>  'slash//world'
>  >>>
> 
> It does not make sense to me that these should be different.

Just in case this isn't clear from Steve's and Fredrik's
post: The behaviour of this function is (or should be)
specified, by an IETF RFC. If somebody finds that non-intuitive,
that's likely because their mental model of relative URIs
deviate's from the RFC's model.

Of course, there is also the chance that the implementation
deviates from the RFC; that would be a bug.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-11-03 Thread Scott Dial
Travis Oliphant wrote:
> Paul Moore wrote:
>> Enough of the abstract. As a concrete example, suppose I have a (byte)
>> string in my program containing some binary data - an ID3 header, or a
>> TCP packet, or whatever. It doesn't really matter. Does your proposal
>> offer anything to me in how I might manipulate that data (assuming I'm
>> not using NumPy)? (I'm not insisting that it should, I'm just trying
>> to understand the scope of the PEP).
>>
> 
> What do you mean by "manipulate the data."  The proposal for a 
> data-format object would help you describe that data in a standard way 
> and therefore share that data between several library that would be able 
> to understand the data (because they all use and/or understand the 
> default Python way to handle data-formats).
> 

Perhaps the most relevant thing to pull from this conversation is back 
to what Martin has asked about before: "flexible array members". A TCP 
packet has no defined length (there isn't even a header field in the 
packet for this, so in fairness we can talk about IP packets which do). 
There is no way for me to describe this with the pre-PEP data-formats.

I feel like it is misleading of you to say "it's up to the package to do 
manipulations," because you glanced over the fact that you can't even 
describe this type of data. ISTM, that you're only interested in 
describing repetitious fixed-structure arrays. If we are going to have a 
"default Python way to handle data-formats", then don't you feel like 
this falls short of the mark?

I fear that you speak about this in too grandiose terms and are now 
trapped by people asking, "well, can I do this?" I think for a lot of 
folks the answer is: "nope." With respect to the network packets, this 
PEP doesn't do anything to fix the communication barrier. Is this not in 
the scope of "a consistent and standard way to discuss the format of 
binary data" (which is what your PEP's abstract sets out as the task)?

-- 
Scott Dial
[EMAIL PROTECTED]
[EMAIL PROTECTED]
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Path object design

2006-11-03 Thread Andrew Dalke
Martin:
> Just in case this isn't clear from Steve's and Fredrik's
> post: The behaviour of this function is (or should be)
> specified, by an IETF RFC. If somebody finds that non-intuitive,
> that's likely because their mental model of relative URIs
> deviate's from the RFC's model.

While I didn't realize that urljoin is only supposed to be used
with a base URL, where "base URL" (used in the docstring) has
a specific requirement that it be absolute.

I instead saw the word "join" and figured it's should do roughly
the same things as os.path.join.


>>> import urlparse
>>> urlparse.urljoin("file:///path/to/hello", "slash/world")
'file:///path/to/slash/world'
>>> urlparse.urljoin("file:///path/to/hello", "/slash/world")
'file:///slash/world'
>>> import os
>>> os.path.join("/path/to/hello", "slash/world")
'/path/to/hello/slash/world'
>>>

It does not.  My intuition, nowadays highly influenced by URLs, is that
with a couple of hypothetical functions for going between filenames and URLs:

os.path.join(absolute_filename, filename)
   ==
file_url_to_filename(urlparse.urljoin(
 filename_to_file_url(absolute_filename),
 filename_to_file_url(filename)))

which is not the case.  os.join assumes the base is a directory
name when used in a join: "inserting '/' as needed" while RFC
1808 says

   The last segment of the base URL's path (anything
   following the rightmost slash "/", or the entire path if no
   slash is present) is removed

Is my intuition wrong in thinking those should be the same?

I suspect it is. I've been very glad that when I ask for a directory
name that I don't need to check that it ends with a "/".  Urljoin's
behaviour is correct for what it's doing.  os.path.join is better for
what it's doing.  (And about once a year I manually verify the
difference because I get unsure.)

I think these should not share the "join" in the name.

If urljoin is not meant for relative base URLs, should it
raise an exception when misused? Hmm, though the RFC
algorithm does not have a failure mode and the result may
be a relative URL.

Consider

>>> urlparse.urljoin("http://blah.com/a/b/c";, "..")
'http://blah.com/a/'
>>> urlparse.urljoin("http://blah.com/a/b/c";, "../")
'http://blah.com/a/'
>>> urlparse.urljoin("http://blah.com/a/b/c";, "../..")
'http://blah.com/'
>>> urlparse.urljoin("http://blah.com/a/b/c";, "../../")
'http://blah.com/'
>>> urlparse.urljoin("http://blah.com/a/b/c";, "../../..")
'http://blah.com/'
>>> urlparse.urljoin("http://blah.com/a/b/c";, "../../../")
'http://blah.com/../'
>>> urlparse.urljoin("http://blah.com/a/b/c";, "../../../..")  # What?!
'http://blah.com/'
>>> urlparse.urljoin("http://blah.com/a/b/c";, "../../../../")
'http://blah.com/../../'
>>>


> Of course, there is also the chance that the implementation
> deviates from the RFC; that would be a bug.

The comment in urlparse

# XXX The stuff below is bogus in various ways...

is ever so reassuring.  I suspect there's a bug given the
previous code.  Or I've a bad mental model.  ;)

Andrew
[EMAIL PROTECTED]
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-11-03 Thread Travis Oliphant

>
> Perhaps the most relevant thing to pull from this conversation is back 
> to what Martin has asked about before: "flexible array members". A TCP 
> packet has no defined length (there isn't even a header field in the 
> packet for this, so in fairness we can talk about IP packets which 
> do). There is no way for me to describe this with the pre-PEP 
> data-formats.
>
> I feel like it is misleading of you to say "it's up to the package to 
> do manipulations," because you glanced over the fact that you can't 
> even describe this type of data. ISTM, that you're only interested in 
> describing repetitious fixed-structure arrays. 
Yes, that's right.  I'm only interested in describing binary data with a 
fixed length.  Others can help push it farther than that (if they even 
care).

> If we are going to have a "default Python way to handle data-formats", 
> then don't you feel like this falls short of the mark?
Not for me.   We can fix what needs fixing, but not if we can't get out 
of the gate.
>
> I fear that you speak about this in too grandiose terms and are now 
> trapped by people asking, "well, can I do this?" I think for a lot of 
> folks the answer is: "nope." With respect to the network packets, this 
> PEP doesn't do anything to fix the communication barrier.

Yes it could if you were interested in pushing it there.   No, I didn't 
solve that particular problem with the PEP (because I can only solve the 
problems I'm aware of), but I do think the problem could be solved.   We 
have far too many nay-sayers on this list, I think.

Right now, I don't have time to push this further.  My real interest is 
the extended buffer protocol.  I want something that works for that.  
When I do have time again to discuss it again, I might come back and 
push some more. 

But, not now.

-Travis



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Path object design

2006-11-03 Thread Phillip J. Eby
At 01:56 AM 11/4/2006 +0100, Andrew Dalke wrote:
>os.join assumes the base is a directory
>name when used in a join: "inserting '/' as needed" while RFC
>1808 says
>
>The last segment of the base URL's path (anything
>following the rightmost slash "/", or the entire path if no
>slash is present) is removed
>
>Is my intuition wrong in thinking those should be the same?

Yes.  :)

Path combining and URL absolutization(?) are inherently different 
operations with only superficial similarities.  One reason for this is that 
a trailing / on a URL has an actual meaning, whereas in filesystem paths a 
trailing / is an aberration and likely an actual error.

The path combining operation says, "treat the following as a subpath of the 
base path, unless it is absolute".  The URL normalization operation says, 
"treat the following as a subpath of the location the base URL is 
*contained in*".

Because of this, os.path.join assumes a path with a trailing separator is 
equivalent to a path without one, since that is the only reasonable way to 
interpret treating the joined path as a subpath of the base path.

But for a URL join, the path /foo and the path /foo/ are not only 
*different paths* referring to distinct objects, but the operation wants to 
refer to the *container* of the referenced object.  /foo might refer to a 
directory, while /foo/ refers to some default content (e.g. 
index.html).  This is actually why Apache normally redirects you from /foo 
to /foo/ before it serves up the index.html; relative URLs based on a base 
URL of /foo won't work right.

The URL approach is designed to make peer-to-peer linking in a given 
directory convenient.  Instead of referring to './foo.html' (as one would 
have to do with filenames, you can simply refer to 'foo.html'.  But the 
cost of saving those characters in every link is that joining always takes 
place on the parent, never the tail-end.  Thus directory URLs normally end 
in a trailing /, and most tools tend to automatically redirect when 
somebody leaves it off.  (Because otherwise the links would be wrong.)

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Path object design

2006-11-03 Thread Steve Holden
Phillip J. Eby wrote:
> At 01:56 AM 11/4/2006 +0100, Andrew Dalke wrote:
> 
>>os.join assumes the base is a directory
>>name when used in a join: "inserting '/' as needed" while RFC
>>1808 says
>>
>>   The last segment of the base URL's path (anything
>>   following the rightmost slash "/", or the entire path if no
>>   slash is present) is removed
>>
>>Is my intuition wrong in thinking those should be the same?
> 
> 
> Yes.  :)
> 
> Path combining and URL absolutization(?) are inherently different 
> operations with only superficial similarities.  One reason for this is that 
> a trailing / on a URL has an actual meaning, whereas in filesystem paths a 
> trailing / is an aberration and likely an actual error.
> 
> The path combining operation says, "treat the following as a subpath of the 
> base path, unless it is absolute".  The URL normalization operation says, 
> "treat the following as a subpath of the location the base URL is 
> *contained in*".
> 
> Because of this, os.path.join assumes a path with a trailing separator is 
> equivalent to a path without one, since that is the only reasonable way to 
> interpret treating the joined path as a subpath of the base path.
> 
> But for a URL join, the path /foo and the path /foo/ are not only 
> *different paths* referring to distinct objects, but the operation wants to 
> refer to the *container* of the referenced object.  /foo might refer to a 
> directory, while /foo/ refers to some default content (e.g. 
> index.html).  This is actually why Apache normally redirects you from /foo 
> to /foo/ before it serves up the index.html; relative URLs based on a base 
> URL of /foo won't work right.
> 
> The URL approach is designed to make peer-to-peer linking in a given 
> directory convenient.  Instead of referring to './foo.html' (as one would 
> have to do with filenames, you can simply refer to 'foo.html'.  But the 
> cost of saving those characters in every link is that joining always takes 
> place on the parent, never the tail-end.  Thus directory URLs normally end 
> in a trailing /, and most tools tend to automatically redirect when 
> somebody leaves it off.  (Because otherwise the links would be wrong.)
> 
Having said this, Andrew *did* demonstrate quite convincingly that the 
current urljoin has some fairly egregious directory traversal glitches. 
Is it really right to punt obvious gotchas like

 >>>urlparse.urljoin("http://blah.com/a/b/c";, "../../../../")

'http://blah.com/../../'

 >>>

to the server?

regards
  Steve
-- 
Steve Holden   +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd  http://www.holdenweb.com
Skype: holdenweb   http://holdenweb.blogspot.com
Recent Ramblings http://del.icio.us/steve.holden

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Path object design

2006-11-03 Thread Nick Coghlan
Steve Holden wrote:
> Having said this, Andrew *did* demonstrate quite convincingly that the 
> current urljoin has some fairly egregious directory traversal glitches. 
> Is it really right to punt obvious gotchas like
> 
>  >>>urlparse.urljoin("http://blah.com/a/b/c";, "../../../../")
> 
> 'http://blah.com/../../'
> 
>  >>>
> 
> to the server?

See Paul Jimenez's thread about replacing urlparse with something better. The 
current module has some serious issues :)

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Status of pairing_heap.py?

2006-11-03 Thread Paul Chiusano
I was looking for a good pairing_heap implementation and came across
one that had apparently been checked in a couple years ago (!). Here
is the full link:

http://svn.python.org/view/sandbox/trunk/collections/pairing_heap.py?rev=40887&view=markup

I was just wondering about the status of this implementation. The api
looks pretty good to me -- it's great that the author decided to have
the insert method return a node reference which can then be passed to
delete and adjust_key. It's a bit of a pain to implement that
functionality, but it's extremely useful for a number of applications.

If that project is still alive, I have a couple api suggestions:

* Add a method which nondestructively yields the top K elements of the
heap. This would work by popping the top k elements of the heap into a
list, then reinserting those elements in reverse order. By reinserting
the sorted elements in reverse order, the top of the heap is
essentially a sorted linked list, so if the exact operation is
repeated again, the removals take contant time rather than amortized
logarthmic.
  * So, for example: if we have a min heap, the topK method would pop
K elements from the heap, say they are {1, 3, 5, 7}, then do
insert(7), followed by insert(5), ... insert(1).
  * Even better might be if this operation avoided having to allocate
new heap nodes, and just reused the old ones.
 * I'm not sure if adjust_key should throw an exception if the key
adjustment is in the wrong direction. Perhaps it should just fall back
on deleting and reinserting that node?

Paul
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Tracker-discuss] Getting Started

2006-11-03 Thread Stefan Seefeld
Brett Cannon wrote:
> On 11/1/06, Stefan Seefeld <[EMAIL PROTECTED]> wrote:

>> Right. Brett, do we need accounts on python.org for this ?
> 
> 
> Yep.  It just requires SSH 2 keys from each of you.  You can then email
> python-dev with those keys and your first.last name and someone there will
> install the keys for you.

My key is at http://www3.sympatico.ca/seefeld/ssh.txt, I'm Stefan Seefeld.

Thanks !

Stefan

-- 

  ...ich hab' noch einen Koffer in Berlin...
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Tracker-discuss] Getting Started

2006-11-03 Thread Erik Forsberg
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

"Brett Cannon" <[EMAIL PROTECTED]> writes:

> On 11/1/06, Stefan Seefeld <[EMAIL PROTECTED]> wrote:
>>
>> Brett Cannon wrote:
>> > On 11/1/06, Stefan Seefeld <[EMAIL PROTECTED]> wrote:
>>
>> >> Right. Brett, do we need accounts on python.org for this ?
>> >
>> >
>> > Yep.  It just requires SSH 2 keys from each of you.  You can then email
>> > python-dev with those keys and your first.last name and someone there
>> will
>> > install the keys for you.
>>
>> My key is at http://www3.sympatico.ca/seefeld/ssh.txt, I'm Stefan Seefeld.
>>
>> Thanks !
>
>
> Just to clarify, this is not for pydotorg but the svn.python.org.  The
> admins for our future Roundup instance are going to keep their Roundup code
> in svn so they need commit access.

Now when that's clarified, here's my data:

Public SSH key: http://efod.se/about/ptkey.pub
First.Lastname: erik.forsberg

I'd appreciate if someone with good taste could tell us where in the
tree we should add our code :-).

Thanks,
\EF
- -- 
Erik Forsberg http://efod.se
GPG/PGP Key: 1024D/0BAC89D9
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.8+ 

iD8DBQFFSPSOrJurFAusidkRAucqAKDWdlq6dkI1nNt5caSyJ+gFviSeJACg4gNJ
ItRUEsEI3/4ZN154Znw4jEQ=
=o+Iy
-END PGP SIGNATURE-
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] idea for data-type (data-format) PEP

2006-11-03 Thread Travis Oliphant
Martin v. Löwis wrote:

>Travis Oliphant schrieb:
>  
>
r_field = PyDict_GetItemString(dtype,'r');



>>Actually it should read PyDict_GetItemString(dtype->fields).The
>>r_field is a tuple (data-type object, offset).  The fields attribute is
>>(currently) a Python dictionary.
>>
>>
>
>Ok. This seems to be missing in the PEP. 
>
Yeah,  actually quite a bit is missing.  Because I wanted to float the 
idea for discussion before "getting the details perfect"  (which of 
course they wouldn't be if it was just my input producing them).

>In this code, where is PyArray_GetField coming from?
>
This is a NumPy Specific C-API.That's why I was confused about why 
you wanted me to show how I would do it. 

But, what you are actually asking is how would another application use 
the data-type information to do the same thing using the data-type 
object and a pointer to memory.  Is that correct?

This is a reasonable thing to request.  And your example is a good one.  
I will use the PEP to explain it.

Ultimately, the code you are asking for will have to have some kind of 
dispatch table for different binary code depending on the actual 
data-types being shared (unless all that is needed is a copy in which 
case just the size of the element area can be used).  In my experience, 
the dispatch table must be present for at least the "simple" 
data-types.  The data-types built up from there can depend on those.

In NumPy, the data-type objects have function pointers to accomplish all 
the things NumPy does quickly.  So, each data-type object in NumPy 
points to a function-pointer table and the NumPy code defers to it to 
actually accomplish the task (much like Python really).

Not all libraries will support working with all data-types.  If they 
don't support it, they just raise an error indicating that it's not 
possible to share that kind of data. 

> What does
>it do? If I wanted to write this code from scratch, what
>should I write instead? Since this is all about a flat
>memory block, I'm surprised I need "true" Python objects
>for the field values in there.
>  
>
Well, actually, the block could be "strided" as well. 

So, you would write something that gets the pointer to the memory and 
then gets the extended information (dimensionality, shape, and strides, 
and data-format object).  Then, you would get the offset of the field 
you are interested in from the start of the element (it's stored in the 
data-format representation).

Then do a memory copy from the right place (using the array iterator in 
NumPy you can actually do it without getting the shape and strides 
information first but I'm holding off on that PEP until an N-d array is 
proposed for Python).   I'll write something like that as an example and 
put it in the PEP for the extended buffer protocol.  

-Travis




>  
>
>>But, the other option (especially for code already written) would be to
>>just convert the data-format specification into it's own internal
>>representation.
>>
>>
>
>Ok, so your assumption is that consumers already have their own
>machinery, in which case ease-of-use would be the question how
>difficult it is to convert datatype objects into the internal
>representation.
>
>Regards,
>Martin
>  
>

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Tracker-discuss] Getting Started

2006-11-03 Thread Michael Twomey
On 11/1/06, Brett Cannon <[EMAIL PROTECTED]> wrote:
> >
> > Right. Brett, do we need accounts on python.org for this ?
>
> Yep.  It just requires SSH 2 keys from each of you.  You can then email
> python-dev with those keys and your first.last name and someone there will
> install the keys for you.
>

I'll need svn access to svn.python.org too for the roundup tracker.

My key is over at http://translucentcode.org/mick/ssh_key.txt
firstname.lastname: michael.twomey

cheers,
  Michael
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Feature Request: Py_NewInterpreter to create separate GIL (branch)

2006-11-03 Thread Robert
repeated from c.l.p : "Feature Request: Py_NewInterpreter to create 
separate GIL (branch)"

Daniel Dittmar wrote:
 > robert wrote:
 >> I'd like to use multiple CPU cores for selected time consuming Python
 >> computations (incl. numpy/scipy) in a frictionless manner.
 >>
 >> Interprocess communication is tedious and out of question, so I
 >> thought about simply using a more Python interpreter instances
 >> (Py_NewInterpreter) with extra GIL in the same process.
 >
 > If I understand Python/ceval.c, the GIL is really global, not specific
 > to an interpreter instance:
 > static PyThread_type_lock interpreter_lock = 0; /* This is the GIL */
 >

Thats the show stopper as of now.
There are only a handfull funcs in ceval.c to use that very global lock. 
The rest uses that funcs around thread states.

Would it be a possibilty in next Python to have the lock separate for 
each Interpreter instance.
Thus: have *interpreter_lock separate in each PyThreadState instance and 
only threads of same Interpreter have same GIL?
Separation between Interpreters seems to be enough. The Interpreter runs 
mainly on the stack. Possibly only very few global C-level resources 
would require individual extra locks.

Sooner or later Python will have to answer the multi-processor question.
A per-interpreter GIL and a nice module for tunneling Python-Objects 
directly between Interpreters inside one process might be the answer at 
the right border-line ? Existing extension code base would remain 
compatible, as far as there is already decent locking on module globals, 
which is the the usual case.

Robert
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The "lazy strings" patch [was: PATCH submitted: Speed up + for string concatenation, now as fast as "".join(x) idiom]

2006-11-03 Thread Larry Hastings
On 2006/10/20, Larry Hastings wrote:
> I'm ready to post the patch.
Sheesh!  Where does the time go.


I've finally found the time to re-validate and post the patch.  It's 
SF.net patch #1590352:

http://sourceforge.net/tracker/index.php?func=detail&aid=1590352&group_id=5470&atid=305470
I've attached both the patch itself (against the current 2.6 revision, 
52618) and a lengthy treatise on the patch and its ramifications as I 
understand them.

I've also added one more experimental change: a new string method, 
str.simplify().  All it does is force a lazy concatenation / lazy slice 
to render.  (If the string isn't a lazy string, or it's already been 
rendered, str.simplify() is a no-op.)  The idea is, if you know these 
consarned "lazy slices" are giving you the oft-cited horrible memory 
usage scenario, you can tune your app by forcing the slices to render 
and drop their references.  99% of the time you don't care, and you 
enjoy the minor speedup.  The other 1% of the time, you call .simplify() 
and your code behaves as it did under 2.5.  Is this the right approach?  
I dunno.  So far I like it better than the alternatives.  But I'm open 
to suggestions, on this or any other aspect of the patch.

Cheers,


/larry/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Feature Request: Py_NewInterpreter to create separate GIL (branch)

2006-11-03 Thread Brett Cannon
On 11/3/06, Robert <[EMAIL PROTECTED]> wrote:
repeated from c.l.p : "Feature Request: Py_NewInterpreter to createseparate GIL (branch)"Daniel Dittmar wrote: > robert wrote: >> I'd like to use multiple CPU cores for selected time consuming Python
 >> computations (incl. numpy/scipy) in a frictionless manner. >> >> Interprocess communication is tedious and out of question, so I >> thought about simply using a more Python interpreter instances
 >> (Py_NewInterpreter) with extra GIL in the same process. > > If I understand Python/ceval.c, the GIL is really global, not specific > to an interpreter instance: > static PyThread_type_lock interpreter_lock = 0; /* This is the GIL */
 >Thats the show stopper as of now.There are only a handfull funcs in ceval.c to use that very global lock.The rest uses that funcs around thread states.Would it be a possibilty in next Python to have the lock separate for
each Interpreter instance.Thus: have *interpreter_lock separate in each PyThreadState instance andonly threads of same Interpreter have same GIL?Separation between Interpreters seems to be enough. The Interpreter runs
mainly on the stack. Possibly only very few global C-level resourceswould require individual extra locks.Right, but that's the trick.  For instance extension modules are shared between interpreters.  Also look at the sys module and basically anything that is set by a function call is a process-level setting that would also need protection.  Then you get into the fun stuff of the possibility of sharing objects created in one interpreter and then passed to another that is not necessarily known ahead of time (whether it be directly through C code or through process-level objects such as an attribute in an extension module).
It is not as simple, unfortunately, as a few locks.Sooner or later Python will have to answer the multi-processor question.
A per-interpreter GIL and a nice module for tunneling Python-Objectsdirectly between Interpreters inside one process might be the answer atthe right border-line ? Existing extension code base would remaincompatible, as far as there is already decent locking on module globals,
which is the the usual case.This is not true (see above).  From my viewpoint the only way for this to work would be to come up with a way to wrap all access to module objects in extension modules so that they are not trampled on because of separate locks per-interpreter, or have to force all extension modules to be coded so that they are instantiated individually per interpreter.  And of course deal with all other process-level objects somehow.
The SMP issue for Python will most likely not happen until someone cares enough to write code to do it and this take on it is no exception.  There is no simple solution or else someone would have done it by now.
-Brett
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The "lazy strings" patch [was: PATCH submitted: Speed up + for string concatenation, now as fast as "".join(x) idiom]

2006-11-03 Thread Josiah Carlson

Larry Hastings <[EMAIL PROTECTED]> wrote:
> But I'm open 
> to suggestions, on this or any other aspect of the patch.

As Martin, I, and others have suggested, direct the patch towards Python
3.x unicode text.  Also, don't be surprised if Guido says no...
http://mail.python.org/pipermail/python-3000/2006-August/003334.html

In that message he talks about why view+string or string+view or
view+view should return strings.  Some are not quite applicable in this
case because with your implementation all additions can return a 'view'.
However, he also states the following with regards to strings vs. views
(an earlier variant of the "lazy strings" you propose),
"Because they can have such different performance and memory usage
 characteristics, it's not right to treat them as the same type."
 - GvR

This suggests (at least to me) that unifying the 'lazy string' with the
2.x string is basically out of the question, which brings me back to my
earlier suggestion; make it into a wrapper that could be used with 3.x
bytes, 3.x text, and perhaps others.

 - Josiah

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com