Re: [Python-Dev] Status of json (simplejson) in cpython

2011-04-16 Thread Bob Ippolito
On Saturday, April 16, 2011, Antoine Pitrou solip...@pitrou.net wrote:
 Le samedi 16 avril 2011 à 17:07 +0200, Xavier Morel a écrit :
 On 2011-04-16, at 16:52 , Antoine Pitrou wrote:
  Le samedi 16 avril 2011 à 16:42 +0200, Dirkjan Ochtman a écrit :
  On Sat, Apr 16, 2011 at 16:19, Antoine Pitrou solip...@pitrou.net wrote:
  What you're proposing doesn't address the question of who is going to
  do the ongoing maintenance. Bob apparently isn't interested in
  maintaining stdlib code, and python-dev members aren't interested in
  maintaining simplejson (assuming it would be at all possible). Since
  both groups of people want to work on separate codebases, I don't see
  how sharing a single codebase would be possible.
 
  From reading this thread, it seems to me like the proposal is that Bob
  maintains a simplejson for both 2.x and 3.x and that the current
  stdlib json is replaced by a (trivially changed) version of
  simplejson.
 
  The thing is, we want to bring our own changes to the json module and
  its tests (and have already done so, although some have been backported
  to simplejson).

 Depending on what those changes are, would it not be possible to apply
 the vast majority of them to simplejson itself?

 Sure, but the thing is, I don't *think* we are interested in backporting
 stuff to simplejson much more than Bob is interested in porting stuff to
 the json module.

I've backported every useful patch (for 2.x) I noticed from json to
simplejson. Would be happy to apply any that I missed if anyone can
point these out.

 I've contributed a couple of patches myself after they were integrated
 to CPython (they are part of the performance improvements Bob is talking
 about), but that was exceptional. Backporting a patch to another project
 with a different directory structure, a slightly different code, etc. is
 tedious and not very rewarding for us Python core developers, while we
 could do other work on our limited free time.

That's exactly why I am not interested in stdlib maintenance myself, I
only use 2.x and that's frozen... so I can't maintain the version we
would actually use.

 Also, some types of work would be tedious to backport, for example if we
 refactor the tests to test both the C and Python implementations.

simplejson's test suite has tested both for quite some time.

 Furthermore, now that python uses Mercurial, it should be possible (or
 even easy) to use a versioned queue (via MQ) for the trivial
 adaptation, and the temporary alterations (things which will likely be
 merged back into simplejson but are not yet, stuff like that) should
 it not?

 Perhaps, perhaps not. That would require someone motivated to put it in
 place, ensure that it doesn't get in the way, document it, etc.
 Honestly, I don't think maintaining a single stdlib module should
 require such an amount of logistics.

It certainly shouldn't, especially because neither of them changes very fast.

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Status of json (simplejson) in cpython

2011-04-15 Thread Bob Ippolito
On Thu, Apr 14, 2011 at 2:29 PM, Raymond Hettinger
raymond.hettin...@gmail.com wrote:

 On Apr 14, 2011, at 12:22 PM, Sandro Tosi wrote:

 The version we have in cpython of json is simplejson 2.0.9 highly
 patched (either because it was converted to py3k, and because of the
 normal flow of issues/bugfixes) while upstream have already released
 2.1.13 .

 Their 2 roads had diverged a lot, and since this blocks any further
 update of cpython's json from upstream, I'd like to close this gap.

 Are you proposing updates to the Python 3.3 json module
 to include newer features like use_decimal and changing
 the indent argument from an integer to a string?

https://github.com/simplejson/simplejson/blob/master/CHANGES.txt

 - what are we going to do in the long run?

 If Bob shows no interest in Python 3, then
 the code bases will probably continue to diverge.

I don't have any real interest in Python 3, but if someone contributes
the code to make simplejson work in Python 3 I'm willing to apply the
patches run the tests against any future changes. The porting work to
make it suitable for the standard library at that point should be
something that can be automated since it will be moving some files
around and changing the string simplejson to json in a whole bunch of
places.

 Since the JSON spec is set in stone, the changes
 will mostly be about API (indentation, object conversion, etc)
 and optimization.  I presume the core parsing logic won't
 be changing much.

Actually the core parsing logic is very different (and MUCH faster),
which is why the merge is tricky. There's the potential for it to
change more in the future, there's definitely more room for
optimization. Probably not in the pure python parser, but the C one.

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Status of json (simplejson) in cpython

2011-04-15 Thread Bob Ippolito
On Friday, April 15, 2011, Antoine Pitrou solip...@pitrou.net wrote:

  Since the JSON spec is set in stone, the changes
  will mostly be about API (indentation, object conversion, etc)
  and optimization.  I presume the core parsing logic won't
  be changing much.

 Actually the core parsing logic is very different (and MUCH faster),

 Are you talking about the Python logic or the C logic?

Both, actually. IIRC simplejson in pure python typically beats json
with it's C extension.

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Status of json (simplejson) in cpython

2011-04-15 Thread Bob Ippolito
On Fri, Apr 15, 2011 at 2:20 PM, Antoine Pitrou solip...@pitrou.net wrote:
 Le vendredi 15 avril 2011 à 14:18 -0700, Bob Ippolito a écrit :
 On Friday, April 15, 2011, Antoine Pitrou solip...@pitrou.net wrote:
 
   Since the JSON spec is set in stone, the changes
   will mostly be about API (indentation, object conversion, etc)
   and optimization.  I presume the core parsing logic won't
   be changing much.
 
  Actually the core parsing logic is very different (and MUCH faster),
 
  Are you talking about the Python logic or the C logic?

 Both, actually. IIRC simplejson in pure python typically beats json
 with it's C extension.

 Really? It would be nice to see some concrete benchmarks against both
 repo tips.

Maybe in a few weeks or months when I have time to finish up the
benchmarks that I was working on... but it should be pretty easy for
anyone to show that the version in CPython is very slow (and uses a
lot more memory) in comparison to simplejson.

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Status of json (simplejson) in cpython

2011-04-15 Thread Bob Ippolito
On Fri, Apr 15, 2011 at 4:12 PM, Antoine Pitrou solip...@pitrou.net wrote:
 On Fri, 15 Apr 2011 14:27:04 -0700
 Bob Ippolito b...@redivi.com wrote:
 On Fri, Apr 15, 2011 at 2:20 PM, Antoine Pitrou solip...@pitrou.net wrote:
  Le vendredi 15 avril 2011 à 14:18 -0700, Bob Ippolito a écrit :
  On Friday, April 15, 2011, Antoine Pitrou solip...@pitrou.net wrote:
  
Since the JSON spec is set in stone, the changes
will mostly be about API (indentation, object conversion, etc)
and optimization.  I presume the core parsing logic won't
be changing much.
  
   Actually the core parsing logic is very different (and MUCH faster),
  
   Are you talking about the Python logic or the C logic?
 
  Both, actually. IIRC simplejson in pure python typically beats json
  with it's C extension.
 
  Really? It would be nice to see some concrete benchmarks against both
  repo tips.

 Maybe in a few weeks or months when I have time to finish up the
 benchmarks that I was working on... but it should be pretty easy for
 anyone to show that the version in CPython is very slow (and uses a
 lot more memory) in comparison to simplejson.

 Well, here's a crude microbenchmark. I'm comparing 2.6+simplejson 2.1.3
 to 3.3+json, so I'm avoiding integers:

 * json.dumps:

 $ python -m timeit -s from simplejson import dumps, loads; \
    d = dict((str(i), str(i)) for i in range(1000)) \
   dumps(d)

 - 2.6+simplejson: 372 usec per loop
 - 3.2+json: 352 usec per loop

 * json.loads:

 $ python -m timeit -s from simplejson import dumps, loads; \
    d = dict((str(i), str(i)) for i in range(1000)); s = dumps(d) \
    loads(s)

 - 2.6+simplejson: 224 usec per loop
 - 3.2+json: 233 usec per loop


 The runtimes look quite similar.

That's the problem with trivial benchmarks. With more typical data
(for us, anyway) you should see very different results.

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pickle alternative in stdlib (Was: On breaking modules into packages)

2010-11-04 Thread Bob Ippolito
On Friday, November 5, 2010,  exar...@twistedmatrix.com wrote:
 On 12:21 am, m...@gsites.de wrote:

 Am 04.11.2010 17:15, schrieb anatoly techtonik:
 pickle is insecure, marshal too.

 If the transport or storage layer is not save, you should cryptographically 
 sign the data anyway::

     def pickle_encode(data, key):
         msg = base64.b64encode(pickle.dumps(data, -1))
         sig = base64.b64encode(hmac.new(key, msg).digest())
         return sig + ':' + msg

     def pickle_decode(data, key):
         if data and ':' in data:
             sig, msg = data.split(':', 1)
             if sig == base64.b64encode(hmac.new(key, msg).digest()):
                 return pickle.loads(base64.b64decode(msg))
         raise pickle.UnpicklingError(Wrong or missing signature.)

 Bottle (a web framework) uses a similar approach to store non-string data in 
 client-side cookies. I don't see a (security) problem here.


 Your pickle_decode leaks information about the key.  An attacker will 
 eventually (a few seconds to a few minutes, depending on how they have access 
 to this system) be able to determine your key and send you arbitrary pickles 
 (ie, execute arbitrary code on your system).

 Oops.

 This stuff is hard.  If you're going to mess around with it, make sure you're 
 *serious* (better approach: don't mess around with it).

Specifically you need to use a constant time signature verification or
else there are possible timing attacks. Sounds like something a hmac
module should provide in the first place.

But yeah, this stuff is hard, better to just not have a code execution
hole in the first place.

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Return from generators in Python 3.2

2010-08-26 Thread Bob Ippolito
On Fri, Aug 27, 2010 at 8:25 AM, Guido van Rossum gu...@python.org wrote:
 On Thu, Aug 26, 2010 at 5:05 PM, Yury Selivanov yseliva...@gmail.com wrote:
 On 2010-08-26, at 8:04 PM, Greg Ewing wrote:
 Even with your proposal, you'd still have to use a 'creepy
 abstraction' every time one of your coroutines calls another.
 That's why PEP 380 deals with 'more than just return'.

 Nope.  In almost any coroutine framework you have a scheduler
 or trampoline object that basically does all the work of calling,
 passing values and propagating exceptions.  And many other things
 that 'yield from' won't help you with (cooperation, deferring to
 process/thread pools, pausing, etc.)  Being a developer of one
 of such frameworks, I can tell you, that I can easily live without
 'yield from', but dealing with weird return syntax is a pain.

 That's not my experience. I wrote a trampoline myself (not released
 yet), and found that I had to write a lot more code to deal with the
 absence of yield-from than to deal with returns. In my framework,
 users write 'raise Return(value)' where Return is a subclass of
 StopIteration. The trampoline code that must be written to deal with
 StopIteration can be extended trivially to deal with this. The only
 reason I chose to use a subclass is so that I can diagnose when the
 return value is not used, but I could have chosen to ignore this or
 just diagnose whenever the argument to StopIteration is not None.

A bit off-topic, but...

In my experience the lack of yield from makes certain styles of
programming both very tedious and very costly for performance. One
example would be Genshi, which implements something like pipes or
filters. There are many filters that will do something once (e.g.
insert a doctype) and but have O(N) performance because of the
function call overhead of for x in other_generator: yield x. Nest
this a few times and you'll have 10 function calls for every byte of
output (not an exaggeration in the case of Trac templates). I think if
implemented properly yield from could get rid of most of that
overhead.

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] State of json in 2.7

2010-06-22 Thread Bob Ippolito
On Tuesday, June 22, 2010, Brett Cannon br...@python.org wrote:
 [cc'ing Bob on his gmail address; didn't have any other address handy
 so I don't know if this will actually get to him]

 On Tue, Jun 22, 2010 at 09:54, Dirkjan Ochtman dirk...@ochtman.nl wrote:
 It looks like simplejson 2.1.0 and 2.1.1 have been released:

 http://bob.pythonmac.org/archives/2010/03/10/simplejson-210/
 http://bob.pythonmac.org/archives/2010/03/31/simplejson-211/

 It looks like any changes that didn't come from the Python tree didn't
 go into the Python tree, either.

 Has anyone asked Bob why he did this? There might be a logical reason.

I've just been busy. It's not trivial to move patches from one to the
other, so it's not something that has been easy for me to get around
to actually doing. It seems that more often than not when I have had
time to look at something, it didn't line up well with python's
release schedule.

(and speaking of busy I'm en route for a week long honeymoon so don't
expect much else from me on this thread)

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal - float comparisons in py3k.

2010-03-23 Thread Bob Ippolito
On Sat, Mar 20, 2010 at 4:38 PM, Mark Dickinson dicki...@gmail.com wrote:
 On Sat, Mar 20, 2010 at 7:56 PM, Guido van Rossum gu...@python.org wrote:
 I propose to reduce all hashes to the hash of a normalized fraction,
 which we can define as a combination of the hashes for the numerator
 and the denominator. Then all we have to do is figure fairly efficient
 ways to convert floats and decimals to normalized fractions (not
 necessarily Fractions). I may be naive but this seems doable: for a
 float, the denominator is always a power of 2 and removing factors of
 2 from the denominator is easy (just right-shift until the last bit is
 zero). For Decimal, the unnormalized denominator is always a power of
 10, and the normalization is a bit messier, but doesn't seem
 excessively so. The resulting numerator and denominator may be large
 numbers, but for typical use of Decimal and float they will rarely be
 excessively large, and I'm not too worried about slowing things down
 when they are (everything slows down when you're using really large
 integers anyway).

 I *am* worried about slowing things down for large Decimals:  if you
 can't put Decimal('1e1234567') into a dict or set without waiting for
 an hour for the hash computation to complete (because it's busy
 computing 10**1234567), I consider that a problem.

 But it's solvable!  I've just put a patch on the bug tracker:

 http://bugs.python.org/issue8188

 It demonstrates how hashes can be implemented efficiently and
 compatibly for all numeric types, even large Decimals like the above.
 It needs a little tidying up, but it works.

I was interested in how the implementation worked yesterday,
especially given the lack of explanation in the margins of
numeric_hash3.patch. numeric_hash4.patch has much better comments, but
I didn't see this patch until after I had sufficiently deciphered the
previous patch and wrote most of this:
http://bob.pythonmac.org/archives/2010/03/23/py3k-unified-numeric-hash/

I'm not really qualified to review the patch, what little formal math
training I had has atrophied quite a bit over the years, but as far as
I can tell it seems to work. The results also seem to match the Python
implementations that I created.

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-02 Thread Bob Ippolito
On Sun, Jan 31, 2010 at 11:16 AM, Guido van Rossum gu...@python.org wrote:
 Whoa. This thread already exploded. I'm picking this message to
 respond to because it reflects my own view after reading the PEP.

 On Sun, Jan 31, 2010 at 4:13 AM, Hanno Schlichting ha...@hannosch.eu wrote:
 On Sun, Jan 31, 2010 at 1:03 PM, Simon Cross
 hodgestar+python...@gmail.com wrote:
 I don't know whether I in favour of using a single pyr folder or not
 but if a single folder is used I'd definitely prefer the folder to be
 called __pyr__ rather than .pyr.

 Exactly what I would prefer. I worry that having many small
 directories is a fairly poor use of the filesystem. A quick scan of
 /usr/local/lib/python3.2 on my Linux box reveals 1163 .py files but
 only 57 directories).

I like this option as well, but why not just name the directory .pyc
instead of __pyr__ or .pyr? That way people probably won't even have
to reconfigure their tools to ignore it :)

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dropping bytes support in json

2009-04-27 Thread Bob Ippolito
On Mon, Apr 27, 2009 at 7:25 AM, Damien Diederen d...@crosstwine.com wrote:

 Antoine Pitrou solip...@pitrou.net writes:
 Hello,

 We're in the process of forward-porting the recent (massive) json
 updates to 3.1, and we are also thinking of dropping remnants of
 support of the bytes type in the json library (in 3.1, again). This
 bytes support almost didn't work at all, but there was a lot of C and
 Python code for it nevertheless. We're also thinking of dropping the
 encoding argument in the various APIs, since it is useless.

 I had a quick look into the module on both branches, and at Antoine's
 latest patch (json_py3k-3).  The current situation on trunk is indeed
 not very pretty in terms of code duplication, and I agree it would be
 nice not to carry that forward.

 I couldn't figure out a way to get rid of it short of multi-#including
 templates and playing with the C preprocessor, however, and have the
 nagging feeling the latter would be frowned upon by the maintainers.

 There is a precedent with xmltok.c/xmltok_impl.c, though, so maybe I'm
 wrong about that.  Should I give it a try, and see how clean the
 result can be made?

 Under the new situation, json would only ever allow str as input, and
 output str as well. By posting here, I want to know whether anybody
 would oppose this (knowing, once again, that bytes support is already
 broken in the current py3k trunk).

 Provided one of the alternatives is dropped, wouldn't it be better to do
 the opposite, i.e., have the decoder take bytes as input, and the
 encoder produce bytes—and layer the str functionality on top of that?  I
 guess the answer depends on how the (most common) lower layers are
 structured, but it would be nice to allow a straight bytes path to/from
 the underlying transport.

 (I'm willing to have a go at the conversion in case somebody is
 interested.)

 Bob, would you have an idea of which lower layers are most commonly used
 with the json module, and whether people are more likely to expect strs
 or bytes in Python 3.x?  Maybe that data could be inferred from some bug
 tracking system?

I don't know what Python 3.x users expect. As far as I know, none of
the lower layers of the json package are used directly. They're
certainly not supposed to be or documented as such.

My use case for dumps is typically bytes output because we push it
straight to and from IO. Some people embed JSON in other documents
(e.g. HTML) where you would want it to be text. I'm pretty sure that
the IO case is more common.

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dropping bytes support in json

2009-04-13 Thread Bob Ippolito
On Mon, Apr 13, 2009 at 1:02 PM, Martin v. Löwis mar...@v.loewis.de wrote:
 Yes, there's a TCP connection.  Sorry for not making that clear to begin
 with.

     If so, it doesn't matter what representation these implementations chose
     to use.


 True, I can always convert from bytes to str or vise versa.

 I think you are missing the point. It will not be necessary to convert.
 You can write the JSON into the TCP connection in Python, and it will
 come out just fine as strings just fine in C# and JavaScript. This
 is how middleware works - it abstracts from programming languages, and
 allows for different representations in different languages, in a
 manner invisible to the participating processes.

 At least one of these two needs to work:

 json.dumps({}).encode('utf-16le')  # dumps() returns str
 '{\x00}\x00'

 json.dumps({}, encoding='utf-16le')  # dumps() returns bytes
 '{\x00}\x00'

 In 2.6, the first one works.  The second incorrectly returns '{}'.

 Ok, that might be a bug in the JSON implementation - but you shouldn't
 be using utf-16le, anyway. Use UTF-8 always, and it will work fine.

 The questions is: which of them is more appropriate, if, what you want,
 is bytes. I argue that the second form is better, since it saves you
 an encode invocation.

It's not a bug in dumps, it's a matter of not reading the
documentation. The encoding parameter of dumps decides how byte
strings should be interpreted, not what the output encoding is.

The output of json/simplejson dumps for Python 2.x is either an ASCII
bytestring (default) or a unicode string (when ensure_ascii=False).
This is very practical in 2.x because an ASCII bytestring can be
treated as either text or bytes in most situations, isn't going to get
mangled over any kind of encoding mismatch (as long as it's an ASCII
superset), and skips an encoding step if getting sent over the wire..

 simplejson.dumps(['\x00f\x00o\x00o'], encoding='utf-16be')
'[foo]'
 simplejson.dumps(['\x00f\x00o\x00o'], encoding='utf-16be', 
 ensure_ascii=False)
u'[foo]'

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dropping bytes support in json

2009-04-10 Thread Bob Ippolito
On Fri, Apr 10, 2009 at 8:38 AM, Stephen J. Turnbull step...@xemacs.org wrote:
 Paul Moore writes:

   On the other hand, further down in the document:
  
   
   3.  Encoding
  
      JSON text SHALL be encoded in Unicode.  The default encoding is
      UTF-8.
  
      Since the first two characters of a JSON text will always be ASCII
      characters [RFC0020], it is possible to determine whether an octet
      stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking
      at the pattern of nulls in the first four octets.
   
  
   This is at best confused (in my utterly non-expert opinion :-)) as
   Unicode isn't an encoding...

 The word encoding (by itself) does not have a standard definition
 AFAIK.  However, since Unicode *is* a coded character set (plus a
 bunch of hairy usage rules), there's nothing wrong with saying text
 is encoded in Unicode.  The RFC 2130 and Unicode TR#17 taxonomies are
 annoying verbose and pedantic to say the least.

 So what is being said there (in UTR#17 terminology) is

 (1) JSON is *text*, that is, a sequence of characters.
 (2) The abstract repertoire and coded character set are defined by the
    Unicode standard.
 (3) The default transfer encoding syntax is UTF-8.

   That implies that loads can/should also allow bytes as input, applying
   the given algorithm to guess an encoding.

 It's not a guess, unless the data stream is corrupt---or nonconforming.

 But it should not be the JSON package's responsibility to deal with
 corruption or non-conformance (eg, ISO-8859-15-encoded programs).
 That's the whole point of specifying the coded character set in the
 standard the first place.  I think it's a bad idea for any of the core
 JSON API to accept or produce bytes in any language that provides a
 Unicode string type.

 That doesn't mean Python's module shouldn't provide convenience
 functions to read and write JSON serialized as UTF-8 (in fact, that
 *should* be done, IMO) and/or other UTFs (I'm not so happy about
 that).  But those who write programs using them should not report bugs
 until they've checked out and eliminated the possibility of an
 encoding screwup!

The current implementation doesn't do any encoding guesswork and I
have no intention to allow that as a feature. The input must be
unicode, UTF-8 bytes, or an encoding must be specified.

Personally most of experience with JSON is as a wire protocol and thus
bytes, so the obvious function to encode json should do that. There
probably should be another function to get unicode output, but nobody
has ever asked for that in the Python 2.x version. They either want
the default behavior (encoding as ASCII str which can be used as
unicode due to implementation details of Python 2.x) or encoding as a
more compact UTF-8 str (without escaping non-ASCII code points).
Perhaps Python 3 users would ask for a unicode output when decoding
though.

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dropping bytes support in json

2009-04-09 Thread Bob Ippolito
On Thu, Apr 9, 2009 at 1:05 PM, Martin v. Löwis mar...@v.loewis.de wrote:
 I can understand that you don't want to spend much time on it. How
 about removing it from 3.1? We could re-add it when long-term support
 becomes more likely.

 I'm speechless.

 It seems that my statement has surprised you, so let me explain:

 I think we should refrain from making design decisions (such as
 API decisions) without Bob's explicit consent, unless we assign
 a new maintainer for the simplejson module (perhaps just for the
 3k branch, which perhaps would be a fork from Bob's code).

 Antoine suggests that Bob did not comment on the issues at hand,
 therefore, we should not proceed with the proposed design. Since
 the 3.1 release is only a few weeks ahead, we have the choice of
 either shipping with the broken version that is currently in the
 3k branch, or drop the module from the 3k branch. I believe our
 users are better served by not having to waste time with a module
 that doesn't quite work, or may change.

Most of my time to spend on json/simplejson and these mailing list
discussions is on weekends, I try not to bother with it when I'm busy
doing Actual Work unless there is a bug or some other issue that needs
more immediate attention. I also wasn't aware that I was expected to
comment on those issues. I'm CC'ed on the discussion for issue4136 but
I don't see any unanswered questions directed at me.

I have the issues (issue5723, issue4136) starred in my gmail and I
planned to look at it more closely later, hopefully on Friday or
Saturday.

As far as Python 3 goes, I honestly have not yet familiarized myself
with the changes to the IO infrastructure and what the new idioms are.
At this time, I can't make any educated decisions with regard to how
it should be done because I don't know exactly how bytes are supposed
to work and what the common idioms are for other libraries in the
stdlib that do similar things. Until I figure that out, someone else
is better off making decisions about the Python 3 version. My guess is
that it should work the same way as it does in Python 2.x: take bytes
or unicode input in loads (which means encoding is still relevant). I
also think the output of dumps should also be bytes, since it is a
serialization, but I am not sure how other libraries do this in Python
3 because one could argue that it is also text. If other libraries
that do text/text encodings (e.g. binascii, mimelib, ...) use str for
input and output instead of bytes then maybe Antoine's changes are the
right solution and I just don't know better because I'm not up to
speed with how people write Python 3 code.

I'll do my best to find some time to look into Python 3 more closely
soon, but thus far I have not been very motivated to do so because
Python 3 isn't useful for us at work and twiddling syntax isn't a very
interesting problem for me to solve.

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] json decoder speedups, any time left for 2.6?

2008-09-27 Thread Bob Ippolito
simplejson 2.0.0 is now released which is about as optimized as I can
be bothered to make it. It's about 4x faster than cPickle for encoding
and just a little slower and decoding, which is good enough for now ;)
The pure Python source is much uglier now (to avoid global lookups,
etc.), but also several times faster than it was.

http://pypi.python.org/pypi/simplejson

One of the optimizations I made probably isn't good for Py3k, it will
return ASCII strings as str objects instead of converting to unicode,
but that shouldn't be too much work to port (though I haven't looked
at the current _json.c port for Py3k).

I also converted over to using Sphinx documentation, which was nice
because I was able to just re-use the docs that were already in Python
trunk after changing the module name around.

All of the work should be easy to merge back into trunk so I'll try
and take care of that quickly after Python 2.6 is released.

On Wed, Sep 24, 2008 at 9:02 AM, Bob Ippolito [EMAIL PROTECTED] wrote:
 http://pypi.python.org/pypi/simplejson

 The _speedups module is optional.

 On Wed, Sep 24, 2008 at 8:42 AM, Alex Martelli [EMAIL PROTECTED] wrote:
 Meanwhile, can you please release (wherever you normally release
 things;-) the pure-Python version as well?  I'd like to play around
 with it in Google App Engine opensource sandboxes (e.g., cfr.
 gae-json-rest -- I'll be delighted to add you to that project if you
 want of course;-) and that requires Python 2.5 and only pure-Python
 add-ons... thanks!

 Alex


 On Wed, Sep 24, 2008 at 8:08 AM, Bob Ippolito [EMAIL PROTECTED] wrote:
 On Wed, Sep 24, 2008 at 6:14 AM, Barry Warsaw [EMAIL PROTECTED] wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On Sep 24, 2008, at 5:47 AM, Nick Coghlan wrote:

 Bob Ippolito wrote:

 How much time do I
 have left to get this into Python 2.6?

 Zero I'm afraid - with rc1 out, it's release blocker bugs only. Anything
 which can be deferred to the 2.6.1 release without causing any major
 harm is definitely out - and while a 2x speedup is very nice, it isn't
 something to be squeezing in post-rc1.

 Still, that should make for a nice incremental improvement when 2.6.1
 rolls around.

 I concur.

 Ok, no problem. The speedup is about 3x now on the trunk ;) I think
 that further optimization will require some more C hacking, but 2.6.1
 should give me plenty of time to get around to some of that.

 -bob
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 http://mail.python.org/mailman/options/python-dev/aleaxit%40gmail.com



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] json decoder speedups, any time left for 2.6?

2008-09-24 Thread Bob Ippolito
On Wed, Sep 24, 2008 at 6:14 AM, Barry Warsaw [EMAIL PROTECTED] wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On Sep 24, 2008, at 5:47 AM, Nick Coghlan wrote:

 Bob Ippolito wrote:

 How much time do I
 have left to get this into Python 2.6?

 Zero I'm afraid - with rc1 out, it's release blocker bugs only. Anything
 which can be deferred to the 2.6.1 release without causing any major
 harm is definitely out - and while a 2x speedup is very nice, it isn't
 something to be squeezing in post-rc1.

 Still, that should make for a nice incremental improvement when 2.6.1
 rolls around.

 I concur.

Ok, no problem. The speedup is about 3x now on the trunk ;) I think
that further optimization will require some more C hacking, but 2.6.1
should give me plenty of time to get around to some of that.

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] json decoder speedups, any time left for 2.6?

2008-09-24 Thread Bob Ippolito
http://pypi.python.org/pypi/simplejson

The _speedups module is optional.

On Wed, Sep 24, 2008 at 8:42 AM, Alex Martelli [EMAIL PROTECTED] wrote:
 Meanwhile, can you please release (wherever you normally release
 things;-) the pure-Python version as well?  I'd like to play around
 with it in Google App Engine opensource sandboxes (e.g., cfr.
 gae-json-rest -- I'll be delighted to add you to that project if you
 want of course;-) and that requires Python 2.5 and only pure-Python
 add-ons... thanks!

 Alex


 On Wed, Sep 24, 2008 at 8:08 AM, Bob Ippolito [EMAIL PROTECTED] wrote:
 On Wed, Sep 24, 2008 at 6:14 AM, Barry Warsaw [EMAIL PROTECTED] wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On Sep 24, 2008, at 5:47 AM, Nick Coghlan wrote:

 Bob Ippolito wrote:

 How much time do I
 have left to get this into Python 2.6?

 Zero I'm afraid - with rc1 out, it's release blocker bugs only. Anything
 which can be deferred to the 2.6.1 release without causing any major
 harm is definitely out - and while a 2x speedup is very nice, it isn't
 something to be squeezing in post-rc1.

 Still, that should make for a nice incremental improvement when 2.6.1
 rolls around.

 I concur.

 Ok, no problem. The speedup is about 3x now on the trunk ;) I think
 that further optimization will require some more C hacking, but 2.6.1
 should give me plenty of time to get around to some of that.

 -bob
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 http://mail.python.org/mailman/options/python-dev/aleaxit%40gmail.com


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] json decoder speedups, any time left for 2.6?

2008-09-23 Thread Bob Ippolito
I'm out of town this week for a conference (ICFP/CUFP in Victoria) and
my hotel's connection has been bad enough such that I can't get any
Real Work done so I've managed to hammer on the json library's
decoding quite a bit instead. I just released simplejson 1.9.3 which
improves decoding performance by about 2x and I've got some more
changes along the way in trunk for 1.9.4 that will increase it even
further (over 3x my original 1.9.2 benchmark perf). How much time do I
have left to get this into Python 2.6?

FWIW the changes are all on the Python side, no C code has been harmed
(yet). The test suite still passes of course.

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should we help pythonmac.org?

2008-08-18 Thread Bob Ippolito
The major difference between the packages on macports and
pythonmac.org is that macports is their own distro of nearly
everything, akin to installing a copy of FreeBSD over top of Mac OS X.
pythonmac.org contains packages that are self-contained and don't have
a whole new set of libraries to install (in the cases where they do
require libraries, they link them in statically for the most part).

These days I don't have a lot of preference, I don't use either :)

On Mon, Aug 18, 2008 at 1:08 PM, Guido van Rossum [EMAIL PROTECTED] wrote:
 Alternatively, I just got mail from Bob Ippolito indicating that he'd
 be happy to hand over the domain to the PSF. It's got quite a bit more
 on it than Python distros, and it's a fairly popular resource for Mac
 users I imagine. However macports.org seems to have more Python stuff,
 and has a more recent version of 2.5. (2.5.2). Perhaps we should link
 to macports.org instead?

 On Mon, Aug 18, 2008 at 9:54 AM, Barry Warsaw [EMAIL PROTECTED] wrote:
 On Aug 18, 2008, at 12:05 PM, Guido van Rossum wrote:

 Does anyone have connections with the owners of pythonmac.org?
 Apparently they are serving up an ancient version of Python 2.5. The
 Google App Engine has a minor issue in 2.5 that's solved in 2.5.1, but
 that is apparently not available from that site. Perhaps we can
 contribute more recent Mac versions, or provide them directly on
 python.org? (The Downloads - Macintosh page points to pythonmac.org,
 which means lots of All Engine users download this old version.)

 I'd be happy to arrange things with a Mac expert to put the Mac binaries on
 the download site.

 --
 --Guido van Rossum (home page: http://www.python.org/~guido/)

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Pydotorg] Should we help pythonmac.org?

2008-08-18 Thread Bob Ippolito
On Mon, Aug 18, 2008 at 3:41 PM, Barry Warsaw [EMAIL PROTECTED] wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On Aug 18, 2008, at 6:13 PM, Fred Drake wrote:

 On Aug 18, 2008, at 5:42 PM, Steve Holden wrote:

 Someone told me the other day that macports made for difficult installs,
 but not being a Mac user I wasn't in a position to evaluate the advice.

 Not being a Mac user either, I've been using Mac OS X for about a year now
 for most of my development.

 I've got mixed feelings about macports:  It's painful to use, compared to
 things like rpm and apt, but... it might be the best that's available for
 the Mac.

 I'm not going to trust it to give me a usable Python, though, in spite of
 not having had problems with Pythons it provides.  Just 'cause I've gotten
 paranoid.

 I use macports too, mostly for stuff I'm too lazy to build from source.  I'm
 sure there's a Python in there, but like Fred, I don't use it.

 I do agree that we could and probably should maintain any Mac Python content
 on the main python.org site, but also if Bob wants to donate the domain, we
 can just have it forward to www.python.org/allyourmacsarebelongtous

We already do that for the wiki, we could do that for the other parts
of the site just as easily (even without or before a transfer of
ownership) :) I'm happy to pay for the domain and hosting, I just
don't have a lot of spare cycles these days unless I need something at
work.

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The docs, reloaded

2007-05-21 Thread Bob Ippolito
On 5/21/07, Martin v. Löwis [EMAIL PROTECTED] wrote:
  I think the people who have responded to my comment read too much into it.
  Nowhere do I think I asked Georg to write an equation typesetter to include
  in the Python documentation toolchain.  I asked that math capability be
  considered.  I have no idea what tools he used to build his new
  documentation set.  I only briefly glanced at a couple of the output pages.
  I think what he has done is marvelous.  However, I don't think the door
  should be shut on equation display.  Is there a route to it based on the
  tools Georg is using?

 I don't think anything in the world can replace TeX for math
 typesetting. So if math typesetting was a requirement (which it
 should not be, for that very reason), then we could not consider
 anything but TeX.

You can use docutils to generate LaTeX output from reST, and you can
put raw LaTeX into the output with .. raw:: latex. I would imagine
this is sufficient for now.

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding socket timeout to urllib2

2007-03-06 Thread Bob Ippolito
On 3/6/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:

 Guido Since idel timeout is not a commonly understood term it would
 Guido be even better if it was explained without using it.

 I think it's commonly understood, but it doesn't mean what the socket
 timeout is used for.  It's how long a connection can be idle (the client
 doesn't make a request of the server) before the server will close the
 connection.


What does idle timeout have to do with urllib2 or any IO layer for
that matter? I've only seen it as a very high level server-only
feature...

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Twisted Isn't Specific (was Re: Trial balloon: microthreads library in stdlib)

2007-02-15 Thread Bob Ippolito
On 2/14/07, Greg Ewing [EMAIL PROTECTED] wrote:
 Thomas Wouters wrote:

   *I* don't like the idea of something in the Python installation
   deciding which reactor to use.

 I wouldn't mind if some way were provided of changing
 the reactor if you want. I'd just like to see a long
 term goal of making it unnecessary as far as possible.

  In any case, your idea requires a lot of changes in external, non-Python
  code -- PyGTK simply exposes the GTK mainloop, which couldn't care less
  about Python's idea of a perfect event reactor model.

 On unix at least, I don't think it should be necessary
 to change gtk, only pygtk. If it can find out the file
 descriptor of the connection to the X server, it can
 plug that into the reactor, and then call
 gtk_main_iteration_do() whenever something comes in
 on it.

 A similar strategy ought to work for any X11-based
 toolkit that exposes a function to perform one
 iteration of its main loop.

 Mileage on other platforms may vary.

   The PerfectReactor can be added later, all current reactors
   aliased to it, and no one would have to change a single line
   of code.

 Sure.

 The other side to all this is the client side, i.e. the
 code that installs event callbacks. At the moment there's
 no clear direction to take, so everyone makes their own
 choice -- some use asyncore, some use Twisted, some use
 the gtk event loop, some roll their own, etc.

There is no single PerfectReactor. There are several use cases where
you need to wait on 1 different event systems, which guarantees at
least two OS threads (and two event loops). In general it's nice to
have a single Python event loop (the reactor) to act on said threads
(e.g. something just sitting on a mutex waiting for messages) but
waiting for IO to occur should *probably* happen on one or more
ancillary threads -- one per event system (e.g. select, GTK,
WaitForMultipleEvents, etc.)

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Twisted Isn't Specific (was Re: Trial balloon: microthreads library in stdlib)

2007-02-15 Thread Bob Ippolito
On 2/15/07, Baptiste Carvello [EMAIL PROTECTED] wrote:
  Ah, threads :-( It turns out that you need to invoke GetMessage in the
  context of the thread in which the window was created. In a different
  thread, you won't get any messages.
 
  I'd be interested to hear about other situations where threading would
  cause a problem.  My suspicion is that Windows is the hard one, and as
  I've shown that one is solvable.
 
 
 I've tried something similar on Linux, with gtk an wx.

 You can run the gtk main loop in its own thread, but because gtk is not thread
 safe, you have to grab a mutex everytime you run gtk code outside the thread 
 the
 mainloop is running in. So you have to surround your calls to the gtk api with
 calls to gtk.threads_enter and gtk.threads_leave. Except for callbacks of
 course, because they are executed in the main thread... Doable, but not fun.

 The same goes for wx. Then all hell breaks loose when you try to use both gtk
 and wx at the same time. That's because on Linux, the wx main loop calls the 
 gtk
 mainloop behind the scenes. As far as I know, that problem can not be solved
 from python.

 So yes that strategy can work, but it's no silver bullet.

And it's worse on Windows and Mac OS X where some GUI API calls *must*
happen on a particular thread or they simply don't work.

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] multiple interpreters and extension modules

2006-12-22 Thread Bob Ippolito
On 12/23/06, Jeremy Kloth [EMAIL PROTECTED] wrote:
 On Friday 22 December 2006 5:02 pm, Josiah Carlson wrote:
  Jeremy Kloth [EMAIL PROTECTED] wrote:
   [[ This may be somewhat c.l.p.-ish but I feel that this crossed into
   CPython development enough to merit posting here ]]
  
   I have received a bug report for 4Suite that involves a
   PyObject_IsInstance check failing for what appears to be the correct
   class, that is, the class names match.  With some investigating, I have
   found that the true problem is with multiple interpreters.  The reason
   for this is that each sub-interpreter has a new copy of any pure Python
   module. The following issues are also true for modules that have been
   reloaded, but I think that is common knowledge.  I mention it only for
   completeness.
 
  If I remember correctly, Python allows you to use multiple interpreters
  in the same process, but it makes no guarantees as to their correctness
  when running.
 
  See this post for further discussion on the issue:
  http://mail.python.org/pipermail/python-list/2004-January/244343.html
 
  You can also search for 'multiple python interpreters single process' in
  google without quotes to hear people lament over the (generally broken)
  multiple Python interpreter support.

 The problem here is that it is mod_python using the multiple interpreters.  We
 have no control over that.  What I'm proposing is fixing the extension module
 support for multiple interpreters with the bonus of adding extension module
 finalization which I've seen brought up here before.

 Fixing this does require support by the extension module author, but if that
 author doesn't feel the need to work in mod_python (if, of course, they load
 module level constants), that is their loss.

 Is 4Suite that different in its use of hybrid Python and C extensions?  There
 is lots of back and forth between the two layers and performance is critical.
 I really don't feel like recoding thousands of lines of Python code into C
 just to get 4Suite to work in mod_python without error.

It's a whole lot more practical to just stop using mod_python and go
for one of the other ways of exposing Python code to the internet. I
bet you can get the same or better performance out of another solution
anyway, and you'd save deployment headaches.

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] multiple interpreters and extension modules

2006-12-22 Thread Bob Ippolito
On 12/23/06, Jeremy Kloth [EMAIL PROTECTED] wrote:
 On Friday 22 December 2006 7:54 pm, Bob Ippolito wrote:
  It's a whole lot more practical to just stop using mod_python and go
  for one of the other ways of exposing Python code to the internet. I
  bet you can get the same or better performance out of another solution
  anyway, and you'd save deployment headaches.

 I have no control over end-users choice of Python/webserver integration, I
 just end up making it possible to run our software in the environment of
 *their* choice.

 If it is the opinion that it is mod_python that is broken, I'd gladly point
 the users to the location stating that fact/belief.  It would make my life
 easier.

Well, it clearly is broken wrt pure python modules and objects that
persist across requests. I believe that it's also broken with any
extension that uses the PyGILState API due to the way it interacts
with multiple interpreters.

I stopped using mod_python years ago due to the sorts of issues that
you're bringing up here (plus problems compiling, deploying, RAM
bloat, etc.). I don't have any recent experience or references that I
can point you to, but I can definitely say that I have had many good
experiences with the WSGI based solutions (and Twisted, but that's a
different game).

I would at least advise your user that there are several perfectly
good ways to make Python speak HTTP, and mod_python is the only one
with this issue.

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Non-blocking (asynchronous) timer without thread?

2006-12-22 Thread Bob Ippolito
On 12/23/06, Evgeniy Khramtsov [EMAIL PROTECTED] wrote:
 Mike Klaas пишет:

  I'm not sure how having python execute code at an arbitrary time would
  _reduce_ race conditions and/or deadlocks. And if you want to make it
  safe by executing code that shares no variables or resources, then it
  is no less safe to use threads, due to the GIL.
 
 Ok. And what about a huge thread overhead? Just try to start 10-50k
 threading timers :)

  If you can write you application in an event-driven way, Twisted might
  be able to do what you are looking for.

 I don't like an idea of Twisted: you want the banana, but get the whole
 gorilla as well :)

Well you simply can't do what you propose without writing code in the
style of Twisted or with interpreter modifications or evil stack
slicing such as with stackless or greenlet. If you aren't willing to
choose any of those then you'll have to live without that
functionality or use another language (though I can't think of any
usable ones that actually safely do what you're asking). It should be
relatively efficient to do what you want with a thread pool (one
thread that manages all of the timers, and worker threads to execute
the timer callbacks).

FWIW, Erlang doesn't have that functionality. You can wait on messages
with a timeout, but there are no interrupts. You do have cheap and
isolated processes instead of expensive shared state threads, though.
Writing Erlang/OTP code is actually a lot closer to writing Twisted
style code than it is to other styles of concurrency (that you'd find
in Python). It's just that Erlang/OTP has better support for
concurrency oriented programming than Python does (across the board;
syntax, interpreter, convention and libraries).

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] infinities

2006-11-26 Thread Bob Ippolito
On 11/26/06, tomer filiba [EMAIL PROTECTED] wrote:
 i found several places in my code where i use positive infinity
 (posinf) for various things, i.e.,

 def readline(self, limit = -1):
 if limit  0:
 limit = 1e1 # posinf
 chars = []
 while limit  0:
 ch = self.read(1)
 chars.append(ch)
 if not ch or ch == \n:
 break
 limit -= 1
 return .join(chars)

 i like the concept, but i hate the 1e1 stuff... why not add
 posint, neginf, and nan to the float type? i find it much more readable as:

 if limit  0:
 limit = float.posinf

 posinf, neginf and nan are singletons, so there's no problem with
 adding as members to the type.

sys.maxint makes more sense there. Or you could change it to while
limit != 0 and set it to -1 (though I probably wouldn't actually do
that)...

There is already a PEP 754 for float constants, which is implemented
in the fpconst module (see CheeseShop). It's not (yet) part of the
stdlib though.

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Problem building module against Mac Python 2.4 and Python 2.5

2006-10-15 Thread Bob Ippolito
On 10/15/06, Barry Scott [EMAIL PROTECTED] wrote:
 This may be down to my lack of knowledge of Mac OS X development.

 I want to build my python extension for Python 2.3, 2.4 and 2.5 on
 the same Mac.
 Build Python 2.3 and Python 2.4 has been working well for a long
 time. But
 after I installed Python 2.5 it seems that I can no longer link a
 against Python 2.4
 without changing sym link /Library/Frameworks/Python.framework/
 Versions/Current
 to point at the one I want to build against.

 The problem did not arise with Python 2.3 and Python 2.4 because
 Python 2.3
 is in /System/Library and Python 2.4 is in /LIbrary. Telling ld which
 framework
 folder to look in allows both to be linked against.

 Is there a way to force ld to use a particular version of the python
 framework or do
 I have to change the symlink each time I build against a different
 version?

 This type of problem does not happen on Windows or Unix by design.

Use an absolute path to the library rather than -framework.

Or use distutils!

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 2.3.6 for the unicode buffer overrun

2006-10-13 Thread Bob Ippolito
On 10/13/06, Anthony Baxter [EMAIL PROTECTED] wrote:
 On Friday 13 October 2006 16:59, Fredrik Lundh wrote:
  yeah, but *you* are doing it.  if the server did that, Martin and
  other trusted contributors could upload the files as soon as they're
  available, instead of first transferring them to you, and then waiting
  for you to find yet another precious time slot to spend on this release.

 Sure - I get that. There's a couple of reasons for me doing it. First is gpg
 signing the release files, which has to happen on my local machine. There's
 also the variation in who actually builds the releases; at least one of the
 Mac builds was done by Bob I. But there could be ways around this. I don't
 want to have to ensure every builder has scp, and I'd also prefer for it to
 all go live at once. A while back, the Mac installer would follow up some
 time after the Windows and source builds. Every release, I'd get emails
 saying where's the mac build?!

With most consumer connections it's a lot faster to download than to
upload. Perhaps it would save you a few minutes if the contributors
uploaded directly to the destination (or to some other fast server)
and you could download and sign it, rather than having to scp it back
up somewhere from your home connection.

To be fair, (thanks to Ronald) the Mac build is entirely automated by
a script with the caveat that you should be a little careful about
what your environment looks like (e.g. don't install fink or macports,
or to move them out of the way when building). It downloads all of the
third party dependencies, builds them with some special flags to make
it universal, builds Python, and then wraps it up in an installer
package.

Given any Mac OS X 10.4 machine, the builds could happen
automatically. Apple could probably provide one if someone asked. They
did it for Twisted. Or maybe the Twisted folks could appropriate part
of that machine's time to also build Python.

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as .join(x) idiom

2006-10-06 Thread Bob Ippolito
On 10/6/06, Fredrik Lundh [EMAIL PROTECTED] wrote:
 Ron Adam wrote:

  I think what may be missing is a larger set of higher level string functions
  that will work with lists of strings directly.  Then lists of strings can be
  thought of as a mutable string type by its use, and then working with 
  substrings
  in lists and using ''.join() will not seem as out of place.

 as important is the observation that you don't necessarily have to join
 string lists; if the data ends up being sent over a wire or written to
 disk, you might as well skip the join step, and work directly from the list.

 (it's no accident that ET has grown tostringlist and fromstringlist
 functions, for example ;-)

The just make lists paradigm is used by Erlang too, it's called
iolist there (it's not a type, just a convention). The lists can be
nested though, so concatenating chunks of data for IO is always a
constant time operation even if the chunks are already iolists.

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Caching float(0.0)

2006-09-30 Thread Bob Ippolito
On 9/30/06, Terry Reedy [EMAIL PROTECTED] wrote:

 Nick Coghlan [EMAIL PROTECTED] wrote in message
 news:[EMAIL PROTECTED]
 I suspect the problem would typically stem from floating point values that
 are
 read in from a human-readable file rather than being the result of a
 'calculation' as such:

 For such situations, one could create a translation dict for both common
 float values and for non-numeric missing value indicators.  For instance,
 flotran = {'*': None, '1.0':1.0, '2.0':2.0, '4.0':4.0}
 The details, of course, depend on the specific case.

But of course you have to know that common float values are never
cached and that it may cause you problems. Some users may expect them
to be because common strings and integers are cached.

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Tix not included in 2.5 for Windows

2006-09-30 Thread Bob Ippolito
On 9/30/06, Scott David Daniels [EMAIL PROTECTED] wrote:
 Christos Georgiou wrote:
  Does anyone know why this happens? I can't find any information pointing to
  this being deliberate.
 
  I just upgraded to 2.5 on Windows (after making sure I can build extensions
  with the freeware VC++ Toolkit 2003) and some of my programs stopped
  operating. I saw in a French forum that someone else had the same problem,
  and what they did was to copy the relevant files from a 2.4.3 installation.
  I did the same, and it seems it works, with only a console message appearing
  as soon as a root window is created:

 Also note: the Os/X universal seems to include a Tix runtime for the
 non-Intel processor, but not for the Intel processor.  This
 makes me think there is a build problem.

Are you sure about that? What file are you referring to specifically?

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Tix not included in 2.5 for Windows

2006-09-30 Thread Bob Ippolito
On 9/30/06, Scott David Daniels [EMAIL PROTECTED] wrote:
 Bob Ippolito wrote:
  On 9/30/06, Scott David Daniels [EMAIL PROTECTED] wrote:
  Christos Georgiou wrote:
  Does anyone know why this happens? I can't find any information pointing 
  to
  this being deliberate.
  Also note: the Os/X universal seems to include a Tix runtime for the
  non-Intel processor, but not for the Intel processor.  This
  makes me think there is a build problem.
 
  Are you sure about that? What file are you referring to specifically?

 OK, from the 2.5 universal: (hand-typed, I e-mail from another machine)


 === Using Idle ===
   import Tix
   Tix.Tk()

 Traceback (most recent call last):
File (pyshell#8), line 1, in (module)
  Tix.Tk()
File /Library/Frameworks/Python.framework/Versions/2.5/
  lib/python2.5/lib-tk/Tix.py, line 210 in __init__
  self.tk.eval('package require Tix')
 TclError: no suitable image found.  Did find:
  /Library/Tcl/Tix8.4/libTix8.4.dylib: mach-o, but wrong architecture.

 === From the command line ===

   import Tix
   Tix.Tk()

 Traceback (most recent call last):
File stdin, line 1, in (module)
File /Library/Frameworks/Python.framework/Versions/2.5/
  lib/python2.5/lib-tk/Tix.py, line 210 in __init__
  self.tk.eval('package require Tix')
 _tkinter.TclError: no suitable image found.  Did find:
  /Library/Tcl/Tix8.4/libTix8.4.dylib: mach-o, but wrong architecture.

Those files are not distributed with Python.

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Caching float(0.0)

2006-09-29 Thread Bob Ippolito
On 9/29/06, Greg Ewing [EMAIL PROTECTED] wrote:
 Nick Craig-Wood wrote:

  Is there any reason why float() shouldn't cache the value of 0.0 since
  it is by far and away the most common value?

 1.0 might be another candidate for cacheing.

 Although the fact that nobody has complained about this
 before suggests that it might not be a frequent enough
 problem to be worth the effort.

My guess is that people do have this problem, they just don't know
where that memory has gone. I know I don't count objects unless I have
a process that's leaking memory or it grows so big that I notice (by
swapping or chance).

That said, I've never noticed this particular issue.. but I deal with
mostly strings. I have had issues with the allocator a few times that
I had to work around, but not this sort of issue.

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] weakref enhancements

2006-09-28 Thread Bob Ippolito
On 9/28/06, Raymond Hettinger [EMAIL PROTECTED] wrote:
 [Alex Martelli]

 I've had use cases for weakrefs to boundmethods (and there IS a
 Cookbook recipe for them),
 
 Weakmethods make some sense (though they raise the question of why bound
 methods are being kept when the underlying object is no longer in use --
 possibly as unintended side-effect of aggressive optimization).

There are *definitely* use cases for keeping bound methods around.

Contrived example:

one_of = set([1,2,3,4]).__contains__
filter(one_of, [2,4,6,8,10])

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] weakref enhancements

2006-09-28 Thread Bob Ippolito
On 9/28/06, Raymond Hettinger [EMAIL PROTECTED] wrote:
  There are *definitely* use cases for keeping bound methods around.
 
  Contrived example:
 
 one_of = set([1,2,3,4]).__contains__
 filter(one_of, [2,4,6,8,10])

 ISTM, the example shows the (undisputed) utility of regular bound methods.

 How does it show the need for methods bound weakly to the underlying object,
 where the underlying can be deleted while the bound method persists, alive but
 unusable?

It doesn't. I seem to have misinterpreted your Weakmethods have some
use (...) sentence. Sorry for the noise.

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Suggestion for a new built-in - flatten

2006-09-22 Thread Bob Ippolito
On 9/22/06, Josiah Carlson [EMAIL PROTECTED] wrote:

 Michael Foord [EMAIL PROTECTED] wrote:
 
  Hello all,
 
  I have a suggestion for a new Python built in function: 'flatten'.

 This has been brought up many times.  I'm -1 on its inclusion, if only
 because it's a fairly simple 9-line function (at least the trivial
 version I came up with), and not all X-line functions should be in the
 standard library.  Also, while I have had need for such a function in
 the past, I have found that I haven't needed it in a few years.

I think instead of adding a flatten function perhaps we should think
about adding something like Erlang's iolist support. The idea is
that methods like writelines should be able to take nested iterators
and consume any object they find that implements the buffer protocol.

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Suggestion for a new built-in - flatten

2006-09-22 Thread Bob Ippolito
On 9/22/06, Brian Harring [EMAIL PROTECTED] wrote:
 On Fri, Sep 22, 2006 at 12:05:19PM -0700, Bob Ippolito wrote:
  On 9/22/06, Josiah Carlson [EMAIL PROTECTED] wrote:
  
   Michael Foord [EMAIL PROTECTED] wrote:
   
Hello all,
   
I have a suggestion for a new Python built in function: 'flatten'.
  
   This has been brought up many times.  I'm -1 on its inclusion, if only
   because it's a fairly simple 9-line function (at least the trivial
   version I came up with), and not all X-line functions should be in the
   standard library.  Also, while I have had need for such a function in
   the past, I have found that I haven't needed it in a few years.
 
  I think instead of adding a flatten function perhaps we should think
  about adding something like Erlang's iolist support. The idea is
  that methods like writelines should be able to take nested iterators
  and consume any object they find that implements the buffer protocol.

 Which is no different then just passing in a generator/iterator that
 does flattening.

 Don't much see the point in gumming up the file protocol with this
 special casing; still will have requests for a flattener elsewhere.

 If flattening was added, should definitely be a general obj, not a
 special casing in one method in my opinion.

I disagree, the reason for iolist is performance and convenience; the
required indirection of having to explicitly call a flattener function
removes some optimization potential and makes it less convenient to
use.

While there certainly should be a general mechanism available to
perform the task (easily accessible from C), the user would be better
served by not having to explicitly call itertools.iterbuffers every
time they want to write recursive iterables of stuff.

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Suggestion for a new built-in - flatten

2006-09-22 Thread Bob Ippolito
On 9/22/06, Josiah Carlson [EMAIL PROTECTED] wrote:

 Bob Ippolito [EMAIL PROTECTED] wrote:
  On 9/22/06, Brian Harring [EMAIL PROTECTED] wrote:
   On Fri, Sep 22, 2006 at 12:05:19PM -0700, Bob Ippolito wrote:
I think instead of adding a flatten function perhaps we should think
about adding something like Erlang's iolist support. The idea is
that methods like writelines should be able to take nested iterators
and consume any object they find that implements the buffer protocol.
  
   Which is no different then just passing in a generator/iterator that
   does flattening.
  
   Don't much see the point in gumming up the file protocol with this
   special casing; still will have requests for a flattener elsewhere.
  
   If flattening was added, should definitely be a general obj, not a
   special casing in one method in my opinion.
 
  I disagree, the reason for iolist is performance and convenience; the
  required indirection of having to explicitly call a flattener function
  removes some optimization potential and makes it less convenient to
  use.

 Sorry Bob, but I disagree.  In the few times where I've needed to 'write
 a list of buffers to a file handle', I find that iterating over the
 buffers to be sufficient.  And honestly, in all of my time dealing
 with socket and file IO, I've never needed to write a list of iterators
 of buffers.  Not to say that YAGNI, but I'd like to see an example where
 1) it was being used in the wild, and 2) where it would be a measurable
 speedup.

The primary use for this is structured data, mostly file formats,
where you can't write the beginning until you have a bunch of
information about the entire structure such as the number of items or
the count of bytes when serialized. An efficient way to do that is
just to build a bunch of nested lists that you can use to calculate
the size (iolist_size(...) in Erlang) instead of having to write a
visitor that constructs a new flat list or writes to StringIO first. I
suppose in the most common case, for performance reasons, you would
want to restrict this to sequences only (as in PySequence_Fast)
because iolist_size(...) should be non-destructive (or else it has to
flatten into a new list anyway).

I've definitely done this before in Python, most recently here:
http://svn.red-bean.com/bob/flashticle/trunk/flashticle/

The flatten function in this case is flashticle.util.iter_only, and
it's used in flashticle.actions, flashticle.amf, flashticle.flv,
flashticle.swf, and flashticle.remoting.

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] python, lipo and the future?

2006-09-17 Thread Bob Ippolito
On 9/17/06, Martin v. Löwis [EMAIL PROTECTED] wrote:
 Josiah Carlson schrieb:
  Martin v. Löwis [EMAIL PROTECTED] wrote:
  Out of curiosity: how do the current universal binaries deal with this
  issue?
 
  If I remember correctly, usually you do two completely independant
  compile runs (optionally on the same machine with different configure or
  macro definitions, then use a packager provided by Apple to merge the
  results for each binary/so to be distributed. Each additional platform
  would just be a new compile run.

Sometimes this is done, but usually people just use CC=cc -arch i386
-arch ppc. Most of the time that Just Works, unless the source
depends on autoconf gunk for endianness related issues.

 It's true that the compiler is invoked twice, however, I very much doubt
 that configure is run twice. Doing so would cause the Makefile being
 regenerated, and the build starting from scratch. It would find the
 object files from the previous run, and either all overwrite them, or
 leave them in place.

 The gcc driver on OSX allows to invoke cc1/as two times, and then
 combines the resulting object files into a single one (not sure whether
 or not by invoking lipo).


That's exactly what it does. The gcc frontend ensures that cc1/as is
invoked exactly as many times as there are -arch flags, and the result
is lipo'ed together. This also means that you get to see a copy of all
warnings and errors for each -arch flag.

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More tracker demos online

2006-08-05 Thread Bob Ippolito

On Aug 5, 2006, at 4:52 AM, Hernan M Foffani wrote:

 Currently, we have two running tracker demos online:

 Roundup:
 http://efod.se/python-tracker/

 Jira:
 http://jira.python.atlassian.com/secure/Dashboard.jspa


 Is anyone looking at the Google Code Hosting tracker, just for  
 fun? =)  (
 code.google.com/hosting, although performance seems to be an issue  
 for now)

 It's propietary code, isn't it?
 http://code.google.com/hosting/faq.html#itselfoss
 (Is that why you said just for fun?)

So is Jira...

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-04 Thread Bob Ippolito

On Aug 3, 2006, at 9:34 PM, Josiah Carlson wrote:


 Bob Ippolito [EMAIL PROTECTED] wrote:
 On Aug 3, 2006, at 6:51 PM, Greg Ewing wrote:

 M.-A. Lemburg wrote:

 Perhaps we ought to add an exception to the dict lookup mechanism
 and continue to silence UnicodeErrors ?!

 Seems to be that comparison of unicode and non-unicode
 strings for equality shouldn't raise exceptions in the
 first place.

 Seems like a slightly better idea than having dictionaries suppress
 exceptions. Still not ideal though because sticking non-ASCII strings
 that are supposed to be text and unicode in the same data structures
 is *probably* still an error.

 If/when 'python -U -c import test.testall' runs without unexpected
 error (I doubt it will happen prior to the all strings are unicode
 conversion), then I think that we can say that there aren't any
 use-cases for strings and unicode being in the same dictionary.

 As an alternate idea, rather than attempting to .decode('ascii') when
 strings and unicode compare, why not .decode('latin-1')?  We lose the
 unicode decoding error, but the right thing happens (in my opinion)
 when u'\xa1' and '\xa1' compare.

Well, in this case it would cause different behavior if u'\xa1' and  
'\xa1' compared equal. It'd just be an even more subtle error.

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dicts are broken Was: unicode hell/mixing str and unicode asdictionarykeys

2006-08-04 Thread Bob Ippolito

On Aug 4, 2006, at 12:51 PM, Giovanni Bajo wrote:

 Paul Colomiets [EMAIL PROTECTED] wrote:

 Well it's not recomended to mix strings and unicode in the
 dictionaries
 but if we mix for example integer and float we have the same  
 thing. It
 doesn't raise exception but still it is not expected behavior for me:
 d = { 1.0: 10, 2.0: 20 }
 then if i somewhere later do:
 d[1] = 100
 d[2] = 200
 to have here all floats in d.keys(). May be this is not a best
 example.

 There is a strong difference. Python is moving towards unifying  
 number types in
 a way (see the true division issue): the idea is that, all in all,  
 user
 shouldn't really care what type a number is, as long as he knows  
 it's a number.
 On the other hand, unicode and str are going to diverge more and more.

Well, not really. True division makes int/int return float instead of  
an int. You really do have to care if you have an int or a float most  
of the time, they're very different semantically.

Unicode and str are eventually going to be the same thing (str would  
ideally end up becoming a synonym of unicode). The difference being  
that there will be some other type to contain bytes.

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-03 Thread Bob Ippolito

On Aug 3, 2006, at 9:51 AM, M.-A. Lemburg wrote:

 Ralf Schmitt wrote:
 Ralf Schmitt wrote:
 Still trying to port our software. here's another thing I noticed:

 d = {}
 d[u'm\xe1s'] = 1
 d['m\xe1s'] = 1
 print d

 With python 2.4 I can add those two keys to the dictionary and get:
 $ python2.4 t2.py
 {u'm\xe1s': 1, 'm\xe1s': 1}

 With python 2.5 I get:

 $ python2.5 t2.py
 Traceback (most recent call last):
File t2.py, line 3, in module
  d['m\xe1s'] = 1
 UnicodeDecodeError: 'ascii' codec can't decode byte 0xe1 in  
 position 1:
 ordinal not in range(128)

 Is this intended behaviour? I guess this might break lots of  
 programs
 and the way python 2.4 works looks right to me.
 I think it should be possible to mix str/unicode keys in dicts  
 and let
 non-ascii strings compare not-equal to any unicode string.

 Also this behaviour makes your programs break randomly, that is,  
 it will
 break when the string you add hashes to the same value that the  
 unicode
 string has (at least that's what I guess..)

 This is because Unicode and 8-bit string keys only work
 in the same way if and only if they are plain ASCII.

 The reason lies in the hash function used by Unicode: it is
 crafted to make hash(u) == hash(s) for all ASCII s, such
 that s == u.

 For non-ASCII strings, there are no guarantees as to the
 hash value of the strings or whether they match or not.

 This has been like that since Unicode was introduced, so it's
 not new in Python 2.5.

What is new is that the exception raised on u == s after hash  
collision is no longer silently swallowed.

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-03 Thread Bob Ippolito

On Aug 3, 2006, at 6:51 PM, Greg Ewing wrote:

 M.-A. Lemburg wrote:

 Perhaps we ought to add an exception to the dict lookup mechanism
 and continue to silence UnicodeErrors ?!

 Seems to be that comparison of unicode and non-unicode
 strings for equality shouldn't raise exceptions in the
 first place.

Seems like a slightly better idea than having dictionaries suppress  
exceptions. Still not ideal though because sticking non-ASCII strings  
that are supposed to be text and unicode in the same data structures  
is *probably* still an error.

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] struct module and coercing floats to integers

2006-08-02 Thread Bob Ippolito

On Jul 28, 2006, at 1:35 PM, Bob Ippolito wrote:

 It seems that the pre-2.5 struct module has some additional
 undocumented behavior[1] that didn't percolate into the new version:
 http://python.org/sf/1530559

 Python 2.4 and previous will coerce floats to integers when necessary
 as such without any kind of complaint:

 $ python2.4 -c import struct; print repr(struct.pack('H',
 0.))
 '\x00\x00'

 Python 2.5 refuses to coerce float to int:

 $ python2.5 -c import struct; print repr(struct.pack('H',
 0.))
 Traceback (most recent call last):
File string, line 1, in module
File /Users/bob/src/python/Lib/struct.py, line 63, in pack
  return o.pack(*args)
 TypeError: unsupported operand type(s) for : 'float' and 'long'

 The available options are to:

 1. Reinstate the pre-2.5 weirdness
 2. Reinstate the pre-2.5 weirdness with a DeprecationWarning
 3. Break existing code that relies on undocumented behavior (seems
 more like a bug than lack of specification)

There's a patch in the tracker for 2. It should get applied when the  
trunk freeze is over.

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] struct module and coercing floats to integers

2006-07-28 Thread Bob Ippolito
It seems that the pre-2.5 struct module has some additional  
undocumented behavior[1] that didn't percolate into the new version:  
http://python.org/sf/1530559

Python 2.4 and previous will coerce floats to integers when necessary  
as such without any kind of complaint:

$ python2.4 -c import struct; print repr(struct.pack('H',  
0.))
'\x00\x00'

Python 2.5 refuses to coerce float to int:

$ python2.5 -c import struct; print repr(struct.pack('H',  
0.))
Traceback (most recent call last):
   File string, line 1, in module
   File /Users/bob/src/python/Lib/struct.py, line 63, in pack
 return o.pack(*args)
TypeError: unsupported operand type(s) for : 'float' and 'long'

The available options are to:

1. Reinstate the pre-2.5 weirdness
2. Reinstate the pre-2.5 weirdness with a DeprecationWarning
3. Break existing code that relies on undocumented behavior (seems  
more like a bug than lack of specification)

Either 2 or 3 seems reasonable to me, with a preference for 3 because  
none of my code depends on old bugs in the struct module :)

As far as precedent goes, the array module *used* to coerce floats  
silently, but it's had a DeprecationWarning since at least Python 2.3  
(but perhaps even earlier). Maybe it's time to promote that warning  
to an exception for Python 2.5?

[1] The pre-2.5 behavior should really be considered a bug, the  
documentation says Return a string containing the values v1, v2, ...  
packed according to the given format. The arguments must match the  
values required by the format exactly. I wouldn't consider arbitrary  
floating point numbers to match the value required by an integer  
format exactly. Floats are not in general interchangeable with  
integers in Python anyway (e.g. list indexes, etc.).

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Release manager pronouncement needed: PEP 302 Fix

2006-07-27 Thread Bob Ippolito

On Jul 27, 2006, at 3:52 AM, Georg Brandl wrote:

 Armin Rigo wrote:
 Hi Phillip,

 On Wed, Jul 26, 2006 at 02:40:27PM -0400, Phillip J. Eby wrote:
 If we don't revert it, there are two ways to fix it.  One is to  
 just change
 PEP 302 so that the behavior is unbroken by definition.  :)  The  
 other is
 to actually go ahead and fix it by adding PathImporter and  
 NullImporter
 types to import.c, along with a factory function on  
 sys.path_hooks to
 create them.  (This would've been the PEP-compliant way to  
 implement the
 need-for-speed patch.)

 So, fix by documentation, fix by fixing, or fix by reverting?   
 Which
 should it be?

 fix by changing the definition looks like a bad idea to me.  The
 import logic is already extremely complicated and delicate, any  
 change
 to it is bound to break *some* code somewhere.

 Though beta1 and beta2 shipped with this change nobody reported any  
 bug that
 could be linked to it. sys.path_importer_cache is quite an internal  
 thing and
 most code, even import hooks, shouldn't have to deal with it.

Anyone trying to emulate what imp.find_module does in a PEP 302  
compliant way will need to introspect sys.path_importer_cache. I have  
some unreleased code based on the PEP 302 spec that does this and the  
way it was originally written would have broke in 2.5 if I had tested  
it there.

Just because it's obscure doesn't mean we should go change how things  
work in a way that's not consistent with the documentation. The  
documentation should change to match the code or vice versa, though I  
really don't have any strong feelings one way or the other.

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] JSON implementation in Python 2.6

2006-07-26 Thread Bob Ippolito

On Jul 26, 2006, at 3:18 PM, John J Lee wrote:

 On Wed, 26 Jul 2006, Phillip J. Eby wrote:
 [...]
 Actually, I would see more reason to include JSON in the standard  
 library,
 since it's at least something approaching an internet protocol  
 these days.

 +1

If there's a consensus on that, my simplejson [1] implementation  
could migrate to the stdlib for 2.6.

The API is modeled after marshal and pickle, the code should be PEP 8  
compliant, its test suite has pretty good coverage, it's already used  
by (at least) TurboGears and Django, and it's the implementation  
currently endorsed by json.org.

The work that would be required would be:

- LaTeX docs (currently reST in docstrings)
- Move the tests around and make them run from the suite rather than  
via nose
- Possible module rename (jsonlib?)

[1] http://undefined.org/python/#simplejson

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] User's complaints

2006-07-17 Thread Bob Ippolito

On Jul 17, 2006, at 11:25 AM, Armin Rigo wrote:

 Hi Bob,

 On Thu, Jul 13, 2006 at 12:58:08AM -0700, Bob Ippolito wrote:
 @main
 def whatever():
 ...

 It would probably need to be called something else, because main is
 often the name of the main function...

 Ah, but there is theoretically no name clash here :-)

 @main # - from the built-ins
 def main():   # - and only then set the global
 ...


 Just-making-a-stupid-point-and-not-endorsing-the-feature-ly yours,

Of course it *works*, but it's still overriding a built-in... Who  
knows when assignment to main will become a SyntaxError like None ;)

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] User's complaints

2006-07-13 Thread Bob Ippolito

On Jul 13, 2006, at 12:37 AM, Wolfgang Langner wrote:

 On 7/13/06, Jeroen Ruigrok van der Werven [EMAIL PROTECTED] wrote:
 Things that struck me as peculiar is the old:

 if __name__ == __main__:
 whatever()

 This is so out of tune with the rest of python it becomes a nuisance.

 It is not beautiful but very useful.
 In Python 3000 we can replace it with:

 @main
 def whatever():
 ...

 to mark this function as main function if module executed directly.

It would probably need to be called something else, because main is  
often the name of the main function... but you could write such a  
decorator now if you really wanted to.

def mainfunc(fn):
 if fn.func_globals.get('__name__') == '__main__':
 # ensure the function is in globals
 fn.func_globals[fn.__name__] = fn
 fn()
 return fn

@mainfunc
def main():
 print 'this is in __main__'

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] User's complaints

2006-07-13 Thread Bob Ippolito

On Jul 13, 2006, at 2:02 AM, Greg Ewing wrote:

 Jeroen Ruigrok van der Werven wrote:

 - Open classes would be nice.

 What do you mean by open classes? Python
 classes already seem pretty open to me, by
 the standards of other languages!

I'm guessing he's talking about being like Ruby or Objective-C where  
you can add methods to any other class in the runtime. Basically we'd  
have that if the built-in classes were mutable, but that just really  
encourages fragile code. The big problem you run into with open  
classes is that you end up depending on two libraries that have a  
different idea of what the foo method on string objects should do.

Adding open classes would make it easier to develop DSLs, but you'd  
only be able to reasonably do one per interpreter (unless you mangled  
the class in a with block or something).

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] User's complaints

2006-07-13 Thread Bob Ippolito

On Jul 13, 2006, at 5:02 AM, Jeroen Ruigrok van der Werven wrote:

 Hi Bob,

 On 7/13/06, Bob Ippolito [EMAIL PROTECTED] wrote:
 Adding open classes would make it easier to develop DSLs, but you'd
 only be able to reasonably do one per interpreter (unless you mangled
 the class in a with block or something).

 The person whose 'complaints' I was stating says that DSLs (Domain
 Specific Languages for those who, like me, were confused about the
 acronym) are a big part of what he is after and one per interpreter is
 fine by him. He also realises that the application(s) he needs them
 for might be unusual. He doesn't specifically need the builtin types
 to be extendable. It's just nice to be able to define a single class
 in multiple modules. Even C++ allows this to some extent (but not as
 much as he'd like).

 He understands the implications of allowing open classes (import vs.
 no import changes semantics, etc.). Personally, he doesn't care *too*
 much about newbie safety since he's not a newbie. To quote verbatim:
 give me the big guns :-)

 And while we're at it, he also stated: [...] add multiple dispatch to
 your list of improvements for Python.

 I hope this clarifies it a bit for other people.

Well, if this person really weren't a newbie then of course they'd  
know how to define a metaclass that can be used to extend a (non- 
built-in) class from another module. They'd probably also know of two  
or three different implementations of multiple dispatch (or  
equivalent, such as generic functions) available, and could probably  
write their own if they had to ;)

The only valid complaint, really, is that built-in classes are read- 
only. I doubt anyone wants to change that. If they want to write  
things in the style of Ruby, why not just use it?

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Community buildbots

2006-07-13 Thread Bob Ippolito

On Jul 13, 2006, at 1:53 PM, Giovanni Bajo wrote:

 [EMAIL PROTECTED] wrote:

 (Aside: IMHO, the sooner we can drop old-style classes entirely, the
 better.
 That is one bumpy Python upgrade process that I will be _very_ happy
 to do.

 I think python should have a couple more of future imports. from  
 __future__
 import new_classes and from __future__ import unicode_literals  
 would be
 really welcome, and would smooth the Py3k migration process

from __future__ import new_classes exists, but the syntax is  
different:

__metaclass__ = type

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Restricted execution: what's the threat model?

2006-07-12 Thread Bob Ippolito

On Jul 12, 2006, at 2:23 PM, Jim Jewett wrote:

 Ka-Ping Yee writes:

   A.  The interpreter will not crash no matter what Python code
   it is given to execute.

 Why?

 We don't want it to crash the embedding app (which might be another
 python interpreter), but if the sandboxed interpreter itself crashes,
 is that so bad?  The embedding app should just act as though that
 interpreter exited, possibly with a status code.

When he says crash, I'd have to imagine that he means of the segfault  
variety. Good luck saving the embedding app after that.

   C.  Python programs running in different interpreters embedded
   in the same process cannot access each other's Python objects.

 Note that Brett's assumption of shared extension modules violates this
 -- but I'm not sure why he needs to assume that.  (Because of the
 init-only-once semantics, I'm not even sure it is a good idea to share
 them.)

Well if you don't share them, you can't have them at all other than  
in the main trusted interpreter. C extensions can only be safely  
initialized once and they often cache objects in static variables...  
lots of C modules aren't even safe to use when combined with multiple  
interpreters and threads (e.g. PyGILState API), so I guess that  
perhaps the C API should be refined anyway.

-bob



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Musings on concurrency and scoping (replacing Javascript)

2006-07-07 Thread Bob Ippolito

On Jul 7, 2006, at 1:08 PM, Guido van Rossum wrote:

 On 7/7/06, Ka-Ping Yee [EMAIL PROTECTED] wrote:
 I've been doing a bunch of Firefox extension programming in  
 Javascript
 and suddenly a few of the recent topics here came together in my head
 in a silent kapow of thoughts.  This is kind of a side note to the
 security discussion, but they're all interconnected: network
 programming, concurrency, lexical scoping, security.

 Hm... I wonder if this style has become so popular in JS because it's
 all they have? I find callback-style programming pretty inscrutable
 pretty soon.

You really don't have any choice without continuations or some built- 
in concurrency primitive. Callbacks are slightly less painful in  
JavaScript because you can define them in-line instead of naming it  
first.

 Client-side web scripting tends to have a callback/continuation-ish
 concurrency style because it has to deal with network transactions
 (which can stall for long periods of time) in a user interface that
 is expected to stay always responsive.  The Firefox API is full of
 listeners/observers, events, and continuation-like things.  So one
 thing to consider is that, when Python is used for these purposes,
 it may be written in a specialized style.

 As i write JavaScript in this style i find i use nested functions
 a lot.  When i want to set up a callback that uses variables in the
 current context, the natural thing to do is to define a new function
 in the local namespace.  And if that function has to also provide a
 callback, then it has another function nested within it and so on.

 function spam() {
 var local_A = do_work();
 do_network_transaction(
 new function(result_1) {
 var local_B = do_work(result_1);
 do_network_transaction(
 new function(result_2) {
 do_work(local_A, local_B, result_1,  
 result_2);
 ...
 }
 );
 }
 );
 }

 How can you ever keep track of when a '}' must be followed by a ';' ?

}\n is the same as }; as far as the JavaScript spec goes, you can  
do either or both.

-bob



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Musings on concurrency and scoping (replacing Javascript)

2006-07-06 Thread Bob Ippolito

On Jul 6, 2006, at 5:04 PM, Ka-Ping Yee wrote:

 On Thu, 6 Jul 2006, Phillip J. Eby wrote:
 As much as I'd love to have the nested scope feature, I think it's  
 only
 right to point out that the above can be rewritten as something  
 like this
 in Python 2.5:

  def spam():
  local_A = do_work()
  result_1 = yield do_network_transaction()
  local_B = do_work(result_1)
  result_2 = yield do_network_transaction()
  do_work(local_A, local_B, result_1, result_2)
  ...

 All you need is an appropriate trampoline (possibly just a  
 decorator) that
 takes the objects yielded by the function, and uses them up to set up
 callbacks that resume the generator with the returned result.

 Clever!  Could you help me understand what goes on in
 do_network_transaction() when you write it this way?  In the
 Firefox/JavaScript world, the network transaction is fired off
 in another thread, and when it's done it posts an event back
 to the JavaScript thread, which triggers the callback.

 And what happens if you want to supply more than one continuation?
 In my JavaScript code i'm setting up two continuations per step --
 one for success and one for failure, since with a network you never
 know what might happen.

When you have a failure the yield expression raises an exception  
instead of returning a result.

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] zlib module build failure on Mac OSX 10.4.7

2006-07-01 Thread Bob Ippolito

On Jul 1, 2006, at 10:45 AM, Ronald Oussoren wrote:


 On Jul 1, 2006, at 6:57 PM, [EMAIL PROTECTED] wrote:


 Ronald Are you sure you're building on a 10.4 box?  Both the
 Ronald macosx-10.3 thingy and lack of inflateCopy seem to  
 indicate that
 Ronald you're running on 10.3.

 Well, yeah, pretty sure.  Let's see.  The box with the disk says  
 Mac OS X
 Tiger - Version 10.4 on the spine. The About This Mac popup says
 10.4.7.

 That gets the easy solution out of the way ;-)

   It used to run 10.3 though.  Is there some possibility the update
 from 10.3 to 10.4 had problems?

 Note that the compile log on the buildbot 10.4 box also has 10.3  
 in its
 directory names.  If I remember correctly, it came from Apple with  
 10.4
 installed.

 /me slaps head.

 Having 10.3 in the directory names is intentional, the version in  
 the directory name is the value of MACOSX_DEPLOYMENT_TARGET, with  
 is defaulted to 10.3 in the configure script.

 What  I don't understand yet is why your copy of libz doesn't have  
 inflateCopy. What does /usr/lib/libz.dylib point to on your system?  
 On my 10.4 box it is a symlink that points to libz.1.2.3.dylib and  
 there is an older version of libz (libz.1.1.3.dylib) in /usr/lib as  
 well.

Maybe Skip didn't upgrade to the latest version of Xcode? Perhaps  
he's still got an old SDK?

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] doc for new restricted execution design for Python

2006-06-28 Thread Bob Ippolito
On Jun 28, 2006, at 10:54 AM, Brett Cannon wrote:On 6/28/06, Trent Mick [EMAIL PROTECTED] wrote: Brett Cannon wrote: Mark (and me a little bit) has been sketching out creating a "Python forMozilla/Firefox" extension for installing an embedded Python into anexisting Firefox installation on the pyxpcom list: http://aspn.activestate.com/ASPN/Mail/Message/pyxpcom/3167613 The idea is that there be a separate Python interpreter per web browser page instance.I think there may be scaling issues there. _javascript_ isn't doing that is it, do you know? As well, that doesn't seem like it would translatewell to sharing execution between separate chrome windows in anon-browser XUL/Mozilla-based app.I don't know how _javascript_ is doing it yet.  The critical thing for me for this month was trying to come up with a security model. And if you don't think it is going to scale, how do you think it should be done?Why wouldn't it scale? How much interpreter state is there really anyway?-bob___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] xturtle.py a replacement for turtle.py(!?) ATTENTION PLEASE!

2006-06-28 Thread Bob Ippolito

On Jun 28, 2006, at 1:05 PM, Gregor Lingl wrote:

 Martin v. Löwis schrieb:
 Collin Winter wrote:

 While I have no opinion on Gregor's app, and while I fully agree  
 that
 new language features and stdlib modules should generally stay  
 out of
 bug-fix point releases, xturtle doesn't seem to rise to that level
 (and hence, those restrictions).


 It's a stdlib module, even if no other stdlib modules depend on it;
 try import turtle.

 In the specific case, the problem with adding it to 2.5 is that  
 xturtle
 is a huge rewrite, so ideally, the code should be reviewed before  
 being
 added. Given that this is a lot of code, nobody will have the time to
 perform a serious review. It will be hard enough to find somebody to
 review it for 2.6 - often, changes of this size take several years to
 review (primarily because it is so specialized that only few people
 even consider reviewing it).

 Sorry Martin, but to me this seems not to be the right way to  
 manage things.
 We have turtle.py revised in Python2.5b1

 Please try this example (as I  just did) :

 IDLE 1.2b1   No Subprocess 
  from turtle import *
  begin_fill()
  circle(100,90)  # observe the turtle
  backward(200)
  circle(100,90)
  color(red)
  end_fill()
 IDLE internal error in runcode()
 Traceback (most recent call last):
  File pyshell#6, line 1, in module
end_fill()
  File C:\Python25\lib\lib-tk\turtle.py, line 724, in end_fill
def end_fill(): _getpen.end_fill()
 AttributeError: 'function' object has no attribute 'end_fill'
 

 An error occurs, because in line 724 it should read
 def end_fill(): _getpen().end_fill()

File a patch, this is a bug fix and should definitely be appropriate  
for inclusion before the release of Python 2.5!

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] doc for new restricted execution design for Python

2006-06-25 Thread Bob Ippolito
On Jun 25, 2006, at 1:08 PM, Brett Cannon wrote:On 6/24/06, Bob Ippolito [EMAIL PROTECTED] wrote: On Jun 24, 2006, at 2:46 AM, Nick Coghlan wrote: Brett Cannon wrote: Yep.  That API will be used directly in the changes to pymalloc and PyMem_*() macros (or at least the basic idea).  It is not *only* for  extension modules but for the core as well. Existing extension modules and existing C code in the Python interpreter have no idea of any PyXXX_ calls, so I don't understand how  new API functions help here. The calls get added to pymalloc and PyMem_*() under the hood, so that existing extension modules use the memory check automatically  without a change.  The calls are just there in case some one has some random need to do their own malloc but still want to participate in the cap. Plus it helped me think everything through by giving everything I would  need to change internally an API. This confused me a bit, too. It might help if you annotated each of the new API's with who the expected callers were:- trusted interpreter - untrusted interpreter- embedding application- extension moduleThreading is definitely going to be an issue with multipleinterpreters (restricted or otherwise)... for example, the PyGILState API probably wouldn't work anymore.PyGILState won't work because there are multiple interpreters period, or because of the introduced distinction of untrusted and trusted interpreters?  In other words, is this some new possible breakage, or is this an issue with threads that has always existed with multiple interpreters? It's an issue that's always existed with multiple interpreters, but multiple interpreters aren't really commonly used or tested at the moment so it's not very surprising.It would be kinda nice to have an interpreter-per-thread with no GIL like some of the other languages have, but the C API depends on too much global state for that...-bob___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] doc for new restricted execution design for Python

2006-06-24 Thread Bob Ippolito

On Jun 24, 2006, at 2:46 AM, Nick Coghlan wrote:

 Brett Cannon wrote:
 Yep.  That API will be used directly in the changes to pymalloc and
 PyMem_*() macros (or at least the basic idea).  It is not *only* for
 extension modules but for the core as well.

 Existing extension modules and existing C code in the Python  
 interpreter
 have no idea of any PyXXX_ calls, so I don't understand how  
 new API
 functions help here.


 The calls get added to pymalloc and PyMem_*() under the hood, so that
 existing extension modules use the memory check automatically  
 without a
 change.  The calls are just there in case some one has some random  
 need
 to do their own malloc but still want to participate in the cap.   
 Plus
 it helped me think everything through by giving everything I would  
 need
 to change internally an API.

 This confused me a bit, too. It might help if you annotated each of  
 the new
 API's with who the expected callers were:

- trusted interpreter
- untrusted interpreter
- embedding application
- extension module

Threading is definitely going to be an issue with multiple  
interpreters (restricted or otherwise)... for example, the PyGILState  
API probably wouldn't work anymore.

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PyRange_New() alternative?

2006-06-22 Thread Bob Ippolito

On Jun 22, 2006, at 11:55 AM, Ralf W. Grosse-Kunstleve wrote:

 --- Georg Brandl [EMAIL PROTECTED] wrote:

 Ralf W. Grosse-Kunstleve wrote:
 http://docs.python.org/dev/whatsnew/ports.html says:

   The PyRange_New() function was removed. It was never  
 documented, never
 used
 in the core code, and had dangerously lax error checking.

 I use this function (don't remember how I found it; this was  
 years ago),
 therefore my code doesn't compile with 2.5b1 (it did OK before  
 with 2.5a2).
 Is
 there an alternative spelling for PyRange_New()?

 You can call PyRange_Type with the appropriate parameters.

 Thanks a lot for the hint! However, I cannot find any documentation  
 for
 PyRange_*. I tried this page...

   http://docs.python.org/api/genindex.html

 and google. Did I miss something?

 I am sure I can get this to work with some digging, but I am  
 posting here to
 highlight a communication problem. I feel if a function is removed the
 alternative should be made obvious in the associated documentation; in
 particular if there is no existing documentation for the alternative.

He means something like this:
PyObject_CallFunction(PyRange_Type, llli, ...)

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] unicode imports

2006-06-16 Thread Bob Ippolito

On Jun 16, 2006, at 9:02 AM, Phillip J. Eby wrote:

 At 01:29 AM 6/17/2006 +1000, Nick Coghlan wrote:
 Kristján V. Jónsson wrote:
 A cursory glance at import.c shows that the import mechanism is  
 fairly
 complicated, and riddled with char *path thingies, and manual  
 string
 arithmetic.  Do you have any suggestions on a clean way to  
 unicodify the
 import mechanism?

 Can you install a PEP 302 path hook and importer/loader that can  
 handle path
 entries that are Unicode strings? (I think this would end up being  
 the
 parallel implementation you were talking about, though)

 If the code that traverses sys.path and sys.path_hooks is itself
 unicode-unaware (I don't remember if it is or isn't), then you  
 might be able
 to trick it by poking a Unicode-savvy importer directly into the
 path_importer_cache for affected Unicode paths.

 Actually, you would want to put it in sys.path_hooks, and then  
 instances
 would be placed in path_importer_cache automatically.  If you are  
 adding it
 to the path_hooks after the fact, you should simply clear the
 path_importer_cache.  Simply poking stuff into the  
 path_importer_cache is
 not a recommended approach.


 One issue is that the package and file names still have to be  
 valid Python
 identifiers, which means ASCII. Unicode would be, at best,  
 permitted only in
 the path entries.

 If I understand the problem correctly, the issue is that if you  
 install
 Python itself to a Unicode directory, you'll be unable to import  
 anything
 from the standard library.  This isn't about module names, it's  
 about the
 places on the path where that stuff goes.

There's a similar issue in that if sys.prefix contains a colon,  
Python is also busted:
http://python.org/sf/1507224

Of course, that's not a Windows issue, but it is everywhere else. The  
offending code in that case is Modules/getpath.c, which probably also  
has to change in order to make unicode directories work on Win32  
(though I think there may be a separate win32 implementation of  
getpath).

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Add pure python PNG writer module to stdlib?

2006-06-10 Thread Bob Ippolito
On Jun 10, 2006, at 4:35 PM, Brett Cannon wrote:On 6/10/06, Johann C. Rocholl [EMAIL PROTECTED] wrote: I'm working on simple module to write PNG image files in pure python.Adding it to the standard library would be useful for people who wantto create images on web server installations without gd and imlib, oron platforms where the netpbm tools are not easily available. Does anybody find this idea interesting?Yes, although I wouldn't want an interface taking in strings but something more like an iterator that returns each row which itself contains int triples.  In other words more array-based than string based. Well you could easily make such strings (or a buffer object that could probably be used in place of a string) with the array module... -bob___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Is implicit underscore assignment buggy?

2006-06-07 Thread Bob Ippolito

On Jun 7, 2006, at 3:41 PM, Aahz wrote:

 On Wed, Jun 07, 2006, Raymond Hettinger wrote:
 Fredrik:

 for users, it's actually quite simple to figure out what's in the _
 variable: it's the most recently *printed* result.  if you cannot  
 see
 it, it's not in there.

 Of course, there's a pattern to it.  The question is whether it is  
 the
 *right* behavior.  Would the underscore assignment be more useful and
 intuitive if it always contained the immediately preceding result,
 even if it was None?  In some cases (such as the regexp example),  
 None
 is a valid and useful possible result of a computation and you may
 want to access that result with _.

 My take is that Fredrik is correct about the current behavior being  
 most
 generally useful even if it is slightly less consistent, as well as  
 being
 undesired in rare circumstances.  Consider that your message is the  
 only
 one I've seen in more than five years of monitoring python-dev and
 c.l.py.

I agree. I've definitely made use of the current behavior, e.g. for  
printing a different representation of _ before doing something else  
with it.

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] test_struct failure on 64 bit platforms

2006-05-31 Thread Bob Ippolito

On May 31, 2006, at 12:49 AM, Neal Norwitz wrote:

 Bob,

 There are a couple of things I don't understand about the new struct.
 Below is a test that fails.

 $ ./python ./Lib/test/regrtest.py test_tarfile test_struct
 test_tarfile
 /home/pybot/test-trunk/build/Lib/struct.py:63: DeprecationWarning: 'l'
 format requires -2147483648 = number = 2147483647
  return o.pack(*args)
 test_struct
 test test_struct failed -- pack('l', -2147483649) did not raise error
 1 test OK.
 1 test failed:
test_struct

 

 I fixed the error message (the min value was off by one before).  I
 think I fixed a few ssize_t issues too.

 The remaining issues I know of are:
  * The warning only appears on 64-bit platforms.
  * The warning doesn't seem correct for 64-bit platforms (l is 8  
 bytes, not 4).
  * test_struct only fails if run after test_tarfile.
  * The msg from test_struct doesn't seem correct for 64-bit platforms.

 I tracked the problem down to trying to write the gzip tar file.  Can
 you fix this?

The warning is correct, and so is the size. Only native formats have  
native sizes; l and i are exactly 4 bytes on all platforms when using  
=, , , or !. That's what std size and alignment means.

It looks like the only thing that's broken here is the test. The  
behavior changed to consistently allow any integer whatsoever to be  
passed to struct for all formats (except q and Q which have always  
done proper range checking). Previously, the range checking was  
inconsistent across platforms (32-bit and 64-bit anyway) and when  
using int vs. long.

Unfortunately I don't have a 64-bit platform easily accessible and I  
have no idea which test it is that's raising the warning. Could you  
isolate it?

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64

2006-05-31 Thread Bob Ippolito

On May 31, 2006, at 8:31 AM, Tim Peters wrote:

 I'm afraid a sabbatical year isn't long enough to understand what the
 struct module did or intends to do by way of range checking 0.7
 wink.

 Is this intended?  This is on a 32-bit Windows box with current trunk:

 from struct import pack as p
 p(I, 2**32 + 2343)
 C:\Code\python\lib\struct.py:63: DeprecationWarning: 'I' format
 requires 0 = number = 4294967295
  return o.pack(*args)
 '\x00\x00\x00\x00'

 The warning makes sense, but the result doesn't make sense to me.  In
 Python 2.4.3, that example raised OverflowError, which seems better
 than throwing away all the bits without an exception.

Throwing away all the bits is a bug, it's supposed to mask with  
0xL

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64

2006-05-30 Thread Bob Ippolito

On May 29, 2006, at 8:00 PM, Tim Peters wrote:

 [Bob Ippolito]
 ...
 Actually, should this be a FutureWarning or a DeprecationWarning?

 Since it was never documented, UndocumentedBugGoingAwayError ;-)
 Short of that, yes, DeprecationWarning.  FutureWarning is for changes
 in non-exceptional behavior (.e.g, if we swapped the meanings of 
 and  in struct format codes, that would rate a FutureWarning
 subclass, line InsaneFutureWarning).

 OK, this behavior is implemented in revision 46537:

 (this is from ./python.exe -Wall)

   import struct

 ...

   struct.pack('B', -1)
 /Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: struct
 integer wrapping is deprecated
return o.pack(*args)
 /Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: 'B'
 format requires 0 = number = 255
return o.pack(*args)
 '\xff'

 We certainly don't want to see two deprecation warnings for a single
 deprecated behavior.  I suggest eliminating the struct integer
 wrapping warning, mostly because I had no idea what it _meant_
 before reading the comments in _struct.c  (wrapping is used most
 often in a proxy or delegation context in Python these days).   'B'
 format requires 0 = number = 255 is perfectly clear all by itself.

What should it be called instead of wrapping? When it says it's  
wrapping, it means that it's doing x = (2 ^ (8 * n)) - 1 to force a  
number into meeting the expected range.

Reducing it to one warning instead of two is kinda difficult. Is it  
worth the trouble?

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64

2006-05-30 Thread Bob Ippolito
On May 30, 2006, at 2:41 AM, Nick Coghlan wrote:

 Bob Ippolito wrote:
 On May 29, 2006, at 8:00 PM, Tim Peters wrote:
 We certainly don't want to see two deprecation warnings for a single
 deprecated behavior.  I suggest eliminating the struct integer
 wrapping warning, mostly because I had no idea what it _meant_
 before reading the comments in _struct.c  (wrapping is used most
 often in a proxy or delegation context in Python these days).   'B'
 format requires 0 = number = 255 is perfectly clear all by  
 itself.
 What should it be called instead of wrapping? When it says it's   
 wrapping, it means that it's doing x = (2 ^ (8 * n)) - 1 to force  
 a  number into meeting the expected range.

 integer overflow masking perhaps?

Sounds good enough, I'll go ahead and change the wording to that.

 Reducing it to one warning instead of two is kinda difficult. Is  
 it  worth the trouble?

 If there are cases where only one warning or the other triggers, it  
 doesn't seem worth the effort to try and suppress one of them when  
 they both trigger.

It works kinda like this:

def get_ulong(x):
ulong_mask = (sys.maxint  1L) | 1
if is_unsigned and ((unsigned)x)  ulong_mask:
x = ulong_mask
warning('integer overflow masking is deprecated')
return x

def pack_ubyte(x):
x = get_ulong(x)
if not (0 = x = 255):
warning('B' format requires 0 = number = 255)
x = 0xff
return chr(x)

Given the implementation, it will warn twice if sizeof(format)   
sizeof(long) AND one of the following:
1. Negative numbers are given for an unsigned format
2. Input value is greater than ((sys.maxint  1) | 1) for an  
unsigned format
3. Input value is not ((-sys.maxint - 1) = x = sys.maxint) for a  
signed format

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Converting crc32 functions to use unsigned

2006-05-30 Thread Bob Ippolito
It seems that we should convert the crc32 functions in binascii,  
zlib, etc. to deal with unsigned integers. Currently it seems that 32- 
bit and 64-bit platforms are going to have different results for  
these functions.

Should we do the same as the struct module, and do DeprecationWarning  
when the input value is  0? Do we have a PyArg_ParseTuple format  
code or a converter that would be suitable for this purpose?

None of the unit tests seem to exercise values where 32-bit and 64- 
bit platforms would have differing results, but that's easy enough to  
fix...

-bob


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64

2006-05-30 Thread Bob Ippolito

On May 30, 2006, at 10:47 AM, Tim Peters wrote:

 [Bob Ippolito]
 What should it be called instead of wrapping?

 I don't know -- I don't know what it's trying to _say_ that isn't
 already said by saying that the input is out of bounds for the format
 code.

The wrapping (now overflow masking) warning happens during conversion  
of PyObject* to long or unsigned long. It has no idea what the  
destination packing format is beyond whether it's signed or unsigned.  
If the packing format happens to be the same size as a long, it can't  
possibly trigger a range warning (unless range checks are moved up  
the stack and all of the function signatures and code get changed to  
accommodate that).

 When it says it's wrapping, it means that it's doing x = (2 ^ (8  
 * n)) - 1 to force
 a number into meeting the expected range.

 How is that different from what it does in this case?:

 struct.pack('B', 256L)
 /Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: 'B' format
 requires 0 = number = 255
  return o.pack(*args)
 '\x00'

 That looks like wrapping to me too (256  (2**(8*1)-1)== 0x00), but
 in this case there is no deprecation warning about wrapping.  Because
 of that, I'm afraid you're drawing distinctions that can't make sense
 to users.

When it says integer wrapping it means that it's wrapping to fit in  
a long or unsigned long. n in this case is always 4 or 8 depending on  
the platform. The format-specific range check is separate. My  
description wasn't very good in the last email.

 Reducing it to one warning instead of two is kinda difficult. Is it
 worth the trouble?

 I don't understand.  Every example you gave that showed a wrapping
 warning also showed a format requires i = number = j warning.  Are
 there cases in which a wrapping warning is given but not a format
 requires i = number = j warning?  If so, I simply haven't seen one
 (but I haven't tried all possible inputs ;-)).

 Since the implementation appears (to judge from the examples) to
 wrap in every case in which any warning is given (or are there cases
 in which it doesn't?), I don't understand the point of distinguishing
 between wrapping warnings and  format requires i = number = j
 warnings either.  The latter are crystal clear.

A latter email in this thread enumerates exactly which circumstances  
should cause two warnings with the current implementation.

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Converting crc32 functions to use unsigned

2006-05-30 Thread Bob Ippolito
On May 30, 2006, at 11:19 AM, Guido van Rossum wrote:

 On 5/30/06, Giovanni Bajo [EMAIL PROTECTED] wrote:
 Bob Ippolito wrote:

  It seems that we should convert the crc32 functions in binascii,
  zlib, etc. to deal with unsigned integers.

 +1!!

 Seems ok, except I don't know what the backwards incompatibilities  
 would be...


I think the only compatibility issues we're going to run into are  
with the struct module in the way of DeprecationWarning. If people  
are depending on specific negative values for these, then their code  
should already be broken on 64-bit platforms. The only potential  
breakage I can see is if they're passing these values to other  
functions written in C that expect PyInt_AsLong(n) to work with the  
values (on 32-bit platforms). I can't think of a use case for that  
beyond the functions themselves and the struct module.

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64

2006-05-29 Thread Bob Ippolito
On May 28, 2006, at 5:34 PM, Thomas Wouters wrote:On 5/29/06, Bob Ippolito [EMAIL PROTECTED] wrote: On May 28, 2006, at 4:31 AM, Thomas Wouters wrote: I'm seeing a dubious failure of test_gzip and test_tarfile on my AMD64 machine. It's triggered by the recent struct changes, but I'd say it's probably caused by a bug/misfeature in zlibmodule:  zlib.crc32 is the result of a zlib 'crc32' functioncall, which returns an unsigned long. zlib.crc32 turns that unsigned long into a (signed) Python int, which means a number beyond 131 goes  negative on 32-bit systems and other systems with 32-bit longs, but stays positive on systems with 64-bit longs: (32-bit)  zlib.crc32("foobabazr") -271938108  (64-bit)  zlib.crc32("foobabazr") 4023029188 The old structmodule coped with that:  struct.pack("l", -271938108)  '\xc4\x8d\xca\xef'  struct.pack("l", 4023029188) '\xc4\x8d\xca\xef' The new one does not:  struct.pack("l", -271938108) '\xc4\x8d\xca\xef'   struct.pack("l", 4023029188) Traceback (most recent call last):   File "stdin", line 1, in module   File "Lib/struct.py", line 63, in pack  return o.pack(*args) struct.error: 'l' format requires -2147483647 = number = 2147483647 The structmodule should be fixed (and a test added ;) but I'm also wondering if zlib shouldn't be fixed. Now, I'm AMD64-centric, so my  suggested fix would be to change the PyInt_FromLong() call to PyLong_FromUnsignedLong(), making zlib always return positive numbers -- it might break some code on 32-bit platforms, but that code is already broken on 64-bit platforms. But I guess I'm okay  with the long being changed into an actual 32-bit signed number on 64-bit platforms, too.The struct module isn't what's broken here. All of the struct typeshave always had well defined bit sizes and alignment if you explicitly specify an endian, I and L are 32-bits everywhere, and Q is supported on platforms that don't have long long. The onlything that's changed is that it actually checks for errorsconsistently now. Yes. And that breaks things. I'm certain the behaviour is used in real-world code (and I don't mean just the gzip module.) It has always, as far as I can remember, accepted 'unsigned' values for the signed versions of ints, longs and long-longs (but not chars or shorts.) I agree that that's wrong, but I don't think changing struct to do the right thing should be done in 2.5. I don't even think it should be done in 2.6 -- although 3.0 is fine.Well, the behavior change is in response to a bug http://python.org/sf/1229380. If nothing else, we should at least fix the standard library such that it doesn't depend on struct bugs. This is the only way to find them :)Basically the struct module previously only checked for errors if you don't specify an endian. That's really strange and leads to very confusing results. The only code that really should be broken by this additional check is code that existed before Python had a long type and only signed values were available.-bob___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64

2006-05-29 Thread Bob Ippolito
On May 29, 2006, at 3:14 AM, Thomas Wouters wrote:On 5/29/06, Bob Ippolito [EMAIL PROTECTED] wrote: Well, the behavior change is in response to a bug http://python.org/sf/1229380. If nothing else, we should at least fix the standard library such that it doesn't depend on struct bugs. This is the only way to find them :) Feel free to comment how the zlib.crc32/gzip co-operation should be fixed. I don't see an obviously correct fix. The trunk is currently failing tests it shouldn't fail. Also note that the error isn't with feeding signed values to unsigned formats (which is what the bug is about) but the other way 'round, although I do believe both should be accepted for the time being, while generating a warning. Well, first I'm going to just correct the modules that are broken (zlib, gzip, tarfile, binhex and probably one or two others).Basically the struct module previously only checked for errors if you don't specify an endian. That's really strange and leads to very confusing results. The only code that really should be broken by this additional check is code that existed before Python had a long type and only signed values were available. Alas, reality is different. The fundamental difference between types in Python and in C causes this, and code using struct is usually meant specifically to bridge those two worlds. Furthermore, struct is often used *fix* that issue, by flipping sign bits if necessary: Well, in C you get a compiler warning for stuff like this. struct.unpack("l", struct.pack("l", 3221225472))(-1073741824,) struct.unpack("l", struct.pack("L", 3221225472))(-1073741824,)  struct.unpack("l", struct.pack("l", -1073741824))(-1073741824,) struct.unpack("l", struct.pack("L", -1073741824))(-1073741824,) Before this change, you didn't have to check whether the value is negative before the struct.unpack/pack dance, regardless of which format character you used. This misfeature is used (and many would consider it convenient, even Pythonic, for struct to DWIM), breaking it suddenly is bad. struct doesn't really DWIM anyway, since integers are up-converted to longs and will overflow past what the (old or new) struct module will accept. Before there was a long type or automatic up-converting, the sign agnosticism worked.. but it doesn't really work correctly these days.We have two choices, either fix it to behave consistently broken everywhere for numbers of every size (modulo every number that comes in so that it fits), or have it do proper range checking. A compromise is to do proper range checking as a warning, and do the modulo math anyway... but is that what we really want?-bob___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64

2006-05-29 Thread Bob Ippolito

On May 29, 2006, at 12:44 PM, Guido van Rossum wrote:

 On 5/29/06, Tim Peters [EMAIL PROTECTED] wrote:
 I think we should do as Thomas proposes: plan to make it an error in
 2.6 (or 2.7 if there's a big outcry, which I don't expect) and  
 accept
 it with a warning in 2.5.

 That's what I arrived at, although 2.4.3's checking behavior is
 actually so inconsistent that it needs some defining (what exactly
 are we trying to still accept?  e.g., that -1 doesn't trigger I
 complaints but that -1L does above?  that one's surely a bug).

 No, it reflects that (up to 2.3 I believe) 0x was -1 but
 0xL was 4294967295L.

Python 2.3 did a FutureWarning on 0x but its value was -1.

Anyway, my plan is to make it such that all non-native format codes  
will behave exactly like C casting, but will do a DeprecationWarning  
for input numbers that were initially out of bounds. This behavior  
will be consistent across (python) int and long, and will be easy  
enough to explain in the docs (but still more complicated than  
values not representable by this data type will raise struct.error).

This means that I'm also changing it so that struct.pack will not  
raise OverflowError for some longs, it will always raise struct.error  
or do a warning (as long as the input is int or long).

Pseudocode looks kinda like this:

def wrap_unsigned(x, CTYPE):
if not (0 = x = CTYPE_MAX):
DeprecationWarning()
x = CTYPE_MAX
return x

Actually, should this be a FutureWarning or a DeprecationWarning?

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64

2006-05-29 Thread Bob Ippolito

On May 29, 2006, at 1:18 PM, Bob Ippolito wrote:


 On May 29, 2006, at 12:44 PM, Guido van Rossum wrote:

 On 5/29/06, Tim Peters [EMAIL PROTECTED] wrote:
 I think we should do as Thomas proposes: plan to make it an  
 error in
 2.6 (or 2.7 if there's a big outcry, which I don't expect) and
 accept
 it with a warning in 2.5.

 That's what I arrived at, although 2.4.3's checking behavior is
 actually so inconsistent that it needs some defining (what exactly
 are we trying to still accept?  e.g., that -1 doesn't trigger I
 complaints but that -1L does above?  that one's surely a bug).

 No, it reflects that (up to 2.3 I believe) 0x was -1 but
 0xL was 4294967295L.

 Python 2.3 did a FutureWarning on 0x but its value was -1.

 Anyway, my plan is to make it such that all non-native format codes
 will behave exactly like C casting, but will do a DeprecationWarning
 for input numbers that were initially out of bounds. This behavior
 will be consistent across (python) int and long, and will be easy
 enough to explain in the docs (but still more complicated than
 values not representable by this data type will raise struct.error).

 This means that I'm also changing it so that struct.pack will not
 raise OverflowError for some longs, it will always raise struct.error
 or do a warning (as long as the input is int or long).

 Pseudocode looks kinda like this:

 def wrap_unsigned(x, CTYPE):
   if not (0 = x = CTYPE_MAX):
   DeprecationWarning()
   x = CTYPE_MAX
   return x

 Actually, should this be a FutureWarning or a DeprecationWarning?

OK, this behavior is implemented in revision 46537:

(this is from ./python.exe -Wall)

  import struct
  struct.pack('B', 256)
Traceback (most recent call last):
   File stdin, line 1, in module
   File /Users/bob/src/python/Lib/struct.py, line 63, in pack
 return o.pack(*args)
struct.error: ubyte format requires 0 = number = 255
  struct.pack('B', -1)
Traceback (most recent call last):
   File stdin, line 1, in module
   File /Users/bob/src/python/Lib/struct.py, line 63, in pack
 return o.pack(*args)
struct.error: ubyte format requires 0 = number = 255
  struct.pack('B', 256)
/Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: 'B'  
format requires 0 = number = 255
   return o.pack(*args)
'\x00'
  struct.pack('B', -1)
/Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: struct  
integer wrapping is deprecated
   return o.pack(*args)
/Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: 'B'  
format requires 0 = number = 255
   return o.pack(*args)
'\xff'
  struct.pack('B', 256L)
/Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: 'B'  
format requires 0 = number = 255
   return o.pack(*args)
'\x00'
  struct.pack('B', -1L)
/Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: struct  
integer wrapping is deprecated
   return o.pack(*args)
/Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: 'B'  
format requires 0 = number = 255
   return o.pack(*args)
'\xff'

In _struct.c, getting rid of the #define PY_STRUCT_WRAPPING 1 will  
turn off this warning+wrapping nonsense and just raise errors for out  
of range values. It'll also enable some additional performance hacks  
(swapping out the host-endian table's pack and unpack functions with  
the faster native versions).

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64

2006-05-28 Thread Bob Ippolito

On May 28, 2006, at 4:31 AM, Thomas Wouters wrote:


 I'm seeing a dubious failure of test_gzip and test_tarfile on my  
 AMD64 machine. It's triggered by the recent struct changes, but I'd  
 say it's probably caused by a bug/misfeature in zlibmodule:  
 zlib.crc32 is the result of a zlib 'crc32' functioncall, which  
 returns an unsigned long. zlib.crc32 turns that unsigned long into  
 a (signed) Python int, which means a number beyond 131 goes  
 negative on 32-bit systems and other systems with 32-bit longs, but  
 stays positive on systems with 64-bit longs:

 (32-bit)
  zlib.crc32(foobabazr)
 -271938108

 (64-bit)
  zlib.crc32(foobabazr)
 4023029188

 The old structmodule coped with that:
  struct.pack(l, -271938108)
 '\xc4\x8d\xca\xef'
  struct.pack(l, 4023029188)
 '\xc4\x8d\xca\xef'

 The new one does not:
  struct.pack(l, -271938108)
 '\xc4\x8d\xca\xef'
  struct.pack(l, 4023029188)
 Traceback (most recent call last):
   File stdin, line 1, in module
   File Lib/struct.py, line 63, in pack
 return o.pack(*args)
 struct.error: 'l' format requires -2147483647 = number = 2147483647

 The structmodule should be fixed (and a test added ;) but I'm also  
 wondering if zlib shouldn't be fixed. Now, I'm AMD64-centric, so my  
 suggested fix would be to change the PyInt_FromLong() call to  
 PyLong_FromUnsignedLong(), making zlib always return positive  
 numbers -- it might break some code on 32-bit platforms, but that  
 code is already broken on 64-bit platforms. But I guess I'm okay  
 with the long being changed into an actual 32-bit signed number on  
 64-bit platforms, too.

The struct module isn't what's broken here. All of the struct types  
have always had well defined bit sizes and alignment if you  
explicitly specify an endian, I and L are 32-bits everywhere, and  
 Q is supported on platforms that don't have long long. The only  
thing that's changed is that it actually checks for errors  
consistently now.

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] r46300 - in python/trunk: Lib/socket.py Lib/test/test_socket.py Lib/test/test_struct.py Modules/_struct.c Modules/arraymodule.c Modules/socketmodule.c

2006-05-27 Thread Bob Ippolito
On May 26, 2006, at 4:56 PM, Guido van Rossum wrote:

 On 5/26/06, martin.blais [EMAIL PROTECTED] wrote:
 Log:
 Support for buffer protocol for socket and struct.

 * Added socket.recv_buf() and socket.recvfrom_buf() methods, that  
 use the buffer
   protocol (send and sendto already did).

 * Added struct.pack_to(), that is the corresponding buffer  
 compatible method to
   unpack_from().

 Hm... The file object has a similar method readinto(). Perhaps the
 methods introduced here could follow that lead instead of using two
 different new naming conventions?

(speaking specifically about struct and not socket)

pack_to and unpack_from are named as such because they work with objects
that support the buffer API (not file-like-objects). I couldn't find any
existing convention for objects that manipulate buffers in such a way.
If there is an existing convention then I'd be happy to rename these.

readinto seems to imply that some kind of position is being  
incremented.
Grammatically it only works if it's implemented on all buffer  
objects, but
in this case it's implemented on the Struct type.

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] SQLite header scan order

2006-05-26 Thread Bob Ippolito

On May 26, 2006, at 8:35 AM, Ronald Oussoren wrote:

 The current version of setup.py looks for the sqlite header files in
 a number of sqlite-specific directories before looking into the
 default inc_dirs. I'd like to revert that order because that would
 make it possible to override the version of sqlite that gets picked
 up. Any objections to that?

+1, the version that ships with Mac OS X 10.4 is pretty old.

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Cost-Free Slice into FromString constructors--Long

2006-05-25 Thread Bob Ippolito

On May 25, 2006, at 3:28 PM, Jean-Paul Calderone wrote:

 On Thu, 25 May 2006 15:01:36 +, Runar Petursson  
 [EMAIL PROTECTED] wrote:
 We've been talking this week about ideas for speeding up the  
 parsing of
 Longs coming out of files or network.  The use case is having a  
 large string
 with embeded Long's and parsing them to real longs.  One approach  
 would be
 to use a simple slice:
 long(mystring[x:y])

 an expensive operation in a tight loop.  The proposed solution is  
 to add
 further keyword arguments to Long (such as):

 long(mystring, base=10, start=x, end=y)

 The start/end would allow for negative indexes, as slices do, but  
 otherwise
 simply limit the scope of the parsing.  There are other solutions,  
 using
 buffer-like objects and such, but this seems like a simple win for  
 anyone
 parsing a lot of text.  I implemented it in a branch  runar- 
 longslice-
 branch,
 but it would need to be updated with Tim's latest improvements to  
 long.
 Then you may ask, why not do it for everything else parsing from  
 string--to
 which I say it should.  Thoughts?

 This really seems like a poor option.  Why fix the problem with a  
 hundred special cases instead of a single general solution?

 Hmm, one reason could be that the general solution doesn't work:

   [EMAIL PROTECTED]:~$ python
   Python 2.4.3 (#2, Apr 27 2006, 14:43:58)
   [GCC 4.0.3 (Ubuntu 4.0.3-1ubuntu5)] on linux2
   Type help, copyright, credits or license for more  
 information.
 long(buffer('1234', 0, 3))
   Traceback (most recent call last):
 File stdin, line 1, in ?
   ValueError: null byte in argument for long()
 long(buffer('123a', 0, 3))
   Traceback (most recent call last):
 File stdin, line 1, in ?
   ValueError: invalid literal for long(): 123a

One problem with buffer() is that it does a memcpy of the buffer. A  
zero-copy version of buffer (a view on some object that implements  
the buffer API) would be nice.

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Google Summer of Code proposal: improvement of long int and adding new types/modules.

2006-04-21 Thread Bob Ippolito

On Apr 21, 2006, at 5:58 PM, Alex Martelli wrote:

 On 4/21/06, Greg Ewing [EMAIL PROTECTED] wrote:
...
 GMP is covered by LGPL, so must any such derivative work

 But the wrapper is just using GMP as a library, so
 it shouldn't be infected with LGPLness, should it?

 If a lawyer for the PSF can confidently assert that gmpy is not a
 derivative work of GMP, I'll have no problem changing gmpy's
 licensing. But I won't make such a call myself: for example, gmpy.c
 #include's gmp.h and uses (==expands) some of the C macros there
 defined -- doesn't that make gmpy.o a derived work of gmp.h?

 I'm quite confident that the concept of derived work would not apply
 if gmpy.so only accessed a gmp.so (or other kinds of dynamic
 libraries), but I fear the connection is stronger than that, so,
 prudently, I'm assuming the derived work status until further
 notice.

Well we already wrap readline, would this really be any worse?   
Readline is GPL.

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] elementtree in stdlib

2006-04-05 Thread Bob Ippolito

On Apr 5, 2006, at 9:02 PM, Alex Martelli wrote:


 On Apr 5, 2006, at 8:30 PM, Greg Ewing wrote:

 A while ago there was some discussion about including
 elementtree in the std lib. I can't remember what the
 conclusion about that was, but if it does go ahead,
 I'd like to suggest that it be reorganised a bit.

 I've just started playing with it, and having a
 package called elementtree containing a module
 called ElementTree containing a class called
 ElementTree is just too confusing for words!

 Try the 2.5 alpha 1 just released, and you'll see that the toplevel
 package is now xml.etree.  The module and class are still called
 ElementTree, though.

It would be nice to have new code be PEP 8 compliant..

Specifically:
Modules should have short, lowercase names, without underscores.

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Use dlopen() on Darwin/OS X to load extensions?

2006-04-04 Thread Bob Ippolito

On Apr 3, 2006, at 9:01 PM, Neal Norwitz wrote:

 On 4/3/06, Zachary Pincus [EMAIL PROTECTED] wrote:

 Sorry if it's bad form to ask about patches one has submitted -- let
 me know if that sort of discussion should be kept strictly on the
 patch tracker.

 No, it's fine.  Thanks for reminding us about this issue.
 Unfortunately, without an explicit ok from one of the Mac maintainers,
 I don't want to add this myself.  If you can get Bob, Ronald, or Jack
 to say ok, I will apply the patch ASAP.  I have a Mac OS X.4 box and
 can test it, but don't know the suitability of the patch.

The patch has my OK (I gave it a while ago on pythonmac-sig).

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] towards a stricter definition of sys.executable

2006-03-17 Thread Bob Ippolito

On Mar 17, 2006, at 12:40 AM, Martin v. Löwis wrote:

 Fredrik Lundh wrote:
 I don't think many people embed setup.py scripts, so alternative  
 (e) would pro-
 bably cause the least problems:

 e) sys.executable contains the full path to the program used  
 to invoke
 this interpreter instance, or None if this could not be  
 determined.

 It seems that you indeed are trying to solve a problem you  
 encountered.
 Can you please explain what the problem is?

 ISTM that the current definition doesn't really cause problems,  
 despite
 potentially being fuzzy. People that start sys.executable typically
 *do* get a Python interpreter - in an embedded interpreter, they just
 don't want to start a new interpreter, as that couldn't work, anyway.

I've seen cases where people want to start worker processes from  
bundled apps (as in py2app/py2exe).  The bootstrap executable  
(sys.executable) is not suitable for this purpose, as it runs a  
specific script.  Forking doesn't quite do the right thing either  
because it's not safe to fork without exec'ing in all cases due to  
state that persists that shouldn't across processes with certain  
platform libraries (in Mac OS X especially).

For py2app, we can bundle a Python interpreter that links to the same  
Python framework and has the same set of modules and extensions that  
the bundled application does, so we can support this use case.  I'd  
definitely like to see something like sys.python_executable become  
standard, and I think I'll go ahead and support it in the next  
release of py2app.

It's possible to degrade gracefully with this approach too:

def get_python_executable():
 python_executable = getattr(sys, 'python_executable', None)
 if python_executable is not None:
 return python_executable
 if not sys.frozen and sys.executable:
 # launched from a standard interpreter
 return sys.executable
 # frozen without python_executable support
 raise RuntimeError


-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Problem with module loading on multi-arch?

2006-03-17 Thread Bob Ippolito

On Mar 17, 2006, at 4:38 PM, Neal Becker wrote:

 Martin v. Löwis wrote:

 Neal Becker wrote:
 Sorry, maybe I used confusing terminology.

 A reference is here: http://fedoraproject.org/wiki/Packaging/Python
 This is the current setup.  For example, this is a standard macro  
 used by
 Redhat in RPM SPEC files for python:

 %define python_sitearch %(%{__python} -c from  
 distutils.sysconfig import
 get_python_lib; print get_python_lib(1))}

 %define python_sitelib %(%{__python} -c from distutils.sysconfig  
 import
 get_python_lib; print get_python_lib())}

 Clearly this practice is widespread.  It would seem that module  
 search
 needs some modification to fully support it.

 Ah. That isn't supported at all, at the moment. Redhat should not be
 using it. Instead, there shouldn't be a difference between  
 sitearch and
 sitelib.


 x86_64 is multiarch.  That means, we allow both i386 and x86_64  
 binaries to
 coexits.  Is the proposal that python should not support this?   
 That would
 be unfortunate.

 I suspect is would not be that difficult to correctly support  
 multiarch
 platforms.  As it is, this usually works - but the example I gave  
 above
 shows where it seems to break.

All the difficult issues supporting multi-arch are going to be with  
distutils, not Python itself.

On OS X it isn't all that hard to support (beyond backwards  
compatibility issues) because you run gcc once with the right options  
and get a single universal binary as output.  It would be a lot more  
invasive if GCC had to be run multiple times and the products had to  
be put in different places.

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] collections.idset and collections.iddict?

2006-03-06 Thread Bob Ippolito

On Mar 6, 2006, at 4:14 PM, Guido van Rossum wrote:

 On 3/6/06, Raymond Hettinger [EMAIL PROTECTED] wrote:
 [Neil Schemenauer]
 I occasionally need dictionaries or sets that use object identity
 rather than __hash__ to store items.  Would it be appropriate to add
 these to the collections module?

 Why not decorate the objects with a class adding a method:
def __hash__(self):
return id(self)

 That would seem to be more Pythonic than creating custom variants  
 of other
 containers.

 I hate to second-guess the OP, but you'd have to override __eq__ too,
 and probably __ne__ and __cmp__ just to be sure. And probably that
 wouldn't do -- since the default __hash__ and __eq__ have the desired
 behavior, the OP is apparently talking about objects that override
 these operations to do something meaningful; overriding them back
 presumably breaks other functionality.

 I wonder if this use case and the frequently requested
 case-insensitive dict don't have some kind of generalization in common
 -- perhaps a dict that takes a key function a la list.sort()?

+1.  I've wanted such a thing a couple times, and there is some  
precedent in the stdlib (e.g. WeakKeyDictionary would be a lot  
shorter with such a base class).

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] operator.is*Type

2006-02-22 Thread Bob Ippolito

On Feb 22, 2006, at 4:18 AM, Fuzzyman wrote:

 Raymond Hettinger wrote:
 from operator import isSequenceType, isMappingType
 class anything(object):
 ... def __getitem__(self, index):
 ... pass
 ...
 something = anything()
 isMappingType(something)
 True
 isSequenceType(something)
 True

 I suggest we either deprecate these functions as worthless, *or* we
 define the protocols slightly more clearly for user defined classes.

 They are not worthless.  They do a damned good job of differentiating
 anything that CAN be differentiated.

 But as far as I can tell (and I may be wrong), they only work if the
 object is a subclass of a built in type, otherwise they're broken. So
 you'd have to do a type check as well, unless you document that an API
 call *only* works with a builtin type or subclass.

If you really cared, you could check hasattr(something, 'get') and  
hasattr(something, '__getitem__'), which is a pretty good indicator  
that it's a mapping and not a sequence (in a dict-like sense, anyway).

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 358 (bytes type) comments

2006-02-22 Thread Bob Ippolito

On Feb 22, 2006, at 1:22 PM, Brett Cannon wrote:

 First off, thanks to Neil for writing this all down.  The whole thread
 of discussion on the bytes type was rather long and thus hard to
 follow.  Nice to finally have it written down in a PEP.

 Anyway, a few comments on the PEP.  One, should the hex() method
 instead be an attribute, implemented as a property?  Seems like static
 data that is entirely based on the value of the bytes object and thus
 is not properly represented by a method.

 Next, why are the __*slice__ methods to be defined?  Docs say they are
 deprecated.

 And for the open-ended questions, I don't think sort() is needed.

sort would be totally useless for bytes.  array.array doesn't have  
sort either.

 Lastly, maybe I am just dense, but it took me a second to realize that
 it will most likely return the ASCII string for __str__() for use in
 something like socket.send(), but it isn't explicitly stated anywhere.
  There is a chance someone might think that __str__ will somehow
 return the sequence of integers as a string does exist.

That would be a bad idea given that bytes are supposed make the str  
type go away.  It's probably better to make __str__ return __repr__  
like it does for most types.  If bytes type supports the buffer API  
(one would hope so), functions like socket.send should do the right  
thing as-is.

http://docs.python.org/api/bufferObjects.html

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readline compilarion fails on OSX

2006-02-20 Thread Bob Ippolito

On Feb 20, 2006, at 6:48 PM, Guido van Rossum wrote:

 On OSX (10.4.4) the readline module in the svn HEAD fails compilation
 as follows. This is particularly strange since the buildbot is green
 for OSX... What could be up with this?

 building 'readline' extension
-lots of build junk-

In Apple's quest to make our lives harder, they installed BSD libedit  
and symlinked it to readline.  Python doesn't like that.  The  
buildbot might have a real readline installation, or maybe the  
buildbot is skipping those tests.

You'll need to install a real libreadline if you want it to work.

I've also put together a little tarball that'll build readline.so  
statically, and there's pre-built eggs for OS X so the easy_install  
should be quick:
http://python.org/pypi/readline

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes.from_hex()

2006-02-20 Thread Bob Ippolito

On Feb 20, 2006, at 7:25 PM, Stephen J. Turnbull wrote:

 Martin == Martin v Löwis [EMAIL PROTECTED] writes:

 Martin Please do take a look. It is the only way: If you were to
 Martin embed base64 *bytes* into character data content of an XML
 Martin element, the resulting XML file might not be well-formed
 Martin anymore (if the encoding of the XML file is not an ASCII
 Martin superencoding).

 Excuse me, I've been doing category theory recently.  By embedding I
 mean a map from an intermediate object which is a stream of bytes to
 the corresponding stream of characters.  In the case of UTF-16-coded
 characters, this would necessarily imply a representation change, as
 you say.

 What I advocate for Python is to require that the standard base64
 codec be defined only on bytes, and always produce bytes.  Any
 representation change should be done explicitly.  This is surely
 conformant with RFC 2045's definition and with RFC 3548.

+1

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes.from_hex()

2006-02-19 Thread Bob Ippolito
On Feb 19, 2006, at 10:55 AM, Martin v. Löwis wrote:

 Stephen J. Turnbull wrote:
 BTW, what use cases do you have in mind for Unicode - Unicode
 decoding?

 I think rot13 falls into that category: it is a transformation
 on text, not on bytes.

The current implementation is a transformation on bytes, not text.   
Conceptually though, it's a text-text transform.

 For other odd cases: base64 goes Unicode-bytes in the *decode*
 direction, not in the encode direction. Some may argue that base64
 is bytes, not text, but in many applications, you can combine base64
 (or uuencode) with abitrary other text in a single stream. Of course,
 it could be required that you go u.encode(ascii).decode(base64).

I would say that base64 is bytes-bytes.  Just because those bytes  
happen to be in a subset of ASCII, it's still a serialization meant  
for wire transmission.  Sometimes it ends up in unicode (e.g. in  
XML), but that's the exception not the rule.

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New Module: CommandLoop

2006-02-19 Thread Bob Ippolito

On Feb 19, 2006, at 5:03 PM, Raymond Hettinger wrote:

 @cmdloop.aliases('goodbye')
 @cmdloop.shorthelp('say goodbye')
 @cmdloop.usage('goodbye TARGET')

 to just:

 @cmdloop.addspec(aliases=['goodbye'], shorthelp ='say  
 goodbye',
 usage='goodbye TARGET')

 leaving the possibility of multiple decorators when one line gets  
 to long:

 @cmdloop.addspec(aliases=['goodbye'], shorthelp ='say  
 goodbye')
 @cmdloop.addspec(usage='goodbye TARGET  # where TARGET is  
 a filename
 in
 the current directory')

 Well, why not support both, and leave it up to the user?

 Having only one method keeps the API simple.  Also, the addspec()  
 approach
 allows the user to choose between single and multiple lines.

 BTW, addspec() could be made completely general by supporting all  
 possible
 keywords at once:

 def addspec(**kwds):
 def decorator(func):
 func.__dict__.update(kwds)
 return func
 return decorator

 With an open definition like that, users can specify new attributes  
 with less
 effort.

Doesn't this discussion belong on c.l.p / python-list?

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] http://www.python.org/dev/doc/devel still available

2006-02-17 Thread Bob Ippolito

On Feb 16, 2006, at 11:35 AM, Benji York wrote:

 Alexander Schremmer wrote:
 In fact, PHP does it like php.net/functionname which is even  
 shorter, i.e.
 they fallback to the documentation if that path does not exist  
 otherwise.

 Like many things PHP, that seems a bit too magical for my tastes.

Not only does it fall back to documentation, it falls back to a  
search for documentation if there isn't a function of that name.

It's a convenient feature, I'm sure people would use it if it was  
there... even if it was something like http://python.org/doc/name

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-17 Thread Bob Ippolito

On Feb 16, 2006, at 9:20 PM, Josiah Carlson wrote:


 Greg Ewing [EMAIL PROTECTED] wrote:

 Josiah Carlson wrote:

 They may not be encodings of _unicode_ data,

 But if they're not encodings of unicode data, what
 business do they have being available through
 someunicodestring.encode(...)?

 I had always presumed that bytes objects are going to be able to be a
 source for encode AND decode, like current non-unicode strings are  
 able
 to be today.  In that sense, if I have a bytes object which is an
 encoding of rot13, hex, uu, etc., or I have a bytes object which I  
 would
 like to be in one of those encodings, I should be able to do  
 b.encode(...)
 or b.decode(...), given that 'b' is a bytes object.

 Are 'encodings' going to become a mechanism to encode and decode
 _unicode_ strings, rather than a mechanism to encode and decode _text
 and data_ strings?  That would seem like a backwards step to me, as  
 the
 email package would need to package their own base-64 encode/decode  
 API
 and implementation, and similarly for any other package which uses any
 one of the encodings already available.

It would be VERY useful to separate the two concepts.  bytes-bytes  
transforms should be one function pair, and bytes-text transforms  
should be another.  The current situation is totally insane:

str.decode(codec) - str or unicode or UnicodeDecodeError or  
ZlibError or TypeError.. who knows what else
str.encode(codec) - str or unicode or UnicodeDecodeError or  
TypeError... probably other exceptions

Granted, unicode.encode(codec) and unicode.decode(codec) are actually  
somewhat sane in that the return type is always a str and the  
exceptions are either UnicodeEncodeError or UnicodeDecodeError.

I think that rot13 is the only conceptually text-text transform  
(though the current implementation is really bytes-bytes),  
everything else is either bytes-text or bytes-bytes.

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-17 Thread Bob Ippolito

On Feb 17, 2006, at 4:20 PM, Martin v. Löwis wrote:

 Ian Bicking wrote:
 Maybe it isn't worse, but the real alternative is:

   import zlib
   import base64

   base64.b64encode(zlib.compress(s))

 Encodings cover up eclectic interfaces, where those interfaces fit a
 basic pattern -- data in, data out.

 So should I write

 3.1415.encode(sin)

 or would that be

 3.1415.decode(sin)

 What about

 http://www.python.org.decode(URL)

 It's data in, data out, after all. Who needs functions?

Well, 3.1415.decode(sin) is of course NaN, because 3.1415.encode 
(sinh) is not defined for numbers outside of [-1, 1] :)

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes.from_hex()

2006-02-17 Thread Bob Ippolito

On Feb 17, 2006, at 8:33 PM, Josiah Carlson wrote:


 Greg Ewing [EMAIL PROTECTED] wrote:

 Stephen J. Turnbull wrote:
 Guido == Guido van Rossum [EMAIL PROTECTED] writes:

 Guido - b = bytes(t, enc); t = text(b, enc)

 +1  The coding conversion operation has always felt like a  
 constructor
 to me, and in this particular usage that's exactly what it is.  I
 prefer the nomenclature to reflect that.

 This also has the advantage that it competely
 avoids using the verbs encode and decode
 and the attendant confusion about which direction
 they go in.

 e.g.

s = text(b, base64)

 makes it obvious that you're going from the
 binary side to the text side of the base64
 conversion.

 But you aren't always getting *unicode* text from the decoding of  
 bytes,
 and you may be encoding bytes *to* bytes:

 b2 = bytes(b, base64)
 b3 = bytes(b2, base64)

 Which direction are we going again?

This is *exactly* why the current set of codecs are INSANE.   
unicode.encode and str.decode should be used *only* for unicode  
codecs.  Byte transforms are entirely different semantically and  
should be some other method pair.

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bdist_* to stdlib?

2006-02-15 Thread Bob Ippolito

On Feb 15, 2006, at 4:49 AM, Jan Claeys wrote:

 Op wo, 15-02-2006 te 14:00 +1300, schreef Greg Ewing:
 I'm disappointed that the various Linux distributions
 still don't seem to have caught onto the very simple
 idea of *not* scattering files all over the place when
 installing something.

 MacOSX seems to be the only system so far that has got
 this right -- organising the system so that everything
 related to a given application or library can be kept
 under a single directory, clearly labelled with a
 version number.

 Those directories might be mounted on entirely different hardware  
 (even
 over a network), often with different characteristics (access speed,
 writeability, etc.).

Huh?  What does that have to do with anything?  I've never seen a  
system where /usr/include, /usr/lib, /usr/bin, etc. are not all on  
the same mount.  It's not really any different with OS X either.

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-15 Thread Bob Ippolito

On Feb 15, 2006, at 6:35 PM, Aahz wrote:

 On Tue, Feb 14, 2006, Guido van Rossum wrote:

 Anyway, I'm now convinced that bytes should act as an array of ints,
 where the ints are restricted to range(0, 256) but have type int.

 range(0, 255)?

No, Guido was correct.  range(0, 256) is [0, 1, 2, ..., 255].

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


  1   2   3   >