Re: [Python-Dev] Re: marshal / unmarshal

2005-04-12 Thread Michael Hudson
My mail is experincing random delays of up to a few hours at the
moment.  I wrote this before I saw your comments on my patch.

Tim Peters [EMAIL PROTECTED] writes:

 [Michael Hudson]
 I've just submitted http://python.org/sf/1180995 which adds format
 codes for binary marshalling of floats if version  1, but it doesn't
 quite have the effect I expected (see below):

  inf = 1e308*1e308
  nan = inf/inf
  marshal.dumps(nan, 2)
 Traceback (most recent call last):
  File stdin, line 1, in ?
 ValueError: unmarshallable object

 I don't understand.  Does binary marshalling _not_ mean just copying
 the bytes on a 754 platform?

No, it means using _PyFloat_Pack8/Unpack8, like the patch description
says.  Making those functions just fiddle bytes when they can I regard
as a separate project (watch a patch manager near you, though).

 If so, that won't work.

I can tell! wink

 Right.  Assuming source and destination boxes both use 754 format, and
 the implementation adjusts endianess if necessary.

 Well, I was assuming marshal would do floats little-endian-wise, as it
 does for integers.

 Then on a big-endian 754 system, loads() will have to reverse the
 bytes in the little-endian marshal bytestring, and dumps() likewise. 

Really?  Even I had worked this out...

 Heh.  I have a vague half-memory of _some_ box that stored the two
 4-byte words in an IEEE double in one order, but the bytes within
 each word in the opposite order.  It's always something ...

 I recall stories of machines that stored the bytes of long in some
 crazy order like that.  I think Python would already be broken on such
 a system, but, also, don't care.

 Python does very little that depends on internal native byte order,
 and C hides it in the absence of casting abuse.  

This surely does:

PyObject *
PyLong_FromLongLong(PY_LONG_LONG ival)
{
PY_LONG_LONG bytes = ival;
int one = 1;
return _PyLong_FromByteArray(
(unsigned char *)bytes,
   SIZEOF_LONG_LONG, IS_LITTLE_ENDIAN, 1);
}

It occurs that in the IEEE case, special values can be detected with
reliablity -- by picking the exponent field out by force -- and a
warning emitted or exception raised.  Good idea?  Hard to say, to me.

Cheers,
mwh

Oh, by the way: http://python.org/sf/1181301

-- 
  It is time-consuming to produce high-quality software. However,
  that should not alone be a reason to give up the high standards
  of Python development.  -- Martin von Loewis, python-dev
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Re: marshal / unmarshal

2005-04-12 Thread Tim Peters
...

[mwh]
 I recall stories of machines that stored the bytes of long in some
 crazy order like that.  I think Python would already be broken on such
 a system, but, also, don't care.

[Tim]
 Python does very little that depends on internal native byte order,
 and C hides it in the absence of casting abuse.

[mwh]
 This surely does:

 PyObject *
 PyLong_FromLongLong(PY_LONG_LONG ival)
 {
PY_LONG_LONG bytes = ival;
int one = 1;
return _PyLong_FromByteArray(
(unsigned char *)bytes,
   SIZEOF_LONG_LONG, IS_LITTLE_ENDIAN, 1);
 }

Yes, that's casting abuse'.  Python does very little of that.  If it
becomes necessary, it's straightforward but long-winded to rewrite the
above in wholly portable C (peel the bytes out of ival,
least-signficant first, via shifting and masking 8 times; ival 
0xff is the least-significant byte regardless of memory storage
order; etc).  BTW, the IS_LITTLE_ENDIAN macro also relies on casting
abuse, and more deeply than does the visible cast there.
 
 It occurs that in the IEEE case, special values can be detected with
 reliablity -- by picking the exponent field out by force

Right, that works for NaNs and infinities; signed zeroes are a bit
trickier to detect.

 -- and a warning emitted or exception raised.  Good idea?  Hard to say, to me.

It's not possible to _create_ a NaN or infinity from finite operands
in 754 without signaling some exceptional condition.  Once you have
one, though, there's generally nothing exceptional about _using_ it. 
Sometimes there is, like +Inf - +Inf or Inf / Inf, but not generally. 
Using a quiet NaN never signals; using a signaling NaN almost always
signals.

So packing a nan or inf shouldn't complain.  On a 754 box, unpacking
one shouldn't complain either.  Unpacking a nan or inf on a non-754
box probably should complain, since there's in general nothing it can
be unpacked _to_ that makes any sense (errors should never pass
silently).
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Re: marshal / unmarshal

2005-04-12 Thread Michael Hudson
Tim Peters [EMAIL PROTECTED] writes:

 ...

 [mwh]
 I recall stories of machines that stored the bytes of long in some
 crazy order like that.  I think Python would already be broken on such
 a system, but, also, don't care.

 [Tim]
 Python does very little that depends on internal native byte order,
 and C hides it in the absence of casting abuse.

 [mwh]
 This surely does:

 PyObject *
 PyLong_FromLongLong(PY_LONG_LONG ival)
 {
PY_LONG_LONG bytes = ival;
int one = 1;
return _PyLong_FromByteArray(
(unsigned char *)bytes,
   SIZEOF_LONG_LONG, IS_LITTLE_ENDIAN, 1);
 }

 Yes, that's casting abuse'.  Python does very little of that.  If it
 becomes necessary, it's straightforward but long-winded to rewrite the
 above in wholly portable C (peel the bytes out of ival,
 least-signficant first, via shifting and masking 8 times; ival 
 0xff is the least-significant byte regardless of memory storage
 order; etc).

Not arguing with that.

 BTW, the IS_LITTLE_ENDIAN macro also relies on casting abuse, and
 more deeply than does the visible cast there.

I'd like to claim that was part of my point :)

There is a certain, small level of assumption in Python that
big-endian or little-endian is the only question to ask -- and I
don't think that's a problem!

Even in this isn't a big deal, at least if we choose a more
interesting 'probe value' that 1.5, it will just lead to an oddball
box degrading to the non-ieee code.

 It occurs that in the IEEE case, special values can be detected with
 reliablity -- by picking the exponent field out by force

 Right, that works for NaNs and infinities; signed zeroes are a bit
 trickier to detect.

Hmm.  Don't think they're such a big deal.

 -- and a warning emitted or exception raised.  Good idea?  Hard to
 say, to me.

 It's not possible to _create_ a NaN or infinity from finite operands
 in 754 without signaling some exceptional condition.  Once you have
 one, though, there's generally nothing exceptional about _using_ it. 
 Sometimes there is, like +Inf - +Inf or Inf / Inf, but not generally. 
 Using a quiet NaN never signals; using a signaling NaN almost always
 signals.

 So packing a nan or inf shouldn't complain.  On a 754 box, unpacking
 one shouldn't complain either.  Unpacking a nan or inf on a non-754
 box probably should complain, since there's in general nothing it can
 be unpacked _to_ that makes any sense (errors should never pass
 silently).

This sounds like good behaviour to me.  I'll try to update the patch
soon.

Cheers,
mwh

-- 
  BUGS   Never use this function.  This function modifies its first
 argument.   The  identity  of  the delimiting character is
 lost.  This function cannot be used on constant strings.
-- the glibc manpage for strtok(3)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


RE: [Python-Dev] args attribute of Exception objects

2005-04-12 Thread Raymond Hettinger
[Sébastien de Menten]
  2) Could this be changed to .args more in line with:
 a) first example: e.args = ('foo', NameError: name 'foo' is not
 defined)
 b) second example: e.args = (4, 'foo', 'int' object has no
attribute
 'foo',)
   the message of the string can even be retrieved with str(e) so it is
 also
 redundant.

Something like this ought to be explored at some point.  It would
certainly improve the exception API to be able to get references to the
objects without parsing strings.

The balancing forces are backwards compatibility and a need to keep the
exception mechanism as lightweight as possible.

Please log a feature request on SF.  Note that the idea is only for
making builtin exceptions more informative.  User defined exceptions can
already attach arbitrary objects:

 class Boom(Exception):
pass

 x = 10
 if x != 5:
raise Boom(Value must be a five, x)

Traceback (most recent call last):
  File pyshell#12, line 2, in -toplevel-
raise Boom(Value must be a five, x)
Boom: ('Value must be a five', 10)


Raymond Hettinger
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com