Re: [Python-Dev] Decimal(unicode)

2008-03-28 Thread Greg Ewing
Nick Coghlan wrote:

> What features do you find particularly unfortunate?

Whichever ones are making people think that implementing
it in C is infeasible.

> Just because 
> something isn't particularly amenable to implementation in C, doesn't 
> make it a bad API for a Python library

No, but for something like a number type, which benefits
greatly from speed, making it actively C-hostile doesn't
seem like a good idea.

> (e.g. the dicts to enable/signal 
> the different error traps are a natural interface for Python code

I don't see why there can't be an object with a mapping
interface for this, that stores them internally as a
bit field or whatever is convenient for the C code.

-- 
Greg

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-28 Thread Georg Brandl
Mark Dickinson schrieb:
> On Thu, Mar 27, 2008 at 4:46 AM, Georg Brandl <[EMAIL PROTECTED] 
> > wrote:
> 
> 
> As Nick said, a drop-in replacement in C isn't feasible
> 
> But probably users of decimal won't really care if they have to slightly
> adapt their code if they get the speed increase instead.
> 
> 
> Could you give me an example of the sort of adaptations that might be
> necessary, or the API changes that would be necessary to make a
> drop-in replacement possible?
> 
> Are you just talking about things like removing the context argument
> from __add__ and friends, or is it more serious than this?

One thing Nick already said is the handling of signal traps via a
dict with class keys -- with that, it's necessary to check the dict
all the time. Having e.g. a method to set traps would enable the
implementation to work with a bit field.

There were some other things I can't remember right now, but this was
the most problematic one.

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-28 Thread Nick Coghlan
Greg Ewing wrote:
> Georg Brandl wrote:
> 
>> As Nick said, a drop-in replacement in C isn't feasible
> 
> Yes, but that appears to be so only because of some
> unfortunate features of the Python version's API.
> 
> Seems to me it would be better to undergo a little
> pain now and get a well-designed C-friendly API.

What features do you find particularly unfortunate? Just because 
something isn't particularly amenable to implementation in C, doesn't 
make it a bad API for a Python library (e.g. the dicts to enable/signal 
the different error traps are a natural interface for Python code, even 
though C code would be inclined to use a bit field for the same thing).

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-27 Thread Mark Dickinson
On Thu, Mar 27, 2008 at 4:46 AM, Georg Brandl <[EMAIL PROTECTED]> wrote:

>
> As Nick said, a drop-in replacement in C isn't feasible
>
> But probably users of decimal won't really care if they have to slightly
> adapt their code if they get the speed increase instead.
>

Could you give me an example of the sort of adaptations that might be
necessary, or the API changes that would be necessary to make a
drop-in replacement possible?

Are you just talking about things like removing the context argument
from __add__ and friends, or is it more serious than this?

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-27 Thread Greg Ewing
Georg Brandl wrote:

> As Nick said, a drop-in replacement in C isn't feasible

Yes, but that appears to be so only because of some
unfortunate features of the Python version's API.

Seems to me it would be better to undergo a little
pain now and get a well-designed C-friendly API.

-- 
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-27 Thread Mark Dickinson
On Thu, Mar 27, 2008 at 4:46 AM, Georg Brandl <[EMAIL PROTECTED]> wrote:

> As Nick said, a drop-in replacement in C isn't feasible
>
> But probably users of decimal won't really care if they have to slightly
> adapt their code if they get the speed increase instead.
>
> We had a SOC student working on decimal-c in the past, so it shouldn't be
> totally dead. What about this year's SOC?
>

I worry that rewriting Decimal in C in its entirety would make it
significantly harder to maintain.  The IBM Decimal Specification
hasn't stabilised yet:  there's another update to it expected some
time after IEEE 754r is finally approved, so there are probably
still significant changes to be made to Decimal in the future.

I know that I would have contributed a lot less to Decimal had
it been written in C, simply because it would have taken me
much more time to understand and modify the code.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-27 Thread Facundo Batista
2008/3/26, Nick Coghlan <[EMAIL PROTECTED]>:

>  Basically, while it makes a lot of sense to move the *arithmetic* to C
>  (as Mark mentioned in his other post), there's a lot of ancillary stuff
>  related to flags and exceptions and context handling that is much easier
>  to handle in Python.

That's why we think that the most probably future move here is:

1. Code a small core in C.

2. Let the rest of Decimal in Py.

"small core" and "rest of Decimal" are moving targets we (as
python-dev) could decide that __add__ is really worthy to do it in C,
but not quantize(), for example.

Regards,

-- 
.Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-27 Thread Georg Brandl
Greg Ewing schrieb:
> Nick Coghlan wrote:
>> I believe the list of incompatibilities and kludges and the subsequent 
>> comments in the following file give the gist of the problems:
>> http://svn.python.org/projects/sandbox/trunk/decimal-c/_decimal.c
> 
> It sounds like some aspects of the API weren't thought
> through very well when the Python version was designed.
> 
> The question now is whether to fix the API design, or
> leave it to become entrenched and lose all hope of
> ever having a fully C-coded implementation.

As Nick said, a drop-in replacement in C isn't feasible

But probably users of decimal won't really care if they have to slightly
adapt their code if they get the speed increase instead.

We had a SOC student working on decimal-c in the past, so it shouldn't be
totally dead. What about this year's SOC?

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-27 Thread Greg Ewing
Nick Coghlan wrote:
> I believe the list of incompatibilities and kludges and the subsequent 
> comments in the following file give the gist of the problems:
> http://svn.python.org/projects/sandbox/trunk/decimal-c/_decimal.c

It sounds like some aspects of the API weren't thought
through very well when the Python version was designed.

The question now is whether to fix the API design, or
leave it to become entrenched and lose all hope of
ever having a fully C-coded implementation.

-- 
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-26 Thread Nick Coghlan
Greg Ewing wrote:
> Nick Coghlan wrote:
>> Greg Ewing wrote:
>>
>>> I thought Decimal was going to be replaced by a C
>>> implementation soon anyway.
>> I believe that was found to be more trouble than it was worth.
> 
> That's very disappointing. Was there any discussion of
> the problems that killed it? I don't remember seeing
> any showstoppers being mentioned.

I believe the list of incompatibilities and kludges and the subsequent 
comments in the following file give the gist of the problems:
http://svn.python.org/projects/sandbox/trunk/decimal-c/_decimal.c

Basically, while it makes a lot of sense to move the *arithmetic* to C 
(as Mark mentioned in his other post), there's a lot of ancillary stuff 
related to flags and exceptions and context handling that is much easier 
to handle in Python.

Cheers,
Nick.


-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-26 Thread Greg Ewing
Nick Coghlan wrote:

> Yeah, this thread has convinced me that it would be better to start 
> rejecting bytes in int() and float() as well rather than implicitly 
> assuming an ASCII encoding.

I had another thought -- would it be feasible to have
some kind of wrapper object that would make a byte
array containing ascii chars look like a string?
Then cases like this could be handled without having
to copy the data.

-- 
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-26 Thread Greg Ewing
Nick Coghlan wrote:
> Greg Ewing wrote:
> 
>> I thought Decimal was going to be replaced by a C
>> implementation soon anyway.
> 
> I believe that was found to be more trouble than it was worth.

That's very disappointing. Was there any discussion of
the problems that killed it? I don't remember seeing
any showstoppers being mentioned.

-- 
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-26 Thread Nick Coghlan
Mark Dickinson wrote:
> On Wed, Mar 26, 2008 at 2:57 AM, Nick Coghlan <[EMAIL PROTECTED] 
> > wrote:
> 
> Greg Ewing wrote:
>  > I thought Decimal was going to be replaced by a C
>  > implementation soon anyway. If so, is it worth going
>  > to much trouble over this?
>  >
> 
> I believe that was found to be more trouble than it was worth. So the
> optimisations focused on various ways of making the Python
> implementation more efficient.
> 
> 
> I think it's still worth considering a hybrid implementation of Decimal:
> C code for the basic integer arithmetic (that is, supply a long int
> replacement whose underlying implementation works in base a
> power of 10), and Python for all the complicated logic (dealing
> with flags, special values, etc.).  This will speed things up in the
> usual cases, and also give everything the right asymptotics for
> those few people using Decimal to do really high precision arithmetic.
> (Right now, addition of two Decimals takes quadratic time.)
> 
> The decimal long integer implementation is already in the sandbox,
> so this probably isn't as much work as it sounds.

Ah, I didn't know that - I guess you're talking about extracting the 
integer arithmetic section from the decimal-c implementation? In that 
case, yes, using a custom type for the guts of the mantissa arithmetic 
instead of trying to get reasonable speed out of a mixture of builtin 
types would be a very good thing (and would obviously eliminate the 
current performance problems in the Py3k version of decimal).

Do you think it would be feasible to get this done for the first beta at 
the beginning of June? (I did have a look at _decimal.c in the sandbox 
to see how much help I could offer, but I have to confess my eyes 
started to glaze over a bit ;)

Or would it be better to pursue a simple C object that just stored a 
sequence of integers and provided methods for fast conversion to/from a 
long for 2.6/3.0 and defer the arithmetic-in-C to 3.1?

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-26 Thread Facundo Batista
2008/3/26, Mark Dickinson <[EMAIL PROTECTED]>:

> I think it's still worth considering a hybrid implementation of Decimal:
> C code for the basic integer arithmetic (that is, supply a long int
> replacement whose underlying implementation works in base a
>  power of 10), and Python for all the complicated logic (dealing

I think that this is the way to go, also.

-- 
.Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-26 Thread Mark Dickinson
On Wed, Mar 26, 2008 at 2:57 AM, Nick Coghlan <[EMAIL PROTECTED]> wrote:

> Greg Ewing wrote:
> > I thought Decimal was going to be replaced by a C
> > implementation soon anyway. If so, is it worth going
> > to much trouble over this?
> >
>
> I believe that was found to be more trouble than it was worth. So the
> optimisations focused on various ways of making the Python
> implementation more efficient.
>

I think it's still worth considering a hybrid implementation of Decimal:
C code for the basic integer arithmetic (that is, supply a long int
replacement whose underlying implementation works in base a
power of 10), and Python for all the complicated logic (dealing
with flags, special values, etc.).  This will speed things up in the
usual cases, and also give everything the right asymptotics for
those few people using Decimal to do really high precision arithmetic.
(Right now, addition of two Decimals takes quadratic time.)

The decimal long integer implementation is already in the sandbox,
so this probably isn't as much work as it sounds.  I won't have time
for it until May, though.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-26 Thread M.-A. Lemburg
On 2008-03-26 07:11, Martin v. Löwis wrote:
>> For binary representations, we already have the struct module to handle 
>> the parsing, but for byte sequences with embedded ASCII digits it's 
>> reasonably common practice to use strings along with the respective type 
>> constructors.
> 
> Sure, but why can't you write
> 
>  foo = int(bar[start:stop].decode("ascii"))
> 
> then? Explicit is better than implicit.

Agreed.

The whole purpose of Unicode is to store text. Data from a file
isn't text per-se. You have to tell Python that a particular set of
bytes is to be interpreted as text and that only works by explicitly
converting the bytes to text.

Numbers or digits aren't any different in this context.
b"1234" is just a sequence of bytes and could well represent
the binary encoding of an integer, the start of a base64 encoded
image, an SSH key or an audio file.

Don't get fooled by the looks of b"1234". It's really just a
shorter way of writing 0x31 0x32 0x33 0x34.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 26 2008)
 >>> Python/Zope Consulting and Support ...http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-26 Thread Nick Coghlan
Martin v. Löwis wrote:
>> For binary representations, we already have the struct module to handle 
>> the parsing, but for byte sequences with embedded ASCII digits it's 
>> reasonably common practice to use strings along with the respective type 
>> constructors.
> 
> Sure, but why can't you write
> 
>  foo = int(bar[start:stop].decode("ascii"))
> 
> then? Explicit is better than implicit.

Yeah, this thread has convinced me that it would be better to start 
rejecting bytes in int() and float() as well rather than implicitly 
assuming an ASCII encoding.

If we decide the fast path for ASCII is still important (e.g. to solve 
3.0's current speed problems in decimal), then it would be better to add 
separate methods to int to expose the old 2.x str->int and int->str 
optimisations (e.g. an int.from_ascii class method and an int.to_ascii 
instance method).

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-25 Thread Nick Coghlan
Greg Ewing wrote:
> I thought Decimal was going to be replaced by a C
> implementation soon anyway. If so, is it worth going
> to much trouble over this?
> 

I believe that was found to be more trouble than it was worth. So the 
optimisations focused on various ways of making the Python 
implementation more efficient.

One of those ways was to store the mantissa as a string in order to gain 
the benefit of the fast str->int and int->str conversions. The 3.0 
version no longer has that benefit, and it shows.

It looks like it may be necessary to switch to a custom object for the 
mantissa storage in order to get those fast conversions back.

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-25 Thread Martin v. Löwis
> For binary representations, we already have the struct module to handle 
> the parsing, but for byte sequences with embedded ASCII digits it's 
> reasonably common practice to use strings along with the respective type 
> constructors.

Sure, but why can't you write

 foo = int(bar[start:stop].decode("ascii"))

then? Explicit is better than implicit.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-25 Thread Greg Ewing
I thought Decimal was going to be replaced by a C
implementation soon anyway. If so, is it worth going
to much trouble over this?

-- 
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-25 Thread Facundo Batista
2008/3/25, Alex Martelli <[EMAIL PROTECTED]>:

> >  Since we have some strong use cases at least for the bytes->int case,
>  >  consistency then suggests that the other numeric types should all accept
>  >  bytes as well (interpreting them as ASCII encoded strings).
>
> +1 -- it seems very practical as well as consistent, and I see no downsides.

Mmm... Py3k-ish speaking

"2.13" is an unicode string that holds four digits, two point one
three, which if converted to Decimal, gives me, well, Decimal("2.13").

b"2.13", as it's not a string of digits anymore, but a stream of 4
bytes, that represents the binary number 0x322e3133...

So, what I find difficult to know is how can you undoubtly express a
collection of digits (inherent to strings) through bytes (without
mixing pre-3k concepts).

Regards,

-- 
.Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-25 Thread Nick Coghlan
Greg Ewing wrote:
> Terry Reedy wrote:
>> The purpose of type constructors is to construct instances from reasonable 
>> inputs.  I think all number constructors should accept bytes
> 
> What should bytes as input to a number constructor
> mean, though?
> 
> People seem to be assuming it should be interpreted
> as ASCII-encoded characters.
> 
> But an equally plausible interpretation might be
> that it's some binary representation of a number.

The difference is that there are some hardware control protocols which 
it makes sense to treat as sequences of bytes, which also contain 
numbers as ASCII digits which need to be processed. It's also the case 
that the permitted characters when passing a *string* to a numeric 
constructor are themselves an ASCII subset.

For binary representations, we already have the struct module to handle 
the parsing, but for byte sequences with embedded ASCII digits it's 
reasonably common practice to use strings along with the respective type 
constructors.

However, Mark found another problem when he attempted to speed up the 
Py3k version of decimal by storing the mantissa as a bytes object 
instead of a unicode string: there is currently no efficient way to 
serialise a number into a byte sequence. So storing the mantissa as a 
bytes object is actually currently slower than storing it as a string, 
as you have to convert the number to a string before you can store it in 
a bytes object. That still leaves us with the problem that decimal is 
about 25% slower in 3.0 than it is in 2.6, due to the fact that the 
unicode->int conversion is much slower than the corresponding 2.x 
str->int conversion.

Ugly problem :P

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-25 Thread Greg Ewing
Terry Reedy wrote:
> The purpose of type constructors is to construct instances from reasonable 
> inputs.  I think all number constructors should accept bytes

What should bytes as input to a number constructor
mean, though?

People seem to be assuming it should be interpreted
as ASCII-encoded characters.

But an equally plausible interpretation might be
that it's some binary representation of a number.

-- 
Greg

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-25 Thread Greg Ewing
Nick Coghlan wrote:
> Since we have some strong use cases at least for the bytes->int case, 
> consistency then suggests that the other numeric types should all accept 
> bytes as well (interpreting them as ASCII encoded strings).

How far should this go? Is conversion to numbers really
so special, or should bytes be acceptable in any context
requiring a string, with an implicit encoding of ascii?

-- 
Greg

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-25 Thread Eric Smith
Martin v. Löwis wrote:
>> I'd call this a bug.  The change is an accident, a side-effect of the fact
>> that in 2.5.1 the coefficient (mantissa) of a Decimal was stored as a
>> tuple, and in 2.5.2 it's stored as a string (which greatly improves 
>> efficiency).
>> Clearly in 2.5.2 the mantissa is being stored as a unicode instance in the
>> second case;  it should be explicitly coerced to str in Decimal.__new__.
>>
>> If others agree that it's a bug, I'll fix it.
> 
> If people agree it's a bug, please do fix it.

It looks like a bug to me.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-25 Thread Martin v. Löwis
> I'd call this a bug.  The change is an accident, a side-effect of the fact
> that in 2.5.1 the coefficient (mantissa) of a Decimal was stored as a
> tuple, and in 2.5.2 it's stored as a string (which greatly improves 
> efficiency).
> Clearly in 2.5.2 the mantissa is being stored as a unicode instance in the
> second case;  it should be explicitly coerced to str in Decimal.__new__.
> 
> If others agree that it's a bug, I'll fix it.

If people agree it's a bug, please do fix it.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-25 Thread Terry Reedy

"Mark Dickinson" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
| On Tue, Mar 25, 2008 at 11:29 AM, Nick Coghlan <[EMAIL PROTECTED]> 
wrote:
|
| > The isinstance(value, str) check in Py3k is too restrictive - it needs
| > to accept bytes instances as well.
| >
|
| Hmm. There's not a lot of consistency here:
|
| >>> int(b'1')
| 1
| >>> float(b'1')
| 1.0
| >>> complex(b'1')
| Traceback (most recent call last):
|  File "", line 1, in 
| TypeError: complex() argument must be a string or a number
| >>> from fractions import Fraction
| >>> Fraction(b'1')
| Traceback (most recent call last):
|  File "", line 1, in 
|  File "/Users/dickinsm/python_source/py3k/Lib/fractions.py", line 98, in
| __new__
|numerator = numerator.__index__()
| AttributeError: 'bytes' object has no attribute '__index__'
|
| So int and float accepts bytes, while complex, Decimal and Fraction do
| not...

The purpose of type constructors is to construct instances from reasonable 
inputs.  I think all number constructors should accept bytes and so the 
latter three should be changed.

tjr



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-25 Thread Alex Martelli
On Tue, Mar 25, 2008 at 9:43 AM, Nick Coghlan <[EMAIL PROTECTED]> wrote:
   ...
>  Since we have some strong use cases at least for the bytes->int case,
>  consistency then suggests that the other numeric types should all accept
>  bytes as well (interpreting them as ASCII encoded strings).

+1 -- it seems very practical as well as consistent, and I see no downsides.


Alex
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-25 Thread Nick Coghlan
Facundo Batista wrote:
> 2008/3/25, Mark Dickinson <[EMAIL PROTECTED]>:
> 
>> So int and float accepts bytes, while complex, Decimal and Fraction do
>> not...
> 
> I'm -1 to accept bytes as input for Decimal, I don't see a case of
> use, and I think that conceptually there's no reason to do it.
> 
> Of course, I can be wrong, ;)

I was thinking converting directly from bytes would be significantly 
quicker than going through Unicode (e.g. for numbers read from a file), 
but that may not actually be the case (it'll definitely be faster 
because there is less data copying and movement involved, but the speed 
difference may be less dramatic than I first thought). So while the 
internal storage of the mantissa definitely needs to be changed to a 
bytes object in order to retain Mark's hard-won performance 
improvements, the case of whether or not to accept bytes is far less clear.

The way I see it either complex, Decimal and Fraction all need to be 
updated to accept bytes objects, or else int and float need to be 
updated to reject them.

It *definitely* needs to be possible to convert bytes objects to 
integers as if they were ASCII strings - otherwise a lot of wire 
protocol processing would become a nightmare. Indeed, the proposed 
change to Decimal to have it store the mantissa as a bytes object in 
Py3k assumes that it will still be possible to convert that value 
directly to a long object.

Since we have some strong use cases at least for the bytes->int case, 
consistency then suggests that the other numeric types should all accept 
bytes as well (interpreting them as ASCII encoded strings).

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-25 Thread Mark Dickinson
On Tue, Mar 25, 2008 at 11:57 AM, Facundo Batista <[EMAIL PROTECTED]>
wrote:

> 2008/3/25, Mark Dickinson <[EMAIL PROTECTED]>:
>
> > So int and float accepts bytes, while complex, Decimal and Fraction do
> > not...
>
> I'm -1 to accept bytes as input for Decimal, I don't see a case of
> use, and I think that conceptually there's no reason to do it.
>

I've opened

http://bugs.python.org/issue2483

to keep track of this.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-25 Thread Facundo Batista
2008/3/25, Mark Dickinson <[EMAIL PROTECTED]>:

> So int and float accepts bytes, while complex, Decimal and Fraction do
> not...

I'm -1 to accept bytes as input for Decimal, I don't see a case of
use, and I think that conceptually there's no reason to do it.

Of course, I can be wrong, ;)

Regards,

-- 
.Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-25 Thread Mark Dickinson
On Tue, Mar 25, 2008 at 11:29 AM, Nick Coghlan <[EMAIL PROTECTED]> wrote:

> The isinstance(value, str) check in Py3k is too restrictive - it needs
> to accept bytes instances as well.
>

Hmm. There's not a lot of consistency here:

>>> int(b'1')
1
>>> float(b'1')
1.0
>>> complex(b'1')
Traceback (most recent call last):
  File "", line 1, in 
TypeError: complex() argument must be a string or a number
>>> from fractions import Fraction
>>> Fraction(b'1')
Traceback (most recent call last):
  File "", line 1, in 
  File "/Users/dickinsm/python_source/py3k/Lib/fractions.py", line 98, in
__new__
numerator = numerator.__index__()
AttributeError: 'bytes' object has no attribute '__index__'

So int and float accepts bytes, while complex, Decimal and Fraction do
not...

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-25 Thread Facundo Batista
2008/3/25, Nick Coghlan <[EMAIL PROTECTED]>:

>
>  Anyway, +1 on coercing the mantissa to a str() instance in 2.5.
>

I don't know about 2.5, I'm sure about 2.6.


>  To fix this, decimal probably needs to grow something like the following
>  near the top of the module:
>
>  try:
>_bytes = bytes
>  except NameError: # 2.5 or earlier
>_bytes = str
>
>  and then use _bytes instead of str as appropriate throughout the rest of
>  the module.

+1, I updated the bug created by Oleg.


>  The isinstance(value, str) check in Py3k is too restrictive - it needs
>  to accept bytes instances as well.

Why? The number in a string should be just strings, IMHO, not bytes...
do you have a case of use for this?

Thanks!

-- 
.Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-25 Thread Mark Dickinson
On Tue, Mar 25, 2008 at 11:29 AM, Nick Coghlan <[EMAIL PROTECTED]> wrote:

> I thought that might be what happened, but I couldn't remember if that
> optimisation was a 2.6 only change or not (I suspect it was included in
> 2.5 as a prereq to the spec compliance updates).
>

Exactly.


> Anyway, +1 on coercing the mantissa to a str() instance in 2.5.
>
> This does raise an interesting point though - currently Decimal in Py3k
> is storing the mantissa as a Unicode instance instead of a bytes
> instance. The performance implications of that are horrendous since
> PyLong_FromUnicode does a malloc, encodes the string into the malloced
> buffer, then invokes PyLong_FromString on the result.
>

Urk!  Yes, this definitely needs to be looked at.


> The following is also a problem in Py3k:
> [...]
>
> The isinstance(value, str) check in Py3k is too restrictive - it needs
> to accept bytes instances as well.
>

I don't understand this. Why does Decimal.__new__ need to accept
bytes instances?  Isn't it just supposed to be creating a Decimal
from a string?  Or is the idea that it should accept ASCII strings
that are masquerading as bytes instances, for reasons of
convenience/efficiency/something else?

Unicode/bytes/str and encoding issues frighten me much more than
floating-point arithmetic ever did, so I expect I'm missing many
things here.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-25 Thread Oleg Broytmann
On Tue, Mar 25, 2008 at 10:47:42AM -0400, Mark Dickinson wrote:
> On Tue, Mar 25, 2008 at 9:46 AM, Oleg Broytmann <[EMAIL PROTECTED]> wrote:
> >In 2.5.2 it prints
> >
> >  
> >  
> >
> >Why the change? Is it a bug or a feature? Shouldn't .to_eng_string()
> >  always return a str?
> 
> I'd call this a bug.  The change is an accident, a side-effect of the fact
> that in 2.5.1 the coefficient (mantissa) of a Decimal was stored as a
> tuple, and in 2.5.2 it's stored as a string (which greatly improves 
> efficiency).
> Clearly in 2.5.2 the mantissa is being stored as a unicode instance in the
> second case;  it should be explicitly coerced to str in Decimal.__new__.
> 
> If others agree that it's a bug, I'll fix it.

http://bugs.python.org/issue2482

Oleg.
-- 
 Oleg Broytmannhttp://phd.pp.ru/[EMAIL PROTECTED]
   Programmers don't die, they just GOSUB without RETURN.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-25 Thread Nick Coghlan
Mark Dickinson wrote:
> On Tue, Mar 25, 2008 at 9:46 AM, Oleg Broytmann <[EMAIL PROTECTED]> wrote:
>>In 2.5.2 it prints
>>
>>  
>>  
>>
>>Why the change? Is it a bug or a feature? Shouldn't .to_eng_string()
>>  always return a str?
> 
> I'd call this a bug.  The change is an accident, a side-effect of the fact
> that in 2.5.1 the coefficient (mantissa) of a Decimal was stored as a
> tuple, and in 2.5.2 it's stored as a string (which greatly improves 
> efficiency).
> Clearly in 2.5.2 the mantissa is being stored as a unicode instance in the
> second case;  it should be explicitly coerced to str in Decimal.__new__.
> 
> If others agree that it's a bug, I'll fix it.

I thought that might be what happened, but I couldn't remember if that 
optimisation was a 2.6 only change or not (I suspect it was included in 
2.5 as a prereq to the spec compliance updates).

Anyway, +1 on coercing the mantissa to a str() instance in 2.5.

This does raise an interesting point though - currently Decimal in Py3k 
is storing the mantissa as a Unicode instance instead of a bytes 
instance. The performance implications of that are horrendous since 
PyLong_FromUnicode does a malloc, encodes the string into the malloced 
buffer, then invokes PyLong_FromString on the result.

To fix this, decimal probably needs to grow something like the following 
near the top of the module:

try:
   _bytes = bytes
except NameError: # 2.5 or earlier
   _bytes = str

and then use _bytes instead of str as appropriate throughout the rest of 
the module.

The following is also a problem in Py3k:

 >>> from decimal import Decimal as d
 >>> d(1)
Decimal('1')
 >>> d('1')
Decimal('1')
 >>> d(b'1')
Traceback (most recent call last):
   File "", line 1, in 
   File "/home/ncoghlan/devel/py3k/Lib/decimal.py", line 659, in __new__
 raise TypeError("Cannot convert %r to Decimal" % value)
TypeError: Cannot convert b'1' to Decimal

The isinstance(value, str) check in Py3k is too restrictive - it needs 
to accept bytes instances as well.

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-25 Thread Facundo Batista
2008/3/25, Mark Dickinson <[EMAIL PROTECTED]>:

>
> I'd call this a bug.  The change is an accident, a side-effect of the fact
>  that in 2.5.1 the coefficient (mantissa) of a Decimal was stored as a
>  tuple, and in 2.5.2 it's stored as a string (which greatly improves 
> efficiency).
>  Clearly in 2.5.2 the mantissa is being stored as a unicode instance in the
>  second case;  it should be explicitly coerced to str in Decimal.__new__.
>
>  If others agree that it's a bug, I'll fix it.
>

+1

-- 
.Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-25 Thread Mark Dickinson
On Tue, Mar 25, 2008 at 9:46 AM, Oleg Broytmann <[EMAIL PROTECTED]> wrote:
>In 2.5.2 it prints
>
>  
>  
>
>Why the change? Is it a bug or a feature? Shouldn't .to_eng_string()
>  always return a str?

I'd call this a bug.  The change is an accident, a side-effect of the fact
that in 2.5.1 the coefficient (mantissa) of a Decimal was stored as a
tuple, and in 2.5.2 it's stored as a string (which greatly improves
efficiency).
Clearly in 2.5.2 the mantissa is being stored as a unicode instance in the
second case;  it should be explicitly coerced to str in Decimal.__new__.

If others agree that it's a bug, I'll fix it.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-25 Thread Mark Dickinson
On Tue, Mar 25, 2008 at 9:46 AM, Oleg Broytmann <[EMAIL PROTECTED]> wrote:
>In 2.5.2 it prints
>
>  
>  
>
>Why the change? Is it a bug or a feature? Shouldn't .to_eng_string()
>  always return a str?

I'd call this a bug.  The change is an accident, a side-effect of the fact
that in 2.5.1 the coefficient (mantissa) of a Decimal was stored as a
tuple, and in 2.5.2 it's stored as a string (which greatly improves efficiency).
Clearly in 2.5.2 the mantissa is being stored as a unicode instance in the
second case;  it should be explicitly coerced to str in Decimal.__new__.

If others agree that it's a bug, I'll fix it.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Decimal(unicode)

2008-03-25 Thread Oleg Broytmann
Hello. In Python 2.5.1 the code

import decimal

for d in '123', u'123':
x = decimal.Decimal(d)
print type(x.to_eng_string())

prints




   In 2.5.2 it prints




   Why the change? Is it a bug or a feature? Shouldn't .to_eng_string()
always return a str?

Oleg.
-- 
 Oleg Broytmannhttp://phd.pp.ru/[EMAIL PROTECTED]
   Programmers don't die, they just GOSUB without RETURN.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com