[issue8048] doctest assumes sys.displayhook hasn't been touched

2010-03-03 Thread Noam Raphael

New submission from Noam Raphael noamr...@gmail.com:

Hello,

This bug is the cause of a bug reported about DreamPie: 
https://bugs.launchpad.net/bugs/530969

DreamPie (http://dreampie.sourceforge.net) changes sys.displayhook so that 
values will be sent to the parent process instead of being printed in stdout. 
This causes doctest to fail when run from DreamPie, because it implicitly 
assumes that sys.displayhook writes the values it gets to sys.stdout. This is 
why doctest replaces sys.stdout with its own file-like object, which is ready 
to receive the printed values.

The solution is simply to replace sys.displayhook with a function that will do 
the expected thing, just like sys.stdout is replaced. The patch I attach does 
exactly this.

Thanks,
Noam

--
components: Library (Lib)
files: doctest.py.diff
keywords: patch
messages: 100334
nosy: noam
severity: normal
status: open
title: doctest assumes sys.displayhook hasn't been touched
type: behavior
Added file: http://bugs.python.org/file16421/doctest.py.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8048
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7260] SyntaxError with a not-existing offset for unicode code

2009-11-03 Thread Noam Raphael

New submission from Noam Raphael noamr...@gmail.com:

Hello,

This is from the current svn:

 ./python
Python 3.2a0 (py3k:76104, Nov  4 2009, 08:49:44) 
[GCC 4.4.1] on linux2
Type help, copyright, credits or license for more information.
 try:
... eval(u'שלום')
... except SyntaxError as e:
... e
... 
SyntaxError('invalid syntax', ('string', 1, 11, u'שלום'))

As you can see, the offset (11) refers to a non-existing character, as
the code contains only 7 characters.

Thanks,
Noam

--
components: Interpreter Core
messages: 94879
nosy: noam
severity: normal
status: open
title: SyntaxError with a not-existing offset for unicode code
versions: Python 3.2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7260
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-03-02 Thread Noam Raphael

Noam Raphael noamr...@gmail.com added the comment:

Do you mean msg58966?

I'm sorry, I still don't understand what's the problem with returning
f_15(x) if eval(f_15(x)) == x and otherwise returning f_17(x). You said
(msg69232) that you don't care if float(repr(x)) == x isn't
cross-platform. Obviously, the simple method will preserve eval(repr(x))
== x, no matter what rounding bugs are present on the platform.

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-03-01 Thread Noam Raphael

Noam Raphael noamr...@gmail.com added the comment:

I'm sorry, but it seems to me that the conclusion of the discussion in
2008 is that the algorithm should simply use the system's
binary-to-decimal routine, and if the result is like 123.456, round it
to 15 digits after the 0, check if the result evaluates to the original
value, and if so, return the rounded result. This would satisfy most
people, and has no need for complex rounding algorithms. Am I mistaken?

If I implement this, will anybody be interested?

Noam

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3028] tokenize module: normal lines, not logical

2008-06-02 Thread Noam Raphael

New submission from Noam Raphael [EMAIL PROTECTED]:

Hello,

The documentation of the tokenize module says: The line passed is the
*logical* line; continuation lines are included.

Some background: The tokenize module splits a python source into tokens,
and says for each token where it begins and where it ends, in the format
of (row, offset). This note in the documentation made me think that
continuation lines are considered as one line, and made me break my head
how I should find the offset of the token in the original string. The
truth is that the row number is simply the index of the line as returned
by the readline function, and it's very simple to reconstruct the string
offset.

I suggest that this will be changed to something like The line passed
is the index of the string returned by the readline function, plus 1.
That is, the first string returned is called line 1, the second is
called line 2, and so on.

Thanks,
Noam

--
assignee: georg.brandl
components: Documentation
messages: 67635
nosy: georg.brandl, noam
severity: normal
status: open
title: tokenize module: normal lines, not logical
versions: Python 2.5

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3028
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue979658] Improve HTML documentation of a directory

2008-01-05 Thread Noam Raphael

Noam Raphael added the comment:

I just wanted to say that I'm not going to bother too much with this
right now - Personally I will just use epydoc when I want to create an
HTML documentation. Of course, you can still do whatever you like with
the patch.

Good luck,
Noam

--
nosy: +noam


Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue979658

___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-18 Thread Noam Raphael

Noam Raphael added the comment:

I think that we can give up float(repr(x)) == x across different
platforms, since we don't guarantee something more basic: We don't
guarantee that the same program doing only floating point operations
will produce the same results across different 754 platforms, because
in the compilation process we rely on the system's decimal to binary
conversion. In other words, using the current repr(), one can pass a
value x from platform A platform B and be sure to get the same value.
But if he has a python function f, he can't be sure that f(x) on
platform A will result in the same value as f(x) on platform B. So the
cross-platform repr() doesn't really matter.

I like eval(repr(x)) == x because it means that repr(x) captures all
the information about x, not because it lets me pass x from one
platform to another. For communication, I use other methods.

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-18 Thread Noam Raphael

Noam Raphael added the comment:

2007/12/18, Raymond Hettinger [EMAIL PROTECTED]:
 The 17 digit representation is useful in that it suggests where the
 problem lies.  In contrast, showing two numbers with reprs of different
 lengths will strongly suggest that the shorter one is exactly
 represented.  Currently, that is a useful suggestion, 10.25 shows as
 10.25 while 10.21 shows as 10.211 (indicating that the
 latter is not exactly represented).  If you start showing 1.1 as 1.1,
 then you've lost both benefits.

Currently, repr(1.3) == '1.3', suggesting that it is exactly
represented, which isn't true. I think that unless you use an
algorithm that will truncate zeros only if the decimal representation
is exact, the suggested algorithm is less confusing than the current
one, in that it doesn't suggest that 1.3 is exactly stored and 1.1
isn't.

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-18 Thread Noam Raphael

Noam Raphael added the comment:

About the educational problem. If someone is puzzled by 1.1*3 !=
3.3, you could always use '%50f' % 1.1 instead of repr(1.1). I don't
think that trying to teach people that floating points don't always do
what they expect them to do is a good reason to print uninteresting
and visually distracting digits when you don't have to.

About the compatibility problem: I don't see why it should matter to
the NumPy people if the repr() of some floats is made shorter. Anyway,
we can ask them, using a PEP or just the mailing list.

About the benefit: If I have data which contains floats, I'm usually
interested about their (physical) value, not about their last bits.
That's why str(f) does what it does. I like repr(x) to be one-to-one,
as I explained in the previous message, but if it can be made more
readable, why not make it so?

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-17 Thread Noam Raphael

Noam Raphael added the comment:

Ok, I think I have a solution!

We don't really need always the shortest decimal representation. We just
want that for most floats which have a nice decimal representation, that
representation will be used. 

Why not do something like that:

def newrepr(f):
r = str(f)
if eval(r) == f:
return r
else:
return repr(f)

Or, in more words:

1. Calculate the decimal representation of f with 17 precision digits,
s1, using the system's routines.
2. Create a new string, s2, by rounding the resulting string to 12
precision digits.
3. Convert the resulting rounded string to a new double, g, using the
system's routines.
4. If f==g, return s2. Otherwise, return s1.

It will take some more time than the current repr(), because of the
additional decimal to binary conversion, but we already said that if
speed is extremely important one can use '%f.17' % f. It will
obviously preserve the eval(repr(f)) == f property. And it will return a
short representation for almost any float that has a short representation.

This algorithm I will be glad to implement.

What do you think?

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-13 Thread Noam Raphael

Noam Raphael added the comment:

2007/12/13, Guido van Rossum [EMAIL PROTECTED]:

  Ok, so if I understand correctly, the ideal thing would be to
  implement decimal to binary conversion by ourselves. This would make
  str - float conversion do the same thing on all platforms, and would
  make repr(1.1)=='1.1'. This would also allow us to define exactly how
  floats operate, with regard to infinities and NaNs. All this is for
  IEEE-754 platforms -- for the rare platforms which don't support it,
  the current state remains.

 Does doubledigits.c not work for non-754 platforms?

No. It may be a kind of an oops, but currently it just won't compile
on platforms which it doesn't recognize, and it only recognizes 754
platforms.

  2. Keep the binary to shortest decimal routine and use it only when we
  know that the system's decimal to binary routine is correctly rounding
  (we can check - perhaps Microsoft has changed theirs?)

 Tim says you can't check (test) for this -- you have to prove it from
 source, or trust the vendor's documentation. I would have no idea
 where to find this documented.

The program for testing floating point compatibility is in
http://www.cant.ua.ac.be/ieeecc754.html

To run it, on my computer, I used:
./configure -target Conversions -platform IntelPentium_cpp
make
./IeeeCC754 -d -r n -n x Conversion/testsets/d2bconvd
less ieee.log

This tests only doubles, round to nearest, and ignores flags which
should be raised to signal inexact conversion. You can use any file in
Conversions/testsets/d2b* - I chose this one pretty randomly.

It turns out that even on my gcc 4.1.3 it finds a few floats not
correctly rounded. :(

Anyway, it can be used to test other platforms. If not by the
executable itself, we can pretty easily write a python program which
uses the test data.

I don't know what exactly the errors with gcc 4.1.3 mean - is there a
problem with the algorithm of glibc, or perhaps the testing program
didn't set some flag?

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-12 Thread Noam Raphael

Noam Raphael added the comment:

Ok, so if I understand correctly, the ideal thing would be to
implement decimal to binary conversion by ourselves. This would make
str - float conversion do the same thing on all platforms, and would
make repr(1.1)=='1.1'. This would also allow us to define exactly how
floats operate, with regard to infinities and NaNs. All this is for
IEEE-754 platforms -- for the rare platforms which don't support it,
the current state remains.

However, I don't think I'm going, in the near future, to add a decimal
to binary implementation -- the Tcl code looks very nice, but it's
quite complicated and I don't want to fiddle with it right now.

If nobody is going to implement the correctly rounding decimal to
binary conversion, then I see three options:
1. Revert to previous situation
2. Keep the binary to shortest decimal routine and use it only when we
know that the system's decimal to binary routine is correctly rounding
(we can check - perhaps Microsoft has changed theirs?)
3. Keep the binary to shortest decimal routine and drop repr(f) == f
(I don't like that option).

If options 2 or 3 are chosen, we can check the 1e5 bug.

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-11 Thread Noam Raphael

Noam Raphael added the comment:

The Tcl code can be fonund here:
http://tcl.cvs.sourceforge.net/tcl/tcl/generic/tclStrToD.c?view=markup

What Tim says gives another reason for using that code - it means that
currently, the compilation of the same source code on two platforms can
result in a code which does different things.

Just to make sure - IEEE does require that operations on doubles will do
the same thing on different platforms, right?

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-11 Thread Noam Raphael

Noam Raphael added the comment:

I think that for str(), the current method is better - using the new
repr() method will make str(1.1*3) == '3.3003', instead of
'3.3'. (The repr is right - you can check, and 1.1*3 != 3.3. But for
str() purposes it's fine.)

But I actually think that we should also use Tcl's decimal to binary
conversion - otherwise, a .pyc file created by python compiled with
Microsoft will cause a different behaviour from a .pyc file created by
python compiled with Gnu, which is quite strange.

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-11 Thread Noam Raphael

Noam Raphael added the comment:

If I think about it some more, why not get rid of all the float
platform-dependencies and define how +inf, -inf and nan behave?

I think that it means:
* inf and -inf are legitimate floats just like any other float.
Perhaps there should be a builtin Inf, or at least math.inf.
* nan is an object of type float, which behaves like None, that is:
nan == nan is true, but nan  nan and nan  3 will raise an
exception. Mathematical operations which used to return nan will raise
an exception (division by zero does this already, but inf + -inf
will do that too, instead of returning nan.) Again, there should be a
builtin NaN, or math.nan. The reason for having a special nan object
is compatibility with IEEE floats - I want to be able to pass around
IEEE floats easily even if they happen to be nan.

This is basically what Tcl did, if I understand correctly - see item 6
in http://www.tcl.tk/cgi-bin/tct/tip/132.html .

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-11 Thread Noam Raphael

Noam Raphael added the comment:

‎That's right, but the standard also defines that 0.0/0 - nan, and
1.0/0 - inf, but instead we raise an exception. It's just that in
Python, every object is expected to be equal to itself. Otherwise, how
can I check if a number is nan?

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-11 Thread Noam Raphael

Noam Raphael added the comment:

If I understand correctly, there are two main concerns: speed and
portability. I think that they are both not that terrible.

How about this:
* For IEEE-754 hardware, we implement decimal/binary conversions, and
define the exact behaviour of floats.
* For non-IEEE-754 hardware, we keep the current method of relying on
the system libraries.

About speed, perhaps it's not such a big problem, since decimal/binary
conversions are usually related to I/O, and this is relatively slow
anyway. I think that usually a program does a relatively few
decimal/binary conversions.
About portability, I think (from a small research I just made) that
S90 supports IEEE-754. This leaves VAX and cray users, which will have
to live with a non-perfect floating-point behaviour.

If I am correct, it will let 99.9% of the users get a deterministic
floating-point behaviour, where eval(repr(f)) == f and
repr(1.1)=='1.1', with a speed penalty they won't notice.

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-11 Thread Noam Raphael

Noam Raphael added the comment:

If I were in that situation I would prefer to store the binary
representation. But if someone really needs to store decimal floats,
we can add a method fast_repr which always calculates 17 decimal
digits.

Decimal to binary conversion, in any case, shouldn't be slower than it
is now, since on Gnu it is done anyway, and I don't think that our
implementation should be much slower.

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-10 Thread Noam Raphael

Noam Raphael added the comment:

I don't know, for me it works fine, even after downloading a fresh SVN
copy. On what platform does it happen?

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-10 Thread Noam Raphael

Noam Raphael added the comment:

I also use linux on x86. I think that byte order would cause different
results (the repr of a random float shouldn't be 1.0.)
Does the test case run ok? Because if it does, it's really strange.

--
versions:  -Python 2.6

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-10 Thread Noam Raphael

Noam Raphael added the comment:

Oh, this is sad. Now I know why Tcl have implemented also a decimal to
binary routine.

Perhaps we can simply use both their routines? If I am not mistaken,
their only real dependency is on a library which allows arbitrary long
integers, called tommath, from which they use a few basic functions.
We can use instead the functions from longobject.c. It will probably
be somewhat slower, since longobject.c wasn't created to allow
in-place operations, but I don't think it should be that bad -- we are
mostly talking about compile time.

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: Why keep identity-based equality comparison?

2006-01-14 Thread Noam Raphael
Mike Meyer wrote:
 [EMAIL PROTECTED] writes:
 
try:
 return a == b
 except TypeError:
 return a is b
 
 
 This isn't easy. It's an ugly hack you have to use everytime you
 want to iterate through a heterogenous set doing equality tests.

I wouldn't define this as an ugly hack. These are four simple line, 
which state clearly and precisely what you mean, and always work. I have 
seen ugly hacks in my life, and they don't look like this.
 
 You're replacing false with an emphathetic false, that *all*
 containers to change for the worse to deal with it.
 
I don't see how they change for the worse if they have exactly the same 
functionality and a few added lines of implementation.
 
Also, Mike said that you'll need an idlist object too - and I think
he's right and that there's nothing wrong with it.
 
 
 Except that we now need four versions of internal data structures,
 instead of two: list, tuple, idlist, idtuple; set, idset, frozenset,
 frozenidset, and so on. What's wrong with this is that it's ugly.

Again, ugly is a personal definition. I may call this explicitness. 
By the way, what's the and so on - I think that these are the only 
built-in containers.
 
 
Note that while you
can easily define the current == behaviour using the proposed
behaviour, you can't define the proposed behaviour using the current
behaviour.
 
 
 Yes you can, and it's even easy. All you have to do is use custom
 classes that raise an exception if they don't

You can't create a general container with my proposed == behaviour. 
That's what I meant.
 
 
Also note that using the current behaviour, you can't easily
treat objects that do define a meaningful value comparison, by
identity.
 
 
 Yes you can. Just use the is operator.

Sorry, I wasn't clear enough. In treating I meant how containers treat 
the objects they contain. For example, you can't easily map a value to a 
specific instance of a list - dict only lets you map a value to a 
specific *value* of a list. Another example - you can't search for a 
specific list object in another list.
 
 Note that this behavior also has the *highly* pecular behavior that a
 doesn't necessarily equal a by default.

Again, peculiar is your aesthethic sense. I would like to hear 
objections based on use cases that are objectively made more difficult. 
Anyway, I don't see why someone should even try checking if a==a, and 
if someone does, the exception can say this type doesn't support value 
comparison. Use the is operator.
 
 I will point out why your example usages aren't really usefull if
 you'll repeat your post with newlines.
 
Here they are:

* Things like Decimal(3.0) == 3.0 will make more sense (raise an
exception which explains that decimals should not be compared to
floats, instead of returning False).
* You won't be able to use objects as keys, expecting them to be
compared by value, and causing a bug when they don't. I recently wrote
a sort-of OCR program, which contains a mapping from a numarray array
of bits to a character (the array is the pixel-image of the char).
Everything seemed to work, but the program didn't recognize any
characters. I discovered that the reason was that arrays are hashed
according to their identity, which is a thing I had to guess. If
default == operator were not defined, I would simply get a TypeError
immediately.
* It is more forward compatible - when it is discovered that two types
can sensibly be compared, the comparison can be defined, without
changing an existing behaviour which doesn't raise an exception.

The third example applies to the Decimal==float use case, and for every 
type that currently has the default identity-based comparison and that 
may benefit from a value-based comparison. Take the class

class Circle(object):
 def __init__(self, center, radius):
 self.center = center
 self.radius = radius

Currently, it's equal only to itself. You may decide to define an 
equality operator which checks whether both the center and the radius 
are the same, but since you already have a default equality operator, 
that change would break backwards-compatibility.

Noam
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why keep identity-based equality comparison?

2006-01-14 Thread Noam Raphael
Mike Meyer wrote:
 Noam Raphael [EMAIL PROTECTED] writes:
 
Also note that using the current behaviour, you can't easily
treat objects that do define a meaningful value comparison, by
identity.

Yes you can. Just use the is operator.

Sorry, I wasn't clear enough. In treating I meant how containers
treat the objects they contain. For example, you can't easily map a
value to a specific instance of a list - dict only lets you map a
value to a specific *value* of a list.
 
 
 Wrong. All you have to do is create a list type that uses identity
 instead of value for equality testing. This is easier than mapping an
 exception to false.
 
You're suggesting a workaround, which requires me to subclass everything 
that I want to lookup by identity (and don't think it's simple - I will 
have to wrap a lot of methods that return a list to return a list with a 
modified == operator).

I'm suggesting the use of another container class: iddict instead of 
dict. That's all.
I don't think that mapping an exception to false is so hard (certainly 
simpler than subclassing a list in that way), and the average user won't 
have to do it, anyway - it's the list implementation that will do it.
 
Another example - you can't
search for a specific list object in another list.
 
 
 Your proposed == behavior doesn't change that at all.

It does - *use idlist*.
 
 
I will point out why your example usages aren't really usefull if
you'll repeat your post with newlines.

Here they are:
* Things like Decimal(3.0) == 3.0 will make more sense (raise an
exception which explains that decimals should not be compared to
floats, instead of returning False).
 
 
 While I agree that Decimal(3.0) == 3.0 returning false doesn't make
 sense, having it raise an exception doesn't make any more sense. This
 should be fixed, but changing == doesn't fix it.
 
No, it can't be fixed your way. It was decided on purpose that Decimal 
shouldn't be comparable to float, to prevent precision errors. I'm 
saying that raising an exception will make it clearer.
 
* You won't be able to use objects as keys, expecting them to be
compared by value, and causing a bug when they don't. I recently wrote
a sort-of OCR program, which contains a mapping from a numarray array
of bits to a character (the array is the pixel-image of the char).
Everything seemed to work, but the program didn't recognize any
characters. I discovered that the reason was that arrays are hashed
according to their identity, which is a thing I had to guess. If
default == operator were not defined, I would simply get a TypeError
immediately.
 
 
 This isn't a use case. You don't get correct code with either version
 of '=='. While there is some merit to doing things that make errors
 easier to find, Python in general rejects the idea of adding
 boilerplate to do so. Your proposal would generate lots of boilerplate
 for many practical situations.
 
I would say that there's a lot of merit to doing things that make errors 
easier to find. That's what exceptions are for.

Please say what those practical situations are - that what I want.
(I understand. You think that added containers and a try...except  fro 
time to time aren't worth it. I think they are. Do you have any other 
practical situations?)
 
* It is more forward compatible - when it is discovered that two types
can sensibly be compared, the comparison can be defined, without
changing an existing behaviour which doesn't raise an exception.
 
 
 Sorry, but that doesn't fly. If you have code that relies on the
 exception being raised when two types are compared, changing it to
 suddenly return a boolean will break that code.
 
You are right, but that's the case for every added language feature (if 
you add a method, you break code that relies on an AttributeError...)
You are right that I'm suggesting a try...except when testing if a list 
contains an object, but a case when you have a list with floats and 
Decimals, and you rely on Decimal(3.0) in list1 to find only 
Decimals seems to me a little bit far-fetched. If you have another 
example, please say it.

Noam
-- 
http://mail.python.org/mailman/listinfo/python-list


Using distutils 2.4 for python 2.3

2005-09-23 Thread Noam Raphael
Hello,

I want to distribute a package. It's compatible with Python 2.3.
Is there a way to use distutils 2.4 feature package_data, while 
maintaining the distribution compatible with python 2.3 ?

Thanks,
Noam Raphael
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Using distutils 2.4 for python 2.3

2005-09-23 Thread Noam Raphael
Fredrik Lundh wrote:
 
 you can enable new metadata fields in older versions by assigning to
 the DistributionMetadata structure:
 
 try:
 from distutils.dist import DistributionMetadata
 DistributionMetadata.package_data = None
 except:
 pass
 
 setup(
 ...
 package_data=...
 )
 
 /F 

I tried this, but it made python2.4 behave like python2.3, and not 
install the package_data files.

Did I do something wrong?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How about pure virtual methods?

2004-12-31 Thread Noam Raphael
Thanks for your suggestion, but it has several problems which the added 
class solves:

* This is a very long code just to write you must implement this 
method. Having a standard way to say that is better.
* You can instantiate the base class, which doesn't make sense.
* You must use testing to check whether a concrete class which you 
derived from the base class really implemented all the abstract methods. 
Testing is a good thing, but it seems to me that when the code specifies 
exactly what should happen, and it doesn't make sense for it not to 
happen, there's no point in having a separate test for it.

About the possibility of implementing only a subset of the interface: 
You are perfectly welcomed to implement any part of the interface as you 
like. Function which use only what you've implemented should work fine 
with your classes. But you can't claim your classes to be instances of 
the base class - as I see it, subclasses, even in Python, guarantee to 
behave like their base classes.

Have a good day,
Noam
--
http://mail.python.org/mailman/listinfo/python-list


Re: How about pure virtual methods?

2004-12-25 Thread Noam Raphael
Mike Meyer wrote:
That's what DbC languages are for. You write the contracts first, then
the code to fullfill them. And get exceptions when the implementation
doesn't do what the contract claims it does.
mike
Can you give me a name of one of them? This is a very interesting thing 
- I should learn one of those sometime. However, I'm pretty sure that 
programming in them is hell, or at least, takes a very long time.

Noam
--
http://mail.python.org/mailman/listinfo/python-list


Re: How about pure virtual methods?

2004-12-25 Thread Noam Raphael
Thank you very much for this answer! I learned from you about unit 
tests, and you convinced me that testing oriented programming is a 
great way to program.

You made me understand that indeed, proper unit testing solves my 
practical problem - how to make sure that all the methods which should 
be implemented were implemented. However, I'm still convinced that this 
feature should be added to Python, for what may be called aesthetic 
reasons - I came to think that it fills a gap in Python's logic, and 
is not really an additional, optional feature. And, of course, there are 
practical advantages to adding it.

The reason why this feature is missing, is that Python supports building 
a class hierarchy. And, even in this dynamically-typed language, the 
fact that B is a subclass of A means that B is supposed to implement the 
interface of A. If you want to arrange in a class hierarchy a set of 
classes, which all implement the same interface but don't have a common 
concrete class, you reach the concept of an abstract class, which 
can't be instantiated. And the basestring class is exactly that.

The current Python doesn't really support this concept. You can write in 
the __new__ of such a class something like if cls == MyAbstractClass: 
raise TypeError, but I consider this as a patch - for example, if you 
have a subclass of this class which is abstract too, you'll have to 
write this exception code again. Before introducing another problem, let 
me quote Alex:

... If you WANT the method in the ABC, for documentation
purposes, well then, that's not duplication of code, it's documentation,
which IS fine (just like it's quite OK to have some of the same info in
a Tutorial document, in a Reference one, AND in a class's docstring!).
If you don't want to have the duplication your unit tests become easier:
you just getattr from the class (don't even have to bother instantiating
it, ain't it great!), and check the result with inspect.
That's actually right - listing a method which should be implemented by 
subclasses, in the class definition is mainly a matter of 
*documentation*. I like the idea that good documentation can be derived 
from my documented code automatically, and even if I provide an external 
documentation, the idea that the code should explain itself is a good 
one. The problem is, that the current convention is not a good 
documentation:

  def frambozzle(self):
 ''' must make the instance frambozzled '''
 raise NotImplementedError
The basic problem is that, if you take this basic structure, it already 
means another thing: This is a method, which takes no arguments and 
raises a NotImplementedError. This may mean, by convention, that this 
method must be implemented by subclasses, but it may also mean that this 
method *may* be implemented by subclasses. I claim that a declaration 
that a method must be implemented by subclass is simply not a method, 
and since Python's logic does lead to this kind of thing, it should 
supply this object (I think it should be the class abstract). Two of 
Python's principles are explicit is better than implicit, and there 
should be (only?) one obvious way to do it. Well, I think that this:

@abstract
def frambozzle(self):
Must make the instance frambozzled
pass
is better than the previous example, and from
def frambozzle(self):
raise NotImplementedError, You must implemented this method, and 
it must make the instance frambozzled

and from
def frambozzle(self):
Must make the instance frambozzled.
PURE VIRTUAL

pass
and from maybe other possible conventions. Note also that making this 
explicit will help you write your tests, even if Python would allow 
instantiation of classes which contain abstract methods - you will be 
able to simple test assert not isinstance(MyClass.frambozzle, abstract).
(I don't like the solution of not mentioning the method at all, which 
makes the test equally simple, because it doesn't document what the 
method should do in the class definition, and I do like in-code 
documentation.)

To summarize, I think that this feature should be added to Python 
because currently, there's no proper way to write some code which fits 
the Python way. As a bonus, it will help you find errors even when 
your unit tests are not sufficient.

I plan to raise this issue in python-dev. If you have any additional 
comments, please post them here. (I will probably be able to reply only 
by the weekend.)

Have a good day,
Noam
--
http://mail.python.org/mailman/listinfo/python-list


Re: Non blocking read from stdin on windows.

2004-12-25 Thread Noam Raphael
You can always have a thread which continually reads stdin and stores it 
in a string, or better, in a cStringIO.StringIO object. Then in the main 
thread, you can check whether something new has arrived. This, of course 
will work on all platforms.

I hope this helped a bit,
Noam
--
http://mail.python.org/mailman/listinfo/python-list


Re: How about pure virtual methods?

2004-12-23 Thread Noam Raphael
Jp Calderone wrote:
  This lets you avoid duplicate test code as well as easily test
new concrete implementations.  It's an ideal approach for frameworks
which mandate application-level implementations of a particular 
interface and want to ease the application developer's task.

  Jp
It's a great way for sharing tests between different subclasses of a 
class. Thank you for teaching me.

However, I'm not sure if this solves my practical problem - testing 
whether all abstract methods were implemented. I think that usually, you 
can't write a test which checks whether an abstract method did what it 
should have, since different implementations do different things. I 
don't even know how you can test whether an abstract method was 
implemented - should you run it and see if it raises a 
NotImplementedError? But with what arguments? And even if you find a way 
to test whether a method was implemented, I still think that the 
duplication of code isn't very nice - you have both in your class 
definition and in your test suite a section which says only method 
so-and-so should be implemented.

I think that making abstract methods a different object really makes 
sense - they are just something else. Functions (and methods) define 
what the computer should do. Abstract methods define what the 
*programmer* should do.

Again, thanks for enlightening me.
Noam
--
http://mail.python.org/mailman/listinfo/python-list


Re: How about pure virtual methods?

2004-12-21 Thread Noam Raphael
Thank you all, especially Alex for your enlightening discussion, and 
Scott for your implementation. I'm sorry that I can't be involved in a 
daily manner - but I did read all of the posts in this thread. They 
helped me understand the situation better, and convinced me that indeed 
this feature is needed. Let's see if I can convince you too.

First, the actual situation in which I stood, which made me think, I 
would like to declare a method as not implemented, so that subclasses 
would have to implement it.

I wrote a system in which objects had to interact between themselves. In 
my design, all those objects had to implement a few methods for the 
interaction to work. So I wrote a base class for all those objects, with 
a few methods which the subclasses had to implement. I think it's good, 
for *me*, to have an explicit list of what should be implemented, so 
that when (in a function) I expect to get an object of this kind I know 
what I may and may not do with it.

Then, I wrote the classes themselves. And I wrote unit tests for them. 
(Ok, I lie. I didn't. But I should have!) Afterwards, I decided that I 
needed all my objects of that kind to supply another method. So I added 
another raise NotImplementedError method to the base class. But what 
about the unit tests? They would have still reported a success - where 
of course they shouldn't have; my classes, in this stage, didn't do what 
they were expected to do. This problem might arise even when not 
changing the interface at all - it's quite easy to write a class which, 
by mistake, doesn't implement all the interface. Its successful unit 
tests may check every single line of code of that class, but a complete 
method was simply forgotten, and you wouldn't notice it until you try 
the class in the larger framework (and, as I understand, the point of 
unit testing is to test the class on its own, before integrating it).

Ok. This was the practical reason why this is needed. Please note that I 
didn't use isinstance even once - all my functions used the 
*interface* of the objects they got. I needed the extra checking for 
myself - if someone wanted to implement a class that wouldn't inherit 
from my base class, but would nevertheless implement the required 
interface, he was free to do it, and it would have worked fine with the 
framework I wrote.

Now for the theoretical reason why this is needed. My reasoning is 
based on the existence of isinstance in Python. Well, what is the 
purpose of isinstance? I claim that it doesn't test if an object *is* of 
a given type. If that would have been its purpose, it would have checked 
whether type(obj) == something. Rather, it checks whether an object is a 
subclass of a given type. Why should we want such a function? A subclass 
may do a completely different thing from what the original class did! 
The answer is that a subclass is guaranteed to have the same *interface* 
as the base class. And that's what matters.

So I conclude that a subclass, in Python, must implement the interface 
of its parent class. Usually, this is obvious - there's no way for a 
subclass not to implement the interface of its parent class, simply 
because it can only override methods, but can't remove methods. But what 
shall we do if the some methods in the base class consist *only* of an 
interface? Can we implement only a part of the interface, and claim that 
instances of that class are instances of the original class, in the 
isinstance fashion? My answer is no. The whole point of isinstance 
is to check whether an instance implements an interface. If it doesn't - 
what is the meaning of the True that isinstance returns? So we should 
simply not allow instances of such classes.

You might say that abstract classes at the base of the hierarchy are 
not Pythonic. But they are in Python already - the class basestring is 
exactly that. It is an uninstantiable class, which is there only so that 
you would be able to do isinstance(x, basestring). Classes with 
notimplemented methods would behave in exactly the same way - you 
wouldn't be able to instantiate them, just to subclass them (and to 
check, using isinstance, whether they implement the required protocol, 
which I agree that wouldn't be Pythonic, probably).

Ok. This is why I think this feature fits Python like a glove to a hand. 
Please post your comments on this! I apologize now - I may not be able 
to reply in the next few days. But I will read them at the end, and I 
will try to answer.

Have a good day,
Noam
--
http://mail.python.org/mailman/listinfo/python-list


Re: How about pure virtual methods?

2004-12-21 Thread Noam Raphael
My long post gives all the philosophy, but I'll give here the short answers.
Mike Meyer wrote:
+0
Python doesn't use classes for typing. As Alex Martelli puts it,
Python uses protocols. So the client expecting a concrete subclass of
your abstract class may get an instantiation of a class that doesn't
inherit from the abstract class at all.
That's right - this mechanism is useful mostly for he who implements 
that class, to make sure that he implemented all that is needed to be 
assigned the title a subclass of that class.

Or maybe the subclass is only going to use a subset of the features of
the abstract class, and the author knows that sum deferred methods
won't be invoked. The correct behavior in this case would be to allow
the subclass to be instantiated, and then get a runtime error if one
of the features the author thought he could skip was actually called.
I disagree - my reasoning is that a subclass must implement the complete 
interface of its base class (see my long post). The author may implement 
a class which defines only a part of the interface, and give it to the 
function, and it may work and be great. But it must not be called an 
instance of the abstract class.

Finally, in a sufficiently complex class hierarchy, this still leaves
you wondering through the hierarchy trying to find the appropriate
parent class that tagged this method as unimplemented, and then
figuring out which class should have implemented it - as possibly a
parent of the class whose instantiation failed is the subclass that
should have made this method concrete.
You are right - but I needed this for a class hierarchy of only two 
levels (the base abstract class and the concrete subclasses), so there 
were not many classes to blame for a missing method.
   mike
I hope this seems reasonable,
Noam
--
http://mail.python.org/mailman/listinfo/python-list


Re: How about pure virtual methods?

2004-12-21 Thread Noam Raphael
Scott David Daniels wrote:
class Abstract(object):
'''A class to stick anywhere in an inheritance chain'''
__metaclass__ = MustImplement
def notimplemented(method):
'''A decorator for those who prefer the parameters declared.'''
return NotImplemented
I just wanted to say that I thought of notimplemented as a class, that 
would save a reference to the functions it got in the constructor. In 
that way pydoc and his friends would be able to find the arguments the 
method was expected to get, and its documentation string.

But it's a great implementation.
Noam
Oh, and another thing - maybe abstract is a better name than 
notimplemented? notimplemented might suggest a method which doesn't 
have to be implemented - and raises NotImplementedError when it is 
called. What do you think?
--
http://mail.python.org/mailman/listinfo/python-list


How about pure virtual methods?

2004-12-18 Thread Noam Raphael
Hello,
I thought about a new Python feature. Please tell me what you think 
about it.

Say you want to write a base class with some unimplemented methods, that 
subclasses must implement (or maybe even just declare an interface, with 
no methods implemented). Right now, you don't really have a way to do 
it. You can leave the methods with a pass, or raise a 
NotImplementedError, but even in the best solution that I know of, 
there's now way to check if a subclass has implemented all the required 
methods without running it and testing if it works. Another problem with 
the existing solutions is that raising NotImplementedError usually means 
This method might be implemented some time, and not you must 
implement this method when you subclass me.

What I suggest is a new class, called notimplemented (you may suggest a 
better name). It would get a function in its constructor, and would just 
save a reference to it. The trick is that when a new type (a subclass of 
the default type object) is created, It will go over all its members and 
check to see if any of them is a notimplemented instance. If that is the 
case, it would not allow an instantiation of itself.

What I want is that if I have this module:
==
class BaseClass(object):
def __init__(self):
...
@notimplemented
def save_data(self, filename):
This method should save the internal state of the class to
a file named filename.

pass
class RealClass(BaseClass):
def save_data(self, filename):
open(filename).write(self.data)
==
then if I try to instantiate BaseClass I would get an exception, but 
instantiating RealClass will be ok.

Well, what do you say?
Noam Raphael
--
http://mail.python.org/mailman/listinfo/python-list