Re: [Python-Dev] Python 3.x and bytes

2011-05-19 Thread Stefan Behnel

Greg Ewing, 19.05.2011 00:02:

Georg Brandl wrote:


We do have

bytes.fromhex('deadbeef')


But again, there is a run-time overhead to this.


Well, yes, but it's negligible if you assign it to a suitable variable first.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [RELEASED] Python 3.2.1 rc 1

2011-05-19 Thread Hagen Fürstenau
 3.2.1b1 was already merged back.  (And 3.2.1rc1 will also be merged back
 soon, since there will be a 3.2.1rc2.)

Thanks for the clarification! :-)

Cheers,
Hagen

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.x and bytes

2011-05-19 Thread Xavier Morel
On 2011-05-19, at 07:28 , Georg Brandl wrote:
 On 19.05.2011 00:39, Greg Ewing wrote:
 Ethan Furman wrote:
 
 some_var[3] == b'd'
 
 1) a check to see if the bytes instance is length 1
 2) a check to see if
   i) the other object is an int, and
   2) 0 = other_obj  256
 3) if 1 and 2, make the comparison instead of returning NotImplemented?
 
 It might seem convenient, but I'd worry that it would lead to
 even more confusion in other ways. If someone sees that
 
some_var[3] == b'd'
 
 is true, and that
 
some_var[3] == 100
 
 is also true, they might expect to be able to do things
 like
 
n = b'd' + 1
 
 and get 101... or maybe b'e'...
 
 Maybe they should :)

But why wouldn't they expect `b'de' + 1` to work as well in this case? If a 
1-byte bytes is equivalent to an integer, why not an arbitrary one as well?
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.x and bytes

2011-05-19 Thread Nick Coghlan
On Thu, May 19, 2011 at 5:10 AM, Eric Smith e...@trueblade.com wrote:
 On 05/18/2011 12:16 PM, Stephen J. Turnbull wrote:
 Robert Collins writes:

   Its probably too late to change, but please don't try to argue that
   its correct: the continued confusion of folk running into this is
   evidence that confusion *is happening*. Treat that as evidence and
   think about how to fix it going forward.

 Sorry, Rob, but you're just wrong here, and Nick is right.  It's
 possible to improve Python 3, but not to fix it in this respect.
 The Python 3 solution is correct, the Python 2 approach is not.
 There's no way to avoid discontinuity and confusion here.

 I don't think there's any connection between the way 2.x confused text
 strings and binary data (which certainly needed addressing) with the way
 that 3.x returns a different type for byte_str[i] than it does for
 byte_str[i:i+1]. I think it's the latter that's confusing to people.
 There's no particular requirement for different types that's needed to
 fix the byte/str problem.

It's a mental model problem. People try to think of bytes as
equivalent to 2.x str and that's just wrong, wrong, wrong. It's far
closer to array.array('c'). Strings are basically *unique* in
returning a length 1 instance of themselves for indexing operations.
For every other sequence type, including tuples, lists and arrays,
slicing returns a new instance of the same type, while indexing will
typically return something different.

Now, we definitely didn't *help* matters by keeping so many of the
default behaviours of bytes() and bytearray() coupled to ASCII-encoded
text, but that was a matter of practicality beating purity: there
really *are* a lot of wire protocols out there that are ASCII based.
In hindsight, perhaps we should have gone further in breaking things
to try to make the point about the mental model shift more forcefully.
(However, that idea carries with it its own problems).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.x and bytes

2011-05-19 Thread Stephen J. Turnbull
Robert Collins writes:

  Thats separate to the implementation issues I have mentioned in this
  thread and previous.

Oops, sorry.

Nevertheless, I personally think that b'a'[0] == 97 is a good idea,
and consistent with everything else in Python.  It's Unicode (str)
that is weird, it's str is surprising when first encountered by a C or
Lisp programmer at first, but not enough to cause a heart attack given
how weird natural language is.  But I don't see why that weirdness (an
element of LIST of TYPE is a LIST of TYPE, hey, young man, you're very
smart but *it's turtles all the way down!*) should be replicated
elsewhere.

If you want your bytes object to behave like a str, it's very easy to
get that (.decode('latin1')), and nobody has yet demonstrated that
this is too time-inefficient for real work, given the other overhead
imposed by Python.  The space inefficiency could be dealt with as Greg
points out (by internally having a Unicode representation using 1 byte
instead of 2 or 4).  But if you want your bytes object to *be* a
string, then you're confused.  It isn't (any more).  Even if it's just
a matter of flipping one bit in the type field, a str-with-unibyte-
representation, is not equal to a bytes object with the same bytes.

For example, you write:

  urlparse converting bytes to 'str' to operate on them is at best a
  kludge - you're forcing 5 times the storage (the original bytes + 4
  bytes-per-byte when its decoded into unicode) to work on something
  which is defined as a BNF * that uses ascii *.

Indeed it (RFC 3896) does *use* ASCII.  But I think there is confusion
in your words.  This is what the RFC says about that use of ASCII:

   2.  Characters

   The URI syntax provides a method of encoding data, presumably for the
   sake of identifying a resource, as a sequence of characters.  [...]

   The ABNF notation defines its terminal values to be non-negative
   integers (codepoints) based on the US-ASCII coded character set
   [ASCII].  Because a URI is a sequence of characters, we must invert
   that relation in order to understand the URI syntax.  Therefore, the
   integer values used by the ABNF must be mapped back to their
   corresponding characters via US-ASCII in order to complete the syntax
   rules.

Ie, ASCII is *irrelevant* to (the modern definition of) URLs except as
it is a convenient and familiar way to refer to a certain familiar and
rather small set of *characters*.  There are reasons for this (that
I'm not going to rehash here), and they are the *same* reasons why
Python 3's behavior is correct IMHO (modulo the issue about the type
of a list element, which I discuss above).

It is true that one might like there to be a literal that expresses
`ord(bytes-object-of-length-one)', ie, something like o'a' == 97.
(This is different from Greg's x'6465616462656566' == b'deadbeef',
which I don't think helps solve the confusion problem although it
would definitely be convenient.)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.x and bytes

2011-05-19 Thread Xavier Morel
On 2011-05-19, at 09:49 , Nick Coghlan wrote:
 On Thu, May 19, 2011 at 5:10 AM, Eric Smith e...@trueblade.com wrote:
 On 05/18/2011 12:16 PM, Stephen J. Turnbull wrote:
 Robert Collins writes:
 
   Its probably too late to change, but please don't try to argue that
   its correct: the continued confusion of folk running into this is
   evidence that confusion *is happening*. Treat that as evidence and
   think about how to fix it going forward.
 
 Sorry, Rob, but you're just wrong here, and Nick is right.  It's
 possible to improve Python 3, but not to fix it in this respect.
 The Python 3 solution is correct, the Python 2 approach is not.
 There's no way to avoid discontinuity and confusion here.
 
 I don't think there's any connection between the way 2.x confused text
 strings and binary data (which certainly needed addressing) with the way
 that 3.x returns a different type for byte_str[i] than it does for
 byte_str[i:i+1]. I think it's the latter that's confusing to people.
 There's no particular requirement for different types that's needed to
 fix the byte/str problem.
 
 It's a mental model problem. People try to think of bytes as
 equivalent to 2.x str and that's just wrong, wrong, wrong. It's far
 closer to array.array('c'). Strings are basically *unique* in
 returning a length 1 instance of themselves for indexing operations.
 For every other sequence type, including tuples, lists and arrays,
 slicing returns a new instance of the same type, while indexing will
 typically return something different.
 
 Now, we definitely didn't *help* matters by keeping so many of the
 default behaviours of bytes() and bytearray() coupled to ASCII-encoded
 text, but that was a matter of practicality beating purity: there
 really *are* a lot of wire protocols out there that are ASCII based.
 In hindsight, perhaps we should have gone further in breaking things
 to try to make the point about the mental model shift more forcefully.
 (However, that idea carries with it its own problems).

For what it's worth, Erlang's approach to the subject is — in my
opinion — excellent:
binaries (whose literals are called bit syntax there) are quite
distinct from strings in both syntax and API, but you can put
chunks of strings within binaries (the bit syntax acts as a container,
in which you can put a literal or non-literal string). This
simultaneously impresses upon the user that binaries are *not* strings
and that they can still easily create binaries from strings.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.x and bytes

2011-05-19 Thread Stefan Behnel

Xavier Morel, 19.05.2011 09:41:

On 2011-05-19, at 07:28 , Georg Brandl wrote:

On 19.05.2011 00:39, Greg Ewing wrote:

If someone sees that

some_var[3] == b'd'

is true, and that

some_var[3] == 100

is also true, they might expect to be able to do things
like

n = b'd' + 1

and get 101... or maybe b'e'...


Maybe they should :)


But why wouldn't they expect `b'de' + 1` to work as well in this case? If a 
1-byte bytes is equivalent to an integer, why not an arbitrary one as well?


The result of this must obviously be bde1.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.x and bytes

2011-05-19 Thread Nick Coghlan
OK, summarising the thread so far from my point of view.

1. There are some aspects of the behavior of bytes() objects that
tempt people to think of them as string-like objects (primarily the
b'' literals and their use in repr(), along with the fact that they
fill roles that were filled by str in it's arbitrary binary data
incarnation in Python 2.x). The mental model this creates in the
reader is incorrect, as bytes() are far closer to array.array('c') in
their underlying behaviour (and deliberately so - cf. PEP 358, 3112,
3137).

One proposal for addressing this is to add a x'deadbeef' literal and
using that in repr() rather than the bytestring. Another would be to
escape all characters, even printable ASCII, in the bytes()
representation. Both of these are undesirable, as they miss the
original purpose of this behaviour: making it easier to work with the
many ASCII based wire protocols that are in widespread use.

To be honest, I don't think there is a lot we can do here except to
further emphasise in the documentation and elsewhere that *bytes is
not a string type* (regardless of any API similarities retained to
ease transition from the 2.x series). For example, if we have any
lingering references to byte strings they should be replaced with
byte sequences or bytes objects (depending on context, as the
former phrasing also encompasses bytearray objects).

2. As a concrete usability issue, it is awkward to programmatically
check the value of a specific byte when working with an ASCII based
protocol:

  data[i] == b'a' # Intuitive, but always False due to type mismatch
  data[i:i+1] == b'a'  # Works, but clumsy
  data[i] == b'a'[0]  # Ditto (but at least susceptible to compiler
const-expression optimisation)
  data[i] == ord('a') # Clumsy and slow
  data[i] == 97 # Hard to read

Proposals to address this include:
- introduce a character literal to allow c'a' as an alternative to ord('a')
Potentially workable, but leaves the intuitive answer above
silently producing an unexpected answer
- allow 1-element byte sequences to compare equal to the corresponding
integer values.
- would require reworking of bytes.__hash__ to use the hash of the
contained element when the data length is exactly 1
- transitivity of equality would recommend also supporting
equivalences such as b'a' == 97.0
- backwards compatibility concerns arise due to introduction of
new key collisions in dictionaries and sets and other value based
containers
- yet more string-like behaviour in a type that is *not* a string
(further reinforcing the mistaken impression from point 1)
- One thing that *isn't* a concern from my point of view is the
fact that we have ample precedent in decimal.Decimal for supporting
implicit coercion in comparison operations while disallowing them in
arithmetic operations (Decimal(1) == 1.0 is allowed, but
Decimal(1) + 1.0 will raise TypeError).

For point 2, I'm personally +0 on the idea of having 1-element bytes
and bytearray objects delegate hashing and comparison operations to
the corresponding integer object. We have the power to make the
obvious code correct code, so let's do that. However, the implications
of the additional key collisions in value based containers may need to
be explored further.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: Skip some tests in the absence of multiprocessing.

2011-05-19 Thread Nick Coghlan
On Thu, May 19, 2011 at 2:51 AM, Éric Araujo mer...@netwok.org wrote:
 Isn’t support.import_module or somesuch useful for this kind of checks?

You have to restructure your tests into the appropriate files for that
to work, as support.import_module() throws SkipTest if the module
isn't available.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Don't set local variable in a list comprehension or generator

2011-05-19 Thread Nick Coghlan
On Thu, May 19, 2011 at 7:34 AM, Victor Stinner
victor.stin...@haypocalc.com wrote:
 But it is slower whereas I read somewhere than generators are faster
 than loops.

Are you sure it wasn't that generator expressions can be faster than
list comprehensions (if the memory savings are significant)?

Or that a reduction function with a generator expression can be faster
than a module-level explicit loop (due to the replacement of
dict-based variable assignment with fast locals in the generator and C
looping in the reduction function)?

In general, as long as both are using fast locals and looping in
Python, I would expect inline looping code to be faster than the
equivalent generator (but often harder to maintain due to lack of
reusability).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.x and bytes

2011-05-19 Thread Łukasz Langa
Wiadomość napisana przez Stefan Behnel w dniu 2011-05-19, o godz. 10:37:

 But why wouldn't they expect `b'de' + 1` to work as well in this case? If 
 a 1-byte bytes is equivalent to an integer, why not an arbitrary one as well?
 
 The result of this must obviously be bde1.

I hope you're joking. At best, the result should be bde\x01. But I don't 
think such construct should be allowed. Just like you can't do `[1, 2, 3] + 4`. 
I wouldn't ever expect that a single byte behaves like a sequence of bytes. In 
the case of bytes b'a' is obviously still a sequence of bytes, just happening 
to store a single one. Indexing should return a byte so I'm not surprised it 
returns a number. Slicing on the other hand returns a sub-sequence.

However inconvenient, I find the current behaviour logical and predictable. A 
shortcut for b'a'[0] would obviously be nice but that's for python-ideas.

-- 
Best regards,
Łukasz Langa
Senior Systems Architecture Engineer

IT Infrastructure Department
Grupa Allegro Sp. z o.o.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.x and bytes

2011-05-19 Thread Stefan Behnel

Łukasz Langa, 19.05.2011 11:25:

Wiadomość napisana przez Stefan Behnel w dniu 2011-05-19, o godz. 10:37:


But why wouldn't they expect `b'de' + 1` to work as well in this case? If a 
1-byte bytes is equivalent to an integer, why not an arbitrary one as well?


The result of this must obviously be bde1.


I hope you're joking.


I obviously was. My point is that expectations and obvious behaviour 
may not be obvious to everyone.


Nick summed it up very nicely IMHO.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.x and bytes

2011-05-19 Thread Xavier Morel
On 2011-05-19, at 11:25 , Łukasz Langa wrote:
 Wiadomość napisana przez Stefan Behnel w dniu 2011-05-19, o godz. 10:37:
 
 But why wouldn't they expect `b'de' + 1` to work as well in this case? If 
 a 1-byte bytes is equivalent to an integer, why not an arbitrary one as 
 well?
 
 The result of this must obviously be bde1.
 I hope you're joking. At best, the result should be bde\x01.

Actually, if `b'd'+1` returns `b'e'` an equivalent behavior should be that 
`b'de'+1` returns `b'df'`.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Don't set local variable in a list comprehension or generator

2011-05-19 Thread Victor Stinner
Le jeudi 19 mai 2011 à 10:47 +1200, Greg Ewing a écrit :
 Victor Stinner wrote:
 
 squares = (x*x for x in range(1))
 
 What bytecode would you optimise that into?

I suppose that you have the current value of range(1) on the stack:
DUP_TOP; BINARY_MULTIPLY; gives you the square. You don't need the x
variable (LOAD_FAST/STORE_FAST).

Full example using a function (instead of loop, so I need to load x):
---
import dis, opcode, struct

def f(x): return x*x

def patch_bytecode(f, bytecode):
fcode = f.__code__
code_type = type(f.__code__)
new_code = code_type(
fcode.co_argcount,
fcode.co_kwonlyargcount,
fcode.co_nlocals,
fcode.co_stacksize,
fcode.co_flags,
bytecode,
fcode.co_consts,
fcode.co_names,
fcode.co_varnames,
fcode.co_filename,
fcode.co_name,
fcode.co_firstlineno,
fcode.co_lnotab,
)
f.__code__ = new_code

print(Original:)
print(f(4) = %s % f(4))
dis.dis(f)
print()

LOAD_FAST = opcode.opmap['LOAD_FAST']
DUP_TOP = opcode.opmap['DUP_TOP']
BINARY_MULTIPLY = opcode.opmap['BINARY_MULTIPLY']
RETURN_VALUE = opcode.opmap['RETURN_VALUE']

bytecode = struct.pack(
'=BHBBB',
LOAD_FAST, 0,
DUP_TOP,
BINARY_MULTIPLY,
RETURN_VALUE)

print(Patched:)
patch_bytecode(f, bytecode)
print(f(4) patched = %s % f(4))
dis.dis(f)
---

Output:
---
$ python3 square.py 
Original:
f(4) = 16
  3   0 LOAD_FAST0 (x) 
  3 LOAD_FAST0 (x) 
  6 BINARY_MULTIPLY  
  7 RETURN_VALUE 

Patched:
f(4) patched = 16
  3   0 LOAD_FAST0 (x) 
  3 DUP_TOP  
  4 BINARY_MULTIPLY  
  5 RETURN_VALUE 
---

Victor

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.x and bytes

2011-05-19 Thread Antoine Pitrou
On Thu, 19 May 2011 17:49:47 +1000
Nick Coghlan ncogh...@gmail.com wrote:
 
 It's a mental model problem. People try to think of bytes as
 equivalent to 2.x str and that's just wrong, wrong, wrong. It's far
 closer to array.array('c'). Strings are basically *unique* in
 returning a length 1 instance of themselves for indexing operations.
 For every other sequence type, including tuples, lists and arrays,
 slicing returns a new instance of the same type, while indexing will
 typically return something different.
 
 Now, we definitely didn't *help* matters by keeping so many of the
 default behaviours of bytes() and bytearray() coupled to ASCII-encoded
 text, but that was a matter of practicality beating purity: there
 really *are* a lot of wire protocols out there that are ASCII based.

I think practicality beating purity should have been extended to
__getitem__ as well. I have almost never had a use for treating a
bytestring as a sequence of integers, while treating a bytestring as a
sequence of one-byte strings is *very* common.

(and, as you say, if you want a sequence of integers you can already
use array.array() which gives you more flexibility as to the width and
signedness of integers)

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Don't set local variable in a list comprehension or generator

2011-05-19 Thread Victor Stinner
Le mercredi 18 mai 2011 à 21:44 -0400, Terry Reedy a écrit :
 On 5/18/2011 5:34 PM, Victor Stinner wrote:
 
 You initial example gave me the impression that the issue has something 
 to do with join in particular, or even comprehensions in particular. It 
 is really about for loops.
 
   dis('for x in range(3): y = x*x')
...
 13 FOR_ITER16 (to 32)
   16 STORE_NAME   1 (x)
   19 LOAD_NAME1 (x)
   22 LOAD_NAME1 (x)
   25 BINARY_MULTIPLY
   26 STORE_NAME   2 (y)
   ...

Yeah, STORE_NAME; LOAD_NAME; LOAD_NAME can be replaced by a single
opcode: DUP_TOP. But the user expects x to be defined outside the loop:

 for x in range(3): y = x*x
... 
 x
2

Well, it is possible to detect if x is used or not after the loop, but
it is a little more complex to optimize than list
comprehension/generator :-)

 .. you cannot get that with Python code without a much smarter optimizer.

Yes, I would like to write a smarter optimizer. But I first asked if it
would accepted to avoid the temporary loop variable because it changes
the Python language: the user can expect a loop variable using
introspection or a debugger. That's why I suggested to only enable the
optimization if Python is running in optimized mode (python -O or python
-OO).

Victor

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.x and bytes

2011-05-19 Thread Nick Coghlan
On Thu, May 19, 2011 at 6:43 PM, Nick Coghlan ncogh...@gmail.com wrote:
 For point 2, I'm personally +0 on the idea of having 1-element bytes
 and bytearray objects delegate hashing and comparison operations to
 the corresponding integer object. We have the power to make the
 obvious code correct code, so let's do that. However, the implications
 of the additional key collisions in value based containers may need to
 be explored further.

On further reflection, the key collision and semantics blurring
problems mean I am at best -0 on this particular solution to the
problem (and heading fairly rapidly in the direction of -1).

Best to just go with b'a'[0] and let the optimiser sort it out (PyPy
should handle it automatically, CPython would need work).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.x and bytes

2011-05-19 Thread Michael Foord

On 19/05/2011 10:25, Łukasz Langa wrote:

Wiadomość napisana przez Stefan Behnel w dniu 2011-05-19, o godz. 10:37:


But why wouldn't they expect `b'de' + 1` to work as well in this case? If a 
1-byte bytes is equivalent to an integer, why not an arbitrary one as well?

The result of this must obviously be bde1.

I hope you're joking. At best, the result should be bde\x01.
The behaviour Stefan suggests is what some weakly typed languages like 
perl (and possibly php?) do, which masks errors and is rightly abhorred 
by Python programmers (although semantically not *so* different from 1 + 
1.0 == 2.0). I think it's safe to say that Stefan was joking.


Michael


  But I don't think such construct should be allowed. Just like you can't do 
`[1, 2, 3] + 4`. I wouldn't ever expect that a single byte behaves like a 
sequence of bytes. In the case of bytes b'a' is obviously still a sequence of 
bytes, just happening to store a single one. Indexing should return a byte so 
I'm not surprised it returns a number. Slicing on the other hand returns a 
sub-sequence.

However inconvenient, I find the current behaviour logical and predictable. A 
shortcut for b'a'[0] would obviously be nice but that's for python-ideas.




--
http://www.voidspace.org.uk/

May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] packaging landed in stdlib

2011-05-19 Thread Tarek Ziadé
Hey

I've pushed packaging in stdlib. There are a few buildbots errors
we're fixing right now.

We will continue our work in their directly for now on.

The next big commit will be for the documentation,

Cheers
Tarek
-- 
Tarek Ziadé | http://ziade.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Don't set local variable in a list comprehension or generator

2011-05-19 Thread Greg Ewing

Victor Stinner wrote:


I suppose that you have the current value of range(1) on the stack:
DUP_TOP; BINARY_MULTIPLY; gives you the square. You don't need the x
variable (LOAD_FAST/STORE_FAST).


That seems far too special-purpose to be worth it to me.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] looking for a contact at Google on the Blogger team

2011-05-19 Thread Doug Hellmann
Several of the PSF blogs hosted on Google's Blogger platform are experiencing 
issues as fallout from the recent maintenance problems they had. We have 
already had to recreate at least one of the translations for Python Insider in 
order to be able to publish to it, and now we can't edit posts on Python 
Insider itself.

Can anyone put me in contact with someone at Google from the Blogger team? I 
would at least like to know whether the bX-qpvq7q problem is being worked on, 
so I can decide whether to take a hiatus or start moving us to another 
platform. There are a lot of posts about the error on the support forums, but 
no obvious response from Google.

Thanks,
Doug

--
Doug Hellmann
Communications Director
Python Software Foundation
http://python.org/psf/

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [RELEASED] Python 3.2.1 rc 1

2011-05-19 Thread Tres Seaver
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 05/18/2011 10:46 PM, anatoly techtonik wrote:
 On Wed, May 18, 2011 at 10:37 PM, Georg Brandl g.bra...@gmx.net wrote:
 On 18.05.2011 21:09, Martin v. Löwis wrote:
 Am 18.05.2011 20:39, schrieb Hagen Fürstenau:
 On behalf of the Python development team, I am pleased to announce the
 first release candidate of Python 3.2.1.

 Shouldn't there be a tag v3.2.1rc1 in the hg repo?

 http://hg.python.org/releasing/3.2.1/

 Regards,
 Martin

 P.S. Shouldn't makes it sound as if there was a mistake.

 To clarify: once the final is done, the repo Martin mentioned will be
 merged back to main and then vanish.
 
 Can't this work be done in the branch of main repo, so that everybody
 can track the progress in place? Is there any picture of the process
 similar to http://nvie.com/posts/a-successful-git-branching-model/ ?

Note that in that writeup, 'release-*' (and 'hotfix-*') branches are not
shown as pushed to the 'origin' repository.


Tres.
- -- 
===
Tres Seaver  +1 540-429-0999  tsea...@palladion.com
Palladion Software   Excellence by Designhttp://palladion.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk3VTeAACgkQ+gerLs4ltQ42kgCeMbIDH6zRU5uyd0Su28Nb9E5q
WAMAniWnrvzRReDa+b3mYtavbyaywGVJ
=Dr2p
-END PGP SIGNATURE-

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] packaging landed in stdlib

2011-05-19 Thread Tarek Ziadé
On Thu, May 19, 2011 at 1:35 PM, Tarek Ziadé ziade.ta...@gmail.com wrote:
 Hey

 I've pushed packaging in stdlib. There are a few buildbots errors
 we're fixing right now.

FYI.

there are still some failures we're fixing. Thanks for your patience
and thanks to the folks that are helping me on this :)

I expect the bbots to be back on track later today

Cheers
Tarek
-- 
Tarek Ziadé | http://ziade.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.x and bytes

2011-05-19 Thread Ethan Furman

Nick Coghlan wrote:
OK, summarising the thread so far from my point of view. 


[snip]


To be honest, I don't think there is a lot we can do here except to
further emphasise in the documentation and elsewhere that *bytes is
not a string type* (regardless of any API similarities retained to
ease transition from the 2.x series). For example, if we have any
lingering references to byte strings they should be replaced with
byte sequences or bytes objects (depending on context, as the
former phrasing also encompasses bytearray objects).


I think this would be a big help.


2. As a concrete usability issue, it is awkward to programmatically
check the value of a specific byte when working with an ASCII based
protocol:

  data[i] == b'a' # Intuitive, but always False due to type mismatch
  data[i:i+1] == b'a'  # Works, but clumsy
  data[i] == b'a'[0]  # Ditto (but at least susceptible to compiler
const-expression optimisation)
  data[i] == ord('a') # Clumsy and slow
  data[i] == 97 # Hard to read

Proposals to address this include:
- introduce a character literal to allow c'a' as an alternative to ord('a')
Potentially workable, but leaves the intuitive answer above
silently producing an unexpected answer


[snip]


For point 2, I'm personally +0 on the idea of having 1-element bytes
and bytearray objects delegate hashing and comparison operations to
the corresponding integer object. We have the power to make the
obvious code correct code, so let's do that. However, the implications
of the additional key collisions in value based containers may need to
be explored further.


Nick Coghlan also wrote:
 On further reflection, the key collision and semantics blurring
 problems mean I am at best -0 on this particular solution to the
 problem (and heading fairly rapidly in the direction of -1).

Last thought I have for a possible 'solution' -- when a bytes object is 
tested for equality against an int raise TypeError.  Precedent being 
sum() raising a TypeError when passed a list of strings because 
performance is so poor.  Reason here being that the intuitive behavior 
will never work and will always produce silent bugs.


~Ethan~

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.x and bytes

2011-05-19 Thread Guido van Rossum
On Thu, May 19, 2011 at 1:43 AM, Nick Coghlan ncogh...@gmail.com wrote:
 OK, summarising the thread so far from my point of view.

 1. There are some aspects of the behavior of bytes() objects that
 tempt people to think of them as string-like objects (primarily the
 b'' literals and their use in repr(), along with the fact that they
 fill roles that were filled by str in it's arbitrary binary data
 incarnation in Python 2.x). The mental model this creates in the
 reader is incorrect, as bytes() are far closer to array.array('c') in
 their underlying behaviour (and deliberately so - cf. PEP 358, 3112,
 3137).

I think most of this wrong mental model is actually due to people
not having completely internalized the Python 3 way.

 One proposal for addressing this is to add a x'deadbeef' literal and
 using that in repr() rather than the bytestring. Another would be to
 escape all characters, even printable ASCII, in the bytes()
 representation. Both of these are undesirable, as they miss the
 original purpose of this behaviour: making it easier to work with the
 many ASCII based wire protocols that are in widespread use.

Indeed, -1 on both.

 To be honest, I don't think there is a lot we can do here except to
 further emphasise in the documentation and elsewhere that *bytes is
 not a string type* (regardless of any API similarities retained to
 ease transition from the 2.x series). For example, if we have any
 lingering references to byte strings they should be replaced with
 byte sequences or bytes objects (depending on context, as the
 former phrasing also encompasses bytearray objects).

+1

 2. As a concrete usability issue, it is awkward to programmatically
 check the value of a specific byte when working with an ASCII based
 protocol:

  data[i] == b'a' # Intuitive, but always False due to type mismatch
  data[i:i+1] == b'a'  # Works, but clumsy
  data[i] == b'a'[0]  # Ditto (but at least susceptible to compiler
 const-expression optimisation)
  data[i] == ord('a') # Clumsy and slow
  data[i] == 97 # Hard to read

 Proposals to address this include:
 - introduce a character literal to allow c'a' as an alternative to ord('a')

-1; the result is not a *character* but an integer. I'm personally
favoring using b'a'[0] and possibly hiding this in a constant
definition.

Potentially workable, but leaves the intuitive answer above
 silently producing an unexpected answer

I'm not convinced that that problem is any worse than other
comparison-related problems. E.g. b'a' == 'a' also always returns
False (most likely it'll be disguised by at least one operand being a
variable of course.)

 - allow 1-element byte sequences to compare equal to the corresponding
 integer values.
    - would require reworking of bytes.__hash__ to use the hash of the
 contained element when the data length is exactly 1
    - transitivity of equality would recommend also supporting
 equivalences such as b'a' == 97.0
    - backwards compatibility concerns arise due to introduction of
 new key collisions in dictionaries and sets and other value based
 containers
    - yet more string-like behaviour in a type that is *not* a string
 (further reinforcing the mistaken impression from point 1)
    - One thing that *isn't* a concern from my point of view is the
 fact that we have ample precedent in decimal.Decimal for supporting
 implicit coercion in comparison operations while disallowing them in
 arithmetic operations (Decimal(1) == 1.0 is allowed, but
 Decimal(1) + 1.0 will raise TypeError).

 For point 2, I'm personally +0 on the idea of having 1-element bytes
 and bytearray objects delegate hashing and comparison operations to
 the corresponding integer object. We have the power to make the
 obvious code correct code, so let's do that. However, the implications
 of the additional key collisions in value based containers may need to
 be explored further.

My gut feeling about this is that this will probably introduce some
confusing or unintended side effect elsewhere, and I am -1 on this
change.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.x and bytes

2011-05-19 Thread Guido van Rossum
On Thu, May 19, 2011 at 10:50 AM, Ethan Furman et...@stoneleaf.us wrote:
 Last thought I have for a possible 'solution' -- when a bytes object is
 tested for equality against an int raise TypeError.  Precedent being sum()
 raising a TypeError when passed a list of strings because performance is so
 poor.  Reason here being that the intuitive behavior will never work and
 will always produce silent bugs.

Not the same thing at all. The == operator is special, and should not
raise exceptions; too many things would start randomly failing (e.g.
membership tests for a dict that has both ints and bytes as keys, or
for a list containing a variety of types).

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Don't set local variable in a list comprehension or generator

2011-05-19 Thread Guido van Rossum
On Wed, May 18, 2011 at 2:34 PM, Victor Stinner
victor.stin...@haypocalc.com wrote:
 Le mercredi 18 mai 2011 à 16:19 +0200, Nadeem Vawda a écrit :
 I'm not sure why you would encounter code like that in the first place.

 Well, I found the STORE_FAST/LOAD_FAST issue while trying to optimize
 the this module which reimplements rot13 using a dict in Python 3:

 d = {}
 for c in (65, 97):
    for i in range(26):
        d[chr(i+c)] = chr((i+13) % 26 + c)

 I tried:

 d = {chr(i+c): chr((i+13) % 26 + c)
     for i in range(26)
     for c in (65, 97)}

 But it is slower whereas I read somewhere than generators are faster
 than loops.

I'm curious where you read that. The explicit loop should be faster or
equally fast *except* when you can avoid a loop in bytecode by
applying map() to a built-in function. However map() with a lambda is
significantly slower. Maybe what you recall actually (correctly) said
that a comprehension is faster than map+lambda?

 By the way, (c for c in ...) is slower than [c for c
 in ...]. I suppose that a generator is slower because it exits/reenter
 into PyEval_EvalFrameEx() at each step, whereas [c for c ...] uses
 BUILD_LIST in a dummy (but fast) loop.

Did you test this in Python 2 or 3? In 2 the genexpr is definitely
slower than the comprehension; in 3 I'm not sure there's much
difference any more.

 (c for c in ...) and [c for c in ...] is stupid, but I used a simplified
 example to explain the problem. A more realistic example would be:

   squares = (x*x for x in range(1))

 You don't really need the x variable, you just want the square.
 Another example is the syntax using a if the filter the data set:

   (x for x in ... if condition(x))

  I heard about optimization in the AST tree instead of working on the
  bytecode. What is the status of this project?

 Are you referring to issue11549? There was some related discussion [1] on
 python-dev about six weeks ago, but I haven't seen anything on the topic
 since then.

 Ah yes, it looks to be this issue. I didn't know that there was an
 issue.

Hm, probably.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.x and bytes

2011-05-19 Thread Glyph Lefkowitz

On May 19, 2011, at 1:43 PM, Guido van Rossum wrote:

 -1; the result is not a *character* but an integer.

Well, really the result ought to be an octet, but I suppose adding an 'octet' 
type is beyond the scope of even this sprawling discussion :).

 I'm personally favoring using b'a'[0] and possibly hiding this in a constant 
 definition.

As someone who spends a frankly unfortunate amount of time handling protocols 
where things like this are necessary, I agree with this recommendation.  In 
protocols where one needs to compare network data with one-byte type 
identifiers or packet prefixes, more (documented) constants and less 
inscrutable junk like

if p == 'c':
   ...
elif p == 'j':
   ...
elif p == 'J': # for compatibility
   ...

would definitely be a good thing.  Of course, I realize that this sort of 
programmer will most likely replace those constants with 99, 106, 74 than take 
a moment to document what they mean, but at least they'll have to pause for a 
moment and realize that they have now lost _all_ mnemonics...

In fact, I feel like I would want to push in the opposite direction: don't 
treat one-byte bytes slices less like integers; I wish I could more easily 
treat n-byte sequences _more_ like integers! :).  More protocols have 2-byte or 
4-byte network-endian packed integers embedded in them than have individual tag 
bytes that I want to examine.  For the typical ASCII-ish protocol where you 
want to look at command names and CRLF-separated messages, you'd never want to 
look at an individual octet, stringish operations like split() will give you 
what you want.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] packaging landed in stdlib

2011-05-19 Thread Georg Brandl
On 19.05.2011 13:35, Tarek Ziadé wrote:
 Hey
 
 I've pushed packaging in stdlib. There are a few buildbots errors
 we're fixing right now.
 
 We will continue our work in their directly for now on.

Rock on!

Georg


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.x and bytes

2011-05-19 Thread Georg Brandl
On 19.05.2011 10:37, Stefan Behnel wrote:
 Xavier Morel, 19.05.2011 09:41:
 On 2011-05-19, at 07:28 , Georg Brandl wrote:
 On 19.05.2011 00:39, Greg Ewing wrote:
 If someone sees that
 
 some_var[3] == b'd'
 
 is true, and that
 
 some_var[3] == 100
 
 is also true, they might expect to be able to do things like
 
 n = b'd' + 1
 
 and get 101... or maybe b'e'...
 
 Maybe they should :)
 
 But why wouldn't they expect `b'de' + 1` to work as well in this case? If
 a 1-byte bytes is equivalent to an integer, why not an arbitrary one as
 well?
 
 The result of this must obviously be bde1.

To clarify my original one-liner: if bytes objects (but only one-char bytes
objects) equal integers, you should rightly expect to treat them as integers.

This is obviously *not* desirable from a strong-typing POV.

Georg

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.x and bytes

2011-05-19 Thread Terry Reedy

On 5/19/2011 3:49 AM, Nick Coghlan wrote:


It's a mental model problem. People try to think of bytes as
equivalent to 2.x str and that's just wrong, wrong, wrong. It's far
closer to array.array('c').


Or like C char arrays


Strings are basically *unique* in
returning a length 1 instance of themselves for indexing operations.


I still remember having to work that out and get used to it.

--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Don't set local variable in a list comprehension or generator

2011-05-19 Thread skip
On 5/18/2011 10:19 AM, Nadeem Vawda wrote:

 I'm not sure why you would encounter code like that in the first place.
 Surely any code of the form:
 
 ''.join(c for c in my_string)
 
 would just return my_string? Or am I missing something?

You might more-or-less legitimately encounter it if the generator expression
originally contained a condition which got removed.

Skip
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: Issue #12120, Issue #12119: tests were missing a sys.dont_write_bytecode check

2011-05-19 Thread Victor Stinner
Python 3.3 is not supposed to create .pyc files in the same directory
than the .py files. So I don't understand the following code.

Le jeudi 19 mai 2011 à 19:56 +0200, tarek.ziade a écrit :
 http://hg.python.org/cpython/rev/9d1fb6a9104b
 changeset:   70207:9d1fb6a9104b
 user:Tarek Ziade ta...@ziade.org
 date:Thu May 19 19:56:12 2011 +0200
 summary:
   Issue #12120, Issue #12119: tests were missing a sys.dont_write_bytecode 
 check
 
 files:
   Lib/distutils/tests/test_build_py.py |  3 ++-
   Lib/packaging/tests/test_command_build_py.py |  3 ++-
   Misc/NEWS|  3 +++
   3 files changed, 7 insertions(+), 2 deletions(-)
 
 
 diff --git a/Lib/distutils/tests/test_build_py.py 
 b/Lib/distutils/tests/test_build_py.py
 --- a/Lib/distutils/tests/test_build_py.py
 +++ b/Lib/distutils/tests/test_build_py.py
 @@ -58,7 +58,8 @@
  pkgdest = os.path.join(destination, pkg)
  files = os.listdir(pkgdest)
  self.assertTrue(__init__.py in files)
 -self.assertTrue(__init__.pyc in files)
 +if not sys.dont_write_bytecode:
 +self.assertTrue(__init__.pyc in files)
  self.assertTrue(README.txt in files)
  
  def test_empty_package_dir (self):
 diff --git a/Lib/packaging/tests/test_command_build_py.py 
 b/Lib/packaging/tests/test_command_build_py.py
 --- a/Lib/packaging/tests/test_command_build_py.py
 +++ b/Lib/packaging/tests/test_command_build_py.py
 @@ -61,7 +61,8 @@
  pkgdest = os.path.join(destination, pkg)
  files = os.listdir(pkgdest)
  self.assertIn(__init__.py, files)
 -self.assertIn(__init__.pyc, files)
 +if not sys.dont_write_bytecode:
 +self.assertIn(__init__.pyc, files)
  self.assertIn(README.txt, files)
  
  def test_empty_package_dir(self):
 diff --git a/Misc/NEWS b/Misc/NEWS
 --- a/Misc/NEWS
 +++ b/Misc/NEWS
 @@ -153,6 +153,9 @@
  Library
  ---
  
 +- Issue #12120, #12119: skip a test in packaging and distutils
 +  if sys.dont_write_bytecode is set to True.
 +
  - Issue #12065: connect_ex() on an SSL socket now returns the original errno
when the socket's timeout expires (it used to return None).
  
 
 ___
 Python-checkins mailing list
 python-check...@python.org
 http://mail.python.org/mailman/listinfo/python-checkins


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.x and bytes

2011-05-19 Thread Ethan Furman

Nick Coghlan wrote:

On Thu, May 19, 2011 at 6:43 PM, Nick Coghlan ncogh...@gmail.com wrote:

For point 2, I'm personally +0 on the idea of having 1-element bytes
and bytearray objects delegate hashing and comparison operations to
the corresponding integer object. We have the power to make the
obvious code correct code, so let's do that. However, the implications
of the additional key collisions in value based containers may need to
be explored further.


Several folk have said that objects that compare equal must hash equal...

Why?  It's an honest question.  Here's what I have tried:

-- class Wierd():
... def __init__(self, value):
... self.value = value
... def __eq__(self, other):
... return self.value == other
... def __hash__(self):
... return hash((self.value + 13) ** 3)
...
-- one = Wierd(1)
-- two = Wierd(2)
-- three = Wierd(3)
-- one
Wierd object at 0x00BFE710
-- one == 1
True
-- one == 2
False
-- two == 2
True
-- three == 3
True
-- d = dict()
-- d[one] = '1'
-- d[two] = '2'
-- d[three] = '3'
-- d
{Wierd object at 0x00BFE710: '1',
 Wierd object at 0x00BFE870: '3',
 Wierd object at 0x00BFE830: '2'}
-- d[1] = '1.0'
-- d[2] = '2.0'
-- d[3] = '3.0'
-- d
{Wierd object at 0x00BFE870: '3',
 1: '1.0',
 2: '2.0',
 3: '3.0',
 Wierd object at 0x00BFE830: '2',
 Wierd object at 0x00BFE710: '1'}
-- d[2]
'2.0'
-- d[two]
'2'

This behavior matches what I was imagining for having
b'a' == 97.  They compare equal, yet remain distinct objects
for all other purposes.

If anybody has a link to or an explanation why equal values must be 
equal hashes I'm all ears.  My apologies in advance if this is an 
incredibly naive question.


~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.x and bytes

2011-05-19 Thread Benjamin Peterson
2011/5/19 Ethan Furman et...@stoneleaf.us:
 If anybody has a link to or an explanation why equal values must be equal
 hashes I'm all ears.  My apologies in advance if this is an incredibly naive
 question.

https://secure.wikimedia.org/wikipedia/en/wiki/Hash_table


-- 
Regards,
Benjamin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.x and bytes

2011-05-19 Thread Raymond Hettinger

On May 19, 2011, at 7:40 PM, Ethan Furman wrote:

 Several folk have said that objects that compare equal must hash equal...

And so do the docs:  
http://docs.python.org/dev/reference/datamodel.html#object.__hash__
, the only required property is that objects which compare equal have the same 
hash value.


Raymond

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com