Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Raymond Hettinger

On Aug 14, 2014, at 10:50 PM, Nick Coghlan  wrote:

> Key points in the proposal:
> 
> * deprecate passing integers to bytes() and bytearray()

I'm opposed to removing this part of the API.  It has proven useful
and the alternative isn't very nice.   Declaring the size of fixed length
arrays is not a new concept and is widely adopted in other languages.
One principal use case for the bytearray is creating and manipulating
binary data.  Initializing to zero is common operation and should remain
part of the core API (consider why we now have list.copy() even though
copying with a slice remains possible and efficient).

I and my clients have taken advantage of this feature and it reads nicely.
The proposed deprecation would break our code and not actually make
anything better.

Another thought is that the core devs should be very reluctant to deprecate
anything we don't have to while the 2 to 3 transition is still in progress.   
Every new deprecation of APIs that existed in Python 2.7 just adds another
obstacle to converting code.  Individually, the differences are trivial.  
Collectively, they present a good reason to never migrate code to Python 3.


Raymond


 

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 4000 to explicitly declare we won't be doing a Py3k style compatibility break again?

2014-08-17 Thread Nick Coghlan
On 17 August 2014 15:34, Nick Coghlan  wrote:
> On 17 August 2014 15:08, Guido van Rossum  wrote:
>> I think this would be a great topic for a blog post. Once you've written it
>> I can even bless it by Tweeting about it. :-)
>
> Sounds like a plan - I'll try to put together something coherent this week :)

OK, make that "this afternoon":
http://www.curiousefficiency.org/posts/2014/08/python-4000.html :)

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Nick Coghlan
On 17 August 2014 18:13, Raymond Hettinger  wrote:
>
> On Aug 14, 2014, at 10:50 PM, Nick Coghlan  wrote:
>
> Key points in the proposal:
>
> * deprecate passing integers to bytes() and bytearray()
>
>
> I'm opposed to removing this part of the API.  It has proven useful
> and the alternative isn't very nice.   Declaring the size of fixed length
> arrays is not a new concept and is widely adopted in other languages.
> One principal use case for the bytearray is creating and manipulating
> binary data.  Initializing to zero is common operation and should remain
> part of the core API (consider why we now have list.copy() even though
> copying with a slice remains possible and efficient).

That's why the PEP proposes adding a "zeros" method, based on the name
of the corresponding NumPy construct.

The status quo has some very ugly failure modes when an integer is
passed unexpectedly, and tries to create a large buffer, rather than
throwing a type error.

> I and my clients have taken advantage of this feature and it reads nicely.

If I see "bytearray(10)" there is nothing there that suggests "this
creates an array of length 10 and initialises it to zero" to me. I'd
be more inclined to guess it would be equivalent to "bytearray([10])".

"bytearray.zeros(10)", on the other hand, is relatively clear,
independently of user expectations.

> The proposed deprecation would break our code and not actually make
> anything better.
>
> Another thought is that the core devs should be very reluctant to deprecate
> anything we don't have to while the 2 to 3 transition is still in progress.
> Every new deprecation of APIs that existed in Python 2.7 just adds another
> obstacle to converting code.  Individually, the differences are trivial.
> Collectively, they present a good reason to never migrate code to Python 3.

This is actually one of the inconsistencies between the Python 2 and 3
binary APIs:

Python 2.7.5 (default, Jun 25 2014, 10:19:55)
[GCC 4.8.2 20131212 (Red Hat 4.8.2-7)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> bytes(10)
'10'
>>> bytearray(10)
bytearray(b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00')

Users wanting well-behaved binary sequences in Python 2.7 would be
well advised to use the "future" module to get a full backport of the
actual Python 3 bytes type, rather than the approximation that is the
8-bit str in Python 2. And once they do that, they'll be able to track
the evolution of the Python 3 binary sequence behaviour without any
further trouble.

That said, I don't really mind how long the deprecation cycle is. I'd
be fine with fully supporting both in 3.5 (2015), deprecating the main
constructor in favour of the explicit zeros() method in 3.6 (2017) and
dropping the legacy behaviour in 3.7 (2018)

Regards,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython (merge 3.4 -> default): Issue #22165: Fixed test_undecodable_filename on non-UTF-8 locales.

2014-08-17 Thread Senthil Kumaran
This change is okay and not harmful. But I think, It might still not fix
the encoding issue that we encountered on Mac.

[localhost cpython]$ hg log -l 1
changeset:   92128:7cdc941d5180
tag: tip
parent:  92126:3153a400b739
parent:  92127:a894b629bbea
user:Serhiy Storchaka 
date:Sun Aug 17 12:21:06 2014 +0300
description:
Issue #22165: Fixed test_undecodable_filename on non-UTF-8 locales.


[localhost cpython]$ ./python.exe -m test.regrtest test_httpservers
[1/1] test_httpservers
test test_httpservers failed -- Traceback (most recent call last):
  File "/Users/skumaran/python/cpython/Lib/test/test_httpservers.py", line
283, in test_undecodable_filename
.encode(enc, 'surrogateescape'), body)
AssertionError: b'href="%40test_5809_tmp%ED%B3%A7w%ED%B3%B0.txt"' not found
in b'http://www.w3.org/TR/html4/strict.dtd";>\n\n\n\nDirectory listing for
tmpj54lc8m1/\n\n\nDirectory listing for
tmpj54lc8m1/\n\n\n@test_5809_tmp%E7w%F0.txt\ntest\n\n\n\n\n'

1 test failed:
test_httpservers

The underlying problem seems to be difference in which os.listdir() which
uses C-API  and os.fsdecode represent the decoded chars. Ref:
http://bugs.python.org/issue22165#msg225428





On Sun, Aug 17, 2014 at 2:52 PM, serhiy.storchaka <
[email protected]> wrote:

> http://hg.python.org/cpython/rev/7cdc941d5180
> changeset:   92128:7cdc941d5180
> parent:  92126:3153a400b739
> parent:  92127:a894b629bbea
> user:Serhiy Storchaka 
> date:Sun Aug 17 12:21:06 2014 +0300
> summary:
>   Issue #22165: Fixed test_undecodable_filename on non-UTF-8 locales.
>
> files:
>   Lib/test/test_httpservers.py |  5 +++--
>   1 files changed, 3 insertions(+), 2 deletions(-)
>
>
> diff --git a/Lib/test/test_httpservers.py b/Lib/test/test_httpservers.py
> --- a/Lib/test/test_httpservers.py
> +++ b/Lib/test/test_httpservers.py
> @@ -272,6 +272,7 @@
>  @unittest.skipUnless(support.TESTFN_UNDECODABLE,
>   'need support.TESTFN_UNDECODABLE')
>  def test_undecodable_filename(self):
> +enc = sys.getfilesystemencoding()
>  filename = os.fsdecode(support.TESTFN_UNDECODABLE) + '.txt'
>  with open(os.path.join(self.tempdir, filename), 'wb') as f:
>  f.write(support.TESTFN_UNDECODABLE)
> @@ -279,9 +280,9 @@
>  body = self.check_status_and_reason(response, 200)
>  quotedname = urllib.parse.quote(filename, errors='surrogatepass')
>  self.assertIn(('href="%s"' % quotedname)
> -  .encode('utf-8', 'surrogateescape'), body)
> +  .encode(enc, 'surrogateescape'), body)
>  self.assertIn(('>%s<' % html.escape(filename))
> -  .encode('utf-8', 'surrogateescape'), body)
> +  .encode(enc, 'surrogateescape'), body)
>  response = self.request(self.tempdir_name + '/' + quotedname)
>  self.check_status_and_reason(response, 200,
>   data=support.TESTFN_UNDECODABLE)
>
> --
> Repository URL: http://hg.python.org/cpython
>
> ___
> Python-checkins mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-checkins
>
>
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 4000 to explicitly declare we won't be doing a Py3k style compatibility break again?

2014-08-17 Thread francis

On 08/17/2014 03:28 AM, Nick Coghlan wrote:

I've seen a few people on python-ideas express the assumption that
there will be another Py3k style compatibility break for Python 4.0.

I've also had people express the concern that "you broke compatibility
in a major way once, how do we know you won't do it again?".



Why not just allow those changes that can be automatically changed by
a tool/script applied on the code (a la go, 2to3, 3.Ato3.B, ...)?
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 4000 to explicitly declare we won't be doing a Py3k style compatibility break again?

2014-08-17 Thread Barry Warsaw
On Aug 16, 2014, at 07:43 PM, Guido van Rossum wrote:

>(Don't understand this to mean that we should never deprecate things.
>Deprecations will happen, they are necessary for the evolution of any
>programming language. But they won't ever hurt in the way that Python 3
>hurt.)

It would be useful to explore what causes the most pain in the 2->3
transition?  IMHO, it's not the deprecations or changes such as print ->
print().  It's the bytes/str split - a fundamental change to core and common
data types.  The question then is whether you foresee any similar looming
pervasive change? [*]

-Barry

[*] I was going to add a joke about mandatory static type checking, but
sometimes jokes are blown up into apocalyptic prophesy around here. ;)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] "embedded NUL character" exceptions

2014-08-17 Thread Serhiy Storchaka
Currently most functions which accepts string argument which then passed 
to C function as NUL-terminated string, reject strings with embedded NUL 
character and raise TypeError. ValueError looks more appropriate here, 
because argument type is correct (str), only its value is wrong. But 
this is backward incompatible change.


I think that we should get rid of this legacy inconsistency sooner or 
later. Why not fix it right now? I have opened an issue on the tracker 
[1], but this issue requires more broad discussion.


[1] http://bugs.python.org/issue22215

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] "embedded NUL character" exceptions

2014-08-17 Thread Guido van Rossum
Sounds good to me.


On Sun, Aug 17, 2014 at 7:47 AM, Serhiy Storchaka 
wrote:

> Currently most functions which accepts string argument which then passed
> to C function as NUL-terminated string, reject strings with embedded NUL
> character and raise TypeError. ValueError looks more appropriate here,
> because argument type is correct (str), only its value is wrong. But this
> is backward incompatible change.
>
> I think that we should get rid of this legacy inconsistency sooner or
> later. Why not fix it right now? I have opened an issue on the tracker [1],
> but this issue requires more broad discussion.
>
> [1] http://bugs.python.org/issue22215
>
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/
> guido%40python.org
>



-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Raymond Hettinger

On Aug 17, 2014, at 1:41 AM, Nick Coghlan  wrote:

> If I see "bytearray(10)" there is nothing there that suggests "this
> creates an array of length 10 and initialises it to zero" to me. I'd
> be more inclined to guess it would be equivalent to "bytearray([10])".
> 
> "bytearray.zeros(10)", on the other hand, is relatively clear,
> independently of user expectations.

Zeros would have been great but that should have been done originally.
The time to get API design right is at inception.
Now, you're just breaking code and invalidating any published examples.

>> 
>> Another thought is that the core devs should be very reluctant to deprecate
>> anything we don't have to while the 2 to 3 transition is still in progress.
>> Every new deprecation of APIs that existed in Python 2.7 just adds another
>> obstacle to converting code.  Individually, the differences are trivial.
>> Collectively, they present a good reason to never migrate code to Python 3.
> 
> This is actually one of the inconsistencies between the Python 2 and 3
> binary APIs:

However, bytearray(n) is the same in both Python 2 and Python 3.
Changing it in Python 3 increases the gulf between the two.

The further we let Python 3 diverge from Python 2, the less likely that
people will convert their code and the harder you make it to write code
that runs under both.

FWIW, I've been teaching Python full time for three years.  I cover the
use of bytearray(n) in my classes and not a single person out of 3000+
engineers have had a problem with it.   I seriously question the PEP's
assertion that there is a real problem to be solved (i.e. that people
are baffled by bytearray(bufsiz)) and that the problem is sufficiently
painful to warrant the headaches that go along with API changes.

The other proposal to add bytearray.byte(3) should probably be named
bytearray.from_byte(3) for clarity.  That said, I question whether there is
actually a use case for this.   I have never seen seen code that has a
need to create a byte array of length one from a single integer.
For the most part, the API will be easiest to learn if it matches what
we do for lists and for array.array.

Sorry Nick, but I think you're making the API worse instead of better.
This API isn't perfect but it isn't flat-out broken either.   There is some
unfortunate asymmetry between bytes() and bytearray() in Python 2,
but that ship has sailed.  The current API for Python 3 is pretty good
(though there is still a tension between wanting to be like lists and like
strings both at the same time).


Raymond


P.S.  The most important problem in the Python world now is getting
Python 2 users to adopt Python 3.  The core devs need to develop
a strong distaste for anything that makes that problem harder.





___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Donald Stufft

> On Aug 17, 2014, at 1:07 PM, Raymond Hettinger  
> wrote:
> 
> 
> On Aug 17, 2014, at 1:41 AM, Nick Coghlan  > wrote:
> 
>> If I see "bytearray(10)" there is nothing there that suggests "this
>> creates an array of length 10 and initialises it to zero" to me. I'd
>> be more inclined to guess it would be equivalent to "bytearray([10])".
>> 
>> "bytearray.zeros(10)", on the other hand, is relatively clear,
>> independently of user expectations.
> 
> Zeros would have been great but that should have been done originally.
> The time to get API design right is at inception.
> Now, you're just breaking code and invalidating any published examples.
> 
>>> 
>>> Another thought is that the core devs should be very reluctant to deprecate
>>> anything we don't have to while the 2 to 3 transition is still in progress.
>>> Every new deprecation of APIs that existed in Python 2.7 just adds another
>>> obstacle to converting code.  Individually, the differences are trivial.
>>> Collectively, they present a good reason to never migrate code to Python 3.
>> 
>> This is actually one of the inconsistencies between the Python 2 and 3
>> binary APIs:
> 
> However, bytearray(n) is the same in both Python 2 and Python 3.
> Changing it in Python 3 increases the gulf between the two.
> 
> The further we let Python 3 diverge from Python 2, the less likely that
> people will convert their code and the harder you make it to write code
> that runs under both.
> 
> FWIW, I've been teaching Python full time for three years.  I cover the
> use of bytearray(n) in my classes and not a single person out of 3000+
> engineers have had a problem with it.   I seriously question the PEP's
> assertion that there is a real problem to be solved (i.e. that people
> are baffled by bytearray(bufsiz)) and that the problem is sufficiently
> painful to warrant the headaches that go along with API changes.
> 
> The other proposal to add bytearray.byte(3) should probably be named
> bytearray.from_byte(3) for clarity.  That said, I question whether there is
> actually a use case for this.   I have never seen seen code that has a
> need to create a byte array of length one from a single integer.
> For the most part, the API will be easiest to learn if it matches what
> we do for lists and for array.array.
> 
> Sorry Nick, but I think you're making the API worse instead of better.
> This API isn't perfect but it isn't flat-out broken either.   There is some
> unfortunate asymmetry between bytes() and bytearray() in Python 2,
> but that ship has sailed.  The current API for Python 3 is pretty good
> (though there is still a tension between wanting to be like lists and like
> strings both at the same time).
> 
> 
> Raymond
> 
> 
> P.S.  The most important problem in the Python world now is getting
> Python 2 users to adopt Python 3.  The core devs need to develop
> a strong distaste for anything that makes that problem harder.
> 

For the record I’ve had all of the problems that Nick states and I’m
+1 on this change.

---
Donald Stufft
PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Ethan Furman

On 08/17/2014 10:16 AM, Donald Stufft wrote:


For the record I’ve had all of the problems that Nick states and I’m
+1 on this change.


I've had many of the problems Nick states and I'm also +1.

--
~Ethan~
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Fwd: PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Ian Cordasco
On Aug 17, 2014 12:17 PM, "Donald Stufft"  wrote:
>> On Aug 17, 2014, at 1:07 PM, Raymond Hettinger  
>> wrote:
>>
>>
>> On Aug 17, 2014, at 1:41 AM, Nick Coghlan  wrote:
>>
>>> If I see "bytearray(10)" there is nothing there that suggests "this
>>> creates an array of length 10 and initialises it to zero" to me. I'd
>>> be more inclined to guess it would be equivalent to "bytearray([10])".
>>>
>>> "bytearray.zeros(10)", on the other hand, is relatively clear,
>>> independently of user expectations.
>>
>>
>> Zeros would have been great but that should have been done originally.
>> The time to get API design right is at inception.
>> Now, you're just breaking code and invalidating any published examples.
>>

 Another thought is that the core devs should be very reluctant to deprecate
 anything we don't have to while the 2 to 3 transition is still in progress.
 Every new deprecation of APIs that existed in Python 2.7 just adds another
 obstacle to converting code.  Individually, the differences are trivial.
 Collectively, they present a good reason to never migrate code to Python 3.
>>>
>>>
>>> This is actually one of the inconsistencies between the Python 2 and 3
>>> binary APIs:
>>
>>
>> However, bytearray(n) is the same in both Python 2 and Python 3.
>> Changing it in Python 3 increases the gulf between the two.
>>
>> The further we let Python 3 diverge from Python 2, the less likely that
>> people will convert their code and the harder you make it to write code
>> that runs under both.
>>
>> FWIW, I've been teaching Python full time for three years.  I cover the
>> use of bytearray(n) in my classes and not a single person out of 3000+
>> engineers have had a problem with it.   I seriously question the PEP's
>> assertion that there is a real problem to be solved (i.e. that people
>> are baffled by bytearray(bufsiz)) and that the problem is sufficiently
>> painful to warrant the headaches that go along with API changes.
>>
>> The other proposal to add bytearray.byte(3) should probably be named
>> bytearray.from_byte(3) for clarity.  That said, I question whether there is
>> actually a use case for this.   I have never seen seen code that has a
>> need to create a byte array of length one from a single integer.
>> For the most part, the API will be easiest to learn if it matches what
>> we do for lists and for array.array.
>>
>> Sorry Nick, but I think you're making the API worse instead of better.
>> This API isn't perfect but it isn't flat-out broken either.   There is some
>> unfortunate asymmetry between bytes() and bytearray() in Python 2,
>> but that ship has sailed.  The current API for Python 3 is pretty good
>> (though there is still a tension between wanting to be like lists and like
>> strings both at the same time).
>>
>>
>> Raymond
>>
>>
>> P.S.  The most important problem in the Python world now is getting
>> Python 2 users to adopt Python 3.  The core devs need to develop
>> a strong distaste for anything that makes that problem harder.
>>
>
> For the record I’ve had all of the problems that Nick states and I’m
> +1 on this change.

I've run into these problems as well, but I'm swayed by Raymond's
argument regarding bytearray's constructor. I wouldn't be adverse to
adding zeroes (for some parity between bytes and bytearray) to that
but I'm not sure deprecating te behaviour of bytearray's constructor
is necessary.

(Whilst on my phone I only replied to Donald, so I'm forwarding this
to the list.)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Raymond Hettinger

On Aug 17, 2014, at 11:33 AM, Ethan Furman  wrote:

> I've had many of the problems Nick states and I'm also +1.

There are two code snippets below which were taken from the standard library.
Are you saying that:
1) you don't understand the code (as the pep suggests)
2) you are willing to break that code and everything like it
3) and it would be more elegantly expressed as:  
charmap = bytearray.zeros(256)
and
mapping = bytearray.zeros(256)

At work, I have network engineers creating IPv4 headers and other structures
with bytearrays initialized to zeros.  Do you really want to break all their 
code?
No where else in Python do we create buffers that way.  Code like
"msg, who = s.recvfrom(256)" is the norm.

Also, it is unclear if you're saying that you have an actual use case for this
part of the proposal?

   ba = bytearray.byte(65)

And than the code would be better, clearer, and faster than the currently 
working form?

   ba = bytearray([65])

Does there really need to be a special case for constructing a single byte?
To me, that is akin to proposing "list.from_int(65)" as an important special
case to replace "[65]".

If you must muck with the ever changing bytes() API, then please 
leave the bytearray() API alone.  I think we should show some respect
for code that is currently working and is cleanly expressible in both
Python 2 and Python 3.  We aren't winning users with API churn.

FWIW, I guessing that the differing view points in the thread stem
mainly from the proponents experiences with bytes() rather than
from experience with bytearray() which doesn't seem to have any
usage problems in the wild.  I've never seen a developer say they
didn't understand what "buf = bytearray(1024)" means.   That is
not an actual problem that needs solving (or breaking).

What may be an actual problem is code like "char = bytes(1024)"
though I'm unclear what a user might have actually been trying
to do with code like that.


Raymond


--- excerpts from Lib/sre_compile.py ---

charmap = bytearray(256)
for op, av in charset:
while True:
try:
if op is LITERAL:
charmap[fixup(av)] = 1
elif op is RANGE:
for i in range(fixup(av[0]), fixup(av[1])+1):
charmap[i] = 1
elif op is NEGATE:
out.append((op, av))
else:
tail.append((op, av))

...

charmap = bytes(charmap) # should be hashable   
  
comps = {}
mapping = bytearray(256)
block = 0
data = bytearray()
for i in range(0, 65536, 256):
chunk = charmap[i: i + 256]
if chunk in comps:
mapping[i // 256] = comps[chunk]
else:
mapping[i // 256] = comps[chunk] = block
block += 1
data += chunk
data = _mk_bitmap(data)
data[0:0] = [block] + _bytes_to_codes(mapping)
out.append((BIGCHARSET, data))
out += tail
return out___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Fwd: PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Barry Warsaw
I think the biggest API "problem" is that default iteration returns integers
instead of bytes.  That's a real pain.

I'm not sure .iterbytes() is the best name for spelling iteration over bytes
instead of integers though.  Given that we can't change __iter__(), I
personally would perhaps prefer a simple .bytes property over which if you
iterated you would receive bytes, e.g.

>>> data = bytes([1, 2, 3])
>>> for i in data:
...  print(i)
... 
1
2
3
>>> for b in data.bytes:
...   print(b)
... 
b'\x01'
b'\x02'
b'\x03'

There are no backward compatibility issues with this of course.

As for the single-int-ctor forms, they're inconvenient and arguably "wrong",
but I think we can live with it.  OTOH, I don't see any harm by adding the
.zeros() alternative constructor.  I'd probably want to spell the .byte()
alternative constructor .from_int() but I also don't think the status quo (or
.byte()) is that much of a usability problem.

The API churn problem comes about when you start wanting to deprecate the
single-int-ctor form.  *If* that part gets adopted, it should have a really
long deprecation cycle, IMO.

Cheers,
-Barry
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Donald Stufft

> On Aug 17, 2014, at 5:19 PM, Raymond Hettinger  
> wrote:
> 
> 
> On Aug 17, 2014, at 11:33 AM, Ethan Furman  > wrote:
> 
>> I've had many of the problems Nick states and I'm also +1.
> 
> There are two code snippets below which were taken from the standard library.
> Are you saying that:
> 1) you don't understand the code (as the pep suggests)
> 2) you are willing to break that code and everything like it
> 3) and it would be more elegantly expressed as:  
> charmap = bytearray.zeros(256)
> and
> mapping = bytearray.zeros(256)
> 
> At work, I have network engineers creating IPv4 headers and other structures
> with bytearrays initialized to zeros.  Do you really want to break all their 
> code?
> No where else in Python do we create buffers that way.  Code like
> "msg, who = s.recvfrom(256)" is the norm.
> 
> Also, it is unclear if you're saying that you have an actual use case for this
> part of the proposal?
> 
>ba = bytearray.byte(65)
> 
> And than the code would be better, clearer, and faster than the currently 
> working form?
> 
>ba = bytearray([65])
> 
> Does there really need to be a special case for constructing a single byte?
> To me, that is akin to proposing "list.from_int(65)" as an important special
> case to replace "[65]".
> 
> If you must muck with the ever changing bytes() API, then please 
> leave the bytearray() API alone.  I think we should show some respect
> for code that is currently working and is cleanly expressible in both
> Python 2 and Python 3.  We aren't winning users with API churn.
> 
> FWIW, I guessing that the differing view points in the thread stem
> mainly from the proponents experiences with bytes() rather than
> from experience with bytearray() which doesn't seem to have any
> usage problems in the wild.  I've never seen a developer say they
> didn't understand what "buf = bytearray(1024)" means.   That is
> not an actual problem that needs solving (or breaking).
> 
> What may be an actual problem is code like "char = bytes(1024)"
> though I'm unclear what a user might have actually been trying
> to do with code like that.

I think this is probably correct. I generally don’t think that bytes(1024)
makes much sense at all, especially not as a default constructor. Most likely
it exists to be similar to bytearray().

I don't have a specific problem with bytearray(1024), though I do think it's
more elegantly and clearly described as bytearray.zeros(1024), but not by much.

I find bytes.byte()/bytearray to be needed as long as there isn't a simple way
to iterate over a bytes or bytearray in a way that yields bytes or bytearrays
instead of integers. To be honest I can't think of a time when I'd actually
*want* to iterate over a bytes/bytearray as integers. Although I realize there
is unlikely to be a reasonable method to change that now. If iterbytes is added
I'm not sure where i'd personally use either bytes.byte() or bytearray.byte().

In general though I think that overloading a single constructor method to do
something conceptually different based on the type of the parameter leads to
these kind of confusing scenarios and that having differently named constructors
for the different concepts is far clearer.

So given all that, I am:

* +1 for some method of iterating over both types as bytes instead of
  integers.
* +1 on adding .zeros to both types as an alternative and preferred method of
  creating a zero filled instance and deprecating the original method[1].
* -0 on adding .byte to both types as an alternative method of creating a
  single byte instance.
* -1 On changing the meaning of bytearray(1024).
* +/-0 on changing the meaning of bytes(1024), I think that bytes(1024) is
  likely to *not* be what someone wants and that what they really want is
  bytes([N]). I also think that the number one reason for someone to be doing
  bytes(N) is because they were attempting to iterate over a bytes or bytearray
  object and they got an integer. I also think that it's bad that this changes
  from 2.x to 3.x and I wish it hadn't. However I can't decide if it's worth
  reverting this at this time or not.

[1] By deprecating I mean, raise a deprecation warning, or something but my
thoughts on actually removing the other methods are listed explicitly.

---
Donald Stufft
PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Fwd: PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Markus Unterwaditzer
On Sun, Aug 17, 2014 at 05:41:10PM -0400, Barry Warsaw wrote:
> I think the biggest API "problem" is that default iteration returns integers
> instead of bytes.  That's a real pain.

I agree, this behavior required some helper functions while porting Werkzeug to
Python 3 AFAIK.

> 
> I'm not sure .iterbytes() is the best name for spelling iteration over bytes
> instead of integers though.  Given that we can't change __iter__(), I
> personally would perhaps prefer a simple .bytes property over which if you
> iterated you would receive bytes, e.g.

I'd rather be for a .bytes() method, to match the .values(), and .keys()
methods on dictionaries.

-- Markus
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Antoine Pitrou


Le 17/08/2014 13:07, Raymond Hettinger a écrit :


FWIW, I've been teaching Python full time for three years.  I cover the
use of bytearray(n) in my classes and not a single person out of 3000+
engineers have had a problem with it.


This is less about bytearray() than bytes(), IMO. bytearray() is 
sufficiently specialized that only experienced people will encounter it.


And while preallocating a bytearray of a certain size makes sense, it's 
completely pointless for a bytes object.


Regards

Antoine.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Fwd: PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Nick Coghlan
On 18 Aug 2014 08:04, "Markus Unterwaditzer" 
wrote:
>
> On Sun, Aug 17, 2014 at 05:41:10PM -0400, Barry Warsaw wrote:
> > I think the biggest API "problem" is that default iteration returns
integers
> > instead of bytes.  That's a real pain.
>
> I agree, this behavior required some helper functions while porting
Werkzeug to
> Python 3 AFAIK.
>
> >
> > I'm not sure .iterbytes() is the best name for spelling iteration over
bytes
> > instead of integers though.  Given that we can't change __iter__(), I
> > personally would perhaps prefer a simple .bytes property over which if
you
> > iterated you would receive bytes, e.g.
>
> I'd rather be for a .bytes() method, to match the .values(), and .keys()
> methods on dictionaries.

Calling it bytes is too confusing:

for x in bytes(data):
   ...

for x in bytes(data).bytes()

When referring to bytes, which bytes do you mean, the builtin or the method?

iterbytes() isn't especially attractive as a method name, but it's far more
explicit about its purpose.

Cheers,
Nick.

>
> -- Markus
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Fwd: PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Barry Warsaw
On Aug 18, 2014, at 08:48 AM, Nick Coghlan wrote:

>Calling it bytes is too confusing:
>
>for x in bytes(data):
>   ...
>
>for x in bytes(data).bytes()
>
>When referring to bytes, which bytes do you mean, the builtin or the method?
>
>iterbytes() isn't especially attractive as a method name, but it's far more
>explicit about its purpose.

I don't know.  How often do you really instantiate the bytes object there in
the for loop?

-Barry



signature.asc
Description: PGP signature
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Nick Coghlan
On 18 Aug 2014 03:07, "Raymond Hettinger" 
wrote:
>
>
> On Aug 17, 2014, at 1:41 AM, Nick Coghlan  wrote:
>
>> If I see "bytearray(10)" there is nothing there that suggests "this
>> creates an array of length 10 and initialises it to zero" to me. I'd
>> be more inclined to guess it would be equivalent to "bytearray([10])".
>>
>> "bytearray.zeros(10)", on the other hand, is relatively clear,
>> independently of user expectations.
>
>
> Zeros would have been great but that should have been done originally.
> The time to get API design right is at inception.
> Now, you're just breaking code and invalidating any published examples.

I'm fine with postponing the deprecation elements indefinitely (or just
deprecating bytes(int) and leaving bytearray(int) alone).

>
>>>
>>> Another thought is that the core devs should be very reluctant to
deprecate
>>> anything we don't have to while the 2 to 3 transition is still in
progress.
>>> Every new deprecation of APIs that existed in Python 2.7 just adds
another
>>> obstacle to converting code.  Individually, the differences are trivial.
>>> Collectively, they present a good reason to never migrate code to
Python 3.
>>
>>
>> This is actually one of the inconsistencies between the Python 2 and 3
>> binary APIs:
>
>
> However, bytearray(n) is the same in both Python 2 and Python 3.
> Changing it in Python 3 increases the gulf between the two.
>
> The further we let Python 3 diverge from Python 2, the less likely that
> people will convert their code and the harder you make it to write code
> that runs under both.
>
> FWIW, I've been teaching Python full time for three years.  I cover the
> use of bytearray(n) in my classes and not a single person out of 3000+
> engineers have had a problem with it.   I seriously question the PEP's
> assertion that there is a real problem to be solved (i.e. that people
> are baffled by bytearray(bufsiz)) and that the problem is sufficiently
> painful to warrant the headaches that go along with API changes.

Yes, I'd expect engineers and networking folks to be fine with it. It isn't
how this mode of the constructor *works* that worries me, it's how it
*fails* (i.e. silently producing unexpected data rather than a type error).

Purely deprecating the bytes case and leaving bytearray alone would likely
address my concerns.

>
> The other proposal to add bytearray.byte(3) should probably be named
> bytearray.from_byte(3) for clarity.  That said, I question whether there
is
> actually a use case for this.   I have never seen seen code that has a
> need to create a byte array of length one from a single integer.
> For the most part, the API will be easiest to learn if it matches what
> we do for lists and for array.array.

This part of the proposal came from a few things:

* many of the bytes and bytearray methods only accept bytes-like objects,
but iteration and indexing produce integers
* to mitigate the impact of the above, some (but not all) bytes and
bytearray methods now accept integers in addition to bytes-like objects
* ord() in Python 3 is only documented as accepting length 1 strings, but
also accepts length 1 bytes-like objects

Adding bytes.byte() makes it practical to document the binary half of ord's
behaviour, and eliminates any temptation to expand the "also accepts
integers" behaviour out to more types.

bytes.byte() thus becomes the binary equivalent of chr(), just as Python 2
had both chr() and unichr().

I don't recall ever needing chr() in a real program either, but I still
consider it an important part of clearly articulating the data model.

> Sorry Nick, but I think you're making the API worse instead of better.
> This API isn't perfect but it isn't flat-out broken either.   There is
some
> unfortunate asymmetry between bytes() and bytearray() in Python 2,
> but that ship has sailed.  The current API for Python 3 is pretty good
> (though there is still a tension between wanting to be like lists and like
> strings both at the same time).

Yes. It didn't help that the docs previously expected readers to infer the
behaviour of the binary sequence methods from the string documentation -
while the new docs could still use some refinement, I've at least addressed
that part of the problem.

Cheers,
Nick.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Fwd: PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Nick Coghlan
On 18 Aug 2014 08:55, "Barry Warsaw"  wrote:
>
> On Aug 18, 2014, at 08:48 AM, Nick Coghlan wrote:
>
> >Calling it bytes is too confusing:
> >
> >for x in bytes(data):
> >   ...
> >
> >for x in bytes(data).bytes()
> >
> >When referring to bytes, which bytes do you mean, the builtin or the
method?
> >
> >iterbytes() isn't especially attractive as a method name, but it's far
more
> >explicit about its purpose.
>
> I don't know.  How often do you really instantiate the bytes object there
in
> the for loop?

I'm talking more generally - do you *really* want to be explaining that
"bytes" behaves like a tuple of integers, while "bytes.bytes" behaves like
a tuple of bytes?

Namespaces are great and all, but using the same name for two different
concepts is still inherently confusing.

Cheers,
Nick.

>
> -Barry
>
>
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
>
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Antoine Pitrou


Le 16/08/2014 01:17, Nick Coghlan a écrit :


* Deprecate passing single integer values to ``bytes`` and ``bytearray``


I'm neutral. Ideally we wouldn't have done that mistake at the beginning.


* Add ``bytes.zeros`` and ``bytearray.zeros`` alternative constructors
* Add ``bytes.byte`` and ``bytearray.byte`` alternative constructors
* Add ``bytes.iterbytes``, ``bytearray.iterbytes`` and
   ``memoryview.iterbytes`` alternative iterators


+0.5. "iterbytes" isn't really great as a name.

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Raymond Hettinger

On Aug 17, 2014, at 4:08 PM, Nick Coghlan  wrote:

> Purely deprecating the bytes case and leaving bytearray alone would likely 
> address my concerns.

That is good progress.  Thanks :-)

Would a warning for the bytes case suffice, do you need an actual deprecation?

> bytes.byte() thus becomes the binary equivalent of chr(), just as Python 2 
> had both chr() and unichr().
> 
> I don't recall ever needing chr() in a real program either, but I still 
> consider it an important part of clearly articulating the data model.
> 
> 


"I don't recall having ever needed this"  greatly weakens the premise that this 
is needed :-)

The APIs have been around since 2.6 and AFAICT there have been zero demonstrated
need for a special case for a single byte.  We already have a perfectly good 
spelling:

   NUL = bytes([0])

The Zen tells us we really don't need a second way to do it (actually a third 
since you
can also write b'\x00') and it suggests that this special case isn't special 
enough.

I encourage restraint against adding an unneeded class method that has no 
parallel
elsewhere.  Right now, the learning curve is mitigated because bytes is very 
str-like
and because bytearray is list-like (i.e. the method names have been used 
elsewhere
and likely already learned before encountering bytes() or bytearray()).  
Putting in new,
rarely used funky method adds to the learning burden.

If you do press forward with adding it (and I don't see why), then as an 
alternate 
constructor, the name should be from_int() or some such to avoid ambiguity
and to make clear that it is a class method.

> iterbytes() isn't especially attractive as a method name, but it's far more
> explicit about its purpose.

I concur.  In this case, explicitness matters.


Raymond


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Nick Coghlan
On 18 Aug 2014 09:41, "Raymond Hettinger" 
wrote:
>
>
> I encourage restraint against adding an unneeded class method that has no
parallel
> elsewhere.  Right now, the learning curve is mitigated because bytes is
very str-like
> and because bytearray is list-like (i.e. the method names have been used
elsewhere
> and likely already learned before encountering bytes() or bytearray()).
 Putting in new,
> rarely used funky method adds to the learning burden.
>
> If you do press forward with adding it (and I don't see why), then as an
alternate
> constructor, the name should be from_int() or some such to avoid ambiguity
> and to make clear that it is a class method.

If I remember the sequence of events correctly, I thought of
map(bytes.byte, data) first, and then Guido suggested a dedicated
iterbytes() method later.

The step I hadn't taken (until now) was realising that the new
memoryview(data).iterbytes() capability actually combines with the existing
(bytes([b]) for b in data) to make the original bytes.byte idea unnecessary.

Cheers,
Nick.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Fwd: PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Barry Warsaw
On Aug 18, 2014, at 09:12 AM, Nick Coghlan wrote:

>I'm talking more generally - do you *really* want to be explaining that
>"bytes" behaves like a tuple of integers, while "bytes.bytes" behaves like
>a tuple of bytes?

I would explain it differently though, using concrete examples.

data = bytes(...)
for i in data: # iterate over data as integers
for i in data.bytes: # iterate over data as bytes

But whatever.  I just wish there was something better than iterbytes.

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Fwd: PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Nick Coghlan
On 18 Aug 2014 09:57, "Barry Warsaw"  wrote:
>
> On Aug 18, 2014, at 09:12 AM, Nick Coghlan wrote:
>
> >I'm talking more generally - do you *really* want to be explaining that
> >"bytes" behaves like a tuple of integers, while "bytes.bytes" behaves
like
> >a tuple of bytes?
>
> I would explain it differently though, using concrete examples.
>
> data = bytes(...)
> for i in data: # iterate over data as integers
> for i in data.bytes: # iterate over data as bytes
>
> But whatever.  I just wish there was something better than iterbytes.

There's actually another aspect to your idea, independent of the naming:
exposing a view rather than just an iterator. I'm going to have to look at
the implications for memoryview, but it may be a good way to go (and would
align with the iterator -> view changes in dict).

Cheers,
Nick.

>
> -Barry
>
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
>
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Fwd: PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Barry Warsaw
On Aug 18, 2014, at 10:08 AM, Nick Coghlan wrote:

>There's actually another aspect to your idea, independent of the naming:
>exposing a view rather than just an iterator. I'm going to have to look at
>the implications for memoryview, but it may be a good way to go (and would
>align with the iterator -> view changes in dict).

Yep!  Maybe that will inspire a better spelling. :)

Cheers,
-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Fwd: PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Guido van Rossum
On Sun, Aug 17, 2014 at 5:22 PM, Barry Warsaw  wrote:

> On Aug 18, 2014, at 10:08 AM, Nick Coghlan wrote:
>
> >There's actually another aspect to your idea, independent of the naming:
> >exposing a view rather than just an iterator. I'm going to have to look at
> >the implications for memoryview, but it may be a good way to go (and would
> >align with the iterator -> view changes in dict).
>
> Yep!  Maybe that will inspire a better spelling. :)
>

+1. It's just as much about b[i] as it is about "for c in b", so a view
sounds right. (The view would have to be mutable for bytearrays and for
writable memoryviews.)

On the rest, it's sounding more and more as if we will just need to live
with both bytes(1000) and bytearray(1000). A warning sounds worse than a
deprecation to me.

bytes.zeros(n) sounds fine to me; I value similar interfaces for bytes and
bytearray pretty highly.

I'm lukewarm on bytes.byte(c); but bytes([c]) does bother me because a size
one list is (or at least feels) more expensive to allocate than a size one
bytes object. So, okay.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 4000 to explicitly declare we won't be doing a Py3k style compatibility break again?

2014-08-17 Thread Guido van Rossum
On Sun, Aug 17, 2014 at 6:29 AM, Barry Warsaw  wrote:

> On Aug 16, 2014, at 07:43 PM, Guido van Rossum wrote:
>
> >(Don't understand this to mean that we should never deprecate things.
> >Deprecations will happen, they are necessary for the evolution of any
> >programming language. But they won't ever hurt in the way that Python 3
> >hurt.)
>
> It would be useful to explore what causes the most pain in the 2->3
> transition?  IMHO, it's not the deprecations or changes such as print ->
> print().  It's the bytes/str split - a fundamental change to core and
> common
> data types.  The question then is whether you foresee any similar looming
> pervasive change? [*]
>

I'm unsure about what's the single biggest pain moving to Python 3. In the
past I would have said that it's for sure the bytes/str split (which both
the biggest pain and the biggest payoff).

But if I look carefully into the soul of teams that are still on 2.7 (I
know a few... :-), I think the real reason is that Python 3 changes so many
different things, you have to actually understand your code to port it
(unlike with minor version transitions, where the changes usually spike in
one specific area, and you can leave the rest to normal attrition and
periodic maintenance).

-Barry
>
> [*] I was going to add a joke about mandatory static type checking, but
> sometimes jokes are blown up into apocalyptic prophesy around here. ;)
>

Heh. :-)

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 4000 to explicitly declare we won't be doing a Py3k style compatibility break again?

2014-08-17 Thread Donald Stufft
On Sun, Aug 17, 2014, at 09:02 PM, Guido van Rossum wrote:
> On Sun, Aug 17, 2014 at 6:29 AM, Barry Warsaw  wrote:
>> On Aug 16, 2014, at 07:43 PM, Guido van Rossum wrote:
>>  
>> 
>(Don't understand this to mean that we should never deprecate things.
>> 
>Deprecations will happen, they are necessary for the evolution of any
>> 
>programming language. But they won't ever hurt in the way that Python 3
>> 
>hurt.)
>>  
>> It would be useful to explore what causes the most pain in the 2->3
>> 
transition?  IMHO, it's not the deprecations or changes such as print ->
>> 
print().  It's the bytes/str split - a fundamental change to core and
common
>> 
data types.  The question then is whether you foresee any similar
looming
>> 
pervasive change? [*]
> 
> I'm unsure about what's the single biggest pain moving to Python 3. In the 
> past I would have said that it's for sure the bytes/str split (which both the 
> biggest pain and the biggest payoff).
>  
> But if I look carefully into the soul of teams that are still on 2.7 (I know 
> a few... :-), I think the real reason is that Python 3 changes so many 
> different things, you have to actually understand your code to port it 
> (unlike with minor version transitions, where the changes usually spike in 
> one specific area, and you can leave the rest to normal attrition and 
> periodic maintenance).
>  

In my experience bytes/str is the single biggest change that causes the
most problems. Most of the other changes can be mechanically transformed
and/or papered over using helpers like six. The bytes/str change is the
main one that requires understanding code and where it requires a
serious untangling of things in code bases where str/bytes are freely
used intechangingbly. Often times this requires making a decision about
what *should* be bytes or str as well which requires having some deep
knowledge about the APIs in question too.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Antoine Pitrou


Le 17/08/2014 19:41, Raymond Hettinger a écrit :


The APIs have been around since 2.6 and AFAICT there have been zero
demonstrated
need for a special case for a single byte.  We already have a perfectly
good spelling:
NUL = bytes([0])


That is actually a very cumbersome spelling. Why should I first create a 
one-element list in order to create a one-byte bytes object?



The Zen tells us we really don't need a second way to do it (actually a
third since you
can also write b'\x00') and it suggests that this special case isn't
special enough.


b'\x00' is obviously the right way to do it in this case, but we're 
concerned about the non-constant case.


The reason to instantiate bytes from non-constant integer comes from the 
unfortunate indexing and iteration behaviour of bytes objects.


Regards

Antoine.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Ethan Furman

On 08/17/2014 02:19 PM, Raymond Hettinger wrote:

On Aug 17, 2014, at 11:33 AM, Ethan Furman wrote:


I've had many of the problems Nick states and I'm also +1.


There are two code snippets below which were taken from the standard library.


[...]

My issues are with 'bytes', not 'bytearray'.  'bytearray(10)' actually makes sense.  I certainly have no problem with 
bytearray and bytes not being exactly the same.


My primary issues with bytes is not being able to do b'abc'[2] == b'c', and with not being able to do x = b'abc'[2]; y = 
bytes(x); assert y == b'c'.


And because of the backwards compatibility issues I would deprecate, because we have a new 'better' way, but not remove, 
the current functionality.


I pretty much agree exactly with what Donald Stufft said about it.

--
~Ethan~
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Fwd: PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Antoine Pitrou

Le 17/08/2014 20:08, Nick Coghlan a écrit :


On 18 Aug 2014 09:57, "Barry Warsaw" mailto:[email protected]>> wrote:
 >
 > On Aug 18, 2014, at 09:12 AM, Nick Coghlan wrote:
 >
 > >I'm talking more generally - do you *really* want to be explaining that
 > >"bytes" behaves like a tuple of integers, while "bytes.bytes"
behaves like
 > >a tuple of bytes?
 >
 > I would explain it differently though, using concrete examples.
 >
 > data = bytes(...)
 > for i in data: # iterate over data as integers
 > for i in data.bytes: # iterate over data as bytes
 >
 > But whatever.  I just wish there was something better than iterbytes.

There's actually another aspect to your idea, independent of the naming:
exposing a view rather than just an iterator.


So that view would actually be the bytes object done right? Funny :-)
Will it have lazy slicing?

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Fwd: PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Donald Stufft
from __future__ import bytesdoneright? :D

-- 
  Donald Stufft
  [email protected]

On Sun, Aug 17, 2014, at 09:40 PM, Antoine Pitrou wrote:
> Le 17/08/2014 20:08, Nick Coghlan a écrit :
> >
> > On 18 Aug 2014 09:57, "Barry Warsaw"  > > wrote:
> >  >
> >  > On Aug 18, 2014, at 09:12 AM, Nick Coghlan wrote:
> >  >
> >  > >I'm talking more generally - do you *really* want to be explaining that
> >  > >"bytes" behaves like a tuple of integers, while "bytes.bytes"
> > behaves like
> >  > >a tuple of bytes?
> >  >
> >  > I would explain it differently though, using concrete examples.
> >  >
> >  > data = bytes(...)
> >  > for i in data: # iterate over data as integers
> >  > for i in data.bytes: # iterate over data as bytes
> >  >
> >  > But whatever.  I just wish there was something better than iterbytes.
> >
> > There's actually another aspect to your idea, independent of the naming:
> > exposing a view rather than just an iterator.
> 
> So that view would actually be the bytes object done right? Funny :-)
> Will it have lazy slicing?
> 
> Regards
> 
> Antoine.
> 
> 
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/donald%40stufft.io
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Ethan Furman

On 08/17/2014 04:08 PM, Nick Coghlan wrote:


I'm fine with postponing the deprecation elements indefinitely (or just 
deprecating bytes(int) and leaving
bytearray(int) alone).


+1 on both pieces.

--
~Ethan~
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Ian Cordasco
On Sun, Aug 17, 2014 at 8:52 PM, Ethan Furman  wrote:
> On 08/17/2014 04:08 PM, Nick Coghlan wrote:
>>
>>
>> I'm fine with postponing the deprecation elements indefinitely (or just
>> deprecating bytes(int) and leaving
>> bytearray(int) alone).
>
>
> +1 on both pieces.

Perhaps postpone the deprecation to Python 4000 ;)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Alex Gaynor
Donald Stufft  stufft.io> writes:

> 
> 
> 
> For the record I’ve had all of the problems that Nick states and I’m
> +1 on this change.
> 
> 
> ---
> Donald Stufft
> PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
> 

I've hit basically every problem everyone here has stated, and in no uncertain
terms am I completely opposed to deprecating anything. The Python 2 to 3
migration is already hard enough, and already proceeding far too slowly for
many of our tastes. Making that migration even more complex would drive me to
the point of giving up.

Alex

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Fwd: PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Chris McDonough

On 08/17/2014 09:40 PM, Antoine Pitrou wrote:

Le 17/08/2014 20:08, Nick Coghlan a écrit :


On 18 Aug 2014 09:57, "Barry Warsaw" mailto:[email protected]>> wrote:
 >
 > On Aug 18, 2014, at 09:12 AM, Nick Coghlan wrote:
 >
 > >I'm talking more generally - do you *really* want to be explaining
that
 > >"bytes" behaves like a tuple of integers, while "bytes.bytes"
behaves like
 > >a tuple of bytes?
 >
 > I would explain it differently though, using concrete examples.
 >
 > data = bytes(...)
 > for i in data: # iterate over data as integers
 > for i in data.bytes: # iterate over data as bytes
 >
 > But whatever.  I just wish there was something better than iterbytes.

There's actually another aspect to your idea, independent of the naming:
exposing a view rather than just an iterator.


So that view would actually be the bytes object done right? Funny :-)
Will it have lazy slicing?


bytes.sorry()? ;-)

- C


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Devin Jeanpierre
On Sun, Aug 17, 2014 at 7:14 PM, Alex Gaynor  wrote:
> I've hit basically every problem everyone here has stated, and in no uncertain
> terms am I completely opposed to deprecating anything. The Python 2 to 3
> migration is already hard enough, and already proceeding far too slowly for
> many of our tastes. Making that migration even more complex would drive me to
> the point of giving up.

Could you elaborate what problems you are thinking this will cause for you?

It seems to me that avoiding a bug-prone API is not particularly
complex, and moving it back to its 2.x semantics or making it not work
entirely, rather than making it work differently, would make porting
applications easier. If, during porting to 3.x, you find a deprecation
warning for bytes(n), then rather than being annoying code churny
extra changes, this is actually a bug that's been identified. So it's
helpful even during the deprecation period.

-- Devin
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 4000 to explicitly declare we won't be doing a Py3k style compatibility break again?

2014-08-17 Thread Nick Coghlan
On 18 August 2014 11:14, Donald Stufft  wrote:
> On Sun, Aug 17, 2014, at 09:02 PM, Guido van Rossum wrote:
>> I'm unsure about what's the single biggest pain moving to Python 3. In the 
>> past I would have said that it's for sure the bytes/str split (which both 
>> the biggest pain and the biggest payoff).
>>
>> But if I look carefully into the soul of teams that are still on 2.7 (I know 
>> a few... :-), I think the real reason is that Python 3 changes so many 
>> different things, you have to actually understand your code to port it 
>> (unlike with minor version transitions, where the changes usually spike in 
>> one specific area, and you can leave the rest to normal attrition and 
>> periodic maintenance).
>>
>
> In my experience bytes/str is the single biggest change that causes the
> most problems. Most of the other changes can be mechanically transformed
> and/or papered over using helpers like six. The bytes/str change is the
> main one that requires understanding code and where it requires a
> serious untangling of things in code bases where str/bytes are freely
> used intechangingbly. Often times this requires making a decision about
> what *should* be bytes or str as well which requires having some deep
> knowledge about the APIs in question too.

It's certainly the one that has caused the most churn in CPython and
the standard library - the ripples still haven't entirely settled on
that front :)

I think Guido's right that there's also a "death of a thousand cuts"
aspect for large existing code bases, though, especially those that
are lacking comprehensive test suites. By definition, existing large
Python 2 applications are OK with the restrictions imposed by Python
2, and we're deliberately not forcing the issue by halting Python 2
maintenance. That's where Steve Dower's idea of being able to
progressively declare a code base "Python 3 compatible" on a file by
file basis and have some means of programmatically enforcing that is
interesting - it opens the door to "opportunistic and incremental"
porting, where modules are progressively updated to run on both, until
an application reaches a point where it can switch to Python 3 and
leave Python 2 behind.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com