Re: io.open vs. codecs.open

2015-03-06 Thread Albert-Jan Roskam


- Original Message -

 From: Steven D'Aprano steve+comp.lang.pyt...@pearwood.info
 To: python-list@python.org
 Cc: 
 Sent: Wednesday, March 4, 2015 8:56 PM
 Subject: Re: io.open vs. codecs.open
 
 Albert-Jan Roskam wrote:
 
  Hi,
 
  Is there a (use case) difference between codecs.open and io.open? What is
  the difference? A small difference that I just discovered is that
  codecs.open(somefile).read() returns a bytestring if no encoding is
  specified*), but a unicode string if an encoding is specified. io.open
  always returns a unicode string.
 
 What version of Python are you using?


Python 2.7 and 3.4.


 In Python 3, io.open is used as the built-in open. I believe this is
 guaranteed, and not just an implementation detail.
 
 The signatures and capabilities are quite different:
 
 codecs.open:
 
 open(filename, mode='rb', encoding=None, errors='strict', 
 buffering=1)
 
 io.open:
 
 open(file, mode='r', buffering=-1, encoding=None,

  errors=None, newline=None, closefd=True, opener=None)

Thanks. I didn't realize that closefd was also available in Python 2. I had 
only seen it in Python 3 open()

 
 io.open does *not* always produce Unicode strings. If you pass 'rb' as 
 the
 mode, the file is opened in binary mode, not text mode, and the read()
 method will return bytes.



So, in recent versions of Python 2 (Python 2.7.x, 2.6) I can basically ditch 
codecs.open() and the standard open()?
Given that standard open() has no encoding parameter, it is only really safe 
for use with binary data (binary mode).


 As usual, help() in the interactive interpreter is your friend.
 help(codecs.open) and help(io.open) will explain the many differences
 between them, including that codecs.open always opens the file in binary
 mode.
 
 As for use-cases, I think that codecs.open is mostly a left-over from the
 Python 2 days when the built-in open had a much simpler interface and fewer
 capabilities. In Python 2, built-in open doesn't take an encoding argument,
 so if you want to use something other than binary mode or the default
 encoding, you were supposed to use codecs.open.
 
 In Python 2.6, the io module was added to Python 2 to aid in porting to
 Python 3. The docs say:
 
 New in version 2.6.
 
 The io module provides the Python interfaces to stream handling.
 Under Python 2.x, this is proposed as an alternative to the
 built-in file object, but in Python 3.x it is the default
 interface to access files and streams.
 
 https://docs.python.org/2/library/io.html
 
 
 To summarise:
 
 * In Python 2, if you want to supply an encoding to open, use codecs.open
 (before 2.6) or io.open (2.6 and later);
 
 * If you want the enhanced capabilities of Python 3 open, use io.open;
 
 * In Python 3, io.open is the same thing as built-in open;
 
 * And codecs.open is (I think) mostly there for backwards compatibility.
 
 
 
 
 -- 
 Steven
 
 -- 
 https://mail.python.org/mailman/listinfo/python-list
 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: io.open vs. codecs.open

2015-03-04 Thread Steven D'Aprano
Albert-Jan Roskam wrote:

 Hi,
 
 Is there a (use case) difference between codecs.open and io.open? What is
 the difference? A small difference that I just discovered is that
 codecs.open(somefile).read() returns a bytestring if no encoding is
 specified*), but a unicode string if an encoding is specified. io.open
 always returns a unicode string.

What version of Python are you using?

In Python 3, io.open is used as the built-in open. I believe this is
guaranteed, and not just an implementation detail.

The signatures and capabilities are quite different:

codecs.open:

open(filename, mode='rb', encoding=None, errors='strict', buffering=1)

io.open:

open(file, mode='r', buffering=-1, encoding=None,
 errors=None, newline=None, closefd=True, opener=None)

io.open does *not* always produce Unicode strings. If you pass 'rb' as the
mode, the file is opened in binary mode, not text mode, and the read()
method will return bytes.

As usual, help() in the interactive interpreter is your friend.
help(codecs.open) and help(io.open) will explain the many differences
between them, including that codecs.open always opens the file in binary
mode.

As for use-cases, I think that codecs.open is mostly a left-over from the
Python 2 days when the built-in open had a much simpler interface and fewer
capabilities. In Python 2, built-in open doesn't take an encoding argument,
so if you want to use something other than binary mode or the default
encoding, you were supposed to use codecs.open.

In Python 2.6, the io module was added to Python 2 to aid in porting to
Python 3. The docs say:

New in version 2.6.

The io module provides the Python interfaces to stream handling.
Under Python 2.x, this is proposed as an alternative to the
built-in file object, but in Python 3.x it is the default
interface to access files and streams.

https://docs.python.org/2/library/io.html


To summarise:

* In Python 2, if you want to supply an encoding to open, use codecs.open
(before 2.6) or io.open (2.6 and later);

* If you want the enhanced capabilities of Python 3 open, use io.open;

* In Python 3, io.open is the same thing as built-in open;

* And codecs.open is (I think) mostly there for backwards compatibility.




-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: io.open vs. codecs.open

2015-03-04 Thread Mark Lawrence

On 04/03/2015 19:56, Steven D'Aprano wrote:

Albert-Jan Roskam wrote:


Hi,

Is there a (use case) difference between codecs.open and io.open? What is
the difference? A small difference that I just discovered is that
codecs.open(somefile).read() returns a bytestring if no encoding is
specified*), but a unicode string if an encoding is specified. io.open
always returns a unicode string.


What version of Python are you using?

In Python 3, io.open is used as the built-in open. I believe this is
guaranteed, and not just an implementation detail.

The signatures and capabilities are quite different:

codecs.open:

open(filename, mode='rb', encoding=None, errors='strict', buffering=1)

io.open:

open(file, mode='r', buffering=-1, encoding=None,
  errors=None, newline=None, closefd=True, opener=None)

io.open does *not* always produce Unicode strings. If you pass 'rb' as the
mode, the file is opened in binary mode, not text mode, and the read()
method will return bytes.

As usual, help() in the interactive interpreter is your friend.
help(codecs.open) and help(io.open) will explain the many differences
between them, including that codecs.open always opens the file in binary
mode.

As for use-cases, I think that codecs.open is mostly a left-over from the
Python 2 days when the built-in open had a much simpler interface and fewer
capabilities. In Python 2, built-in open doesn't take an encoding argument,
so if you want to use something other than binary mode or the default
encoding, you were supposed to use codecs.open.

In Python 2.6, the io module was added to Python 2 to aid in porting to
Python 3. The docs say:

 New in version 2.6.

 The io module provides the Python interfaces to stream handling.
 Under Python 2.x, this is proposed as an alternative to the
 built-in file object, but in Python 3.x it is the default
 interface to access files and streams.

https://docs.python.org/2/library/io.html


To summarise:

* In Python 2, if you want to supply an encoding to open, use codecs.open
(before 2.6) or io.open (2.6 and later);

* If you want the enhanced capabilities of Python 3 open, use io.open;

* In Python 3, io.open is the same thing as built-in open;

* And codecs.open is (I think) mostly there for backwards compatibility.



See http://bugs.python.org/issue8796 Deprecate codecs.open().

--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: io.open vs. codecs.open

2015-03-04 Thread random832
On Wed, Mar 4, 2015, at 07:12, Albert-Jan Roskam wrote:
 Hi,
 
 Is there a (use case) difference between codecs.open and io.open? What is
 the difference?
 A small difference that I just discovered is that
 codecs.open(somefile).read() returns a bytestring if no encoding is
 specified*), but a unicode string if an encoding is specified. io.open
 always returns a unicode string.

I think this is a historical accident. Originally, in python 2, built-in
open only returned byte strings. Later, codecs was added, and then io
was added after that. Python 3 changed the built-in functions to use the
same classes as io, and now io.open and built-in open are the same. In
new Python 3 code, you should probably always use builtin open. Use
binary mode (mode option has b in it) if you want byte strings.
-- 
https://mail.python.org/mailman/listinfo/python-list


io.open vs. codecs.open

2015-03-04 Thread Albert-Jan Roskam
Hi,

Is there a (use case) difference between codecs.open and io.open? What is the 
difference?
A small difference that I just discovered is that codecs.open(somefile).read() 
returns a bytestring if no encoding is specified*), but a unicode string if an 
encoding is specified. io.open always returns a unicode string.

*) I had never tried that before. I would have expected that encoding would 
default to e.g locale.getpreferredencoding().


Thank you!

Regards,

Albert-Jan


~~

All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a 

fresh water system, and public health, what have the Romans ever done for us?

~~
-- 
https://mail.python.org/mailman/listinfo/python-list