On Thu, Jan 29, 2015 at 02:42:31AM -0500, Robert Simmons wrote:
> On Thu, Jan 29, 2015 at 2:29 AM, Roland Smith <rsm...@xs4all.nl> wrote:
> > On Thu, Jan 29, 2015 at 01:38:21AM -0500, Robert Simmons wrote:
> >> I'm having a unicode problem on FreeBSD lang/python34 that does not
> >> appear on MacOS X. I've condensed the problem to one single line to
> >> enter in the interpreter:
> >>
> >> FreeBSD:
> >> Python 3.4.2 (default, Jan 28 2015, 22:23:57)
> >> [GCC 4.2.1 Compatible FreeBSD Clang 3.4.1 (tags/RELEASE_34/dot1-final
> >> 208032)] on freebsd10
> >> Type "help", "copyright", "credits" or "license" for more information.
> >> >>> b'\xc3\xa2'.decode('utf-8')
> >> '\xe2'
> >>
> >> MacOS X:
> >> Python 3.4.2 (default, Oct 19 2014, 17:55:38)
> >> [GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.54)] on darwin
> >> Type "help", "copyright", "credits" or "license" for more information.
> >> >>> b'\xc3\xa2'.decode('utf-8')
> >> 'â'
> >>
> >> Why is Python on FreeBSD incorrectly decoding this?
> >
> > Works fine here (FreeBSD 10.1-STABLE #0 r276653 amd64):
> >
> >     Python 3.4.2 (default, Nov  4 2014, 19:34:48)
> >     [GCC 4.2.1 Compatible FreeBSD Clang 3.4.1 (tags/RELEASE_34/dot1-final 
> > 208032)] on freebsd10
> >     Type "help", "copyright", "credits" or "license" for more information.
> >     >>> b'\xc3\xa2'.decode('utf-8')
> >     'â'

(please don't top-post)

> What is the output from print(sys.stdout.encoding) on your system?

    Python 3.4.2 (default, Nov  4 2014, 19:34:48) 
    [GCC 4.2.1 Compatible FreeBSD Clang 3.4.1 (tags/RELEASE_34/dot1-final 
208032)] on freebsd10
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import sys
    >>> print(sys.stdout.encoding)
    UTF-8

> And, can you explain how to change that on mine so that it is UTF-8?
> Mine is a default fresh install, btw.

In /etc/login.conf, I set LC_ALL=en_US.UTF-8;

    default:\
            :passwd_format=sha512:\
            :copyright=/etc/COPYRIGHT:\
            :welcome=/etc/motd:\
            :setenv=MAIL=/var/mail/$,BLOCKSIZE=K,LC_ALL=en_US.UTF-8:\
            :path=/sbin /bin /usr/sbin /usr/bin /usr/games /usr/local/sbin 
/usr/local/bin

And I use a unicode aware X terminal (rxvt-unicode).

In case you're not using X11, the new vt(4) device uses UTF-8, but the old
sc(4) doesn't support it at all, AFAIK.

Roland
-- 
R.F.Smith                                   http://rsmith.home.xs4all.nl/
[plain text _non-HTML_ PGP/GnuPG encrypted/signed email much appreciated]
pgp: 5753 3324 1661 B0FE 8D93  FCED 40F6 D5DC A38A 33E0 (keyID: A38A33E0)

Attachment: pgpp4MY4jhhiJ.pgp
Description: PGP signature

Reply via email to