On Fri, Mar 18, 2011 at 10:09:26PM +0100, Anders Sneckenborg wrote:
> C:\temp\slask4>be new "Svenska tecken åäö"
> Created bug with ID 6be/5c3
> C:\temp\slask4>be list
> ERROR:
> 'ascii' codec can't decode byte 0xe5 in position 15: ordinal not in
> range(128)
> You should set a locale that supports unicode, e.g.
>   export LANG=en_US.utf8
> See http://docs.python.org/library/locale.html for details
> 
> Is it not possible to use Swedish (and other) characters?

It works for me:

  tmp $ mkdir z
  tmp $ cd z
  z $ git init
  Initialized empty Git repository in /tmp/z/.git/
  z $ be --full-version
  1.0.0
  revision: 1e0967ab82d8541413e1dfe4d2e78f1008aa9c5b
  date: 2011-02-24
  committer: W. Trevor King
  storage version: Bugs Everywhere Directory v1.4
  z $ be init
  Using git for revision control.
  BE repository initialized.
  z $ be new "Svenska tecken åäö"
  Created bug with ID 82d/3ca
  z $ be list
  82d/3ca:om: Svenska tecken åäö

Unicode encoding is a bit tricky though, so I may have mixed something
up.  It is also possible they your environment is not configured
correctly.

There used to be a BugDir-wide configuration setting to override your
system encoding, but they were removed because it was unclear which
part of BE should be responsible for reading them (see bug bea/e30).
There is currently no BE-specific way to configure the default
encoding, as it is really a system-level issue.

The encoding for commandline IO is determined by
libbe.util.encoding.get_encoding() which uses
locale.getpreferredencoding() or, if that is not set,
sys.getdefaultencoding().  The locale method depends on the LANG
environmental variable [1].  On my system:

  $ echo $LANG
  en_US.UTF-8
  $ python
  Python 2.6.6 (r266:84292, Mar 16 2011, 22:37:38) 
  [GCC 4.4.4] on linux2
  Type "help", "copyright", "credits" or "license" for more information.
  >>> import locale
  >>> import sys
  >>> locale.getpreferredencoding()
  'UTF-8'
  >>> sys.getdefaultencoding()
  'ascii'

But if I unset LANG:

  $ LANG='' python -c 'import locale; print locale.getpreferredencoding()'
  ANSI_X3.4-1968

Locale strings are different on Windows [2], so while I use
en_US.UTF-8, for Swedish on Windows you would use something like
swedish_sweden.1252 or sve_swe.1252 with [3]

  C:\temp\slask4>set LANG=swedish_sweden.1252

I'm not sure this will work, as I have no real experience with
encodings on Windows.  You might also try PYTHONIOENCODING [4], but
preliminary tests on my system show that not effecting get_encoding().

If none of the above environmental variables help, it looks like there
is some more elaborate code in bzrlib.osutils and mercurial.encoding
that we can look to for inspiration.

[1]: http://docs.python.org/library/locale.html
[2]: http://msdn.microsoft.com/en-us/library/hzz3tw78
[3]: http://en.wikipedia.org/wiki/Environment_variable#DOS_and_Windows
[4]: http://docs.python.org/using/cmdline.html#envvar-PYTHONIOENCODING

-- 
This email may be signed or encrypted with GPG (http://www.gnupg.org).
The GPG signature (if present) will be attached as 'signature.asc'.
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy

My public key is at http://www.physics.drexel.edu/~wking/pubkey.txt

Attachment: pgpRcxCmEk6DV.pgp
Description: PGP signature

_______________________________________________
Be-devel mailing list
[email protected]
http://void.printf.net/cgi-bin/mailman/listinfo/be-devel

Reply via email to