Re: Unicode issue with Python v3.3

2013-04-19 Thread Νίκος Γκρ33κ
Hello Cameron,

Did you received my yesterday's mail?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-17 Thread nagia . retsina
Τη Κυριακή, 14 Απριλίου 2013 12:28:32 μ.μ. UTC+3, ο χρήστης Cameron Simpson 
έγραψε:
 On 13Apr2013 23:00, nagia.rets...@gmail.com nagia.rets...@gmail.com wrote:
 
 | root@nikos [/home/nikos/public_html/foo-py]# pwd
 
 | /home/nikos/public_html/foo-py
 
 | root@nikos [/home/nikos/public_html/foo-py]# cat foo.py 
 
 | #!/bin/sh
 
 | exec 2/home/nikos/cgi.err.out
 
 | echo $0 $* 2
 
 | id 2
 
 | env | sort 2
 
 | set -x
 
 | exec /full/path/to/foo-py ${1+$@}
 
 | 
 
 | root@nikos [/home/nikos/public_html/foo-py]# python3 foo.py 
 
 |   File foo.py, line 2
 
 | exec 2/home/nikos/cgi.err.out
 
 |  ^
 
 | SyntaxError: invalid syntax
 
 
 
 That is because foo.py isn't a python script anymore, it is a shell script.
 
 Its purpose is to divert stderr to a file and to recite various
 
 things about the environment to that file in addition to any error
 
 messages.
 
 
 
 Just run it directly:
 
 
 
   ./foo.py
 
 
 
 The #! line should cause it to be run by the shell.
 
 
 
 I also recommend you try to do all this as your normal user account.
 
 Root is for administration, such as stopping/starting apache and
 
 so on. Not test running scripts from the command line; consider:
 
 if the script has bugs, as root it can do an awful lot of damage.
 
 
 
 | root@nikos [/home/nikos/public_html/foo-py]# 
 
 | As far as thr tail -f of the error_log:
 
 | root@nikos [/home/nikos/public_html]# touch /var/log/httpd/error_log
 
 
 
 That won't do you much good; apache has not opened it, and so it
 
 will not be writing to it. It was writing to a file of that name,
 
 but you removed that file. Apache probably still has its hooks in the old
 
 file (which now has no name).
 
 
 
 Restarting apache should open (or create if missing) this file for you.
 
 
 
 | root@nikos [/home/nikos/public_html]# tail -f /var/log/httpd/error_log
 
 | and its empty even when at the exact same time i run 'python3
 
 | metrites.py' from another interactive prompt when it supposed to
 
 | give live feed of the error messages.
 
 
 
 No, _apache_ writes to that file. So only when you visit the web
 
 page will stuff appear there.
 
 
 
 If you just run things from the command line, error messages will appear on 
 your terminal. Or, after this line of the wrapper script:
 
 
 
   exec 2/home/nikos/cgi.err.out
 
 
 
 the error messages will appear in cgi.err.out.
 
 
 
 | Cameron would it be too much to ask to provide you with root
 
 | access to my VPS server so you can have a look there too?
 
 | i can pay you if you like if you wait a few days to gather some money.
 
 
 
 I really do not recommend that:
 
 
 
   - it is nuts to blithely allow a stranger root access to your system
 
   - you won't learn anything about CGI scripts
 
 
 
 What you need for further debugging of your python issues is access
 
 to the error messages from the CGI script. That is the purpose of
 
 the wrapper script.
 
 
 
 Get the wrapper running on the command line and then test it via the browser.
 
 
 
 Cheers,
 
 -- 
 
 Cameron Simpson c...@zip.com.au
 
 
 
 Lord grant me the serenity to accept the things I can not change,
 
  the courage to change the things that I can,
 
 and the wisdom to hide the bodies of those people I had to kill
 
  because they pissed me off.
 
 - Jeffrey Papen jpa...@asucla.ucla.edu
cameron,

can you help please or tell me what else i need to try?
Hello
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-17 Thread Chris Angelico
On Wed, Apr 17, 2013 at 4:56 PM,  nagia.rets...@gmail.com wrote:
 can you help please or tell me what else i need to try?

You need to try trimming quoted text in replies, not double-spacing,
and paying for help.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-17 Thread Chris Angelico
On Wed, Apr 17, 2013 at 4:56 PM,  nagia.rets...@gmail.com wrote:
 can you help please or tell me what else i need to try?

You need to try trimming quoted text in replies, not double-spacing,
and paying for help.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-17 Thread Cameron Simpson
On 14Apr2013 04:22, nagia.rets...@gmail.com nagia.rets...@gmail.com wrote:
|  | Cameron would it be too much to ask to provide you with root
|  | access to my VPS server so you can have a look there too?
|  | i can pay you if you like if you wait a few days to gather some money.
|  
|  I really do not recommend that:
|- it is nuts to blithely allow a stranger root access to your system
|- you won't learn anything about CGI scripts
[...]
| I insist that you will make the most of this if you access the VPS yourself.
| it runs CentOS 6.4
| Please accept, i trust you.

Very well.

Let's take this off list to personal email (note that the reply-to
on this message is just myself, not the list/group).

We can return here after sorting CGI issues, should there be any further python
specific issues.

Reply to this message. I will email you my ssh public key. Please make me an
_ordinary_ user account called cameron and send me the ssh details of your 
VPS.
-- 
Cameron Simpson c...@zip.com.au

TeX: When you pronounce it correctly to your computer, the terminal may
 become slightly moist. - D. E. Knuth.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-17 Thread Νίκος Γκρ33κ
Τη Πέμπτη, 18 Απριλίου 2013 2:00:48 π.μ. UTC+3, ο χρήστης Cameron Simpson 
έγραψε:

 Reply to this message. I will email you my ssh public key. Please make me an 
 _ordinary_ user account called cameron and send me the ssh details of your
 VPS.

Thank you very much Cameron, i appreciate all your help and i'am willing to 
open you a free lifetime premium account at my webhosting as a token of 
appreciation.

I have just mail you the login credentials.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-15 Thread Νίκος Γκρ33κ
Hello, can you still help me please?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-14 Thread nagia . retsina
Τη Τετάρτη, 10 Απριλίου 2013 12:10:13 π.μ. UTC+3, ο χρήστης Νίκος Γκρ33κ έγραψε:
 Hello, iam still trying to alter the code form python 2.6 = 3.3
 
 
 
 Everyrging its setup except that unicode error that you can see if you go to 
 http://superhost.gr
 
 
 
 Can anyone help with this?
 
 I even tried to change print() with sys.stdout.buffer() but still i get the 
 same unicode issue.
 
 
 
 I don't know what to try anymore.

root@nikos [/home/nikos/public_html/foo-py]# pwd
/home/nikos/public_html/foo-py
root@nikos [/home/nikos/public_html/foo-py]# cat foo.py 
#!/bin/sh
exec 2/home/nikos/cgi.err.out
echo $0 $* 2
id 2
env | sort 2
set -x
exec /full/path/to/foo-py ${1+$@}

root@nikos [/home/nikos/public_html/foo-py]# python3 foo.py 
  File foo.py, line 2
exec 2/home/nikos/cgi.err.out
 ^
SyntaxError: invalid syntax
root@nikos [/home/nikos/public_html/foo-py]# 

As far as thr tail -f of the error_log:

root@nikos [/home/nikos/public_html]# touch /var/log/httpd/error_log
root@nikos [/home/nikos/public_html]# tail -f /var/log/httpd/error_log

and its empty even when at the exact same time i run 'python3 metrites.py' from 
another interactive prompt when it supposed to give live feed of the error 
messages.

Cameron would it be too much to ask to provide you with root access to my VPS 
server so you can have a look there too?

i can pay you if you like if you wait a few days to gather some money.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-14 Thread Cameron Simpson
On 13Apr2013 23:00, nagia.rets...@gmail.com nagia.rets...@gmail.com wrote:
| root@nikos [/home/nikos/public_html/foo-py]# pwd
| /home/nikos/public_html/foo-py
| root@nikos [/home/nikos/public_html/foo-py]# cat foo.py 
| #!/bin/sh
| exec 2/home/nikos/cgi.err.out
| echo $0 $* 2
| id 2
| env | sort 2
| set -x
| exec /full/path/to/foo-py ${1+$@}
| 
| root@nikos [/home/nikos/public_html/foo-py]# python3 foo.py 
|   File foo.py, line 2
| exec 2/home/nikos/cgi.err.out
|  ^
| SyntaxError: invalid syntax

That is because foo.py isn't a python script anymore, it is a shell script.
Its purpose is to divert stderr to a file and to recite various
things about the environment to that file in addition to any error
messages.

Just run it directly:

  ./foo.py

The #! line should cause it to be run by the shell.

I also recommend you try to do all this as your normal user account.
Root is for administration, such as stopping/starting apache and
so on. Not test running scripts from the command line; consider:
if the script has bugs, as root it can do an awful lot of damage.

| root@nikos [/home/nikos/public_html/foo-py]# 
| As far as thr tail -f of the error_log:
| root@nikos [/home/nikos/public_html]# touch /var/log/httpd/error_log

That won't do you much good; apache has not opened it, and so it
will not be writing to it. It was writing to a file of that name,
but you removed that file. Apache probably still has its hooks in the old
file (which now has no name).

Restarting apache should open (or create if missing) this file for you.

| root@nikos [/home/nikos/public_html]# tail -f /var/log/httpd/error_log
| and its empty even when at the exact same time i run 'python3
| metrites.py' from another interactive prompt when it supposed to
| give live feed of the error messages.

No, _apache_ writes to that file. So only when you visit the web
page will stuff appear there.

If you just run things from the command line, error messages will appear on 
your terminal. Or, after this line of the wrapper script:

  exec 2/home/nikos/cgi.err.out

the error messages will appear in cgi.err.out.

| Cameron would it be too much to ask to provide you with root
| access to my VPS server so you can have a look there too?
| i can pay you if you like if you wait a few days to gather some money.

I really do not recommend that:

  - it is nuts to blithely allow a stranger root access to your system
  - you won't learn anything about CGI scripts

What you need for further debugging of your python issues is access
to the error messages from the CGI script. That is the purpose of
the wrapper script.

Get the wrapper running on the command line and then test it via the browser.

Cheers,
-- 
Cameron Simpson c...@zip.com.au

Lord grant me the serenity to accept the things I can not change,
 the courage to change the things that I can,
and the wisdom to hide the bodies of those people I had to kill
 because they pissed me off.
- Jeffrey Papen jpa...@asucla.ucla.edu
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-14 Thread nagia . retsina
Τη Κυριακή, 14 Απριλίου 2013 12:28:32 μ.μ. UTC+3, ο χρήστης Cameron Simpson 
έγραψε:
 On 13Apr2013 23:00, nagia.rets...@gmail.com nagia.rets...@gmail.com wrote:
 
 | root@nikos [/home/nikos/public_html/foo-py]# pwd
 
 | /home/nikos/public_html/foo-py
 
 | root@nikos [/home/nikos/public_html/foo-py]# cat foo.py 
 
 | #!/bin/sh
 
 | exec 2/home/nikos/cgi.err.out
 
 | echo $0 $* 2
 
 | id 2
 
 | env | sort 2
 
 | set -x
 
 | exec /full/path/to/foo-py ${1+$@}
 
 | 
 
 | root@nikos [/home/nikos/public_html/foo-py]# python3 foo.py 
 
 |   File foo.py, line 2
 
 | exec 2/home/nikos/cgi.err.out
 
 |  ^
 
 | SyntaxError: invalid syntax
 
 
 
 That is because foo.py isn't a python script anymore, it is a shell script.
 
 Its purpose is to divert stderr to a file and to recite various
 
 things about the environment to that file in addition to any error
 
 messages.
 
 
 
 Just run it directly:
 
 
 
   ./foo.py
 
 
 
 The #! line should cause it to be run by the shell.
 
 
 
 I also recommend you try to do all this as your normal user account.
 
 Root is for administration, such as stopping/starting apache and
 
 so on. Not test running scripts from the command line; consider:
 
 if the script has bugs, as root it can do an awful lot of damage.
 
 
 
 | root@nikos [/home/nikos/public_html/foo-py]# 
 
 | As far as thr tail -f of the error_log:
 
 | root@nikos [/home/nikos/public_html]# touch /var/log/httpd/error_log
 
 
 
 That won't do you much good; apache has not opened it, and so it
 
 will not be writing to it. It was writing to a file of that name,
 
 but you removed that file. Apache probably still has its hooks in the old
 
 file (which now has no name).
 
 
 
 Restarting apache should open (or create if missing) this file for you.
 
 
 
 | root@nikos [/home/nikos/public_html]# tail -f /var/log/httpd/error_log
 
 | and its empty even when at the exact same time i run 'python3
 
 | metrites.py' from another interactive prompt when it supposed to
 
 | give live feed of the error messages.
 
 
 
 No, _apache_ writes to that file. So only when you visit the web
 
 page will stuff appear there.
 
 
 
 If you just run things from the command line, error messages will appear on 
 your terminal. Or, after this line of the wrapper script:
 
 
 
   exec 2/home/nikos/cgi.err.out
 
 
 
 the error messages will appear in cgi.err.out.
 
 
 
 | Cameron would it be too much to ask to provide you with root
 
 | access to my VPS server so you can have a look there too?
 
 | i can pay you if you like if you wait a few days to gather some money.
 
 
 
 I really do not recommend that:
 
 
 
   - it is nuts to blithely allow a stranger root access to your system
 
   - you won't learn anything about CGI scripts
 
 
 
 What you need for further debugging of your python issues is access
 
 to the error messages from the CGI script. That is the purpose of
 
 the wrapper script.
 
 
 
 Get the wrapper running on the command line and then test it via the browser.
 
 
 
 Cheers,
 
 -- 
 
 Cameron Simpson c...@zip.com.au
 
 
 
 Lord grant me the serenity to accept the things I can not change,
 
  the courage to change the things that I can,
 
 and the wisdom to hide the bodies of those people I had to kill
 
  because they pissed me off.
 
 - Jeffrey Papen jpa...@asucla.ucla.edu

Well i trust you because you are the only one along with Lele that are helpimg 
me here:

i tried what you said:

root@nikos [/home/nikos/public_html/cgi-bin]# service httpd restart
root@nikos [/home/nikos/public_html/cgi-bin]# python3 metrites.py 
root@nikos [/home/nikos/public_html]# cd foo-py/
root@nikos [/home/nikos/public_html/foo-py]# ls
./  ../  foo.py*
root@nikos [/home/nikos/public_html/foo-py]# ./foo.py 
root@nikos [/home/nikos/public_html/foo-py]# cd ..
root@nikos [/home/nikos/public_html]# cat cgi.err.out 
root@nikos [/home/nikos/public_html/cgi-bin]# cat /var/log/httpd/error_log 
root@nikos [/home/nikos/public_html/cgi-bin]# 

and i have run the script form browser but i still see nothing.

I insist that you will make the most of this if you access the VPS yourself.
it runs CentOS 6.4

Please accept, i trust you.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-13 Thread Cameron Simpson
On 12Apr2013 21:50, nagia.rets...@gmail.com nagia.rets...@gmail.com wrote:
| Ookey after that is corrected, i then tried the plain solution and i got this 
response back form the shell:
| 
| Traceback (most recent call last):
|   File metrites.py, line 213, in lt;modulegt;
| htmldata = f.read()
|   File /root/.local/lib/python2.7/lib/python3.3/encodings/iso8859_7.py, 
line 23, in decode
| return codecs.charmap_decode(input,self.errors,decoding_table)[0]
| UnicodeDecodeError: 'charmap' codec can't decode byte 0xae in position 47: 
character maps to lt;undefinedgt;
| 
| then i switched to:
| 
|   with open('/home/nikos/www/' + page, encoding='utf-8') as f:
|   htmldata = f.read()
| 
| and i got no error at all, just pure run *from the shell*!

Ok, so you need to specify utf-8 to decode the file. Good.

| But i get internal server error when i try to run the webpage from the 
browser(Chrome).

That is standard for a CGI script that does not complete successfully.

| So, can you tell me please where can i find the apache error log so to 
display here please?

That depends on the install. Have a look in /var/log/apache or similar.
Otherwise you need to find the httpd.conf for the apache and look
for its log coniguration settings.

| Apcher error_log is always better than running 'python3 metrites.py' because 
even if the python script has no error apache will also display more web 
related things?

The error log is where error messages from CGI scripts go. And other error 
messages.
It is very useful when testing CGI scripts.

Of course, it's best to work out as much as possible from the command
line first; you have much more direct control and access to errors
there. That only gets you so far though; the environment the CGI
script runs in is not the same as your command line, and some
different behaviour can come from this.

BTW, are you sure python3 is running your CGI script?
Also, the CGI script may not be running as you, but as the apache user.
In that case, it may fail if it does not has permission to access various
files owned by you.

Anyway, you need to see the error messages to work this out.

If you can't find the error log you can divert stderr at the
start of your python program:

  sys.stderr = open('/home/nikos/cgi.err.out', 'a')

and watch that in a shell:

  tail -f cgi.err.out

Cheers,
-- 
Cameron Simpson c...@zip.com.au

If you 'aint falling off, you ar'nt going hard enough.  - Fred Gassit
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-13 Thread nagia . retsina
Τη Σάββατο, 13 Απριλίου 2013 1:28:07 μ.μ. UTC+3, ο χρήστης Cameron Simpson 
έγραψε:
 On 12Apr2013 21:50, nagia.rets...@gmail.com nagia.rets...@gmail.com wrote:
 
 | Ookey after that is corrected, i then tried the plain solution and i got 
 this response back form the shell:
 
 | 
 
 | Traceback (most recent call last):
 
 |   File metrites.py, line 213, in lt;modulegt;
 
 | htmldata = f.read()
 
 |   File /root/.local/lib/python2.7/lib/python3.3/encodings/iso8859_7.py, 
 line 23, in decode
 
 | return codecs.charmap_decode(input,self.errors,decoding_table)[0]
 
 | UnicodeDecodeError: 'charmap' codec can't decode byte 0xae in position 47: 
 character maps to lt;undefinedgt;
 
 | 
 
 | then i switched to:
 
 | 
 
 | with open('/home/nikos/www/' + page, encoding='utf-8') as f:
 
 | htmldata = f.read()
 
 | 
 
 | and i got no error at all, just pure run *from the shell*!
 
 
 
 Ok, so you need to specify utf-8 to decode the file. Good.
 
 
 
 | But i get internal server error when i try to run the webpage from the 
 browser(Chrome).
 
 
 
 That is standard for a CGI script that does not complete successfully.
 
 
 
 | So, can you tell me please where can i find the apache error log so to 
 display here please?
 
 
 
 That depends on the install. Have a look in /var/log/apache or similar.
 
 Otherwise you need to find the httpd.conf for the apache and look
 
 for its log coniguration settings.
 
 
 
 | Apcher error_log is always better than running 'python3 metrites.py' 
 because even if the python script has no error apache will also display more 
 web related things?
 
 
 
 The error log is where error messages from CGI scripts go. And other error 
 messages.
 
 It is very useful when testing CGI scripts.
 
 
 
 Of course, it's best to work out as much as possible from the command
 
 line first; you have much more direct control and access to errors
 
 there. That only gets you so far though; the environment the CGI
 
 script runs in is not the same as your command line, and some
 
 different behaviour can come from this.
 
 
 
 BTW, are you sure python3 is running your CGI script?
 
 Also, the CGI script may not be running as you, but as the apache user.
 
 In that case, it may fail if it does not has permission to access various
 
 files owned by you.
 
 
 
 Anyway, you need to see the error messages to work this out.
 
 
 
 If you can't find the error log you can divert stderr at the
 
 start of your python program:
 
 
 
   sys.stderr = open('/home/nikos/cgi.err.out', 'a')
 
 
 
 and watch that in a shell:
 
 
 
   tail -f cgi.err.out
 
 
 
 Cheers,
 
 -- 
 
 Cameron Simpson c...@zip.com.au
 
 
 
 If you 'aint falling off, you ar'nt going hard enough.  - Fred Gassit

root@macgyver [/home/nikos/public_html/cgi-bin]# ls ../cgi.err.out 
../cgi.err.out
root@macgyver [/home/nikos/public_html/cgi-bin]# cat ../cgi.err.out 
root@macgyver [/home/nikos/public_html/cgi-bin]# 

Also i have foudn the error log and i tried to view it but it was empty and 
then i removed it and then run the script both from shell and broswer but it 
didnt reappeared.

root@macgyver [/home/nikos/public_html/cgi-bin]# cat /var/log/httpd/error_log
cat: /var/log/httpd/error_log: No such file or directory
root@macgyver [/home/nikos/public_html/cgi-bin]# 

Maybe its somehtign wron with my enviroment?
Should we check the Apache and CGI enviroment somehow and also make sure as you 
say that *I* run the CGI scripts and not user 'Apache' ?

Tell me what commands i should issues please and i will display the output to 
you.

Thank you Cameron, for helpimg me. Somehow the script doesnt seem to be the 
issue in  my VPS.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-13 Thread Chris Angelico
On Sun, Apr 14, 2013 at 12:16 AM,  nagia.rets...@gmail.com wrote:
 Also i have foudn the error log and i tried to view it but it was empty and 
 then i removed it and then run the script both from shell and broswer but it 
 didnt reappeared.

 root@macgyver [/home/nikos/public_html/cgi-bin]# cat /var/log/httpd/error_log
 cat: /var/log/httpd/error_log: No such file or directory
 root@macgyver [/home/nikos/public_html/cgi-bin]#

https://www.google.com/search?q=log+file+rotation

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-13 Thread Cameron Simpson
On 13Apr2013 07:16, nagia.rets...@gmail.com nagia.rets...@gmail.com wrote:
| root@macgyver [/home/nikos/public_html/cgi-bin]# ls ../cgi.err.out 
| ../cgi.err.out

I prefer ls -ld myself.

| root@macgyver [/home/nikos/public_html/cgi-bin]# cat ../cgi.err.out 
| 
| Also i have foudn the error log and i tried to view it but it was
| empty and then i removed it and then run the script both from shell
| and broswer but it didnt reappeared.

Never remove it. It is only created by the web server at startup or log 
rotation time. So now you need to restart the apache to get it back.

Just open a spare terminal and run:

  tail -f /var/log/httpd/error_log

| Should we check the Apache and CGI enviroment somehow and also
| make sure as you say that *I* run the CGI scripts and not user
| 'Apache' ?

Well, it is helpful to know. if the CGI script tries to write any data to files,
if it runs as a different user it will need different permissions on the files.

| Tell me what commands i should issues please and i will display the output to 
you.

I would be tempter to wrap the CGI script in a shell script.
Suppose your script is named foo.py. 

You can move the script to foo-py and make a shell script called foo.py 
looking like this:

  #!/bin/sh
  exec 2/home/nikos/cgi.err.out
  echo $0 $* 2
  id 2
  env | sort 2
  set -x
  exec /full/path/to/foo-py ${1+$@}

and make sure it, like the original, is readable and executable:

  chmod a+rx foo.py foo-py

Make sure cgi.err.out is publicly writable (in case the apache is
not running the CGIs are you):

  chmod a+w cgi.err.out

Then:

  tail -f cgi.err.out

in a spare window.

Then try the script.

It should transcribe information about the script's user and
environment and also catch errors.

This should help in debugging.

Cheers,
-- 

I die. I have a terrible fever in my head and it gets hotter and hotter until
my head is a fire, a forge, a star. I set the world on fire and all die. O the
embarrassment.  - Joe Haldeman, _A !Tangled Web_
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-12 Thread nagia . retsina
Someone HEELP ME!!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-12 Thread Chris Angelico
On Fri, Apr 12, 2013 at 10:50 PM,  nagia.rets...@gmail.com wrote:
 Someone HEELP ME!!

http://youtu.be/VxMYwjp8t0o

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-12 Thread nagia . retsina
Τη Παρασκευή, 12 Απριλίου 2013 4:14:39 μ.μ. UTC+3, ο χρήστης Chris Angelico 
έγραψε:
 On Fri, Apr 12, 2013 at 10:50 PM,  nagia.rets...@gmail.com wrote:
 
  Someone HEELP ME!!
 
 
 
 http://youtu.be/VxMYwjp8t0o
 
 
 
 ChrisA


Well, instead of being a smartass it would be nice if you could actually help 
for once.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-12 Thread Chris Angelico
On Fri, Apr 12, 2013 at 11:18 PM,  nagia.rets...@gmail.com wrote:
 Τη Παρασκευή, 12 Απριλίου 2013 4:14:39 μ.μ. UTC+3, ο χρήστης Chris Angelico 
 έγραψε:
 On Fri, Apr 12, 2013 at 10:50 PM,  nagia.rets...@gmail.com wrote:

  Someone HEELP ME!!

 http://youtu.be/VxMYwjp8t0o

 ChrisA


 Well, instead of being a smartass it would be nice if you could actually help 
 for once.

Yeah, I'm done with that. Your whining ran through my patience a few
posts ago. But you should feel special; I clipped that just for you.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-12 Thread rusi
On Apr 12, 6:18 pm, nagia.rets...@gmail.com wrote:
 Τη Παρασκευή, 12 Απριλίου 2013 4:14:39 μ.μ. UTC+3, ο χρήστης Chris Angelico 
 έγραψε:

  On Fri, Apr 12, 2013 at 10:50 PM,  nagia.rets...@gmail.com wrote:

   Someone HEELP ME!!

 http://youtu.be/VxMYwjp8t0o

  ChrisA

 Well, instead of being a smartass it would be nice if you could actually help 
 for once.

Interesting!
Among the things which you dont seem to know is the meaning of the
word 'once'.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-12 Thread nagia . retsina
Τη Παρασκευή, 12 Απριλίου 2013 4:29:51 μ.μ. UTC+3, ο χρήστης rusi έγραψε:
 On Apr 12, 6:18 pm, nagia.rets...@gmail.com wrote:
 
  Τη Παρασκευή, 12 Απριλίου 2013 4:14:39 μ.μ. UTC+3, ο χρήστης Chris Angelico 
  έγραψε:
 
 
 
   On Fri, Apr 12, 2013 at 10:50 PM,  nagia.rets...@gmail.com wrote:
 
 
 
Someone HEELP ME!!
 
 
 
  http://youtu.be/VxMYwjp8t0o
 
 
 
   ChrisA
 
 
 
  Well, instead of being a smartass it would be nice if you could actually 
  help for once.
 
 
 
 Interesting!
 
 Among the things which you dont seem to know is the meaning of the
 
 word 'once'.

Same applies for you too. Stop being smartasses.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-12 Thread Ian Kelly
On Fri, Apr 12, 2013 at 8:36 AM,  nagia.rets...@gmail.com wrote:
 Τη Παρασκευή, 12 Απριλίου 2013 4:29:51 μ.μ. UTC+3, ο χρήστης rusi έγραψε:
 On Apr 12, 6:18 pm, nagia.rets...@gmail.com wrote:
  Well, instead of being a smartass it would be nice if you could actually 
  help for once.

 Interesting!

 Among the things which you dont seem to know is the meaning of the
 word 'once'.

 Same applies for you too. Stop being smartasses.

Please keep in mind that this is a community of volunteers.  Nobody
here is being paid for their time to help you fix your website, and if
you manage to irritate us in the process, we're likely to just walk
away from it.

I looked over the code that you have provided us with, and based on
that I could not see any reason why the html would be in the form of a
bytes instead of a str.  Since nobody else here seems to have any
further insight into the problem either, you're just going to have to
find a a way to debug the code.  If you cannot do that on your own,
then I suggest that you find a contractor who can, hire them, and
grant them the access they need to do a real debugging session.

I would also recommend that in the future you should stop deploying
untested code to your production website.  Set up a development
environment for yourself, make the changes there, and only deploy when
you know that everything is working.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-12 Thread Roy Smith
In article mailman.533.1365792239.3114.python-l...@python.org,
 Ian Kelly ian.g.ke...@gmail.com wrote:

 I would also recommend that in the future you should stop deploying
 untested code to your production website.  Set up a development
 environment for yourself, make the changes there, and only deploy when
 you know that everything is working.

But that takes all the fun out of it :-)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-12 Thread nagia . retsina
Τη Παρασκευή, 12 Απριλίου 2013 9:37:29 μ.μ. UTC+3, ο χρήστης Ian έγραψε:
 On Fri, Apr 12, 2013 at 8:36 AM,  nagia.rets...@gmail.com wrote:
 
  Τη Παρασκευή, 12 Απριλίου 2013 4:29:51 μ.μ. UTC+3, ο χρήστης rusi έγραψε:
 
  On Apr 12, 6:18 pm, nagia.rets...@gmail.com wrote:
 
   Well, instead of being a smartass it would be nice if you could actually 
   help for once.
 
 
 
  Interesting!
 
 
 
  Among the things which you dont seem to know is the meaning of the
 
  word 'once'.
 
 
 
  Same applies for you too. Stop being smartasses.
 
 
 
 Please keep in mind that this is a community of volunteers.  Nobody
 
 here is being paid for their time to help you fix your website, and if
 
 you manage to irritate us in the process, we're likely to just walk
 
 away from it.
 
 
 
 I looked over the code that you have provided us with, and based on
 
 that I could not see any reason why the html would be in the form of a
 
 bytes instead of a str.  Since nobody else here seems to have any
 
 further insight into the problem either, you're just going to have to
 
 find a a way to debug the code.  If you cannot do that on your own,
 
 then I suggest that you find a contractor who can, hire them, and
 
 grant them the access they need to do a real debugging session.
 
 
 
 I would also recommend that in the future you should stop deploying
 
 untested code to your production website.  Set up a development
 
 environment for yourself, make the changes there, and only deploy when
 
 you know that everything is working.

I agree with what you say except form the fact that i try to irritate people.
Look at the thread and you will see who's irritating whom first.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-12 Thread Cameron Simpson
On 11Apr2013 09:55, Nikos nagia.rets...@gmail.com wrote:
| Τη Πέμπτη, 11 Απριλίου 2013 1:45:22 μ.μ. UTC+3, ο χρήστης Cameron Simpson 
έγραψε:
|  On 10Apr2013 21:50, nagia.rets...@gmail.com nagia.rets...@gmail.com wrote:
|  | the doctype is coming form the attempt of script metrites.py to open and 
read the 'index.html' file.
|  | But i don't know how to try to open it as a byte file instead of an tetxt 
file.

Lele Gaifax showed one way:

from codecs import open
with open('index.html', encoding='utf-8') as f:
content = f.read()

But a plain open() should also do:

with open('index.html') as f:
content = f.read()

if you're not taking tight control of the file encoding.

The point here is to get _text_ (i.e. str) data from the file, not bytes.

If the text turns out to be incorrectly decoded (i.e. incorrectly
reading the file bytes and assembling them into text strings) because
the default encoding is wrong, then you may need to read for Lele's
more verbose open() example to select the correct encoding.

But first ignore that and get text (str) instead of bytes.
If you're already getting text from the file, something later is
making bytes and handing it to print().

Another approach to try is to use
  sys.stdout.write()
instead of
  print()

The print() function will take _anything_ and write text of some form.
The write() function will throw an exception if it gets the wrong type of data.

If sys.stdout is opened in binary mode then write() will require
bytes as data; strings will need to be explicitly turned into bytes
via .encode() in order to not raise an exception.

If sys.stdout is open in text mode, write() will require str data.
The sys.stdout file itself will transcribe to bytes for you.

If you take that route, at least you will not have confusion about
str versus bytes.

For an HTML output page I would advocate arranging that sys.stdout
is in text mode; that way you can do the natural thing and .write()
str data and lovely UTF-8 bytes will come out the other end.

If the above test (using .write() instead of print()) shows it to
be in binary mode we can fix that. But you need to find out.

You will want access to the error messages from the CGI environment;
do you have access to the web servers error_log? You can tail that
in a terminal while you reload the page to see what's going on.

| This works in the shell, but doesn't work on my website:
| 
| $ cat utf8.txt
| υλικό!Πρόκειται γ

Ok, so your terminal is using UTF-8 as its output coding. (And so
is your mail posting program, since we see it unmangled on my screen
here.)

| $ python3
| Python 3.2.3 (default, Oct 19 2012, 20:10:41)
| [GCC 4.6.3] on linux2
| Type help, copyright, credits or license for more information.
|  data = open('utf8.txt').read()
|  print(data)
| υλικό!Πρόκειται γ

Likewise.

However, in an exciting twist, I seem to recall that Python invoked
interactively with aterminal as output will have the default terminal
encoding in place on sys.stdout. Producing what you expect. _However_,
python invoked in a batch environment where stdout is not a terminal
(such as in the CGI environment producing your web page), that is
_not_ necessarily the case.

|  print(data.encode('utf-8'))
| 
b'\xcf\x85\xce\xbb\xce\xb9\xce\xba\xcf\x8c!\xce\xa0\xcf\x81\xcf\x8c\xce\xba\xce\xb5\xce\xb9\xcf\x84\xce\xb1\xce\xb9
 \xce\xb3\n'
| 
| See, the last line is what i'am getting on my website.

The above line takes your Unicode text in data and transcribed
it to bytes using UTF-8 as the encoding. And print() is then receiving
that bytes object and printing its str() representation as b''.
That str is itself unicode, and when print passes it to sys.stdout,
_that_ transcribed the unicode b'...' string as bytes to your
terminal. Using UTF-8 based on the previous examples above, but
since all those characters are in the bottom 127 code range the
byte sequence will be the same if it uses ASCII or ISO8859-1 or
almost anything else:-)

As you can see, there's a lot of encoding/decoding going on behind
the scenes even in this superficially simple example.

| If i remove
| the encode('utf-8') part in metrites.py, the webpage will not show
| anything at all...

Ah, but data will be being output. The print() function _will_ be
writing data out in some form.  I suggest you remove the .encode()
and then examine the _source_ text of the web page, not its visible
form.

So: remove .encode(), reload the web page, view page source
(depends on your browser, it is ctrl-U in Firefox ((Cmd-U in firefox
on a Mac))).

I think a lot of the issue you have in this thread is that your
page is too complex. Make another page to do the same thing, and
start with nothing. Add stuff to it a single item at a time until
the page behaves incorrectly. Then you will know the exact item of
code that introduced the issue. And then that single item can be
examined in detail for the decode/encode issues.

The other issue in the thread is that people losing 

Re: Unicode issue with Python v3.3

2013-04-12 Thread nagia . retsina
Τη Σάββατο, 13 Απριλίου 2013 4:41:57 π.μ. UTC+3, ο χρήστης Cameron Simpson 
έγραψε:
 On 11Apr2013 09:55, Nikos nagia.rets...@gmail.com wrote:
 
 | Τη Πέμπτη, 11 Απριλίου 2013 1:45:22 μ.μ. UTC+3, ο χρήστης Cameron Simpson 
 έγραψε:
 
 |  On 10Apr2013 21:50, nagia.rets...@gmail.com nagia.rets...@gmail.com 
 wrote:
 
 |  | the doctype is coming form the attempt of script metrites.py to open 
 and read the 'index.html' file.
 
 |  | But i don't know how to try to open it as a byte file instead of an 
 tetxt file.
 
 
 
 Lele Gaifax showed one way:
 
 
 
 from codecs import open
 
 with open('index.html', encoding='utf-8') as f:
 
 content = f.read()
 
 
 
 But a plain open() should also do:
 
 
 
 with open('index.html') as f:
 
 content = f.read()
 
 
 
 if you're not taking tight control of the file encoding.
 
 
 
 The point here is to get _text_ (i.e. str) data from the file, not bytes.
 
 
 
 If the text turns out to be incorrectly decoded (i.e. incorrectly
 
 reading the file bytes and assembling them into text strings) because
 
 the default encoding is wrong, then you may need to read for Lele's
 
 more verbose open() example to select the correct encoding.
 
 
 
 But first ignore that and get text (str) instead of bytes.
 
 If you're already getting text from the file, something later is
 
 making bytes and handing it to print().
 
 
 
 Another approach to try is to use
 
   sys.stdout.write()
 
 instead of
 
   print()
 
 
 
 The print() function will take _anything_ and write text of some form.
 
 The write() function will throw an exception if it gets the wrong type of 
 data.
 
 
 
 If sys.stdout is opened in binary mode then write() will require
 
 bytes as data; strings will need to be explicitly turned into bytes
 
 via .encode() in order to not raise an exception.
 
 
 
 If sys.stdout is open in text mode, write() will require str data.
 
 The sys.stdout file itself will transcribe to bytes for you.
 
 
 
 If you take that route, at least you will not have confusion about
 
 str versus bytes.
 
 
 
 For an HTML output page I would advocate arranging that sys.stdout
 
 is in text mode; that way you can do the natural thing and .write()
 
 str data and lovely UTF-8 bytes will come out the other end.
 
 
 
 If the above test (using .write() instead of print()) shows it to
 
 be in binary mode we can fix that. But you need to find out.
 
 
 
 You will want access to the error messages from the CGI environment;
 
 do you have access to the web servers error_log? You can tail that
 
 in a terminal while you reload the page to see what's going on.
 
 
 
 | This works in the shell, but doesn't work on my website:
 
 | 
 
 | $ cat utf8.txt
 
 | υλικό!Πρόκειται γ
 
 
 
 Ok, so your terminal is using UTF-8 as its output coding. (And so
 
 is your mail posting program, since we see it unmangled on my screen
 
 here.)
 
 
 
 | $ python3
 
 | Python 3.2.3 (default, Oct 19 2012, 20:10:41)
 
 | [GCC 4.6.3] on linux2
 
 | Type help, copyright, credits or license for more information.
 
 |  data = open('utf8.txt').read()
 
 |  print(data)
 
 | υλικό!Πρόκειται γ
 
 
 
 Likewise.
 
 
 
 However, in an exciting twist, I seem to recall that Python invoked
 
 interactively with aterminal as output will have the default terminal
 
 encoding in place on sys.stdout. Producing what you expect. _However_,
 
 python invoked in a batch environment where stdout is not a terminal
 
 (such as in the CGI environment producing your web page), that is
 
 _not_ necessarily the case.
 
 
 
 |  print(data.encode('utf-8'))
 
 | 
 b'\xcf\x85\xce\xbb\xce\xb9\xce\xba\xcf\x8c!\xce\xa0\xcf\x81\xcf\x8c\xce\xba\xce\xb5\xce\xb9\xcf\x84\xce\xb1\xce\xb9
  \xce\xb3\n'
 
 | 
 
 | See, the last line is what i'am getting on my website.
 
 
 
 The above line takes your Unicode text in data and transcribed
 
 it to bytes using UTF-8 as the encoding. And print() is then receiving
 
 that bytes object and printing its str() representation as b''.
 
 That str is itself unicode, and when print passes it to sys.stdout,
 
 _that_ transcribed the unicode b'...' string as bytes to your
 
 terminal. Using UTF-8 based on the previous examples above, but
 
 since all those characters are in the bottom 127 code range the
 
 byte sequence will be the same if it uses ASCII or ISO8859-1 or
 
 almost anything else:-)
 
 
 
 As you can see, there's a lot of encoding/decoding going on behind
 
 the scenes even in this superficially simple example.
 
 
 
 | If i remove
 
 | the encode('utf-8') part in metrites.py, the webpage will not show
 
 | anything at all...
 
 
 
 Ah, but data will be being output. The print() function _will_ be
 
 writing data out in some form.  I suggest you remove the .encode()
 
 and then examine the _source_ text of the web page, not its visible
 
 form.
 
 
 
 So: remove .encode(), reload the web page, view page source
 
 (depends on your browser, it is ctrl-U in Firefox ((Cmd-U in firefox
 
 on a Mac))).
 

Re: Unicode issue with Python v3.3

2013-04-11 Thread nagia . retsina
Since now we k ow the problem maybe we can tell metrites.py to open index.html 
using utf-8 encoding rather as binary, dont you think?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-11 Thread Steven D'Aprano
On Thu, 11 Apr 2013 00:13:46 -0700, nagia.retsina wrote:

 Since now we k ow the problem maybe we can tell metrites.py to open
 index.html using utf-8 encoding rather as binary, dont you think?

What makes you think it is UTF-8?

Last time you tried decoding content as UTF-8, you got an error that it 
wasn't a legal UTF-8 file. 


Where does index.html come from? Whatever program generates that, you 
need to find out what encoding it is using.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-11 Thread Steven D'Aprano
On Thu, 11 Apr 2013 07:50:19 +, Steven D'Aprano wrote:

 On Thu, 11 Apr 2013 00:13:46 -0700, nagia.retsina wrote:
 
 Since now we k ow the problem maybe we can tell metrites.py to open
 index.html using utf-8 encoding rather as binary, dont you think?
 
 What makes you think it is UTF-8?
 
 Last time you tried decoding content as UTF-8, you got an error that it
 wasn't a legal UTF-8 file.

Oops, sorry, correction. It wasn't a legal UTF-8 string. It was an 
environment variable that was causing the decoding error, since it 
contained illegal bytes for a UTF-8 string.


 Where does index.html come from? Whatever program generates that, you
 need to find out what encoding it is using.


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-11 Thread nagia . retsina
Τη Πέμπτη, 11 Απριλίου 2013 11:20:47 π.μ. UTC+3, ο χρήστης Steven D'Aprano 
έγραψε:
 On Thu, 11 Apr 2013 07:50:19 +, Steven D'Aprano wrote:
 
 
 
  On Thu, 11 Apr 2013 00:13:46 -0700, nagia.retsina wrote:
 
  
 
  Since now we k ow the problem maybe we can tell metrites.py to open
 
  index.html using utf-8 encoding rather as binary, dont you think?
 
  
 
  What makes you think it is UTF-8?
 
  
 
  Last time you tried decoding content as UTF-8, you got an error that it
 
  wasn't a legal UTF-8 file.
 
 
 
 Oops, sorry, correction. It wasn't a legal UTF-8 string. It was an 
 
 environment variable that was causing the decoding error, since it 
 
 contained illegal bytes for a UTF-8 string.
 
 
 
 
 
  Where does index.html come from? Whatever program generates that, you
 
  need to find out what encoding it is using.

Hello steven, index.html was writenn by handcode from me utilizing html + css

metrites.py tries to open that script so we must tell it to open as utf-8 text 
and not as a binary file.

How can we do that?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-11 Thread Lele Gaifax
nagia.rets...@gmail.com writes:

 metrites.py tries to open that script so we must tell it to open as
 utf-8 text and not as a binary file.

One way is the following:

from codecs import open

with open('index.html', encoding='utf-8') as f:
content = f.read()

ciao, lele.
-- 
nickname: Lele Gaifax | Quando vivrò di quello che ho pensato ieri
real: Emanuele Gaifas | comincerò ad aver paura di chi mi copia.
l...@metapensiero.it  | -- Fortunato Depero, 1929.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-11 Thread Cameron Simpson
On 10Apr2013 21:50, nagia.rets...@gmail.com nagia.rets...@gmail.com wrote:
| Firtly thank uou for taking a look into the code.
| the doctype is coming form the attempt of script metrites.py to open and read 
the 'index.html' file.
| But i don't know how to try to open it as a byte file instead of an tetxt 
file.

I think you've got it backwards. It looks like metrites.py has
opened the file as bytes instead of as text (probably utf8, but
that remains to be seen). Because it has opened it in binary mode
you're getting bytes when you read from the file.

Can you show the relevant code that opens the files and reads from
it, and the print statement that is putting it back out?

You probably need to ensure that metrites.py is opening it as text,
with the correct encoding.  Note that the encoding is nothing to
do with your _output_. It is the encoding of the data in the file
you are reading, and that is dictated by the editor used to make
the file.

Anyway, code first. What does it look like?

Cheers,
-- 
Cameron Simpson c...@zip.com.au

Six trillion RFID tags is four orders of magnitude bigger than any electronic 
item ever made.
- overhead by WIRED at the Intelligent Printing conference Oct2006
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-11 Thread nagia . retsina
Of course here is how it look like:

if page.endswith('.html'):
f = open( /home/nikos/www/ + page, encoding=utf-8 )
htmldata = f.read()
htmldata = htmldata % (quote, music)

counter = ''' center
  a 
href=mailto:supp...@superhost.gr; img src=/data/images/mail.png/a
  table border=2 cellpadding=2 
bgcolor=black
tdfont 
color=limeΑριθμός Επισκεπτών/td
tda 
href=http://superhost.gr/?show=logpage=%s;font color=yellow %d /td
  /tablebr
  ''' % (page, data[0])
  
template = htmldata + counter
print( template )
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-11 Thread Nikos
Τη Πέμπτη, 11 Απριλίου 2013 1:45:22 μ.μ. UTC+3, ο χρήστης Cameron Simpson 
έγραψε:
 On 10Apr2013 21:50, nagia.rets...@gmail.com nagia.rets...@gmail.com wrote:
 
 | Firtly thank uou for taking a look into the code.
 
 | the doctype is coming form the attempt of script metrites.py to open and 
 read the 'index.html' file.
 
 | But i don't know how to try to open it as a byte file instead of an tetxt 
 file.
 
 
 
 I think you've got it backwards. It looks like metrites.py has
 
 opened the file as bytes instead of as text (probably utf8, but
 
 that remains to be seen). Because it has opened it in binary mode
 
 you're getting bytes when you read from the file.
 
 
 
 Can you show the relevant code that opens the files and reads from
 
 it, and the print statement that is putting it back out?
 
 
 
 You probably need to ensure that metrites.py is opening it as text,
 
 with the correct encoding.  Note that the encoding is nothing to
 
 do with your _output_. It is the encoding of the data in the file
 
 you are reading, and that is dictated by the editor used to make
 
 the file.


 Webhost  Weblog
This works in the shell, but doesn't work on my website:

$ cat utf8.txt
υλικό!Πρόκειται γ
$ python3
Python 3.2.3 (default, Oct 19 2012, 20:10:41)
[GCC 4.6.3] on linux2
Type help, copyright, credits or license for more information.
 data = open('utf8.txt').read()
 print(data)
υλικό!Πρόκειται γ

 print(data.encode('utf-8'))
b'\xcf\x85\xce\xbb\xce\xb9\xce\xba\xcf\x8c!\xce\xa0\xcf\x81\xcf\x8c\xce\xba\xce\xb5\xce\xb9\xcf\x84\xce\xb1\xce\xb9
 \xce\xb3\n'

See, the last line is what i'am getting on my website. If i remove the 
encode('utf-8') part in metrites.py, the webpage will not show anything at 
all...
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: People in the python community [was Re: Unicode issue with Python v3.3]

2013-04-11 Thread Michael Torrie
On 04/10/2013 10:50 AM, Νίκος Γκρ33κ wrote:
 I'am not sure i follow you. How did my topic changed?! Is this
 possible?

This is a mailing list/nntp newsgroup.  The subject line can be changed
arbitrarily by anyone replying to another message.  Normally this is
done to indicate a natural progression of the conversation in a new
direction.  In this case, Steven D'Aprano wrote a reply that did not
answer your pleas, but instead made some observations, and so he changed
the subject line to reflect that.

If you read your messages using a threaded message display, this will
make more sense to you.  But if you use Gmail's (or Google's) broken
conversation view, then this information about who is responding to whom
does get lost--actually in conversation view a lot of information about
the message flow is lost; it really is unfortunate that this way of
communicating has become so widespread.

 How about the oce i posted at patebin.com. Did anyone by any chnace
 had a look into?

 It's only a single thing iam missing for the encoding and the the
 script will load properly with python 3.3

I'm truly sorry, but I simply do not have the time to do so.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-11 Thread nagia . retsina
Well, can somebody else propose somehting plz?

i have paste the whole script and even the necessary snippet that perhaps 
causing this encoding confusion in 3.3
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-11 Thread alex23
On Apr 12, 2:36 pm, nagia.rets...@gmail.com wrote:
 Well, can somebody else propose somehting plz?

Pay for a professional.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-10 Thread rusi
On Apr 10, 10:06 am, rusi rustompm...@gmail.com wrote:
 An interesting case of two threads:

 On Apr 10, 9:46 am, Chris Angelico ros...@gmail.com wrote:

  On Wed, Apr 10, 2013 at 2:25 PM, Steven D'Aprano
   Obviously you know what the problem is much better than the Python
   interpreter.

  I just went to the page and it started playing sound. Between that and
  this arrogant refusal to believe either the interpreter or the people
  who are freely donating time to assist, I'm done. No more looking at
  Nikos's home page to try to figure out his problems. Have fun, Nikos.

  ChrisA

 Some swans are black
 Some homo sapiens have negative IQ

Hmm I see some cut-paste goofup on my part.
I was meaning to juxtapose this thread where we put up with inordinate
amount of nonsense from OP
along with the recent thread in which a newcomer who thinks he has
found a bug in pdb is made fun of.

Then thought better of it and deleted the stuff.
However I did not do a good delete-job so I better now say what I
avoided saying:

If those who habitually post rubbish are given much of our time and
effort,
whereas newcomers and first-timers are treated rudely, the list begins
to smell like a club of old farts.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-10 Thread Antoine Pitrou
rusi rustompmody at gmail.com writes:
 
 Hmm I see some cut-paste goofup on my part.
 I was meaning to juxtapose this thread where we put up with inordinate
 amount of nonsense from OP
 along with the recent thread in which a newcomer who thinks he has
 found a bug in pdb is made fun of.
 
 Then thought better of it and deleted the stuff.
 However I did not do a good delete-job so I better now say what I
 avoided saying:
 
 If those who habitually post rubbish are given much of our time and
 effort,
 whereas newcomers and first-timers are treated rudely, the list begins
 to smell like a club of old farts.

+1. If you think you have something intelligent to say to jmfauth,
you might as well start a private discussion with him.

As far as I'm concerned, python-list is *already* of club of old
farts. Many regular posters are more interested in being right on the
Internet rather than helping people out.

(this is where the StackOverflow mechanics probably work better, sadly)

Regards

Antoine.


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-10 Thread nagia . retsina
Τη Τετάρτη, 10 Απριλίου 2013 7:25:21 π.μ. UTC+3, ο χρήστης Steven D'Aprano 
έγραψε:

 What does os.environ['REMOTE_ADDR'] give? Until you answer that question, 
 you won't make any progress.

I insists stevv.

Look at what 'python3 metrites.py' gives me

!-- The above is a description of an error in a Python program, formatted
 for a Web browser because the 'cgitb' module was enabled.  In case you
 are not reading this in a Web browser, here is the original traceback:

Traceback (most recent call last):
  File metrites.py, line 34, in lt;modulegt;
userinfo = os.environ['HTTP_USER_AGENT']
  File /root/.local/lib/python2.7/lib/python3.3/os.py, line 669, in 
__getitem__
value = self._data[self.encodekey(key)]
KeyError: b'HTTP_USER_AGENT'

--

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-10 Thread Νίκος Γκρ33κ
Here is the whole code for metrites.py in case someone wants to take allok.

Everything is correct after altering it to meet python 3.3, everythign aprt 
from the weird unicode error thing.

http://pastebin.com/5Mpjx5Fd

please take a look.
Thank you. 
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-10 Thread Steven D'Aprano
On Tue, 09 Apr 2013 23:04:35 -0700, rusi wrote:

 Hmm I see some cut-paste goofup on my part. I was meaning to juxtapose
 this thread where we put up with inordinate amount of nonsense from OP
 along with the recent thread in which a newcomer who thinks he has found
 a bug in pdb is made fun of.

Curious. Is this making fun of the newcomer?

  If you are able to supply more details, we might be able to
  follow up on the registration problem.  And,  as someone else
  suggested, you could post the details of the pdb problem here.
  Note, there are already a number of currently open issues with
  pdb reported on the bug tracker. If you haven't already, you
  could search for pdb and see if your problem has been reported.
  Thanks for bringing the problem(s) up!


Or perhaps this is making fun of them?

  Post the 10-line program here, so others can verify whether it is a bug.


I think it is quite unfair of you to mischaracterise the entire community 
response in this way. One person made a light-hearted, silly, unhelpful 
response. (As sarcasm, I'm afraid it missed the target.) Two people made 
good, sensible responses -- and you were not either of them.

If you want to be helpful, how about leading by example and taking on 
some of the less coherent newbie questions, instead of just bitching that 
others don't? It's easy, and a pleasure, to give good answers to well-
written, carefully thought out questions. It's much harder to do the same 
for those questions which are... shall we say... less optimal. We could 
do with a few more people who make an effort to be helpful and friendly, 
instead of scolds who just tell us off when we stumble.



 Then thought better of it and deleted the stuff. However I did not do a
 good delete-job so I better now say what I avoided saying:
 
 If those who habitually post rubbish are given much of our time and
 effort,
 whereas newcomers and first-timers are treated rudely, the list begins
 to smell like a club of old farts.


It's often the newcomers who are posting rubbish. Should we ignore them 
for posting rubbish, or welcome them for being newcomers?



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


People in the python community [was Re: Unicode issue with Python v3.3]

2013-04-10 Thread Steven D'Aprano
On Wed, 10 Apr 2013 08:28:55 +, Steven D'Aprano wrote:

 If you want to be helpful, how about leading by example and taking on
 some of the less coherent newbie questions
[...]


On that note, I think I'll take the opportunity to give thanks to Peter 
Otten, who (if I remember correctly) has been here for longer than I 
have, and I've been here for a long time. In all that time, I don't think 
I've ever seen him snap at or be rude to anyone, not even those who 
deserved it, and he doesn't shy away from answering even the most poorly 
written questions.


Peter, I don't know how you do it, but you're doing a fantastic job.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: People in the python community [was Re: Unicode issue with Python v3.3]

2013-04-10 Thread Mark Lawrence

On 10/04/2013 09:34, Steven D'Aprano wrote:


On that note, I think I'll take the opportunity to give thanks to Peter
Otten, who (if I remember correctly) has been here for longer than I
have, and I've been here for a long time. In all that time, I don't think
I've ever seen him snap at or be rude to anyone, not even those who
deserved it, and he doesn't shy away from answering even the most poorly
written questions.


Peter, I don't know how you do it, but you're doing a fantastic job.



Seconded.  For those who don't know Peter is always responding to 
queries on the tutor mailing list as well.  Definite case of the 
patience of a saint.


--
If you're using GoogleCrap™ please read this 
http://wiki.python.org/moin/GoogleGroupsPython.


Mark Lawrence

--
http://mail.python.org/mailman/listinfo/python-list


Re: People in the python community [was Re: Unicode issue with Python v3.3]

2013-04-10 Thread Νίκος Γκρ33κ
 os.environ['HTTP_USER_AGENT'] is only set when running from browser.

so i faked it by using:

userinfo = os.environ.get('HTTP_USER_AGENT', 'some default')

but the encoding issues are still there.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: People in the python community [was Re: Unicode issue with Python v3.3]

2013-04-10 Thread Νίκος Γκρ33κ
Thank you just altered it but i still get the same encoding issues.

please its only a matter of simple alternation that iam not able to see.

When you have the time plz take a look.

Thank you!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: People in the python community [was Re: Unicode issue with Python v3.3]

2013-04-10 Thread Peter Otten
Steven D'Aprano wrote:

 On Wed, 10 Apr 2013 08:28:55 +, Steven D'Aprano wrote:
 
 If you want to be helpful, how about leading by example and taking on
 some of the less coherent newbie questions
 [...]
 
 
 On that note, I think I'll take the opportunity to give thanks to Peter
 Otten, who (if I remember correctly) has been here for longer than I
 have, and I've been here for a long time. In all that time, I don't think
 I've ever seen him snap at or be rude to anyone, not even those who
 deserved it, and he doesn't shy away from answering even the most poorly
 written questions.
 
 
 Peter, I don't know how you do it, but you're doing a fantastic job.

Thank you :)

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: People in the python community [was Re: Unicode issue with Python v3.3]

2013-04-10 Thread Peter Otten
Mark Lawrence wrote:

 On 10/04/2013 09:34, Steven D'Aprano wrote:

 On that note, I think I'll take the opportunity to give thanks to Peter
 Otten, who (if I remember correctly) has been here for longer than I
 have, and I've been here for a long time. In all that time, I don't think
 I've ever seen him snap at or be rude to anyone, not even those who
 deserved it, and he doesn't shy away from answering even the most poorly
 written questions.


 Peter, I don't know how you do it, but you're doing a fantastic job.

 
 Seconded.  For those who don't know Peter is always responding to
 queries on the tutor mailing list as well.  Definite case of the
 patience of a saint.

You're invited as a speaker to my funeral ;)

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: People in the python community [was Re: Unicode issue with Python v3.3]

2013-04-10 Thread Νίκος Γκρ33κ
Anyone please?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: People in the python community [was Re: Unicode issue with Python v3.3]

2013-04-10 Thread Mark Lawrence

On 10/04/2013 15:43, Νίκος Γκρ33κ wrote:

Anyone please?



I have already shown my support for Peter Otten on this thread.  Are you 
asking for more people to do so?


--
If you're using GoogleCrap™ please read this 
http://wiki.python.org/moin/GoogleGroupsPython.


Mark Lawrence

--
http://mail.python.org/mailman/listinfo/python-list


Re: People in the python community [was Re: Unicode issue with Python v3.3]

2013-04-10 Thread Chris Angelico
On Thu, Apr 11, 2013 at 1:15 AM, Mark Lawrence breamore...@yahoo.co.uk wrote:
 On 10/04/2013 15:43, Νίκος Γκρ33κ wrote:

 Anyone please?


 I have already shown my support for Peter Otten on this thread.  Are you
 asking for more people to do so?

Sure, I can! He's one of the people who keeps this list/ng productive
and helpful. People can come here with Python problems and get Python
solutions.

(I wouldn't normally me too a thread, but hey, with that opening!)

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: People in the python community [was Re: Unicode issue with Python v3.3]

2013-04-10 Thread Νίκος Γκρ33κ
I'am not sure i follow you.
How did my topic changed?! Is this possible?

How about the oce i posted at patebin.com.
Did anyone by any chnace had a look into?

It's only a single thing iam missing for the encoding and the the script will 
load properly with python 3.3
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-10 Thread Nobody
On Wed, 10 Apr 2013 00:23:46 -0700, nagia.retsina wrote:

 Look at what 'python3 metrites.py' gives me

   File /root/.local/lib/python2.7/lib/python3.3/os.py, line 669, ...
 ^^^   ^^^


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-10 Thread Νίκος Γκρ33κ
Τη Τετάρτη, 10 Απριλίου 2013 9:08:38 μ.μ. UTC+3, ο χρήστης Nobody έγραψε:
 On Wed, 10 Apr 2013 00:23:46 -0700, nagia.retsina wrote:
 
 
 
  Look at what 'python3 metrites.py' gives me
 
 
 
File /root/.local/lib/python2.7/lib/python3.3/os.py, line 669, ...
 
  ^^^   ^^^

Yes i see it in the traceback but i dont know what it means.
Please explain to me.
Tahnk you.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-10 Thread Ian Kelly
On Wed, Apr 10, 2013 at 12:25 PM, Νίκος Γκρ33κ nikos.gr...@gmail.com wrote:
 Τη Τετάρτη, 10 Απριλίου 2013 9:08:38 μ.μ. UTC+3, ο χρήστης Nobody έγραψε:
 On Wed, 10 Apr 2013 00:23:46 -0700, nagia.retsina wrote:



  Look at what 'python3 metrites.py' gives me



File /root/.local/lib/python2.7/lib/python3.3/os.py, line 669, ...

  ^^^   ^^^

 Yes i see it in the traceback but i dont know what it means.
 Please explain to me.
 Tahnk you.

It means that there is something very strange about the way that your
Python 3.3 is installed, as the libraries appear to be installed under
your Python 2.7 library directory.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-10 Thread Arnaud Delobelle
On 10 April 2013 09:28, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
 On Tue, 09 Apr 2013 23:04:35 -0700, rusi wrote:
[...]
 I think it is quite unfair of you to mischaracterise the entire community
 response in this way. One person made a light-hearted, silly, unhelpful
 response. (As sarcasm, I'm afraid it missed the target.) Two people made
 good, sensible responses -- and you were not either of them.

Enough already with the thought police.

It was me who made the silly reply to the guy who was ranting about
everything being broken, giving us nothing to help in on, ending his
message in an edifying and in my judgement, largely rhetorical
Suggestions?.  So I gave him some silly suggestions (*not* intended
to be sarcasm), and I'm not apologising for it.  At least I'm not
presuming to take the moral high ground at every half-opportunity.

Recently I gave a very quick reply to someone who was wondering why he
couldn't get the docstring from his descriptor - I didn't have the
time to expand because two of my kids had jumped on my knees almost as
soon as I'd got on the computer.  I decided to post the reply anyway
as I thought it would give the OP something to get started on and
nobody else seemed to have replied so far - but I got remonstrated for
not being complete enough in my reply!  What is that about?

AFAIK, this is not Python Customer Service, but a place for people who
are interested in Python to discuss problems and *freely* exchange
thoughts about the language and its ecosystem.  Over the year I've
posted the occasional silly message but I think my record is
overwhelmingly that I've tried to be helpful, and when I've needed
some help myself, I've got some great advice.  My first question on
this list was answered by Alex Martelli and nowadays I get most
excellent and concise tips from Peter Otten - thanks, Peter! If
there's one person on this list I don't want to offend, it's you!

So here's to lots more good and bad humour on this list, and the
occasional slightly un-pc remark even!

Cheers,

-- 
Arnaud
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-10 Thread Cameron Simpson
On 10Apr2013 01:06, Νίκος Γκρ33κ nikos.gr...@gmail.com wrote:
| Here is the whole code for metrites.py in case someone wants to take allok.
| 
| Everything is correct after altering it to meet python 3.3,
| everythign aprt from the weird unicode error thing.
| 
| http://pastebin.com/5Mpjx5Fd
| 
| please take a look.

From looking at the HTML source of the page:

  http://superhost.gr/

I see near the start:

  b'!DOCTYPE html

I'd say you have a bytes object that you've fed to print().
In python2, str is effectively bytes.
In python3, str is a sequence of Unicode code points, and bytes are
arrays of small integers.
If you feed a bytes object to print it will print a strig represenation
of it, starting with b'

The question is: where did the bytes object come from? A cursory
glance through your pastebin code doesn't show me anthing very
obvious.

I'd start by asking: where does the string !DOCTYPE come from?
Wherever that is, it seems to be bytes rather than str.
Start with that.

Cheers,
-- 
Cameron Simpson c...@zip.com.au

You don't have to live on the edge, but you have to know where it is.
- Scott Lilliott, c...@swl.msd.ray.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-10 Thread nagia . retsina
Firtly thank uou for taking a look into the code.

the doctype is coming form the attempt of script metrites.py to open and read 
the 'index.html' file.

But i don't know how to try to open it as a byte file instead of an tetxt file.
-- 
http://mail.python.org/mailman/listinfo/python-list


Unicode issue with Python v3.3

2013-04-09 Thread Νίκος Γκρ33κ
Hello, iam still trying to alter the code form python 2.6 = 3.3

Everyrging its setup except that unicode error that you can see if you go to 
http://superhost.gr

Can anyone help with this?
I even tried to change print() with sys.stdout.buffer() but still i get the 
same unicode issue.

I don't know what to try anymore.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-09 Thread Ian Kelly
On Tue, Apr 9, 2013 at 3:10 PM, Νίκος Γκρ33κ nikos.gr...@gmail.com wrote:
 Hello, iam still trying to alter the code form python 2.6 = 3.3

 Everyrging its setup except that unicode error that you can see if you go to 
 http://superhost.gr

 Can anyone help with this?
 I even tried to change print() with sys.stdout.buffer() but still i get the 
 same unicode issue.

 I don't know what to try anymore.

It seems to be failing on the line:

host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0]

So the obvious question to ask is: what are the contents of
os.environ['REMOTE_ADDR'] when this line is reached?

And why are you still trying to solve these sorts of problems on your
production website?  Do you not have a development or staging
environment?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-09 Thread nagia . retsina
Τη Τετάρτη, 10 Απριλίου 2013 12:34:25 π.μ. UTC+3, ο χρήστης Ian έγραψε:
 On Tue, Apr 9, 2013 at 3:10 PM, Νίκος Γκρ33κ nikos.gr...@gmail.com wrote:
 
  Hello, iam still trying to alter the code form python 2.6 = 3.3
 
 
 
  Everyrging its setup except that unicode error that you can see if you go 
  to http://superhost.gr
 
 
 
  Can anyone help with this?
 
  I even tried to change print() with sys.stdout.buffer() but still i get the 
  same unicode issue.
 
 
 
  I don't know what to try anymore.
 
 
 
 It seems to be failing on the line:
 
 
 
 host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0]
 
 
 
 So the obvious question to ask is: what are the contents of
 
 os.environ['REMOTE_ADDR'] when this line is reached?
 
 
 
 And why are you still trying to solve these sorts of problems on your
 
 production website?  Do you not have a development or staging
 
 environment?

No forget this line. this is not the problem.
No i don't have  a testing enviroment, i altered all the code form 2.6 to 3.3 
in the live enviromtnt.

i strongly believe there is somethign goind wrong with the prints(). Thoese are 
causing the unicode isu es much like as thes changes from:

quote = random.choice( list( open( /home/nikos/www/data/private/quotes.txt, ) 
) )

quote = random.choice( list( open( /home/nikos/www/data/private/quotes.txt, 
encoding=utf-8 ) ) )

in order for the open() to work.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-09 Thread Steven D'Aprano
On Tue, 09 Apr 2013 20:16:12 -0700, nagia.retsina wrote:

 Τη Τετάρτη, 10 Απριλίου 2013 12:34:25 π.μ. UTC+3, ο χρήστης Ian έγραψε:
 On Tue, Apr 9, 2013 at 3:10 PM, Νίκος Γκρ33κ nikos.gr...@gmail.com
 wrote:
 
  Hello, iam still trying to alter the code form python 2.6 = 3.3
 
  Everyrging its setup except that unicode error that you can see if
  you go to http://superhost.gr
 
  Can anyone help with this?
 
  I even tried to change print() with sys.stdout.buffer() but still i
  get the same unicode issue.
 
  I don't know what to try anymore.
 
 
 
 It seems to be failing on the line:
 
 host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0]
 
 So the obvious question to ask is: what are the contents of
 
 os.environ['REMOTE_ADDR'] when this line is reached?
[...]

 No forget this line. this is not the problem. No i don't have  a testing
 enviroment, i altered all the code form 2.6 to 3.3 in the live
 enviromtnt.
 
 i strongly believe there is somethign goind wrong with the prints().


Obviously you know what the problem is much better than the Python 
interpreter.

I suggest you open a bug report:

Errors printing bytes are wrongly claimed to be socket errors

and see what happens.

Or, you can listen to people who actually know what they are talking 
about, and look at the actual error, which has NOTHING to do with print.

What does os.environ['REMOTE_ADDR'] give? Until you answer that question, 
you won't make any progress.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-09 Thread Chris Angelico
On Wed, Apr 10, 2013 at 2:25 PM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
 On Tue, 09 Apr 2013 20:16:12 -0700, nagia.retsina wrote:

 Τη Τετάρτη, 10 Απριλίου 2013 12:34:25 π.μ. UTC+3, ο χρήστης Ian έγραψε:
 On Tue, Apr 9, 2013 at 3:10 PM, Νίκος Γκρ33κ nikos.gr...@gmail.com
 wrote:

  Hello, iam still trying to alter the code form python 2.6 = 3.3
 
  Everyrging its setup except that unicode error that you can see if
  you go to http://superhost.gr
 
  Can anyone help with this?

  I even tried to change print() with sys.stdout.buffer() but still i
  get the same unicode issue.
 
  I don't know what to try anymore.



 It seems to be failing on the line:

 host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0]

 So the obvious question to ask is: what are the contents of

 os.environ['REMOTE_ADDR'] when this line is reached?
 [...]

 No forget this line. this is not the problem. No i don't have  a testing
 enviroment, i altered all the code form 2.6 to 3.3 in the live
 enviromtnt.

 i strongly believe there is somethign goind wrong with the prints().


 Obviously you know what the problem is much better than the Python
 interpreter.

I just went to the page and it started playing sound. Between that and
this arrogant refusal to believe either the interpreter or the people
who are freely donating time to assist, I'm done. No more looking at
Nikos's home page to try to figure out his problems. Have fun, Nikos.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue with Python v3.3

2013-04-09 Thread rusi
An interesting case of two threads:

On Apr 10, 9:46 am, Chris Angelico ros...@gmail.com wrote:
 On Wed, Apr 10, 2013 at 2:25 PM, Steven D'Aprano

  Obviously you know what the problem is much better than the Python
  interpreter.

 I just went to the page and it started playing sound. Between that and
 this arrogant refusal to believe either the interpreter or the people
 who are freely donating time to assist, I'm done. No more looking at
 Nikos's home page to try to figure out his problems. Have fun, Nikos.

 ChrisA

Some swans are black
Some homo sapiens have negative IQ
-- 
http://mail.python.org/mailman/listinfo/python-list


[issue6077] Unicode issue with tempfile on Windows

2009-11-29 Thread Amaury Forgeot d'Arc

Amaury Forgeot d'Arc amaur...@gmail.com added the comment:

Fixed with r76593 (py3k) and r76594 (release31-maint)

--
resolution: accepted - fixed
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6077
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6077] Unicode issue with tempfile on Windows

2009-11-20 Thread Antoine Pitrou

Changes by Antoine Pitrou pit...@free.fr:


--
components: +IO -Library (Lib)
priority:  - normal
stage:  - patch review
versions: +Python 3.1, Python 3.2 -Python 3.0

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6077
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6077] Unicode issue with tempfile on Windows

2009-11-20 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

The patch looks ok to me.

--
assignee:  - amaury.forgeotdarc
nosy: +pitrou
resolution:  - accepted
stage: patch review - commit review

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6077
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: unicode issue

2009-10-06 Thread Gabriel Genellina
En Thu, 01 Oct 2009 12:10:58 -0300, Walter Dörwald wal...@livinglogic.de  
escribió:

On 01.10.09 16:09, Hyuga wrote:

On Sep 30, 3:34 am, gentlestone tibor.b...@hotmail.com wrote:



_MAP = {
# LATIN
u'À': 'A', u'Á': 'A', u'Â': 'A', u'Ã': 'A', u'Ä': 'A', u'Å': 'A',
u'Æ': 'AE', u'Ç':'C', [...long table...]
}

def downcode(name):

 downcode(uŽabovitá zmiešaná kaša)
u'Zabovita zmiesana kasa'

for key, value in _MAP.iteritems():
name = name.replace(key, value)
return name


import unicodedata

def downcode(name):
   return unicodedata.normalize(NFD, name)\
  .encode(ascii, ignore)\
  .decode(ascii)


This article [1] shows a mixed technique, decomposing characters when such  
info is available in the Unicode tables, and also allowing for a custom  
mapping when not.


[1] http://effbot.org/zone/unicode-convert.htm

--
Gabriel Genellina

--
http://mail.python.org/mailman/listinfo/python-list


Re: unicode issue

2009-10-01 Thread gentlestone
save in utf-8 the coding declaration also has to be utf-8

ok, I understand, but what's the problem? Unfortunately seems to be
the Python interactive
mode doesn't have unicode support. It recognize the latin-1 encoding
only.

So I have 2 options, how to write doctest:
1. Replace native charaters with their encoded representation like
u\u017dabovit\xe1 zmie\u0161an\xe1 ka\u0161a instead of uŽabovitá
zmiešaná kaša
2. Use latin-1 encoding, where the file is saved in utf-8

The first is bad because doctest is a great documenttion tool and it
is propably the main reason I use python. And something like
u\u017dabovit\xe1 zmie\u0161an\xe1 ka\u0161a is not a best
documentation style. But the tests work.

The second is bad, because the declaration is incorrect and if I use
it in Django model declaration for example I got bad data in the
application.

So what is the solution? Back to Java? :-)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: unicode issue

2009-10-01 Thread Dave Angel

gentlestone wrote:

save in utf-8 the coding declaration also has to be utf-8



ok, I understand, but what's the problem? Unfortunately seems to be
the Python interactive
mode doesn't have unicode support. It recognize the latin-1 encoding
only.

So I have 2 options, how to write doctest:
1. Replace native charaters with their encoded representation like
u\u017dabovit\xe1 zmie\u0161an\xe1 ka\u0161a instead of uŽabovitá
zmiešaná kaša
2. Use latin-1 encoding, where the file is saved in utf-8

The first is bad because doctest is a great documenttion tool and it
is propably the main reason I use python. And something like
u\u017dabovit\xe1 zmie\u0161an\xe1 ka\u0161a is not a best
documentation style. But the tests work.

The second is bad, because the declaration is incorrect and if I use
it in Django model declaration for example I got bad data in the
application.

So what is the solution? Back to Java? :-)

  
Wait -- don't give up yet.  Since I'm one of the ones who (partially) 
steered you wrong, let me try to help.


Key variable here is how your text editor behaves.  Since I've never 
taken my (programming) text editor out of ASCII mode before this week, 
it took some experimenting (and more importantly a message from Piet on 
this thread) to make sense of things.  I think I now know how to make my 
own editor (Komodo IDE) behave in this environment, and you probably can 
do as well or better.  In fact, judging from your messages, you probably 
are doing much better on the editor front.


When I tried this morning to re-open that test file from yesterday, many 
of the characters were all messed up.  I was okay as long as the project 
was still open, but not today.  The editor itself apparently looks to 
that encoding declaration when it's deciding how to interpret the bytes 
on disk.


So I did the following, using Komodo IDE.  I created a new file in the 
project.  Before saving it, I used 
Edit-CurrentFileSettings-Properties-Encoding to set it to UTF-8.  
*NOW* I pasted the stuff from your email message.  And added the

#-*- coding: utf-8 -*-

as the second line of the file.   Notice it's *NOT* latin-1.

At this point I save and run the file, and it seems to work fine.

My guess is that I could set these as default settings in Komodo, if I 
were doing UTF-8 very often, and it would become painless.  I know I 
have certain stuff in my python template, and could add that encoding 
line as well.



Anyway, that gets us to the step of running the doctest.  The trick here 
seems to be that we need to define the docstring as a Unicode docstring 
to have it interpreted correctly.  Try adding the u in front of the 
triple quote as follows:


def downcode(name):
   u
downcode(uŽabovitá zmiešaná kaša)
   u'Zabovita zmiesana kasa'
   
   for key, value in _MAP.iteritems():
   name = name.replace(key, value)
   return name

Now, if the doctest passes, we seem to be in good shape.

There's another problem, that hopefully somebody else can help with.  
That's if doctest needs to report an error.  When I deliberately changed 
the expect string I get an error like the following.


UnicodeEncodeError: 'ascii' codec can't encode character u'\u017d' in 
position 1

50: ordinal not in range(128)

I get a similar error if running the -v option on doctest.   (Note that 
I do *NOT* get the error when running inside Komodo.  And what I've read 
implies that the same would be true if running inside IDLE.)  The 
problem is similar to the one you'd have doing a simple:


   print u\u017d

I think these are avoided if  sys.stdout.encoding (and maybe 
sys.stderr.encoding) are set to utf-8.  On my system they're set to 
None, which says to use the system default encoding.  On my system 
that would be ASCII, so I get the error.  But perhaps yours is already 
something better.


I found links:  
http://drj11.wordpress.com/2007/05/14/python-how-is-sysstdoutencoding-chosen/

http://wiki.python.org/moin/PrintFails

http://lists.macromates.com/textmate/2008-June/025735.html
  which indicate you may want to try:  


set LC_CTYPE=en_GB.utf-8 python

at the command prompt before running python.  This could be system specific;  
it didn't work for me on XP.

The workaround that works for me (so far) is:

if __name__ == __main__:
   import sys, codecs
   sys.stdout = codecs.getwriter('utf8')(sys.stdout)

   print uŽabovitá zmiešaná kaša
   import doctest
   doctest.testmod()

The codecs line tells python that stdout should use utf-8.  That doesn't make 
the characters look good on my console, but at least it avoids the errors.  I'm 
guessing that on my system I should use latin1 here instead of utf8.  But I 
don't want to confuse things.


HTH

DaveA

--
http://mail.python.org/mailman/listinfo/python-list


Re: unicode issue

2009-10-01 Thread Hyuga
On Sep 30, 3:34 am, gentlestone tibor.b...@hotmail.com wrote:
 Why don't work this code on Python 2.6? Or how can I do this job?

 _MAP = {
     # LATIN
     u'À': 'A', u'Á': 'A', u'Â': 'A', u'Ã': 'A', u'Ä': 'A', u'Å': 'A',
 u'Æ': 'AE', u'Ç':'C',
     u'È': 'E', u'É': 'E', u'Ê': 'E', u'Ë': 'E', u'Ì': 'I', u'Í': 'I',
 u'Î': 'I',
     u'Ï': 'I', u'Ð': 'D', u'Ñ': 'N', u'Ò': 'O', u'Ó': 'O', u'Ô': 'O',
 u'Õ': 'O', u'Ö':'O',
     u'Ő': 'O', u'Ø': 'O', u'Ù': 'U', u'Ú': 'U', u'Û': 'U', u'Ü': 'U',
 u'Ű': 'U',
     u'Ý': 'Y', u'Þ': 'TH', u'ß': 'ss', u'à':'a', u'á':'a', u'â': 'a',
 u'ã': 'a', u'ä':'a',
     u'å': 'a', u'æ': 'ae', u'ç': 'c', u'è': 'e', u'é': 'e', u'ê': 'e',
 u'ë': 'e',
     u'ì': 'i', u'í': 'i', u'î': 'i', u'ï': 'i', u'ð': 'd', u'ñ': 'n',
 u'ò': 'o', u'ó':'o',
     u'ô': 'o', u'õ': 'o', u'ö': 'o', u'ő': 'o', u'ø': 'o', u'ù': 'u',
 u'ú': 'u',
     u'û': 'u', u'ü': 'u', u'ű': 'u', u'ý': 'y', u'þ': 'th', u'ÿ': 'y',
     # LATIN_SYMBOLS
     u'©':'(c)',
     # GREEK
     u'α':'a', u'β':'b', u'γ':'g', u'δ':'d', u'ε':'e', u'ζ':'z',
 u'η':'h', u'θ':'8',
     u'ι':'i', u'κ':'k', u'λ':'l', u'μ':'m', u'ν':'n', u'ξ':'3',
 u'ο':'o', u'π':'p',
     u'ρ':'r', u'σ':'s', u'τ':'t', u'υ':'y', u'φ':'f', u'χ':'x',
 u'ψ':'ps', u'ω':'w',
     u'ά':'a', u'έ':'e', u'ί':'i', u'ό':'o', u'ύ':'y', u'ή':'h',
 u'ώ':'w', u'ς':'s',
     u'ϊ':'i', u'ΰ':'y', u'ϋ':'y', u'ΐ':'i',
     u'Α':'A', u'Β':'B', u'Γ':'G', u'Δ':'D', u'Ε':'E', u'Ζ':'Z',
 u'Η':'H', u'Θ':'8',
     u'Ι':'I', u'Κ':'K', u'Λ':'L', u'Μ':'M', u'Ν':'N', u'Ξ':'3',
 u'Ο':'O', u'Π':'P',
     u'Ρ':'R', u'Σ':'S', u'Τ':'T', u'Υ':'Y', u'Φ':'F', u'Χ':'X',
 u'Ψ':'PS', u'Ω':'W',
     u'Ά':'A', u'Έ':'E', u'Ί':'I', u'Ό':'O', u'Ύ':'Y', u'Ή':'H',
 u'Ώ':'W', u'Ϊ':'I', u'Ϋ':'Y',
     # TURKISH
     u'ş':'s', u'Ş':'S', u'ı':'i', u'İ':'I', u'ç':'c', u'Ç':'C',
 u'ü':'u', u'Ü':'U',
     u'ö':'o', u'Ö':'O', u'ğ':'g', u'Ğ':'G',
     # RUSSIAN
     u'а':'a', u'б':'b', u'в':'v', u'г':'g', u'д':'d', u'е':'e',
 u'ё':'yo', u'ж':'zh',
     u'з':'z', u'и':'i', u'й':'j', u'к':'k', u'л':'l', u'м':'m',
 u'н':'n', u'о':'o',
     u'п':'p', u'р':'r', u'с':'s', u'т':'t', u'у':'u', u'ф':'f',
 u'х':'h', u'ц':'c',
     u'ч':'ch', u'ш':'sh', u'щ':'sh', u'ъ':'', u'ы':'y', u'ь':'',
 u'э':'e', u'ю':'yu', u'я':'ya',
     u'А':'A', u'Б':'B', u'В':'V', u'Г':'G', u'Д':'D', u'Е':'E',
 u'Ё':'Yo', u'Ж':'Zh',
     u'З':'Z', u'И':'I', u'Й':'J', u'К':'K', u'Л':'L', u'М':'M',
 u'Н':'N', u'О':'O',
     u'П':'P', u'Р':'R', u'С':'S', u'Т':'T', u'У':'U', u'Ф':'F',
 u'Х':'H', u'Ц':'C',
     u'Ч':'Ch', u'Ш':'Sh', u'Щ':'Sh', u'Ъ':'', u'Ы':'Y', u'Ь':'',
 u'Э':'E', u'Ю':'Yu', u'Я':'Ya',
     # UKRAINIAN
     u'Є':'Ye', u'І':'I', u'Ї':'Yi', u'Ґ':'G', u'є':'ye', u'і':'i',
 u'ї':'yi', u'ґ':'g',
     # CZECH
     u'č':'c', u'ď':'d', u'ě':'e', u'ň':'n', u'ř':'r', u'š':'s',
 u'ť':'t', u'ů':'u',
     u'ž':'z', u'Č':'C', u'Ď':'D', u'Ě':'E', u'Ň':'N', u'Ř':'R',
 u'Š':'S', u'Ť':'T', u'Ů':'U', u'Ž':'Z',
     # POLISH
     u'ą':'a', u'ć':'c', u'ę':'e', u'ł':'l', u'ń':'n', u'ó':'o',
 u'ś':'s', u'ź':'z',
     u'ż':'z', u'Ą':'A', u'Ć':'C', u'Ę':'e', u'Ł':'L', u'Ń':'N',
 u'Ó':'o', u'Ś':'S',
     u'Ź':'Z', u'Ż':'Z',
     # LATVIAN
     u'ā':'a', u'č':'c', u'ē':'e', u'ģ':'g', u'ī':'i', u'ķ':'k',
 u'ļ':'l', u'ņ':'n',
     u'š':'s', u'ū':'u', u'ž':'z', u'Ā':'A', u'Č':'C', u'Ē':'E',
 u'Ģ':'G', u'Ī':'i',
     u'Ķ':'k', u'Ļ':'L', u'Ņ':'N', u'Š':'S', u'Ū':'u', u'Ž':'Z'

 }

 def downcode(name):
     
      downcode(uŽabovitá zmiešaná kaša)
     u'Zabovita zmiesana kasa'
     
     for key, value in _MAP.iteritems():
         name = name.replace(key, value)
     return name

Though C Python is pretty optimized under the hood for this sort of
single-character replacement, this still seems pretty inefficient
since you're calling replace for every character you want to map.  I
think that a better approach might be something like:

def downcode(name):
return ''.join(_MAP.get(c, c) for c in name)

Or using string.translate:

import string
def downcode(name):
table = string.maketrans(
'ÀÁÂÃÄÅ...',
'AA...')
return name.translate(table)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: unicode issue

2009-10-01 Thread Walter Dörwald
On 01.10.09 16:09, Hyuga wrote:
 On Sep 30, 3:34 am, gentlestone tibor.b...@hotmail.com wrote:
 Why don't work this code on Python 2.6? Or how can I do this job?

 _MAP = {
 # LATIN
 u'À': 'A', u'Á': 'A', u'Â': 'A', u'Ã': 'A', u'Ä': 'A', u'Å': 'A',
 u'Æ': 'AE', u'Ç':'C',
 u'È': 'E', u'É': 'E', u'Ê': 'E', u'Ë': 'E', u'Ì': 'I', u'Í': 'I',
 u'Î': 'I',
 u'Ï': 'I', u'Ð': 'D', u'Ñ': 'N', u'Ò': 'O', u'Ó': 'O', u'Ô': 'O',
 u'Õ': 'O', u'Ö':'O',
 u'Ő': 'O', u'Ø': 'O', u'Ù': 'U', u'Ú': 'U', u'Û': 'U', u'Ü': 'U',
 u'Ű': 'U',
 u'Ý': 'Y', u'Þ': 'TH', u'ß': 'ss', u'à':'a', u'á':'a', u'â': 'a',
 u'ã': 'a', u'ä':'a',
 u'å': 'a', u'æ': 'ae', u'ç': 'c', u'è': 'e', u'é': 'e', u'ê': 'e',
 u'ë': 'e',
 u'ì': 'i', u'í': 'i', u'î': 'i', u'ï': 'i', u'ð': 'd', u'ñ': 'n',
 u'ò': 'o', u'ó':'o',
 u'ô': 'o', u'õ': 'o', u'ö': 'o', u'ő': 'o', u'ø': 'o', u'ù': 'u',
 u'ú': 'u',
 u'û': 'u', u'ü': 'u', u'ű': 'u', u'ý': 'y', u'þ': 'th', u'ÿ': 'y',
 # LATIN_SYMBOLS
 u'©':'(c)',
 # GREEK
 u'α':'a', u'β':'b', u'γ':'g', u'δ':'d', u'ε':'e', u'ζ':'z',
 u'η':'h', u'θ':'8',
 u'ι':'i', u'κ':'k', u'λ':'l', u'μ':'m', u'ν':'n', u'ξ':'3',
 u'ο':'o', u'π':'p',
 u'ρ':'r', u'σ':'s', u'τ':'t', u'υ':'y', u'φ':'f', u'χ':'x',
 u'ψ':'ps', u'ω':'w',
 u'ά':'a', u'έ':'e', u'ί':'i', u'ό':'o', u'ύ':'y', u'ή':'h',
 u'ώ':'w', u'ς':'s',
 u'ϊ':'i', u'ΰ':'y', u'ϋ':'y', u'ΐ':'i',
 u'Α':'A', u'Β':'B', u'Γ':'G', u'Δ':'D', u'Ε':'E', u'Ζ':'Z',
 u'Η':'H', u'Θ':'8',
 u'Ι':'I', u'Κ':'K', u'Λ':'L', u'Μ':'M', u'Ν':'N', u'Ξ':'3',
 u'Ο':'O', u'Π':'P',
 u'Ρ':'R', u'Σ':'S', u'Τ':'T', u'Υ':'Y', u'Φ':'F', u'Χ':'X',
 u'Ψ':'PS', u'Ω':'W',
 u'Ά':'A', u'Έ':'E', u'Ί':'I', u'Ό':'O', u'Ύ':'Y', u'Ή':'H',
 u'Ώ':'W', u'Ϊ':'I', u'Ϋ':'Y',
 # TURKISH
 u'ş':'s', u'Ş':'S', u'ı':'i', u'İ':'I', u'ç':'c', u'Ç':'C',
 u'ü':'u', u'Ü':'U',
 u'ö':'o', u'Ö':'O', u'ğ':'g', u'Ğ':'G',
 # RUSSIAN
 u'а':'a', u'б':'b', u'в':'v', u'г':'g', u'д':'d', u'е':'e',
 u'ё':'yo', u'ж':'zh',
 u'з':'z', u'и':'i', u'й':'j', u'к':'k', u'л':'l', u'м':'m',
 u'н':'n', u'о':'o',
 u'п':'p', u'р':'r', u'с':'s', u'т':'t', u'у':'u', u'ф':'f',
 u'х':'h', u'ц':'c',
 u'ч':'ch', u'ш':'sh', u'щ':'sh', u'ъ':'', u'ы':'y', u'ь':'',
 u'э':'e', u'ю':'yu', u'я':'ya',
 u'А':'A', u'Б':'B', u'В':'V', u'Г':'G', u'Д':'D', u'Е':'E',
 u'Ё':'Yo', u'Ж':'Zh',
 u'З':'Z', u'И':'I', u'Й':'J', u'К':'K', u'Л':'L', u'М':'M',
 u'Н':'N', u'О':'O',
 u'П':'P', u'Р':'R', u'С':'S', u'Т':'T', u'У':'U', u'Ф':'F',
 u'Х':'H', u'Ц':'C',
 u'Ч':'Ch', u'Ш':'Sh', u'Щ':'Sh', u'Ъ':'', u'Ы':'Y', u'Ь':'',
 u'Э':'E', u'Ю':'Yu', u'Я':'Ya',
 # UKRAINIAN
 u'Є':'Ye', u'І':'I', u'Ї':'Yi', u'Ґ':'G', u'є':'ye', u'і':'i',
 u'ї':'yi', u'ґ':'g',
 # CZECH
 u'č':'c', u'ď':'d', u'ě':'e', u'ň':'n', u'ř':'r', u'š':'s',
 u'ť':'t', u'ů':'u',
 u'ž':'z', u'Č':'C', u'Ď':'D', u'Ě':'E', u'Ň':'N', u'Ř':'R',
 u'Š':'S', u'Ť':'T', u'Ů':'U', u'Ž':'Z',
 # POLISH
 u'ą':'a', u'ć':'c', u'ę':'e', u'ł':'l', u'ń':'n', u'ó':'o',
 u'ś':'s', u'ź':'z',
 u'ż':'z', u'Ą':'A', u'Ć':'C', u'Ę':'e', u'Ł':'L', u'Ń':'N',
 u'Ó':'o', u'Ś':'S',
 u'Ź':'Z', u'Ż':'Z',
 # LATVIAN
 u'ā':'a', u'č':'c', u'ē':'e', u'ģ':'g', u'ī':'i', u'ķ':'k',
 u'ļ':'l', u'ņ':'n',
 u'š':'s', u'ū':'u', u'ž':'z', u'Ā':'A', u'Č':'C', u'Ē':'E',
 u'Ģ':'G', u'Ī':'i',
 u'Ķ':'k', u'Ļ':'L', u'Ņ':'N', u'Š':'S', u'Ū':'u', u'Ž':'Z'

 }

 def downcode(name):
 
  downcode(uŽabovitá zmiešaná kaša)
 u'Zabovita zmiesana kasa'
 
 for key, value in _MAP.iteritems():
 name = name.replace(key, value)
 return name
 
 Though C Python is pretty optimized under the hood for this sort of
 single-character replacement, this still seems pretty inefficient
 since you're calling replace for every character you want to map.  I
 think that a better approach might be something like:
 
 def downcode(name):
 return ''.join(_MAP.get(c, c) for c in name)
 
 Or using string.translate:
 
 import string
 def downcode(name):
 table = string.maketrans(
 'ÀÁÂÃÄÅ...',
 'AA...')
 return name.translate(table)

Or even simpler:

import unicodedata

def downcode(name):
   return unicodedata.normalize(NFD, name)\
  .encode(ascii, ignore)\
  .decode(ascii)

Servus,
   Walter
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: unicode issue

2009-10-01 Thread Rami Chowdhury
On Thu, 01 Oct 2009 08:10:58 -0700, Walter Dörwald wal...@livinglogic.de  
wrote:



On 01.10.09 16:09, Hyuga wrote:

On Sep 30, 3:34 am, gentlestone tibor.b...@hotmail.com wrote:

Why don't work this code on Python 2.6? Or how can I do this job?

[snip _MAP]

def downcode(name):

 downcode(uŽabovitá zmiešaná kaša)
u'Zabovita zmiesana kasa'

for key, value in _MAP.iteritems():
name = name.replace(key, value)
return name


Though C Python is pretty optimized under the hood for this sort of
single-character replacement, this still seems pretty inefficient
since you're calling replace for every character you want to map.  I
think that a better approach might be something like:

def downcode(name):
return ''.join(_MAP.get(c, c) for c in name)

Or using string.translate:

import string
def downcode(name):
table = string.maketrans(
'ÀÁÂÃÄÅ...',
'AA...')
return name.translate(table)


Or even simpler:

import unicodedata

def downcode(name):
   return unicodedata.normalize(NFD, name)\
  .encode(ascii, ignore)\
  .decode(ascii)

Servus,
   Walter


As I understand it, the ignore argument to str.encode *removes* the  
undecodable characters, rather than replacing them with an ASCII  
approximation. Is that correct? If so, wouldn't that rather defeat the  
purpose?


--
Rami Chowdhury
Never attribute to malice that which can be attributed to stupidity --  
Hanlon's Razor

408-597-7068 (US) / 07875-841-046 (UK) / 0189-245544 (BD)
--
http://mail.python.org/mailman/listinfo/python-list


Re: unicode issue

2009-10-01 Thread Walter Dörwald
On 01.10.09 17:50, Rami Chowdhury wrote:
 On Thu, 01 Oct 2009 08:10:58 -0700, Walter Dörwald
 wal...@livinglogic.de wrote:
 
 On 01.10.09 16:09, Hyuga wrote:
 On Sep 30, 3:34 am, gentlestone tibor.b...@hotmail.com wrote:
 Why don't work this code on Python 2.6? Or how can I do this job?

 [snip _MAP]

 def downcode(name):
 
  downcode(uŽabovitá zmiešaná kaša)
 u'Zabovita zmiesana kasa'
 
 for key, value in _MAP.iteritems():
 name = name.replace(key, value)
 return name

 Though C Python is pretty optimized under the hood for this sort of
 single-character replacement, this still seems pretty inefficient
 since you're calling replace for every character you want to map.  I
 think that a better approach might be something like:

 def downcode(name):
 return ''.join(_MAP.get(c, c) for c in name)

 Or using string.translate:

 import string
 def downcode(name):
 table = string.maketrans(
 'ÀÁÂÃÄÅ...',
 'AA...')
 return name.translate(table)

 Or even simpler:

 import unicodedata

 def downcode(name):
return unicodedata.normalize(NFD, name)\
   .encode(ascii, ignore)\
   .decode(ascii)

 Servus,
Walter
 
 As I understand it, the ignore argument to str.encode *removes* the
 undecodable characters, rather than replacing them with an ASCII
 approximation. Is that correct? If so, wouldn't that rather defeat the
 purpose?

Yes, but any accented characters have been split into the base character
and the combining accent via normalize() before, so only the accent gets
removed. Of course non-decomposable characters will be removed
completely, but it would be possible to replace

   .encode(ascii, ignore).decode(ascii)

with something like this:

   u.join(c for c in name if unicodedata.category(c) == Mn)

Servus,
   Walter
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: unicode issue

2009-10-01 Thread Peter Otten
Rami Chowdhury wrote:

 On Thu, 01 Oct 2009 08:10:58 -0700, Walter Dörwald wal...@livinglogic.de
 wrote:
 
 On 01.10.09 16:09, Hyuga wrote:
 On Sep 30, 3:34 am, gentlestone tibor.b...@hotmail.com wrote:
 Why don't work this code on Python 2.6? Or how can I do this job?

 [snip _MAP]

 def downcode(name):
 
  downcode(uŽabovitá zmiešaná kaša)
 u'Zabovita zmiesana kasa'
 
 for key, value in _MAP.iteritems():
 name = name.replace(key, value)
 return name

 Though C Python is pretty optimized under the hood for this sort of
 single-character replacement, this still seems pretty inefficient
 since you're calling replace for every character you want to map.  I
 think that a better approach might be something like:

 def downcode(name):
 return ''.join(_MAP.get(c, c) for c in name)

 Or using string.translate:

 import string
 def downcode(name):
 table = string.maketrans(
 'ÀÁÂÃÄÅ...',
 'AA...')
 return name.translate(table)

 Or even simpler:

 import unicodedata

 def downcode(name):
return unicodedata.normalize(NFD, name)\
   .encode(ascii, ignore)\
   .decode(ascii)

 Servus,
Walter
 
 As I understand it, the ignore argument to str.encode *removes* the
 undecodable characters, rather than replacing them with an ASCII
 approximation. Is that correct? If so, wouldn't that rather defeat the
 purpose?

You didn't take the normalization step into your consideration. Example:

 import unicodedata
 s = uÄ
 unicodedata.normalize(NFD, s)
u'A\u0308'
 _.encode(ascii, ignore)
'A'



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: unicode issue

2009-10-01 Thread Rami Chowdhury
On Thu, 01 Oct 2009 09:03:38 -0700, Walter Dörwald wal...@livinglogic.de  
wrote:


Yes, but any accented characters have been split into the base character
and the combining accent via normalize() before, so only the accent gets
removed. Of course non-decomposable characters will be removed
completely, but it would be possible to replace

   .encode(ascii, ignore).decode(ascii)

with something like this:

   u.join(c for c in name if unicodedata.category(c) == Mn)

Servus,
   Walter


Thank you for the clarification!

--
Rami Chowdhury
Never attribute to malice that which can be attributed to stupidity --  
Hanlon's Razor

408-597-7068 (US) / 07875-841-046 (UK) / 0189-245544 (BD)
--
http://mail.python.org/mailman/listinfo/python-list


unicode issue

2009-09-30 Thread gentlestone
Why don't work this code on Python 2.6? Or how can I do this job?

_MAP = {
# LATIN
u'À': 'A', u'Á': 'A', u'Â': 'A', u'Ã': 'A', u'Ä': 'A', u'Å': 'A',
u'Æ': 'AE', u'Ç':'C',
u'È': 'E', u'É': 'E', u'Ê': 'E', u'Ë': 'E', u'Ì': 'I', u'Í': 'I',
u'Î': 'I',
u'Ï': 'I', u'Ð': 'D', u'Ñ': 'N', u'Ò': 'O', u'Ó': 'O', u'Ô': 'O',
u'Õ': 'O', u'Ö':'O',
u'Ő': 'O', u'Ø': 'O', u'Ù': 'U', u'Ú': 'U', u'Û': 'U', u'Ü': 'U',
u'Ű': 'U',
u'Ý': 'Y', u'Þ': 'TH', u'ß': 'ss', u'à':'a', u'á':'a', u'â': 'a',
u'ã': 'a', u'ä':'a',
u'å': 'a', u'æ': 'ae', u'ç': 'c', u'è': 'e', u'é': 'e', u'ê': 'e',
u'ë': 'e',
u'ì': 'i', u'í': 'i', u'î': 'i', u'ï': 'i', u'ð': 'd', u'ñ': 'n',
u'ò': 'o', u'ó':'o',
u'ô': 'o', u'õ': 'o', u'ö': 'o', u'ő': 'o', u'ø': 'o', u'ù': 'u',
u'ú': 'u',
u'û': 'u', u'ü': 'u', u'ű': 'u', u'ý': 'y', u'þ': 'th', u'ÿ': 'y',
# LATIN_SYMBOLS
u'©':'(c)',
# GREEK
u'α':'a', u'β':'b', u'γ':'g', u'δ':'d', u'ε':'e', u'ζ':'z',
u'η':'h', u'θ':'8',
u'ι':'i', u'κ':'k', u'λ':'l', u'μ':'m', u'ν':'n', u'ξ':'3',
u'ο':'o', u'π':'p',
u'ρ':'r', u'σ':'s', u'τ':'t', u'υ':'y', u'φ':'f', u'χ':'x',
u'ψ':'ps', u'ω':'w',
u'ά':'a', u'έ':'e', u'ί':'i', u'ό':'o', u'ύ':'y', u'ή':'h',
u'ώ':'w', u'ς':'s',
u'ϊ':'i', u'ΰ':'y', u'ϋ':'y', u'ΐ':'i',
u'Α':'A', u'Β':'B', u'Γ':'G', u'Δ':'D', u'Ε':'E', u'Ζ':'Z',
u'Η':'H', u'Θ':'8',
u'Ι':'I', u'Κ':'K', u'Λ':'L', u'Μ':'M', u'Ν':'N', u'Ξ':'3',
u'Ο':'O', u'Π':'P',
u'Ρ':'R', u'Σ':'S', u'Τ':'T', u'Υ':'Y', u'Φ':'F', u'Χ':'X',
u'Ψ':'PS', u'Ω':'W',
u'Ά':'A', u'Έ':'E', u'Ί':'I', u'Ό':'O', u'Ύ':'Y', u'Ή':'H',
u'Ώ':'W', u'Ϊ':'I', u'Ϋ':'Y',
# TURKISH
u'ş':'s', u'Ş':'S', u'ı':'i', u'İ':'I', u'ç':'c', u'Ç':'C',
u'ü':'u', u'Ü':'U',
u'ö':'o', u'Ö':'O', u'ğ':'g', u'Ğ':'G',
# RUSSIAN
u'а':'a', u'б':'b', u'в':'v', u'г':'g', u'д':'d', u'е':'e',
u'ё':'yo', u'ж':'zh',
u'з':'z', u'и':'i', u'й':'j', u'к':'k', u'л':'l', u'м':'m',
u'н':'n', u'о':'o',
u'п':'p', u'р':'r', u'с':'s', u'т':'t', u'у':'u', u'ф':'f',
u'х':'h', u'ц':'c',
u'ч':'ch', u'ш':'sh', u'щ':'sh', u'ъ':'', u'ы':'y', u'ь':'',
u'э':'e', u'ю':'yu', u'я':'ya',
u'А':'A', u'Б':'B', u'В':'V', u'Г':'G', u'Д':'D', u'Е':'E',
u'Ё':'Yo', u'Ж':'Zh',
u'З':'Z', u'И':'I', u'Й':'J', u'К':'K', u'Л':'L', u'М':'M',
u'Н':'N', u'О':'O',
u'П':'P', u'Р':'R', u'С':'S', u'Т':'T', u'У':'U', u'Ф':'F',
u'Х':'H', u'Ц':'C',
u'Ч':'Ch', u'Ш':'Sh', u'Щ':'Sh', u'Ъ':'', u'Ы':'Y', u'Ь':'',
u'Э':'E', u'Ю':'Yu', u'Я':'Ya',
# UKRAINIAN
u'Є':'Ye', u'І':'I', u'Ї':'Yi', u'Ґ':'G', u'є':'ye', u'і':'i',
u'ї':'yi', u'ґ':'g',
# CZECH
u'č':'c', u'ď':'d', u'ě':'e', u'ň':'n', u'ř':'r', u'š':'s',
u'ť':'t', u'ů':'u',
u'ž':'z', u'Č':'C', u'Ď':'D', u'Ě':'E', u'Ň':'N', u'Ř':'R',
u'Š':'S', u'Ť':'T', u'Ů':'U', u'Ž':'Z',
# POLISH
u'ą':'a', u'ć':'c', u'ę':'e', u'ł':'l', u'ń':'n', u'ó':'o',
u'ś':'s', u'ź':'z',
u'ż':'z', u'Ą':'A', u'Ć':'C', u'Ę':'e', u'Ł':'L', u'Ń':'N',
u'Ó':'o', u'Ś':'S',
u'Ź':'Z', u'Ż':'Z',
# LATVIAN
u'ā':'a', u'č':'c', u'ē':'e', u'ģ':'g', u'ī':'i', u'ķ':'k',
u'ļ':'l', u'ņ':'n',
u'š':'s', u'ū':'u', u'ž':'z', u'Ā':'A', u'Č':'C', u'Ē':'E',
u'Ģ':'G', u'Ī':'i',
u'Ķ':'k', u'Ļ':'L', u'Ņ':'N', u'Š':'S', u'Ū':'u', u'Ž':'Z'
}

def downcode(name):

 downcode(uŽabovitá zmiešaná kaša)
u'Zabovita zmiesana kasa'

for key, value in _MAP.iteritems():
name = name.replace(key, value)
return name
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: unicode issue

2009-09-30 Thread Andre Engels
On Wed, Sep 30, 2009 at 9:34 AM, gentlestone tibor.b...@hotmail.com wrote:
 Why don't work this code on Python 2.6? Or how can I do this job?

Please be more specific than it doesn't work:
* What exactly are you doing
* What were you expecting the result of that to be
* What is the actual result?

-- 
André Engels, andreeng...@gmail.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: unicode issue

2009-09-30 Thread gentlestone
On 30. Sep., 09:41 h., Andre Engels andreeng...@gmail.com wrote:
 On Wed, Sep 30, 2009 at 9:34 AM, gentlestone tibor.b...@hotmail.com wrote:
  Why don't work this code on Python 2.6? Or how can I do this job?

 Please be more specific than it doesn't work:
 * What exactly are you doing
 * What were you expecting the result of that to be
 * What is the actual result?

 --
 André Engels, andreeng...@gmail.com

* What exactly are you doing
replace non-ascii characters - see doctest documentation

* What were you expecting the result of that to be
see doctest documentation

* What is the actual result?
the actual result is unchanged name
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: unicode issue

2009-09-30 Thread Andre Engels
I get the feeling that the problem is with the Python interactive
mode. It does not have full unicode support, so uŽabovitá zmiešaná
kaša is changed to u'\x8eabovit\xe1 zmie\x9aan\xe1 ka\x9aa'. If you
call your code from another program, it might work correctly.


-- 
André Engels, andreeng...@gmail.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: unicode issue

2009-09-30 Thread gentlestone
On 30. Sep., 10:35 h., Andre Engels andreeng...@gmail.com wrote:
 I get the feeling that the problem is with the Python interactive
 mode. It does not have full unicode support, so uŽabovitá zmiešaná
 kaša is changed to u'\x8eabovit\xe1 zmie\x9aan\xe1 ka\x9aa'. If you
 call your code from another program, it might work correctly.

 --
 André Engels, andreeng...@gmail.com

thx a lot

I spent 2 days of my life beacause of this

so doctests are unuseable for non-engish users in python - seems to be
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: unicode issue

2009-09-30 Thread gentlestone
On 30. Sep., 10:43 h., gentlestone tibor.b...@hotmail.com wrote:
 On 30. Sep., 10:35 h., Andre Engels andreeng...@gmail.com wrote:

  I get the feeling that the problem is with the Python interactive
  mode. It does not have full unicode support, so uŽabovitá zmiešaná
  kaša is changed to u'\x8eabovit\xe1 zmie\x9aan\xe1 ka\x9aa'. If you
  call your code from another program, it might work correctly.

  --
  André Engels, andreeng...@gmail.com

 thx a lot

 I spent 2 days of my life beacause of this

 so doctests are unuseable for non-engish users in python - seems to be

yes, you are right, now it works:

def slugify(name):

 slugify(u'\u017dabovit\xe1 zmie\u0161an\xe1 ka\u0161a s.r.o')
u'zabovita-zmiesana-kasa-sro'

for key, value in _MAP.iteritems():
name = name.replace(key, value)
return defaultfilters.slugify(name)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: unicode issue

2009-09-30 Thread Dave Angel

gentlestone wrote:

Why don't work this code on Python 2.6? Or how can I do this job?

_MAP =
# LATIN
u'À': 'A', u'Á': 'A', u'Â': 'A', u'Ã': 'A', u'Ä': 'A', u'Å': 'A',
u'Æ': 'AE', u'Ç':'C',
u'È': 'E', u'É': 'E', u'Ê': 'E', u'Ë': 'E', u'Ì': 'I', u'Í': 'I',
u'Î': 'I',
u'Ï': 'I', u'Ð': 'D', u'Ñ': 'N', u'Ò': 'O', u'Ó': 'O', u'Ô': 'O',
u'Õ': 'O', u'Ö':'O',
u'Ő': 'O', u'Ø': 'O', u'Ù': 'U', u'Ú': 'U', u'Û': 'U', u'Ü': 'U',
u'Ű': 'U',
u'Ý': 'Y', u'Þ': 'TH', u'ß': 'ss', u'à':'a', u'á':'a', u'â': 'a',
u'ã': 'a', u'ä':'a',
u'å': 'a', u'æ': 'ae', u'ç': 'c', u'è': 'e', u'é': 'e', u'ê': 'e',
u'ë': 'e',
u'ì': 'i', u'í': 'i', u'î': 'i', u'ï': 'i', u'ð': 'd', u'ñ': 'n',
u'ò': 'o', u'ó':'o',
u'ô': 'o', u'õ': 'o', u'ö': 'o', u'ő': 'o', u'ø': 'o', u'ù': 'u',
u'ú': 'u',
u'û': 'u', u'ü': 'u', u'ű': 'u', u'ý': 'y', u'þ': 'th', u'ÿ': 'y',
# LATIN_SYMBOLS
u'©':'(c)',
# GREEK
u'α':'a', u'β':'b', u'γ':'g', u'δ':'d', u'ε':'e', u'ζ':'z',
u'η':'h', u'θ':'8',
u'ι':'i', u'κ':'k', u'λ':'l', u'μ':'m', u'ν':'n', u'ξ':'3',
u'ο':'o', u'π':'p',
u'ρ':'r', u'σ':'s', u'τ':'t', u'υ':'y', u'φ':'f', u'χ':'x',
u'ψ':'ps', u'ω':'w',
u'ά':'a', u'έ':'e', u'ί':'i', u'ό':'o', u'ύ':'y', u'ή':'h',
u'ώ':'w', u'ς':'s',
u'ϊ':'i', u'ΰ':'y', u'ϋ':'y', u'ΐ':'i',
u'Α':'A', u'Β':'B', u'Γ':'G', u'Δ':'D', u'Ε':'E', u'Ζ':'Z',
u'Η':'H', u'Θ':'8',
u'Ι':'I', u'Κ':'K', u'Λ':'L', u'Μ':'M', u'Ν':'N', u'Ξ':'3',
u'Ο':'O', u'Π':'P',
u'Ρ':'R', u'Σ':'S', u'Τ':'T', u'Υ':'Y', u'Φ':'F', u'Χ':'X',
u'Ψ':'PS', u'Ω':'W',
u'Ά':'A', u'Έ':'E', u'Ί':'I', u'Ό':'O', u'Ύ':'Y', u'Ή':'H',
u'Ώ':'W', u'Ϊ':'I', u'Ϋ':'Y',
# TURKISH
u'ş':'s', u'Ş':'S', u'ı':'i', u'İ':'I', u'ç':'c', u'Ç':'C',
u'ü':'u', u'Ü':'U',
u'ö':'o', u'Ö':'O', u'ğ':'g', u'Ğ':'G',
# RUSSIAN
u'а':'a', u'б':'b', u'в':'v', u'г':'g', u'д':'d', u'е':'e',
u'ё':'yo', u'ж':'zh',
u'з':'z', u'и':'i', u'й':'j', u'к':'k', u'л':'l', u'м':'m',
u'н':'n', u'о':'o',
u'п':'p', u'р':'r', u'с':'s', u'т':'t', u'у':'u', u'ф':'f',
u'х':'h', u'ц':'c',
u'ч':'ch', u'ш':'sh', u'щ':'sh', u'ъ':'', u'ы':'y', u'ь':'',
u'э':'e', u'ю':'yu', u'я':'ya',
u'А':'A', u'Б':'B', u'В':'V', u'Г':'G', u'Д':'D', u'Е':'E',
u'Ё':'Yo', u'Ж':'Zh',
u'З':'Z', u'И':'I', u'Й':'J', u'К':'K', u'Л':'L', u'М':'M',
u'Н':'N', u'О':'O',
u'П':'P', u'Р':'R', u'С':'S', u'Т':'T', u'У':'U', u'Ф':'F',
u'Х':'H', u'Ц':'C',
u'Ч':'Ch', u'Ш':'Sh', u'Щ':'Sh', u'Ъ':'', u'Ы':'Y', u'Ь':'',
u'Э':'E', u'Ю':'Yu', u'Я':'Ya',
# UKRAINIAN
u'Є':'Ye', u'І':'I', u'Ї':'Yi', u'Ґ':'G', u'є':'ye', u'і':'i',
u'ї':'yi', u'ґ':'g',
# CZECH
u'č':'c', u'ď':'d', u'ě':'e', u'ň':'n', u'ř':'r', u'š':'s',
u'ť':'t', u'ů':'u',
u'ž':'z', u'Č':'C', u'Ď':'D', u'Ě':'E', u'Ň':'N', u'Ř':'R',
u'Š':'S', u'Ť':'T', u'Ů':'U', u'Ž':'Z',
# POLISH
u'ą':'a', u'ć':'c', u'ę':'e', u'ł':'l', u'ń':'n', u'ó':'o',
u'ś':'s', u'ź':'z',
u'ż':'z', u'Ą':'A', u'Ć':'C', u'Ę':'e', u'Ł':'L', u'Ń':'N',
u'Ó':'o', u'Ś':'S',
u'Ź':'Z', u'Ż':'Z',
# LATVIAN
u'ā':'a', u'č':'c', u'ē':'e', u'ģ':'g', u'ī':'i', u'ķ':'k',
u'ļ':'l', u'ņ':'n',
u'š':'s', u'ū':'u', u'ž':'z', u'Ā':'A', u'Č':'C', u'Ē':'E',
u'Ģ':'G', u'Ī':'i',
u'Ķ':'k', u'Ļ':'L', u'Ņ':'N', u'Š':'S', u'Ū':'u', u'Ž':'Z'
}

def downcode(name):

 downcode(uŽabovitá zmiešaná kaša)
u'Zabovita zmiesana kasa'

for key, value in _MAP.iteritems():
name =ame.replace(key, value)
return name

  

Works for me:

rrr = downcode(uŽabovitá zmiešaná kaša)
print repr(rrr)
print rrr

prints out:

u'Zabovita zmiesana kasa'
Zabovita zmiesana kasa

I did have to add an encoding declaration as line 2 of the file:

#-*- coding: latin-1 -*-

and I had to convince my editor (Komodo) to save the file in utf-8.

DaveA

--
http://mail.python.org/mailman/listinfo/python-list


Re: unicode issue

2009-09-30 Thread gentlestone
On 30. Sep., 11:45 h., Dave Angel da...@dejaviewphoto.com wrote:
 gentlestone wrote:
  Why don't work this code on Python 2.6? Or how can I do this job?

  _MAP =
      # LATIN
      u'À': 'A', u'Á': 'A', u'Â': 'A', u'Ã': 'A', u'Ä': 'A', u'Å': 'A',
  u'Æ': 'AE', u'Ç':'C',
      u'È': 'E', u'É': 'E', u'Ê': 'E', u'Ë': 'E', u'Ì': 'I', u'Í': 'I',
  u'Î': 'I',
      u'Ï': 'I', u'Ð': 'D', u'Ñ': 'N', u'Ò': 'O', u'Ó': 'O', u'Ô': 'O',
  u'Õ': 'O', u'Ö':'O',
      u'Ő': 'O', u'Ø': 'O', u'Ù': 'U', u'Ú': 'U', u'Û': 'U', u'Ü': 'U',
  u'Ű': 'U',
      u'Ý': 'Y', u'Þ': 'TH', u'ß': 'ss', u'à':'a', u'á':'a', u'â': 'a',
  u'ã': 'a', u'ä':'a',
      u'å': 'a', u'æ': 'ae', u'ç': 'c', u'è': 'e', u'é': 'e', u'ê': 'e',
  u'ë': 'e',
      u'ì': 'i', u'í': 'i', u'î': 'i', u'ï': 'i', u'ð': 'd', u'ñ': 'n',
  u'ò': 'o', u'ó':'o',
      u'ô': 'o', u'õ': 'o', u'ö': 'o', u'ő': 'o', u'ø': 'o', u'ù': 'u',
  u'ú': 'u',
      u'û': 'u', u'ü': 'u', u'ű': 'u', u'ý': 'y', u'þ': 'th', u'ÿ': 'y',
      # LATIN_SYMBOLS
      u'©':'(c)',
      # GREEK
      u'α':'a', u'β':'b', u'γ':'g', u'δ':'d', u'ε':'e', u'ζ':'z',
  u'η':'h', u'θ':'8',
      u'ι':'i', u'κ':'k', u'λ':'l', u'μ':'m', u'ν':'n', u'ξ':'3',
  u'ο':'o', u'π':'p',
      u'ρ':'r', u'σ':'s', u'τ':'t', u'υ':'y', u'φ':'f', u'χ':'x',
  u'ψ':'ps', u'ω':'w',
      u'ά':'a', u'έ':'e', u'ί':'i', u'ό':'o', u'ύ':'y', u'ή':'h',
  u'ώ':'w', u'ς':'s',
      u'ϊ':'i', u'ΰ':'y', u'ϋ':'y', u'ΐ':'i',
      u'Α':'A', u'Β':'B', u'Γ':'G', u'Δ':'D', u'Ε':'E', u'Ζ':'Z',
  u'Η':'H', u'Θ':'8',
      u'Ι':'I', u'Κ':'K', u'Λ':'L', u'Μ':'M', u'Ν':'N', u'Ξ':'3',
  u'Ο':'O', u'Π':'P',
      u'Ρ':'R', u'Σ':'S', u'Τ':'T', u'Υ':'Y', u'Φ':'F', u'Χ':'X',
  u'Ψ':'PS', u'Ω':'W',
      u'Ά':'A', u'Έ':'E', u'Ί':'I', u'Ό':'O', u'Ύ':'Y', u'Ή':'H',
  u'Ώ':'W', u'Ϊ':'I', u'Ϋ':'Y',
      # TURKISH
      u'ş':'s', u'Ş':'S', u'ı':'i', u'İ':'I', u'ç':'c', u'Ç':'C',
  u'ü':'u', u'Ü':'U',
      u'ö':'o', u'Ö':'O', u'ğ':'g', u'Ğ':'G',
      # RUSSIAN
      u'а':'a', u'б':'b', u'в':'v', u'г':'g', u'д':'d', u'е':'e',
  u'ё':'yo', u'ж':'zh',
      u'з':'z', u'и':'i', u'й':'j', u'к':'k', u'л':'l', u'м':'m',
  u'н':'n', u'о':'o',
      u'п':'p', u'р':'r', u'с':'s', u'т':'t', u'у':'u', u'ф':'f',
  u'х':'h', u'ц':'c',
      u'ч':'ch', u'ш':'sh', u'щ':'sh', u'ъ':'', u'ы':'y', u'ь':'',
  u'э':'e', u'ю':'yu', u'я':'ya',
      u'А':'A', u'Б':'B', u'В':'V', u'Г':'G', u'Д':'D', u'Е':'E',
  u'Ё':'Yo', u'Ж':'Zh',
      u'З':'Z', u'И':'I', u'Й':'J', u'К':'K', u'Л':'L', u'М':'M',
  u'Н':'N', u'О':'O',
      u'П':'P', u'Р':'R', u'С':'S', u'Т':'T', u'У':'U', u'Ф':'F',
  u'Х':'H', u'Ц':'C',
      u'Ч':'Ch', u'Ш':'Sh', u'Щ':'Sh', u'Ъ':'', u'Ы':'Y', u'Ь':'',
  u'Э':'E', u'Ю':'Yu', u'Я':'Ya',
      # UKRAINIAN
      u'Є':'Ye', u'І':'I', u'Ї':'Yi', u'Ґ':'G', u'є':'ye', u'і':'i',
  u'ї':'yi', u'ґ':'g',
      # CZECH
      u'č':'c', u'ď':'d', u'ě':'e', u'ň':'n', u'ř':'r', u'š':'s',
  u'ť':'t', u'ů':'u',
      u'ž':'z', u'Č':'C', u'Ď':'D', u'Ě':'E', u'Ň':'N', u'Ř':'R',
  u'Š':'S', u'Ť':'T', u'Ů':'U', u'Ž':'Z',
      # POLISH
      u'ą':'a', u'ć':'c', u'ę':'e', u'ł':'l', u'ń':'n', u'ó':'o',
  u'ś':'s', u'ź':'z',
      u'ż':'z', u'Ą':'A', u'Ć':'C', u'Ę':'e', u'Ł':'L', u'Ń':'N',
  u'Ó':'o', u'Ś':'S',
      u'Ź':'Z', u'Ż':'Z',
      # LATVIAN
      u'ā':'a', u'č':'c', u'ē':'e', u'ģ':'g', u'ī':'i', u'ķ':'k',
  u'ļ':'l', u'ņ':'n',
      u'š':'s', u'ū':'u', u'ž':'z', u'Ā':'A', u'Č':'C', u'Ē':'E',
  u'Ģ':'G', u'Ī':'i',
      u'Ķ':'k', u'Ļ':'L', u'Ņ':'N', u'Š':'S', u'Ū':'u', u'Ž':'Z'
  }

  def downcode(name):
      
       downcode(uŽabovitá zmiešaná kaša)
      u'Zabovita zmiesana kasa'
      
      for key, value in _MAP.iteritems():
          name =ame.replace(key, value)
      return name

 Works for me:

 rrr = downcode(uŽabovitá zmiešaná kaša)
 print repr(rrr)
 print rrr

 prints out:

 u'Zabovita zmiesana kasa'
 Zabovita zmiesana kasa

 I did have to add an encoding declaration as line 2 of the file:

 #-*- coding: latin-1 -*-

 and I had to convince my editor (Komodo) to save the file in utf-8.

 DaveA

great, thanks you all, I changed utf-8 to latin-1 in the header and it
works for me too

how mutch time could I save, just ask in this forum
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: unicode issue

2009-09-30 Thread saeed.gnu
I recommend to use UTF-8 coding(specially in GNU/Linux) then write
this in the second line:
#-*- coding: latin-1 -*-
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: unicode issue

2009-09-30 Thread Mark Tolonen


Dave Angel da...@dejaviewphoto.com wrote in message 
news:4ac328d4.3060...@dejaviewphoto.com...

gentlestone wrote:

Why don't work this code on Python 2.6? Or how can I do this job?

_MAP =
# LATIN
u'À': 'A', u'Á': 'A', u'Â': 'A', u'Ã': 'A', u'Ä': 'A', u'Å': 'A',
u'Æ': 'AE', u'Ç':'C',
u'È': 'E', u'É': 'E', u'Ê': 'E', u'Ë': 'E', u'Ì': 'I', u'Í': 'I',
u'Î': 'I',
u'Ï': 'I', u'Ð': 'D', u'Ñ': 'N', u'Ò': 'O', u'Ó': 'O', u'Ô': 'O',
u'Õ': 'O', u'Ö':'O',
u'Ő': 'O', u'Ø': 'O', u'Ù': 'U', u'Ú': 'U', u'Û': 'U', u'Ü': 'U',
u'Ű': 'U',
u'Ý': 'Y', u'Þ': 'TH', u'ß': 'ss', u'à':'a', u'á':'a', u'â': 'a',
u'ã': 'a', u'ä':'a',
u'å': 'a', u'æ': 'ae', u'ç': 'c', u'è': 'e', u'é': 'e', u'ê': 'e',
u'ë': 'e',
u'ì': 'i', u'í': 'i', u'î': 'i', u'ï': 'i', u'ð': 'd', u'ñ': 'n',
u'ò': 'o', u'ó':'o',
u'ô': 'o', u'õ': 'o', u'ö': 'o', u'ő': 'o', u'ø': 'o', u'ù': 'u',
u'ú': 'u',
u'û': 'u', u'ü': 'u', u'ű': 'u', u'ý': 'y', u'þ': 'th', u'ÿ': 'y',
# LATIN_SYMBOLS
u'©':'(c)',
# GREEK
u'α':'a', u'β':'b', u'γ':'g', u'δ':'d', u'ε':'e', u'ζ':'z',
u'η':'h', u'θ':'8',
u'ι':'i', u'κ':'k', u'λ':'l', u'μ':'m', u'ν':'n', u'ξ':'3',
u'ο':'o', u'π':'p',
u'ρ':'r', u'σ':'s', u'τ':'t', u'υ':'y', u'φ':'f', u'χ':'x',
u'ψ':'ps', u'ω':'w',
u'ά':'a', u'έ':'e', u'ί':'i', u'ό':'o', u'ύ':'y', u'ή':'h',
u'ώ':'w', u'ς':'s',
u'ϊ':'i', u'ΰ':'y', u'ϋ':'y', u'ΐ':'i',
u'Α':'A', u'Β':'B', u'Γ':'G', u'Δ':'D', u'Ε':'E', u'Ζ':'Z',
u'Η':'H', u'Θ':'8',
u'Ι':'I', u'Κ':'K', u'Λ':'L', u'Μ':'M', u'Ν':'N', u'Ξ':'3',
u'Ο':'O', u'Π':'P',
u'Ρ':'R', u'Σ':'S', u'Τ':'T', u'Υ':'Y', u'Φ':'F', u'Χ':'X',
u'Ψ':'PS', u'Ω':'W',
u'Ά':'A', u'Έ':'E', u'Ί':'I', u'Ό':'O', u'Ύ':'Y', u'Ή':'H',
u'Ώ':'W', u'Ϊ':'I', u'Ϋ':'Y',
# TURKISH
u'ş':'s', u'Ş':'S', u'ı':'i', u'İ':'I', u'ç':'c', u'Ç':'C',
u'ü':'u', u'Ü':'U',
u'ö':'o', u'Ö':'O', u'ğ':'g', u'Ğ':'G',
# RUSSIAN
u'а':'a', u'б':'b', u'в':'v', u'г':'g', u'д':'d', u'е':'e',
u'ё':'yo', u'ж':'zh',
u'з':'z', u'и':'i', u'й':'j', u'к':'k', u'л':'l', u'м':'m',
u'н':'n', u'о':'o',
u'п':'p', u'р':'r', u'с':'s', u'т':'t', u'у':'u', u'ф':'f',
u'х':'h', u'ц':'c',
u'ч':'ch', u'ш':'sh', u'щ':'sh', u'ъ':'', u'ы':'y', u'ь':'',
u'э':'e', u'ю':'yu', u'я':'ya',
u'А':'A', u'Б':'B', u'В':'V', u'Г':'G', u'Д':'D', u'Е':'E',
u'Ё':'Yo', u'Ж':'Zh',
u'З':'Z', u'И':'I', u'Й':'J', u'К':'K', u'Л':'L', u'М':'M',
u'Н':'N', u'О':'O',
u'П':'P', u'Р':'R', u'С':'S', u'Т':'T', u'У':'U', u'Ф':'F',
u'Х':'H', u'Ц':'C',
u'Ч':'Ch', u'Ш':'Sh', u'Щ':'Sh', u'Ъ':'', u'Ы':'Y', u'Ь':'',
u'Э':'E', u'Ю':'Yu', u'Я':'Ya',
# UKRAINIAN
u'Є':'Ye', u'І':'I', u'Ї':'Yi', u'Ґ':'G', u'є':'ye', u'і':'i',
u'ї':'yi', u'ґ':'g',
# CZECH
u'č':'c', u'ď':'d', u'ě':'e', u'ň':'n', u'ř':'r', u'š':'s',
u'ť':'t', u'ů':'u',
u'ž':'z', u'Č':'C', u'Ď':'D', u'Ě':'E', u'Ň':'N', u'Ř':'R',
u'Š':'S', u'Ť':'T', u'Ů':'U', u'Ž':'Z',
# POLISH
u'ą':'a', u'ć':'c', u'ę':'e', u'ł':'l', u'ń':'n', u'ó':'o',
u'ś':'s', u'ź':'z',
u'ż':'z', u'Ą':'A', u'Ć':'C', u'Ę':'e', u'Ł':'L', u'Ń':'N',
u'Ó':'o', u'Ś':'S',
u'Ź':'Z', u'Ż':'Z',
# LATVIAN
u'ā':'a', u'č':'c', u'ē':'e', u'ģ':'g', u'ī':'i', u'ķ':'k',
u'ļ':'l', u'ņ':'n',
u'š':'s', u'ū':'u', u'ž':'z', u'Ā':'A', u'Č':'C', u'Ē':'E',
u'Ģ':'G', u'Ī':'i',
u'Ķ':'k', u'Ļ':'L', u'Ņ':'N', u'Š':'S', u'Ū':'u', u'Ž':'Z'
}

def downcode(name):

 downcode(uŽabovitá zmiešaná kaša)
u'Zabovita zmiesana kasa'

for key, value in _MAP.iteritems():
name =ame.replace(key, value)
return name



Works for me:

rrr = downcode(uŽabovitá zmiešaná kaša)
print repr(rrr)
print rrr

prints out:

u'Zabovita zmiesana kasa'
Zabovita zmiesana kasa

I did have to add an encoding declaration as line 2 of the file:

#-*- coding: latin-1 -*-

and I had to convince my editor (Komodo) to save the file in utf-8.


Why decare latin-1 and save in utf-8?  I'm not sure how you got that to work 
because those encodings aren't equivalent.  I get:


Traceback (most recent call last):
 File stdin, line 1, in module
 File testit.py, line 1
SyntaxError: encoding problem: utf-8

In fact, some of the characters in the above code don't map to latin-1.

Traceback (most recent call last):
 File stdin, line 1, in module
UnicodeEncodeError: 'latin-1' codec can't encode character u'\u0150' in 
position

309: ordinal not in range(256)

import unicodedata as ud
ud.name(u'\u0150')


-Mark



--
http://mail.python.org/mailman/listinfo/python-list


Re: unicode issue

2009-09-30 Thread Piet van Oostrum
 Dave Angel da...@dejaviewphoto.com (DA) wrote:

DA Works for me:

DA rrr = downcode(uŽabovitá zmiešaná kaša)
DA print repr(rrr)
DA print rrr

DA prints out:

DA u'Zabovita zmiesana kasa'
DA Zabovita zmiesana kasa

DA I did have to add an encoding declaration as line 2 of the file:

DA #-*- coding: latin-1 -*-

DA and I had to convince my editor (Komodo) to save the file in utf-8.

*Seems to work*.
If you save in utf-8 the coding declaration also has to be utf-8.
Besides, many of these characters won't be representable in latin-1.
The reason it worked is that these characters were translated into two-
or more-bytes sequences and replace did work with these. But it's
dangerous, as they are then no longer the unicode characters they were
intended to be. 
-- 
Piet van Oostrum p...@vanoostrum.org
WWW: http://pietvanoostrum.com/
PGP key: [8DAE142BE17999C4]
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: unicode issue

2009-09-30 Thread Dave Angel

Piet van Oostrum wrote:

Dave Angel da...@dejaviewphoto.com (DA) wrote:



  

DA Works for me:



  

DA rrr = downcode(uŽabovitá zmiešaná kaša)
DA print repr(rrr)
DA print rrr



  

DA prints out:



  

DA u'Zabovita zmiesana kasa'
DA Zabovita zmiesana kasa



  

DA I did have to add an encoding declaration as line 2 of the file:



  

DA #-*- coding: latin-1 -*-



  

DA and I had to convince my editor (Komodo) to save the file in utf-8.



*Seems to work*.
If you save in utf-8 the coding declaration also has to be utf-8.
Besides, many of these characters won't be representable in latin-1.
The reason it worked is that these characters were translated into two-
or more-bytes sequences and replace did work with these. But it's
dangerous, as they are then no longer the unicode characters they were
intended to be. 
  
Thanks for the correction. What I meant by works for me is that the 
single example in the docstring translated okay. But I do have a lot to 
learn about using Unicode in sources, and I want to learn.


So tell me, how were we supposed to guess what encoding the original 
message used? I originally had the mailing list message (in Thunderbird 
email). When I copied (copy/paste) to Komodo IDE (text editor), it 
wouldn't let me save because the file type was ASCII. So I randomly 
chosen latin-1 for file type, and it seemed to like it.


At that point I expected and got errors from Python because I had no 
coding declaration. I used latin-1, and still had problems, though I 
forget what they were. Only when I changed the file encoding type again, 
to utf-8, did the errors go away. I agree that they should agree, but I 
don't know how to reconcile the copy/paste boundary, the file type 
(without BOM, which is another variable), the coding declaration, and 
the stdout implicit ASCII encoding. I understand a bunch of it, but not 
enough to be able to safely walk through the choices.


Is this all written up in one place, to where an experienced programmer 
can make sense of it? I've nibbled at the edges (even wrote a UTF-8 
encoder/decoder a dozen years ago).


DaveA
--
http://mail.python.org/mailman/listinfo/python-list


Re: unicode issue

2009-09-30 Thread Piet van Oostrum
 Dave Angel da...@ieee.org (DA) wrote:
[snip]
DA Thanks for the correction. What I meant by works for me is that the
DA single example in the docstring translated okay. But I do have a lot to
DA learn about using Unicode in sources, and I want to learn.

DA So tell me, how were we supposed to guess what encoding the original
DA message used? I originally had the mailing list message (in Thunderbird
DA email). When I copied (copy/paste) to Komodo IDE (text editor), it wouldn't
DA let me save because the file type was ASCII. So I randomly chosen latin-1
DA for file type, and it seemed to like it.

You can see the encoding of the message in its headers. But it is not
important, as the Unicode characters you see is what it is about. You
just copy and paste them in your Python file. The Python file does not
have to use the same encoding as the message from which you pasted. The
editor will do the proper conversion. (If it doesn't throw it away
immediately.) Only for the Python file you must choose an encoding that
can encode all the characters that are in the file. In this case utf-8
is the only reasonable choice, but if there are only latin-1 characters
in the file then of course latin-1 (iso-8859-1) will also be good.

Any decent editor will only allow you to save in an encoding that can
encode all the characters in the file, otherwise you will lose some
characters. 

Because Python must also know which encoding you used and this is not in
itself deductible from the file contents, you need the coding
declaration. And it must be the same as the encoding in which the file
is saved, otherwise Python will see something different than you saw in
your editor. Sooner or later this will give you a big headache.

DA At that point I expected and got errors from Python because I had no coding
DA declaration. I used latin-1, and still had problems, though I forget what
DA they were. Only when I changed the file encoding type again, to utf-8, did
DA the errors go away. I agree that they should agree, but I don't know how to
DA reconcile the copy/paste boundary, the file type (without BOM, which is
DA another variable), the coding declaration, and the stdout implicit ASCII
DA encoding. I understand a bunch of it, but not enough to be able to safely
DA walk through the choices.

DA Is this all written up in one place, to where an experienced programmer can
DA make sense of it? I've nibbled at the edges (even wrote a UTF-8 
DA encoder/decoder a dozen years ago).

I don't know a place. Usually utf-8 is a safe bet but in some cases can
be overkill. And then in you Python input/output (read/write) you may
have to use a different encoding if the programs that you have to
communicate with expect something different.
-- 
Piet van Oostrum p...@vanoostrum.org
WWW: http://pietvanoostrum.com/
PGP key: [8DAE142BE17999C4]
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: unicode issue

2009-09-30 Thread Dave Angel

Piet van Oostrum wrote:

Dave Angel da...@ieee.org (DA) wrote:


[snip]
  

DA Thanks for the correction. What I meant by works for me is that the
DA single example in the docstring translated okay. But I do have a lot to
DA learn about using Unicode in sources, and I want to learn.



  

DA So tell me, how were we supposed to guess what encoding the original
DA message used? I originally had the mailing list message (in Thunderbird
DA email). When I copied (copy/paste) to Komodo IDE (text editor), it wouldn't
DA let me save because the file type was ASCII. So I randomly chosen latin-1
DA for file type, and it seemed to like it.



You can see the encoding of the message in its headers. But it is not
important, as the Unicode characters you see is what it is about. You
just copy and paste them in your Python file. The Python file does not
have to use the same encoding as the message from which you pasted. The
editor will do the proper conversion. (If it doesn't throw it away
immediately.) Only for the Python file you must choose an encoding that
can encode all the characters that are in the file. In this case utf-8
is the only reasonable choice, but if there are only latin-1 characters
in the file then of course latin-1 (iso-8859-1) will also be good.

Any decent editor will only allow you to save in an encoding that can
encode all the characters in the file, otherwise you will lose some
characters. 


Because Python must also know which encoding you used and this is not in
itself deductible from the file contents, you need the coding
declaration. And it must be the same as the encoding in which the file
is saved, otherwise Python will see something different than you saw in
your editor. Sooner or later this will give you a big headache.

  

DA At that point I expected and got errors from Python because I had no coding
DA declaration. I used latin-1, and still had problems, though I forget what
DA they were. Only when I changed the file encoding type again, to utf-8, did
DA the errors go away. I agree that they should agree, but I don't know how to
DA reconcile the copy/paste boundary, the file type (without BOM, which is
DA another variable), the coding declaration, and the stdout implicit ASCII
DA encoding. I understand a bunch of it, but not enough to be able to safely
DA walk through the choices.



  

DA Is this all written up in one place, to where an experienced programmer can
DA make sense of it? I've nibbled at the edges (even wrote a UTF-8 
DA encoder/decoder a dozen years ago).



I don't know a place. Usually utf-8 is a safe bet but in some cases can
be overkill. And then in you Python input/output (read/write) you may
have to use a different encoding if the programs that you have to
communicate with expect something different.
  


I know what I was missing.  The copy/paste must be doing it in pure 
Unicode.  And the in-memory version of the source text is in Unicode.  
So the text editor's encoding affects how that Unicode is encoded into 8 
bit bytes for the file (and how it will be reloaded next time).  OK, 
that seems to make sense.


I know that the clipboard has type tags, but I haven't looked at them in 
so long that I forget what they look like.  For text, is it just ASCII 
and Unicode?  Or are there other possible encodings that the source and 
sink negotiate?


Thanks for the clear explanation.

DaveA
--
http://mail.python.org/mailman/listinfo/python-list


[issue6077] Unicode issue with tempfile on Windows

2009-05-27 Thread Amaury Forgeot d'Arc

Amaury Forgeot d'Arc amaur...@gmail.com added the comment:

File descriptors wrapped by the new IO module should be opened in binary
mode.

The attached patch changes TemporaryFile and NamedTemporaryFile to
always call os.open() in binary mode; the mode is really used by the
io.open() function.

mkstemp() returns a raw file descriptor and was not changed.

--
keywords: +needs review, patch
nosy: +amaury.forgeotdarc
Added file: http://bugs.python.org/file14092/tempfile.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6077
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6077] Unicode issue with tempfile on Windows

2009-05-26 Thread Ugra Dániel

New submission from Ugra Dániel daniel.u...@gmail.com:

Opening a file with tempfile.TemporaryFile using wt+ mode, then
reading content back, will cause reading to stop (without any exception)
when encountering byte '0x1a' (aka. Ctrl+Z) on Windows even tough UTF-16
encoding is used. When using built-in open with the same parameters
(plus a file name of course) everything works as expected. On Linux this
issue does not exists.

--
components: Library (Lib)
files: UnicodeTest.py
messages: 88151
nosy: daniel.ugra
severity: normal
status: open
title: Unicode issue with tempfile on Windows
type: behavior
versions: Python 3.0
Added file: http://bugs.python.org/file14032/UnicodeTest.py

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6077
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6077] Unicode issue with tempfile on Windows

2009-05-21 Thread Ezio Melotti

Changes by Ezio Melotti ezio.melo...@gmail.com:


--
nosy: +ezio.melotti

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6077
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Unicode issue on Windows cmd line

2009-02-11 Thread jeffg
Having issue on Windows cmd.
 Python.exe
a = u'\xf0'
print a

This gives a unicode error.

Works fine in IDLE, PythonWin, and my Macbook but I need to run this
from a windows batch.

Character should look like this ð.

Please help!
--
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue on Windows cmd line

2009-02-11 Thread Albert Hopkins
On Wed, 2009-02-11 at 10:35 -0800, jeffg wrote:
 Having issue on Windows cmd.
  Python.exe
 a = u'\xf0'
 print a
 
 This gives a unicode error.
 
 Works fine in IDLE, PythonWin, and my Macbook but I need to run this
 from a windows batch.
 
 Character should look like this ð.
 
 Please help!

You forgot to paste the error.


--
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue on Windows cmd line

2009-02-11 Thread jeffg
On Feb 11, 2:35 pm, Albert Hopkins mar...@letterboxes.org wrote:
 On Wed, 2009-02-11 at 10:35 -0800, jeffg wrote:
  Having issue on Windows cmd.
   Python.exe
  a = u'\xf0'
  print a

  This gives a unicode error.

  Works fine in IDLE, PythonWin, and my Macbook but I need to run this
  from a windows batch.

  Character should look like this ð.

  Please help!

 You forgot to paste the error.

The error looks like this:
File stdin, line 1, in module
File C:\python25\lib\encodings\cp437.py, line 12, in encode
  return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode character u'\xf0' in
position 0
: character maps to undefined


Running Python 2.5.4 on Windows XP
--
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue on Windows cmd line

2009-02-11 Thread Benjamin Kaplan
On Wed, Feb 11, 2009 at 2:50 PM, jeffg jeffgem...@gmail.com wrote:

 On Feb 11, 2:35 pm, Albert Hopkins mar...@letterboxes.org wrote:
  On Wed, 2009-02-11 at 10:35 -0800, jeffg wrote:
   Having issue on Windows cmd.
Python.exe
   a = u'\xf0'
   print a
 
   This gives a unicode error.
 
   Works fine in IDLE, PythonWin, and my Macbook but I need to run this
   from a windows batch.
 
   Character should look like this ð.
 
   Please help!
 
  You forgot to paste the error.

 The error looks like this:
File stdin, line 1, in module
File C:\python25\lib\encodings\cp437.py, line 12, in encode
  return codecs.charmap_encode(input,errors,encoding_map)
 UnicodeEncodeError: 'charmap' codec can't encode character u'\xf0' in
 position 0
 : character maps to undefined


 Running Python 2.5.4 on Windows XP



That isn't a python problem, it's a Windows problem. For compatibility
reasons, Microsoft never added Unicode support to cmd.  When you do print
u'', python tries to convert the characters to the console encoding (the
really old cp437, not even the Windows standard cp1252), it messes up.
AFAIK, you'll have to use the chcp command to switch to an encoding that has
the character and then print u'\xf0'.encode(the_encoding) to get it to
display. There isn't any way around it- we've tried.


 --
 http://mail.python.org/mailman/listinfo/python-list

--
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue on Windows cmd line

2009-02-11 Thread Karen Tracey
On Wed, Feb 11, 2009 at 2:50 PM, jeffg jeffgem...@gmail.com wrote:

 On Feb 11, 2:35 pm, Albert Hopkins mar...@letterboxes.org wrote:
  On Wed, 2009-02-11 at 10:35 -0800, jeffg wrote:
   Having issue on Windows cmd.
Python.exe
   a = u'\xf0'
   print a
 
   This gives a unicode error.
 
   Works fine in IDLE, PythonWin, and my Macbook but I need to run this
   from a windows batch.
 
   Character should look like this ð.
 
   Please help!
 
  You forgot to paste the error.

 The error looks like this:
File stdin, line 1, in module
File C:\python25\lib\encodings\cp437.py, line 12, in encode
  return codecs.charmap_encode(input,errors,encoding_map)
 UnicodeEncodeError: 'charmap' codec can't encode character u'\xf0' in
 position 0
 : character maps to undefined


 Running Python 2.5.4 on Windows XP


First, you may need to change your command prompt Properties-Font to use
Lucida Console rather than raster fonts.  Then you'll need to change the
code page using chcp to something that has a mapping for the character you
want. E.g.:

D:\chcp
Active code page: 437

D:\python
Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)]
on win32
Type help, copyright, credits or license for more information.
 a = u'\xf0'
 print a
Traceback (most recent call last):
  File stdin, line 1, in module
  File D:\bin\Python2.5.2\lib\encodings\cp437.py, line 12, in encode
return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode character u'\xf0' in
position 0: character maps to undefined
 quit()

D:\chcp 1252
Active code page: 1252

D:\python
Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)]
on win32
Type help, copyright, credits or license for more information.
 a = u'\xf0'
 print a
ð
 quit()

D:\

(Just changing the code page works to avoid the UnicodeEncodeError, but with
raster fonts that character displays as thee horizontal bars.)

Karen
--
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue on Windows cmd line

2009-02-11 Thread Martin v. Löwis
 Having issue on Windows cmd.
 Python.exe
 a = u'\xf0'
 print a
 
 This gives a unicode error.
 
 Works fine in IDLE, PythonWin, and my Macbook but I need to run this
 from a windows batch.
 
 Character should look like this ð.
 
 Please help!

Well, your terminal just cannot display this character by default; you
need to use a different terminal program, or reconfigure your terminal.

For example, do

chcp 1252

and select Lucida Console as the terminal font, then try again.

Of course, this will cause *different* characters to become
non-displayable.

Regards,
Martin
--
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue on Windows cmd line

2009-02-11 Thread Benjamin Kaplan
On Wed, Feb 11, 2009 at 3:57 PM, Martin v. Löwis mar...@v.loewis.dewrote:

  Having issue on Windows cmd.
  Python.exe
  a = u'\xf0'
  print a
 
  This gives a unicode error.
 
  Works fine in IDLE, PythonWin, and my Macbook but I need to run this
  from a windows batch.
 
  Character should look like this ð.
 
  Please help!

 Well, your terminal just cannot display this character by default; you
 need to use a different terminal program, or reconfigure your terminal.

 For example, do

 chcp 1252

 and select Lucida Console as the terminal font, then try again.

 Of course, this will cause *different* characters to become
 non-displayable.


Well,



 Regards,
 Martin
 --
 http://mail.python.org/mailman/listinfo/python-list

--
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode issue on Windows cmd line

2009-02-11 Thread Benjamin Kaplan
On Wed, Feb 11, 2009 at 4:10 PM, Benjamin Kaplan
benjamin.kap...@case.eduwrote:



 On Wed, Feb 11, 2009 at 3:57 PM, Martin v. Löwis mar...@v.loewis.dewrote:

  Having issue on Windows cmd.
  Python.exe
  a = u'\xf0'
  print a
 
  This gives a unicode error.
 
  Works fine in IDLE, PythonWin, and my Macbook but I need to run this
  from a windows batch.
 
  Character should look like this ð.
 
  Please help!

 Well, your terminal just cannot display this character by default; you
 need to use a different terminal program, or reconfigure your terminal.

 For example, do

 chcp 1252

 and select Lucida Console as the terminal font, then try again.

 Of course, this will cause *different* characters to become
 non-displayable.


 Well,


Whoops. Didn't mean to hit send there. I was going to say, you can't have
everything when Microsoft is only willing to break the programs that average
people are going to use on a daily basis. I mean, why would they do
something nice for the international community at the expense of breaking
some 20 year old batch scripts? Those were the only things that still worked
when Vista first came out.




 Regards,
 Martin
 --
 http://mail.python.org/mailman/listinfo/python-list



--
http://mail.python.org/mailman/listinfo/python-list


  1   2   >