Re: [python] Čtení z Excelu v češtině

Pavel Kosina Mon, 29 May 2006 08:08:08 -0700

aha, tak to již je asi (určitě) unicode řetězec. ukázka:
# -*- coding: cp1250 -*- x=u"Žluťoučký kůň pěl ďábelské ódy." # print unicode(x,"cp852") - hází chybu TypeError: decoding Unicode is not supportedprint x.encode("cp852") - vytištěno správně

Takže ty ho musíš encodovat. Tu chybu, cos psal předtím, ta pravděpodobně vznikla, když jsi zkoušel převést znak v unicode (který byl v Excelu) na to tvoje kodovani, v kterém nebyl odpovidajíci ekvivalent. TO by se mělo dát řešit dalším parametrem errors v .encode() -

encode( [encoding[,errors]]): Return an encoded version of the string. Default encoding is the current default string encoding. errors may be given to set a different error handling scheme. The default for errors is 'strict', meaning that encoding errors raise a UnicodeError. Other possible values are 'ignore', 'replace', 'xmlcharrefreplace', 'backslashreplace' and any other name registered via codecs.register_error. For a list of possible encodings, see section 4.9.2. New in version 2.0. Changed in version 2.3: Support for 'xmlcharrefreplace' and 'backslashreplace' and other error handling schemes added.

Některý znaky ti tam holt budou chybět, ale lepší většina než nic. Hodně štěstí, s češtinou je to často porod ;-)

-- 
geon
Pavel Kosina

Martin Jedlička napsal(a):

jo, diky...toto jsem zkousel, ale zase mi to pise:
TypeError: decoding Unicode is not supported


Martin

Pavel Kosina napsal(a):

Martin Jedlička napsal(a):

Zdravim,
  pracuju s excelem pres win32com a mam problem s textem v cestine. 
Pokud mam text v excelu v cestine, tak mi to pri nacteni textu chodi chybu:
UnicodeEncodeError: 'ascii' codec can't encode character u'\u010c' in 
position 0: ordinal not in range(128)
Cetl jsem clanek o cestine na http://www.py.cz/UnicodeEncodeError, ale 
nevim jak pracovat s Unicode, kdyz mam ten nacteny text v nejake 
promenne. Jak mam s tim ceskym textem pracovat?

Obecně: musíš nejdříve zjistit (třeba i metodou pokus omyl), v kterém 
kodovaní je text v proměnné je uložen. No a pak to převedeš na 
všeobjímající unicode, asi takto:

x=unicode(tvuj_text, "utf-8")
nebo
x=unicode(tvuj_text,"cp1250")

Pak by to mělo jít tisknout i zobrazovat. Pokud to budeš ukládat do 
souboru, je lepší to převést na nějaké obyčejnější kodovani - třeba zpět 
na utf-8 nebo cp1250. Mě to někdy, když jsem ukládal přímo v unicode, 
pak vůbec nešel soubor zobrazit v editorech.

Tipnul bych si u Excelu pod XP na utf-8...

_______________________________________________
Python mailing list
[email protected]
http://www.py.cz/mailman/listinfo/python

_______________________________________________
Python mailing list
[email protected]
http://www.py.cz/mailman/listinfo/python

Re: [python] Čtení z Excelu v češtině

Odpovedet emailem