subject:"Special chars with HTMLParser"

Re: Special chars with HTMLParser

2009-08-07 Thread Stefan Behnel

Fafounet wrote: I am parsing a web page with special chars such as #xE9; (which stands for é). I know I can have the unicode character é from unicode (\xe9,iso-8859-1) but with those extra characters I don' t know. I tried to implement handle_charref within HTMLParser without success.

Special chars with HTMLParser

2009-08-05 Thread Fafounet

Hello, I am parsing a web page with special chars such as #xE9; (which stands for é). I know I can have the unicode character é from unicode (\xe9,iso-8859-1) but with those extra characters I don' t know. I tried to implement handle_charref within HTMLParser without success. Furthermore, if I

Re: Special chars with HTMLParser

2009-08-05 Thread Piet van Oostrum

Fafounet fafou...@gmail.com (F) wrote: F Hello, F I am parsing a web page with special chars such as #xE9; (which F stands for é). F I know I can have the unicode character é from unicode F (\xe9,iso-8859-1) F but with those extra characters I don' t know. F I tried to implement handle_charref

Re: Special chars with HTMLParser

2009-08-05 Thread Fafounet

Thank you, now I can get the correct character. Now when I have the string ab#xE9;cd I can get ab then é thanks to your function and then cd. But how is it possible to know that cd is still the same word ? Fabien The character references indicate Unicode ordinals, not iso-8859-1 characters.

Re: Special chars with HTMLParser

2009-08-05 Thread Piet van Oostrum

Fafounet fafou...@gmail.com (F) wrote: F Thank you, now I can get the correct character. F Now when I have the string ab#xE9;cd I can get ab then é thanks to F your function and then cd. But how is it possible to know that cd is F still the same word ? That depends on your definition of `word'.

Re: Special chars with HTMLParser

Special chars with HTMLParser

Re: Special chars with HTMLParser

Re: Special chars with HTMLParser

Re: Special chars with HTMLParser

5 matches

Site Navigation

Mail list logo

Footer information