Dear Jaganadh,
I have tried with separate individual execution as

{{{
$ python html2text.py index1.htm

Traceback (most recent call last):
  File "../aaronsw-html2text-d9bf7d6/html2text.py", line 488, in <module>
    data = data.decode(encoding)
  File "/usr/lib/python2.6/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x88 in position 11366:
invalid start byte

}}}

Where index1.htm is the fetched page from "
http://www.hindu.com/nic/kye/index1.htm";
while with index2.htm , fetched from "
http://www.hindu.com/nic/kye/index2.htm " the command works fine..!!


I also tried with importing as, "from html2text import * " and calling the
function accordingly. Same results.!!

Thanks,
Nikunj



On Sun, Apr 17, 2011 at 9:21 PM, JAGANADH G <jagana...@gmail.com> wrote:

>
>
> On Sun, Apr 17, 2011 at 9:13 PM, Nikunj Badjatya <nikunjbadja...@gmail.com
> > wrote:
>
>>
>> Tried with the change.
>> {{{
>> ...
>> ...
>> -  myunistr = smart_str(fetch)
>>
>> + myunistr = smart_str(fetch.read())
>> ...
>> ...
>> }}}
>>
>> Output:
>>
>> {{{
>> Traceback (most recent call last):
>>   File "html2text.py", line 447, in <module>
>>     data = open(arg, 'r').read().decode(encoding)
>>   File "/usr/lib/python2.6/encodings/utf_8.py", line 16, in decode
>>     return codecs.utf_8_decode(input, errors, True)
>> UnicodeDecodeError: 'utf8' codec can't decode byte 0x88 in position 11366:
>> invalid start byte
>> }}}
>>
>> Same error as before. !! ??
>>
>>
>
>
> I think the error is coming from this line
> os.system('python2.6 html2text.py main.html > main.txt')
>
> Insted of calling os.system try to import concerned function from
> html2text.py in the program
>
>
>
> --
> **********************************
> JAGANADH G
> http://jaganadhg.freeflux.net/blog
> *ILUGCBE*
> http://ilugcbe.techstud.org
>
>
_______________________________________________
BangPypers mailing list
BangPypers@python.org
http://mail.python.org/mailman/listinfo/bangpypers

Reply via email to