Bugs item #1574672, was opened at 2006-10-10 19:12
Message generated for change (Comment added) made by duncanwebb
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=446895&aid=1574672&group_id=46652

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: other
Group: 1.x svn
Status: Open
>Resolution: Fixed
Priority: 5
Private: No
Submitted By: Richard van Paasen (rvpaasen)
Assigned to: Nobody/Anonymous (nobody)
Summary: Headlines crashes

Initial Comment:
The headlines plugin crashes on my box with:

Traceback (most recent call last):
  File "/home/freevotest/freevo/src/main.py", line 321,
in eventhandler
    app.eventhandler(event)
  File "/home/freevotest/freevo/src/menu.py", line 605,
in eventhandler
    action( arg=arg, menuw=self )
  File
"/home/freevotest/freevo/src/plugins/headlines.py",
line 221, in getheadlines
    description = util.htmlenties2txt(description)
  File "/home/freevotest/freevo/src/util/misc.py", line
405, in htmlenties2txt
    string = string.replace(entity, replacement)
UnicodeDecodeError: 'ascii' codec can't decode byte
0xad in position 0: ordinal not in range(128)


It does this for the RSS feed:
  http://tweakers.net/feeds/mixed.xml

It does not always crash (today and two days ago it
did), the encoding of characters may be buggy.



----------------------------------------------------------------------

>Comment By: Duncan Webb (duncanwebb)
Date: 2006-11-27 18:08

Message:
Logged In: YES 
user_id=104395
Originator: NO

I figured it out.

I've applied to change to rel-1-6 at r8668 and rel-1 at r8667.

I'm just wondering if we should be using the Unicode or String call, or
are these calls always latin-1 and never latin-2, etc?

----------------------------------------------------------------------

Comment By: Michael Droettboom (mdboom)
Date: 2006-11-27 18:03

Message:
Logged In: YES 
user_id=119312
Originator: NO

Yes.  Sorry.  The line to replace is the one where it crashes in the
traceback in the original post:

string = string.replace(entity, replacement)

should be:

string = string.replace(entity, replacement.decode("latin-1"))


----------------------------------------------------------------------

Comment By: Duncan Webb (duncanwebb)
Date: 2006-11-27 17:59

Message:
Logged In: YES 
user_id=104395
Originator: NO

I've done some changes to musc.py and line 405 is not line 405 if you see
what I mean. :-)

What would help is either the original line or the revision number of
misc.py

Is this information available to you?

----------------------------------------------------------------------

Comment By: Michael Droettboom (mdboom)
Date: 2006-11-27 16:40

Message:
Logged In: YES 
user_id=119312
Originator: NO

This is crashing because the value of "replacement" (which comes from
Python stdlib's htmlentitydefs) is encoded in "latin-1", but the
conversion to unicode here assumes 'ascii', and it chokes on code points >
128.

See http://www.python.org/doc/lib/module-htmlentitydefs.html

(Sorry, I don't have access to the source to provide a patch here,
but...)

An easy fix for this is to replace line 405 in src/util/misc.py with

string = string.replace(entity, replacement.decode("latin-1"))

----------------------------------------------------------------------

Comment By: Duncan Webb (duncanwebb)
Date: 2006-10-10 21:15

Message:
Logged In: YES 
user_id=104395

I've added this:
HEADLINES_LOCATIONS = [
    (u'Tweakers.net', 'http://tweakers.net/feeds/mixed.xml'),
]
And sure enough I can reproduce the error and will try to
find out what is causing this.

----------------------------------------------------------------------

Comment By: Richard van Paasen (rvpaasen)
Date: 2006-10-10 20:58

Message:
Logged In: YES 
user_id=182311

the u"" trick does not work. The problem is probably located
in the content of he RSS feed. Did you try out the feed
http://tweakers.net/feeds/mixed.xml ? If you do that now,
you'll see what I mean.


----------------------------------------------------------------------

Comment By: Duncan Webb (duncanwebb)
Date: 2006-10-10 20:41

Message:
Logged In: YES 
user_id=104395

I think that there is a simple fix for this. Can you try
putting the title into unicode, so that you have
(u'"title", "link"),

eg, before:
HEADLINES_LOCATIONS = [
    ("BBC Front Page",
"http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/front_page/rss.xml";),
]
after:
HEADLINES_LOCATIONS = [
    (u"BBC Front Page",
"http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/front_page/rss.xml";),
]

This works for TV and Radio channels so I guess that it will
work for headlines too.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=446895&aid=1574672&group_id=46652

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Freevo-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/freevo-devel

Reply via email to