The "type error: function takes exactly 5 arguments (1 given)" crash I
was getting a couple weeks ago has returned today, and I think I've
narrowed down the data that is causing it to happen.
As before, the crash occurs in some internal TG functions for KID
processing (none of which I have ever changed at all) toward the
latter part of the call to my "get_feed_data" function in my
FeedControllers module, ending with the following lines (full set of
lines posted previously in this thread so I won't repeat them here):
...
\parser.py", line 227, in _coalesce
text += to_unicode(value, encoding)
File "c:\python24\lib\site-packages\kid-0.9.5-py2.4.egg\kid
\parser.py", line 204, in to_unicode
return unicode(value, encoding)
type error: function takes exactly 5 arguments (1 given)
Processing the following snippet of XML seems to cause the crash.
It's a portion of an RSS news feed, and when I put some debug
conditional processing to skip over this snippet then the crash does
not happen.
I am guessing that the offending character is the "&#151"
following the word "education".
<item>
<title>Review finds nutrition education failing (AP)</title>
<link>http://us.rd.yahoo.com/dailynews/rss/health/*http://
news.yahoo.com/s/ap/20070704/ap_on_he_me/failing_to_fight_fat</link>
<guid isPermaLink="false">ap/20070704/failing_to_fight_fat</guid>
<pubDate>Wed, 04 Jul 2007 21:06:50 GMT</pubDate>
<description>AP - The federal government will spend more than $1
billion this year on nutrition education &#151; fresh carrot and
celery snacks, videos of dancing fruit, hundreds of hours of lively
lessons about how great you will feel if you eat well.</description>
</item>
I notice that in a browser the "&#151" character is displayed as a
long dash.
Can anyone please offer me any suggestions for work-arounds I can add
to my "get feed_data" function so that when some external RSS feed I
process happens to have a character like this it can recover and
proceed without crashing?
Thanks much in advance for any help. This problem is beyond the outer
edge of my Python expertise, but I hope the solution can help me
advance that a bit and also make my project perform much more
reliably.
Researching this a little I found this discussion of a range of
character codes 128-159, of which the character 151 is within, so
maybe that has something to do with this?
http://www.cs.tut.fi/~jkorpela/chars.html#win
"In the Windows character set, some positions in the range 128 - 159
are assigned to printable characters, such as "smart quotes", em dash,
en dash, and trademark symbol. Thus, the character repertoire is
larger than ISO Latin 1. The use of octets in the range 128 - 159 in
any data to be processed by a program that expects ISO 8859-1 encoded
data is an error which might cause just anything. They might for
example get ignored, or be processed in a manner which looks
meaningful, or be interpreted as control characters. See my document
On the use of some MS Windows characters in HTML for a discussion of
the problems of using these characters."
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"TurboGears" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at
http://groups.google.com/group/turbogears?hl=en
-~----------~----~----~----~------~----~------~--~---