I'm having the problem with BBC and Sci-Fi (http://www.scifi.com/handheld/)
----- Original Message ----- From: "John Albrecht" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Wednesday, August 14, 2002 5:00 PM Subject: Other bugs ive noticed in plucker desktop > Im getting the same problem with max depth on wrired and bbc news, but Im > also getting a problem with bbc recipies. > I wont starve as ive got all the recipies plucked using the old pyplucker, > but i cant sem to pluck any of the bcc > recipies pages using plucker desktop, and the new pypluker. > > pyplucker crashes out with the following message. > > Processing http://www.bbc.co.uk/food/recipes/print/.....brown_7685.html... > Retrieved ok. > Error: Unknown error parsing document > http://www.bbc.co.uk/food/recipes/print/1/T/triplechocolatebrown_7685.html: > Traceback (innermost last): > File "C:\Program Files\Plucker\PyPlucker\Parser.py", line 27, in > generic_parse > r > parser = TextParser.StructuredHTMLParser (url, data, headers, config, > attrib > utes) > File "C:\Program Files\Plucker\PyPlucker\TextParser.py", line 875, in > __init__ > > self.feed (text) > File "C:\Program Files\Plucker\Python\lib\sgmllib.py", line 83, in feed > self.goahead(0) > File "C:\Program Files\Plucker\Python\lib\sgmllib.py", line 113, in > goahead > k = self.parse_starttag(i) > File "C:\Program Files\Plucker\Python\lib\sgmllib.py", line 258, in > parse_star > ttag > self.finish_starttag(tag, attrs) > File "C:\Program Files\Plucker\Python\lib\sgmllib.py", line 292, in > finish_sta > rttag > self.handle_starttag(tag, method, attrs) > File "C:\Program Files\Plucker\PyPlucker\TextParser.py", line 958, in > handle_s > tarttag > sgmllib.SGMLParser.handle_starttag(self, tag, method, attrs) > File "C:\Program Files\Plucker\Python\lib\sgmllib.py", line 332, in > handle_sta > rttag > method(attrs) > File "C:\Program Files\Plucker\PyPlucker\TextParser.py", line 1128, in > do_meta > > ctype, parameters = parse_http_header_value(data[1][1]) > IndexError: list index out of range > Parsing failed. > > The other thing ive noticed in plucker desktop, when plucking site from a > website url, > url pattern filter, stay on host, and stay on domain dont seem to do > anything. > (wired news again, plucking from a website url, puck level 5, stay on host > set), > > Having said that plucker is great, and betas are beta for a reason :), keep > up the good work. > > John > > ps. running windows 2000, plucker desktop 1.2.0 > > >

