Im getting the same problem with max depth on wrired and bbc news, but Im also getting a problem with bbc recipies. I wont starve as ive got all the recipies plucked using the old pyplucker, but i cant sem to pluck any of the bcc recipies pages using plucker desktop, and the new pypluker.
pyplucker crashes out with the following message. Processing http://www.bbc.co.uk/food/recipes/print/.....brown_7685.html... Retrieved ok. Error: Unknown error parsing document http://www.bbc.co.uk/food/recipes/print/1/T/triplechocolatebrown_7685.html: Traceback (innermost last): File "C:\Program Files\Plucker\PyPlucker\Parser.py", line 27, in generic_parse r parser = TextParser.StructuredHTMLParser (url, data, headers, config, attrib utes) File "C:\Program Files\Plucker\PyPlucker\TextParser.py", line 875, in __init__ self.feed (text) File "C:\Program Files\Plucker\Python\lib\sgmllib.py", line 83, in feed self.goahead(0) File "C:\Program Files\Plucker\Python\lib\sgmllib.py", line 113, in goahead k = self.parse_starttag(i) File "C:\Program Files\Plucker\Python\lib\sgmllib.py", line 258, in parse_star ttag self.finish_starttag(tag, attrs) File "C:\Program Files\Plucker\Python\lib\sgmllib.py", line 292, in finish_sta rttag self.handle_starttag(tag, method, attrs) File "C:\Program Files\Plucker\PyPlucker\TextParser.py", line 958, in handle_s tarttag sgmllib.SGMLParser.handle_starttag(self, tag, method, attrs) File "C:\Program Files\Plucker\Python\lib\sgmllib.py", line 332, in handle_sta rttag method(attrs) File "C:\Program Files\Plucker\PyPlucker\TextParser.py", line 1128, in do_meta ctype, parameters = parse_http_header_value(data[1][1]) IndexError: list index out of range Parsing failed. The other thing ive noticed in plucker desktop, when plucking site from a website url, url pattern filter, stay on host, and stay on domain dont seem to do anything. (wired news again, plucking from a website url, puck level 5, stay on host set), Having said that plucker is great, and betas are beta for a reason :), keep up the good work. John ps. running windows 2000, plucker desktop 1.2.0