Problem with certain site

Verlorene Seele Thu, 13 Feb 2003 13:29:47 -0800

Hello!

I would like to make following site offline browsable, but I get error
codes (using Windows plucker desktop 1.2.0.3):
http://animexx.4players.de/news.phtml


I know this page is _not_ W3C conform, tested with
http://validator.w3.org/check?uri=http%3A%2F%2Fwww.animexx.de%2Fnews.phtml

Is there a possibility to run the Python parser in a "compatibility"
mode for this web page?

I would be happy about suggestions how to handle this case.
RTFM: If the manual says anything about this, please tell me, I will
read it (give a small hint, e.g. a word to search for).

Thank you for your time, greetings from Germany,
Stephen D. Leedle AKA Verlorene Seele

These are the error messages which are produced:

> Initializing Plucker spidering engine...
>  
> -----------------------------------------------------------
> Updating channel: animexx...
> -----------------------------------------------------------
> Pluckerdir is 'H:\Program Files\Plucker'...
> ZLib compression turned on
> Using exclusion list H:\Program Files\Plucker\exclusionlist.txt
> Using exclusion list H:\Program Files\Plucker\exclusionlist.txt
> ---- 0 collected, 1 to do ----
> Processing http://animexx.4players.de/news.phtml...
>   Retrieved ok.
> Error:  Unknown error parsing document http://animexx.4players.de/news.phtml:
> Traceback (most recent call last):
>   File "H:\Program Files\Plucker\PyPlucker\Parser.py", line 27, in generic_parser
>     parser = TextParser.StructuredHTMLParser (url, data, headers, config, attributes)
>   File "H:\Program Files\Plucker\PyPlucker\TextParser.py", line 896, in __init__
>     self.feed (text)
>   File "H:\Program Files\Plucker\python\lib\sgmllib.py", line 91, in feed
>     self.goahead(0)
>   File "H:\Program Files\Plucker\python\lib\sgmllib.py", line 121, in goahead
>     k = self.parse_starttag(i)
>   File "H:\Program Files\Plucker\python\lib\sgmllib.py", line 311, in parse_starttag
>     self.finish_starttag(tag, attrs)
>   File "H:\Program Files\Plucker\python\lib\sgmllib.py", line 349, in finish_starttag
>     self.handle_starttag(tag, method, attrs)
>   File "H:\Program Files\Plucker\PyPlucker\TextParser.py", line 986, in 
>handle_starttag
>     sgmllib.SGMLParser.handle_starttag(self, tag, method, attrs)
>   File "H:\Program Files\Plucker\python\lib\sgmllib.py", line 385, in handle_starttag
>     method(attrs)
>   File "H:\Program Files\Plucker\PyPlucker\TextParser.py", line 1394, in start_font
>     self._doc.set_forecolor (rgb)
>   File "H:\Program Files\Plucker\PyPlucker\TextParser.py", line 484, in set_forecolor
>     elif value[0] == '#':
> IndexError: string index out of range
>   Parsing failed.
> ---- all 0 pages retrieved and parsed ----
> Writing out collected data...
> Writing document 'animexx' to file H:\Program 
>Files\Plucker\channels/animexx/animexx.pdb
> Traceback (most recent call last):
>   File "H:\Program Files\Plucker\PyPlucker\Spider.py", line 1577, in ?
>     sys.exit(realmain())
>   File "H:\Program Files\Plucker\PyPlucker\Spider.py", line 1569, in realmain
>     retval = main (config, exclusion_lists)
>   File "H:\Program Files\Plucker\PyPlucker\Spider.py", line 1084, in main
>     mapping = writer.write (verbose=verbosity, alias_list=alias_list)
>   File "H:\Program Files\Plucker\PyPlucker\Writer.py", line 518, in write
>     result = Writer.write (self, verbose, alias_list=alias_list)
>   File "H:\Program Files\Plucker\PyPlucker\Writer.py", line 337, in write
>     raise RuntimeError("The collection process failed to generate a 'home' document")
> RuntimeError: The collection process failed to generate a 'home' document
> Installing channel output to destinations...
> Setting new due dates...
> Tasks completed for all channels.

_______________________________________________
plucker-list mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-list

Problem with certain site

Reply via email to