[EMAIL PROTECTED] wrote:
Send plucker-list mailing list submissions to [EMAIL PROTECTED]
To subscribe or unsubscribe via the World Wide Web, visit http://lists.rubberchicken.org/mailman/listinfo/plucker-list or, via email, send a message with subject or body 'help' to [EMAIL PROTECTED]
You can reach the person managing the list at [EMAIL PROTECTED]
When replying, please edit your Subject line so it is more specific than "Re: Contents of plucker-list digest..."
Today's Topics:
1. Plucker 1.2 Parser Error (Michael A. Lees)
--__--__--
Message: 1 Date: Sat, 30 Aug 2003 23:16:49 -0300 From: "Michael A. Lees" <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Subject: Plucker 1.2 Parser Error Reply-To: [EMAIL PROTECTED]
Friday, August 29, 2003, 20:23
Dear Friends,
I've been trying to convert a series of HTML documents on my PC, but I keep getting some strange errors at the end. They are actually about 50 HTML files, starting from the index, but since I did not know the correct link depth between them I just guessed 10, and the final message says "3165 parsed".
The problem seems to happen when the parsed documents are being converted to the PDB file. Something happens and the PDB is not created, something with Python.
I am using Plucker Desktop 1.2.01, and I've never had such problem, even on very large amounts of documents.
Can anyone help? The whole error message is as follows:
--- start of error log ---
---- all 3165 pages retrieved and parsed ---- Writing out collected data... Writing document 'Thinking in Java, 3rd Ed.' to file C:\Arquivos de programas\Plucker\channels/ThinkinginJava3rdEd/ThinkinginJava3rdEd.pdb Traceback (most recent call last): File "C:\Arquivos de programas\Plucker\PyPlucker\Spider.py", line 1532, in ? sys.exit(realmain()) File "C:\Arquivos de programas\Plucker\PyPlucker\Spider.py", line 1524, in realmain retval = main (config, exclusion_lists) File "C:\Arquivos de programas\Plucker\PyPlucker\Spider.py", line 1046, in main mapping = writer.write (verbose=verbosity, alias_list=alias_list) File "C:\Arquivos de programas\Plucker\PyPlucker\Writer.py", line 518, in write result = Writer.write (self, verbose, alias_list=alias_list) File "C:\Arquivos de programas\Plucker\PyPlucker\Writer.py", line 310, in write self._mapper = Mapper(self._collection, alias_list.as_dict()) File "C:\Arquivos de programas\Plucker\PyPlucker\Writer.py", line 102, in __init__ self._get_id_for_doc(doc) File "C:\Arquivos de programas\Plucker\PyPlucker\Writer.py", line 112, in _get_id_for_doc id = self._url_to_id_mapping.get(doc.get_url()) AttributeError: 'None' object has no attribute 'get_url' Installing channel output to destinations... Setting channels new due date Tasks completed for all channels.
--- end of error log ---
I have tried other sites and documents, and they convert OK. I tried cleaning the HTML files using HTML Tidy (this worked before when trying to convert HTML documents generated by Power Point), but made no difference here.
Sincerely,
Michael A. Lees [EMAIL PROTECTED]
--__--__--
_______________________________________________ plucker-list mailing list [EMAIL PROTECTED] http://lists.rubberchicken.org/mailman/listinfo/plucker-list
End of plucker-list Digest
_______________________________________________ plucker-list mailing list [EMAIL PROTECTED] http://lists.rubberchicken.org/mailman/listinfo/plucker-list

