Friday, August 29, 2003, 20:23

Dear Friends,

I've been trying to convert a series of HTML documents on my PC, but I
keep  getting  some strange errors at the end. They are actually about
50  HTML  files, starting from the index, but since I did not know the
correct  link  depth  between  them  I  just guessed 10, and the final
message says "3165 parsed".

The  problem  seems  to  happen  when  the  parsed documents are being
converted  to  the  PDB  file.  Something  happens  and the PDB is not
created, something with Python.

I  am  using  Plucker Desktop 1.2.01, and I've never had such problem,
even on very large amounts of documents.

Can anyone help? The whole error message is as follows:

--- start of error log ---

---- all 3165 pages retrieved and parsed ----
Writing out collected data...
Writing document 'Thinking in Java, 3rd Ed.' to file C:\Arquivos de 
programas\Plucker\channels/ThinkinginJava3rdEd/ThinkinginJava3rdEd.pdb
Traceback (most recent call last):
  File "C:\Arquivos de programas\Plucker\PyPlucker\Spider.py", line 1532, in ?
    sys.exit(realmain())
  File "C:\Arquivos de programas\Plucker\PyPlucker\Spider.py", line 1524, in realmain
    retval = main (config, exclusion_lists)
  File "C:\Arquivos de programas\Plucker\PyPlucker\Spider.py", line 1046, in main
    mapping = writer.write (verbose=verbosity, alias_list=alias_list)
  File "C:\Arquivos de programas\Plucker\PyPlucker\Writer.py", line 518, in write
    result = Writer.write (self, verbose, alias_list=alias_list)
  File "C:\Arquivos de programas\Plucker\PyPlucker\Writer.py", line 310, in write
    self._mapper = Mapper(self._collection, alias_list.as_dict())
  File "C:\Arquivos de programas\Plucker\PyPlucker\Writer.py", line 102, in __init__
    self._get_id_for_doc(doc)
  File "C:\Arquivos de programas\Plucker\PyPlucker\Writer.py", line 112, in 
_get_id_for_doc
    id = self._url_to_id_mapping.get(doc.get_url())
AttributeError: 'None' object has no attribute 'get_url'
Installing channel output to destinations...
Setting channels new due date
Tasks completed for all channels.

--- end of error log ---

I  have  tried other sites and documents, and they convert OK. I tried
cleaning  the  HTML  files  using  HTML  Tidy (this worked before when
trying  to  convert HTML documents generated by Power Point), but made
no difference here.

Sincerely,

Michael A. Lees
[EMAIL PROTECTED]

_______________________________________________
plucker-list mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-list

Reply via email to