Hi,

since a week or so I got the following problem: while plucking my
default-database from pages on the web, the spider program on the PC
produces repeatable errors, which lead to stopping spidering before building
the Palm-Database.
 My System in detail:

Plucker Version: "Plucker-1.1.11SR1.exe"
OS: W2K SP2
Internet-Connection: dial up ISDN
Simultaneous running programs using LAN/Internet/TCPIP: none ;-)

The error insists to exist even after a fresh boot. In error Protokoll #1 I
see 70 or so errors in the status display of the dialup connection, in
protokoll #2 I see 0. When I try to access those pages manually (with a
browser) I can't experience any difficulties. When I close the connection,
redial and start runsync again, it crashes at the same address, with the
same error, day for day :-( It makes me mad! ;-)

I had this problem sometimes before (with older versions of plucker - it
just froze back then, don't crashed), but usually, after after a reboot I
was able to build the database.

I can't figure out the problem by myself, so do you have a clue?? Is it a
python problem?

I would like to propose the following error handling paradigm: set up a
timer before getting a document. When the timer expires or a socket error or
similiar happens, just ignore *THIS* document and continue with the
others... My palm mainly serves me as a newsreader - and plucker was the
best solution so far for this purpose, but its glory fades with this naughty
behaviour...


Thanks for helping and happy hunting,
Christoph Munkelt

Protokoll#1 (using the "Planet Interkom" german ISP for dialup)

Processing http://www.heise.de/pda/newsticker/m21083.html.
           160 collected, 113 still to do
  Retrieved ok
Processing http://www.heise.de/pda/newsticker/m21082.html.
           161 collected, 112 still to do
Traceback (innermost last):
  File ".\PyPlucker\Spider.py", line 1090, in ?
    retval = main (config, exclusion_lists)
  File ".\PyPlucker\Spider.py", line 717, in main
    spider.process_all(verbose=verbosity)
  File ".\PyPlucker\Spider.py", line 365, in process_all
    self.process (verbose)
  File ".\PyPlucker\Spider.py", line 424, in process
    post_data=post_data)
  File ".\PyPlucker\Retriever.py", line 269, in retrieve
    result = self._retrieve (url, alias_list, post_data)
  File ".\PyPlucker\Retriever.py", line 228, in _retrieve
    contents = webdoc.read ()
  File "D:\Programme\Apps\Plucker\Python\lib\socket.py", line 106, in read
    new = self._sock.recv(self._rbufsize)
socket.error: (10054, 'winsock error')
Error executing PyPlucker. Error: 1

Protokoll#2 (using a local university dial-in, same database, just 2 minutes
later)

Processing http://www.faz.net/IN/INtemplates/faznet.....B-0008C7F31E1E}.
           190 collected, 102 still to do
Traceback (innermost last):
  File ".\PyPlucker\Spider.py", line 1090, in ?
    retval = main (config, exclusion_lists)
  File ".\PyPlucker\Spider.py", line 717, in main
    spider.process_all(verbose=verbosity)
  File ".\PyPlucker\Spider.py", line 365, in process_all
    self.process (verbose)
  File ".\PyPlucker\Spider.py", line 424, in process
    post_data=post_data)
  File ".\PyPlucker\Retriever.py", line 269, in retrieve
    result = self._retrieve (url, alias_list, post_data)
  File ".\PyPlucker\Retriever.py", line 228, in _retrieve
    contents = webdoc.read ()
  File "D:\Programme\Apps\Plucker\Python\lib\socket.py", line 106, in read
    new = self._sock.recv(self._rbufsize)
socket.error: (10054, 'winsock error')
Error executing PyPlucker. Error: 1

Reply via email to