Hi,
since a week or so I got the following problem: while plucking my
default-database from pages on the web, the spider program on the PC
produces repeatable errors, which lead to stopping spidering before building
the Palm-Database.
My System in detail:
Plucker Version: "Plucker-1.1.11SR1.exe"
OS: W2K SP2
Internet-Connection: dial up ISDN
Simultaneous running programs using LAN/Internet/TCPIP: none ;-)
The error insists to exist even after a fresh boot. In error Protokoll #1 I
see 70 or so errors in the status display of the dialup connection, in
protokoll #2 I see 0. When I try to access those pages manually (with a
browser) I can't experience any difficulties. When I close the connection,
redial and start runsync again, it crashes at the same address, with the
same error, day for day :-( It makes me mad! ;-)
I had this problem sometimes before (with older versions of plucker - it
just froze back then, don't crashed), but usually, after after a reboot I
was able to build the database.
I can't figure out the problem by myself, so do you have a clue?? Is it a
python problem?
I would like to propose the following error handling paradigm: set up a
timer before getting a document. When the timer expires or a socket error or
similiar happens, just ignore *THIS* document and continue with the
others... My palm mainly serves me as a newsreader - and plucker was the
best solution so far for this purpose, but its glory fades with this naughty
behaviour...
Thanks for helping and happy hunting,
Christoph Munkelt
Protokoll#1 (using the "Planet Interkom" german ISP for dialup)
Processing http://www.heise.de/pda/newsticker/m21083.html.
160 collected, 113 still to do
Retrieved ok
Processing http://www.heise.de/pda/newsticker/m21082.html.
161 collected, 112 still to do
Traceback (innermost last):
File ".\PyPlucker\Spider.py", line 1090, in ?
retval = main (config, exclusion_lists)
File ".\PyPlucker\Spider.py", line 717, in main
spider.process_all(verbose=verbosity)
File ".\PyPlucker\Spider.py", line 365, in process_all
self.process (verbose)
File ".\PyPlucker\Spider.py", line 424, in process
post_data=post_data)
File ".\PyPlucker\Retriever.py", line 269, in retrieve
result = self._retrieve (url, alias_list, post_data)
File ".\PyPlucker\Retriever.py", line 228, in _retrieve
contents = webdoc.read ()
File "D:\Programme\Apps\Plucker\Python\lib\socket.py", line 106, in read
new = self._sock.recv(self._rbufsize)
socket.error: (10054, 'winsock error')
Error executing PyPlucker. Error: 1
Protokoll#2 (using a local university dial-in, same database, just 2 minutes
later)
Processing http://www.faz.net/IN/INtemplates/faznet.....B-0008C7F31E1E}.
190 collected, 102 still to do
Traceback (innermost last):
File ".\PyPlucker\Spider.py", line 1090, in ?
retval = main (config, exclusion_lists)
File ".\PyPlucker\Spider.py", line 717, in main
spider.process_all(verbose=verbosity)
File ".\PyPlucker\Spider.py", line 365, in process_all
self.process (verbose)
File ".\PyPlucker\Spider.py", line 424, in process
post_data=post_data)
File ".\PyPlucker\Retriever.py", line 269, in retrieve
result = self._retrieve (url, alias_list, post_data)
File ".\PyPlucker\Retriever.py", line 228, in _retrieve
contents = webdoc.read ()
File "D:\Programme\Apps\Plucker\Python\lib\socket.py", line 106, in read
new = self._sock.recv(self._rbufsize)
socket.error: (10054, 'winsock error')
Error executing PyPlucker. Error: 1