|
Errm... does anyone has a solution to
my problem? :)
----- Original Message -----
From: Benedict
Tang
Sent: Wednesday, April 17, 2002 3:02 PM
Subject: [scoop] Sitescooper doesn't update sites Some my sites don't seem to get
updated. I've tried putting a -fullrefresh and deleting the cache folders, but
nothing seems to work.
I've added a -debug parameter, and
generated a log, which is attached.
It can't be due to my proxy server,
as I have this problem for more than a week already and the sites are
updated when I checked using IE.
I used -fullrefesh, but it
doesn't work for me. Strangely, after I run the batch file again, folders are
created in tmp\cache and tmp\page_cache_dir for the respective sites, but
none in tmp\txt itself. Can someone help see what's
wrong? Thanks!
|
debug: site title set to: "MRT train times", file base: MRT_train_times debug: tmp dir: C:\Software\Palm\SiteScooper\sitescooper-3.1.2\tmp\txt\MRT_train_times.tmp, output dir: C:\Software\Palm\SiteScooper\sitescooper-3.1.2\tmp\txt\MRT_train_times debug: site title set to: "Catholic Encyclopedia", file base: Catholic_Encyclopedia debug: tmp dir: C:\Software\Palm\SiteScooper\sitescooper-3.1.2\tmp\txt\Catholic_Encyclopedia.tmp, output dir: C:\Software\Palm\SiteScooper\sitescooper-3.1.2\tmp\txt\Catholic_Encyclopedia debug: site title set to: "Office of Readings", file base: Office_of_Readings debug: tmp dir: C:\Software\Palm\SiteScooper\sitescooper-3.1.2\tmp\txt\Office_of_Readings.tmp, output dir: C:\Software\Palm\SiteScooper\sitescooper-3.1.2\tmp\txt\Office_of_Readings debug: site title set to: "Zenit News Agency", file base: Zenit_News_Agency debug: tmp dir: C:\Software\Palm\SiteScooper\sitescooper-3.1.2\tmp\txt\Zenit_News_Agency.tmp, output dir: C:\Software\Palm\SiteScooper\sitescooper-3.1.2\tmp\txt\Zenit_News_Agency SITE START: now scooping site "sites\mrt_train_times.site". debug: tmp dir: C:\Software\Palm\SiteScooper\sitescooper-3.1.2\tmp\txt\MRT_train_times.tmp, output dir: C:\Software\Palm\SiteScooper\sitescooper-3.1.2\tmp\txt\MRT_train_times, now: Sat Apr 6 12:55:50 2002 debug: minpages: levels=0 Reading level-2 front page: http://www.smrtcorp.com/smrt/train_arrivals.htm debug: adding url-handler to queue: train_arrivals.htm debug: URL-handler queue: train_arrivals.htm debug: waiting for queue to empty: URL-handler queue: train_arrivals.htm debug: url-handler train_arrivals.htm: running state 2 debug: finishing GET: http://www.smrtcorp.com/smrt/train_arrivals.htm debug: last-modified time: 1013070087 (Apr 6 2002) debug: oldest link seen at www.smrtcorp.com http://www.smrtcorp.com/smrt/train_arrivals.htm: modtime=1013070087 (Apr 6 2002) debug: table item <td valign=top width=1 height="589"> omitted debug: table item <td valign=top width="31" height="210"> omitted debug: table item <td width="31" height="10"> omitted debug: table item <td width="2%" height="30"> omitted debug: table item <td width="23%" height="30"> omitted debug: table item <td width="2%" height="30"> omitted debug: table item <td width="26%" height="30"> omitted debug: table item <td width="2%" height="30"> omitted debug: table item <td width="22%" height="30"> omitted debug: table item <td width="2%" height="30"> omitted debug: table item <td width="21%" height="30"> omitted debug: table item <td width="2%" height="29"> omitted debug: table item <td width="23%" height="29"> omitted debug: table item <td width="2%" height="29"> omitted debug: table item <td width="26%" height="29"> omitted debug: table item <td width="2%" height="29"> omitted debug: table item <td width="22%" height="29"> omitted debug: table item <td width="2%" height="29"> omitted debug: table item <td width="21%" height="29"> omitted debug: table item <td width="2%" height="28"> omitted debug: table item <td width="23%" height="28"> omitted debug: table item <td width="2%" height="28"> omitted debug: table item <td width="26%" height="28"> omitted debug: table item <td width="2%" height="28"> omitted debug: table item <td width="22%" height="28"> omitted debug: table item <td width="2%" height="28"> omitted debug: table item <td width="21%" height="28"> omitted debug: table item <td width="2%" height="31"> omitted debug: table item <td width="23%" height="31"> omitted debug: table item <td width="2%" height="31"> omitted debug: table item <td width="26%" height="31"> omitted debug: table item <td width="2%" height="31"> omitted debug: table item <td width="22%" height="31"> omitted debug: table item <td width="2%" height="31"> omitted debug: table item <td width="21%" height="31"> omitted debug: table item <td width="2%" height="26"> omitted debug: table item <td width="23%" height="26"> omitted debug: table item <td width="2%" height="26"> omitted debug: table item <td width="26%" height="26"> omitted debug: table item <td width="2%" height="26"> omitted debug: table item <td width="22%" height="26"> omitted debug: table item <td width="2%" height="26"> omitted debug: table item <td width="21%" height="26"> omitted debug: table item <td width="2%" height="27"> omitted debug: table item <td width="23%" height="27"> omitted debug: table item <td width="2%" height="27"> omitted debug: table item <td width="26%" height="27"> omitted debug: table item <td width="2%" height="27"> omitted debug: table item <td width="22%" height="27"> omitted debug: table item <td width="2%" height="27"> omitted debug: table item <td width="21%" height="27"> omitted debug: table item <td width="2%" height="28"> omitted debug: table item <td width="23%" height="28"> omitted debug: table item <td width="2%" height="28"> omitted debug: table item <td width="26%" height="28"> omitted debug: table item <td width="2%" height="28"> omitted debug: table item <td width="22%" height="28"> omitted debug: table item <td width="2%" height="28"> omitted debug: table item <td width="21%" height="28"> omitted debug: table item <td width="2%" height="26"> omitted debug: table item <td width="23%" height="26"> omitted debug: table item <td width="2%" height="26"> omitted debug: table item <td width="26%" height="26"> omitted debug: table item <td width="2%" height="26"> omitted debug: table item <td width="22%" height="26"> omitted debug: table item <td width="2%" height="26"> omitted debug: table item <td width="21%" height="26"> omitted debug: table item <td width="2%" height="31"> omitted debug: table item <td width="23%" height="31"> omitted debug: table item <td width="2%" height="31"> omitted debug: table item <td width="26%" height="31"> omitted debug: table item <td width="2%" height="31"> omitted debug: table item <td width="22%" height="31"> omitted debug: table item <td width="2%" height="31"> omitted debug: table item <td width="21%" height="31"> omitted debug: table item <td width="2%" height="27"> omitted debug: table item <td width="23%" height="27"> omitted debug: table item <td width="2%" height="27"> omitted debug: table item <td width="26%" height="27"> omitted debug: table item <td width="2%" height="27"> omitted debug: table item <td width="22%" height="27"> omitted debug: table item <td width="2%" height="27"> omitted debug: table item <td width="21%" height="27"> omitted debug: table item <td width="2%" height="26"> omitted debug: table item <td width="23%" height="26"> omitted debug: table item <td width="2%" height="26"> omitted debug: table item <td width="26%" height="26"> omitted debug: table item <td width="2%" height="26"> omitted debug: table item <td width="22%" height="26"> omitted debug: table item <td width="2%" height="26"> omitted debug: table item <td width="21%" height="26"> omitted debug: table item <td width="2%" height="31"> omitted debug: table item <td width="23%" height="31"> omitted debug: table item <td width="2%" height="31"> omitted debug: table item <td width="26%" height="31"> omitted debug: table item <td width="2%" height="31"> omitted debug: table item <td width="22%" height="31"> omitted debug: table item <td width="2%" height="31"> omitted debug: table item <td width="21%" height="31"> omitted debug: table item <td width="2%" height="31"> omitted debug: table item <td width="23%" height="31"> omitted debug: table item <td width="2%" height="31"> omitted debug: table item <td width="26%" height="31"> omitted debug: table item <td width="2%" height="31"> omitted debug: table item <td width="22%" height="31"> omitted debug: table item <td width="2%" height="31"> omitted debug: table item <td width="21%" height="31"> omitted -refresh is on, not looking for differences debug: cache singleton C:\Software\Palm\SiteScooper\sitescooper-3.1.2\tmp\page_cache_dir\smrtcorp\com_smrt_train_arrivals_htm-0-0: ref count now 8 Printing: http://www.smrtcorp.com/smrt/train_arrivals.htm debug: story written, 0.693359375 K, limit 500 K debug: stories found so far: 1 debug: still one of our httpclients in queue: URL-handler queue: train_arrivals.htm debug: url-handler train_arrivals.htm: running state 4 debug: url-handler done: train_arrivals.htm debug: queue now empty of our httpclients: URL-handler queue: debug: stories found: 1 minimum: 2 MRT train times: no new stories, ignoring. debug: (Not setting already-seen age cache since no links were followed) SITE END: done scooping site "sites\mrt_train_times.site". SITE START: now scooping site "sites\cathen.site". debug: tmp dir: C:\Software\Palm\SiteScooper\sitescooper-3.1.2\tmp\txt\Catholic_Encyclopedia.tmp, output dir: C:\Software\Palm\SiteScooper\sitescooper-3.1.2\tmp\txt\Catholic_Encyclopedia, now: Sat Apr 6 12:55:54 2002 debug: minpages: levels=1 Reading level-3 front page: http://www.newadvent.org/cathen/ debug: adding url-handler to queue: cathen/ debug: URL-handler queue: cathen/ debug: waiting for queue to empty: URL-handler queue: cathen/ debug: url-handler cathen/: running state 2 debug: finishing GET: http://www.newadvent.org/cathen/ debug: last-modified time: 1018082405 (Apr 6 2002) debug: oldest link seen at www.newadvent.org http://www.newadvent.org/cathen/: modtime=1018082405 (Apr 6 2002) debug: cache singleton C:\Software\Palm\SiteScooper\sitescooper-3.1.2\tmp\page_cache_dir\newadvent\org_cathen_-0-0: ref count now 2 debug: re-adding stripped closing tag: </blockquote> Printing: http://www.newadvent.org/cathen/ debug: story written, 6.580078125 K, limit 0 K debug: stories found so far: 1 debug: still one of our httpclients in queue: URL-handler queue: cathen/ debug: url-handler cathen/: running state 4 debug: url-handler done: cathen/ debug: queue now empty of our httpclients: URL-handler queue: debug: stories found: 1 minimum: 3 Catholic Encyclopedia: no new stories, ignoring. debug: (Not setting already-seen age cache since no links were followed) SITE END: done scooping site "sites\cathen.site". SITE START: now scooping site "sites\office_of_readings.site". debug: tmp dir: C:\Software\Palm\SiteScooper\sitescooper-3.1.2\tmp\txt\Office_of_Readings.tmp, output dir: C:\Software\Palm\SiteScooper\sitescooper-3.1.2\tmp\txt\Office_of_Readings, now: Sat Apr 6 12:56:03 2002 debug: minpages: levels=1 Reading level-3 front page: http://www.universalis.com/cgi-bin/display/800/readings.htm debug: adding url-handler to queue: 800/readings.htm debug: URL-handler queue: 800/readings.htm debug: waiting for queue to empty: URL-handler queue: 800/readings.htm debug: url-handler 800/readings.htm: running state 2 debug: finishing GET: http://www.universalis.com/cgi-bin/display/800/readings.htm debug: last-modified time: 1018022400 (Apr 6 2002) debug: oldest link seen at www.universalis.com http://www.universalis.com/cgi-bin/display/800/readings.htm: modtime=1018022400 (Apr 6 2002) debug: table item <td width=25% align=right valign=top bgcolor=#eeee88> omitted debug: cache singleton C:\Software\Palm\SiteScooper\sitescooper-3.1.2\tmp\page_cache_dir\universalis\com_cgi-bin_display_800_readings_htm-0-0: ref count now 7 Printing: http://www.universalis.com/cgi-bin/display/800/readings.htm debug: story written, 13.279296875 K, limit 0 K debug: stories found so far: 1 debug: still one of our httpclients in queue: URL-handler queue: 800/readings.htm debug: url-handler 800/readings.htm: running state 4 debug: url-handler done: 800/readings.htm debug: queue now empty of our httpclients: URL-handler queue: debug: stories found: 1 minimum: 3 Office of Readings: no new stories, ignoring. debug: (Not setting already-seen age cache since no links were followed) SITE END: done scooping site "sites\office_of_readings.site". SITE START: now scooping site "sites\zenit.site". debug: tmp dir: C:\Software\Palm\SiteScooper\sitescooper-3.1.2\tmp\txt\Zenit_News_Agency.tmp, output dir: C:\Software\Palm\SiteScooper\sitescooper-3.1.2\tmp\txt\Zenit_News_Agency, now: Sat Apr 6 12:56:16 2002 debug: minpages: levels=1 Reading level-3 front page: http://www.zenit.org/english/english.phtml debug: adding url-handler to queue: english.phtml debug: URL-handler queue: english.phtml debug: waiting for queue to empty: URL-handler queue: english.phtml debug: url-handler english.phtml: running state 2 debug: finishing GET: http://www.zenit.org/english/english.phtml debug: last-modified time: not provided debug: oldest link seen at www.zenit.org http://www.zenit.org/english/english.phtml: modtime=1018097781 (Apr 6 2002) debug: table item <td height="28" width="30%"> omitted -refresh is on, not looking for differences debug: cache singleton C:\Software\Palm\SiteScooper\sitescooper-3.1.2\tmp\page_cache_dir\zenit\org_english_english_phtml-0-0: ref count now 5 Printing: http://www.zenit.org/english/english.phtml debug: story written, 5.091796875 K, limit 0 K debug: stories found so far: 1 debug: still one of our httpclients in queue: URL-handler queue: english.phtml debug: url-handler english.phtml: running state 4 debug: url-handler done: english.phtml debug: queue now empty of our httpclients: URL-handler queue: debug: stories found: 1 minimum: 3 Zenit News Agency: no new stories, ignoring. debug: (Not setting already-seen age cache since no links were followed) SITE END: done scooping site "sites\zenit.site". Finished!
<<attachment: scoop_test.bat>>
cathen.site
Description: Binary data
mrt_train_times.site
Description: Binary data
office_of_readings.site
Description: Binary data
zenit.site
Description: Binary data
