Hi,
I downloaded plucker and am running on a machine with RedHat7.1,
kernel version 2.4.2-2; and what appears to be python 1.5 (although I
seem to have both python and python1.5 binaries on my system). The
plucker-build routine seems to fail because it can't get any web pages
which are linked on the same site as a higher-level one; i.e. I've been
trying with www.smh.com.au, I can't get any pages which look like
www.smh.com.au/anything/blah.html (see error messages below). This would
seem to be a relatively straightforward error?? I've tried with both
v.1.0 and 1.1, but same error with each. Any tips?
Incidentally, it looks like it would be very nice software (if it
worked for me properly!) did a nice job on the Wired frontpage anyway.
Cheers
Duncan
> plucker-build --db-name="Sydney Morning Herald" -f SMH
--home-url="http://www.smh.com.au" -M 2 --noimages
Working for pluckerdir /nfs/burgundy/h1/duncan/.plucker
Processing http://www.smh.com.au/.
0 collected, 0 still to do
Moved to '://www.smh.com.au/'
Retrieved ok
Processing /text/.
1 collected, 120 still to do
Retrieved failed: 404 -- [Errno 2] No such file or directory: '/text/'
Processing /news/newsAlert.html.
1 collected, 119 still to do
Retrieved failed: 404 -- [Errno 2] No such file or directory:
'/news/newsAlert.html'
Processing /media/wtc/smhwtc.html.
1 collected, 118 still to do
Retrieved failed: 404 -- [Errno 2] No such file or directory:
'/media/wtc/smhwtc.html'
.
.
.
Processing /news/0105/01/pageone/index.html.
3 collected, 115 still to do
Retrieved failed: 404 -- [Errno 2] No such file or directory:
'/news/0105/01/pageone/index.html'
Processing /news/0105/01/national/index.html.
3 collected, 114 still to do
Retrieved failed: 404 -- [Errno 2] No such file or directory:
'/news/0105/01/national/index.html'
.
.
.