Re: [CODE4LIB] wget archiving for dummies

Alex Armstrong Mon, 06 Oct 2014 00:49:44 -0700

I wanted a quick-and-dirty solution to archiving our old LibGuides sitea few months ago.

wget was my first port of call also. I don't have good notes as to whatwent wrong, but I ended up using httrack:

http://www.httrack.com/


It basically worked out of the box.

HTH,
Alex


On 10/06/2014 09:44 AM, Eric Phetteplace wrote:

Hey C4L,

If I wanted to archive a Wordpress site, how would I do so?

More elaborate: our library recently got a "donation" of a remote Wordpress
site, sitting one directory below the root of a domain. I can tell from a
cursory look it's a Wordpress site. We've never archived a website before
and I don't need to do anything fancy, just download a workable copy as it
presently exists. I've heard this can be as simple as:

wget -m $PATH_TO_SITE_ROOT

but that's not working as planned. Wget's convert links feature doesn't
seem to be quite so simple; if I download the site, disable my network
connection, then host locally, some 20 resources aren't available. Mostly
images which are under the same directory. Possibly loaded via AJAX. Advice?

(Anticipated) pertinent advice: I shouldn't be doing this at all, we should
outsource to Archive-It or similar, who actually know what they're doing.
Yes/no?

Best,
Eric

Re: [CODE4LIB] wget archiving for dummies

Reply via email to