On 13 Nov 2004 at 15:53, Gary Griswold wrote: > A university in a remote location has a 33kbps UUCP connection to the > Internet. Because their connection to the internet includes one step that > is UUCP, they are unable to use HTTP, but use accmail services, such as > www4mail, agora, emailweb, pagegetter. For curriculum content purposes they > would like to obtain a local copy of some specific websites. Getting pages > one at a time is very slow, because of their 33kbps connection. If they had > a local copy of specific websites they wished to use in course content, they > would be able to obtain a good response.
I tried to do something similar a few years ago while working at a University in a country with very poor telecom infrastructure. We had access to the web, but in practice it was unusable, especially in the rainy season. ACCMAIL methods were the only reliable way to get web pages with important content, but it is not possible to reconstruct entire usable websites that way. Solution: Find an external collaborator with a shell account. Use the shell account to grab websites, then package them for email to the remote university. Most of it can be done with a Unix shell script ... 1. Use wget with flags -rkp to download a browsable website, eg: wget -rkp http://domain.org/ 2. Archive the website with tar + gzip (or zip), eg: zip -r domain.org.zip domain.org 3. Use split with flag -b to break the archive into email-able chunks, each with an identifiable name and sequence suffix, eg: split -b 32k domain.org.zip domain.org_20041116_ 4. Use mpack to email each chunk separately as an attachment 5. Reassemble the chunks at the remote university. The method used depends on who (or what) receives the emails. Ideally they should be filtered to a local script, but a human being can also do it. The main problems are those which would be problems for any other ACCMAIL method -- cookies, broken dynamic content, browser sniffing, etc. However, on the whole, wget does a good job. We never tried to automate the process, eg: to accept email requests for new websites or to refresh existing sites, but it would not have been very difficult. I would recommend that you inform the owners or adminstrators of your target websites. Tell them what you are doing and why. Some of them might package their website for you. -- szs `at` szs `dot` net ---------------------------------------------------------------------- To contribute to the discussion, email to [EMAIL PROTECTED] To unsubscribe, email to the *admin* address [EMAIL PROTECTED] with UNSUBSCRIBE ACCMAIL as the message body. To get the latest version of the ACCMAIL FAQ, send a blank email to accmail.faq.en `AT` szs.net (replacing `AT` with @ to form a proper email address). ----------------------------------------------------------------------