DM Smith wrote:
I think our goal is to work with publishers to obtain their permission to distribute their work. Doing what we can to protect their interests is in everyone's best interest. There is a FAQ at Archive.org: How can I remove my site's pages from the Wayback Machine? http://www.archive.org/about/faqs.php#2 Also: The Oakland Archive Policy http://www2.sims.berkeley.edu/research/conferences/aps/removal-policy.html It appears that the following kind of lines in a robots.txt file would keep a file from being archived and remove copies from the archive. User-agent: ia_archiver Disallow: /path/file.zip If for some reason it would be good to block all robots from indexing a file the format for that would be: User-agent: * Disallow: /path/file.zip See: http://pageresource.com/zine/robotstxt.htm As to why zip files in the archive may not work see: Broken/truncated .zip files in Wayback Machine http://www.archive.org/iathreads/post-view.php?id=9151 Also of interest: What is the Wayback Machine's Copyright Policy? http://www.archive.org/about/faqs.php#20 Internet Archive's Terms of Use, Privacy Policy, and Copyright Policy http://www.archive.org/about/terms.php Looking at the Terms of Use it is more understandable why they are comfortable archiving what they do (for noncommercial, noninfringing or fair use- scholarship and research). Now the question is, if they are correct under those terms to archive almost anything is there a need to interfere with the process? Jerry |
_______________________________________________ sword-devel mailing list: [email protected] http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page
