DM Smith wrote:

I think our goal is to work with publishers to obtain their permission  
to distribute their work. Doing what we can to protect their interests  
is in everyone's best interest.
  

There is a FAQ at Archive.org:

How can I remove my site's pages from the Wayback Machine?
http://www.archive.org/about/faqs.php#2

Also:
The Oakland Archive Policy
http://www2.sims.berkeley.edu/research/conferences/aps/removal-policy.html

It appears that the following kind of lines in a robots.txt file would keep a file from being archived and remove copies from the archive.

User-agent: ia_archiver
Disallow: /path/file.zip

If for some reason it would be good to block all robots from indexing a file the format for that would be:
User-agent: *
Disallow: /path/file.zip

See:
http://pageresource.com/zine/robotstxt.htm

As to why zip files in the archive may not work see:
Broken/truncated .zip files in Wayback Machine
http://www.archive.org/iathreads/post-view.php?id=9151

Also of interest:
What is the Wayback Machine's Copyright Policy?
http://www.archive.org/about/faqs.php#20

Internet Archive's Terms of Use, Privacy Policy, and Copyright Policy
http://www.archive.org/about/terms.php

Looking at the Terms of Use it is more understandable why they are comfortable archiving what they do (for noncommercial, noninfringing or fair use- scholarship and research).

Now the question is, if they are correct under those terms to archive almost anything is there a need to interfere with the process?

Jerry








_______________________________________________
sword-devel mailing list: [email protected]
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

Reply via email to