The value of a set of archives is often unrelated to its age, or even
whether the particular mailing list is still in existence.  For example,
the EDI-L mailing list moved from a listserve mailing list hosted at the
Univ. of California to Yahoo for various reasons.  The Yahoo portion
obviously can't be archived here, but the older messages at
http://www.mail-archive.com/edi-l%40listserv.ucop.edu/ are still
referred to and treasured, especially those things I write about re:
international standards used in e-commerce.

William J. Kammerer
Novannet, LLC.

----- Original Message -----
From: "Jeff Breidenbach" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Sunday, 23 June, 2002 02:56 PM
Subject: Re: [Gossip] More Missing messages OR ....Does everything have
tobe done at once?



> As for technical solutions:
>
> * Archives that have not received a new message over a certain period
>   of time could be targeted for deletion.  I am sure there are
archives
>   that are no longer used, so they could be removed.  The key is to
>   determine what is the proper period.

I'm already de-indexing them from the list of lists after
six months of inactivity. Deletion of defunct lists is probably
quite reasonable.

> * Related to the previous one is to delete archives that have not
>   been accessed over a certain period of time.  Let usage determine
what
>   should and should not stay.  Robot hits should be excluded.  Some
>   heuristics may need to be employed since some robots do not play
>   nice (like address harvesters).

Very hard.

  * Robot identification is hard
  * Robot traffic is high
  * Apache logs are enormously large (so I don't keep a long
    history)
  * I've turned off the "atime" records in the filesystem
    for improved performance.

> * Remove archives that are just duplicates of a lists "official"
>   archives (this would actually affect me :-)  For example,
>   I see that that are several cygwin.com lists archived at
>   mail-archive.com, but cygwin.com keeps there own set of archives
>   at <http://cygwin.com/lists.html>.

Hard to identify. But banning all of YahooGroups was one step
in this direction.

> What is the space limitation of your current hosting provider?

Current hosting provider is donating co-location service, but I can't
swap out to a biggger machine. I have about 1.5 terabits there at the
moment.

-Jeff




_______________________________________________
Gossip mailing list
[EMAIL PROTECTED]
http://jab.org/cgi-bin/mailman/listinfo/gossip




_______________________________________________
Gossip mailing list
[EMAIL PROTECTED]
http://jab.org/cgi-bin/mailman/listinfo/gossip

Reply via email to