Mike Alberghini wrote: > >The archive directories contain each months mail in three formats: > >1. a plaintext file: 2004-November.txt >2. a gzipped file: 2004-November.txt.gz >3. a directory: 2004-November - contains individual HTML messages. > >The web archive uses the files in the directory, and links to the gzipped >file. Does anything use the plaintext file? It seems like it's wasting a >ton of diskspace having the same file gzipped and unzipped in the same space.
How the .txt file is used depends on the setting of GZIP_ARCHIVE_TXT_FILES in mm_cfg.py. If this is set to Yes, the .txt file only exists temporarily while the archiver unzips the .txt.gz and appends the .txt into a new .txt.gz. With this setting, there are no permanent .txt files, but this is a very inefficient process (see comments in Defaults.py). If GZIP_ARCHIVE_TXT_FILES is No, then the archive is accumulated in the .txt file and is gzip'd by a nightly cron. In this case, the .txt files can be deleted for prior months if no new messages ever arrive for that month. This can't always be guaranteed as a message could be delayed in transit or have a bad date. In general though, old .txt files can be deleted, and if a "late" message did arrive and cause loss of the .txt.gz information, the archive could be rebuilt from the <list>.mbox/<list>.mbox file with bin/arch. >So, first off, can I delete the year-month.txt files without causing harm? Generally, yes after the month is over. >Second, once the current month is over, can I prevent the non-zipped files >from ever existing? You can set GZIP_ARCHIVE_TXT_FILES - Yes in mm_cfg.py if you're willing to live with the additional processing to unzip/rezip the .txt.gz file for each message. >Finally, is there a way to prevent the archiving of >attachments? If you don't want to use content filtering to keep them off the list entirely, then I think it would require a somewhat tricky hack. You could modify the code in Mailman/Handlers/Scrubber.py, but this would also affect digests - that's where it gets tricky. -- Mark Sapiro <[EMAIL PROTECTED]> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan ------------------------------------------------------ Mailman-Users mailing list [email protected] http://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-users/archive%40jab.org
