Re: [Mailman-Users] Corrupted archives ...

2009-08-13 Thread Mark Sapiro
Glenn Sieb wrote:

Mark Sapiro said the following on 8/12/09 10:05 AM:

 As Terry suggests, you could run bin/cleanarch as an additional
 test/correction on the listname.mbox. There may be unescaped From 
 in message bodies that didn't confuse Mutt or that you didn't notice
 with Mutt, and then run bin/arch --wipe to rebuild the archive. But
 also be aware as Terry says that this may renumber messages and break
 saved links to archived messages.
   

*nods* This is an instance where I may have to go through manually with
vi and fix this email-by-email. :sigh:

It will take forever, considering there are 55k or so messages in the
archive.


If as you imply below, you've already run bin/arch --wipe in the recent
past, then you've already reneumbered the archive, so don't worry
about doing it again.


 An alternative alternative is to just remove 2009-August/,
 2009-August.txt and 2009-August.txt.gz (if any) from
 archives/private/listname/ and then run bin/arch (without --wipe) with
 input just consisting of the Aug, 1999 portion of listname.mbox.
   
Ooh. Let me try that one.
 But the real questions are how did this happen; do the 128 messages
 all have Mon Aug 10 18:53:40 EDT 2009 timestamps or do they have
 different timestamps, and what may have been done at that/those times?
   
It was probably one of the times I ran arch --wipe.

And yes, they all have the same timestamp in the archives.

Let me try re-running the arch command with the 2009-August* files
removed

Odd. I had to manually create the 2009-August directory, but the problem
is still there. :-/

(I did bin/arch (listname))


I meant do

bin/arch (listname) /path/to/edited/mbox/containing/only/2009August.

However, if you've actually done bin/arch --wipe (listname) and wound
up with those strange no-subject messages in the current month, there
is either a problem with bin/arch or with the listname.mbox.

What happens if you run

 bin/cleanarch  /path/to/listname.mbox  /dev/null

-- 
Mark Sapiro m...@msapiro.netThe highway is for gamblers,
San Francisco Bay Area, Californiabetter use your sense - B. Dylan

--
Mailman-Users mailing list
Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Security Policy: http://wiki.list.org/x/QIA9


Re: [Mailman-Users] Corrupted archives ...

2009-08-12 Thread Mark Sapiro
Glenn Sieb wrote:

I'm running mailman-2.1.12, with the htdig patches on FreeBSD 7.0

I have a list with archives that are about 10 years old. The archive
mbox size is 175M.

I was alerted by a subscriber that the August 2009 archives list 128 No
subject emails that look funny.

So I looked.. sure enough they're there. And they look something like
this when I click on a single email listed in the archives:

No subject

Mon Aug 10 18:53:40 EDT 2009

* Previous message: [Redacted] Blah...
* Next message: No subject
* Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Tue, 14 Dec 1999 23:27:19 PST
X-Originating-IP: [63.11.227.157]
From: redacted redacted_at...
To: redacted
Date: Tue, 14 Dec 1999 23:27:19 PST
Mailing-List: contact redacted
X-Mailing-List: redacted
Precedence: bulk
List-Help: http://www.example.com/redacted/info.html,
  mailto:redacted at example.com
List-Unsubscribe: mailto:redacted-unsubscribe at example.com
List-Archive: http://www.example.com/redacted/
Reply-To: redacted
Subject: [Redacted] Redacted
MIME-Version: 1.0
Content-Type: text/plain; format=flowed
Content-Transfer-Encoding: 7bit
Status: RO
Content-Length: 7352
Lines: 174

(body of email starts here)

From Redacted redacted at u... Wed Dec 15 00:40:19 1999
Delivered-To: redacted
Received: (listserv 1.291); by f7; 15 Dec 1999 08:43:59 -
Delivered-To: redacted
Date: 15 Dec 99 03:44:15 EST
From: Redacted redacted at u...
To: redacted
X-Mailing-List: redacted
Precedence: bulk
List-Help: http://www.example.com/redacted/info.html,
  mailto:redacted at example.com
List-Unsubscribe: mailto:redacted-unsubscribe at example.com
List-Archive: http://www.example.com/redacted/
Reply-To: redacted
Subject: [Redacted] RedactedMIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII

Content-Transfer-Encoding: quoted-printable

(body of email starts here...)

(another email starts here, as above...)

(end of example)

Everything looks fine if I use mutt -f listname.mbox in the private
archives directory for the list.

Has anyone had problems like this? My GoogleFu is failing me, or at
least isn't showing me anything like this.


Do you see these Dec. 1999 messages when you look with Mutt?

There is a problem with a Debian patch, but the symptom is somewhat
different, and you're on FreeBSD anyway, so I don't think this is it.

It looks like someone or some script ran bin/arch on Mon Aug 10
18:53:40 EDT 2009 (and possibly at other times) with some spurious
input, but I'm not sure what that input would be. The puzzling part is
the Previous/Next/Sorted header which only appears in the periodic
index files.

As Terry suggests, you could run bin/cleanarch as an additional
test/correction on the listname.mbox. There may be unescaped From 
in message bodies that didn't confuse Mutt or that you didn't notice
with Mutt, and then run bin/arch --wipe to rebuild the archive. But
also be aware as Terry says that this may renumber messages and break
saved links to archived messages.

An alternative alternative is to just remove 2009-August/,
2009-August.txt and 2009-August.txt.gz (if any) from
archives/private/listname/ and then run bin/arch (without --wipe) with
input just consisting of the Aug, 1999 portion of listname.mbox.

But the real questions are how did this happen; do the 128 messages
all have Mon Aug 10 18:53:40 EDT 2009 timestamps or do they have
different timestamps, and what may have been done at that/those times?

-- 
Mark Sapiro m...@msapiro.netThe highway is for gamblers,
San Francisco Bay Area, Californiabetter use your sense - B. Dylan

--
Mailman-Users mailing list
Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Security Policy: http://wiki.list.org/x/QIA9


Re: [Mailman-Users] Corrupted archives ...

2009-08-12 Thread Glenn Sieb
Mark Sapiro said the following on 8/12/09 10:05 AM:
 Do you see these Dec. 1999 messages when you look with Mutt?
   

*doublechecking* Yes. They look fine.

 It looks like someone or some script ran bin/arch on Mon Aug 10
 18:53:40 EDT 2009 (and possibly at other times) with some spurious
 input, but I'm not sure what that input would be. The puzzling part is
 the Previous/Next/Sorted header which only appears in the periodic
 index files.
   
Yup. My archives are indexed automagically by Month-Year...

 As Terry suggests, you could run bin/cleanarch as an additional
 test/correction on the listname.mbox. There may be unescaped From 
 in message bodies that didn't confuse Mutt or that you didn't notice
 with Mutt, and then run bin/arch --wipe to rebuild the archive. But
 also be aware as Terry says that this may renumber messages and break
 saved links to archived messages.
   

*nods* This is an instance where I may have to go through manually with
vi and fix this email-by-email. :sigh:

It will take forever, considering there are 55k or so messages in the
archive.
 An alternative alternative is to just remove 2009-August/,
 2009-August.txt and 2009-August.txt.gz (if any) from
 archives/private/listname/ and then run bin/arch (without --wipe) with
 input just consisting of the Aug, 1999 portion of listname.mbox.
   
Ooh. Let me try that one.
 But the real questions are how did this happen; do the 128 messages
 all have Mon Aug 10 18:53:40 EDT 2009 timestamps or do they have
 different timestamps, and what may have been done at that/those times?
   
It was probably one of the times I ran arch --wipe.

And yes, they all have the same timestamp in the archives.

Let me try re-running the arch command with the 2009-August* files
removed

Odd. I had to manually create the 2009-August directory, but the problem
is still there. :-/

(I did bin/arch (listname))

Thanks, Mark!
--Glenn


--
Mailman-Users mailing list
Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Security Policy: http://wiki.list.org/x/QIA9


[Mailman-Users] Corrupted archives ...

2009-08-10 Thread Glenn Sieb
Hullo,

I'm running mailman-2.1.12, with the htdig patches on FreeBSD 7.0

I have a list with archives that are about 10 years old. The archive
mbox size is 175M.

I was alerted by a subscriber that the August 2009 archives list 128 No
subject emails that look funny.

So I looked.. sure enough they're there. And they look something like
this when I click on a single email listed in the archives:

No subject

Mon Aug 10 18:53:40 EDT 2009

* Previous message: [Redacted] Blah...
* Next message: No subject
* Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Tue, 14 Dec 1999 23:27:19 PST
X-Originating-IP: [63.11.227.157]
From: redacted redacted_at...
To: redacted
Date: Tue, 14 Dec 1999 23:27:19 PST
Mailing-List: contact redacted
X-Mailing-List: redacted
Precedence: bulk
List-Help: http://www.example.com/redacted/info.html,
  mailto:redacted at example.com
List-Unsubscribe: mailto:redacted-unsubscribe at example.com
List-Archive: http://www.example.com/redacted/
Reply-To: redacted
Subject: [Redacted] Redacted
MIME-Version: 1.0
Content-Type: text/plain; format=flowed
Content-Transfer-Encoding: 7bit
Status: RO
Content-Length: 7352
Lines: 174

(body of email starts here)

From Redacted redacted at u... Wed Dec 15 00:40:19 1999
Delivered-To: redacted
Received: (listserv 1.291); by f7; 15 Dec 1999 08:43:59 -
Delivered-To: redacted
Date: 15 Dec 99 03:44:15 EST
From: Redacted redacted at u...
To: redacted
X-Mailing-List: redacted
Precedence: bulk
List-Help: http://www.example.com/redacted/info.html,
  mailto:redacted at example.com
List-Unsubscribe: mailto:redacted-unsubscribe at example.com
List-Archive: http://www.example.com/redacted/
Reply-To: redacted
Subject: [Redacted] RedactedMIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII

Content-Transfer-Encoding: quoted-printable

(body of email starts here...)

(another email starts here, as above...)

(end of example)

Everything looks fine if I use mutt -f listname.mbox in the private
archives directory for the list.

Has anyone had problems like this? My GoogleFu is failing me, or at
least isn't showing me anything like this.

Thanks in advance!
--Glenn
--
Mailman-Users mailing list
Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Security Policy: http://wiki.list.org/x/QIA9


Re: [Mailman-Users] Corrupted archives ...

2009-08-10 Thread Terri Oda

Glenn Sieb wrote:

I have a list with archives that are about 10 years old. The archive
mbox size is 175M.

I was alerted by a subscriber that the August 2009 archives list 128 No
subject emails that look funny.

[snip]

From: redacted redacted_at...
To: redacted
Date: Tue, 14 Dec 1999 23:27:19 PST

(body of email starts here)


From Redacted redacted at u... Wed Dec 15 00:40:19 1999

Delivered-To: redacted


Have you tried running bin/cleanarch and then rerunning bin/arch to 
regenerate the messages?  It's possible what you're seeing could be 
caused by messed up From lines in your old mbox file (used by the 
archiver to determine the start of messages).  Mutt may just have a more 
forgiving parser.


Be warned, though, if you regenerate the entire archive, then the links 
in your archive will change (i.e. old posts that people have linked will 
no longer be in the same spot).


 Terri

--
Mailman-Users mailing list
Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Security Policy: http://wiki.list.org/x/QIA9