Re: [Mailman-Users] Blank Characters Removed from Subject: Line

2007-06-29 Thread Barry Finkel
Barry Finkel writes:

  I am running Mailman 2.1.9.  I have a list where one posting has a
  Subject: line:
  
   Change in Procedure for Computers on list with possible Antivirus 
  Problems 
  
  The next posting in the thread has:
  
   Change in Procedure for Computers on list with possible 
  AntivirusProblems 

and Stephen J. Turnbull [EMAIL PROTECTED] replied:

What is happening, I guess, is that Mailman is folding that header to
keep it within some number of characters, maybe 76 or so.  RFC 2822
specifies that this may be done by inserting a linebreak (CRLF) before
whitespace.  The RFC implies that the right thing to do in that case
is to remove the CRLF only, but some MUAs also remove a space.  I
suspect that is what is happening to this case.

Can you post a copy of the raw header as received by Mailman and as
sent by Mailman?

Below are pieces of two messages.  I have the original message
from the archives of the sender followed by the relevant lines of
the list .mbox file (including line numbers).

===
-Original Message-
From: ...
Sent: Tuesday, June 26, 2007 2:30 PM
To: ...
Subject: RE: Change in Procedure for Computers on list with possible
AntivirusProblems

Not a question ...
===
 184331 Subject: RE: Change in Procedure for Computers on list with possible
 184332 AntivirusProblems
 184333 Date: Tue, 26 Jun 2007 14:29:52 -0500
 184342 From: ...
 184343 To: ...

 184358 
 184359 Not a question ...

===
===
-Original Message-
From: ...
Sent: Tuesday, June 26, 2007 3:50 PM
To: ...
Cc: ...
Subject: RE: Change in Procedure for Computers on list
withpossibleAntivirusProblems

Hi ...
===
 184735 Subject: RE: Change in Procedure for Computers on list
 184736 withpossibleAntivirusProblems
 184737 Date: Tue, 26 Jun 2007 15:50:20 -0500
 184747 From: ...
 184748 To: ...
 184751 Cc: ...

 184765 
 184766 Hi ...
===
===

In both cases, I do not see that Mailman has removed any blanks from
the Subject: line.
--
Barry S. Finkel
Computing and Information Systems Division
Argonne National Laboratory  Phone:+1 (630) 252-7277
9700 South Cass Avenue   Facsimile:+1 (630) 252-4601
Building 222, Room D209  Internet: [EMAIL PROTECTED]
Argonne, IL   60439-4828 IBMMAIL:  I1004994

--
Mailman-Users mailing list
Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Security Policy: 
http://www.python.org/cgi-bin/faqw-mm.py?req=showamp;file=faq01.027.htp


Re: [Mailman-Users] Blank Characters Removed from Subject: Line

2007-06-29 Thread Mark Sapiro
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Barry Finkel wrote:
 
 Below are pieces of two messages.  I have the original message
 from the archives of the sender followed by the relevant lines of
 the list .mbox file (including line numbers).
 
 ===
 -Original Message-
 From: ...
 Sent: Tuesday, June 26, 2007 2:30 PM
 To: ...
 Subject: RE: Change in Procedure for Computers on list with possible
 AntivirusProblems


This is some rendering of the Subject:, but it is not the actual
Subject: header. If it were, the Subject: of the outgoing message would
be simply

Subject: RE: Change in Procedure for Computers on list with possible

since a header continuation must begin with at least one whitespace
character. You need to get the 'message source' from the sender.

 Not a question ...
 ===
  184331 Subject: RE: Change in Procedure for Computers on list with possible
  184332 AntivirusProblems


And here we see Mailman has sent the post with the subject folded with a
tab as the whitespace character.

  184333 Date: Tue, 26 Jun 2007 14:29:52 -0500
  184342 From: ...
  184343 To: ...
 
  184358 
  184359 Not a question ...
 
 ===
 ===
 -Original Message-
 From: ...
 Sent: Tuesday, June 26, 2007 3:50 PM
 To: ...
 Cc: ...
 Subject: RE: Change in Procedure for Computers on list
 withpossibleAntivirusProblems


And someones MUA has dropped the tab in unfolding (and this has
happened more than once)

I think it would be better if Mailman folded using sp rather than
tab since with tab a standards compliant unfolding would leave a
tab in the middle of the subject which may be worse than dropping it.

But, the fact remains that there are many commonly used MUAs that drop a
whitespace character in unfolding and there's not much we can do about that.

- --
Mark Sapiro [EMAIL PROTECTED]   The highway is for gamblers,
San Francisco Bay Area, Californiabetter use your sense - B. Dylan

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (MingW32)

iD8DBQFGhS/jVVuXXpU7hpMRAm5XAKCdZAzuN7TYt4lD9KTmsvffUqnS7wCeKUew
V6Pn0pu0CJwDfJNv9aA7pno=
=HqSS
-END PGP SIGNATURE-
--
Mailman-Users mailing list
Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Security Policy: 
http://www.python.org/cgi-bin/faqw-mm.py?req=showamp;file=faq01.027.htp


Re: [Mailman-Users] Blank Characters Removed from Subject: Line

2007-06-29 Thread Stephen J. Turnbull
Mark Sapiro writes:

  But, the fact remains that there are many commonly used MUAs that drop a
  whitespace character in unfolding and there's not much we can do about that.

I wonder if they're better with RFC 2047.  That is, suppose we
rendered

Subject: Pretend this is a long field

as

Subject: Pretend this is =?US-ASCII?Q?a=20?=
 =?US-ASCII?Q?long=20field?=

Of course, that would be just unbearably ugly if your MUA doesn't do
MIME headers.

Maybe the best course would be to use two spaces at the beginning of a
folded physical line.
--
Mailman-Users mailing list
Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Security Policy: 
http://www.python.org/cgi-bin/faqw-mm.py?req=showamp;file=faq01.027.htp


Re: [Mailman-Users] Blank Characters Removed from Subject: Line

2007-06-28 Thread Barry Finkel
Barry Finkel writes:

  I am running Mailman 2.1.9.  I have a list where one posting has a
  Subject: line:
  
   Change in Procedure for Computers on list with possible Antivirus 
  Problems 
  
  The next posting in the thread has:
  
   Change in Procedure for Computers on list with possible 
  AntivirusProblems 

Stephen J. Turnbull [EMAIL PROTECTED] replied:

What is happening, I guess, is that Mailman is folding that header to
keep it within some number of characters, maybe 76 or so.  RFC 2822
specifies that this may be done by inserting a linebreak (CRLF) before
whitespace.  The RFC implies that the right thing to do in that case
is to remove the CRLF only, but some MUAs also remove a space.  I
suspect that is what is happening to this case.

Can you post a copy of the raw header as received by Mailman and as
sent by Mailman?

I am not sure I have this information.  What I see (and posted) from
the list .mbox file - is that what was received by Mailman or what was
sent?

--
Barry S. Finkel
Computing and Information Systems Division
Argonne National Laboratory  Phone:+1 (630) 252-7277
9700 South Cass Avenue   Facsimile:+1 (630) 252-4601
Building 222, Room D209  Internet: [EMAIL PROTECTED]
Argonne, IL   60439-4828 IBMMAIL:  I1004994

--
Mailman-Users mailing list
Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Security Policy: 
http://www.python.org/cgi-bin/faqw-mm.py?req=showamp;file=faq01.027.htp


Re: [Mailman-Users] Blank Characters Removed from Subject: Line

2007-06-28 Thread Mark Sapiro
Barry Finkel wrote:

Stephen J. Turnbull [EMAIL PROTECTED] replied:

Can you post a copy of the raw header as received by Mailman and as
sent by Mailman?

I am not sure I have this information.  What I see (and posted) from
the list .mbox file - is that what was received by Mailman or what was
sent?


It's essentially the message as sent by Mailman to the list, but
without the msg_header and msg_footer if any.

To get the incoming message, you'd have to go to the poster's sent
messages or get a Cc or Bcc from the poster.

-- 
Mark Sapiro [EMAIL PROTECTED]   The highway is for gamblers,
San Francisco Bay Area, Californiabetter use your sense - B. Dylan

--
Mailman-Users mailing list
Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Security Policy: 
http://www.python.org/cgi-bin/faqw-mm.py?req=showamp;file=faq01.027.htp


[Mailman-Users] Blank Characters Removed from Subject: Line

2007-06-27 Thread Stephen J. Turnbull
Barry Finkel writes:

  I am running Mailman 2.1.9.  I have a list where one posting has a
  Subject: line:
  
   Change in Procedure for Computers on list with possible Antivirus 
  Problems 
  
  The next posting in the thread has:
  
   Change in Procedure for Computers on list with possible 
  AntivirusProblems 

What is happening, I guess, is that Mailman is folding that header to
keep it within some number of characters, maybe 76 or so.  RFC 2822
specifies that this may be done by inserting a linebreak (CRLF) before
whitespace.  The RFC implies that the right thing to do in that case
is to remove the CRLF only, but some MUAs also remove a space.  I
suspect that is what is happening to this case.

Can you post a copy of the raw header as received by Mailman and as
sent by Mailman?


--
Mailman-Users mailing list
Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Security Policy: 
http://www.python.org/cgi-bin/faqw-mm.py?req=showamp;file=faq01.027.htp


Re: [Mailman-Users] Blank Characters Removed from Subject: Line

2007-06-27 Thread Mark Sapiro
Barry Finkel wrote:

I am running Mailman 2.1.9.  I have a list where one posting has a
Subject: line:

 Change in Procedure for Computers on list with possible Antivirus 
 Problems 

The next posting in the thread has:

 Change in Procedure for Computers on list with possible AntivirusProblems 

A subsequent posting (not treated in the list archives as the same
thread) has

 Change in Procedure for Computers on list with possibleAntivirusProblems 


First of all, threading in Mailman's pipermail archive is not based on
Subject: at all. It is based on In-Reply-To: and References: headers
so if a reply is not added to the thread, it is because the replier's
MUA didn't add an In-Reply-To: or References: header, or it added one
or both of these referencing an off-list post not in the archive.

The Subject: issue you observe has nothing to do with whether or not a
reply is properly threaded in the archive.


The next posting in the thread has:

 Change in Procedure for Computers on list withpossibleAntivirusProblems 

The final posting in this thread has the same Subject: line as
immediately above.  I am not subscribed to this list and cannot
post to it, so I do not know if a subsequent posting to this thread
will remove another blank character in the Subject: line.


It may or may not.


What I see in the list mbox file are these lines (with line numbers):

--
184232 Subject: Change in Procedure for Computers on list with possible 
Antivirus
184233 Problems
--
184331 Subject: RE: Change in Procedure for Computers on list with possible
184332 AntivirusProblems
--
184456 Subject: Re: Change in Procedure for Computers on list with
184457 possibleAntivirusProblems
--
184566 Subject: RE: Change in Procedure for Computers on list
184567 withpossibleAntivirusProblems
--
184735 Subject: RE: Change in Procedure for Computers on list
184736 withpossibleAntivirusProblems
--

Note that the original subject is split into two lines.

What might be causing this?  Is this a problem with Mailman, or is it
a problem with the sender's Mail User Agent (probably Outlook), or
a problem with the sender's mail system:


All of the above, or at least the first two.


As the mbox file has the blanks removed, I have to believe that
it is not Mailman that is removing the blanks.


The basic issue revolves around the rules for folding and unfolding
long header lines. The original standard was RFC 822
http://www.faqs.org/rfcs/rfc822.html, sec 3.1.1. The current
recommendation is RFC 2822 http://www.faqs.org/rfcs/rfc2822.html,
sec 2.2.3.

While careful reading of these two standards shows they are almost the
same with respect to folding and exactly the same with respect to
unfolding, the RFC 822 rules can result in the insertion of extra
white space (the oposite of what you see here). Further, many MUAs and
other mail processing software (such as the Python email library used
by Mailman) don't follow the rules exactly, perhaps because in trying
to compensate for too much white space they remove too much. Also, the
rules really work best with structured headers where white space
occurs between syntactic fields, and not so well with free form text
headers like Subject:.

Aside: I just read Stephen's reply in which he says The RFC implies
that the right thing to do in that case is to remove the CRLF only,
but some MUAs also remove a space. And, I add or a tab. This
whitespace removal in unfolding is the crux of the issue.

Part of the problem is Mailman will unfold and refold the header in the
process of adding the subject_prefix. This process will lengthen the
header, perhaps causing it to be folded when it wasn't before or
folded in a different place. Also, Mailman tends to fold with
CRLFTAB and MUAs tend to remove the TAB in unfolding.

Also, If a subject is folded, and then the white space removed in
unfolding, that makes the joined 'word' long so the next time it will
tend to fold at the SP preceding the long 'word' and than that SP
can be lost in unfolding.

Mailman could behave in a completely RFC compliant manner (it doesn't),
and there would still be the problem because MUAs don't behave in a
completely compliant manner.

Also note that the actual removal of whitespace is done by the MUA, not
Mailman, but that doesn't let Mailman completely off the hook, because
in some cases, Mailman may replace SP with TAB and the MUA may be
more likely to remove TAB.

See the (multiple) threads with subject Subject Lines Wrapped After
Commas, (Like This?) starting at
http://mail.python.org/pipermail/mailman-users/2007-May/057117.html
for a different but related discussion.

-- 
Mark Sapiro [EMAIL PROTECTED]   The highway is for gamblers,
San Francisco Bay Area, Californiabetter use your sense - B. Dylan

--
Mailman-Users mailing list
Mailman-Users@python.org