It looks like Yahoo! groups has started sending out messages containing multiple HEAD elements in the text/html MIME subpart of its emails.
Unfortunately, this seems to break MHonArc, probably because it only expects one and uses it to determine X-Body-of-Message et al. I've been reading through the W3C DTD for HTML 4.01 and it's not clear to me if HTML documents are allowed one HEAD element or more than one: http://www.w3.org/TR/html401/struct/global.html#edef-HEAD If more than one is allowed, it's a bug in MHonArc. But if it's Yahoo!'s fault, MHonArc is in the clear and will just work around it. ;) Does anyone know for sure how this should be handled? I have a patch included below to correct this, but I'm curious if multiple HEAD tags are legal or not. Thanks, Chris P.S. You can see the (icky) message at http:///www.mallorn.com/~lindsey/multi-head-ick.txt http://www.bonvivantnursery.com/ Bon Vivant Nursery http://www.hort.net/gallery/ 4023 online plant photos and growing! http://www.hort.net/gallery/date/2006-07-26/ The latest additions
diff -rc MHonArc-2.6.16/lib/mhtxthtml.pl MHonArc-2.6.16-headfix/lib/mhtxthtml.pl *** MHonArc-2.6.16/lib/mhtxthtml.pl Sun May 1 19:04:39 2005 --- MHonArc-2.6.16-headfix/lib/mhtxthtml.pl Tue Nov 14 18:02:20 2006 *************** *** 186,194 **** $base =~ s|(.*/).*|$1|; ## Strip out certain elements/tags to support proper inclusion: ! ## some browsers are forgiving about dublicating header tags, but ## we try to do things right. It also help minimize XSS exploits. ! $$data =~ s|<head\s*>[\s\S]*</head\s*>||io; 1 while ($$data =~ s|<!doctype\s[^>]*>||gio); 1 while ($$data =~ s|</?html\b[^>]*>||gio); 1 while ($$data =~ s|</?x-html\b[^>]*>||gio); --- 186,194 ---- $base =~ s|(.*/).*|$1|; ## Strip out certain elements/tags to support proper inclusion: ! ## some browsers are forgiving about duplicating header tags, but ## we try to do things right. It also help minimize XSS exploits. ! $$data =~ s|<head\s*>[\s\S]*?</head\s*>||gio; 1 while ($$data =~ s|<!doctype\s[^>]*>||gio); 1 while ($$data =~ s|</?html\b[^>]*>||gio); 1 while ($$data =~ s|</?x-html\b[^>]*>||gio);