Since newlines are supposedly ignored by HTML, I'd maybe do this:

1) replace <br( /)?> tags with something unusual, like the bell chr(7).
2) remove HTML
3) remove *all* newlines.  Maybe with several replace() calls instead of 
rereplace() -- efficiency tests may be required.
4) replace the bell with newlines

I'm not sure if that's *better*, but that's what I'd do.

--Ben "Ninja" Doom

Michael Dinowitz wrote:
> Currently, these lists block mail that does not have a text portion.
> This means that HTML only email does not get through. I've decided to
> try out some code to take those emails that are HTML only and strip
> out the HTML while trying to retain some of the line formatting. I
> came up with this:
> <!--- Replace all breaks with a single carriage return --->
> <cfset string=replacenocase(string, '<br>', chr(10), 'all')>
> <!--- Remove all HTML tags --->
> <cfset string=rereplace(string, '<[^>]+>', '', 'all')>
> <!--- If there are 3 or more newlines in a row, turn them into 2
> newlines. Some newlines have spaces after them. Finally, trim the
> string --->
> <cfset string=trim(rereplace(string, '(?:(\n|\r){2}\s*){3,}', '\1\1', 'all'))>
> 
> Anyone see a problem here or a way to do it better?
> 
> Thanks
> 


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Get the answers you are looking for on the ColdFusion Labs
Forum direct from active programmers and developers.
http://www.adobe.com/cfusion/webforums/forum/categories.cfm?forumid-72&catid=648

Archive: http://www.houseoffusion.com/groups/RegEx/message.cfm/messageid:1049
Subscription: http://www.houseoffusion.com/groups/RegEx/subscribe.cfm
Unsubscribe: 
http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.21

Reply via email to