it should. I would do:
1. replace all whitespace by a single space
2. replace <p> by CR/LF/CR/LF and <br> by CR/LF (maybe replace <td>
too?)
3. Remove tags (You may need to replace HTML codes too?)
3. When you display: replace CR/LF by <br>
Something like:
<cfscript>
crlf = chr(13) & chr(10);
str = REReplace(str,"[[:space:]]+"," ","ALL");
str = REReplaceNoCase(str,"(<p[[:space:]>/])","#crlf##crlf#\1","ALL");
str = REReplaceNoCase(str,"(<br[[:space:]>/])","#crlf#\1","ALL");
str = REReplace(str,"<[^>]+>","","ALL");
</cfscript>
<cfoutput>#Replace(str,crlf,"<br/>","ALL")#</cfoutput>
Pascal
> -----Original Message-----
> From: Cedric Villat [mailto:[EMAIL PROTECTED]
> Sent: donderdag 26 februari 2004 16:41
> To: CF-Talk
> Subject: Parsing HTML Email
>
> So I'm using CFPOP to retrieve email messages and I'm storing
> the HTML version of the emails in the database. I'm then
> parsing the email on display, but I want to remove any sort
> of HTML formatting. I've tried 2 methods, and neither seem
> bulletproof.
>
> First try, I remove all HTML tags from the message using the
> StripTags udf.
> Then I converte all Chr(10)&Chr(13) to <br> for display. This
> seems ok, but the StripTags tag seems to remove things
> BETWEEN tags, so <div>Hello</div> would actually be removed
> from the message. Not ideal.
>
> So my second try was to use a RegEx to remove everything
> contained in a < and > and also change the carriage returns
> to <br>'s. Again, this seems ok, but the output doesn't
> always come out as desired or the way it *should* look in plain text.
>
> Is there an easier way to get the plain text version of my
> stored message?
>
> Cedric
>
>
>
>
[Todays Threads] [This Message] [Subscription] [Fast Unsubscribe] [User Settings]

