Title: Re: =20 line endings after Remove HTML
The =20 stuff comes from messages using Quoted Printable encoding. I almost never get them any more, but I have a script that fixes them. There can be stuff also like =3D for hyphens, and so on. Dan Crevier wrote it years ago using a scripting addition that didn’t make it to OS X. I added the vanilla AppleScript subroutine to do the decoding. It has a lot of options I’ve turned off so the script does nothing but decode QP and remove extra spaces (it could be much shorter if the unused code were deleted). This should fix all of them:
(*
Clean message
Dan Crevier 2/25/98
<mailto:[EMAIL PROTECTED]>
A configurable script to clean messages. Run this script with a message window open to
optionally:
• Clean subjects
• Reform paragraphs
• Fix URLs
• Remove the forwarding header
• Remove extra spaces
• Remove quoting
• Decode quoted-printable
• Convert tabs to spaces
The options are described in more detail below. To turn the options on and off, change
the property value to true or false.
You can save multiple copies of the script with different sets of options
Thanks to Guy Kawasaki and David Cortright
*)
(* cleanSubject
If true, subjects will have extraneous Fw:, (fwd), etc removed
and leading and trailing spaces will be removed
*)
property cleanSubject : false
(* reformParagraphs
If true, paragraphs will be reformed. This only works when paragraphs
are separated by blank lines.
If removeQuoting is not turned on, it will not look good on quoted lines
*)
property reformParagraphs : false
(* fixURLs
If true:
• Adds <>'s to URL's
• Adds ftp:// to words beginning with ftp. (probable ftp site addresses)
• Adds http:// to words beginning with www. (probable web site addresses)
• Adds mailto: to words that contain @'s (probable email addresses)
*)
property fixURLs : false
(* removeForwardHeader
If true, it removes the text before "---------- Forwarded message ----------"
*)
property removeForwardHeader : false
(* removeExtraSpaces
If true, it removes extra spaces between words
*)
property removeExtraSpaces : true
(* removeQuoting
If true, it removes >'s from the beginning of the line
*)
property removeQuoting : false
(* decodeQuotedPrintable
If true, decodes quoted printable text (such as =3D )
Requires Dan Crevier's DecodeQP OSAX (included)
*)
property decodeQuotedPrintable : true
(* tabsToSpaces
If true, converts tabs to 5 spaces
Note: the spaces may be removed by the removeExtraSpaces, or removeQuoting options
*)
property tabsToSpaces : false
on run {}
if not CheckMessageWindow() then return
tell application "Microsoft Entourage"
-- ••�clean subject
if cleanSubject then
set theSubject to ""
try
set theSubject to the subject of front window
on error
-- Can't get subject of outgoing message
end try
if theSubject = "" then
try
set theSubject to the subject of the displayed message of the front window
end try
end if
if theSubject ≠ "" then -- No subject, forget it!
set theSubject to my SearchReplace(theSubject, " (fwd)", "")
set theSubject to my SearchReplace(theSubject, "Fw: ", "")
set theSubject to my SearchReplace(theSubject, "FW: ", "")
if theSubject does not start with "Re: " then set theSubject to my SearchReplace(theSubject, "Re: ", "")
repeat while theSubject starts with "Re: Re: "
set theSubject to my SearchReplace(theSubject, "Re: Re: ", "Re: ")
end repeat
set theSubject to my SearchReplace(theSubject, "Re[2]: ", "")
set theSubject to my SearchReplace(theSubject, "Fwd[2]:", "")
-- strip leading and trailing spaces
repeat while theSubject begins with " "
set theSubject to text 2 thru -1 of theSubject
end repeat
repeat while theSubject ends with " "
set theSubject to text 1 thru -2 of theSubject
end repeat
set the subject of the front window to theSubject
end if
end if
-- ••�clean text
set theText to the content of the front window
-- ••�clean forward header
if removeForwardHeader then
-- remove signature
set headerEnd to offset of "----------" in theText
if (headerEnd is not 0) and (headerEnd is not greater than ((length of theText) / 2)) then
set theText to text (headerEnd + 10) thru -1 of theText
-- remove old headers
set headerEnd to offset of (return & return) in theText
if headerEnd is not 0 then
set theText to text (headerEnd + 3) thru -1 of theText
end if
end if
end if
-- ••�Decode quoted-printable
if decodeQuotedPrintable then
try
set theText to my decodeqp(theText)
on error
display dialog "Quoted-printable could not be decoded because the DecodeQP " & ¬
"scripting addition is not installed. See script for details." buttons {"Okay"} ¬
with icon stop
end try
end if
-- •• tabs to spaces
if tabsToSpaces then
set theText to my SearchReplace(theText, tab, " ")
end if
-- •• remove quoting / reform paragraphs
if removeQuoting or reformParagraphs then
set lastWasText to false
set theParagraphs to the paragraphs of theText
set theNewText to ""
repeat with theLine in theParagraphs
set lineLength to length of theLine
set startChar to 1
if removeQuoting then
repeat while startChar ≤ lineLength
set theChar to character startChar of theLine
if theChar is not " " and theChar is not ">" then
exit repeat
end if
set startChar to startChar + 1
end repeat
end if
if (startChar ≥ length of theLine) then
set theNewText to theNewText & return
if reformParagraphs and lastWasText then
set lastWasText to false
set theNewText to theNewText & return
end if
else
set theNewText to theNewText & ((text startChar thru -1 of theLine) as string) & " "
if not reformParagraphs then
set theNewText to theNewText & return
end if
set lastWasText to true
end if
end repeat
set theText to theNewText
end if
-- •• remove extra spaces/fix URLs
if removeExtraSpaces or fixURLs then
set theLines to the paragraphs of theText
set oldDelimiter to AppleScript's text item delimiters
set AppleScript's text item delimiters to space
set newText to ""
repeat with theLine in theLines
set newLine to ""
set theWords to the text items of theLine
repeat with theWord in theWords
if (length of theWord is greater than 0) or (not removeExtraSpaces) then
if fixURLs then
set theWord to my ProcessURL(theWord)
end if
set newLine to newLine & theWord & " "
end if
end repeat
-- strip trailing space
if (length of newLine) is greater than 1 then
set newLine to (text 1 thru -2 of newLine)
end if
set newText to newText & newLine & return
end repeat
set theText to newText
end if
set the content of the front window to theText
end tell
end run
-- routine to do a search and replace on text
-- routine enhanced by David Cortright
on SearchReplace(mainString, searchString, replaceString)
set newString to mainString
set lSearchString to length of searchString
set lReplaceString to length of replaceString
set processedOffset to 0
considering case
repeat
set lNewString to length of newString
set unprocessedString to text (processedOffset + 1) thru lNewString of newString
set foundOffset to (offset of searchString in unprocessedString) + processedOffset
if foundOffset = processedOffset then exit repeat
if foundOffset is 1 then
set newString to replaceString & (text (lSearchString + 1) thru lNewString of newString)
set processedOffset to lReplaceString
else if ((foundOffset + lSearchString - 1) = lNewString) then
set newString to (text 1 thru (foundOffset - 1) of newString) & replaceString
exit repeat
else
set newString to ¬
(text 1 thru (foundOffset - 1) of newString) & replaceString & ¬
(text (foundOffset + lSearchString) thru lNewString of newString)
set processedOffset to foundOffset + lReplaceString - 1
end if
if newString does not contain searchString then exit repeat
end repeat
return newString
end considering
end SearchReplace
-- given a string, it tries to "fix" it, if it's a URL
on ProcessURL(inString)
if inString ends with "." then
if length of inString > 1 then
return ProcessURL(text 1 thru -2 of inString) & "."
end if
end if
-- just needs <>
if inString begins with "ftp:" or inString begins with "http:" or inString begins with "mailto:" then
return "<" & inString & ">"
end if
-- check for bad ftp URLs
if inString begins with "ftp." then
return "<ftp://" & inString & ">"
end if
-- check for bad http URLs
if inString begins with "www." then
return "<http://" & inString & ">"
end if
-- check for bad mailto URLs
if inString is not "@" and inString does not start with "<" and inString contains "@" then
return "<mailto:" & inString & ">"
end if
return inString
end ProcessURL
-- make sure a outgoing message window is frontmost
on CheckMessageWindow()
tell application "Microsoft Entourage"
set msgClass to the class of the front window
if msgClass is not draft window and msgClass is not message window then
display dialog "This script is designed to work with a message window frontmost." buttons {"Okay"} with icon stop
return false
end if
end tell
return true
end CheckMessageWindow
on decodeqp(aString)
set HexChars to "0123456789ABCDEF"
set oldDelims to AppleScript's text item delimiters
set AppleScript's text item delimiters to {"="}
set workList to text items of aString
set returnValue to ""
set firstOne to true
repeat with s in workList
if firstOne then
set firstOne to false
set returnValue to s
else
if character 1 of s is return then set returnValue to returnValue & s
try
set decNum to 0
set pair to characters 1 thru 2 of s
if item 1 of pair is not in HexChars or item 2 of pair is not in HexChars then
set decNum to 0
else
set decNum to ((offset of (item 1 of pair) in HexChars) - 1) * 16 + ((offset of (item 2 of pair) in HexChars) - 1)
end if
if decNum ≠ 0 then
set returnValue to returnValue & (ASCII character decNum)
if (length of s) ≥ 3 then
set returnValue to returnValue & text 3 thru -1 of s
else
set returnValue to returnValue & s
end if
end if
on error errMsg number errNum from errFrom partial result errResult to errTo
-- on error code goes here
display dialog errMsg & "Number:" & errNum
return
end try
end if
end repeat
set AppleScript's text item delimiters to oldDelims
return returnValue
end decodeqp
On or near 12/5/02 10:22 AM, Gary Lists at [EMAIL PROTECTED] observed:
> Allen (since I use your script to remove HTML),
>
> is there an edit I can make to eliminate the =20 line endings on some mail
> after removing the HTML formatting via AS?
>
> Or some other setting/change I can make?
>
> Gary
>
>
--
Microsoft MVP for Entourage/OE/Word (MVPs are volunteers)
Allen Watson <[EMAIL PROTECTED]> Entourage FAQ site: <http://www.entourage.mvps.org/>
AppleScripts for Outlook Express and Entourage:
<http:[EMAIL PROTECTED]/Scripts/>
Entourage Help Pages: <http://www.entourage.mvps.org/>
