Richard, a couple additional notes:

On 8/4/08 11:25 AM, "Richard Gaskin" <[EMAIL PROTECTED]> wrote:

> I put that into this function:
> 
> function RegexMethod pHtml
>    put "" into newString
>    put "(?U)<.*>" into regEx
>    return replaceText(pHtml,regEx,newString)
> end RegexMethod
> 
> ...and then ran it on the HTML source for this page:
> 
> <http://mail.runrev.com/pipermail/use-revolution/2008-August/113074.html>
> 
> It catches just about everything except for the mailto near the top:
> 
>   <A 
> HREF="mailto:use-revolution%40lists.runrev.com?Subject=Getting%20the%20text%20
> content%20of%20a%20HTML%20page&In-Reply-To=f99b52860808031334l44f6cd1by6ed2444
> fb32560ac%40mail.gmail.com"
>         TITLE="Getting the text content of a HTML page">
> 
> Presumably this is because that tag is broken onto two lines.
> 

Try this variation for the regEx           put "(?Us)<.*>" into regEx
the 's' says 'ignore end of line characters to make the match'

> This function takes care of that, and this far benchmarks about an order
> of magnitude faster:
> 
> 
> function HtmlTextMethod pHtml
>    put the properties of the templateField into tSaveProps
>    set the htmlText of the templateField to pHtml
>    get the text of the templateField
>    set the properties of the templateField to tSaveProps
>    return it
> end HtmlTextMethod

Caution with this technique in that the Rev tags are noted in the
documentation to only include a subset of tags.  In today's world of XML,
programmers will create their own versions.  Perhaps a little catch line or
two:

if it contains "<" then
   answer "There may be an extra tag or two remaining in the text"
   answer "Please inspect the result to be sure"
end if

Of course, the killer in this exercise is when the text we want has
something line "Tip: solve for x > y, then add the point to your graph"

Fun, games, and work I between.

Jim Ault
Las Vegas


_______________________________________________
use-revolution mailing list
[email protected]
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution

Reply via email to