Hi, Graham,

[EMAIL PROTECTED] wrote:
> 
> Has anyone got a function that strips out all the html from
> a page leaving just the text behind?
> 

Given the following:

    load-text-only: func [where [file! url!] /local text] [
        text: make string! 10000
        foreach item load/markup where [
            if string? item [
                append text item
            ]
        ]
        text
    ]

and a %test.html file containing:

    <html>
    <head>
    <title>Test Page</title>
    </head>
    <body>
    <h1>Test Page</h1>
    <p>Here is a paragraph.</p>
    <p>Here is another one</p>
    <blockquote>Common sense is seldom both.</blockquote>
    </body>
    </html>

you can say:

    >> load-text-only %test.html
    == {

    Test Page


    Test Page
    Here is a paragraph.
    Here is another one
    Common sense is seldom both.



    }

Dealing with the surplus whitespace is "left as an exercise for
the reader"  ;-)

-jn-
-- 
; Joel Neely  [EMAIL PROTECTED]  901-263-4460  38017/HKA/9677
REBOL []  foreach [order string]  sort/skip reduce [ true "!"
false  head reverse "rekcah"  none "REBOL "  prin "Just " "another "
] 2 [prin string] print ""
-- 
To unsubscribe from this list, please send an email to
[EMAIL PROTECTED] with "unsubscribe" in the 
subject, without the quotes.

Reply via email to