Hi, Bo:

I have a question about this e-mail. As you can see below, the line breaks are
not right. The lines that have a single quote (") on it probably had a NEWLINE
of some sort on the previous line. This causes an error when run, so what is
really supposed to be between the two quote marks?

I figured out how to get it to run OK, but, for my purposes, the retained text
needs much better formatting - close to how the author wanted to format it.
Using an optional arg for the max-linelength would also help. If someone knows
how and is willing to do this, I would dearly like to see an improved version.
Anyone who has ever used a good HTML stripper, such as the excellent HTTX
(Amiga), knows how useful they can be, when called by other programs.

Thanks for the seed, Bo!

-- 

                ---===///||| Donald Dalley |||\\\===---
                     The World of AmiBroker Support
                  http://webhome.idirect.com/~ddalley
                          UIN/ICQ#: 65203020


On 10-Nov-00, [EMAIL PROTECTED] wrote:

> Bit by our own list...here it is in plain text!

> -Bo

> --Striptags--

> REBOL [
>     Title:  "HTML Tag Stripper"
>     Date:   20-Jul-1999
>     Author: "Bohdan Lechnowsky"
>     Email:  [EMAIL PROTECTED]
>     Purpose: {
>         To strip off HTML tags leaving only text behind
>     }
> ]

> striptags: func [page /local text end] [

>     multi-replace: func [
>         {Replaces multiple items in a file}
>         pg  [series!] {The series to replace items in}
>         blk [block!] {A block of search and replace elements}
>     ][foreach [srch rplc] blk [replace/all pg srch rplc]]

>     ;table of tags and more suitable ASCII characters
>     page: multi-replace trim/lines page [
>         "<TITLE>"    "TITLE: "
>         "</TITLE>"   "
> "
>         "  "         " "
>         "<TD>"       "|"
>         "</TD>"      "|"
>         "||" "|"
>         "<TR>"       " "
>         "</TR>"      "
> "
>         "<TABLE"    "
> <"
>         "</TABLE>"   "
> "
>         "<P>"        "
> "
>         "<LI>"       "
> � "
>         "<BR>"       "
> "
>         "&nbsp;"     " "
>         "&gt;"       ">"
>         "&lt;"       "<"
>         "&copy;"     "(c)"
>         "&amp;"      "&"
>         "&quot;"     {"}
>         "</H1>"      "
> "
>         "</H2>"      "
> "
>         "</H3>"      "
> "
>         "</H4>"      "
> "
>         "</H5>"      "
> "
>         "</H6>"      "
> "
>         "<HR"        "
> ----------
> <"
>     ]
>     text: copy ""

>     append page "<"
>     append text copy/part page find page "<"
>     while [page: find/tail page ">"] [
>         if (first page) <> #"<" [
>             if found? end: find page "<" [
>                 append text copy/part page end
>             ]
>         ]
>     ]
>     return append text "
> "
> ]

-- 
To unsubscribe from this list, please send an email to
[EMAIL PROTECTED] with "unsubscribe" in the 
subject, without the quotes.

Reply via email to