Hi!
Belated happy new year to you all!
I'm working on a news script, and needed a way to strip input from
HTML-tags, so I conjured up this little function:
detag: func [
"Removes HTML-tags from a string, file or url, leaves special characters
intact, except ."
source [string!] "String, file or url to detag."
'target [word!] "Copies the result to this target."
/custom block [block!] "Define a block of special characters to be
replaced, i.e. [^"Á^" ^"�^"]."
/local tag string list a b
][
list: ["<br>" " " "</p>" "^/" " " " "]
if custom [append list block]
string: copy source
set to-word get 'target string
for i 1 (length? list) 2 [
a: i
b: i + 1
replace/all get target list/:a list/:b
]
while [
parse get target [to "<" thru ">" to end]
][
parse get target [to "<" copy tag thru ">" (remove/part find get
target tag length? tag) to end]
]
get target
]
Feel free to optimise all this, I bloated it a little bit, because of <br>,
</p> and - so I've added a /custom refinement. Use it to strip any
special characters. I wanted to add a refinement that did that
automatically, but I was too lazy to figure out a really efficient way to
handle the special characters (like: & - any letter - acute/circ/etc. - ;)
Have fun, keep coding!
Regards,
Rachid