Eric:
Thanks so much, it works perfectly. I'm very happy to have seen this as I
was just about to start having a perl program parse all of the stuff before
it got to the rebol program. I'm much happier having it all in one.
One note though,
Your syntax:
>You can use it like this:
>
> page: utf-iso read <some url>
doesn't seem to work as it looks for some string and the like..
I just did:
page: read to-url <urlhere>
page2: utf-iso page
Works great.
Thanks again.. (Note, I will likely be unsubscribing from this list, any
followups should be CC:'d to me)
MirclMax
[EMAIL PROTECTED]
Quoth Eric Long at 05:30 PM 1/24/2001 +0900:
>Hello MirclMax,
>
> >A page I used to grab and parse through Rebol using a
> >page: read to-url <urlhere>
> >command .. seems to have switched from sending its content as ISO-8859-1
> >(Latin1) and seems to be doing it as UTF-8 now.
>
> >I really need the content now stored in "page" to be ISO-8859-1. So, can
> >anyone tell me a way to force the "read" to pull it down as ISO-8859-1?
> >Barring that, does anyone have any functions to convert "page" to it? (I
> >would actually need another function to go in the reverse direction I think)
>
>This is something I've wanted to have a whack at for quite a while, so
>I threw something together. It seems to work OK, but I haven't tested it
>on illegal or broken UTF-8 to see what it does in such cases. Anything that
>can't be expressed as ISO-8859-1 is converted to "?", but you can easily
>modify it to substitute some other string, or ignore such characters.
>
>You can use it like this:
>
> page: utf-iso read <some url>
>
>Cheers,
>Eric
>
>utf-iso: func [
> {convert a string from UTF-8 encoding to ISO-8859-1}
> s [string!]
> /local res ascii skipn skipped stretch one iso
>] compose [
> normal: (make bitset! [#"^(0)" - #"^(7F)"])
> iso: (make bitset! [#"^(C2)" - #"^(C3)"])
> skipn: (make bitset! [#"^(80)" - #"^(FF)"])
> skipped: (make bitset! [#"^(80)" - #"^(BF)"])
> res: copy ""
> parse/all s [
> any [
> copy stretch some normal (append res stretch) |
> copy one iso copy stretch skipped
> (append res to-char
> (first one) - #"^(C0)" * #"^(40)" +
> ((first stretch)- #"^(80)")) |
> skipn any skipped (append res "?") |
> some skipped (append res "!")
> ]
> ]
> res
>]
>
>--
>To unsubscribe from this list, please send an email to
>[EMAIL PROTECTED] with "unsubscribe" in the
>subject, without the quotes.
--
To unsubscribe from this list, please send an email to
[EMAIL PROTECTED] with "unsubscribe" in the
subject, without the quotes.