Re: [twsocket] Unknown encode used on som html page

Olivier Sannier Sun, 11 Apr 2010 23:58:01 -0700

Xavier Mor-Mur wrote:

I write smtp client to send html email from my application.
Html normally is a doc saved as html format. If there are images thentheir tag have SRC coded on a way I can't find how it's done.On Example at end you can see html generated using OpenOffice 3.1.There is only one image and doc name is "Sin título 1.html" which isused to declare inserted images.
I don't know if this happen using MS-Word or other word processors.
<IMG SRC="Sin%20t%C3%ADtulo%201_html_m82e68f1.jpg" ....
image on disc is "Sin título 1_html_m82e68f1.jpg"
whitespace are replaced by %20
but í es replaced by %C3%AD but in utf-8 is %ED

No, you are wrong, %C3%AD is the UTF-8 code for that accented i. %ED isits representation in ISO-8859-1

Special characters are coded using 2 bytes and I need how to decode toaccess image to embed into email.
UrlDecode don't work as it returns "Ã-" (where this - is virtual dash)

Yes, it works, it returns a UTF-8 encoded string. You have to furtherdecode it to get the ISO-8859-1 value you are looking for.

UTF8Decode is there for this.
--
To unsubscribe or change your settings for TWSocket mailing list
please goto http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be

Re: [twsocket] Unknown encode used on som html page

Reply via email to