From: Matthias Buercher <[EMAIL PROTECTED]>
Date: Sun, 12 Feb 2006 10:25:54 +0100
two utf16 questions:
(1) with
defineencoding(encodings.utf16)
i can define the string as utf16.
but how can i define it as bigendian or littleendian?
(2) given a binarystream and i know that i have to read a utf16
string with a given character length, what would be the proper method
to read this string? the bytelength can be bigger then
2*characterlength. i thought to read first 2*characterlength and then
test if the character length is achieved, else read chunks as long
until the string has its length.
You can't set a string to have an endian property. However as Boris
correctly mentioned, binarystreams and memoryblocks can have endian
properties.
The best way to read a binary stream is usually to do it all at once,
I think, at least that's the simplest way, I don't know if RB manages
to make it into the fastest way.
dim s as string
s = mybinarystream.read( mybinarystream.length )
If you want to then swap the endianness, use a memoryblock, or a
plugin such as my own.
with my plugin you'd do this:
ed = s.ElfData
ed.UTF=16
ed = ed.ConvertTo(16, ed.BigEndian=false)
s = ed.ToString
That's it, you've swapped the endianness, and you did it through an
optimised method. .ConvertTo won't do any more work than necessary,
so here it only swaps bytes and does nothing else.
If you wanted to convert it into UTF-8, you could just do this:
ed = ed.ConvertToUTF8
s = ed.ToString
That's also a lot faster than swapping it manually. My encoding
converter is at least 2x faster than RB's encoding converter,
although RB might just be going through Apple's UTF converter so it's
not necessarily a comparison of my coding skills against theirs.
Also, if you knew that the first 4 characters if your string are
always ASCII (like an XML file, or an HTML file or even a human
editable config file), then you can use my .EncodingXMLGuess method.
ed = s.ElfData
ed.EncodingXMLGuess
ed = ed.ConvertToUTF8
s = ed.ToString
That nice 4 lines of code will convert a string containing any UTF
(even UTF-32), into UTF-8, and do it perfectly reliably, as long as
the first 4 characters of the string are ASCII! (EncodingXMLGuess
also checks for a BOM, if one is present.)
--
http://elfdata.com/plugin/
_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>
Search the archives of this list here:
<http://support.realsoftware.com/listarchives/lists.html>