Graeme Geldenhuys schreef:
On Tue, Feb 3, 2009 at 9:02 AM, Vincent Snijders
<vsnijd...@vodafonevast.nl> wrote:
I am a Lazarus developer, and I don't think I said it like that.

I wasn't pointing fingers to you Vincent. :-) I summarized what a few
people have said.

LoadFromFile in a LCL control, you need to make sure they are valid UTF8
strings. And honestly, it is only you who make sure that it is, because you
know the initial encoding.

The problem is as follows.... Even though I am a long time developer,
I often have no clue what encoding a file is in when I look at the
file using Nautilus file manager. I often open a file in my preferred
text editor, look if it displays correctly, then look in the statusbar
area for what encoding the editor detected (at least my editor does
that nicely).


The LCL does not have this feature. It can only handle UTF8. period.

So even though you are using something as simple as the TMemo in LCL,
and LCL always wants UTF-8, how do you know what encoding to convert
from to UTF-8?

If you don't know, you cannot process it. Simple.

If I give you various text files, each using one of the
following schemes: UTF-16, UTF-16BE, and UTF-16LE, UTF-32 and whatever
else I can find. Loading the file into a TStringList and then doing
UTF8Decode on each line.... will it display correctly in the TMemo?


For each of these encodings, you would first have to translate it to UTF8, before you give it to the LCL. Note that is not wise to load UTF16* and UTF32 encoded files into a byte indexed ansistring.

Now what if the memo content is changed and then saved?  How does the
TMemo know which encoding to use (I would preferably like the same
encoding as before, not necessarily UTF-8). So if the file was
originally UTF-32, I don't want it to be UTF-8 afterwards.

If you want it the be the same, then you have to convert it back. You know what it was in the first place, because you translated it to UTF8, before giving it to the LCL.

If the TStringList.LoadFromFile(...) took a encoding parameter, it
could store that encoding option internally, so if you call
.SaveToFile(somefile.txt) later, it could use the same encoding as
used in LoadFromFile(), otherwise default to something like utf-8 if
no encoding was specified anywhere.

Maybe. I leave that suggestion to RTL developers. See also Marco's mail.

Vincent
_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Reply via email to