On Sun, 25 Nov 2007 15:17:54 +0100
"Felipe Monteiro de Carvalho" <[EMAIL PROTECTED]> wrote:
> On Nov 25, 2007 1:42 PM, Mattias Gaertner <[EMAIL PROTECTED]>
> wrote:
> > Thanks.
> > TStringlist is too slow and the conversion should not be part of
> > codetools. The widgetsets have the best access to the system libs,
> > so they have the best encoding converters. I can do that part.
>
> Isn't AnsiToUtf8 enougth? This way we don't depend on any external
> libs.
AnsiToUtf8 can be used to convert from system encoding to UTF-8. And
there is no problem to depend on external libs if the widgetset already
depend on them.
> A simple loop using ReadLn and WriteLn should be enougth.
I guess not, but that's an implementation details.
> > The difficult part is to find out the encoding of a text file.
> > Your proposal of the //&encoding comment has the advantage of
> > comments. Because this is a lazarus feature I suggest to use IDE
> > directives style comments: {%encoding xxx}.
>
> The best option would be using the standard way which any editor
> should recognize: BOM
BOM is only for UTF.
There are plenty of files in UTF-8 without BOM.
See below.
> But we can't use that until the widestring manager is improved.
>
> Using a comment will cause trouble opening the file on other editors
> which are prepared to handle the standard.
That's a normal problem for all text editors and text files without
encoding info.
If lazarus starts to add the BOM for new UTF-8 files, then FPC should
not convert the string constants.
BTW, what about file functions under windows?
Mattias
_________________________________________________________________
To unsubscribe: mail [EMAIL PROTECTED] with
"unsubscribe" as the Subject
archives at http://www.lazarus.freepascal.org/mailarchives