On Sun, 25 Nov 2007 15:17:54 +0100 "Felipe Monteiro de Carvalho" <[EMAIL PROTECTED]> wrote:
> On Nov 25, 2007 1:42 PM, Mattias Gaertner <[EMAIL PROTECTED]> > wrote: > > Thanks. > > TStringlist is too slow and the conversion should not be part of > > codetools. The widgetsets have the best access to the system libs, > > so they have the best encoding converters. I can do that part. > > Isn't AnsiToUtf8 enougth? This way we don't depend on any external > libs. AnsiToUtf8 can be used to convert from system encoding to UTF-8. And there is no problem to depend on external libs if the widgetset already depend on them. > A simple loop using ReadLn and WriteLn should be enougth. I guess not, but that's an implementation details. > > The difficult part is to find out the encoding of a text file. > > Your proposal of the //&encoding comment has the advantage of > > comments. Because this is a lazarus feature I suggest to use IDE > > directives style comments: {%encoding xxx}. > > The best option would be using the standard way which any editor > should recognize: BOM BOM is only for UTF. There are plenty of files in UTF-8 without BOM. See below. > But we can't use that until the widestring manager is improved. > > Using a comment will cause trouble opening the file on other editors > which are prepared to handle the standard. That's a normal problem for all text editors and text files without encoding info. If lazarus starts to add the BOM for new UTF-8 files, then FPC should not convert the string constants. BTW, what about file functions under windows? Mattias _________________________________________________________________ To unsubscribe: mail [EMAIL PROTECTED] with "unsubscribe" as the Subject archives at http://www.lazarus.freepascal.org/mailarchives