I've been keeping up with this topic for a while now and I haven't read any suggestions similar to how I envision encoding support.
I think it's best to keep ansi strings intact. I also think it's best to create a string encoding class factory for people to draw upon for conversions. I don't think seamless conversion between types is presently required. I think looking at all the different technologies it would be wise for FPC to just support the encoding of ANSI strings to match that of popular ones. A "featured" encoding system would include character sets for Internet releated apps 'UTF-8', 'UTF-16', 'UTF-16BE', 'UTF-16LE', 'ISO-8859-1', 'ISO-8859-2', 'ISO-8859-3', 'ISO-8859-4', 'ISO-8859-5', 'ISO-8859-6', 'ISO-8859-7', 'ISO-8859-8', 'ISO-8859-8_1', 'ISO-8859-9', 'ISO-8859-10', 'ISO-8859-11', 'ISO-8859-12', 'ISO-8859-13', 'ISO-8859-14', 'ISO-8859-15', 'ISO-8859-16', 'ISO-2022-KR', 'ISO-2022-JP', 'ISO-2022-CN', 'csISO_IR_111', 'Windows-874', 'Windows-1250', 'Windows-1251', 'Windows-1252', 'Windows-1253', 'Windows-1254', 'Windows-1255', 'Windows-1256', 'Windows-1257', 'Windows-1258', 'EUC-KR', 'EUC-JP', 'EUC-TW', 'TIS-620', 'UHC', 'JOHAB', 'TCVN', 'VPS', 'CP-866', 'ARMSCII-8', 'USASCII', 'VISCII', 'HZ', 'GBK', 'Big5', 'Big5_HKSCS', 'GB2312', 'GB18030', 'KO18-R', 'KO18-U', 'IBM-850', 'IBM-852', 'IBM-855', 'IBM-857', 'IBM-864', 'IBM-862', 'MacCE', 'MacRoman', 'MacRomanian', 'MacTurkish', 'MacIcelandic', 'Shift-JIS', 'MacCyrillic', 'MacCroatian', 'MacDevanagari', 'MacGurmukhi', 'MacGujarati' Going between these via (streams or memory), and some or most of these would be the ideal. I'm thinking - borrow your design from the Image class factory system. Using fpImage class system I can go from PNG to JPG with just extensions and grab classes and create instances of converters there. In the Internet app development realm we have blocks of text that we already know what the encoding is supposed to be. Take for example a MP3 music file with ID3 Tag for a generic string. It would be declared as ANSI, UTF8 or UTF16. Codec:=StringCodecFactory.getHandler(ANSI) Codec:=StringCodecFactory.getHandler(UTF8) Codec:=StringCodecFactory.getHandler(UTF16) Codec:=StringCodecFactory.getHandler(UTF32) Codec.Pos(..) Codec.PosEx() Codec.Replace() Codec.Copy() Codec.Delete() Codec.Read(ContentType,Stream/string) overload; Codec.Write(ContentType,Stream/string) overload; Codec.AsString Codec.Encode() // ie.) ansi or UTF8 Codec.Decode() // ie.) ISO-8859-1 will remove all the =20 and = for word-wrapping etc. Ideally, during design time, I could "case" all 3 types and just reference the desired class for POS, Replace, PosEx. The beneifts of isolating all the desired encoding types would be ease of debugging, ease of growth, teams could target specific methods, and if the factory return no encoding method it's not supported. I'm at the point where I have extreme disparity codecs for various forms of communications. My ideal would be such a system. _______________________________________________ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel