Re: [fpc-devel] TStringField, String and UnicodeString and UTF8String
So IMHO there must be: 1. allocated space in record buffer in size 4*TFieldDef.Size+1 2. redefine meaning of Size property (as number of bytes not characters) and create fielddefs with Size*4 Yes, those are the possible solutions. Good thing about the second option, is that a user can do that on his own if he wants to use UTF-8, just create persistent fields with a field size of 4*the amount of characters. I'm not sure if we have to change this. It's a problem the programmer has to deal with, I think... I see here these possible problems/disadvantages: 1. In many cases (dynamic build or ad-hoc queries) is creating persistent fields not very effective (or complicated) 2. allocation of space in record buffer is based on TFieldDef objects (see TCustomBufDataset.GetFieldSize) and TFieldDef object are created by TSQLConnectors in AddFieldDefs method, so setting Size in persistent field does not solve whole problem, because each SQLConnector must set Size also 3. in TStringField is Size used also to determine default DisplayWidth (for TDBGrid) and in Delphi also for setting MaxLength in TDBEdit (so here we can see, that Size is used like max.number of characters rather than bytes) 4. incompatibility of Delphi (if we reclasiffy Size as number of bytes not characters) So I would prefer 1st way (increase buffer size, may be if we will support only BMP then 3*Size+1 will be sufficient) So Size remains as character length hm, according to http://docwiki.embarcadero.com/VCL/XE/en/DB.TStringField.Size is Size number of characters but according to http://docwiki.embarcadero.com/VCL/en/DB.TFieldDef.Size is Size number of bytes in underlaying database Yes, that's indeed the problem. But there's also the .DataSize property, so we could use that. Yes, but DataSize is defined only in TField not in TFieldDef class If we will add DataSize property also for TFieldDef class and overload TFieldDef.Create method with additional parameter DataSize, then SQLConnector would specify both information: character size (for displaying purposes) and byte size (for buffer purposes) Maybe... if the pressure on the bugtracker gets too high, I'll bow and change this. (I think 25% of all existing db bugs are related to this and people who do not understand anything about encodings.) but TField is created from TFieldDef and TField.Size=TFieldDef.Size ... so isn't it curious ? Not that when you want to use UTF-16 (or 32) you have to use TWideStringFields. So TWideStringField is no-encoding-agnostic field (is it designed to be everytime UTF-16 encoded) ? No. It's designed to contain an array of two-bytes records. In fact you can use it to store UCS-2 data, but not UTF-16. Same story as with ansi/UTF-8. Yes I understand now. Laco. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] TStringField, String and UnicodeString and UTF8String
So this is answer, which i have looked for: In Lazarus TStringField MUST hold UTF-8 encoded strings. Not entirely true. You could also choose to bind the fields to some Lazarus-components manually, not using the db-components. IMHO most of gui database applications use controls like TDBGrid or TDBEdit so they should display correct values by default without extra coding (or at least provide some standardized support ... ) (Tedit.Text := convertFunc(StringField.Text)) Or you can add a hook so that the .text property always does a conversion to UTF-8. First option can be used if you use a mediator or view. Second options I woudn't use. Rofl. You mean that Microsoft SQL Server can't handle unicode completely? Completely not, but only UCS-2 (no UTF-8) SQL Server provides non-UNICODE datatypes - char, varchar, text ie: TStringField Yes, but ODBC driver returns data in ANSI codepage (no possibility to force them return UTF-8) This I can fix by patch in TODBCConnection LoadField like this: (so I convert to UTF-8 in connector method, when driver is unable return UTF-8) begin Res:=SQLGetData(ODBCCursor.FSTMTHandle, FieldDef.Index+1, SQL_C_CHAR, buffer, FieldDef.Size, @StrLenOrInd); + if CharSet='ANSI' then //hack for Microsoft SQL Server +StrPLCopy(buffer, UTF8Encode(PChar(buffer)), FieldDef.Size); end; and UNICODE (UCS-2) datatypes - nchar, nvarchar, ntext ie: TWideStringField. Yes, in this case ODBC driver returns data in UCS-2, this data are written into WideString buffer, which seems correct, but in DBGrid are displayed ? instead of characters with diacritical marks (IMHO because widestringmanager in Windows converts WideString to ANSI string , not UTF-8 string). This can be fixed by using OnGetText method of field: aText:=UTF8Encode(Sender.AsString); Which is not user friendly, because requires hacking in user code in every TWideStringField in every TSQLQuery It can be also fixed in fields.inc: function TWideStringField.GetAsString: string; begin +{$IFDEF WINDOWS} + Result := UTF8Encode(GetAsWideString); +{$ELSE} Result := GetAsWideString; +{$ENDIF} end; So what is the expected encoding of data written into TWideStringField ... or is there way how to get correct results id DBGrid without above mentioned workarounds ? SQL Server ODBC driver supports AutoTranslate, see: http://msdn.microsoft.com/en-us/library/ms130822.aspx SQL Server char, varchar, or text data sent to a client SQL_C_CHAR variable is converted from character to Unicode using the server ACP, then converted from Unicode to character using the client ACP. This is what you use when you set the encoding when you connect to the client. The solution to all your problems. As explained three times, in this message alone. In fact it's simple: incoming data=outgoing data. If you need UTF-8 encoding for the outgoing data (direct access to Lazarus controls) you have to select UTF-8 at the input. Yes, but as I wrote such possibility does not exists with Microsoft SQL Server (and also I think Access) (it seems, that Microsoft does not like UTF-8 and prefers UTF-16 (UCS-2)) And, luckily, you can instruct the Database-server which encoding to use when it's communicating with the outer world. So your problem is solved. When it is possiblem then yes. Now, if you also choose UTF-8 as the Database-server field encoding (the encoding the data is stored in) there's no conversion necessary at all. Yes if DB supports UTF-8 -Laco. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] String and UnicodeString and UTF8String
Sven Barth schrieb: Widestring will also grind the application to a halt due to being COM based on Windows. How that? WideString on Windows has no reference counting, thus everytime a WideString is assigned it needs to be copied. I'm not so sure of that. AFAIR the field exists, but it's unused or reserved for shared memory management. Of course the requirement, that a BSTR has to reside in shared memory, discourages the use of exactly that type for stringhandling inside an application. I only wanted to prevent the introduction of another UTF16String type, in addition to WideString, BSTR (WinAPI) and UnicodeString (Delphi). Conversion-wise WideString/BSTR and (other) UTF-16 strings are equivalent. Nearly all Windows API functions only allow single byte encodings or UTF-16. The only functions that I'm aware of, that can use UTF-8 encoding is the console input/output API (if the codepage is set to UTF-8) [and also file I/O APIs, but they don't assume any encoding]. And the conversion functions of course (MBCStoWStr...). DoDi ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] String and UnicodeString and UTF8String
On 14-1-2011 13:21, Hans-Peter Diettrich wrote: Sven Barth schrieb: Widestring will also grind the application to a halt due to being COM based on Windows. How that? WideString on Windows has no reference counting, thus everytime a WideString is assigned it needs to be copied. I'm not so sure of that. AFAIR the field exists, but it's unused or reserved for shared memory management. Yes, if you use the set of memory allocators I mentioned the field *will* be used. COM marshalling. No com, no count. simple as that. It is unused, because the memory manager doesn't use it. com is not implemented, unless you use a com based memory manager. No com, no reference count. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Bug dwarf3 / fpc / gdb Re: dwarf3 and others too [Re: [fpc-devel] Dwarf3 and the encoding of classes]
On 11/01/2011 09:28, Joost van der Sluis wrote: Same for variants and the enumerations. Your input is valuable, though. Maybe better to open bug-reports for the variants and enumerations. So it won't be forgotten. I' ll be happy to open reports, I wasn't sure, since I can not tell if an issue is fpc or gdb More: array types, but similar for strings type TDynIntArray = Array of Integer; TStatIntArray = Array [5..9] of Integer; var VarDynIntArray: TDynIntArray; // named type VarDynIntArrayA: Array of Integer;// anonymous type = hence the A postfix of the var-name VarStatIntArray: TStatIntArray; VarStatIntArrayA: Array [5..9] of Integer; dynamic arrays ptype VarDynIntArray ~type = array [0..1] of LongInt\n #GOOD ptype VarDynIntArrayA~type = array [0..1] of LongInt\n whatis VarDynIntArray ~type = TDynIntArray\n #GOOD whatis VarDynIntArrayA~type = array [0..-4220246888] of LongInt\n # Interesting difference in the range but does not mean it's wrong # IMHO it would be nice if the type resolved too ptype TDynIntArray ^error,msg=Cannot resolve DW_OP_push_object_address for a missing object whatis TDynIntArray ^error,msg=Cannot resolve DW_OP_push_object_address for a missing object whatis @VarDynIntArray ~type = ^TDynIntArray\n #GOOD ptype @VarDynIntArray ~type = ^ ^error,msg=Cannot resolve DW_OP_push_object_address for a missing object static arrays ptype VarStatIntArray ~type = array [5..9] of LongInt\n #GOOD ptype @VarStatIntArray ~type = ^LongInt\n # where is the array gone ? = but probably a gdb issue # Are staic arrays pointers? / The same for the DynArray is ok, but for static? ptype VarStatIntArray^ ~type = LongInt\n * 2) ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Attn Joost: the ampersand in dwarf for var-param
Been to quick... Found the answer Revision: 16683 Author: joost Date: 14:49:20, 02 January 2011 Message: * Dwarf: Hide the implicit pointer from a function-parameter which is passed by reference, and dereference the (hidden) pointer in the DW_AT_location block. This solves problems with function parameters defined as 'var' Modified : /trunk/compiler/dbgdwarf.pas -- But it seems that now, even in dwarf-3 objects are treated as pointer again? (I can happily live with that. I just want to know) On 14/01/2011 20:56, Martin wrote: I just noticed, a (good) change in dwarf 2, fpc trunk (ot I believe I noticed) in dwarf2, var param (param by ref) where encoded with an procedure a(var Foo: TObject): ptype Foo type = TFOO = class : public TOBJECT whatis Foo type = TFoo Today, I looked at it, and the ampersands are gone? In Dwarf they where never there, in dwarf the var-param alwas behaved like normal var (you would never noticed the extra pointer layer) = seems dwarf caught up? Martin ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: Bug dwarf3 / fpc / gdb Re: dwarf3 and others too [Re: [fpc-devel] Dwarf3 and the encoding of classes]
On Fri, 2011-01-14 at 18:47 +, Martin wrote: On 11/01/2011 09:28, Joost van der Sluis wrote: Same for variants and the enumerations. Your input is valuable, though. Maybe better to open bug-reports for the variants and enumerations. So it won't be forgotten. I' ll be happy to open reports, I wasn't sure, since I can not tell if an issue is fpc or gdb The problem is now that debug-problems aren't reported and no-one feels responsible for them because it could be that it's someone elses problem. Just report them, I (we) can always look if it is a gdb or fpc issue. And ask on the gdb lists for help. It's not as if you just add bugs to the bug-tracker, without any investigation or background-info. ;) More: array types, but similar for strings type TDynIntArray = Array of Integer; TStatIntArray = Array [5..9] of Integer; var VarDynIntArray: TDynIntArray; // named type VarDynIntArrayA: Array of Integer;// anonymous type = hence the A postfix of the var-name VarStatIntArray: TStatIntArray; VarStatIntArrayA: Array [5..9] of Integer; dynamic arrays Are fully implemented in Dwarf-3 and the gdb I've send. I don't think it is a good spending of my time to let the Dwarf-2 info give nice messages that it won't work. ;) It works with some luck in some cases. I don't think I'll spend time on that. But please open a bug-report. Maybe I or someone else will. Joost. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Attn Joost: the ampersand in dwarf for var-param
On Fri, 2011-01-14 at 21:27 +, Martin wrote: Been to quick... Found the answer I though you'd noticed this earlier. I also wrote it in one of the mails last week? But it seems that now, even in dwarf-3 objects are treated as pointer again? (I can happily live with that. I just want to know) Yes, that was the (final?) conclusion from the discussion with you and Jonas. It is also why I said you were too fast with your tests, because I still had to change this. Note that there will be more changes. Now effectively the Dwarf-2 code for writing object-info is used for Dwarf 3 also. But there were some additions in them old Dwarf-3 code, that I have to re-implement. (In such a way that as much as possible code is shared between Dwarf-2 and Dwarf-3) Joost. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Attn Joost: the ampersand in dwarf for var-param
On 14/01/2011 22:16, Joost van der Sluis wrote: On Fri, 2011-01-14 at 21:27 +, Martin wrote: Been to quick... Found the answer I though you'd noticed this earlier. I also wrote it in one of the mails last week? Seems I have missed that, sorry. Anyway, that is good news... But it seems that now, even in dwarf-3 objects are treated as pointer again? (I can happily live with that. I just want to know) Yes, that was the (final?) conclusion from the discussion with you and Jonas. It is also why I said you were too fast with your tests, because I still had to change this. That is also good news. Especially since I found a better way to distinguish between var a: TObject; b: ^TObject; in order to display the correct data in the IDE. And the new way (once implemented) will also work if the debug info is mixed dwarf/stabs ptype a^ ptype b^ will return correct (note the ^ in front of TObject) a^: ~type = TObject = class : b^: ~type = ^TObject = class : . And that is the same for all kind of debug infos :) And with luck, that will work with the Mac 6.3.50 as well (because 6.3.50 cannot do whatis TObject = and current IDE wants that...) Note that there will be more changes. Now effectively the Dwarf-2 code for writing object-info is used for Dwarf 3 also. But there were some additions in them old Dwarf-3 code, that I have to re-implement. (In such a way that as much as possible code is shared between Dwarf-2 and Dwarf-3) I will start filling in reports... Have you sen the RunGdbmi app next to my test? It allows you to run a set of gbd instructions to the same application, with different debuggers[1], different compilers[1], and different gw/gs settings, and see all the result at once. [1] assuming you use gdblist.txt and fpclist.txt files The advantage is, you see the actual gdb output, instead of what the IDE does with it Martin ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
[fpc-devel] Can someone test, pleease - possible problem with dbg info
In lazarus /debugger/test/Gdbmi is a program TestGdbmi If I insert a breakpoint at line 480 (either before start, or during run, then GDB reports : TCmdLineDebugger.SendCmdLn -exec-continue TCmdLineDebugger.ReadLn ^error,msg=Warning:\nCannot insert breakpoint 9.\nError accessing memory address 0x18: Input/output error.\n TGDBMIDebugger.ProcessResult Error: ,msg=Warning:\nCannot insert breakpoint 9.\nError accessing memory address 0x18: Input/output error.\ n TCmdLineDebugger.ReadLn (gdb) Can anyone try if they get the same? w32 fpc 2.4.2 lazarus uptodate0/9/31 dwarf -gw Oh, and the problem goes away, if I use the external linker -Xe Martin ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Can someone test, pleease - possible problem with dbg info
15.01.2011 8:38, Martin wrote: In lazarus /debugger/test/Gdbmi is a program TestGdbmi If I insert a breakpoint at line 480 (either before start, or during run, then GDB reports : TCmdLineDebugger.SendCmdLn -exec-continue TCmdLineDebugger.ReadLn ^error,msg=Warning:\nCannot insert breakpoint 9.\nError accessing memory address 0x18: Input/output error.\n TGDBMIDebugger.ProcessResult Error: ,msg=Warning:\nCannot insert breakpoint 9.\nError accessing memory address 0x18: Input/output error.\ n TCmdLineDebugger.ReadLn (gdb) Can anyone try if they get the same? w32 fpc 2.4.2 lazarus uptodate0/9/31 dwarf -gw Oh, and the problem goes away, if I use the external linker -Xe You have a known problem. You placed a breakpoint at the location which was striped away by the linker. But debug info for this location is generated and not stripped by the compiler. There is already a bug report regards that. Best regards, Paul Ishenin ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Can someone test, pleease - possible problem with dbg info
On 15/01/2011 01:50, Paul Ishenin wrote: 15.01.2011 8:38, Martin wrote: In lazarus /debugger/test/Gdbmi is a program TestGdbmi If I insert a breakpoint at line 480 (either before start, or during run, then GDB reports : TCmdLineDebugger.SendCmdLn -exec-continue TCmdLineDebugger.ReadLn ^error,msg=Warning:\nCannot insert breakpoint 9.\nError accessing memory address 0x18: Input/output error.\n TGDBMIDebugger.ProcessResult Error: ,msg=Warning:\nCannot insert breakpoint 9.\nError accessing memory address 0x18: Input/output error.\ n TCmdLineDebugger.ReadLn (gdb) Can anyone try if they get the same? w32 fpc 2.4.2 lazarus uptodate0/9/31 dwarf -gw Oh, and the problem goes away, if I use the external linker -Xe You have a known problem. You placed a breakpoint at the location which was striped away by the linker. But debug info for this location is generated and not stripped by the compiler. There is already a bug report regards that. ok, because it showed the blue dots for the lines... thanks ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel