Re: [fpc-devel] TStringField, String and UnicodeString and UTF8String

2011-01-14 Thread LacaK


  

So IMHO there must be:
1. allocated space in record buffer in size 4*TFieldDef.Size+1 



  

2. redefine meaning of Size property (as number of bytes not
characters) and create fielddefs with Size*4



Yes, those are the possible solutions. Good thing about the second
option, is that a user can do that on his own if he wants to use UTF-8,
just create persistent fields with a field size of 4*the amount of
characters. I'm not sure if we have to change this. It's a problem the
programmer has to deal with, I think...

  

I see here these possible problems/disadvantages:
1. In many cases (dynamic build or ad-hoc queries) is creating 
persistent fields not very effective (or complicated)
2. allocation of space in record buffer is based on TFieldDef objects 
(see TCustomBufDataset.GetFieldSize) and TFieldDef object are created by 
TSQLConnectors in AddFieldDefs method, so setting Size in persistent 
field does not solve whole problem, because each SQLConnector must set 
Size also
3. in TStringField is Size used also to determine default DisplayWidth 
(for TDBGrid) and in Delphi also for setting MaxLength in TDBEdit
(so here we can see, that Size is used like max.number of characters 
rather than bytes)
4. incompatibility of Delphi (if we reclasiffy Size as number of bytes 
not characters)


So I would prefer 1st way (increase buffer size, may be if we will 
support only BMP then 3*Size+1 will be sufficient)

So Size remains as character length


hm, according to
http://docwiki.embarcadero.com/VCL/XE/en/DB.TStringField.Size is Size
number of characters
but according to
http://docwiki.embarcadero.com/VCL/en/DB.TFieldDef.Size is Size number
of bytes in underlaying database



Yes, that's indeed the problem. But there's also the .DataSize property,
so we could use that.

  

Yes, but DataSize is defined only in TField not in TFieldDef class
If we will add DataSize property also for TFieldDef class and overload 
TFieldDef.Create method with additional parameter DataSize, then 
SQLConnector would specify both information: character size (for 
displaying purposes) and byte size (for buffer purposes)



Maybe... if the pressure on the bugtracker gets too high, I'll bow and
change this. (I think 25% of all existing db bugs are related to this
and people who do not understand anything about encodings.)

  

but TField is created from TFieldDef and
TField.Size=TFieldDef.Size ... so isn't it curious ?


Not that when you want to use UTF-16 (or 32) you have to use
TWideStringFields.

  
  

So TWideStringField is no-encoding-agnostic field (is it designed to
be everytime UTF-16 encoded) ?



No. It's designed to contain an array of two-bytes records. In fact you
can use it to store UCS-2 data, but not UTF-16. Same story as with
ansi/UTF-8.

  

Yes I understand now.

Laco.

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] TStringField, String and UnicodeString and UTF8String

2011-01-14 Thread LacaK



So this is answer, which i have looked for:
In Lazarus TStringField MUST hold UTF-8 encoded strings.



Not entirely true. You could also choose to bind the fields to some
Lazarus-components manually, not using the db-components.

IMHO most of gui database applications use controls like TDBGrid or TDBEdit
so they should display correct values by default without extra coding 
(or at least provide some standardized support ... )




 (Tedit.Text :=
convertFunc(StringField.Text)) Or you can add a hook so that the .text
property always does a conversion to UTF-8. First option can be used if
you use a mediator or view. Second options I woudn't use.

  
Rofl. You mean that Microsoft SQL Server can't handle unicode
completely? 
  

Completely not, but only UCS-2 (no UTF-8)

SQL Server provides non-UNICODE datatypes - char, varchar, text 



ie: TStringField
  
Yes, but ODBC driver returns data in ANSI codepage (no possibility to 
force them return UTF-8)

This I can fix by patch in TODBCConnection LoadField like this:
(so I convert to UTF-8 in connector method, when driver is unable return 
UTF-8)

   begin
 Res:=SQLGetData(ODBCCursor.FSTMTHandle, FieldDef.Index+1, 
SQL_C_CHAR, buffer, FieldDef.Size, @StrLenOrInd);

+  if CharSet='ANSI' then //hack for Microsoft SQL Server
+StrPLCopy(buffer, UTF8Encode(PChar(buffer)), FieldDef.Size);
   end;   

  

 and UNICODE (UCS-2) datatypes - nchar, nvarchar, ntext



ie: TWideStringField.
  
Yes, in this case ODBC driver returns data in UCS-2, this data are 
written into WideString buffer, which seems correct, but in DBGrid are 
displayed ? instead of characters with diacritical marks (IMHO because 
widestringmanager in Windows converts WideString to ANSI string , not 
UTF-8 string).
This can be fixed by using OnGetText method of field: 
aText:=UTF8Encode(Sender.AsString);
Which is not user friendly, because requires hacking in user code in 
every TWideStringField in every TSQLQuery

It can be also fixed in fields.inc:
function TWideStringField.GetAsString: string;
begin
+{$IFDEF WINDOWS}
+  Result := UTF8Encode(GetAsWideString);
+{$ELSE}
 Result := GetAsWideString;
+{$ENDIF}
end;

So what is the expected encoding of data written into TWideStringField 
... or is there way how to get correct results id DBGrid without above 
mentioned workarounds ?


  

 SQL Server ODBC driver supports AutoTranslate, see:
http://msdn.microsoft.com/en-us/library/ms130822.aspx
 SQL Server char, varchar, or text data sent to a client SQL_C_CHAR
variable is converted from character to Unicode using the server ACP,
then converted from Unicode to character using the client ACP.



This is what you use when you set the encoding when you connect to the
client. The solution to all your problems. As explained three times, in
this message alone.

In fact it's simple: incoming data=outgoing data.

If you need UTF-8 encoding for the outgoing data (direct access to
Lazarus controls) you have to select UTF-8 at the input.
Yes, but as I wrote such possibility does not exists with Microsoft SQL 
Server (and also I think Access)

(it seems, that Microsoft does not like UTF-8 and prefers UTF-16 (UCS-2))


And, luckily, you can instruct the Database-server which encoding to use
when it's communicating with the outer world. So your problem is solved.
  

When it is possiblem then yes.


Now, if you also choose UTF-8 as the Database-server field encoding (the
encoding the data is stored in) there's no conversion necessary at all.
  

Yes if DB supports UTF-8

-Laco.

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] String and UnicodeString and UTF8String

2011-01-14 Thread Hans-Peter Diettrich

Sven Barth schrieb:


Widestring will also grind the application to a halt due to being COM
based
on Windows.


How that?




WideString on Windows has no reference counting, thus everytime a 
WideString is assigned it needs to be copied.


I'm not so sure of that. AFAIR the field exists, but it's unused or 
reserved for shared memory management.


Of course the requirement, that a BSTR has to reside in shared memory, 
discourages the use of exactly that type for stringhandling inside an 
application.


I only wanted to prevent the introduction of another UTF16String type, 
in addition to WideString, BSTR (WinAPI) and UnicodeString (Delphi). 
Conversion-wise WideString/BSTR and (other) UTF-16 strings are equivalent.



Nearly all Windows API functions only allow single byte encodings or 
UTF-16. The only functions that I'm aware of, that can use UTF-8 
encoding is the console input/output API (if the codepage is set to 
UTF-8) [and also file I/O APIs, but they don't assume any encoding].


And the conversion functions of course (MBCStoWStr...).

DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] String and UnicodeString and UTF8String

2011-01-14 Thread Thaddy

On 14-1-2011 13:21, Hans-Peter Diettrich wrote:

Sven Barth schrieb:


Widestring will also grind the application to a halt due to being COM
based
on Windows.


How that?




WideString on Windows has no reference counting, thus everytime a 
WideString is assigned it needs to be copied.


I'm not so sure of that. AFAIR the field exists, but it's unused or 
reserved for shared memory management.



Yes, if you use the set of memory allocators I mentioned the field 
*will* be used. COM marshalling.
No com, no count. simple as that. It is unused, because the memory 
manager doesn't use it.
com is not implemented, unless you use a com based memory manager. No 
com, no reference count.




___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Bug dwarf3 / fpc / gdb Re: dwarf3 and others too [Re: [fpc-devel] Dwarf3 and the encoding of classes]

2011-01-14 Thread Martin

On 11/01/2011 09:28, Joost van der Sluis wrote:


Same for variants and the enumerations. Your input is valuable, though.
Maybe better to open bug-reports for the variants and enumerations. So
it won't be forgotten.
I' ll be happy to open reports, I wasn't sure, since I can not tell if 
an issue is fpc or gdb


More:
array types, but similar for strings

  type
TDynIntArray = Array of Integer;
TStatIntArray = Array [5..9] of Integer;
  var
VarDynIntArray: TDynIntArray;   // named type
VarDynIntArrayA: Array of Integer;// anonymous type = hence 
the A postfix of the var-name

VarStatIntArray: TStatIntArray;
VarStatIntArrayA: Array [5..9] of Integer;

 dynamic arrays

ptype VarDynIntArray   ~type = array [0..1] of LongInt\n   #GOOD
ptype VarDynIntArrayA~type = array [0..1] of LongInt\n
whatis VarDynIntArray   ~type = TDynIntArray\n  #GOOD
whatis VarDynIntArrayA~type = array [0..-4220246888] of LongInt\n
# Interesting difference in the range but does not mean it's wrong

# IMHO it would be nice if the type resolved too
ptype TDynIntArray  ^error,msg=Cannot resolve 
DW_OP_push_object_address for a missing object
whatis TDynIntArray ^error,msg=Cannot resolve 
DW_OP_push_object_address for a missing object


whatis @VarDynIntArray   ~type = ^TDynIntArray\n #GOOD
ptype @VarDynIntArray
~type = ^
^error,msg=Cannot resolve DW_OP_push_object_address for a missing object

 static arrays
ptype VarStatIntArray   ~type = array [5..9] of LongInt\n #GOOD
ptype @VarStatIntArray  ~type = ^LongInt\n   # where is the 
array gone ? = but probably a gdb issue


# Are staic arrays pointers?
/ The same for the DynArray is ok, but for static?
ptype VarStatIntArray^  ~type = LongInt\n

* 2)

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Attn Joost: the ampersand in dwarf for var-param

2011-01-14 Thread Martin

Been to quick...
Found the answer

Revision: 16683
Author: joost
Date: 14:49:20, 02 January 2011
Message:
 * Dwarf: Hide the implicit pointer from a function-parameter which is 
passed

   by reference, and dereference the (hidden) pointer in the DW_AT_location
   block. This solves problems with function parameters defined as 'var'

Modified : /trunk/compiler/dbgdwarf.pas

--
But it seems that now, even in dwarf-3  objects are treated as pointer 
again? (I can happily live with that. I just want to know)



On 14/01/2011 20:56, Martin wrote:
I just noticed, a (good) change in dwarf 2, fpc trunk (ot I believe I 
noticed)


in dwarf2, var param (param by ref) where encoded with an 

procedure a(var Foo: TObject):

ptype Foo
type = TFOO = class : public TOBJECT

whatis Foo
type = TFoo

Today, I looked at it, and the ampersands are gone?

In Dwarf they where never there, in dwarf the var-param alwas behaved 
like normal var (you would never noticed the extra pointer layer) = 
seems dwarf caught up?


Martin
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: Bug dwarf3 / fpc / gdb Re: dwarf3 and others too [Re: [fpc-devel] Dwarf3 and the encoding of classes]

2011-01-14 Thread Joost van der Sluis
On Fri, 2011-01-14 at 18:47 +, Martin wrote:
 On 11/01/2011 09:28, Joost van der Sluis wrote:
 
  Same for variants and the enumerations. Your input is valuable, though.
  Maybe better to open bug-reports for the variants and enumerations. So
  it won't be forgotten.
 I' ll be happy to open reports, I wasn't sure, since I can not tell if 
 an issue is fpc or gdb

The problem is now that debug-problems aren't reported and no-one feels
responsible for them because it could be that it's someone elses
problem. 

Just report them, I (we) can always look if it is a gdb or fpc issue.
And ask on the gdb lists for help.

It's not as if you just add bugs to the bug-tracker, without any
investigation or background-info. ;)

 More:
 array types, but similar for strings
 
type
  TDynIntArray = Array of Integer;
  TStatIntArray = Array [5..9] of Integer;
var
  VarDynIntArray: TDynIntArray;   // named type
  VarDynIntArrayA: Array of Integer;// anonymous type = hence 
 the A postfix of the var-name
  VarStatIntArray: TStatIntArray;
  VarStatIntArrayA: Array [5..9] of Integer;
 
  dynamic arrays

Are fully implemented in Dwarf-3 and the gdb I've send. I don't think it
is a good spending of my time to let the Dwarf-2 info give nice messages
that it won't work. ;)

It works with some luck in some cases. I don't think I'll spend time on
that. But please open a bug-report. Maybe I or someone else will.

Joost.

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Attn Joost: the ampersand in dwarf for var-param

2011-01-14 Thread Joost van der Sluis
On Fri, 2011-01-14 at 21:27 +, Martin wrote:
 Been to quick...
 Found the answer

I though you'd noticed this earlier. I also wrote it in one of the mails
last week?

 But it seems that now, even in dwarf-3  objects are treated as pointer 
 again? (I can happily live with that. I just want to know)

Yes, that was the (final?) conclusion from the discussion with you and
Jonas. It is also why I said you were too fast with your tests, because
I still had to change this.

Note that there will be more changes. Now effectively the Dwarf-2 code
for writing object-info is used for Dwarf 3 also. But there were some
additions in them old Dwarf-3 code, that I have to re-implement. (In
such a way that as much as possible code is shared between Dwarf-2 and
Dwarf-3)

Joost.



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Attn Joost: the ampersand in dwarf for var-param

2011-01-14 Thread Martin

On 14/01/2011 22:16, Joost van der Sluis wrote:

On Fri, 2011-01-14 at 21:27 +, Martin wrote:

Been to quick...
Found the answer

I though you'd noticed this earlier. I also wrote it in one of the mails
last week?

Seems I have missed that, sorry.
Anyway, that is good news...


But it seems that now, even in dwarf-3  objects are treated as pointer
again? (I can happily live with that. I just want to know)

Yes, that was the (final?) conclusion from the discussion with you and
Jonas. It is also why I said you were too fast with your tests, because
I still had to change this.

That is also good news.

Especially since I found a better way to distinguish between
var
  a: TObject;
  b: ^TObject;
in order to display the correct data in the IDE.
And the new way (once implemented) will also work if the debug info is 
mixed dwarf/stabs


ptype a^
ptype b^

will return correct (note the ^ in front of TObject)
a^:  ~type = TObject = class : 
b^:  ~type = ^TObject = class : .

And that is the same for all kind of debug infos :)

And with luck, that will work with the Mac 6.3.50 as well (because 
6.3.50 cannot do whatis TObject = and current IDE wants that...)





Note that there will be more changes. Now effectively the Dwarf-2 code
for writing object-info is used for Dwarf 3 also. But there were some
additions in them old Dwarf-3 code, that I have to re-implement. (In
such a way that as much as possible code is shared between Dwarf-2 and
Dwarf-3)

I will start filling in reports...

Have you sen the RunGdbmi app next to my test? It allows you to run a 
set of gbd instructions to the same application, with different 
debuggers[1], different compilers[1], and different gw/gs settings, and 
see all the result at once.


[1] assuming you use gdblist.txt and fpclist.txt files

The advantage is, you see the actual gdb output, instead of what the IDE 
does with it


Martin
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


[fpc-devel] Can someone test, pleease - possible problem with dbg info

2011-01-14 Thread Martin

In lazarus /debugger/test/Gdbmi is a program TestGdbmi

If I insert a breakpoint at line 480 (either before start, or during 
run, then GDB reports :

 TCmdLineDebugger.SendCmdLn -exec-continue
 TCmdLineDebugger.ReadLn ^error,msg=Warning:\nCannot insert 
breakpoint 9.\nError accessing memory address 0x18: Input/output error.\n


  TGDBMIDebugger.ProcessResult Error: ,msg=Warning:\nCannot insert 
breakpoint 9.\nError accessing memory address 0x18: Input/output error.\

n
 TCmdLineDebugger.ReadLn (gdb) 


Can anyone try if they get the same?
w32
fpc 2.4.2
lazarus uptodate0/9/31
dwarf -gw

Oh, and the problem goes away, if I use the external linker -Xe

Martin
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Can someone test, pleease - possible problem with dbg info

2011-01-14 Thread Paul Ishenin

15.01.2011 8:38, Martin wrote:

In lazarus /debugger/test/Gdbmi is a program TestGdbmi

If I insert a breakpoint at line 480 (either before start, or during 
run, then GDB reports :

 TCmdLineDebugger.SendCmdLn -exec-continue
 TCmdLineDebugger.ReadLn ^error,msg=Warning:\nCannot insert 
breakpoint 9.\nError accessing memory address 0x18: Input/output 
error.\n


  TGDBMIDebugger.ProcessResult Error: ,msg=Warning:\nCannot insert 
breakpoint 9.\nError accessing memory address 0x18: Input/output error.\

n
 TCmdLineDebugger.ReadLn (gdb) 


Can anyone try if they get the same?
w32
fpc 2.4.2
lazarus uptodate0/9/31
dwarf -gw

Oh, and the problem goes away, if I use the external linker -Xe
You have a known problem. You placed a breakpoint at the location which 
was striped away by the linker. But debug info for this location is 
generated and not stripped by the compiler. There is already a bug 
report regards that.


Best regards,
Paul Ishenin
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Can someone test, pleease - possible problem with dbg info

2011-01-14 Thread Martin

On 15/01/2011 01:50, Paul Ishenin wrote:

15.01.2011 8:38, Martin wrote:

In lazarus /debugger/test/Gdbmi is a program TestGdbmi

If I insert a breakpoint at line 480 (either before start, or during 
run, then GDB reports :

 TCmdLineDebugger.SendCmdLn -exec-continue
 TCmdLineDebugger.ReadLn ^error,msg=Warning:\nCannot insert 
breakpoint 9.\nError accessing memory address 0x18: Input/output 
error.\n


  TGDBMIDebugger.ProcessResult Error: ,msg=Warning:\nCannot insert 
breakpoint 9.\nError accessing memory address 0x18: Input/output error.\

n
 TCmdLineDebugger.ReadLn (gdb) 


Can anyone try if they get the same?
w32
fpc 2.4.2
lazarus uptodate0/9/31
dwarf -gw

Oh, and the problem goes away, if I use the external linker -Xe
You have a known problem. You placed a breakpoint at the location 
which was striped away by the linker. But debug info for this location 
is generated and not stripped by the compiler. There is already a bug 
report regards that.

ok, because it showed the blue dots for the lines...
thanks
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel