In our previous episode, Hans-Peter Diettrich said:
> > Please explain what you mean by "unicode" and what by "ansi" in your
> > statement. Without nuancing that, your statement is pretty much meaning
> > less.
>
> AFAIR Delphi changed the string type to Unicode (UTF-16) in D2009, i.e.
> D2007 wa
In our previous episode, Jonas Maebe said:
> > I know that Florian and you wanted to see the default string as something
> > of a
> > dialect mode, but I never saw a way to do that practically.
>
> How about this: a new language feature is added to the compiler that
> enables defining a type alia
On 10/11/2011 10:11 PM, Hans-Peter Diettrich wrote:
(a) Sorry, but this does not answer the question I tried to ask
(Difference between a possible type called RawByteString and a basic
"new string" variable that happens to be set to the Encoding ID
"RawByte").
When I have a variable of type
Am 11.10.2011 23:09, schrieb Hans-Peter Diettrich:
Marco van de Voort schrieb:
Note that many places that are runtime typed (like
tstringlist.loadfromfile)
get a encoding parameter, so that the loading code can convert the
encoding
of the file (in encoding parameter) to whatever stringtype tstr
On 10/11/2011 09:37 PM, Hans-Peter Diettrich wrote:
Why implement the upper/lower translation N times, when afterwards the
N encodings have to be converted into the Result encoding? Where the
encoding conversions already exist...
Obviously, the dedicated upper/lower translation done in a certa
On 10/11/2011 09:37 PM, Hans-Peter Diettrich wrote:
IMO, calling ToLower with a string that is set to the encoding
"RawByte" does not make sense and should generate an exception.
Nope.
A new string consists of a record that contains the encoding ID, element
size, reference count, length and
The last answer was to
On 10/11/2011 09:37 PM, Hans-Peter Diettrich wrote:
When a string is assigned to an RawByteString, both point to the
original string, which has a valid (non-raw) encoding.
-Michael
___
fpc-devel maillist - fpc-devel@lists.
12.10.2011 16:03, Michael Schnell wrote:
I suppose a variable of the type "String" is pre-loaded with the
predefined "System" encoding ID.
If you mean "AnsiString" then it is loaded with encoding 0 which means
default system codepage. It will get the real encoding number after the
first assig
On 12 Oct 2011, at 09:56, Marco van de Voort wrote:
If it was just one class it would work. But essentially it is all
OOP. (e.g.
tcomponent and tcontrol has string properties, and thus the whole of
lazarus),
Lazarus doesn't have to change anything. They are free to follow the
path you pro
On 10/12/2011 10:09 AM, Paul Ishenin wrote:
12.10.2011 16:03, Michael Schnell wrote:
I suppose a variable of the type "String" is pre-loaded with the
predefined "System" encoding ID.
If you mean "AnsiString" then it is loaded with encoding 0 which means
default system codepage. It will get th
On 12 Oct 2011, at 10:13, Jonas Maebe wrote:
That would indeed require some ifdefs to keep the code compilable
also by Delphi. No solution will be completely free.
Well, an alternative could be to add a global directive such as
{$modeswitch duplicate_all_string_based_code}
whereby anything
Am 12.10.2011 10:24, schrieb Michael Schnell:
On 10/12/2011 10:09 AM, Paul Ishenin wrote:
12.10.2011 16:03, Michael Schnell wrote:
I suppose a variable of the type "String" is pre-loaded with the
predefined "System" encoding ID.
If you mean "AnsiString" then it is loaded with encoding 0 which
On 10/12/2011 10:35 AM, Sven Barth wrote:
No. In Delphi "String = UnicodeString", but AnsiString still exists as
a one-byte (or multi-byte) string type (the "new string type" or "code
page aware string type").
Sorry, but I don't understand.
According to the "TAnsiRec", such a "New String" no
On Wednesday 12 October 2011 09.50:33 Marco van de Voort wrote:
>
> Undecided. But I'm very strongly against utf16 default on unix. I don't do
> much GUI on unix, and it would be insane to have a string type that is
> totally different from all other string types that I touch.
Do I understand it
Am 12.10.2011 10:59, schrieb Martin Schreiber:
On Wednesday 12 October 2011 09.50:33 Marco van de Voort wrote:
Undecided. But I'm very strongly against utf16 default on unix. I don't do
much GUI on unix, and it would be insane to have a string type that is
totally different from all other strin
1) Why UTF8String made incompatible with AnsiString(CP_UTF8)
( UTF8String = type AnsiString(CP_UTF8); )? Why not an alias?
2) Same question about RawByteString
3) why UnicodeString is separate type? Does it should be
AnsiString(CP_UTF16)? If not what is AnsiString(CP_UTF16)?
4) If now ansistri
On Wednesday 12 October 2011 11.13:45 Sven Barth wrote:
> Am 12.10.2011 10:59, schrieb Martin Schreiber:
> > On Wednesday 12 October 2011 09.50:33 Marco van de Voort wrote:
> >> Undecided. But I'm very strongly against utf16 default on unix. I don't
> >> do much GUI on unix, and it would be insane
Am 12.10.2011 11:33, schrieb Alex Shishkin:
1) Why UTF8String made incompatible with AnsiString(CP_UTF8)
( UTF8String = type AnsiString(CP_UTF8); )? Why not an alias?
2) Same question about RawByteString
RawByteString is special, because any other string can be assigned to it
WITHOUT conversio
Michael Schnell schrieb:
When I have a variable of type AnsiString, and assign an string to it,
then its encoding is reported as 1252 (my system codepage). On Paul's
machine it will have a different encoding, I assume?
Via personal consulting ( :) ) I learned that the multiple new Pascal -
s
Paul Ishenin schrieb:
12.10.2011 16:03, Michael Schnell wrote:
I suppose a variable of the type "String" is pre-loaded with the
predefined "System" encoding ID.
If you mean "AnsiString" then it is loaded with encoding 0 which means
default system codepage. It will get the real encoding number
Sven Barth schrieb:
Am 11.10.2011 23:09, schrieb Hans-Peter Diettrich:
In short, see it as if text now has a mandatory encoding attached. If
the
runtime doesn't know the type, then it is not text, but binary, and you
should treat it as such.
Can you give an example, how the runtime can not
12.10.2011 13:45, Sven Barth пишет:
Am 12.10.2011 11:33, schrieb Alex Shishkin:
1) Why UTF8String made incompatible with AnsiString(CP_UTF8)
( UTF8String = type AnsiString(CP_UTF8); )? Why not an alias?
2) Same question about RawByteString
RawByteString is special, because any other string can
My proposed changes to spstring.
1) if string is defined w/o explicit encoding (f.e. just "string", in
H+ modeswitch or "ansistring") it treated as RawByteString.
2) In unicode Delphi mode encoding of all string constant values is
forced to UTF16, source encoding can be any. String variables
Marco van de Voort schrieb:
In our previous episode, Hans-Peter Diettrich said:
Please explain what you mean by "unicode" and what by "ansi" in your
statement. Without nuancing that, your statement is pretty much meaning
less.
AFAIR Delphi changed the string type to Unicode (UTF-16) in D2009, i.
Michael Schnell schrieb:
I understand that some day (when the official release comes up) "String"
will be a new String type and thus ANSIString obsolete and just an alias.
No. "string" is an alias (generic type), all other string types are
distinct types.
So target encoding ID "0" means tha
Michael Schnell schrieb:
On 10/12/2011 10:35 AM, Sven Barth wrote:
No. In Delphi "String = UnicodeString", but AnsiString still exists as
a one-byte (or multi-byte) string type (the "new string type" or "code
page aware string type").
Sorry, but I don't understand.
According to the "TAnsiRe
Alex Shishkin schrieb:
1) Why UTF8String made incompatible with AnsiString(CP_UTF8)
( UTF8String = type AnsiString(CP_UTF8); )? Why not an alias?
An alias allows to assign strings of *any* encoding, with possibly fatal
consequences. A strict UTF8String type allows for implicit conversion,
whe
Alex Shishkin schrieb:
But the question remains: what is
AnsiString(CP_NONE)?
CP_NONE is not a codepage, it's listed under Clipping Capabilities.
If it is special too, why not just alias and if not what is it?
CP_ACP (0) is the placeholder for the system Ansi CodePage.
DoDi
_
On 10/12/2011 01:53 PM, Hans-Peter Diettrich wrote:
All AnsiString types have an element size of 1, UnicodeString has 2
and UCS4String has 4 bytes per element.
Disregarding whether or not this makes sense: what technology enforces
this (e.g. Compiler Magic or RTL) ?
-Michael
Am 12.10.2011 10:50, schrieb Michael Schnell:
On 10/12/2011 10:35 AM, Sven Barth wrote:
No. In Delphi "String = UnicodeString", but AnsiString still exists as
a one-byte (or multi-byte) string type (the "new string type" or "code
page aware string type").
Sorry, but I don't understand.
Accord
On 10/12/2011 01:45 PM, Hans-Peter Diettrich wrote:
Michael Schnell schrieb:
So target encoding ID "0" means that " := " will preserve the
encoding of the source and set the target appropriately without doing
a conversion.
No. Codepage 0 stands for the system encoding, formerly "native"
stri
Am 12.10.2011 14:07, schrieb Michael Schnell:
On 10/12/2011 01:53 PM, Hans-Peter Diettrich wrote:
All AnsiString types have an element size of 1, UnicodeString has 2
and UCS4String has 4 bytes per element.
Disregarding whether or not this makes sense: what technology enforces
this (e.g. Compil
On 10/12/2011 12:13 PM, Hans-Peter Diettrich wrote:
Delphi allows for RawByteStrings with encoding 0. When assigned to an
AnsiString, the string encoding still is zero, both variables seem to
point to the same string data.
The pointing to the data array (managed by "lazy copy and reference
On 10/12/2011 12:09 PM, Hans-Peter Diettrich wrote:
Seemingly (other than I assumed) a " := " between new strings does
not preserve the encoding, but performs an encoding conversion to the
target's encoding ID.
Right.
As I now understand: Exception: Target encoding ID = 0, source encoding
On 10/12/2011 12:09 PM, Hans-Peter Diettrich wrote:
Right, the new string types are *strict* types,
That does make sense regarding Pascal's general "strict type" paradigm.
-Michael
___
fpc-devel maillist - fpc-devel@lists.freepascal.org
http://list
Am 12.10.2011 14:16, schrieb Sven Barth:
Am 12.10.2011 14:16, schrieb Michael Schnell:
On 10/12/2011 12:13 PM, Hans-Peter Diettrich wrote:
Delphi allows for RawByteStrings with encoding 0. When assigned to an
AnsiString, the string encoding still is zero, both variables seem to
point to the sa
On 12/10/2011 11:47, Martin Schreiber wrote:
> idea. Have a look at Firemonkey and you know what I mean. ;-)
For those unfamiliar with Firemonkey, would you mind explaining further.
...but over all, I do agree with your statement, that FPC shouldn't
follow Delphi blindly. Delphi and VCL is Windo
On 10/12/2011 02:09 PM, Sven Barth wrote:
Basically both, as both rely on and use the fact that "AnsiString[i] =
AnsiChar" and "SizeOf(AnsiChar) = 1" and also "UnicodeString[i] =
UnicodeChar" and "SizeOf(UnicodeChar) = 2".
Yep.
But what I wanted to ask is what happens, if I disregard this, e
On 10/12/2011 02:17 PM, Sven Barth wrote:
The pointing to the data array (managed by "lazy copy and reference
counting features) is independent from the encoding ID, that is part of
the string management record and not of the data array.
Wrong. Both reference counting and code page are part o
Am 12.10.2011 14:17, schrieb Graeme Geldenhuys:
As for you statement regarding "do we need Unicode support everywhere?"
Well, with Delphi 2009's Unicode support, the Delphi language now
supports Unicode too. Thus unit names, class names, property names,
variable names etc can all contain Unicode
Am 12.10.2011 14:26, schrieb Michael Schnell:
On 10/12/2011 02:17 PM, Sven Barth wrote:
The pointing to the data array (managed by "lazy copy and reference
counting features) is independent from the encoding ID, that is part of
the string management record and not of the data array.
Wrong. B
Am 12.10.2011 14:24, schrieb Michael Schnell:
On 10/12/2011 02:09 PM, Sven Barth wrote:
Basically both, as both rely on and use the fact that "AnsiString[i] =
AnsiChar" and "SizeOf(AnsiChar) = 1" and also "UnicodeString[i] =
UnicodeChar" and "SizeOf(UnicodeChar) = 2".
Yep.
But what I wanted t
On 12 Oct 2011, at 14:17, Graeme Geldenhuys wrote:
eg: UTF-8 as native string type under *nix systems, and
UTF-16 under Windows. Why must some platforms get a speed penalty and
others not, when you force only one encoding on all platforms?
The reason for doing so would be to make code more ea
On Wednesday 12 October 2011 14.17:57 Graeme Geldenhuys wrote:
>For those unfamiliar with Firemonkey, would you mind explaining further.
Read here for example:
https://forums.embarcadero.com/forum.jspa?forumID=380
> As for you statement regarding "do we need Unicode support everywhere?"
> Well
On 10/12/2011 02:28 PM, Sven Barth wrote:
There will be a conversion.
Meaning:
- when it is a var parameter, am error message is issued.
- when it is a value parameter: conversion is called
- type cast will do a conversion
- assignment will do a conversion (at least if the target encoding
Am 12.10.2011 14:40, schrieb Michael Schnell:
On 10/12/2011 02:28 PM, Sven Barth wrote:
There will be a conversion.
Meaning:
- when it is a var parameter, am error message is issued.
- when it is a value parameter: conversion is called
- type cast will do a conversion
- assignment will do a co
On 10/12/2011 11:45 AM, Sven Barth wrote:
RawByteString is special, because any other string can be assigned to
it WITHOUT conversion, but the code page of the assigned string will
be kept, so one can still check which code page the original string had.
Ooops, so there is not encoding ID "RA
Am 12.10.2011 14:47, schrieb Michael Schnell:
On 10/12/2011 11:45 AM, Sven Barth wrote:
RawByteString is special, because any other string can be assigned to
it WITHOUT conversion, but the code page of the assigned string will
be kept, so one can still check which code page the original string
12.10.2011 16:34, Hans-Peter Diettrich пишет:
Alex Shishkin schrieb:
1) Why UTF8String made incompatible with AnsiString(CP_UTF8)
( UTF8String = type AnsiString(CP_UTF8); )? Why not an alias?
An alias allows to assign strings of *any* encoding, with possibly fatal
consequences. A strict UTF8St
Am 12.10.2011 14:16, schrieb Michael Schnell:
On 10/12/2011 12:13 PM, Hans-Peter Diettrich wrote:
Delphi allows for RawByteStrings with encoding 0. When assigned to an
AnsiString, the string encoding still is zero, both variables seem to
point to the same string data.
The pointing to the data
Am 12.10.2011 14:42, schrieb Alex Shishkin:
12.10.2011 16:34, Hans-Peter Diettrich пишет:
Alex Shishkin schrieb:
1) Why UTF8String made incompatible with AnsiString(CP_UTF8)
( UTF8String = type AnsiString(CP_UTF8); )? Why not an alias?
An alias allows to assign strings of *any* encoding, with
On 10/12/2011 11:33 AM, Alex Shishkin wrote:
...
While I feel that something like this migh be a(nother) decent way to
handle Unicode strings, obviously different decisions have been taken
because
1) Pascal is supposed to be a strictly typed language. This asks for
being able to use distinct
12.10.2011 16:52, Michael Schnell пишет:
On 10/12/2011 11:33 AM, Alex Shishkin wrote:
...
While I feel that something like this migh be a(nother) decent way to
handle Unicode strings, obviously different decisions have been taken
because
1) Pascal is supposed to be a strictly typed language. T
On 12/10/2011 14:38, Martin Schreiber wrote:
>
> Read here for example:
Thanks for the link.
> Is this desirable? What is the benefit of non ASCII Pascal identifiers at the
> expense of performance and simplicity?
No idea if it is desirable - probably not, when it is a global open
source proj
In fact this seems to avoid the "nice" feature of the current
(Delphi-like) Implementation.
Here seemingly inconsistent (or rather "intersexual") strings are
provided for: perfectly encoded and distinguishable data that suffers
from being locked in a variable that happens to be of a type with
On 10/12/2011 02:39 PM, Sven Barth wrote:
I'd say yes (the only point I'm really unsure about is "var"-arguments).
I did this list according to what I expect regarding to different
numerical types (like integer and real) or two really different string
types (like short string and long string)
Am 12.10.2011 15:35, schrieb Michael Schnell:
On 10/12/2011 02:39 PM, Sven Barth wrote:
I'd say yes (the only point I'm really unsure about is "var"-arguments).
I did this list according to what I expect regarding to different
numerical types (like integer and real) or two really different str
On 10/12/2011 04:23 PM, Sven Barth wrote:
There was some discussion about how to handle var parameters, but I
don't remember the outcome anymore. AFAIK Delphi issues a compile
error (I don't know for sure though).
Options are:
- compiler error
- compiler warning
- runtime exception
- con
Am 12.10.2011 16:46, schrieb Michael Schnell:
On 10/12/2011 04:23 PM, Sven Barth wrote:
There was some discussion about how to handle var parameters, but I
don't remember the outcome anymore. AFAIK Delphi issues a compile
error (I don't know for sure though).
Options are:
- compiler error
- c
Thanks !
-Michael
___
fpc-devel maillist - fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel
On Wednesday 12 October 2011 14.32:38 Jonas Maebe wrote:
> On 12 Oct 2011, at 14:17, Graeme Geldenhuys wrote:
> > eg: UTF-8 as native string type under *nix systems, and
> > UTF-16 under Windows. Why must some platforms get a speed penalty and
> > others not, when you force only one encoding on all
Michael Schnell schrieb:
On 10/12/2011 02:28 PM, Sven Barth wrote:
There will be a conversion.
Meaning:
- when it is a var parameter, am error message is issued.
- when it is a value parameter: conversion is called
- type cast will do a conversion
Correct, so far.
- assignment will do
Alex Shishkin schrieb:
12.10.2011 16:34, Hans-Peter Diettrich пишет:
Alex Shishkin schrieb:
1) Why UTF8String made incompatible with AnsiString(CP_UTF8)
( UTF8String = type AnsiString(CP_UTF8); )? Why not an alias?
An alias allows to assign strings of *any* encoding, with possibly fatal
conse
Michael Schnell schrieb:
In fact the var parameter case is most interesting regarding new strings.
While in the other cases the system can decide at runtime what do do
(with respect to the encoding ID (s) ), with a var parameter the type
names might be used to generate an error message at com
Sven Barth schrieb:
http://svn.freepascal.org/cgi-bin/viewvc.cgi/trunk/rtl/inc/astrings.inc?revision=19444&view=markup
I don't understand the use of encoding 0 and CP_NONE in FPC. Can
somebody explain?
Furthermore I suspect that the implementation of
Function Pos(Const Substr : RawByteStri
Graeme Geldenhuys schrieb:
On 12/10/2011 11:47, Martin Schreiber wrote:
idea. Have a look at Firemonkey and you know what I mean. ;-)
For those unfamiliar with Firemonkey, would you mind explaining further.
...but over all, I do agree with your statement, that FPC shouldn't
follow Delphi bli
Martin Schreiber schrieb:
Well, with Delphi 2009's Unicode support, the Delphi language now
supports Unicode too. Thus unit names, class names, property names,
variable names etc can all contain Unicode text in there names. So yes,
Unicode is required throughout the Object Pascal language, and F
13.10.2011 9:13, Hans-Peter Diettrich wrote:
Sven Barth schrieb:
http://svn.freepascal.org/cgi-bin/viewvc.cgi/trunk/rtl/inc/astrings.inc?revision=19444&view=markup
I don't understand the use of encoding 0 and CP_NONE in FPC. Can
somebody explain?
DefaultSystemCodepage is used When the pati
Paul Ishenin schrieb:
13.10.2011 9:13, Hans-Peter Diettrich wrote:
Sven Barth schrieb:
http://svn.freepascal.org/cgi-bin/viewvc.cgi/trunk/rtl/inc/astrings.inc?revision=19444&view=markup
I don't understand the use of encoding 0 and CP_NONE in FPC. Can
somebody explain?
DefaultSystemCodep
13.10.2011 14:57, Hans-Peter Diettrich пишет:
Paul Ishenin schrieb:
13.10.2011 9:13, Hans-Peter Diettrich wrote:
Sven Barth schrieb:
http://svn.freepascal.org/cgi-bin/viewvc.cgi/trunk/rtl/inc/astrings.inc?revision=19444&view=markup
I don't understand the use of encoding 0 and CP_NONE in F
70 matches
Mail list logo