On 2021-03-08 21:36, Martin Frb via fpc-pascal wrote:
.
.
In the example the index access should have returned a single
codeunit, which was known to be a complete codepoint.
As far as I understand the unexpected part was, that the unicode
string did not contain the content of the string constant, because the
assignment had caused an encoding conversion to be inserted.
That conversion caused the need for a widestring manager.
Maybe to help the search when/where and whatfor notes/warnings
should/could be produced, those implicit conversions can be broken
down into groups.
I can think of 2 groups already.
1) Conversion due to explicit declared different encoding.
AnAnsiString := SomeWideString;
AnAsciiString := AnUtf8String; // declared as "type
AnsiString(CP_ASCII);" and "type AnsiString(CP_UTF8);"
Do you mean a compile-time warning? The trouble is that the compiler
wouldn't know whether a real widestring manager would get included in
the final binary when such conversions are encountered. And remember
that the final binary may be compiled at a different time from the
moment when the unit containing such conversions is compiled. In other
words, compile-time warnings would be rather difficult to implement. It
might be possible to error-out at runtime when such conversions are
invoked, but note that technically the conversion may not lead to
incorrect results if the string doesn't contain characters beyond
US-ASCII. In other word, a run-time error might be appropriate only if
the conversion encounters a character it cannot handle. However, adding
such a check would probably slow-down processing even for cases when the
strings don't contain any problematic characters.
2) Conversion where at least one string is not explicitly declared for
a certain codepage.
This should include indirection via $codepage
No, this is not the case. $codepage defines the source file encoding.
The compiler translates the string constants declared this way to a
UTF-16 constant stored within the compiled binary. Specifying $codepage
has no implications on runtime conversions by itself.
Then maybe as a first step, a note/warning could be given, if a
constant string is assigned to a variable, and a change of encoding is
needed for this.
- "constant string" here would be any string that does not have a
direct explicit declared encoding.
- This could be given, even if the presence/absence of a widestring
manager is not known. Because
Because what?
Obviously knowing the presence/absence of a widestring manager allows
to refine warnings.
But I guess that comes at a higher price, as each unit when compiled
could only set flags in the ppu (including forwarding flags from used
units).
And the compiling the final program would read which warning flags are
present, and if any unit flagged the inclusion of a widestring
manager.
Yes, this would be indeed the only possibility.
Tomas
_______________________________________________
fpc-pascal maillist - fpc-pascal@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal