Re: [Development] QStringLiteral is broken(ish) on MSVC (compiler bug?)

2019-03-14 Thread Thiago Macieira
On Thursday, 14 March 2019 13:54:29 PDT NIkolai Marchenko wrote:
> I've posted about this issue (I think) on slack a bit earlier, see
> https://cpplang.slack.com/archives/C29936TQC/p154989901601

For those who can't read it, the suggestion was to use the /utf-8 option to 
the compiler (with qmake, CONFIG += utf8_source). But a quick set of testing 
does not show correct results. For 

  char16_t text1[] = u"" "\u0102";

It produces, without /utf-8 (see https://msvc.godbolt.org/z/EvtKzq):

?text1@@3PA_SA DB '?', 00H, 00H, 00H; text1

And with /utf-8:

?text1@@3PA_SA DB 0c4H, 00H, 01aH, ' ', 00H, 00H; text1

Those two values make no sense. U+0102 is neither 0x003f (question mark) nor 
0x00c4 0x201a ("Ä‚"). This is a clear compiler bug. An interpretation of the 
C++11 standard could say that the translation is correct for the no-/utf-8 
build, but with /utf-8 or /execution-charset:utf-8 it should have produced the 
correct result.

C++11 2.14.5 [lex.string]/13 (now 5.13.5/12 [1]) says:

"If one string-literal has no encoding-prefix, it is treated as a string-
literal of the same encoding-prefix as the other operand."

In table 9:
u"a" "b"is the same as  u"ab"

[1] http://eel.is/c++draft/lex.string#12
-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel System Software Products



___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] QStringLiteral is broken(ish) on MSVC (compiler bug?)

2019-03-14 Thread NIkolai Marchenko
I've posted about this issue (I think) on slack a bit earlier, see
https://cpplang.slack.com/archives/C29936TQC/p154989901601

On Thu, Mar 14, 2019 at 11:51 PM Matthew Woehlke 
wrote:

> While working on some modernization of my application — in particular,
> converting some UTF-8 literals to use QStringLiteral — I noticed a
> concerning compiler warning:
>
>   warning C4566: character represented by universal-character-name
>   '\u' cannot be represented in the current code page (1252)
>
> After doing some testing, it turns out that, given code like
> QStringLiteral("\u269E \U0001f387 \u269F"), MSVC is indeed butchering
> the string.
>
> Further investigation shows that the problem seems to be with the
> implementation of QStringLiteral. In particular, it appears that the
> preprocessor initially sees just the raw string literal without the 'u'
> prefix, butchers it, then later "promotes" it to a UTF-16 literal, but
> by then the damage has been done.
>
> While this absolutely feels like a compiler bug, it's an *awful* big
> gotcha that probably should be documented. Also, is there anything that
> Qt can do to work around it? (I know these sorts of macro expansions can
> be tricksy...)
>
> Note: and the *local* work-around is apparently to include the 'u'
> prefix on my own literal; apparently doubling it (`uu"stuff"`) is okay.
>
> --
> Matthew
> ___
> Development mailing list
> Development@qt-project.org
> https://lists.qt-project.org/listinfo/development
>
___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] QStringLiteral is broken(ish) on MSVC (compiler bug?)

2019-03-14 Thread Matthew Woehlke
On 14/03/2019 16.50, Matthew Woehlke wrote:
> While working on some modernization of my application — in particular,
> converting some UTF-8 literals to use QStringLiteral — I noticed a
> concerning compiler warning:
> 
>   warning C4566: character represented by universal-character-name
>   '\u' cannot be represented in the current code page (1252)
> 
> After doing some testing, it turns out that, given code like
> QStringLiteral("\u269E \U0001f387 \u269F"), MSVC is indeed butchering
> the string.
> 
> Further investigation shows that the problem seems to be with the
> implementation of QStringLiteral. In particular, it appears that the
> preprocessor initially sees just the raw string literal without the 'u'
> prefix, butchers it, then later "promotes" it to a UTF-16 literal, but
> by then the damage has been done.
> 
> While this absolutely feels like a compiler bug, it's an *awful* big
> gotcha that probably should be documented. Also, is there anything that
> Qt can do to work around it? (I know these sorts of macro expansions can
> be tricksy...)
> 
> Note: and the *local* work-around is apparently to include the 'u'
> prefix on my own literal; apparently doubling it (`uu"stuff"`) is okay.

...forgot to mention; previous mail attempted to have a complete test
case attached, except I accidentally applied the work-around and saved
it before sending the message. So, to see the problem, build the
attached source (with VC++), but remove the 'u' prefix from the
QStringLiteral.

-- 
Matthew
___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


[Development] QStringLiteral is broken(ish) on MSVC (compiler bug?)

2019-03-14 Thread Matthew Woehlke
While working on some modernization of my application — in particular,
converting some UTF-8 literals to use QStringLiteral — I noticed a
concerning compiler warning:

  warning C4566: character represented by universal-character-name
  '\u' cannot be represented in the current code page (1252)

After doing some testing, it turns out that, given code like
QStringLiteral("\u269E \U0001f387 \u269F"), MSVC is indeed butchering
the string.

Further investigation shows that the problem seems to be with the
implementation of QStringLiteral. In particular, it appears that the
preprocessor initially sees just the raw string literal without the 'u'
prefix, butchers it, then later "promotes" it to a UTF-16 literal, but
by then the damage has been done.

While this absolutely feels like a compiler bug, it's an *awful* big
gotcha that probably should be documented. Also, is there anything that
Qt can do to work around it? (I know these sorts of macro expansions can
be tricksy...)

Note: and the *local* work-around is apparently to include the 'u'
prefix on my own literal; apparently doubling it (`uu"stuff"`) is okay.

-- 
Matthew
#include 
#include 

int main(int argc, char** argv)
{
  QApplication app{argc, argv};
  QLabel label;
  label.setText(QStringLiteral(u"\u269E \U0001f387 \u269F"));
  label.setAlignment(Qt::AlignCenter);

  auto f = label.font();
  f.setPixelSize(256);
  label.setFont(f);

  label.show();
  return app.exec();
}
___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development