[PATCH] D157547: Arm64EC entry/exit thunks, consolidated.

2023-10-09 Thread Jacek Caban via Phabricator via cfe-commits
jacek added a comment.

In D157547#4653477 , @efriedma wrote:

> How important is that particular pattern?  I think most patterns involving 
> assembly should be covered by some combination of naked functions, and 
> functions defined in separate assembly files, both of which avoid weird 
> questions about what the compiler should do when assembly tries to define a 
> function symbol.

Thank for suggestions. We can indeed change that to naked functions (or 
separated assembly file when that's not suitable). We use full assembly 
declaration on other targets for compatibility reasons (which may be worth 
revisiting if that's still a problem and it's definitely not a concern in 
Arm64EC case).

It may be nice to output some form of error instead of assert crash in those 
cases, but I guess that's a secondary concern at this point.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157547/new/

https://reviews.llvm.org/D157547

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157547: Arm64EC entry/exit thunks, consolidated.

2023-10-09 Thread Jacek Caban via Phabricator via cfe-commits
jacek added a comment.

It works nice with my testing so far, thanks!

I noticed one problem with code that defines symbols in assembly. It conflicts 
with anti-dependency symbols emitted by codegen and hits an asserts in assembly 
printer. This commit fixes it for me:
https://github.com/cjacek/llvm-project/commit/879b1b6bf99b1e506ae04b8db0b44eb1b2205f19

Here is a test case:

  extern void asmfunc(void);
  __asm__( ".globl \"#asmfunc\"\n"
   "\t.section .text,\"xr\",discard,\"#asmfunc\"\n"
   "\"#asmfunc\":\n\t"
   "nop; ret\n\t" );
  void test(void) { asmfunc(); }


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157547/new/

https://reviews.llvm.org/D157547

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157547: Arm64EC entry/exit thunks, consolidated.

2023-09-25 Thread Jacek Caban via Phabricator via cfe-commits
jacek added a comment.

In D157547#4650643 , @efriedma wrote:

> Seems to be working for simple cases, but I'm not sure this is actually 
> working properly (I'm still seeing LNK1000 crashes).

I don't know if it's related to crashes that you're seeing, but those 
anti-dependency symbols look different than what MSVC produces. With the patch, 
I get:

$ cat extfunc.c
extern void extfunc(void);
void func(void) { extfunc(); }
$ clang -target arm64ec-windows -c extcall.c -o extcall.o
$ llvm-readobj --symbols extcall.o
...

  Symbol {
Name: #extfunc
Value: 0
Section: IMAGE_SYM_UNDEFINED (0)
BaseType: Null (0x0)
ComplexType: Null (0x0)
StorageClass: WeakExternal (0x69)
AuxSymbolCount: 1
AuxWeakExternal {
  Linked: .weak.#extfunc.default.#func (40)
  Search: AntiDependency (0x4)
}
  }

...

  Symbol {
Name: extfunc
Value: 0
Section: IMAGE_SYM_UNDEFINED (0)
BaseType: Null (0x0)
ComplexType: Null (0x0)
StorageClass: WeakExternal (0x69)
AuxSymbolCount: 1
AuxWeakExternal {
  Linked: #extfunc$exit_thunk (43)
  Search: AntiDependency (0x4)
}
  }

While an equivalent object file produced by MSVC has:

  Symbol {
Name: #extfunc
Value: 0
Section: IMAGE_SYM_UNDEFINED (0)
BaseType: Null (0x0)
ComplexType: Function (0x2)
StorageClass: WeakExternal (0x69)
AuxSymbolCount: 1
AuxWeakExternal {
  Linked: #extfunc$exit_thunk (14)
  Search: AntiDependency (0x4)
}
  }
  Symbol {
Name: extfunc
Value: 0
Section: IMAGE_SYM_UNDEFINED (0)
BaseType: Null (0x0)
ComplexType: Function (0x2)
StorageClass: WeakExternal (0x69)
AuxSymbolCount: 1
AuxWeakExternal {
  Linked: #extfunc (19)
  Search: AntiDependency (0x4)
}
  }

It's the mangled symbol that's linked to guest exit thunk, not unmangled one. 
Also there is no .weak.*.default.* involved.

BTW, I hope to get my WIP lld-link branch into usable state very soon and plan 
to publish it then. I'm able to link real-world code now, but it still requires 
a few hacks that I need to fix to make it work out of the box. Hopefully it 
will make testing and debugging such problems easier.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157547/new/

https://reviews.llvm.org/D157547

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157547: Arm64EC entry/exit thunks, consolidated.

2023-08-25 Thread Jacek Caban via Phabricator via cfe-commits
jacek added a comment.

I published an initial version of my attempt at documenting this stuff here: 
https://wiki.winehq.org/ARM64ECToolchain.

It's not yet complete, but I hope it's already useful. I plan to extend it in 
the future. If some clarification or more information would be useful, please 
let me know and I will try to improve it or fill the gaps.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157547/new/

https://reviews.llvm.org/D157547

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157547: Arm64EC entry/exit thunks, consolidated.

2023-08-17 Thread Jacek Caban via Phabricator via cfe-commits
jacek added inline comments.



Comment at: llvm/lib/Target/AArch64/AArch64MCInstLower.cpp:51-54
+// For ARM64EC, symbol lookup in the MSVC linker has limited awareness
+// of ARM64EC mangling ("#"/"$$h"). So object files need to refer to both
+// the mangled and unmangled names of ARM64EC symbols, even if they aren't
+// actually used by any relocations. Emit the necessary references here.

efriedma wrote:
> jacek wrote:
> > I think that mangled weak symbol should link to another kind exit thunk, 
> > not to unmangled symbol directly.
> > 
> > When an extern code symbol is resolved by an unmangled name, it means that 
> > we have ARM64EC->X64 call and we need to use an exit thunk. Linker doesn't 
> > need a special logic for that: on MSVC it seems to be achieved by pointing 
> > the antidependency symbol to yet another exit thunk. That other exit thunk 
> > loads the target exit thunk and X64 (unmangled) symbol into x10, x11 and 
> > uses __os_arm64x_dispatch_icall to call emulator. I guess we may worry 
> > about that later, but in that case maybe it's better not to emit mangled 
> > antidependency symbol at all for now?
> From what I recall, I wrote the code here primarily to deal with issues 
> trying to take the address of a function; even if the function is defined in 
> arm64ec code, the address is the unmangled symbol.  So there's some weirdness 
> there involving symbol lookup even before there's any actual x64 code 
> involved.
> 
> If you have a better idea of how external symbol references are supposed to 
> be emitted, I'd appreciate a brief writeup, since all the information I have 
> comes from trying to read dumpbin dumps of MSVC output.
Yes, when EC code takes an address of a function, it needs to use unmangled 
symbols, because mangled symbols may point to a thunk (in cases when the 
implementation is in X64 or __declspec(hybrid_patchable) is used) and we still 
want an address of the real implementation. Here is an example how MSVC sets it 
up:

$ cat test.c
extern int otherfunc(void);
int myfunc(void) { return otherfunc(); }
$ llvm-readobj --symbols test.obj
...
  Symbol {
Name: #otherfunc$exit_thunk
Value: 0
Section: .wowthk$aa (4)
BaseType: Null (0x0)
ComplexType: Function (0x2)
StorageClass: External (0x2)
AuxSymbolCount: 0
  }
...
  Symbol {
Name: #otherfunc
Value: 0
Section: IMAGE_SYM_UNDEFINED (0)
BaseType: Null (0x0)
ComplexType: Function (0x2)
StorageClass: WeakExternal (0x69)
AuxSymbolCount: 1
AuxWeakExternal {
  Linked: #otherfunc$exit_thunk (14)
  Search: AntiDependency (0x4)
}
  }
  Symbol {
Name: #myfunc
Value: 0
Section: .text$mn (3)
BaseType: Null (0x0)
ComplexType: Function (0x2)
StorageClass: External (0x2)
AuxSymbolCount: 0
  }
  Symbol {
Name: otherfunc
Value: 0
Section: IMAGE_SYM_UNDEFINED (0)
BaseType: Null (0x0)
ComplexType: Function (0x2)
StorageClass: WeakExternal (0x69)
AuxSymbolCount: 1
AuxWeakExternal {
  Linked: #otherfunc (19)
  Search: AntiDependency (0x4)
}
  }
  Symbol {
Name: myfunc
Value: 0
Section: IMAGE_SYM_UNDEFINED (0)
BaseType: Null (0x0)
ComplexType: Function (0x2)
StorageClass: WeakExternal (0x69)
AuxSymbolCount: 1
AuxWeakExternal {
  Linked: #myfunc (21)
  Search: AntiDependency (0x4)
}
  }

In this example myfunc links to #myfunc (and similarly otherfunc->#otherfunc). 
That's enough because even if we pass a pointer of #myfunc to actual X64 code 
which will try to call this address, emulator will have a chance to figure it 
out and use entry thunk as needed.

However, #otherfunc points to #otherfunc$exit_thunk (which then references 
otherfunc and $iexit_thunk$cdecl$i8$v in its implementation). If otherfunc is 
implemented in ARM64EC, #otherfunc symbol will be replaced by the object file 
implementing it. If it will not be replaced, it means that symbol is resolved 
to X64 implementation and the linked thunk will be used to call it.

My information are based on experimentation with MSVC as well, so those are 
only my best guesses. I mostly experimented with it in a context of linker 
support and I have those aspects of linker working in my tree. I will work on 
more comprehensive description of those things.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157547/new/

https://reviews.llvm.org/D157547

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157547: Arm64EC entry/exit thunks, consolidated.

2023-08-16 Thread Jacek Caban via Phabricator via cfe-commits
jacek added inline comments.



Comment at: llvm/lib/Target/AArch64/AArch64MCInstLower.cpp:51-54
+// For ARM64EC, symbol lookup in the MSVC linker has limited awareness
+// of ARM64EC mangling ("#"/"$$h"). So object files need to refer to both
+// the mangled and unmangled names of ARM64EC symbols, even if they aren't
+// actually used by any relocations. Emit the necessary references here.

I think that mangled weak symbol should link to another kind exit thunk, not to 
unmangled symbol directly.

When an extern code symbol is resolved by an unmangled name, it means that we 
have ARM64EC->X64 call and we need to use an exit thunk. Linker doesn't need a 
special logic for that: on MSVC it seems to be achieved by pointing the 
antidependency symbol to yet another exit thunk. That other exit thunk loads 
the target exit thunk and X64 (unmangled) symbol into x10, x11 and uses 
__os_arm64x_dispatch_icall to call emulator. I guess we may worry about that 
later, but in that case maybe it's better not to emit mangled antidependency 
symbol at all for now?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157547/new/

https://reviews.llvm.org/D157547

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D61670: [RFC] [MinGW] Allow opting out from .refptr stubs

2023-07-25 Thread Jacek Caban via Phabricator via cfe-commits
jacek added a comment.

`-fno-autoimport` sounds good to me. For linker implications, using 
`--disable-auto-import` in this case would seem consistent to me.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D61670/new/

https://reviews.llvm.org/D61670

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D78508: [Clang] Allow long as size_t printf argument on 32-bit Windows platforms.

2020-05-15 Thread Jacek Caban via Phabricator via cfe-commits
jacek added a comment.

I don't mind dropping the patch, we can mitigate the problem in Wine. My 
understanding is that we could use "pedantic" logic similar to NSInteger in 
checkFormatExpr, please let me know if you'd like something like that.

I still think that using casts in such cases is not the right pattern. For an 
example, if developer knows that SIZE_T is a type that will always be 
compatible with %z, the code:
printf("%zd", size);
is not wrong. Proposed casts would change all such cases to:
printf("%zd", (size_t)size);
Now imagine that I made a mistake and changed type of size to a type of 
different size. In the first variant, clang will issue a warning. In the second 
case, the cast will hide the warning, but I'm probably interested in changing 
the format instead.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78508/new/

https://reviews.llvm.org/D78508



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D78508: [Clang] Allow long as size_t printf argument on 32-bit Windows platforms.

2020-04-20 Thread Jacek Caban via Phabricator via cfe-commits
jacek created this revision.
jacek added a reviewer: clang.
Herald added subscribers: cfe-commits, mstorsjo.
Herald added a project: clang.

On 32-bit Windows platform, int and long is mixed all over the place in 
platform headers. The most notable example is SIZE_T, which is unsigned long, 
unlike size_t, which is unsigned int. %I, %z and %j formats do the right thing 
for those types, but clang issues a warning, leaving no good way to print 
SIZE_T (other than disabling warnings or adding useless casts).

GCC supports only %I for mingw targets, but allows both int and long to be used.

This is causing problems when using clang to build Wine PE files.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D78508

Files:
  clang/lib/AST/FormatString.cpp
  clang/lib/AST/PrintfFormatString.cpp
  clang/test/Sema/format-strings-ms.c


Index: clang/test/Sema/format-strings-ms.c
===
--- clang/test/Sema/format-strings-ms.c
+++ clang/test/Sema/format-strings-ms.c
@@ -21,17 +21,35 @@
 void signed_test() {
   short val = 30;
   printf("val = %I64d\n", val); // expected-warning{{format specifies type 
'__int64' (aka 'long long') but the argument has type 'short'}}
+  printf("val = %Id\n", val);
   long long bigval = 30;
   printf("val = %I32d\n", bigval); // expected-warning{{format specifies type 
'__int32' (aka 'int') but the argument has type 'long long'}}
   printf("val = %Id\n", bigval); // expected-warning{{format specifies type 
'__int32' (aka 'int') but the argument has type 'long long'}}
+  int ival = 30;
+  printf("%Id", ival);
+  printf("%zd", ival);
+  printf("%td", ival);
+  long lval = 30;
+  printf("%Id", lval);
+  printf("%zd", lval);
+  printf("%td", lval);
 }
 
 void unsigned_test() {
   unsigned short val = 30;
   printf("val = %I64u\n", val); // expected-warning{{format specifies type 
'unsigned __int64' (aka 'unsigned long long') but the argument has type 
'unsigned short'}}
+  printf("val = %Iu\n", val);
   unsigned long long bigval = 30;
   printf("val = %I32u\n", bigval); // expected-warning{{format specifies type 
'unsigned __int32' (aka 'unsigned int') but the argument has type 'unsigned 
long long'}}
   printf("val = %Iu\n", bigval); // expected-warning{{format specifies type 
'unsigned __int32' (aka 'unsigned int') but the argument has type 'unsigned 
long long'}}
+  unsigned int ival = 30;
+  printf("%Iu", ival);
+  printf("%zu", ival);
+  printf("%tu", ival);
+  unsigned long lval = 30;
+  printf("%Iu", lval);
+  printf("%zu", lval);
+  printf("%tu", lval);
 }
 
 void w_test(wchar_t c, wchar_t *s) {
Index: clang/lib/AST/PrintfFormatString.cpp
===
--- clang/lib/AST/PrintfFormatString.cpp
+++ clang/lib/AST/PrintfFormatString.cpp
@@ -527,8 +527,8 @@
 return ArgType::makeSizeT(ArgType(Ctx.getSignedSizeType(), "ssize_t"));
   case LengthModifier::AsInt3264:
 return Ctx.getTargetInfo().getTriple().isArch64Bit()
-   ? ArgType(Ctx.LongLongTy, "__int64")
-   : ArgType(Ctx.IntTy, "__int32");
+   ? ArgType::makeSizeT(ArgType(Ctx.LongLongTy, "__int64"))
+   : ArgType::makeSizeT(ArgType(Ctx.IntTy, "__int32"));
   case LengthModifier::AsPtrDiff:
 return ArgType::makePtrdiffT(
 ArgType(Ctx.getPointerDiffType(), "ptrdiff_t"));
@@ -562,8 +562,10 @@
 return ArgType::makeSizeT(ArgType(Ctx.getSizeType(), "size_t"));
   case LengthModifier::AsInt3264:
 return Ctx.getTargetInfo().getTriple().isArch64Bit()
-   ? ArgType(Ctx.UnsignedLongLongTy, "unsigned __int64")
-   : ArgType(Ctx.UnsignedIntTy, "unsigned __int32");
+   ? ArgType::makeSizeT(
+ ArgType(Ctx.UnsignedLongLongTy, "unsigned __int64"))
+   : ArgType::makeSizeT(
+ ArgType(Ctx.UnsignedIntTy, "unsigned __int32"));
   case LengthModifier::AsPtrDiff:
 return ArgType::makePtrdiffT(
 ArgType(Ctx.getUnsignedPointerDiffType(), "unsigned ptrdiff_t"));
Index: clang/lib/AST/FormatString.cpp
===
--- clang/lib/AST/FormatString.cpp
+++ clang/lib/AST/FormatString.cpp
@@ -401,8 +401,21 @@
   case BuiltinType::UInt:
 return T == C.IntTy ? Match : NoMatch;
   case BuiltinType::Long:
+// Win32 platform mixes long and int for size_t-like purposes.
+// ssize_t is int, but platform type SSIZE_T is long.
+if ((isSizeT() || isPtrdiffT()) &&
+C.getTargetInfo().getTriple().isOSMSVCRT() &&
+C.getTargetInfo().getTriple().isArch32Bit())
+  return Match;
 return T == C.UnsignedLongTy ? Match : NoMatch;
   case BuiltinType::ULong:
+// Win32 platform mixes long and int for size_t-like