Hello,

Kindly accept my apologies, there was a misunderstanding regarding the earlier 
patch, as we observed Cygwin tests failing in our arm64 environment, 
specifically related to pthreads.
I request you to drop this patch. I will share a new patch that addresses the 
pthread test failures on AArch64.

Thanks,
Thirumalai Nagalingam

-----Original Message-----
From: Jeremy Drake <cyg...@jdrake.com> 
Sent: 26 June 2025 00:08
To: Thirumalai Nagalingam <thirumalai.nagalin...@multicorewareinc.com>
Cc: cygwin-patches@cygwin.com
Subject: Re: [PATCH] Aarch64: Fix register load order in `ldp` in commit f4ba145

On Wed, 25 Jun 2025, Thirumalai Nagalingam wrote:

> -      ldp     x0, x10, [x19, #16]  // x0 = stackaddr, x10 = stackbase \n\
> +      ldp     x10, x0, [x19, #24]  // x0 = stackaddr, x10 = stackbase \n\

I am very confused about this.

The struct layout:
struct pthread_wrapper_arg
{
  LPTHREAD_START_ROUTINE func; // +0
  PVOID arg;                   // +8
  PBYTE stackaddr;             // +16
  PBYTE stackbase;             // +24
  PBYTE stacklimit;            // +32
  ULONG guardsize;             // +40
};

below, you have
           ldp     x19, x0, [x19]       // x19 = func, x0 = arg            \n\
           blr     x19                  // call thread function            \n"

If this works (and it'd be really very obvious if it didn't), ldp loads 64-bits 
at the address given and puts it in the first register, and loads 64-bits at 
address+8 and puts it in the second register.  So wouldn't this really be

+      ldp     x10, x0, [x19, #24]  // x10 = stackbase, x0 = stacklimit \n\

?

so now you're freeing stacklimit instead of stackbase?  I don't think that's 
right.

Reply via email to