Bug#1068586: ghc: Broken on arm{el,hf} because of time_t transition

2024-04-09 Thread Adrian Bunk
[ Steve added, for the symbol list question ]

On Tue, Apr 09, 2024 at 09:44:43PM +0300, Ilias Tsitsimpis wrote:
> On Tue, Apr 09, 2024 at 08:53PM, Adrian Bunk wrote:
> > On Tue, Apr 09, 2024 at 07:23:29PM +0300, Ilias Tsitsimpis wrote:
> > > I believe it's not. haskell-hourglass used to work on arm{el,hf} just
> > > before the time64 transition.
> > 
> > before this transition x32 was the only 32bit architecture with 64bit 
> > time_t.
> 
> Aha! Didn't know that, thanks for flagging this. I am surprised that we
> hadn't noticed this earlier (as we see now, GHC is completely broken).

I wouldn't call it "completely broken".

I'm too lazy to get exact numbers, but the arm{el,hf} situation is more 
like 1000 packages built (a large part with tests) and 10 failed.

>...
> > > We need a way to identify every package that is broken.
> > 
> > $ nm -D /usr/lib/ghc/lib/arm-linux-ghc-9.4.7/libHStime-1.12.2-ghc9.4.7.so | 
> > grep gettimeofday
> >  U gettimeofday@GLIBC_2.4
> > ...
> > 
> > $ nm -D 
> > /usr/lib/ghc/lib/arm-linux-ghc-9.4.7/libHSdirectory-1.3.7.1-ghc9.4.7.so | 
> > grep utimensat 
> > ...
> >  U utimensat@GLIBC_2.6
> > 
> > $ nm -D /usr/lib/ghc/lib/arm-linux-ghc-9.4.7/libHStime-1.12.2-ghc9.4.7.so | 
> > grep localtime
> >  U __localtime64_r@GLIBC_2.34
> 
> Thank you! Can we maybe run this against all Haskell libraries and
> identify the ones that are currently broken? Maybe we have such a script
> already for the time64 transition.

Steve, do you have a list of all bad symbols for the time_t transition?

With this list it should be possible to just install all currently 
installable Haskell packages on a porterbox and do something like

$ for s in gettimeofday utimensat localtime localtime_r; do for f in 
/usr/lib/ghc/lib/arm-linux-ghc-9.4.7/*.so 
/usr/lib/haskell-packages/ghc/lib/arm-linux-ghc-9.4.7/*.so; do nm -D $f | grep 
$s@ && echo $f; done; done
 U gettimeofday@GLIBC_2.4
/usr/lib/ghc/lib/arm-linux-ghc-9.4.7/libHStime-1.12.2-ghc9.4.7.so
 U utimensat@GLIBC_2.6
/usr/lib/ghc/lib/arm-linux-ghc-9.4.7/libHSdirectory-1.3.7.1-ghc9.4.7.so
 U utimensat@GLIBC_2.6
/usr/lib/ghc/lib/arm-linux-ghc-9.4.7/libHSunix-2.7.3-ghc9.4.7.so
$

That last one is likely
libraries/unix/System/Posix/Files/Common.hsc:foreign import ccall unsafe 
"utimensat"

> > But the last one is the broken localtime_r one, is anything going wrong 
> > with the ccall from TimeZone.hs to cbits there?
> > 
> > get_current_timezone_seconds() returning long is wrong,
> > and might contribute to that bug:
> > 
> >   foreign import ccall unsafe "HsTime.h get_current_timezone_seconds"
> > get_current_timezone_seconds ::
> > CTime -> Ptr CInt -> Ptr CString -> IO CLong
> > ...
> >   getTimeZoneCTime ctime =
> > ...
> > secs <- get_current_timezone_seconds ctime pdst pcname
> > 
> > I don't know Haskell, but is this declaring ctime as CLong,
> > and then passing it instead of a CTime?
> 
> I believe it returns the timezone in seconds (i.e., 3600 for +1 hour
> timezone), so CLong should be large enough to hold this value. My guess
> right now is that it fails due to the bogus CTime value that we pass, we
> need to test this.

Yes, I suspect that this function with
  CTime -> Ptr CInt -> Ptr CString -> IO CLong
gets called as
  CLong -> Ptr CInt -> Ptr CString -> IO CLong
but I'm surprised if Haskell does not catch something like that at 
compile time.

> Ilias

cu
Adrian



Bug#1068586: ghc: Broken on arm{el,hf} because of time_t transition

2024-04-09 Thread Ilias Tsitsimpis
On Tue, Apr 09, 2024 at 08:53PM, Adrian Bunk wrote:
> On Tue, Apr 09, 2024 at 07:23:29PM +0300, Ilias Tsitsimpis wrote:
> > I believe it's not. haskell-hourglass used to work on arm{el,hf} just
> > before the time64 transition.
> 
> before this transition x32 was the only 32bit architecture with 64bit time_t.

Aha! Didn't know that, thanks for flagging this. I am surprised that we
hadn't noticed this earlier (as we see now, GHC is completely broken).

> > The fix here is to backport (at least) the following patches which
> > change these calls to use the capi calling convention:
> > 
> >   * 
> > https://gitlab.haskell.org/ghc/packages/time/-/commit/d52314edb138b6ecd7e888c588f83917b0ee2c29
> >   * 
> > https://gitlab.haskell.org/ghc/packages/directory/-/commit/f6b288bd96fba5a955d1f73663eb52c1859ee765
> 
> This localtime_r issue I mentioned is not covered in the above commit in 
> time.

Hmm true. My guess here is that GHC gets a completely bogus CTime value
back from clock_gettime() (we proved that the way we call
clock_gettime() is currently broken) and when it passes this CTime value
to localtime_r() it fails. I haven't verified any of this, just a guess.

> > We need a way to identify every package that is broken.
> 
> $ nm -D /usr/lib/ghc/lib/arm-linux-ghc-9.4.7/libHStime-1.12.2-ghc9.4.7.so | 
> grep gettimeofday
>  U gettimeofday@GLIBC_2.4
> ...
> 
> $ nm -D 
> /usr/lib/ghc/lib/arm-linux-ghc-9.4.7/libHSdirectory-1.3.7.1-ghc9.4.7.so | 
> grep utimensat 
> ...
>  U utimensat@GLIBC_2.6
> 
> $ nm -D /usr/lib/ghc/lib/arm-linux-ghc-9.4.7/libHStime-1.12.2-ghc9.4.7.so | 
> grep localtime
>  U __localtime64_r@GLIBC_2.34

Thank you! Can we maybe run this against all Haskell libraries and
identify the ones that are currently broken? Maybe we have such a script
already for the time64 transition.

> But the last one is the broken localtime_r one, is anything going wrong 
> with the ccall from TimeZone.hs to cbits there?
> 
> get_current_timezone_seconds() returning long is wrong,
> and might contribute to that bug:
> 
>   foreign import ccall unsafe "HsTime.h get_current_timezone_seconds"
> get_current_timezone_seconds ::
> CTime -> Ptr CInt -> Ptr CString -> IO CLong
> ...
>   getTimeZoneCTime ctime =
> ...
> secs <- get_current_timezone_seconds ctime pdst pcname
> 
> I don't know Haskell, but is this declaring ctime as CLong,
> and then passing it instead of a CTime?

I believe it returns the timezone in seconds (i.e., 3600 for +1 hour
timezone), so CLong should be large enough to hold this value. My guess
right now is that it fails due to the bogus CTime value that we pass, we
need to test this.

-- 
Ilias



Bug#1068586: ghc: Broken on arm{el,hf} because of time_t transition

2024-04-09 Thread Adrian Bunk
On Tue, Apr 09, 2024 at 07:23:29PM +0300, Ilias Tsitsimpis wrote:
> Hi Adrian,

Hi Ilias,

> On Tue, Apr 09, 2024 at 12:56PM, Adrian Bunk wrote:
> > FTR, in haskell-hourglass this is #1001686 which already failed on x32 
> > for this reason.
> 
> I believe it's not. haskell-hourglass used to work on arm{el,hf} just
> before the time64 transition.

before this transition x32 was the only 32bit architecture with 64bit time_t.

> > My reading of the code is that this happens when 
> > libraries/time/lib/cbits/HsTime.c returns 0x8000 due to 
> > localtime_r() called in this function returning an error.
> > 
> > The "localtime_r failed" is from
> > libraries/time/lib/Data/Time/LocalTime/Internal/TimeZone.hs
>...
> The fix here is to backport (at least) the following patches which
> change these calls to use the capi calling convention:
> 
>   * 
> https://gitlab.haskell.org/ghc/packages/time/-/commit/d52314edb138b6ecd7e888c588f83917b0ee2c29
>   * 
> https://gitlab.haskell.org/ghc/packages/directory/-/commit/f6b288bd96fba5a955d1f73663eb52c1859ee765

This localtime_r issue I mentioned is not covered in the above commit in 
time.

>...
> We need a way to identify every package that is broken.

$ nm -D /usr/lib/ghc/lib/arm-linux-ghc-9.4.7/libHStime-1.12.2-ghc9.4.7.so | 
grep gettimeofday
 U gettimeofday@GLIBC_2.4
...

$ nm -D /usr/lib/ghc/lib/arm-linux-ghc-9.4.7/libHSdirectory-1.3.7.1-ghc9.4.7.so 
| grep utimensat 
...
 U utimensat@GLIBC_2.6

$ nm -D /usr/lib/ghc/lib/arm-linux-ghc-9.4.7/libHStime-1.12.2-ghc9.4.7.so | 
grep localtime
 U __localtime64_r@GLIBC_2.34


But the last one is the broken localtime_r one, is anything going wrong 
with the ccall from TimeZone.hs to cbits there?

get_current_timezone_seconds() returning long is wrong,
and might contribute to that bug:

  foreign import ccall unsafe "HsTime.h get_current_timezone_seconds"
get_current_timezone_seconds ::
CTime -> Ptr CInt -> Ptr CString -> IO CLong
...
  getTimeZoneCTime ctime =
...
secs <- get_current_timezone_seconds ctime pdst pcname

I don't know Haskell, but is this declaring ctime as CLong,
and then passing it instead of a CTime?


> Ilias

cu
Adrian



Bug#1068586: ghc: Broken on arm{el,hf} because of time_t transition

2024-04-09 Thread Ilias Tsitsimpis
Hi Adrian,

On Tue, Apr 09, 2024 at 12:56PM, Adrian Bunk wrote:
> FTR, in haskell-hourglass this is #1001686 which already failed on x32 
> for this reason.

I believe it's not. haskell-hourglass used to work on arm{el,hf} just
before the time64 transition.

> My reading of the code is that this happens when 
> libraries/time/lib/cbits/HsTime.c returns 0x8000 due to 
> localtime_r() called in this function returning an error.
> 
> The "localtime_r failed" is from
> libraries/time/lib/Data/Time/LocalTime/Internal/TimeZone.hs

The problem is that GHC uses the ccall calling convention in order to
call clock_gettime() [1]. GHC assumes time_t to be 64-bits, but ends up
calling the old 32-bits variant of clock_gettime(), and not the new one
__clock_gettime64().

Here is more information about GHC's FFI calling conventions[2].
Here is also an upstream issue about this[3].

[1] 
https://gitlab.haskell.org/ghc/packages/time/-/blob/baab563ee2ce547f7b7f7e7069ed09db2d406941/lib/Data/Time/Clock/Internal/CTimespec.hsc#L30
[2] https://www.haskell.org/ghc/blog/20210709-capi-usage.html
[3] https://github.com/haskell/directory/pull/145

The fix here is to backport (at least) the following patches which
change these calls to use the capi calling convention:

  * 
https://gitlab.haskell.org/ghc/packages/time/-/commit/d52314edb138b6ecd7e888c588f83917b0ee2c29
  * 
https://gitlab.haskell.org/ghc/packages/directory/-/commit/f6b288bd96fba5a955d1f73663eb52c1859ee765

Other Haskell libraries may have the same bug as GHC if they are calling
directly the C functions using the ccall calling convention. An example
is haskell-hourglass, which needs to be patched as well:

  * 
https://github.com/vincenthz/hs-hourglass/blob/36bd2e6d5d0eb316532f13285d1c533d6da297ef/Data/Hourglass/Internal/Unix.hs#L82

We need a way to identify every package that is broken.

-- 
Ilias



Bug#1068586: ghc: Broken on arm{el,hf} because of time_t transition

2024-04-09 Thread Adrian Bunk
On Sun, Apr 07, 2024 at 05:18:06PM +0300, Ilias Tsitsimpis wrote:
> Package: ghc
> Version: 9.4.7-3
> Severity: grave
> Justification: renders package unusable
> 
> I recently uploaded a new version of GHC to unstable, in order to fix
> #1068179. As a result, GHC got rebuilt taking into account the new size
> for time_t on arm{el,hf}. This is evident from the build logs, where
> we now see:
> 
>   
> https://buildd.debian.org/status/fetch.php?pkg=ghc=armel=9.4.7-4=1712410679=0
> 
>   checking Haskell type for time_t... Int64
> 
> whereas previously we had:
> 
>   
> https://buildd.debian.org/status/fetch.php?pkg=ghc=armel=9.4.7-3=1708366014=0
> 
>   checking Haskell type for time_t... Int32
> 
> 
> After this change, a number of Haskell packages have started to FTBFS:
> 
> * 
> https://buildd.debian.org/status/fetch.php?pkg=haskell-filestore=armel=0.6.5-3%2Bb2=1712457355=0
> * 
> https://buildd.debian.org/status/fetch.php?pkg=haskell-fold-debounce=armel=0.2.0.11-1%2Bb2=1712466208=0
> * 
> https://buildd.debian.org/status/fetch.php?pkg=haskell-hourglass=armel=0.2.12-5%2Bb2=1712462130=0

FTR, in haskell-hourglass this is #1001686 which already failed on x32 
for this reason.

> Looking into this, I see that (at least) the getPOSIXTime method is
> broken on arm{el,hf}.
>...

https://buildd.debian.org/status/fetch.php?pkg=haskell-lambdahack=armhf=0.11.0.0-4%2Bb1=1712562043=0
https://buildd.debian.org/status/fetch.php?pkg=haskell-hledger-lib=armhf=1.30-1%2Bb1=1712566894=0

  Exception: user error (localtime_r failed)

My reading of the code is that this happens when 
libraries/time/lib/cbits/HsTime.c returns 0x8000 due to 
localtime_r() called in this function returning an error.

The "localtime_r failed" is from
libraries/time/lib/Data/Time/LocalTime/Internal/TimeZone.hs

> Ilias

cu
Adrian



Bug#1068586: ghc: Broken on arm{el,hf} because of time_t transition

2024-04-07 Thread Ilias Tsitsimpis
Package: ghc
Version: 9.4.7-3
Severity: grave
Justification: renders package unusable

I recently uploaded a new version of GHC to unstable, in order to fix
#1068179. As a result, GHC got rebuilt taking into account the new size
for time_t on arm{el,hf}. This is evident from the build logs, where
we now see:

  
https://buildd.debian.org/status/fetch.php?pkg=ghc=armel=9.4.7-4=1712410679=0

  checking Haskell type for time_t... Int64

whereas previously we had:

  
https://buildd.debian.org/status/fetch.php?pkg=ghc=armel=9.4.7-3=1708366014=0

  checking Haskell type for time_t... Int32


After this change, a number of Haskell packages have started to FTBFS:

* 
https://buildd.debian.org/status/fetch.php?pkg=haskell-filestore=armel=0.6.5-3%2Bb2=1712457355=0
* 
https://buildd.debian.org/status/fetch.php?pkg=haskell-fold-debounce=armel=0.2.0.11-1%2Bb2=1712466208=0
* 
https://buildd.debian.org/status/fetch.php?pkg=haskell-hourglass=armel=0.2.12-5%2Bb2=1712462130=0

Looking into this, I see that (at least) the getPOSIXTime method is
broken on arm{el,hf}. Compiling the following program on armel:

  $ cat Time.hs
  import Data.Time.Clock.POSIX

  main = do
t <- getPOSIXTime
print t

  $ ghc -o time Time.hs
  $ ./time
  3590884976642664859s

whereas on an amd64 system it returns:

  $ ./time
  1712499127.06215219s

This bug blocks the time_t transition (#1036884).

-- 
Ilias