Bug#1068586: ghc: Broken on arm{el,hf} because of time_t transition
[ Steve added, for the symbol list question ] On Tue, Apr 09, 2024 at 09:44:43PM +0300, Ilias Tsitsimpis wrote: > On Tue, Apr 09, 2024 at 08:53PM, Adrian Bunk wrote: > > On Tue, Apr 09, 2024 at 07:23:29PM +0300, Ilias Tsitsimpis wrote: > > > I believe it's not. haskell-hourglass used to work on arm{el,hf} just > > > before the time64 transition. > > > > before this transition x32 was the only 32bit architecture with 64bit > > time_t. > > Aha! Didn't know that, thanks for flagging this. I am surprised that we > hadn't noticed this earlier (as we see now, GHC is completely broken). I wouldn't call it "completely broken". I'm too lazy to get exact numbers, but the arm{el,hf} situation is more like 1000 packages built (a large part with tests) and 10 failed. >... > > > We need a way to identify every package that is broken. > > > > $ nm -D /usr/lib/ghc/lib/arm-linux-ghc-9.4.7/libHStime-1.12.2-ghc9.4.7.so | > > grep gettimeofday > > U gettimeofday@GLIBC_2.4 > > ... > > > > $ nm -D > > /usr/lib/ghc/lib/arm-linux-ghc-9.4.7/libHSdirectory-1.3.7.1-ghc9.4.7.so | > > grep utimensat > > ... > > U utimensat@GLIBC_2.6 > > > > $ nm -D /usr/lib/ghc/lib/arm-linux-ghc-9.4.7/libHStime-1.12.2-ghc9.4.7.so | > > grep localtime > > U __localtime64_r@GLIBC_2.34 > > Thank you! Can we maybe run this against all Haskell libraries and > identify the ones that are currently broken? Maybe we have such a script > already for the time64 transition. Steve, do you have a list of all bad symbols for the time_t transition? With this list it should be possible to just install all currently installable Haskell packages on a porterbox and do something like $ for s in gettimeofday utimensat localtime localtime_r; do for f in /usr/lib/ghc/lib/arm-linux-ghc-9.4.7/*.so /usr/lib/haskell-packages/ghc/lib/arm-linux-ghc-9.4.7/*.so; do nm -D $f | grep $s@ && echo $f; done; done U gettimeofday@GLIBC_2.4 /usr/lib/ghc/lib/arm-linux-ghc-9.4.7/libHStime-1.12.2-ghc9.4.7.so U utimensat@GLIBC_2.6 /usr/lib/ghc/lib/arm-linux-ghc-9.4.7/libHSdirectory-1.3.7.1-ghc9.4.7.so U utimensat@GLIBC_2.6 /usr/lib/ghc/lib/arm-linux-ghc-9.4.7/libHSunix-2.7.3-ghc9.4.7.so $ That last one is likely libraries/unix/System/Posix/Files/Common.hsc:foreign import ccall unsafe "utimensat" > > But the last one is the broken localtime_r one, is anything going wrong > > with the ccall from TimeZone.hs to cbits there? > > > > get_current_timezone_seconds() returning long is wrong, > > and might contribute to that bug: > > > > foreign import ccall unsafe "HsTime.h get_current_timezone_seconds" > > get_current_timezone_seconds :: > > CTime -> Ptr CInt -> Ptr CString -> IO CLong > > ... > > getTimeZoneCTime ctime = > > ... > > secs <- get_current_timezone_seconds ctime pdst pcname > > > > I don't know Haskell, but is this declaring ctime as CLong, > > and then passing it instead of a CTime? > > I believe it returns the timezone in seconds (i.e., 3600 for +1 hour > timezone), so CLong should be large enough to hold this value. My guess > right now is that it fails due to the bogus CTime value that we pass, we > need to test this. Yes, I suspect that this function with CTime -> Ptr CInt -> Ptr CString -> IO CLong gets called as CLong -> Ptr CInt -> Ptr CString -> IO CLong but I'm surprised if Haskell does not catch something like that at compile time. > Ilias cu Adrian
Bug#1068586: ghc: Broken on arm{el,hf} because of time_t transition
On Tue, Apr 09, 2024 at 08:53PM, Adrian Bunk wrote: > On Tue, Apr 09, 2024 at 07:23:29PM +0300, Ilias Tsitsimpis wrote: > > I believe it's not. haskell-hourglass used to work on arm{el,hf} just > > before the time64 transition. > > before this transition x32 was the only 32bit architecture with 64bit time_t. Aha! Didn't know that, thanks for flagging this. I am surprised that we hadn't noticed this earlier (as we see now, GHC is completely broken). > > The fix here is to backport (at least) the following patches which > > change these calls to use the capi calling convention: > > > > * > > https://gitlab.haskell.org/ghc/packages/time/-/commit/d52314edb138b6ecd7e888c588f83917b0ee2c29 > > * > > https://gitlab.haskell.org/ghc/packages/directory/-/commit/f6b288bd96fba5a955d1f73663eb52c1859ee765 > > This localtime_r issue I mentioned is not covered in the above commit in > time. Hmm true. My guess here is that GHC gets a completely bogus CTime value back from clock_gettime() (we proved that the way we call clock_gettime() is currently broken) and when it passes this CTime value to localtime_r() it fails. I haven't verified any of this, just a guess. > > We need a way to identify every package that is broken. > > $ nm -D /usr/lib/ghc/lib/arm-linux-ghc-9.4.7/libHStime-1.12.2-ghc9.4.7.so | > grep gettimeofday > U gettimeofday@GLIBC_2.4 > ... > > $ nm -D > /usr/lib/ghc/lib/arm-linux-ghc-9.4.7/libHSdirectory-1.3.7.1-ghc9.4.7.so | > grep utimensat > ... > U utimensat@GLIBC_2.6 > > $ nm -D /usr/lib/ghc/lib/arm-linux-ghc-9.4.7/libHStime-1.12.2-ghc9.4.7.so | > grep localtime > U __localtime64_r@GLIBC_2.34 Thank you! Can we maybe run this against all Haskell libraries and identify the ones that are currently broken? Maybe we have such a script already for the time64 transition. > But the last one is the broken localtime_r one, is anything going wrong > with the ccall from TimeZone.hs to cbits there? > > get_current_timezone_seconds() returning long is wrong, > and might contribute to that bug: > > foreign import ccall unsafe "HsTime.h get_current_timezone_seconds" > get_current_timezone_seconds :: > CTime -> Ptr CInt -> Ptr CString -> IO CLong > ... > getTimeZoneCTime ctime = > ... > secs <- get_current_timezone_seconds ctime pdst pcname > > I don't know Haskell, but is this declaring ctime as CLong, > and then passing it instead of a CTime? I believe it returns the timezone in seconds (i.e., 3600 for +1 hour timezone), so CLong should be large enough to hold this value. My guess right now is that it fails due to the bogus CTime value that we pass, we need to test this. -- Ilias
Bug#1068586: ghc: Broken on arm{el,hf} because of time_t transition
On Tue, Apr 09, 2024 at 07:23:29PM +0300, Ilias Tsitsimpis wrote: > Hi Adrian, Hi Ilias, > On Tue, Apr 09, 2024 at 12:56PM, Adrian Bunk wrote: > > FTR, in haskell-hourglass this is #1001686 which already failed on x32 > > for this reason. > > I believe it's not. haskell-hourglass used to work on arm{el,hf} just > before the time64 transition. before this transition x32 was the only 32bit architecture with 64bit time_t. > > My reading of the code is that this happens when > > libraries/time/lib/cbits/HsTime.c returns 0x8000 due to > > localtime_r() called in this function returning an error. > > > > The "localtime_r failed" is from > > libraries/time/lib/Data/Time/LocalTime/Internal/TimeZone.hs >... > The fix here is to backport (at least) the following patches which > change these calls to use the capi calling convention: > > * > https://gitlab.haskell.org/ghc/packages/time/-/commit/d52314edb138b6ecd7e888c588f83917b0ee2c29 > * > https://gitlab.haskell.org/ghc/packages/directory/-/commit/f6b288bd96fba5a955d1f73663eb52c1859ee765 This localtime_r issue I mentioned is not covered in the above commit in time. >... > We need a way to identify every package that is broken. $ nm -D /usr/lib/ghc/lib/arm-linux-ghc-9.4.7/libHStime-1.12.2-ghc9.4.7.so | grep gettimeofday U gettimeofday@GLIBC_2.4 ... $ nm -D /usr/lib/ghc/lib/arm-linux-ghc-9.4.7/libHSdirectory-1.3.7.1-ghc9.4.7.so | grep utimensat ... U utimensat@GLIBC_2.6 $ nm -D /usr/lib/ghc/lib/arm-linux-ghc-9.4.7/libHStime-1.12.2-ghc9.4.7.so | grep localtime U __localtime64_r@GLIBC_2.34 But the last one is the broken localtime_r one, is anything going wrong with the ccall from TimeZone.hs to cbits there? get_current_timezone_seconds() returning long is wrong, and might contribute to that bug: foreign import ccall unsafe "HsTime.h get_current_timezone_seconds" get_current_timezone_seconds :: CTime -> Ptr CInt -> Ptr CString -> IO CLong ... getTimeZoneCTime ctime = ... secs <- get_current_timezone_seconds ctime pdst pcname I don't know Haskell, but is this declaring ctime as CLong, and then passing it instead of a CTime? > Ilias cu Adrian
Bug#1068586: ghc: Broken on arm{el,hf} because of time_t transition
Hi Adrian, On Tue, Apr 09, 2024 at 12:56PM, Adrian Bunk wrote: > FTR, in haskell-hourglass this is #1001686 which already failed on x32 > for this reason. I believe it's not. haskell-hourglass used to work on arm{el,hf} just before the time64 transition. > My reading of the code is that this happens when > libraries/time/lib/cbits/HsTime.c returns 0x8000 due to > localtime_r() called in this function returning an error. > > The "localtime_r failed" is from > libraries/time/lib/Data/Time/LocalTime/Internal/TimeZone.hs The problem is that GHC uses the ccall calling convention in order to call clock_gettime() [1]. GHC assumes time_t to be 64-bits, but ends up calling the old 32-bits variant of clock_gettime(), and not the new one __clock_gettime64(). Here is more information about GHC's FFI calling conventions[2]. Here is also an upstream issue about this[3]. [1] https://gitlab.haskell.org/ghc/packages/time/-/blob/baab563ee2ce547f7b7f7e7069ed09db2d406941/lib/Data/Time/Clock/Internal/CTimespec.hsc#L30 [2] https://www.haskell.org/ghc/blog/20210709-capi-usage.html [3] https://github.com/haskell/directory/pull/145 The fix here is to backport (at least) the following patches which change these calls to use the capi calling convention: * https://gitlab.haskell.org/ghc/packages/time/-/commit/d52314edb138b6ecd7e888c588f83917b0ee2c29 * https://gitlab.haskell.org/ghc/packages/directory/-/commit/f6b288bd96fba5a955d1f73663eb52c1859ee765 Other Haskell libraries may have the same bug as GHC if they are calling directly the C functions using the ccall calling convention. An example is haskell-hourglass, which needs to be patched as well: * https://github.com/vincenthz/hs-hourglass/blob/36bd2e6d5d0eb316532f13285d1c533d6da297ef/Data/Hourglass/Internal/Unix.hs#L82 We need a way to identify every package that is broken. -- Ilias
Bug#1068586: ghc: Broken on arm{el,hf} because of time_t transition
On Sun, Apr 07, 2024 at 05:18:06PM +0300, Ilias Tsitsimpis wrote: > Package: ghc > Version: 9.4.7-3 > Severity: grave > Justification: renders package unusable > > I recently uploaded a new version of GHC to unstable, in order to fix > #1068179. As a result, GHC got rebuilt taking into account the new size > for time_t on arm{el,hf}. This is evident from the build logs, where > we now see: > > > https://buildd.debian.org/status/fetch.php?pkg=ghc=armel=9.4.7-4=1712410679=0 > > checking Haskell type for time_t... Int64 > > whereas previously we had: > > > https://buildd.debian.org/status/fetch.php?pkg=ghc=armel=9.4.7-3=1708366014=0 > > checking Haskell type for time_t... Int32 > > > After this change, a number of Haskell packages have started to FTBFS: > > * > https://buildd.debian.org/status/fetch.php?pkg=haskell-filestore=armel=0.6.5-3%2Bb2=1712457355=0 > * > https://buildd.debian.org/status/fetch.php?pkg=haskell-fold-debounce=armel=0.2.0.11-1%2Bb2=1712466208=0 > * > https://buildd.debian.org/status/fetch.php?pkg=haskell-hourglass=armel=0.2.12-5%2Bb2=1712462130=0 FTR, in haskell-hourglass this is #1001686 which already failed on x32 for this reason. > Looking into this, I see that (at least) the getPOSIXTime method is > broken on arm{el,hf}. >... https://buildd.debian.org/status/fetch.php?pkg=haskell-lambdahack=armhf=0.11.0.0-4%2Bb1=1712562043=0 https://buildd.debian.org/status/fetch.php?pkg=haskell-hledger-lib=armhf=1.30-1%2Bb1=1712566894=0 Exception: user error (localtime_r failed) My reading of the code is that this happens when libraries/time/lib/cbits/HsTime.c returns 0x8000 due to localtime_r() called in this function returning an error. The "localtime_r failed" is from libraries/time/lib/Data/Time/LocalTime/Internal/TimeZone.hs > Ilias cu Adrian
Bug#1068586: ghc: Broken on arm{el,hf} because of time_t transition
Package: ghc Version: 9.4.7-3 Severity: grave Justification: renders package unusable I recently uploaded a new version of GHC to unstable, in order to fix #1068179. As a result, GHC got rebuilt taking into account the new size for time_t on arm{el,hf}. This is evident from the build logs, where we now see: https://buildd.debian.org/status/fetch.php?pkg=ghc=armel=9.4.7-4=1712410679=0 checking Haskell type for time_t... Int64 whereas previously we had: https://buildd.debian.org/status/fetch.php?pkg=ghc=armel=9.4.7-3=1708366014=0 checking Haskell type for time_t... Int32 After this change, a number of Haskell packages have started to FTBFS: * https://buildd.debian.org/status/fetch.php?pkg=haskell-filestore=armel=0.6.5-3%2Bb2=1712457355=0 * https://buildd.debian.org/status/fetch.php?pkg=haskell-fold-debounce=armel=0.2.0.11-1%2Bb2=1712466208=0 * https://buildd.debian.org/status/fetch.php?pkg=haskell-hourglass=armel=0.2.12-5%2Bb2=1712462130=0 Looking into this, I see that (at least) the getPOSIXTime method is broken on arm{el,hf}. Compiling the following program on armel: $ cat Time.hs import Data.Time.Clock.POSIX main = do t <- getPOSIXTime print t $ ghc -o time Time.hs $ ./time 3590884976642664859s whereas on an amd64 system it returns: $ ./time 1712499127.06215219s This bug blocks the time_t transition (#1036884). -- Ilias