Re: New Windows I/O manager in GHC 8.12
Thanks Simon, cheers :) Sent from my Mobile On Mon, Jul 20, 2020, 15:28 Simon Peyton Jones wrote: > Tamar, I salute you! This is a big piece of work – thank you! > > > Simon > > > > *From:* ghc-devs *On Behalf Of *Phyx > *Sent:* 17 July 2020 16:04 > *To:* ghc-devs@haskell.org Devs > *Subject:* New Windows I/O manager in GHC 8.12 > > > > Hi All, > > In case you've missed it, about 150 or so commits were committed to master > yesterday. These commits add WinIO (Windows I/O) to GHC. This is a new > I/O > manager that is designed for the native Windows I/O subsystem instead of > relying on the broken posix-ish compatibility layer that MIO used. > > This is one of 3 big patches I have been working on for years now.. > > So before I continue on why WinIO was made I'll add a TL;DR; > > WinIO adds an internal API break compared to previous GHC releases. That > is > the internal code was modified to support a completely asynchronous I/O > system. > > What this means is that we have to keep track of the file pointer offset > which > previously was done by the C runtime. This is because in async I/O you > cannot > assume the offset to be at any given location. > > What does this mean for you? Very little. If you did not use internal GHC > I/O code. > In particular if you haven't used Buffer, BufferIO and RawIO. If you have > you will > to explicitly add support for GHC 8.12+. > > Because FDs are a Unix concept and don't behave as you would expect on > Windows, the > new I/O manager also uses HANDLE instead of FD. This means that any > library that has > used the internal GHC Fd type won't work with WinIO. Luckily the number of > libraries > that have seems quite low. If you can please stick to the external Handle > interface > for I/O functions. > > The boot libraries have been updated, and in particular process *requires* > the version > that is shipped with GHC. Please respect the version bounds here! I will > be writing > a migration guide for those that need to migrate code. The amount of work > is usually > trivial as Base provides shims to do most of the common things you would > have used Fd for. > > Also if I may make a plea to GHC developers.. Do not add non-trivial > implementations > in the external exposed modules (e.g. System.xxx, Data.xxx) but rather add > them to internal > modules (GHC.xxx) and re-export them from the external modules. This > allows us to avoid > import cycles inside the internal modules :) > > -- > > So why WinIO? Over the years a number of hard to fix issues popped up on > Windows, including > proper Unicode console I/O, cooked inputs, ability to cancel I/O requests. > This also allows libraries like Brick to work on Windows without > re-inventing the wheel or have to hide their I/O from the I/O manager. > > In order to attempt to do some of these with MIO layer upon layers of > hacks were added. This means that things sometimes worked.., but when it > didn't was rather unpredictable. Some of the issues were simply unfixable > with MIO. I will be making some posts about how WinIO works (and also > archiving them on the wiki don't worry :)) but for now some highlights: > > WinIO is 3 years of work, First started by Joey Hess, then picked up by > Mikhail Glushenkov before landing at my feet. While the majority has been > rewritten their work did provide a great jumping off point so thanks! Also > thanks to Ben and AndreasK for helping me get it over the line.. As you can > imagine I was exhausted by this point :). > > Some stats: ~8000 new lines and ~1100 removed ones spread over 130+ > commits (sorry this was the smallest we could get it while not losing some > historical context) and with over 153 files changed not counting the > changes to boot libraries. > > It Fixes #18307, #17035, #16917, #15366, #14530, #13516, #13396, #13359, > #12873, #12869, #11394, #10542, #10484, #10477, #9940, #7593, #7353, #5797, > #5305, #4471, #3937, #3081, #12117, #2408, #10956, #2189 > (but only on native windows consoles, so no msys shells) and #806 which is > 14 years old! > > WinIO is a dynamic choice, so you can switch between I/O managers using > the RTS flag --io-manager=[native|posix]. > > On non-Windows native is the same as posix. > > The chosen Async interface for this implementation is using Completion > Ports. > > The I/O manager uses a new interface added in Windows Vista called > GetQueuedCompletionStatusEx which allows us to service multiple > request interrupts in one go. > > Some highlights: > > * Drops Windows Vista support > Vista is out of extended support as of 2017. The new minimum is Windows > 7. This allows us to use much more efficient OS provided abstractions. > > * Replace Events and Monitor locks with much faster and efficient > Conditional Variables and SlimReaderWriterLocks. > * Change GHC's Buffer and I/O structs to support asynchronous operation by > not relying on the OS managing File Offset. > * Implement a new command line flag +RTS --io-manag
RE: New Windows I/O manager in GHC 8.12
Tamar, I salute you! This is a big piece of work – thank you! Simon From: ghc-devs On Behalf Of Phyx Sent: 17 July 2020 16:04 To: ghc-devs@haskell.org Devs Subject: New Windows I/O manager in GHC 8.12 Hi All, In case you've missed it, about 150 or so commits were committed to master yesterday. These commits add WinIO (Windows I/O) to GHC. This is a new I/O manager that is designed for the native Windows I/O subsystem instead of relying on the broken posix-ish compatibility layer that MIO used. This is one of 3 big patches I have been working on for years now.. So before I continue on why WinIO was made I'll add a TL;DR; WinIO adds an internal API break compared to previous GHC releases. That is the internal code was modified to support a completely asynchronous I/O system. What this means is that we have to keep track of the file pointer offset which previously was done by the C runtime. This is because in async I/O you cannot assume the offset to be at any given location. What does this mean for you? Very little. If you did not use internal GHC I/O code. In particular if you haven't used Buffer, BufferIO and RawIO. If you have you will to explicitly add support for GHC 8.12+. Because FDs are a Unix concept and don't behave as you would expect on Windows, the new I/O manager also uses HANDLE instead of FD. This means that any library that has used the internal GHC Fd type won't work with WinIO. Luckily the number of libraries that have seems quite low. If you can please stick to the external Handle interface for I/O functions. The boot libraries have been updated, and in particular process *requires* the version that is shipped with GHC. Please respect the version bounds here! I will be writing a migration guide for those that need to migrate code. The amount of work is usually trivial as Base provides shims to do most of the common things you would have used Fd for. Also if I may make a plea to GHC developers.. Do not add non-trivial implementations in the external exposed modules (e.g. System.xxx, Data.xxx) but rather add them to internal modules (GHC.xxx) and re-export them from the external modules. This allows us to avoid import cycles inside the internal modules :) -- So why WinIO? Over the years a number of hard to fix issues popped up on Windows, including proper Unicode console I/O, cooked inputs, ability to cancel I/O requests. This also allows libraries like Brick to work on Windows without re-inventing the wheel or have to hide their I/O from the I/O manager. In order to attempt to do some of these with MIO layer upon layers of hacks were added. This means that things sometimes worked.., but when it didn't was rather unpredictable. Some of the issues were simply unfixable with MIO. I will be making some posts about how WinIO works (and also archiving them on the wiki don't worry :)) but for now some highlights: WinIO is 3 years of work, First started by Joey Hess, then picked up by Mikhail Glushenkov before landing at my feet. While the majority has been rewritten their work did provide a great jumping off point so thanks! Also thanks to Ben and AndreasK for helping me get it over the line.. As you can imagine I was exhausted by this point :). Some stats: ~8000 new lines and ~1100 removed ones spread over 130+ commits (sorry this was the smallest we could get it while not losing some historical context) and with over 153 files changed not counting the changes to boot libraries. It Fixes #18307, #17035, #16917, #15366, #14530, #13516, #13396, #13359, #12873, #12869, #11394, #10542, #10484, #10477, #9940, #7593, #7353, #5797, #5305, #4471, #3937, #3081, #12117, #2408, #10956, #2189 (but only on native windows consoles, so no msys shells) and #806 which is 14 years old! WinIO is a dynamic choice, so you can switch between I/O managers using the RTS flag --io-manager=[native|posix]. On non-Windows native is the same as posix. The chosen Async interface for this implementation is using Completion Ports. The I/O manager uses a new interface added in Windows Vista called GetQueuedCompletionStatusEx which allows us to service multiple request interrupts in one go. Some highlights: * Drops Windows Vista support Vista is out of extended support as of 2017. The new minimum is Windows 7. This allows us to use much more efficient OS provided abstractions. * Replace Events and Monitor locks with much faster and efficient Conditional Variables and SlimReaderWriterLocks. * Change GHC's Buffer and I/O structs to support asynchronous operation by not relying on the OS managing File Offset. * Implement a new command line flag +RTS --io-manager=[native|posix] to control which I/O manager is used. * Implement a new Console I/O interface supporting much faster reads/writes and unicode output correctly. Also supports things like cooked input etc. * In new I/O manager if the user still has their code-page set to OEM, then we use UTF-8
Re: HEAD doesn't build. Totally stalled.
Ther revert MR is here: https://gitlab.haskell.org/ghc/ghc/-/merge_requests/3714 It's kind of ironic that it's stuck in CI limbo, whereas the initial MR wasn't. > I'm surprised gitlab presubmit merge did not detect the build breakage. So am I! As laid out, I believe a better solution is to have a mapping of symbols to potential carrying libraries, and have GHC know about that, when the linker tries to link arbitrary objects and encounters those symbols. Another strategy that Tamar employed to great success on the windows side, is to just increase the set of libraries GHC tries to load by default, and thus get rid of the annoying list of symbols in the RTS. I hope the above MR will pass now (after another rebase); and I can find some time to implement a better solution soon. Cheers, Moritz On Mon, Jul 20, 2020 at 4:28 PM Sergei Trofimovich wrote: > > On Fri, 17 Jul 2020 10:45:37 +0800 > Moritz Angermann wrote: > > > Well, we actually *do* test for __SSP__ in HEAD: > > https://github.com/ghc/ghc/blob/master/rts/RtsSymbols.c#L1170 > > Which currently lists: > > #if !defined(mingw32_HOST_OS) && !defined(DYNAMIC) && > > (defined(_FORTIFY_SOURCE) || defined(__SSP__)) > > I believe it's a https://gitlab.haskell.org/ghc/ghc/-/issues/18442 > > It breaks for me as well. > > It triggers if one has gcc compiler with any of 2 properties: > > 1. gcc is built with --enable-default-ssp (sets __SSP__ for all compilations) > 2. gcc defaults to _FORTIFY_SOURCE > > Note that presence or absence of __stack_chk_guard is indicated > by neither of these and instead is present when gcc is built with > --enable-libssp (use gcc's __stack_* functions instead gcc's direct TLS > instructions with one glibc fallback.) > > Gentoo does both [1.] and [2.] by default. I believe Debian does at least > [2.] by default. I'm surprised gitlab presubmit merge did not detect the > build breakage. > > What do macros [1] and [2.] mean for glibc-linux: > > - _FORTIFY_SOURCE only affects glibc headers to change memcpy() > calls to memcpy_chk() to add overflow checks. It does not affect > symbol exports available by libc. __stack_* symbols are always present. > Parts of libc or other libraries we link ghc with coult already call > __stack_* > function as they could already be built with _FORTIFY_SOURCE. Regardless > of how ghc is being built: with _FORTIFY_SOURCE or without. > > - __SSP__ indicates code generation of stack canary placement by gcc > (-fstack-protector-* options, or default override with gcc's > --enable-default-ssp) > > If target is not a gcc's libssp target (a.k.a. --disable-libssp), a default > for all > linux-glibc targets) then gcc never uses -lssp and uses gcc's builtin > instructions > instead of __stack_chk_guard helpers. In this mode __stack_chk_guard is not > present in any libraries installed by gcc or glibc. The only symbol > provided by glibc > is __stack_chk_fail (which arguably should not be exposed at all as it's an > unusual contract between glibc/gcc: https://gcc.gnu.org/PR93509) > > --enable-libssp for gcc does bring in __stack_chk_guard. Library is present > and could > use __stack_chk_guard in libraries ghc depends on regardless of > -fstack-protector-* options used to build ghc. I believe --enable-libssp is > used only > on mingw. > > What I'm trying to say is that presence of __stack_chk_guard is orthogonal > to either __SSP__ define or _FORTIFY_SOURCE ghc uses today.. > > It's rather a function of how gcc toolchain was built: --enable-libssp or not. > > > But this seems to still be ill conceived. And while Simon is the only > > one I'm aware of, for whom this breaks we need to find a better > > solution. As such, we will revert the commits. > > > > Why do we do all this symbol nonsense in the RTS to begin with? It > > has to do with our static linker we have in GHC. Loading arbitrary > > archives, means we need to be able to resolve all kinds of symbols > > that objects might refer to. For regular dependencies this will work > > if the dependencies are listed in the package configuration file, the > > linker will know which dependencies to link. This get a bit annoying > > for libraries that the compiler will automagically provide. libgcc, > > libssp, librt, ... > > > > The solution so far was simply to have the RTS depend on these > > symbols, and keep a list of them around. That way when the linker > > built the RTS we'd get it to link all these symbols into the RTS, and > > we could refer to them in the linker. Essentially looking them up in > > the linked binary (ghc, or iserv). > > > > This is a rather tricky problem, and almost all solutions we came up > > with are annoying in one or more dimensions. After some discussion on > > IRC last night, we'll go forward trying the following solution: > > > > We'll keep a file in the lib folder (similar to the settings, > > llvm-targets, ...) that is essentially a lookup table of Symbol -> > > [Library]. If we encounter an un
Re: HEAD doesn't build. Totally stalled.
On Fri, 17 Jul 2020 10:45:37 +0800 Moritz Angermann wrote: > Well, we actually *do* test for __SSP__ in HEAD: > https://github.com/ghc/ghc/blob/master/rts/RtsSymbols.c#L1170 > Which currently lists: > #if !defined(mingw32_HOST_OS) && !defined(DYNAMIC) && > (defined(_FORTIFY_SOURCE) || defined(__SSP__)) I believe it's a https://gitlab.haskell.org/ghc/ghc/-/issues/18442 It breaks for me as well. It triggers if one has gcc compiler with any of 2 properties: 1. gcc is built with --enable-default-ssp (sets __SSP__ for all compilations) 2. gcc defaults to _FORTIFY_SOURCE Note that presence or absence of __stack_chk_guard is indicated by neither of these and instead is present when gcc is built with --enable-libssp (use gcc's __stack_* functions instead gcc's direct TLS instructions with one glibc fallback.) Gentoo does both [1.] and [2.] by default. I believe Debian does at least [2.] by default. I'm surprised gitlab presubmit merge did not detect the build breakage. What do macros [1] and [2.] mean for glibc-linux: - _FORTIFY_SOURCE only affects glibc headers to change memcpy() calls to memcpy_chk() to add overflow checks. It does not affect symbol exports available by libc. __stack_* symbols are always present. Parts of libc or other libraries we link ghc with coult already call __stack_* function as they could already be built with _FORTIFY_SOURCE. Regardless of how ghc is being built: with _FORTIFY_SOURCE or without. - __SSP__ indicates code generation of stack canary placement by gcc (-fstack-protector-* options, or default override with gcc's --enable-default-ssp) If target is not a gcc's libssp target (a.k.a. --disable-libssp), a default for all linux-glibc targets) then gcc never uses -lssp and uses gcc's builtin instructions instead of __stack_chk_guard helpers. In this mode __stack_chk_guard is not present in any libraries installed by gcc or glibc. The only symbol provided by glibc is __stack_chk_fail (which arguably should not be exposed at all as it's an unusual contract between glibc/gcc: https://gcc.gnu.org/PR93509) --enable-libssp for gcc does bring in __stack_chk_guard. Library is present and could use __stack_chk_guard in libraries ghc depends on regardless of -fstack-protector-* options used to build ghc. I believe --enable-libssp is used only on mingw. What I'm trying to say is that presence of __stack_chk_guard is orthogonal to either __SSP__ define or _FORTIFY_SOURCE ghc uses today.. It's rather a function of how gcc toolchain was built: --enable-libssp or not. > But this seems to still be ill conceived. And while Simon is the only > one I'm aware of, for whom this breaks we need to find a better > solution. As such, we will revert the commits. > > Why do we do all this symbol nonsense in the RTS to begin with? It > has to do with our static linker we have in GHC. Loading arbitrary > archives, means we need to be able to resolve all kinds of symbols > that objects might refer to. For regular dependencies this will work > if the dependencies are listed in the package configuration file, the > linker will know which dependencies to link. This get a bit annoying > for libraries that the compiler will automagically provide. libgcc, > libssp, librt, ... > > The solution so far was simply to have the RTS depend on these > symbols, and keep a list of them around. That way when the linker > built the RTS we'd get it to link all these symbols into the RTS, and > we could refer to them in the linker. Essentially looking them up in > the linked binary (ghc, or iserv). > > This is a rather tricky problem, and almost all solutions we came up > with are annoying in one or more dimensions. After some discussion on > IRC last night, we'll go forward trying the following solution: > > We'll keep a file in the lib folder (similar to the settings, > llvm-targets, ...) that is essentially a lookup table of Symbol -> > [Library]. If we encounter an unknown symbol, and we have it in our > lookup table, we will try to load the named libraries, hoping for them > to contain the symbol we are looking for. If everything fails we'll > bail. > > For the example symbols that prompted this issue: (which are emitted > when stack smashing protector hardening is enabled, which seems to be > the default on most linux distributions today, which is likely why I > couldn't reproduce this easily.) > > [("__stack_chk_guard", ["ssp"])] > > would tell the compiler to try to locate (through the usual library > location means) the library called "ssp", if it encounters the symbol > "__stack_chk_guard". > > Isn't this what the dynamic linker is supposed to solve? Why do we > have to do all this on our own? Can't we just use the dynamic linker? > Yes, and no. Yes we can use the dynamic linker, and we even do. But > not all platforms have a working, or usable linker. iOS for example > has a working dynamic linker, but user programs can't use it. muslc > reports "Dynamic loading n
RE: Unmerged Patch: 3358
Matthew It looks from https://gitlab.haskell.org/ghc/ghc/-/merge_requests/3358 as if it was blocked on something to do with 'text'. Is that unblocked now? The MR also says "Fast forward merge is not possible". So it sounds as if the steps are: * Check that the change to text, whatever that is, has been done, and fix the text submodule commit on the MR * Rebase * Assign to Marge. If you get stuck with that, do yell. Simon | -Original Message- | From: ghc-devs On Behalf Of Matthew | Pickering | Sent: 20 July 2020 08:46 | To: GHC developers | Subject: Unmerged Patch: 3358 | | Hi, | | My patch 3358 needs to get merged before the 8.12 fork. | | When I finished it (in May), it passed CI and after this point I | lacked time to work on it further. Now I have asked 4 times for this | patch to get merged and it is still open. | | The GHC proposal for this patch already took an extortionate amount of | time to get accepted. Please can we close this chapter by merging the | patch. | | Cheers, | | Matt | ___ | ghc-devs mailing list | ghc-devs@haskell.org | https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmail.has | kell.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fghc- | devs&data=02%7C01%7Csimonpj%40microsoft.com%7Cc2cb1e33865c4972eaaf08d | 82c810934%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C63730827004717 | &sdata=TZc9lHJMlijCfhhhqWSLk8o46z5tosFz91LWAy9h1b8%3D&reserved=0 ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Unmerged Patch: 3358
Hi, My patch 3358 needs to get merged before the 8.12 fork. When I finished it (in May), it passed CI and after this point I lacked time to work on it further. Now I have asked 4 times for this patch to get merged and it is still open. The GHC proposal for this patch already took an extortionate amount of time to get accepted. Please can we close this chapter by merging the patch. Cheers, Matt ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs