Tamar, I salute you!  This is a big piece of work – thank you!

Simon

From: ghc-devs <ghc-devs-boun...@haskell.org> On Behalf Of Phyx
Sent: 17 July 2020 16:04
To: ghc-devs@haskell.org Devs <ghc-devs@haskell.org>
Subject: New Windows I/O manager in GHC 8.12

Hi All,

In case you've missed it, about 150 or so commits were committed to master
yesterday.  These commits add WinIO (Windows I/O) to GHC.  This is a new I/O
manager that is designed for the native Windows I/O subsystem instead of
relying on the broken posix-ish compatibility layer that MIO used.

This is one of 3 big patches I have been working on for years now..

So before I continue on why WinIO was made I'll add a TL;DR;

WinIO adds an internal API break compared to previous GHC releases.  That is
the internal code was modified to support a completely asynchronous I/O system.

What this means is that we have to keep track of the file pointer offset which
previously was done by the C runtime.  This is because in async I/O you cannot
assume the offset to be at any given location.

What does this mean for you? Very little. If you did not use internal GHC I/O 
code.
In particular if you haven't used Buffer, BufferIO and RawIO. If you have you 
will
to explicitly add support for GHC 8.12+.

Because FDs are a Unix concept and don't behave as you would expect on Windows, 
the
new I/O manager also uses HANDLE instead of FD. This means that any library 
that has
used the internal GHC Fd type won't work with WinIO. Luckily the number of 
libraries
that have seems quite low. If you can please stick to the external Handle 
interface
for I/O functions.

The boot libraries have been updated, and in particular process *requires* the 
version
that is shipped with GHC.  Please respect the version bounds here!  I will be 
writing
a migration guide for those that need to migrate code.  The amount of work is 
usually
trivial as Base provides shims to do most of the common things you would have 
used Fd for.

Also if I may make a plea to GHC developers.. Do not add non-trivial 
implementations
in the external exposed modules (e.g. System.xxx, Data.xxx) but rather add them 
to internal
modules (GHC.xxx) and re-export them from the external modules.  This allows us 
to avoid
import cycles inside the internal modules :)

--

So why WinIO? Over the years a number of hard to fix issues popped up on 
Windows, including
proper Unicode console I/O, cooked inputs, ability to cancel I/O requests. This 
also allows libraries like Brick to work on Windows without re-inventing the 
wheel or have to hide their I/O from the I/O manager.

In order to attempt to do some of these with MIO layer upon layers of hacks 
were added.  This means that things sometimes worked.., but when it didn't was 
rather unpredictable.  Some of the issues were simply unfixable with MIO.  I 
will be making some posts about how WinIO works (and also archiving them on the 
wiki don't worry :)) but for now some highlights:

WinIO is 3 years of work, First started by Joey Hess, then picked up by Mikhail 
Glushenkov before landing at my feet.  While the majority has been rewritten 
their work did provide a great jumping off point so thanks!  Also thanks to Ben 
and AndreasK for helping me get it over the line.. As you can imagine I was 
exhausted by this point :).

Some stats: ~8000 new lines and ~1100 removed ones spread over 130+ commits 
(sorry this was the smallest we could get it while not losing some historical 
context) and with over 153 files changed not counting the changes to boot 
libraries.

It Fixes #18307, #17035, #16917, #15366, #14530, #13516, #13396, #13359, 
#12873, #12869, #11394, #10542, #10484, #10477, #9940, #7593, #7353, #5797, 
#5305, #4471, #3937, #3081, #12117, #2408, #10956, #2189
(but only on native windows consoles, so no msys shells) and #806 which is 14 
years old!

WinIO is a dynamic choice, so you can switch between I/O managers using the RTS 
flag --io-manager=[native|posix].

On non-Windows native is the same as posix.

The chosen Async interface for this implementation is using Completion Ports.

The I/O manager uses a new interface added in Windows Vista called 
GetQueuedCompletionStatusEx which allows us to service multiple request 
interrupts in one go.

Some highlights:

* Drops Windows Vista support
  Vista is out of extended support as of 2017. The new minimum is Windows 7.  
This allows us to use much more efficient OS provided abstractions.

* Replace Events and Monitor locks with much faster and efficient Conditional 
Variables and SlimReaderWriterLocks.
* Change GHC's Buffer and I/O structs to support asynchronous operation by not 
relying on the OS managing File Offset.
* Implement a new command line flag +RTS --io-manager=[native|posix] to control 
which I/O manager is used.
* Implement a new Console I/O interface supporting much faster reads/writes and 
unicode output correctly.  Also supports things like cooked input etc.
* In new I/O manager if the user still has their code-page set to OEM, then we 
use UTF-8 by default. This allows Unicode to work correctly out of the box.
* Add Atomic Exchange PrimOp and implement Atomic Ptr exchanges.
* Flush event logs eagerly as to not rely on finalizers running.
* A lot of refactoring and more use of hsc2hs to share constants
* Control aborts Ctrl+C should be a bit more reliable.
* Add a new IOPort primitive that should be only used for these I/O operations. 
Essentially an IOPort is based on an MVar with the following major
  differences:
  - Does not allow multiple pending writes. If the port is full a second write 
is just discarded.
  - There is no deadlock avoidance guarantee. If you block on an IOPort and 
your Haskell application does not have any work left to do the whole 
application is
stalled.  In the threaded RTS we just continue idling, in the non-threaded rts 
the scheduler is blocked.

* Support various optimizations in the Windows I/O manager such as skipping I/O 
Completion if the request finished synchronously etc.
* The I/O manager is now agnostic to the handle type. i.e. There is no socket 
specific code in the manager.  This is now all pushed to the network library. 
Completely de-coupling these.
* Unified threaded and non-threaded I/O code. The only major difference is 
where event loop is driven from and that the non-threaded rts will always use a 
single OS thread to service requests. We cannot use more as there are no rts 
locks to make concurrent modifications safe.

Cheers,
Tamar
_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Reply via email to