Am 30.01.2013 00:43, schrieb Brian Anderson:
On 01/29/2013 04:47 AM, Michael Neumann wrote:
Am 29.01.2013 03:01, schrieb Brian Anderson:
On 01/28/2013 05:29 PM, Graydon Hoare wrote:
On 13-01-28 04:56 PM, Brian Anderson wrote:

I think libuv is doing too much here. For example, if I don't want to
remove the socket from the event
queue, just disable the callback, then this is not possible. I'd
prefer when I could just tell libuv that
I am interested in event X (on Windows: I/O completion, on UNIX: I/O
availability).
Yet the optimization you suggest has to do with recycling the buffer,
not listening for one kind of event vs. another.

In general I'm not interested in trying to "get underneath" the
abstraction uv is providing. It's providing an IOCP-oriented interface,
I would like to code to that and make the rust IO library not have to
worry when it's on windows vs. unix. That's the point of the abstraction uv provides, and it's valuable. If it means bouncing off epoll a few too many times (or reallocating a buffer a few too many times), I'm not too
concerned. Those should both be O(1) operations.

Is it possible to do this optimization later or do we need to plan for this ahead of time? I would prefer to use the uv API as it's presented
to start with.
The optimization to use a caller-provided buffer should (a) not be
necessary to get us started and (b) be equally possible on either
platform, unix or windows, _so long as_ we're actually sleeping a task
during its period of interest in IO (either the pre-readiness sleep or a
post-issue, pre-completion sleep). In other words, if we're simulating
sync IO, then we can use a task-local buffer. If we're _not_ simulating
sync IO (I sure hope we do!) then we should let uv allocate and free
dynamic buffers as it needs them.

But I really hope we wind up structuring it so it simulates sync IO.
We're providing a task abstraction. Users _want_ the sync IO abstraction
the same way they want the sequential control flow abstraction.

Presenting the scheduler-originating I/O as synchronous is what I intend. I am not sure that we can guarantee that a task is actually waiting for I/O when an I/O event occurs that that task is waiting for. A task may block on some other unrelated event while the event loop is doing I/O. Pseudocode:

let port = IOPort::connect(); // Assume we're doing I/O reads using something portlike
while port.recv() {
// Block on a different port, while uv continues doing I/O on our behalf
    let intermediate_value = some_other_port.recv();
}

This is why I'm imagining that the scheduler will sometimes need to buffer.

I don't think so. Let me explain.

This anyway is only a problem (which can be solved) iff we want to be able to treat I/O like a port and want to wait for either one to resume our thread. And I assume we want this, so that we can listen on an I/O socket AND for example for incoming messages at the same time.

The kernel provides a way to do (task-local) blocking I/O operations. There is no way for the task to return from a read() call unless data comes in or in case of EOF (or any other error condition).This behaves basically like a blocking POSIX read() call, just that it is converted into asynchronous read by libuv under the hood. To expose I/O as port, we have to start
a new task:

  let fd = open(...);
  let (po, ch) = streams::pipe();
  do task::spawn {
    loop {
      let buf: ~[u8] = vec::from_fn(1000, || 0);
      let nread = fd.read(buf, 1000);
      if nread > 0 {
        ch.send(Data(buf))
      }
      else if nread == 0 {
        ch.send(EOF)
      }
      else {
        ch.send(Error)
      }
    }
  }

Yes, a single call to 'read' will not return until some I/O arrives, but after 'read' returns I/O continues to arrive and that I/O needs to be stored somewhere if the task doesn't immediately block in another call to 'read' on that same fd. Taking the above example:

loop {
// This will block until data arrives at which point the task will be context-switched in and the data returned.
    let nread = fd.read(buf, 1000);

    // This will put the task to sleep waiting on a message on cmd_port
    let command = cmd_port.recv();
}

Until data arrives on cmd_port the task cannot be scheduled. While the task is asleep the I/O loop can't be blocked since other tasks are using it too. So in the meantime uv continues to receive data from the open fd and it needs to live somewhere until the task calls 'read' again on the same fd. Perhaps there's something I don't understand about the uv API here, but I think that once we start reading uv is going to continually provide us with data whether we are ready for it or not.


  // now we can treat `po` as a Port and call select() on it


But I don't think channel I/O will be used that often.

Note that one big advantage is that we can specify the buffer size ourself!
When we would let libuv create a  buffer for us, how would it know the
buffer size? The alloc_cb you provide to libuv upon uv_start_read() will get
a suggested_size parameter passed, but this is 64k by default, and libuv
cannot know what kind of I/O protocol you are handling. When I do
line oriented I/O, I would not need a full 64k buffer allocated for every
read, which in the worst case would only return one byte in it in case
of a very slow sender (send one byte each second). Or is 64k enough
for receiving a very large packet. We clearly want a way to tell the I/O
system how large we expect the packet to be that will arrive over I/O
otherwise this is completely useless IMHO.

I am not sure how to do this with the forementioned issues - when we receive the alloc_cb the task may not have been able to communicate to the event loop the size of the next buffer. Imagine this sequence of events:

* task issues fd.read(buf, 1000);
* alloc_cb arrives. great, 'read' already told us the size of the next buffer.
* task wakes up and starts handling data
* task goes to sleep for some other reason
* alloc_cb arrives. how big is the buffer supposed to be?
* task wakes up and issues fd.read(buf, 1000). What can we do with '1000'? we already missed the underlying read event


We would still have one separate iotask per scheduler.This is a native
thread and runs the I/O loop. There is no way to do that inside the
scheduler as we would block any task while waiting for I/O.

I have something different in mind. There is no iotask. The scheduler is the event loop and I/O callbacks are interleaved with running tasks. There will be no thread synchronization required to pass data from the event loop to a task - only context switches to schedule and deschedule the task.

I don't anticipate problems with blocking the scheduler - when an external event requires the scheduler to wake up it will create an async_cb to run some scheduler code.


The callbacks like on_read_cb would simply notify the scheduler
that the task that was responsible for doing this read operation
can now resume. As the scheduler lives in another thread
(the thread in which all tasks of that scheduler live in)
and might be active, we need to do some locking here.
When the scheduler gets activated next time, either by
issuing a blocking I/O operation, giving up by using task::yield
or by waiting for a message on a port, or when sending a message
blocks, the scheduler can decide which task to schedule next
and consider those for which I/O has arrived as well.

One thing to consider is that we'd need a way to return the number
of bytes written to the buffer to the calling task of read().
We should store this in the same manner as the pointer to the buffer
and the buffer_size in the stream_t handle. This is safe, as one I/O
object is always exclusively used by one task.

I am not familiar enough with the mechanics of the uv API enough to understand this point. I think though that it assumes that the synchronous code will handle I/O events as they arrive and we will be passing the uv handles from the async code to the sync code (therefore uv won't be overwriting a handle while a Rust task is in possession of it). For the reasons I mentioned before I don't see how this is possible (in the general case) since the scheduler may need to buffer data until it can be acted on by the task.

We can call this field
last_nread for example, and when the scheduler reactivates a
task blocked on a read I/O, we would simply return this field as number
of read bytes.

In short:

  * A task can either block on exactly *one* I/O object
  * or on a channel/port.
  * Each I/O object belongs exclusivly to one task
  * I/O and Port/Chan are two different things
  * I/O is "lower" than Port/Chan, but can be easily
     wrapped into an Port/Chan abstraction (see code above)
  * When a task blocks on an I/O event, it blocks until
     this I/O event arrives.
  * A task can only ever block on *one* I/O event
  * For Channel I/O (I/O over Port/Chan) a separate task
     in needed for each connection object.
  * We have on iotask per scheduler.

I mostly agree with these points but some are at odds with my previous statements.

Please forgive me for not responding to the remaining points in detail. We can discuss more later.

I think I now understood how you want to implement the scheduler.

The scheduler will schedule tasks as long as they can communicate with each other via channels, i.e. as long as there is at least one task that is not blocked on a channel. If all tasks are blocked on a channel, the scheduler will enter it's I/O loop (uv_run_loop).
It will only awake from it if I/O arrives (or uv_async_send is called from
a different scheduler). This is optimal in terms of performance. Only if tasks communicate across a scheduler boundary it is little bit more expensive, as it will (potentially) awake the receiving scheduler from it's I/O loop. But I think this is negliglible, and models nicely how processors in a SMP system wake up each other using IPIs (inter processor interrupts). Except inter-scheduler communication, there are no locks needed at all, assuming one scheduler corresponds to
one native thread.

I think this system is easier to implement that what I proposed, and it performs better. As I/O can be performed asynchronously depending on the OS (on Windows), you need to allocate buffers,
but this is no problem IMHO.

Regards,

  Michael
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev

Reply via email to