Re: [Libguestfs] [libnbd PATCH 2/2] rust: Use mio::poll instead of requiring epoll

2023-09-04 Thread Tage Johansson



On 9/1/2023 3:41 PM, Eric Blake wrote:

On Fri, Sep 01, 2023 at 08:35:58AM +, Tage Johansson wrote:

On 8/30/2023 10:11 PM, Eric Blake wrote:

CI shows our async handle fails to build on FreeBSD and MacOS (where
epoll() is not available as a syscall, and therefore not available as
a Rust crate).  We can instead accomplish the same level-probing
effects by doing a zero-timeout poll with mio (which defers under the
hood to epoll on Linux, and kqueue on BSD).


The problem with mio is that it uses edge triggered notifications. According
to this page :


Draining readiness
Once a readiness event is received, the corresponding operation must be
performed repeatedly until it returns variant
std::io::error::ErrorKind::WouldBlock. Unless this is done, there is no
guarantee that another readiness event will be delivered, even if
further data is received for the event source.

Would that still be true even if we deregister the interest after our
one-shot poll, and reregister it immediately before-hand?  In other
words, even if registration sets up edge triggers for future
notifications, the fact that we are only doing a one-shot query seems
like it would be sufficient to use our one-shot as a level probe
(assuming that the registration starts by learning the current state
of the fd before waiting for any future edges).

I agree that once we get a notification after a .poll(), the normal
assumption is that we must loop and consume all data before calling
.poll again (because the .poll would block until the next edge - but
if we didn't consume to the blocking point, there is no next edge).
But since we are NOT looping, nor doing any consumption work, our
timeout of 0 is trying to behave as an instant level check, and if
adding a reregister/deregister wrapper around the .poll makes it more
reliable (by wiping out all prior events and re-priming the
registration to start with the current level), that may be all the
more we need.



It should work. I tried to do that and it works on my machine at least. 
Although it might be a bit uggly, it may be a short term solution. I 
will send your patch with these modifications soon as reply to this message.




The motivation behind using epoll in addition to tokio in the first place
was to check if fd is readable/writable without necessarily knowing if the
last read/write operation returned [ErrorKind::WouldBlock]. And it doesn't
seems that mio full fills that requirenment.

Do we need to expand the libnbd API itself to make it more obvious
whether our state machine completed because it hit a EWOULDBLOCK
scenario?  In general, once you kick the state machine (by sending a
new command, or calling one of the two aio_notify calls), the state
machine tries to then process as much data as possible until it blocks
again, rather than stopping after just processing a single
transaction.  That is, if libnbd isn't already designed to work with
edge triggers, how much harder would it be to modify it (perhaps by
adding a new API) so that it DOES work nicely with edge triggers?



This would be the cleanest and best solution I think. Adding two methods 
to the handle: aio_read_blocked() and aio_write_blocked() which return 
true iff the last read/write operation blocked. It could be two bool 
fields on the handle struct which are set to false by default and reset 
to false whenever a read/write operation is performed without blocking. 
It is very important that aio_read_blocked() will return true only if 
the last read operation blocked, (and the same thing for 
aio_write_blocked()).




If we can solve _that_ problem, then the Rust async handle wouldn't
need to use epoll, mio, or anything else; tokio's edge trigger
behavior would already be sufficient on its own.



Yes, this is true. It would be more efficient, easier to understand, and 
require less dependencies.



--

Best regards,

Tage


___
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs



Re: [Libguestfs] [libnbd PATCH 2/2] rust: Use mio::poll instead of requiring epoll

2023-09-01 Thread Eric Blake
On Fri, Sep 01, 2023 at 08:35:58AM +, Tage Johansson wrote:
> 
> On 8/30/2023 10:11 PM, Eric Blake wrote:
> > CI shows our async handle fails to build on FreeBSD and MacOS (where
> > epoll() is not available as a syscall, and therefore not available as
> > a Rust crate).  We can instead accomplish the same level-probing
> > effects by doing a zero-timeout poll with mio (which defers under the
> > hood to epoll on Linux, and kqueue on BSD).
> 
> 
> The problem with mio is that it uses edge triggered notifications. According
> to this page :
> 
> > Draining readiness
> > Once a readiness event is received, the corresponding operation must be
> > performed repeatedly until it returns variant
> > std::io::error::ErrorKind::WouldBlock. Unless this is done, there is no
> > guarantee that another readiness event will be delivered, even if
> > further data is received for the event source.

Would that still be true even if we deregister the interest after our
one-shot poll, and reregister it immediately before-hand?  In other
words, even if registration sets up edge triggers for future
notifications, the fact that we are only doing a one-shot query seems
like it would be sufficient to use our one-shot as a level probe
(assuming that the registration starts by learning the current state
of the fd before waiting for any future edges).

I agree that once we get a notification after a .poll(), the normal
assumption is that we must loop and consume all data before calling
.poll again (because the .poll would block until the next edge - but
if we didn't consume to the blocking point, there is no next edge).
But since we are NOT looping, nor doing any consumption work, our
timeout of 0 is trying to behave as an instant level check, and if
adding a reregister/deregister wrapper around the .poll makes it more
reliable (by wiping out all prior events and re-priming the
registration to start with the current level), that may be all the
more we need.

> 
> The motivation behind using epoll in addition to tokio in the first place
> was to check if fd is readable/writable without necessarily knowing if the
> last read/write operation returned [ErrorKind::WouldBlock]. And it doesn't
> seems that mio full fills that requirenment.

Do we need to expand the libnbd API itself to make it more obvious
whether our state machine completed because it hit a EWOULDBLOCK
scenario?  In general, once you kick the state machine (by sending a
new command, or calling one of the two aio_notify calls), the state
machine tries to then process as much data as possible until it blocks
again, rather than stopping after just processing a single
transaction.  That is, if libnbd isn't already designed to work with
edge triggers, how much harder would it be to modify it (perhaps by
adding a new API) so that it DOES work nicely with edge triggers?

If we can solve _that_ problem, then the Rust async handle wouldn't
need to use epoll, mio, or anything else; tokio's edge trigger
behavior would already be sufficient on its own.

> 
> 
> I tried using the polling  crate,
> which uses epoll and kqueue under the hood. but for some reason the
> disconnect call failes, at least on my Linux machine. I have know idea
> what's the problem.
> 
> 
> Another idea would be that we create our own abstraction upon epoll and
> kqueue. But I am unable to test any kqueue implementation, I think that I
> can not even compile it on my machine.

If there is already a solution out there that works, let's favor that
instead of reimplementing a portability wrapper ourselves.

> 
> 
> It would of course also be possible to disable the Rust bindings (or at
> least the asynchronous API) on BSD and MacOS, but that would not be a good
> solution I think.

Yeah, crippling the BSD build is less desirable, even if doing it in
the short term to get green CI is tempting.

You mentioned that the disconnect API was hanging.  Is it because we
have not set up tokio to inform us when it detects that a socket has
closed?  Admittedly, I haven't reproduced the hang you mentioned yet;
is it something that was easy to reproduce (for example, just remove
all the epoll code without replacement, then run 'make check')?  If I
can reproduce the hang, maybe I can learn from strace where things are
getting stuck?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org
___
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs



Re: [Libguestfs] [libnbd PATCH 2/2] rust: Use mio::poll instead of requiring epoll

2023-09-01 Thread Tage Johansson


On 8/30/2023 10:11 PM, Eric Blake wrote:

CI shows our async handle fails to build on FreeBSD and MacOS (where
epoll() is not available as a syscall, and therefore not available as
a Rust crate).  We can instead accomplish the same level-probing
effects by doing a zero-timeout poll with mio (which defers under the
hood to epoll on Linux, and kqueue on BSD).



The problem with mio is that it uses edge triggered notifications. 
According to this page :



Draining readiness
Once a readiness event is received, the corresponding operation must 
be performed repeatedly until it returns variant 
std::io::error::ErrorKind::WouldBlock. Unless this is done, there is 
no guarantee that another readiness event will be delivered, even if 
further data is received for the event source.


The motivation behind using epoll in addition to tokio in the first 
place was to check if fd is readable/writable without necessarily 
knowing if the last read/write operation returned 
[ErrorKind::WouldBlock]. And it doesn't seems that mio full fills that 
requirenment.



I tried using the polling  
crate, which uses epoll and kqueue under the hood. but for some reason 
the disconnect call failes, at least on my Linux machine. I have know 
idea what's the problem.



Another idea would be that we create our own abstraction upon epoll and 
kqueue. But I am unable to test any kqueue implementation, I think that 
I can not even compile it on my machine.



It would of course also be possible to disable the Rust bindings (or at 
least the asynchronous API) on BSD and MacOS, but that would not be a 
good solution I think.



--

Best regards,

Tage




Fixes: 223a9965 ("rust: async: Create an async friendly handle type")
CC: Tage Johansson
Signed-off-by: Eric Blake
---
  rust/Cargo.toml  |  2 +-
  rust/src/async_handle.rs | 46 +---
  2 files changed, 30 insertions(+), 18 deletions(-)

diff --git a/rust/Cargo.toml b/rust/Cargo.toml
index 0879b34c..391c80e9 100644
--- a/rust/Cargo.toml
+++ b/rust/Cargo.toml
@@ -49,7 +49,7 @@ thiserror = "1.0.40"
  log = { version = "0.4.19", optional = true }
  libc = "0.2.147"
  tokio = { optional = true, version = "1.29.1", default-features = false, features = ["rt", 
"sync", "net"] }
-epoll = "4.3.3"
+mio = "0.8.0"

  [features]
  default = ["log", "tokio"]
diff --git a/rust/src/async_handle.rs b/rust/src/async_handle.rs
index ab28b910..35b1c0d2 100644
--- a/rust/src/async_handle.rs
+++ b/rust/src/async_handle.rs
@@ -35,9 +35,11 @@ use crate::sys;
  use crate::Handle;
  use crate::{Error, FatalErrorKind, Result};
  use crate::{AIO_DIRECTION_BOTH, AIO_DIRECTION_READ, AIO_DIRECTION_WRITE};
-use epoll::Events;
+use mio::unix::SourceFd;
+use mio::{Events, Interest as MioInterest, Poll, Token};
  use std::sync::Arc;
  use std::sync::Mutex;
+use std::time::Duration;
  use tokio::io::{unix::AsyncFd, Interest, Ready as IoReady};
  use tokio::sync::Notify;
  use tokio::task;
@@ -176,12 +178,12 @@ async fn polling_task(handle_data: ) -> 
Result<(), FatalErrorKind> {
  } = handle_data;
  let fd = handle.aio_get_fd().map_err(Error::to_fatal)?;
  let tokio_fd = AsyncFd::new(fd)?;
-let epfd = epoll::create(false)?;
-epoll::ctl(
-epfd,
-epoll::ControlOptions::EPOLL_CTL_ADD,
-fd,
-epoll::Event::new(Events::EPOLLIN | Events::EPOLLOUT, 42),
+let mut events = Events::with_capacity(1);
+let mut poll = Poll::new()?;
+poll.registry().register(
+ SourceFd(),
+Token(0),
+MioInterest::READABLE | MioInterest::WRITABLE,
  )?;

  // The following loop does approximately the following things:
@@ -248,19 +250,29 @@ async fn polling_task(handle_data: ) -> 
Result<(), FatalErrorKind> {
  }
  drop(pending_cmds_lock);

-// Use epoll to check the current read/write availability on the fd.
+// Use mio poll to check the current read/write availability on the fd.
  // This is needed because Tokio supports only edge-triggered
  // notifications but Libnbd requires level-triggered notifications.
-let mut revent = epoll::Event { data: 0, events: 0 };
  // Setting timeout to 0 means that it will return immediately.
-epoll::wait(epfd, 0, std::slice::from_mut( revent))?;
-let revents = Events::from_bits(revent.events).unwrap();
-if !revents.contains(Events::EPOLLIN) {
-ready_guard.clear_ready_matching(IoReady::READABLE);
-}
-if !revents.contains(Events::EPOLLOUT) {
-ready_guard.clear_ready_matching(IoReady::WRITABLE);
-}
+// mio states that it is OS-dependent on whether a single event
+// can be both readable and writable, but we survive just fine
+// if we only see one direction even when both are available.
+match poll.poll( events, 

[Libguestfs] [libnbd PATCH 2/2] rust: Use mio::poll instead of requiring epoll

2023-08-30 Thread Eric Blake
CI shows our async handle fails to build on FreeBSD and MacOS (where
epoll() is not available as a syscall, and therefore not available as
a Rust crate).  We can instead accomplish the same level-probing
effects by doing a zero-timeout poll with mio (which defers under the
hood to epoll on Linux, and kqueue on BSD).

Fixes: 223a9965 ("rust: async: Create an async friendly handle type")
CC: Tage Johansson 
Signed-off-by: Eric Blake 
---
 rust/Cargo.toml  |  2 +-
 rust/src/async_handle.rs | 46 +---
 2 files changed, 30 insertions(+), 18 deletions(-)

diff --git a/rust/Cargo.toml b/rust/Cargo.toml
index 0879b34c..391c80e9 100644
--- a/rust/Cargo.toml
+++ b/rust/Cargo.toml
@@ -49,7 +49,7 @@ thiserror = "1.0.40"
 log = { version = "0.4.19", optional = true }
 libc = "0.2.147"
 tokio = { optional = true, version = "1.29.1", default-features = false, 
features = ["rt", "sync", "net"] }
-epoll = "4.3.3"
+mio = "0.8.0"

 [features]
 default = ["log", "tokio"]
diff --git a/rust/src/async_handle.rs b/rust/src/async_handle.rs
index ab28b910..35b1c0d2 100644
--- a/rust/src/async_handle.rs
+++ b/rust/src/async_handle.rs
@@ -35,9 +35,11 @@ use crate::sys;
 use crate::Handle;
 use crate::{Error, FatalErrorKind, Result};
 use crate::{AIO_DIRECTION_BOTH, AIO_DIRECTION_READ, AIO_DIRECTION_WRITE};
-use epoll::Events;
+use mio::unix::SourceFd;
+use mio::{Events, Interest as MioInterest, Poll, Token};
 use std::sync::Arc;
 use std::sync::Mutex;
+use std::time::Duration;
 use tokio::io::{unix::AsyncFd, Interest, Ready as IoReady};
 use tokio::sync::Notify;
 use tokio::task;
@@ -176,12 +178,12 @@ async fn polling_task(handle_data: ) -> 
Result<(), FatalErrorKind> {
 } = handle_data;
 let fd = handle.aio_get_fd().map_err(Error::to_fatal)?;
 let tokio_fd = AsyncFd::new(fd)?;
-let epfd = epoll::create(false)?;
-epoll::ctl(
-epfd,
-epoll::ControlOptions::EPOLL_CTL_ADD,
-fd,
-epoll::Event::new(Events::EPOLLIN | Events::EPOLLOUT, 42),
+let mut events = Events::with_capacity(1);
+let mut poll = Poll::new()?;
+poll.registry().register(
+ SourceFd(),
+Token(0),
+MioInterest::READABLE | MioInterest::WRITABLE,
 )?;

 // The following loop does approximately the following things:
@@ -248,19 +250,29 @@ async fn polling_task(handle_data: ) -> 
Result<(), FatalErrorKind> {
 }
 drop(pending_cmds_lock);

-// Use epoll to check the current read/write availability on the fd.
+// Use mio poll to check the current read/write availability on the fd.
 // This is needed because Tokio supports only edge-triggered
 // notifications but Libnbd requires level-triggered notifications.
-let mut revent = epoll::Event { data: 0, events: 0 };
 // Setting timeout to 0 means that it will return immediately.
-epoll::wait(epfd, 0, std::slice::from_mut( revent))?;
-let revents = Events::from_bits(revent.events).unwrap();
-if !revents.contains(Events::EPOLLIN) {
-ready_guard.clear_ready_matching(IoReady::READABLE);
-}
-if !revents.contains(Events::EPOLLOUT) {
-ready_guard.clear_ready_matching(IoReady::WRITABLE);
-}
+// mio states that it is OS-dependent on whether a single event
+// can be both readable and writable, but we survive just fine
+// if we only see one direction even when both are available.
+match poll.poll( events, Some(Duration::ZERO)) {
+Ok(_) => {
+for event in  {
+if !event.is_readable() {
+ready_guard.clear_ready_matching(IoReady::READABLE);
+}
+if !event.is_writable() {
+ready_guard.clear_ready_matching(IoReady::WRITABLE);
+}
+}
+}
+Err(_) => {
+ready_guard.clear_ready_matching(IoReady::READABLE);
+ready_guard.clear_ready_matching(IoReady::WRITABLE);
+}
+};
 ready_guard.retain_ready();
 }
 }
-- 
2.41.0

___
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs