On Thu, Feb 05, 2026 at 07:53:33AM +0000, [email protected] wrote:
Hi,

Hi [Your Name],


I'm experiencing a bug where SSH sessions over vsock take 2-20+ seconds
to establish due to poll() not signaling POLLIN when data is available.
The bug does NOT occur on the first connection after VM boot, but affects
all subsequent connections.

* Summary

- vsock poll() fails to return POLLIN when data is in the receive buffer
- sshd-session's ppoll() times out every ~20ms instead of waking on data
- First SSH connection after guest boot works instantly
- All subsequent connections experience 2-20+ second delays
- Non-PTY commands (ssh -T ... 'echo test') work instantly

mm, so not sure if it's related to the kernel or the user space proxy, etc. Would be nice to replicate without ssh.

I tried with 6.18 on both guest and host and I'm not able to reproduce it.

Can you try to write a simple reproducer without ssh involved?

Thanks,
Stefano

- TCP connections to the same VM work instantly

* Environment

Host:
- OS: Arch Linux
- Kernel: 6.18.2-arch2-1
- QEMU: system package (latest)

Guest:
- OS: Debian trixie
- Kernel: 6.17.13+deb13-amd64 (also tested on 6.12.57, same issue)
- OpenSSH: 10.0p2

QEMU command (relevant parts):
 qemu-system-x86_64 -enable-kvm -smp 8 \
   -object memory-backend-memfd,id=mem,size=20G,share=on \
   -machine memory-backend=mem \
   -device vhost-vsock-pci,guest-cid=5 \
   ...

Connection method: ssh user@vsock/5 (via systemd-ssh-proxy)

* Symptoms

Interactive SSH (PTY) - SLOW:
 $ time ssh user@vsock/5
 # Takes 2-20+ seconds before shell prompt appears

Non-interactive SSH - FAST:
 $ time ssh user@vsock/5 'echo test'
 test
 real    0m0.156s

TCP to same VM - FAST:
 $ time ssh -p 33594 [email protected]
 # Instant

* Key observation: First connection after boot is fast

After guest reboot:
 $ ssh user@vsock/5      # INSTANT (< 1 second)
 $ exit
 $ ssh user@vsock/5      # SLOW (2-20 seconds)
 $ ssh user@vsock/5      # SLOW
 ...

This suggests the bug involves state that accumulates or isn't properly
cleaned up between connections.

** bpftrace evidence

Using syscall tracepoints on guest during slow connection:

 === MINIMAL VSOCK DIAGNOSTIC ===
 [   29 ms] sshd-session: ppoll() duration=19 ms ret=1
            ^^^ 20ms TIMEOUT pattern detected!
 [   50 ms] sshd-session: ppoll() duration=20 ms ret=1
            ^^^ 20ms TIMEOUT pattern detected!
 [   70 ms] sshd-session: ppoll() duration=18 ms ret=1
            ^^^ 20ms TIMEOUT pattern detected!
 ... (continues for ~2 seconds) ...

 [ 5000 ms] --- 5s stats: ppoll=455, timeouts=103, recv=0 (0 bytes) ---

 [19432 ms] sshd: recvmsg() = 308 bytes [4 µs]
 [19442 ms] sshd-session: recvmsg() = 308 bytes [4 µs]

Pattern analysis:
- ppoll() returns ret=1 (1 fd ready) but takes exactly ~20ms (timeout)
- The ready fd is the PTY, NOT the vsock socket
- recv=0 during the timeout phase: vsock data not being read
- recvmsg() finally succeeds after ~19 seconds
- When recvmsg() runs, it completes in 4 microseconds (data WAS there)

This proves: data is sitting in the vsock receive buffer, but poll()
is not returning POLLIN, so sshd doesn't know to read it.

* 30-second summary from bpftrace

 Total ppoll calls: 488
 Timeouts (20ms pattern): 103
 Successful recvmsg: 6 (984 bytes)
 Timeout rate: 21%

* Why PTY-specific?

PTY sessions require bidirectional traffic:
1. Server sends shell prompt → client must receive it
2. Client sends keypress → server must receive it
3. Server sends echo → client must receive it

Each exchange relies on poll() waking on POLLIN. The bug causes poll()
to miss the wakeup, forcing sshd to wait for its 20ms timeout fallback.

Non-PTY commands do request-response-exit quickly before the bug
manifests significantly.

## Additional context

I previously encountered the identical issue on WSL2's Hyper-V vsock
implementation, suggesting this may be a fundamental issue with how
vsock transports handle poll/wakeup semantics, not specific to virtio.

## Hypothesis

Based on the evidence, this appears to be a lost wakeup race condition:
1. Host sends packet to guest
2. Packet is enqueued to socket's rx_queue
3. sk_data_ready() is called but poll waiters aren't properly woken
4. vsock_poll() returns 0 (no POLLIN) despite data being available
5. ppoll() times out after 20ms, sshd retries
6. Eventually succeeds through timeout-based retry

The "first connection works" pattern suggests the race involves
existing state from previous connections - possibly worker threads,
interrupt handlers, or virtqueue state that isn't properly reset.

## Reproducer

1. Start QEMU VM with vhost-vsock-pci device
2. Boot guest, ensure sshd is running
3. From host: ssh user@vsock/<CID>  # First connection is fast
4. Exit and reconnect: ssh user@vsock/<CID>  # Now slow

## Request

Could someone familiar with the vsock/virtio poll implementation
review the wakeup path? Specifically:
- virtio_transport_recv_pkt() -> sk_data_ready() path
- vsock_poll() -> poll_wait() registration timing
- Any state that persists between connections

Happy to provide additional traces or test patches.

Thanks,
[Your Name]

---
bpftrace script used (runs on guest):

#!/usr/bin/env bpftrace
BEGIN {
   @start = nsecs;
   printf("=== MINIMAL VSOCK DIAGNOSTIC ===\n");
}
tracepoint:syscalls:sys_enter_ppoll {
   if (comm == "sshd-session" || comm == "sshd") {
       @ppoll_enter[tid] = nsecs;
       @ppoll_count++;
   }
}
tracepoint:syscalls:sys_exit_ppoll {
   if (@ppoll_enter[tid]) {
       $ms = (nsecs - @start) / 1000000;
       $dur = (nsecs - @ppoll_enter[tid]) / 1000000;
       if ($dur > 10) {
           printf("[%5lld ms] %s: ppoll() duration=%lld ms ret=%d\n",
                  $ms, comm, $dur, args->ret);
           if ($dur >= 18 && $dur <= 25) {
               printf("           ^^^ 20ms TIMEOUT pattern detected!\n");
               @timeout_count++;
           }
       }
       delete(@ppoll_enter[tid]);
   }
}
tracepoint:syscalls:sys_exit_recvmsg {
   if (comm == "sshd-session" || comm == "sshd") {
       if (args->ret > 0) {
           $ms = (nsecs - @start) / 1000000;
           printf("[%5lld ms] %s: recvmsg() = %lld bytes\n", $ms, comm, args-
ret);
           @recv_count++;
           @recv_bytes += args->ret;
       }
   }
}
interval:s:5 {
   printf("\n[%5lld ms] --- 5s stats: ppoll=%d, timeouts=%d, recv=%d (%d bytes)
---\n\n",
          (nsecs - @start) / 1000000, @ppoll_count, @timeout_count,
@recv_count, @recv_bytes);
}






Reply via email to