Thanks for your help Frank. On Tue, Apr 21, 2009 at 7:57 PM, Frank Sorenson <[email protected]> wrote:
> > The manpage for read(2) shows: > EAGAIN Non-blocking I/O has been selected using O_NONBLOCK and no > data was immediately available for reading. > > Can you show us the output of: readlink /proc/`pidof ssh-agent`/fd/160 > (change 160 to whatever fd is giving the EAGAIN) > Or even just: ls /proc/`pidof ssh-agent`/fd How about both :) r...@chub:~# ls /proc/29019/fd/ 0 106 114 122 130 139 147 155 19 27 35 43 51 6 68 76 84 92 1 107 115 123 131 14 148 156 2 28 36 44 52 60 69 77 85 93 10 108 116 124 132 140 149 157 20 29 37 45 53 61 7 78 86 94 100 109 117 125 133 141 15 158 21 3 38 46 54 62 70 79 87 95 101 11 118 126 134 142 150 159 22 30 39 47 55 63 71 8 88 96 102 110 119 127 135 143 151 16 23 31 4 48 56 64 72 80 89 97 103 111 12 128 136 144 152 160 24 32 40 49 57 65 73 81 9 98 104 112 120 129 137 145 153 17 25 33 41 5 58 66 74 82 90 99 105 113 121 13 138 146 154 18 26 34 42 50 59 67 75 83 91 this is when the strace shows: read(160, 0xbfe1452a, 1024) = -1 EAGAIN (Resource temporarily unavailable) read(160, 0xbfe1452a, 1024) = -1 EAGAIN (Resource temporarily unavailable) readlink for 160 shows: r...@chub:~# readlink /proc/29019/fd/160 socket:[6380248] I believe this should map to: b...@chub:~$ netstat -anp | grep 6380248 unix 3 [ ] STREAM CONNECTED 6380248 - /tmp/keyring-gNQ6hA/ssh > With so many ssh connections, I'd be curious to see what your entropy > pool looks like. Do you have any remaining > in/proc/sys/kernel/random/entropy_avail or has the pool been exhausted? I have plenty of entropy available, it only goes down slightly during the whole process. Another clue to the puzzle. I have 1300 or so machines in a DC in Hong Kong, only available through a jump server in the same DC. If I'm running my agent on my local machine, through the jump server, and connect to all the machines, connections time out, agent locks up, etc. However, if I copy my keys to the jump box, and run the agent from there, no connections fail, and all connections complete very quickly. I assume that this is because connections open and close quickly enough that whatever limit I'm hitting isn't reached (netstat snapshots every second show around 200 max concurrent connections). --Bob /* PLUG: http://plug.org, #utah on irc.freenode.net Unsubscribe: http://plug.org/mailman/options/plug Don't fear the penguin. */
