On Wed, 19 Dec 2001, Mathieu Nantel wrote:

> Good day,
>
> I've got a test install of Kerberos up and running. All of the tools seem
> to work, beside RCP, which is misbehaving between 2 of my hosts. When
> attempting "rcp file1 destination_server:", the command freezes and never
> terminates and the file doesn't get copied. However, when using truss to
> run it, I.E. : "truss rcp file1 destination_server:", it works perfectly.
>
> Now, I don't think truss changes the behavior of whatever you run under
> it.

- Yes, it does. Particularly wrt subtle timing issues like I suspect
that this problem is. The hesienberg principal applies to computers.
If you observe something, you change the behavior.

> I don't really want to write an unelegant wrapper script that will
> truss all the rcp commands.
>
> Anyone has an idea why I am seeing this behavior? The boxes are running
> Solaris 2.6, and I am using MIT Kerberos 1.2.2.
>

- I have heard vague references to a bug in Solaris select in which
you can get a "false positive" (i.e. when you get a return from
select you should have something to read, but when you do the read
call you get 0 results ).

- Running something under truss would probably hide this bug quite
well since the extra overhead would allow the buffer to fill by the
time you read it.

- We've had a similar unreproducibable problem with rsh connections
btw Linux clients and Solaris hosts. I've tweaked the reads in
krlogind.c to retry if they get 0. However, since I can't reproduce
the bug effectively, I have no idea whether that will fix the problem
or not.

- Booker C. Bense

Reply via email to