Re: [networking-discuss] strange rcp issue with snv_92

zhihui Chen Tue, 15 Jul 2008 07:42:11 -0700

I am sure the local machine send the "dropped connection" to the other end
with snoop.
and dtrace output shows that rcp sends "dtropped connection" with following
stack:


fd=4
      libc.so.1`write
      rcp`deswrite+0x81
      rcp`desrcpwrite+0x23
      rcp`error+0x50
      rcp`sink+0x669
      rcp`tolocal+0x3f3
      rcp`main+0x839
      rcp`_start+0x7a

After check the source code of rcp.c, this stack is generated by
following source code in function sink,
                j = desrcpread(rem, cp, amt);
                if (j <= 0) {
                    int sverrno = errno;

                    /*
                     * Connection to supplier lost.
                     * Truncate file to correspond
                     * to amount already transferred.
                     *
                     * Note that we must call ftruncate()
                     * before any call to error() (which
                     * might result in a SIGPIPE and
                     * sudden death before we have a chance
                     * to correct the file's size).
                     */
                    size = lseek(ofd, 0, SEEK_CUR);
                    if ((ftruncate(ofd, size)  == -1) &&
                        (errno != EINVAL) &&
                        (errno != EACCES))
#define     TRUNCERR    "rcp: can't truncate %s: %s\n"
                        error(TRUNCERR, np,
                            strerror(errno));
                    error("rcp: %s\n",
                        j ? strerror(sverrno) :
                        "dropped connection");
                    (void) close(ofd);
                    exit(1);

and following truss output shows the last read have tried to read 1132
bytes, but only return 0 byte. Also, the buffer for last read is showed by
truss with 0x080D6844, not the content like other read. I am not sure what
caused this.

brk(0x080D71D8)                                 = 0
read(4, "\0\0\0\0\0\0\0\0\0\0\0\0".., 65536)    = 17520
read(4, "\0\0\0\0\0\0\0\0\0\0\0\0".., 48016)    = 7300
read(4, "\0\0\0\0\0\0\0\0\0\0\0\0".., 40716)    = 8760
read(4, "\0\0\0\0\0\0\0\0\0\0\0\0".., 31956)    = 10220
read(4, "\0\0\0\0\0\0\0\0\0\0\0\0".., 21736)    = 11680
read(4, "\0\0\0\0\0\0\0\0\0\0\0\0".., 10056)    = 7464
read(4, "\0\0\0\0\0\0\0\0\0\0\0\0".., 2592)     = 1460
read(4, 0x080D6844, 1132)                       = 0
llseek(5, 0, SEEK_CUR)                          = 0
fcntl(5, F_FREESP64, 0x08027B44)                = 0
write(4, "01 r c p :   d r o p p e".., 25)      = 25
rcp: dropped connection
write(2, " r c p :   d r o p p e d".., 24)      = 24
close(5)                                        = 0
_exit(1)


2008/7/15, Brian Utterback <[EMAIL PROTECTED]>:
>
> I just re-read what I wrote, and I meant to say "when one was *not*
> received (unlikely)" Oops.
>
> Anyway, the datastream is getting an EOF; that's what the truss shows.  Why
> is the question on the table now, which leads to who closed the datastream?
> The snoop will show that. Be sure to snoop the client end first and look for
> a FIN packet.
>
> zhihui Chen wrote:
>
>> Also, I have tried the test on this machine (with snv_92) to many
>> different system(including snv_67, snv_91, s10u5), this problem always
>> exist.
>>
>> 2008/7/15, zhihui Chen <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>>:
>>
>>    I can try snoop to check. But I dont think this problem is caused by
>>    the other end. The other end is one machine with S10U5, I have
>>    tried this test on other machines with snv_91 or privious build, no
>>    problem.
>>        2008/7/15, Brian Utterback <[EMAIL PROTECTED]
>>    <mailto:[EMAIL PROTECTED]>>:
>>
>>        I don't know what is going on, but to me it looks like the
>>        problem is at the other end. This shows that rcp is getting a
>>        premature EOF indication on the data stream. Either the local OS
>>        is erroneously delivering an EOF when one was received
>>        (unlikely) or the other end sent it. You could use snoop to
>>        check which it is. You could use truss and snoop on the other
>>        end to help you determine why. A snoop would confirm that the
>>        data stream was actually closed by the other host, since
>>        firewalls have been known to do this kind of thing.
>>
>>
>>        zhihui Chen wrote:
>>
>>            Hello all, I have met a strange rcp issue with snv_92. When
>>            I copy a file from remote machine to local through rcp, the
>>            copy result will be decided by file size.  If the size of
>>            file <=8k, then rcp is OK, like following:
>>             intel6# rcp irperf:`pwd`/test8k .
>>            intel6# ls -l test8k
>>            -rw-r--r--   1 root     root        8192 Jul 14 23:50 test8k
>>             If the size of file >8k, the rcp does work, like following:
>>
>>            intel6# rcp irperf:`pwd`/test10k .
>>            rcp: dropped connection
>>            intel6# ls -l test10k
>>            -rw-r--r--   1 root     root           0 Jul 14 23:51 test10k
>>
>>            But if I add "truss" before rcp, then rcp works, like
>> following:
>>             intel6# truss rcp irperf:`pwd`/test10k .
>>            .....
>>            read(4, "\0\0\0\0\0\0\0\0\0\0\0\0".., 10240)    = 7300
>>            read(4, "\0\0\0\0\0\0\0\0\0\0\0\0".., 2940)     = 2920
>>            read(4, "\0\0\0\0\0\0\0\0\0\0\0\0".., 20)       = 20
>>            write(5, "\0\0\0\0\0\0\0\0\0\0\0\0".., 10240)   = 10240
>>            fcntl(5, F_FREESP64, 0x08027B44)                = 0
>>            close(5)                                        = 0
>>            read(4, "\0", 1)                                = 1
>>            write(4, "\0", 1)                               = 1
>>            read(4, 0x08027C30, 1)                          = 0
>>            close(4)                                        = 0
>>            _exit(0)
>>            intel6# ls -l test10k
>>            -rw-r--r--   1 root     root       10240 Jul 14 23:53 test10k
>>
>>            If the size of file become larger, then "truss rcp" does not
>>            work either, like following:
>>             intel6# truss rcp irperf:`pwd`/test100k .
>>            .......
>>            read(4, "\0\0\0\0\0\0\0\0\0\0\0\0".., 65536)    = 13140
>>            read(4, "\0\0\0\0\0\0\0\0\0\0\0\0".., 52396)    = 4380
>>            read(4, "\0\0\0\0\0\0\0\0\0\0\0\0".., 48016)    = 7300
>>            read(4, "\0\0\0\0\0\0\0\0\0\0\0\0".., 40716)    = 8760
>>            read(4, "\0\0\0\0\0\0\0\0\0\0\0\0".., 31956)    = 10220
>>            read(4, "\0\0\0\0\0\0\0\0\0\0\0\0".., 21736)    = 11680
>>            read(4, "\0\0\0\0\0\0\0\0\0\0\0\0".., 10056)    = 1624
>>            read(4, "\0\0\0\0\0\0\0\0\0\0\0\0".., 8432)     = 7300
>>            read(4, "\0\0\0\0\0\0\0\0\0\0\0\0".., 1132)     = 328
>>            read(4, 0x080D698C, 804)                        = 0
>>            llseek(5, 0, SEEK_CUR)                          = 0
>>            fcntl(5, F_FREESP64, 0x08027B44)                = 0
>>            write(4, "01 r c p :   d r o p p e".., 25)      = 25
>>            rcp: dropped connection
>>            write(2, " r c p :   d r o p p e d".., 24)      = 24
>>            close(5)                                        = 0
>>            _exit(1)
>>            intel6# ls -l test100k
>>            -rw-r--r--   1 root     root           0 Jul 14 23:45 test100k
>>             Does anyone have met similar issue and how to solve this
>> issue?
>>             -----
>>            zhihui
>>            Intel OpenSolaris Team
>>
>>
>>
>>  ------------------------------------------------------------------------
>>
>>            _______________________________________________
>>            networking-discuss mailing list
>>            [email protected]
>>            <mailto:[email protected]>
>>
>>
>>        --        blu
>>
>>        There are two rules in life:
>>        Rule 1- Don't tell people everything you know
>>
>>  ----------------------------------------------------------------------
>>        Brian Utterback - Solaris RPE, Sun Microsystems, Inc.
>>        Ph:877-259-7345, Em:brian.utterback-at-ess-you-enn-dot-kom
>>
>>
>>
>>
>>    --    zhihui
>>    Intel OpenSolaris Team
>>
>>
>>
>> --
>> zhihui
>> Intel OpenSolaris Team
>>
>
> --
> blu
>
> There are two rules in life:
> Rule 1- Don't tell people everything you know
> ----------------------------------------------------------------------
> Brian Utterback - Solaris RPE, Sun Microsystems, Inc.
> Ph:877-259-7345, Em:brian.utterback-at-ess-you-enn-dot-kom
>



-- 
zhihui
Intel OpenSolaris Team

_______________________________________________
networking-discuss mailing list
[email protected]

Re: [networking-discuss] strange rcp issue with snv_92

Reply via email to