I apologize, I'm not a Dtrace wizard (yet). So it's probably worth my asking if 
you have a Dtrace command you would recommend to track this problem?

Brian Utterback wrote:
> Hmm. Sounds like something is not flush all the buffers.  In that case, 
> I would suggest using dtrace instead of truss. It shouldn't perturb the 
> timing much.
> 
> Daniel Morrison wrote:
>> Your correct - I am in 2 of the '3' cases.
>>
>> Running the truss->tst under BOTH cron and manually - works fine.
>>
>> If I run the code directly from cron->tst (not cron->truss->tst) - I 
>> fail to get the output. (see plain tst.log)
>>
>> Since I get data cron->truss->tst - I have no way of duplicating the 
>> failure with a truss log - because it always works running under truss.
>>
>> It's ONLY the case where I run the code from cron->tst directly.
>>
>> I've spent a week on this... going crazy in the process ;-)
>>
>> Is cron doing something 'different' that would mangle the use of 
>> stdin/stdout ?
>>
>> Thanks again...
>>
>> Brian Utterback wrote:
>>> Okay, having looked at the files, I am afraid I don't understand the 
>>> problem. It looks to me like you are getting the same data written to 
>>> the log file in both cases. Can you please explain what is actually 
>>> wrong here?
>>>
>>> Daniel Morrison wrote:
>>>> Brian,
>>>>
>>>> Thanks for your response. I have taken the time to carefully 
>>>> document, which hopefully is simple to follow and will help...
>>>>
>>>> I have tried to make everything as simple as possible!!
>>>>
>>>> This is the return I am expecting... On the remote server the ls 
>>>> shows (as user w001)
>>>> $ ls -1
>>>> Mapper
>>>> admin
>>>> bin
>>>> common
>>>> dev
>>>> docs
>>>> etc
>>>> help
>>>> lib
>>>> log
>>>> net
>>>> nfs
>>>> nis
>>>> opt
>>>> proc
>>>> tmp
>>>> usr
>>>> var
>>>>
>>>> I have attached tst.zip, which contains:
>>>> -rwxr-xr-x   1 dem      usr         9784 Feb  5 12:21 tst
>>>> -rw-rw-rw-   1 dem      usr         1173 Feb  5 12:21 tst.c
>>>> -rw-r--r--   1 root     root          52 Feb  5 12:44 tst.log
>>>> -rw-r--r--   1 root     root          52 Feb  5 12:34 tst.logCRON
>>>> -rw-r--r--   1 root     root         189 Feb  5 12:36 tst.logTRUSS_CRON
>>>> -rw-r--r--   1 root     root         189 Feb  5 12:37 
>>>> tst.logTRUSS_MANUAL
>>>> -rw-r--r--   1 dem      usr         4736 Feb  5 12:21 tst.o
>>>> -rwxrwxr--   1 dem      usr           19 Feb  5 12:45 tst.sh
>>>> -rwxrwxrwx   1 dem      usr          212 Feb  5 11:26 tst_make.sh
>>>> -rw-r--r--   1 root     root       69620 Feb  5 12:36 tst_truss.outCRON
>>>> -rw-r--r--   1 root     root       70773 Feb  5 12:37 
>>>> tst_truss.outMANUAL
>>>> -rwxrwxr--   1 dem      usr           56 Feb  5 11:23 tst_truss.sh
>>>>
>>>> The cron entry is:
>>>> 0,10,20,30,40,50 7-20 * * * /tmp/tst.sh
>>>>
>>>> To test with truss - I just substituted tst_truss.sh for tst.sh
>>>>
>>>> Basically...
>>>>
>>>> - running tst.sh from cron = fails to capture the remote ls
>>>> - running tst.sh from root (with the default minimum env) - works fine
>>>> - running tst_truss.sh from cron works also
>>>>
>>>> Other than it being run from cron - there is no other difference.
>>>>
>>>>
>>>> *** SIDE NOTE: I found popen2() source on the net. I substituted the 
>>>> command and an fdopen() - and it works fine under cron.
>>>>
>>>>
>>>> Any other questions/comments - I will be happy to help!
>>>>
>>>> Thanks for looking at this.
>>>>
>>>> Thanks,
>>>> Dan
>>>>
>>>>
>>>>
>>>> Brian Utterback wrote:
>>>>> I think you need to show a little more of the truss. If I am 
>>>>> interpreting this correctly, process 15957 is tst, 15982 is the 
>>>>> reading process of rsh, and 15992 is the writing process of rsh. 
>>>>> So, what I see 15992 doing is trying to read from the stdin from 
>>>>> the popen in tst, but since popen had a mode of "r", there is no 
>>>>> stdin, so it got a EOF (i.e. 0) on the first read. Since there is 
>>>>> nothing to send,
>>>>> it then called the shutdown to shutdown the outgoing half of the 
>>>>> connection.
>>>>>
>>>>> The process 15982 got a buffer of data from the remote system, and 
>>>>> then wrote it to stdout, as it should. Shortly after that, it got 
>>>>> an EOF on the stdout from the remote connection, so it killed the 
>>>>> write process and exited, as it should.
>>>>>
>>>>> We also see process 15957 read the buffer of data. Since that is 
>>>>> where the truss ends, there is no way to tell what it did with it 
>>>>> after that.
>>>>>
>>>>> So, either the tst process got the buffer but then didn't print it, 
>>>>> or 15957 was not the tst process. It might be the su process. In 
>>>>> any case, I can't tell from just a snippet of the code and a 
>>>>> snippet of the truss.
>>>>>
>>>>> Daniel Morrison wrote:
>>>>>> solaris 10 Sparc (10/08) Studio 12 compiler
>>>>>>
>>>>>> tst.c (snippet)
>>>>>>
>>>>>> strcpy(syscmd, "/sbin/su w001 -c \"/usr/bin/rsh machB 
>>>>>> /usr/bin/ls\"");
>>>>>> pptr = popen(syscmd, "r");
>>>>>> while (fgets(line,sizeof(line),pptr) != NULLP)
>>>>>> fprintf(logptr, "%s\n", line);
>>>>>> pclose(pptr);
>>>>>>
>>>>>> - When tst is run from root - (w/no env) works fine
>>>>>> - When tst is run from background - (w/no env) works fine
>>>>>> - When tst is run from cron - fails to read any data
>>>>>>
>>>>>> created machB:/tmp/tst.sh to test su & rsh...
>>>>>> #!/bin/sh
>>>>>> /usr/bin/ls > /tmp/ls.capture
>>>>>>
>>>>>> - substituted /usr/bin/ls with /tmp/tst.sh in rsh (above)
>>>>>> - output to log fine, so, there isn't anything wrong with env or 
>>>>>> permissions, etc
>>>>>>
>>>>>> next...
>>>>>> - ran truss on tst from root
>>>>>> - ran truss on tst from cron
>>>>>>
>>>>>> Listings are essentially the same, except ENOTTY in various places
>>>>>>
>>>>>> But - the KEY is - the truss (from cron) shows a buffer of data 
>>>>>> coming back from the remote 'ls' command WORKS - but the socket is 
>>>>>> shutdown before the buffer can be read!
>>>>>>
>>>>>> 15992: read(0, 0xFFBF35DC, 51200) = 0
>>>>>> 15982: read(4, " M a p p e r\n a d m i n".., 51200) = 83
>>>>>> 15982: write(1, " M a p p e r\n a d m i n".., 83) = 83
>>>>>> 15992: shutdown(4, SHUT_WR, SOV_DEFAULT) = 0
>>>>>> 15957: read(3, " M a p p e r\n a d m i n".., 5120) = 83
>>>>>> 15982: pollsys(0xFFBF3398, 1, 0x00000000, 0x00000000) = 1
>>>>>> 15982: read(4, 0xFFBF35DC, 51200) = 0
>>>>>> 15982: kill(15992, SIGKILL) = 0
>>>>>> 15982: _exit(0)
>>>>>>
>>>>>> Mapper\nadmin\n... is the CORRECT buffer of file names on the 
>>>>>> remote directory. So everything succeeds - except being able to 
>>>>>> 'read' the contents of the buffer.
>>>>>>
>>>>>> I see the write(1, ...) which should be a write to stdout, but it 
>>>>>> is followed by the shutdown().
>>>>>>
>>>>>> I have tried a fread() - instead of fgets(), but no difference.
>>>>>>
>>>>>> So - the question is - how do I prevent socket from closing early 
>>>>>> in this scenario. I've tried everything I can think of. (I have 
>>>>>> tried -n option for rsh, no joy).
>>>>>>
>>>>>> Appreciate any insight into what is triggering this problem.
>>>>>>
>>>>>> Thx!
>>>>>> Dan
>>>>>
>>>
> 
_______________________________________________
opensolaris-code mailing list
opensolaris-code@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/opensolaris-code

Reply via email to