I apologize, I'm not a Dtrace wizard (yet). So it's probably worth my asking if you have a Dtrace command you would recommend to track this problem?
Brian Utterback wrote: > Hmm. Sounds like something is not flush all the buffers. In that case, > I would suggest using dtrace instead of truss. It shouldn't perturb the > timing much. > > Daniel Morrison wrote: >> Your correct - I am in 2 of the '3' cases. >> >> Running the truss->tst under BOTH cron and manually - works fine. >> >> If I run the code directly from cron->tst (not cron->truss->tst) - I >> fail to get the output. (see plain tst.log) >> >> Since I get data cron->truss->tst - I have no way of duplicating the >> failure with a truss log - because it always works running under truss. >> >> It's ONLY the case where I run the code from cron->tst directly. >> >> I've spent a week on this... going crazy in the process ;-) >> >> Is cron doing something 'different' that would mangle the use of >> stdin/stdout ? >> >> Thanks again... >> >> Brian Utterback wrote: >>> Okay, having looked at the files, I am afraid I don't understand the >>> problem. It looks to me like you are getting the same data written to >>> the log file in both cases. Can you please explain what is actually >>> wrong here? >>> >>> Daniel Morrison wrote: >>>> Brian, >>>> >>>> Thanks for your response. I have taken the time to carefully >>>> document, which hopefully is simple to follow and will help... >>>> >>>> I have tried to make everything as simple as possible!! >>>> >>>> This is the return I am expecting... On the remote server the ls >>>> shows (as user w001) >>>> $ ls -1 >>>> Mapper >>>> admin >>>> bin >>>> common >>>> dev >>>> docs >>>> etc >>>> help >>>> lib >>>> log >>>> net >>>> nfs >>>> nis >>>> opt >>>> proc >>>> tmp >>>> usr >>>> var >>>> >>>> I have attached tst.zip, which contains: >>>> -rwxr-xr-x 1 dem usr 9784 Feb 5 12:21 tst >>>> -rw-rw-rw- 1 dem usr 1173 Feb 5 12:21 tst.c >>>> -rw-r--r-- 1 root root 52 Feb 5 12:44 tst.log >>>> -rw-r--r-- 1 root root 52 Feb 5 12:34 tst.logCRON >>>> -rw-r--r-- 1 root root 189 Feb 5 12:36 tst.logTRUSS_CRON >>>> -rw-r--r-- 1 root root 189 Feb 5 12:37 >>>> tst.logTRUSS_MANUAL >>>> -rw-r--r-- 1 dem usr 4736 Feb 5 12:21 tst.o >>>> -rwxrwxr-- 1 dem usr 19 Feb 5 12:45 tst.sh >>>> -rwxrwxrwx 1 dem usr 212 Feb 5 11:26 tst_make.sh >>>> -rw-r--r-- 1 root root 69620 Feb 5 12:36 tst_truss.outCRON >>>> -rw-r--r-- 1 root root 70773 Feb 5 12:37 >>>> tst_truss.outMANUAL >>>> -rwxrwxr-- 1 dem usr 56 Feb 5 11:23 tst_truss.sh >>>> >>>> The cron entry is: >>>> 0,10,20,30,40,50 7-20 * * * /tmp/tst.sh >>>> >>>> To test with truss - I just substituted tst_truss.sh for tst.sh >>>> >>>> Basically... >>>> >>>> - running tst.sh from cron = fails to capture the remote ls >>>> - running tst.sh from root (with the default minimum env) - works fine >>>> - running tst_truss.sh from cron works also >>>> >>>> Other than it being run from cron - there is no other difference. >>>> >>>> >>>> *** SIDE NOTE: I found popen2() source on the net. I substituted the >>>> command and an fdopen() - and it works fine under cron. >>>> >>>> >>>> Any other questions/comments - I will be happy to help! >>>> >>>> Thanks for looking at this. >>>> >>>> Thanks, >>>> Dan >>>> >>>> >>>> >>>> Brian Utterback wrote: >>>>> I think you need to show a little more of the truss. If I am >>>>> interpreting this correctly, process 15957 is tst, 15982 is the >>>>> reading process of rsh, and 15992 is the writing process of rsh. >>>>> So, what I see 15992 doing is trying to read from the stdin from >>>>> the popen in tst, but since popen had a mode of "r", there is no >>>>> stdin, so it got a EOF (i.e. 0) on the first read. Since there is >>>>> nothing to send, >>>>> it then called the shutdown to shutdown the outgoing half of the >>>>> connection. >>>>> >>>>> The process 15982 got a buffer of data from the remote system, and >>>>> then wrote it to stdout, as it should. Shortly after that, it got >>>>> an EOF on the stdout from the remote connection, so it killed the >>>>> write process and exited, as it should. >>>>> >>>>> We also see process 15957 read the buffer of data. Since that is >>>>> where the truss ends, there is no way to tell what it did with it >>>>> after that. >>>>> >>>>> So, either the tst process got the buffer but then didn't print it, >>>>> or 15957 was not the tst process. It might be the su process. In >>>>> any case, I can't tell from just a snippet of the code and a >>>>> snippet of the truss. >>>>> >>>>> Daniel Morrison wrote: >>>>>> solaris 10 Sparc (10/08) Studio 12 compiler >>>>>> >>>>>> tst.c (snippet) >>>>>> >>>>>> strcpy(syscmd, "/sbin/su w001 -c \"/usr/bin/rsh machB >>>>>> /usr/bin/ls\""); >>>>>> pptr = popen(syscmd, "r"); >>>>>> while (fgets(line,sizeof(line),pptr) != NULLP) >>>>>> fprintf(logptr, "%s\n", line); >>>>>> pclose(pptr); >>>>>> >>>>>> - When tst is run from root - (w/no env) works fine >>>>>> - When tst is run from background - (w/no env) works fine >>>>>> - When tst is run from cron - fails to read any data >>>>>> >>>>>> created machB:/tmp/tst.sh to test su & rsh... >>>>>> #!/bin/sh >>>>>> /usr/bin/ls > /tmp/ls.capture >>>>>> >>>>>> - substituted /usr/bin/ls with /tmp/tst.sh in rsh (above) >>>>>> - output to log fine, so, there isn't anything wrong with env or >>>>>> permissions, etc >>>>>> >>>>>> next... >>>>>> - ran truss on tst from root >>>>>> - ran truss on tst from cron >>>>>> >>>>>> Listings are essentially the same, except ENOTTY in various places >>>>>> >>>>>> But - the KEY is - the truss (from cron) shows a buffer of data >>>>>> coming back from the remote 'ls' command WORKS - but the socket is >>>>>> shutdown before the buffer can be read! >>>>>> >>>>>> 15992: read(0, 0xFFBF35DC, 51200) = 0 >>>>>> 15982: read(4, " M a p p e r\n a d m i n".., 51200) = 83 >>>>>> 15982: write(1, " M a p p e r\n a d m i n".., 83) = 83 >>>>>> 15992: shutdown(4, SHUT_WR, SOV_DEFAULT) = 0 >>>>>> 15957: read(3, " M a p p e r\n a d m i n".., 5120) = 83 >>>>>> 15982: pollsys(0xFFBF3398, 1, 0x00000000, 0x00000000) = 1 >>>>>> 15982: read(4, 0xFFBF35DC, 51200) = 0 >>>>>> 15982: kill(15992, SIGKILL) = 0 >>>>>> 15982: _exit(0) >>>>>> >>>>>> Mapper\nadmin\n... is the CORRECT buffer of file names on the >>>>>> remote directory. So everything succeeds - except being able to >>>>>> 'read' the contents of the buffer. >>>>>> >>>>>> I see the write(1, ...) which should be a write to stdout, but it >>>>>> is followed by the shutdown(). >>>>>> >>>>>> I have tried a fread() - instead of fgets(), but no difference. >>>>>> >>>>>> So - the question is - how do I prevent socket from closing early >>>>>> in this scenario. I've tried everything I can think of. (I have >>>>>> tried -n option for rsh, no joy). >>>>>> >>>>>> Appreciate any insight into what is triggering this problem. >>>>>> >>>>>> Thx! >>>>>> Dan >>>>> >>> > _______________________________________________ opensolaris-code mailing list opensolaris-code@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/opensolaris-code