[ksh93-integration-discuss] ksh93 issue - race condition between lseek and write -CR 7007665

sriman narayana bhavanam - Sun Microsystems - Bangalore India Thu, 03 Feb 2011 09:06:37 -0800

Dear ksh93 experts,

I am working on ksh93 CR 7007665 reported recently by dmittry(dmitry.mikhailiche...@oracle.com). Dmittry also did a lot of initialinvestigation and pointed us to the root cause of this issue. I would tolike to discuss with you the details of this CR. I appreciate any helpfor its resolution.

The root cause of the issue lies in the lack of synchronization amongparent/child ksh instances while performing I/O to file descriptors thatshare common file offset. The issue is becoming visible on systems withlarge number of processors. Specifically, in our case, it is happeningduring the execution of the following eval command in the shell script(source of the shell script is given at the bottom of this email).

eval "copy | copy" < 4496648b.execute.out >hydra_actual_filtered 2>&1

The expected result of this command is to copy the contents of"4496648b.execute.out" to "hydra_actual_filtered". But, in fewiteractions, the output produced in "hydra_actual_filtered" isdifferent from the content of input file.

Here is the detailed evaluation so far:

The ksh process processes the I/O redirection operators - >, < and 2>&1prior to processing the argument to eval command. After I/O redirectionoperators are processed - stdin, stdout and stderr of main ksh processare setup as below:


               0 - 4496648b.execute.out
               1 - hydra_actual_filtered
               2 - hydra_actual_filtered

Now, ksh evaluates the argument to eval and finds out that a pipelineneeds to be setup. So, it creates a pipe, spawns a child process toexecute the first command in the pipeline (The second command in thepipeline is executed by the parent ksh process itself).

Now, both parent and child do a series of dup & close operations toget the file descriptors set up as required for the pipeline.Finally, file descriptors are set up as below for the child andparent



               0 - 4496648b.execute.out                      < 0 - pipe

1 - pipe > 1 -hydra_actual_filtered2 - hydra_actual_filtered 2 -hydra_actual_filtered


               child                                        parent
               ------                                       --------

Among the above descriptors list, 2 of child and 1& 2 of parent sharethe same file offset.

Now, the child (as part of executing first instance of copy function)starts reading input from stdin (i.e. 4496648b.execute.out) and writesit to stdout (i.e. pipe). The parent reads whatever input it has gotover stdin (i.e. pipe ) and writes it to stdout (i.e.hydra_actual_filtered).

The problem is happening when the child is about to exit. As part ofexit, setbuf is called over each descriptor stream (_sfcleanup ->setbuf->sfsk->lseek). When setbuf is called for desccriptor 2 in child,there is an lseek attempted in that routine. This lseek by child overdesc 2 is overlapping with write by parent over desc 1 and it isresulting in a race condition



The following seems to be the sequence of events taking place :


Let us assume that file offset is 10 before the start of event 1

              event1 1 : child starts lseek over 2  and computes new offset
                          - lseek(2, 0, SEEK_CUR)
                             Here, new computed offet is 10

event 2 : parent initiates a write and writes 10 bytes to desc 1

event 3 : parent updates offset as part of write and write returns.The new offset after write is 20


              event 4 : child updates the computed offset and lseek returns
                               Here, new offet is 10

event 5 : parent initiates another write of 10 bytes.

In fact, write by parent in event 5 should start writing at offset 20.However, because of the intervening lseek, offset got changed and itstarts writing again at offset 10, thereby overwriting the previouscontents.

I think this is a general problem that can happen anywhere whenever wehave multiple processes with open descriptors sharing the file offset.To address this issue, it looks like we need to protect the entry/exitinto/out of I/O wrapper routines sfwr, sfrd, sfsk etc in libast. Itlooks like there is some skelton code implemented to handle thissynchronization. It can be seen from following lines in these routines.



      SFMTXDECL(f);

      SFMTXENTER(f,-1);

But, this is not fully implented. With vt_threaded defined as 0 invtthread.h, the sfmutex routine which is called from SFMUTEXENTER macrosimply returns 0 :


 vtthread.h :

   37 /* ast doesn't do threads yet */
   38 #if _PACKAGE_ast && !defined(vt_threaded)
   39 #define vt_threaded     0
   40 #endif

We are thinking that it may be a risky solution to implement mutexlocking with PTHREAD_PROCESS_SHARED attribute for interprocesssynchronization.Also thinking on the lines of avoding the lseek calls to avoid the racecondition.

We appreciate any help on the resolution of this issue. Please let meknow if any more information is required.

Looking forward for your inputs.


Regards,
-Sriman


Shell script
--------------
#!/bin/sh

gen_file() {

echo"1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901"echo"1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901"echo"1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901"echo"1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901"echo"1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901"echo"1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901"

}

copy() {
   while read line; do
       echo $line
   done
}

gen_file > hydra_pass_filtered
gen_file > 4496648b.execute.out

n=0
while [ $n -lt 1000 ]; do
   eval "copy | copy" < 4496648b.execute.out > hydra_actual_filtered 2>&1

diff hydra_pass_filtered hydra_actual_filtered > hydra_diff_output|| exit 5

   n=`expr $n + 1`
done



_______________________________________________
ksh93-integration-discuss mailing list
ksh93-integration-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/ksh93-integration-discuss

[ksh93-integration-discuss] ksh93 issue - race condition between lseek and write -CR 7007665

Reply via email to