Hi all,

I maintain a little package with some text and binary utilities here:

  https://sourceforge.net/projects/drmtools/

In the last release the "execinput" program was modified so that instead of running just one command it could run N at a time. Like so (silly example to calculate md5sum values for every file in a directory and store each in an ".md5" file, using
20 parallel processes):

ls -1 | extract -fmt 'md5sum [1,] > [1,].md5 ' | execinput -t 20

This uses fork() and wait() and works fine on linux.

Problem is, on windows there is not a strict equivalent for those functions. Below is a first pass at getting something similar to work. It just reads from stdin and tries to create a thread in which it does a system() call. At the moment it has several issues, none of which I understand very well. Compiled in 32 bit Mingw64 on a Windows 7 32 bit system (if that is the right way to say that) with:
  gcc -g -O0 -o c_spawn_n c_spawn_n.c
Not sure what release of Mingw64 this is but it was installed circa 8/29/2018.

For the tests described below it was given the task of "DIR onefile can concatanate result to a single file", like so:

 ls -1  | extract -fmt 'dir [1,] >> test_out.log' >many_dir.txt
 head -3 many_dir.txt
dir #c_spawn_n.c# >> test_out.log
dir accudate.1 >> test_out.log
dir accudate.c >> test_out.log
./c_spawn_n <many_dir.txt 2>oops3.txt
#preceding was msys, for cmd.exe without the leading path

Issues:


1.  Compiler warnings:

c_spawn_n.c: In function 'main':
c_spawn_n.c:47:13: warning: passing argument 3 of 'CreateThread' from incompatible pointer type [-Wincompatible-pointer-types]
             RunCommand,             // thread function name
             ^~~~~~~~~~
In file included from C:/progs/msys32/mingw32/i686-w64-mingw32/include/winbase.h:29:0, from C:/progs/msys32/mingw32/i686-w64-mingw32/include/windows.h:70,
                 from c_spawn_n.c:7:
C:/progs/msys32/mingw32/i686-w64-mingw32/include/processthreadsapi.h:163:28: note: expected 'LPTHREAD_START_ROUTINE' but argument is of type 'DWORD (__attribute__((__stdcall__)) * (*)(void *))(void *)' WINBASEAPI HANDLE WINAPI CreateThread (LPSECURITY_ATTRIBUTES lpThreadAttributes, SIZE_T dwStackSize, LPTHREAD_START_ROUTINE lpStartAddress, LPVOID lpParameter, DWORD dwCreationFlags, LPDWORD lpThreadId);
                            ^~~~~~~~~~~~
c_spawn_n.c: In function 'RunCommand':
c_spawn_n.c:70:12: warning: return makes pointer from integer without a cast [-Wint-conversion]
     return (retval);
            ^
2.  when run procs is shown as -1, not sure how that happens.
3. cannot tell very well if this is actually starting multiple processes or if it
is just running sequentially.
4. there are now about about 100 cmd.exe processes showing up in Windows Task Manager. So something is hanging somewhere. It was run both in "MSYS2 Mingw 32" and a W7
cmd shell.   These processes are not visible in "ps -ef" in the former.
5. Pretty sure I'm not checking status right on the thread creation and not
passing the system() exit status back correctly (or at all).
6.  During the test it says this a variable number of times:

The process cannot access the file because it is being used by another process.

I think that means that it has problems with >>test_out.log running on multiple processes
at once and that there is some implicit file locking.


Here is the small test program's code:


// c_spawn_n.c
// TEST program to run up to 20 commands in parallel as subprocesses.
// Read commands from stdin.
//
//

#include <windows.h>
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <process.h>
#include <unistd.h>

// prototypes
LPTHREAD_START_ROUTINE RunCommand( LPVOID lpParam ); // runs thread

int     max_threads=20;
int     procs=0;                    // Number of processes running

int main(int argc, char *argv[]){
#define MAXSTRING 256
   char *command = malloc(MAXSTRING);
   char **ThrCommands = calloc(sizeof(char *), max_threads);
   HANDLE *ThrHandles = (HANDLE *) calloc(sizeof(HANDLE),max_threads);
   DWORD  *ThreadIds = (DWORD *) calloc(sizeof(DWORD),max_threads);
   int   firstempty=0;
   int   procs=0;
   DWORD retVal;
   while(1){
        if(NULL == fgets(command,MAXSTRING,stdin))break;

        if(procs){
            firstempty = WaitForMultipleObjects(
                 max_threads, ThrHandles, FALSE, INFINITE);
                procs--;
                GetExitCodeThread(ThrHandles[firstempty], &retVal);
fprintf(stderr,"DEBUG thread %d exited with status %ul\n",
firstempty,retVal);
        }

fprintf(stderr,"DEBUG firstempty %d command >%s< len %d\n",
firstempty,command,strlen(command));
        ThrCommands[firstempty] = strdup(command); // save it
        ThrHandles[firstempty] = CreateThread(
            NULL,                   // default security attributes
            0,                      // use default stack size
            RunCommand,             // thread function name
            ThrCommands[firstempty],// argument to thread function
            0,                      // use default creation flags
            ThreadIds + firstempty); // returns the thread identifier
        procs++;
    }
    // wait for threads to exit
    while(procs){
            firstempty = WaitForMultipleObjects(
                max_threads, ThrHandles, FALSE, INFINITE);
                procs--;
                GetExitCodeThread(ThrHandles[firstempty], &retVal);
fprintf(stderr,"DEBUG CLEANUP thread %d exited with status %ul\n",
firstempty,retVal);
    }
}

// run a command, return its exit status
LPTHREAD_START_ROUTINE  RunCommand(  LPVOID lpParam  ){
        char *command = (char *) lpParam;
        int retval = system(command);
// output from next command never appears in console (DOS or MSYS)
//fprintf(stderr,"DEBUG RunCommand status %d with command %s\n",retval,command);
    return (retval);
}


Thanks,

David Mathog
[email protected]
Manager, Sequence Analysis Facility, Biology Division, Caltech


_______________________________________________
Mingw-w64-public mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Reply via email to