The multi-processor package and IO redirection.

Edward d'Auvergne Mon, 19 Sep 2011 06:23:07 -0700

>> The second problem is IO redirection.  This occurs in a number of
>> places in relax.  These include:
>>
>> 1)  The --log command line flag which causes STDOUT and STDERR to be
>> sent to file.
>>
>> 2)  The --tee command line flag which causes STDOUT and STDERR to be
>> sent both to file and to the terminal.
>>
>> 3)  The test suite.  The STDOUT and STDERR streams are caught and only
>> sent to STDERR if there is an error or failure in a test.
>>
>> 4)  The relax controller (part of the GUI).  This is a window to which
>> STDOUT and STDERR are directed.  In the test-suite mode, the streams
>> also output to the terminal.
>>
>> 5)  The multi-processor package.  There are two parts.  The first is
>> essentially a pre-filter which prepends certain tokens to the IO
>> stream (i.e. the 'M  |', 'M  E|', and 'S 1|' text).  I cannot see how
>> we can do this as 4) is always set up after 5).  So I am considering
>> removing this part.  It will make it more difficult with debugging,
>> but I can see no other way.
>>
>> 6)  The second part for the multi-processor package, which is
>> currently non-functional, is the catching of the IO streams of the
>> slave processes to send back to the master.  I will try to mimic the
>> relax controller code here and store all slave text as a list with
>> flags specifying whether it is STDOUT or STDERR.  Then the list can be
>> returned to the master at which point the text can be sent to the two
>> streams.
>>
>> The problem is that at each point here, sys.stdout and sys.stderr are
>> replaced and the order in which this happens is impossible to change.
>> Well 4) will always be last.
>
> I think the problem here is that the whole idea of replacing the std io
> streams multiple times is inflexible and painful.


This is true.  Python, like many programing languages, does not handle
IO streams very nicely.  It would be good if you could set up some
processing pipe for the IO stream, but this looks very complicated to
implement.


> So I have come up with a
> strawman multiplexed io implimentation. The idea is that you replace stdio
> and stderr once and then insert IO elements to deal with the needs to block
> the output of io streams, record them and  create Tees etc  see what you
> think? Should we open a couple of new mail threads to discuss these things?

Lets see how it goes.  But we may need a few new threads.  For your
code below, it looks like it could be a the start of a solution.  The
question is, how do we design this so that the multi-processor package
is not dependent on relax?  Should the Multiplex_IO object be part of
relax, or the multi package?  I think that the main program should
have full control over the IO streams, and that the multi-processor
package should use what is available to it.  One problem here is that
a program could change the IO streams in mid operation.  For example
if I was to implement a function in the GUI which activates logging.
The Python program, as is with relax, could have many, many places
where IO streams are manipulated.  So it would be more beneficial to
have an IO stream manipulation framework within the main program.  The
multi-processor only needs to prepend text and capture slave IO
streams.

Maybe an alternative would be to capture and store the IO streams only
on the slaves?  This could be a fabric-specific implementation.  For
example the uni-processor would not touch the streams.  It would not
look as nice, but this would solve all the issues.  The slave IO
streams are captured, the 'S 1 |', 'S 2 |', etc. text would be
prepended, the text sent back to the master via the memo objects, and
then the master can send it to what ever sys.stdout and sys.stderr are
currently set to by the calling Python program.  As I mentioned
before, the slave's STDOUT and STDERR order can be preserved by
storing it all in a list whereby the IO stream is identified by a
flag, as is done in the relax controller GUI window.  For example if
you have in a slave:

sys.stdout.write('test\n')
sys.stderr.write('fail\n')

Then the object passed to the master via the memo could look like:

io =[
  ['S 1 | test\n', 0],
  ['S 2 | fail\n', 1]
]

Then on the master:

for line, stream in io:
    if stream == 0:
        sys.stdout.write(line)
    else:
        sys.stderr.write(line)

This means that the two slave streams are merged, and then split again
by the master, preserving their order.  What do you think the overhead
of such an operation would be?  My guess would be only a few
milliseconds per slave process, as most of the work would be on the
master.  I would guess that the overhead should be similar to the
current prependIO code.  Only touching IO streams from the slaves
created by the multi-processor framework would not clash with any IO
manipulations the main program could ever imagine to do.

Regards,

Edward

_______________________________________________
relax (http://nmr-relax.com)

This is the relax-devel mailing list
[email protected]

To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-devel

The multi-processor package and IO redirection.

Reply via email to