To Phil and Pedro:

I have now (commit 096527c) implemented named semaphores variants (both
POSIX and Windows) of the 3-semaphore approach that was previously
only implemented for POSIX unnamed semaphores.  This should complete
my planned wxwidgets work (except for the test of unnamed semaphores
capability mentioned below), and the rest will likely be up to you
guys.

As far as I can tell on POSIX platforms the named semaphores variant
of the three-semaphores approach gives identical plotted
results to the unnamed semaphores variant.

I obviously could not test the the Windows variant of the
three-semaphores approach that I implemented so I am leaving
that up to you. Use the cmake option -DPLPLOT_WX_DEBUG_OUTPUT=ON
to get useful debug information about the header and plbuf results of
transmitBytes and receiveBytes from both the -dev wxwidgets side of
the IPC and the wxPLViewer side of the IPC. Use the cmake option
-DPL_WXWIDGETS_IPC3=ON to exercise the 3-semaphore approach on any
platform.

Unnamed POSIX semaphores are known not to work on Mac OS X (and likely
a number of other proprietary Unices) and Windows.  Thus, for the
-DPL_WXWIDGETS_IPC3=ON case I plan to implement a CMake test that will
determine whether PL_HAVE_UNNAMED_POSIX_SEMAPHORES is OFF or ON for a
given platform without need for user input.  However, as an interim
measure until I get that test implemented our build system sets
PL_HAVE_UNNAMED_POSIX_SEMAPHORES to OFF by default and knowledgable
users that _know_ their platform (e.g., Linux and likely the *BSD
variants) supports unnamed semaphores can try such semaphores by
setting -DPL_HAVE_UNNAMED_POSIX_SEMAPHORE=ON.

I did a large number of timing tests for the various POSIX variants
which I refer to below as

* "IPC3 unnamed semaphores" (i.e., -DPL_WXWIDGETS_IPC3=ON
   -DPL_HAVE_UNNAMED_POSIX_SEMAPHORES=ON).

* "IPC3 named semaphores" (i.e., -DPL_WXWIDGETS_IPC3=ON
   -DPL_HAVE_UNNAMED_POSIX_SEMAPHORES=OFF).

* "IPC single mutex" (i.e., -DPL_WXWIDGETS_IPC3=OFF which effectively uses
   the same code that Phil implemented before I made the additions
   to that code for the above two cases).

One apparent result is "IPC3 unnamed semaphores" and "IPC3 named
semaphores" are the same speed within the very large replication noise
(see below) with just one example where "IPC3 unnamed semaphores"
appears to be significantly faster (by a factor of 1.7).  But take
this result with a huge grain of salt, because it doesn't make much
sense that only one example is affected by some timing difference
between these two cases.  In any case, I believe these two cases
should be essentially identical in speed (i.e., the time taken to
access the semaphores should be negligible compared to the time
required for memcpy to transmit header and plbuf data through the
shared memory on the transmitBytes side and extract header and plbuf
data from the shared memory on the receiveBytes side).  Also, I have
proved (for example 8 which has a very large plbuf filled with plot
directives) that generation of that plbuf data and transfer of those
data from -dev wxwidgets to wxPLViewer takes less than 0.8 seconds. So
assuming transfer takes less than generation (and I assume
substantially less), variants on how that transfer is done should
hardly affect the timing of example 8, and the affect on other
examples of such variants should be much less.

Another apparent result is "IPC single mutex" is significantly (factor
of two) faster _and_ slower than the other two depending on which
example you use for the comparison with the faster results
substantially outnumbering the slower results.  But there doesn't seem
to be any pattern to which examples are faster and which slower, and
in any case I think the memcpy time should be nearly identical in all cases
(e.g., for the "IPC single mutex" case where a circular buffer is used
you still have to copy bytes to that buffer on the transmitting side
and copy those bytes from that buffer on the receiving side just as in
the 3-semaphores approach with or without named semaphores).

I now provide more details on how I got these timing results in case
someone can figure out why they are so unreliable.

My timing results were done using the time command with its output
reformatting to a single line corresponding to the real time
interval using

export TIMEFORMAT=$'real\t%3R'

(After all tests were done I restored the normal 3-line time format by

export TIMEFORMAT=$'\nreal\t%3lR\nuser\t%3lU\nsys\t%3lS'

.) After building the test_c_wxwidgets target (to insure all
dependencies are built), I then collected time information for -dev
wxwidgets on most of our C examples (with exceptions 08, 17, 20, 25,
31, 32, 33, and 34) typically using something like the following:

(for N in $(seq --format='%02.0f' 0 30 |grep -vE '08|17|20|25'); do echo $N; 
(time examples/c/x${N}c -dev wxwidgets  >&/dev/null); sleep 5; done) >| 
../IPC3_timing_ON_OFF_alt5.txt 2>&1

Note these time commands only measure the time taken by the -dev
wxwidgets side of the IPC, and do not measure the time required by the
other side of the IPC (wxPLViewer). The purpose of the sleep command
is to allow time for the wxPLViewer command to finish so that it
(usually except in those cases where wxPLViewer takes longer than 5
seconds) does not interfere with the time measurement of subsequent
examples.

Unfortunately, the fundamental result I have from such tests is these
results are not reliable! One issue is if I run the above command
multiple times I get substantially different results for all examples.
For example, you can easily get time variations of 20 per cent for the
same standard example from one run to the next if using the -np option
for the examples/c/x${N}c commands, and it is even worse without that
option (as above) because it is hard to be exactly consistent with how
you run through the pages of an example by hitting the enter key by
hand. In other words, both with and without the -np option, the cpu
time (or more likely the memory or some other resource) consumed by
wxPLViewer is likely interfering with the timing of the
examples/c/x${N}c command in an inconsistent way.

It should be possible to reduce this problem by a very large degree by
making wxPLViewer a lot more efficient.  A particularly egregious
result is example 8 (which is why it was skipped above).  As discussed
before, it takes something like 3 seconds per page for wxPLViewer to
render the 10 pages of example 8 _once that example is completely done
with all IPC finished_.  And during that 3 seconds per page wxPLViewer
consumes virtually no cpu. In comparison, xcairo and wxwidgets
generates and displays all 10 pages of that example in less than 3
seconds so we are discussing at least an order of magnitude
discrepancy (and virtually all of it idle time for the wxPLViewer
case) in time required to render all 10 pages of the example.

That huge disparity is a puzzle to me since on Linux the wxwidgets
library is essentially a wrapper for a subset of the GTK+ library
suite which is accessed more directly by xcairo.  Therefore, you would
expect -dev xcairo and wxPLViewer (invoked by -dev wxwidgets) to take
roughly the same amount of time to complete for example 8.  Anyhow,
example 8 has lots (probably thousands) of plfill calls so I am
wondering if wxPLViewer is translatiing plfill into some generic
wxwidgets library call that is extraordinarily inefficient, and there
might be an alternative wxwidgets library API to use to make fills
that would be much more efficient than what we are using now?  But the
inefficiency of wxPLViewer after an example has completed may be
nothing to do with that because a major clue is virtually no CPU time
is consumed by (independent, i.e., after IPC is finished) wxPLViewer.
Instead it appears to be spending the vast majority of its time
waiting for some non-IPC event.  So perhaps the issue is the event
setup that is used for wxPLViewer is inefficient for some reason. (For
example, it is possible one of the types of events that are handled
may currently be firing essentially continuously.) Anyhow, a careful
review of that event handling code should be done with debug printout
each time an event fires to make sure there are no continuously firing
events.

I am also concerned with the reliability of any timing results for
the -np option which simply shows blank screens in all cases. Probably
the "IPC3 unnamed semaphores", "IPC3 named semaphores", and "IPC
single mutex" are implemented correctly because they are no obvious
run-time errors such as segfaults, and for the two IPC3 cases with the
-DPLPLOT_WX_DEBUG_OUTPUT=ON cmake option you get printouts verifying
the data that is sent is received properly. Nevertheless, there is
currently no visual plot rendering evidence to support (or refute)
that conclusion for the -np case.

The solution to this issue is for wxPLViewer to display plot results
immediately as the parts of plbuf are acquired to verify the rendering
is done correctly. (This "immediate" approach is already done for most
of the other PLplot interactive devices including the old version of
wxwidgets.) That change would likely allow the -locate option to work
correctly for example 1 and should also solve the example 17 issue
where only the final result is displayed rather than properly plotting
the intermediate results that are required to get to that final
result.

In sum, I believe we need to make the two improvements above (solve
the wxPLViewer order of magnitude inefficiency issue and display
wxPLViewer plots immediately for each part of plbuf that is received)
before we can get believable timing results.  Therefore it is much too
soon to judge any of the "IPC3 unnamed semaphores", "IPC3 named
semaphores", and "IPC single mutex" approaches based on any apparent
timing issues that are occurring now, and once we get more reliable
timing my guess is we will see no important difference in timing
results between the three different IPC methods that are currently
implemented.

For now (except to implement a CMake test that would determine
PL_HAVE_UNNAMED_POSIX_SEMAPHORES without user input), I believe I am
done with this project, and I request you follow up with these steps:

* Build and test wxwidgets on Windows using -DPL_WXWIDGETS_IPC3=ON
   -DPL_HAVE_UNNAMED_POSIX_SEMAPHORE=OFF to
   verify that this variant of the three-semaphores approach is working
   correctly on Windows. I am fairly confident it will work (or at
   least any issues will be small typographical ones) because I copied
   the syntax for initializing, posting,  waiting for, and destroying the
   3 semaphores for the Windows case following how that was done for
   the Windows variant of the "IPC single mutex" case.

* Evaluate the code clarity of the "IPC3 unnamed semaphores" and "IPC3
   named semaphores" approaches compared to the "IPC single mutex"
   approach.  My opinion is the three-semaphores approach (whether
   using the unnamed POSIX, named POSIX, or named Windows semaphores
   variants) has much improved code clarity compared to the "IPC single
   mutex" approach.  But it will be interesting to see if your
   independent assessment of that question also supports that
   conclusion.  :-)

* Get

examples/c/x01c -locate -dev wxwidgets

   to work for all three approaches.  The fundamental issue here is the
   wxPLViewer end as currently designed refuses to render the plot for
   partial plbuf results before -dev wxwidgets finishes sending the
   complete plbuf.  So at least part of the fix here is to redesign
   wxPLViewer to allow rendering of partial plbuf results.  (See
   discussion above concerning two other benefits of that approach.)

Alan
__________________________
Alan W. Irwin

Astronomical research affiliation with Department of Physics and Astronomy,
University of Victoria (astrowww.phys.uvic.ca).

Programming affiliations with the FreeEOS equation-of-state
implementation for stellar interiors (freeeos.sf.net); the Time
Ephemerides project (timeephem.sf.net); PLplot scientific plotting
software package (plplot.sf.net); the libLASi project
(unifont.org/lasi); the Loads of Linux Links project (loll.sf.net);
and the Linux Brochure Project (lbproject.sf.net).
__________________________

Linux-powered Science
__________________________

------------------------------------------------------------------------------
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford
_______________________________________________
Plplot-devel mailing list
Plplot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/plplot-devel

Reply via email to