Hi Arjen:

Thanks very much for running the Python-C comparisons for both
MinGW/MSYS (and later) MSVC on 32-bit Windows.  More below in context.

On 2013-10-19 14:08-0000 Arjen Markus wrote:

>> From: Alan W. Irwin [mailto:[email protected]]
>> So the next step in trying to figure out the cause of these problems should 
>> be to do
>> some additional platform checks for 32-bit Python.
>> Arjen is moving ahead with those for both the MinGW/MSYS and MSVC Microsoft
>> Windows cases, and I am hoping Andrew will be able to check the 32-bit Python
>> case on Linux.
>>
>
> I have used the MinGW 32-bits platform on Windows to do the same test. I see 
> minor differences
> in the following examples:
> 00 03 04 05 06 08 09 11 12 15 16 17 18 20 21 22 23 25 26 27 29 and 33.
>
> So that list is almost the same as yours, except for example 19 - that is 
> completely clean.

Although Wine has a pretty good track record for being a good Windows
test platform, I was concerned that I might have stumbled over some
issue that was due to some Wine bug. But from your MinGW/MSYS results
(and later MSVC results) on Microsoft Windows, this problem first
turned up with Wine turns out to be a widespread issue on 32-bit
Windows.  And the "jury is still out" on whether this is also an issue
with 32-bit Linux.

>
> I did not run example 14 yet and there is no example x14a.
>
> The examples that show significant differences are: 17 and 25 - the 
> PostScript files show extra
> lines or completely different lines in these two cases.
>
> I also noted that many differences occur with lines like:
>
> 1349 220 M (Python) -- 1350 220 M (C)
> ...  1800 M (Python) -- ... 1799 M (C)
>
> I have not checked what PLplot command is responsible for them, but they 
> occur in many examples,
> so I guess they have to do with the frames.

Those are PostScript commands generated by the ps device driver.  That
driver issues alias commands like

/M {moveto} def

in the top of the file so the first "M" command you see above simply
mean move the pen to coordinate 1349 220.  So differences like above
mean that C and Python (on all 32-bit Windows platforms we have tested
between us) have slightly different views of the overall coordinate
transformations that yield the above positions.  But the puzzle is
that all those transformations are done in our C library and the "psc"
C device in both cases depending only on the input coordinates
entered, for example, in the plenv call for x00.  And I also proved
with the gdb debugging tool that those input coordinates were
identical in that particular case. So in the 32-bit case we have
proved that our core C library, libplplotd and/or the psc device,
gives different answers with the _same_ input data in the C and Python
cases for at least one example, and that might be the cause of all the
above issues.

My working hypothesis to explain this unusual result is some memory
management issue (e.g., an uninitialized variable) in our core C
library and/or our ps device that generates some unpredictability in
the results.  A further hypothesis is that Python has a very large
memory footprint so it leaves non-zero bit patterns behind scattered
over a wide memory space, and one of those non-zero bit patterns is
being interpreted differently by the uninitialized variable than when
any other language is being used to run the examples.

In later e-mail concerning the MSVC case you stated the following
results:

<quote>
- Most of the Python examples produce the same results as the
corresponding C examples.

- The ones that do not simply crash at some point. These are: 08, 09,
11, 15, 16, 20, 21 and 22.
   I have not looked into the possible cause of this. It may be a
single function that is causing this.
   It may be a set of functions.

   Anyway, this means that the mystery is only larger: The Python
installation was _exactly_ the
same under Windows/MSVC as under Windows/MinGW.
</quote>

I think these results are also consistent with the working hypothesis
that there is a lurking memory management issue in our core C library
and/or the ps device driver code for the 32-bit case.

So Arjen, if you have access to a static or dynamic memory debugging
tool for Windows (see a partial list of such tools for all operating
systems at http://en.wikipedia.org/wiki/Memory_debugging), I suggest
you run it on x00c (our simplest 2D plot example written in C) to see
if it spots some memory management issue (uninitialized variable or
whatever) in libplplotd, our core C library.  And, of course, if
MinGW or MSVC is giving any warning messages at all concerning
the compilation of our core C library, we should try to address
those warnings.

The only memory debugging tool I have access to is valgrind (a dynamic
analysis tool) which can only be run on Linux (and Mac OS X). That
gives totally clean results for x00c for the 64-bit Linux case where
Python and C results are identical.  For the record, here are those
results:

software@raven> valgrind examples/c/x00c -dev psc -o test.ps
==19591== Memcheck, a memory error detector
==19591== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et
al.
==19591== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright
info
==19591== Command: examples/c/x00c -dev psc -o test.ps
==19591== 
==19591== 
==19591== HEAP SUMMARY:
==19591==     in use at exit: 0 bytes in 0 blocks
==19591==   total heap usage: 457 allocs, 457 frees, 132,045 bytes
allocated
==19591== 
==19591== All heap blocks were freed -- no leaks are possible
==19591== 
==19591== For counts of detected and suppressed errors, rerun with: -v
==19591== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 4 from 4)

However, now that our combined results show there is a widespread
32-bit issue for the Windows case (triggered by Python, but I think
that is the result of the large Python memory footprint and nothing to
do with bad Python code or bad code interfacing Python with our core C
library), we need followup with extensive testing for the 32-bit Linux
case.

Therefore, I strongly encourage Andrew (or someone else here with
access to 32-bit Linux) to compare Python and C results and if those
are different do valgrind runs to help us get to the bottom of these
issues.

Alan
__________________________
Alan W. Irwin

Astronomical research affiliation with Department of Physics and Astronomy,
University of Victoria (astrowww.phys.uvic.ca).

Programming affiliations with the FreeEOS equation-of-state
implementation for stellar interiors (freeeos.sf.net); the Time
Ephemerides project (timeephem.sf.net); PLplot scientific plotting
software package (plplot.sf.net); the libLASi project
(unifont.org/lasi); the Loads of Linux Links project (loll.sf.net);
and the Linux Brochure Project (lbproject.sf.net).
__________________________

Linux-powered Science
__________________________

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60135031&iu=/4140/ostg.clktrk
_______________________________________________
Plplot-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/plplot-devel

Reply via email to