Re: [Vote] Release apr-util 1.3.11

Rainer Jung Thu, 28 Apr 2011 02:25:14 -0700

Hi Stefan,

On 28.04.2011 00:34, Stefan Fritsch wrote:

On Tuesday 26 April 2011, Rainer Jung wrote:

+1 although there are still two problems on Solaris 10 for
test_reslist, but not a regression.


I built and made check on the following platforms:

- Solaris 8 + 10, Sparc
- SuSE Linux Enterprise 10 32 and 64 Bit
- RedHat Enterprise Linux 5, 64 Bit

Using all combinations of:

apr 1.3.12 / 1.4.2
expat builtin / 2.0.1
dso disable / enable
Berkeley DB 4.8.30 5.0.26 5.1.19
sqlite 3.7.2
mysql 6.0.2 (only Solaris)
oracle 10.2.0.4.0 (only Solaris)

All builds suceeded, all make check ran fine, except for two cases
on Solaris 10 (this time not Niagara, but instead old sun4u - V240
with 2 CPUs).

I reran the tests and couldn't reproduce the problem, so it is not
deterministic. Out of 48 build combinations on Solaris 10, only
three had a problem. This is similar to 1.3.10, but it is not
always the same combinations. Like for 1.3.10 problem happens on
Solaris 10 but not on Solaris 8.

Details on Solaris 10 test failures

- only in testreslist
- two types of failures:
    - twice crashes (segmentation fault)
    - once non-terminating loop
- Crashes seem not really related to used apr version (one for 1.3
and one for 1.4)


I also get undeterministic test failures on the Debian build machines,
mostly hangs in testreslist. It happens on mipsel and sparc much more
often than on the other architectures, and some architectures had no
failure at all. Which compiler are you using? If you are using gcc, it
could be a gcc bug.


On Sparc I use gcc 4.1.2. All builds are 32 Bit.

Concerning the hangs (unterminated loops in my case), I did some moreinvestigation for 1.3.10 and confirmed using GDB, that there actuallywas a cycle in the cleanups:


(gdb) print c
$1 = (cleanup_t *) 0x38558
(gdb) print *c

$2 = {next = 0x38558, data = 0x38558, plain_cleanup_fn = 0x38710,child_cleanup_fn = 0x38798}


so c == c->next and thus apr_pool_cleanup_kill looped.

I didn't check, whether that was still true for 1.3.11. I don't know whyc == c->next.

Concerning gcc: I use the same gcc for building on Solaris 8 and onSolaris 10, even the same binary gcc files. I never observed a problemon the single CPU Sparc 8 system, but did observer problems on Solaris10 for 1.3.10 and for 1.3.11. Apart from the OS version the other majordifference is concurrency in hardware (used Niagara CPU with 6 or 8cores and 4 times the number of strands when testing 1.3.10, and a moretraditional 2 CPU Sparc V240 when testing 1.3.11).

I hope I have some time to check older versions, like 1.3.9 etc. andmaybe also older apr (pool) versions to see, whether I can narrow downthe reason. Unfortunately until now, I could only reproduce the twoproblems (unterminated loop, crash) when doing the testing as part ofthe mass building, which takes time (a couple of hours). When runningtestall after building even in loops, I could not reproduce the problems ...


Regards,

Rainer

Re: [Vote] Release apr-util 1.3.11

Reply via email to