Hi,
I've got a really inexplicable problem with a java application I've been
developing under linux, but wish to run under linux, solaris and NT.
My systems are:
linux(debian1.3.1/kernel2.0.33/libc5.4.33) jdk1.1.3/gcc2.7.2.1
Solaris2.5.1/ultrasparc jdk1.1.5/gcc2.7.2.1
NT4.0(sp3/dualppro) jdk1.1.4/MSVC++
The application is a client server system with the server written in C/C++
and running well under linux and solaris with extensive use of pthreads.
The client is written almost exclusively in java1.1 with the swing1.0.1
toolkit for the GUI. The network communication is based on tcp sockets
with an application level messaging protocol involving a custom message
packet 'packing/unpacking' library written in C and java. The client uses
a small part of the C library through the JNI interface for packing data
into the message packets, and unpacking the responses into java class
variables. The pack library is simply a 'C' clone of the perl 'pack'
command.
Essentially the problem is that, although everything runs fine under
linux, under solaris or NT the java runtime environment crashes with
illegal memory access errors almost immediately after the creation of the
first message (ie. after the first call to the native library). The crash
does not occur during the native call, but sometime later, normally during
an innocuous String.equals(String) call (at least according to the
stacktrace).
An interesting observation is that if I run a very simple, single threaded
test client under solaris, I can get it to pass a few thousand messages
before it crashes, but a multithreaded test client, or the full GUI client
(also multithreaded) has the almost-immediate crash.
My first assumption is that I must have some strange memory bug in the C
library. The thing that confuses this is that the C-only test programs
run fine on all three OS's and the java clients run fine under Linux.
It's only java under solaris and NT that is a problem.
I've also recompiled all classes and JNI code under the relevant OS to
ensure that there were no unexpected bytecode differences due to different
OS or jdk release. I've not managed to run the crash under jdb because I
cannot seem to get the jdb environment to find my .so library of natve
code.
I can not think of any sensible approach to tackling this problem, and
wondered if anyone out there has seen anything remotely similar before, or
has any ideas on what I can do to try track the bug down.
Cheers, Craig
------
"No one can make you feel inferior without your consent."
-- Eleanor Roosevelt
======================================================================
Craig Taverner ------====== Email:[EMAIL PROTECTED]
ComOpt AB ------======== Tel: +46-42-212580
Michael Löfmans Gata 6 ------========== Fax: +46-42-210585
SE-254 38 Helsingborg ------======== Cell: +46-708-212598
Sweden ------====== http://www.comopt.com
======================================================================