Ruth Cao wrote:
Hi Gregory,
Thanks so much for the detailed analysis. And I finally reproduce this
failure on my side since setting up the environment takes a lot of time :-(
I modified the findError method when fixing Harmony-2387, in which the
hysock_connect_with_timeout has a bug. I added this entry since the
getsockopt method returns a error code of zero when it indicates success.
Now it seems that this schema does not apply to everything. May you pls
revert this change? And I'll provide a new patch for Harmony-2387. Sorry
for any inconvenience caused.
Besides, I wonder why this failure does not happen on IBMVME/Linux,
sometimes even cannot reproduce on DRLVM/Linux32. I'll also take a
closer look. Thanks a lot.
I think that since free is done on uninitialized pointer, it is just
pure luck that this pointer may sometimes be NULL, so free doesn't
complain. For me this crash is best reproducible on x86_64, probably
stack layout is different on this architecture so addrinfo doesn't get
mapped on stack area with NULL values in it.
Gregory Shimansky wrote:
Vladimir Ivanov wrote:
Hello everybody,
in case if someone miss the CC notification: now the classlib tests
crashed/ hang up on the Linux boxes when run over DRLVM.
Notifications were send ~12hours ago.
Failed tests:
Linux x86_64 (hang up):
TEST-org.apache.harmony.archive.tests.java.util.jar.JarExecTest.xml
Linux x86 (trying to reproduce):
TEST-org.apache.harmony.archive.tests.java.util.jar.JarFileTest.xml
TEST-org.apache.harmony.security.tests.PolicyEntryTest.xml
TEST-org.apache.harmony.security.tests.java.security.cert.CertificateFactory4Test.xml
I've found the reason of crash of
org.apache.harmony.archive.tests.java.util.jar.JarExecTest. The reason
for it is actually commit in revision 514596. Most likely other tests
fail for the same reason. The sequence that leads to a crash looks
like this:
1. Java calls Java_java_net_InetAddress_getHostByNameImpl with a host
name "jcltest.apache.org".
2. It calls hysock_getaddrinfo with this name and uninitialized
hyaddrinfo_struct addrinfo variable.
3. Function hysock_getaddrinfo calls system function getaddrinfo and
it returns not null which means error.
4. In this case hysock_getaddrinfo reads errno and records it in
errorCode. But errorCode appears to be 0. Looking at man page for
getaddrinfo I see that only in case of EAI_SYSTEM it sets errno to
some specific value. In other cases errno state is not specified.
5. Function hysock_getaddrinfo records an error with errorCode 0 using
findError. Since after the change in 514596 the errorCode 0 means
HYPORT_SUCCESS, then it is considered to be no error. Previously
before that change findError would return HYPORT_ERROR_SOCKET_OPFAILED.
6. Since hysock_getaddrinfo returned HYPORT_SUCCESS which is 0, the
function Java_java_net_InetAddress_getHostByNameImpl continues to work
with uninitialized addrinfo variable.
7. When Java_java_net_InetAddress_getHostByNameImpl calls to
hysock_freeaddrinfo, free is called on unintialized pointer which
leads to a crash.
--
Gregory