Hi Gregory,

Thanks so much for the detailed analysis. And I finally reproduce this failure on my side since setting up the environment takes a lot of time :-(

I modified the findError method when fixing Harmony-2387, in which the hysock_connect_with_timeout has a bug. I added this entry since the getsockopt method returns a error code of zero when it indicates success.

Now it seems that this schema does not apply to everything. May you pls revert this change? And I'll provide a new patch for Harmony-2387. Sorry for any inconvenience caused.

Besides, I wonder why this failure does not happen on IBMVME/Linux, sometimes even cannot reproduce on DRLVM/Linux32. I'll also take a closer look. Thanks a lot.

Gregory Shimansky wrote:
Vladimir Ivanov wrote:
Hello everybody,
in case if someone miss the CC notification: now the classlib tests
crashed/ hang up on the Linux boxes when run over DRLVM.
Notifications were send ~12hours ago.
Failed tests:
Linux x86_64 (hang up):
TEST-org.apache.harmony.archive.tests.java.util.jar.JarExecTest.xml

Linux x86 (trying to reproduce):
TEST-org.apache.harmony.archive.tests.java.util.jar.JarFileTest.xml
TEST-org.apache.harmony.security.tests.PolicyEntryTest.xml
TEST-org.apache.harmony.security.tests.java.security.cert.CertificateFactory4Test.xml


I've found the reason of crash of org.apache.harmony.archive.tests.java.util.jar.JarExecTest. The reason for it is actually commit in revision 514596. Most likely other tests fail for the same reason. The sequence that leads to a crash looks like this:

1. Java calls Java_java_net_InetAddress_getHostByNameImpl with a host name "jcltest.apache.org". 2. It calls hysock_getaddrinfo with this name and uninitialized hyaddrinfo_struct addrinfo variable. 3. Function hysock_getaddrinfo calls system function getaddrinfo and it returns not null which means error. 4. In this case hysock_getaddrinfo reads errno and records it in errorCode. But errorCode appears to be 0. Looking at man page for getaddrinfo I see that only in case of EAI_SYSTEM it sets errno to some specific value. In other cases errno state is not specified. 5. Function hysock_getaddrinfo records an error with errorCode 0 using findError. Since after the change in 514596 the errorCode 0 means HYPORT_SUCCESS, then it is considered to be no error. Previously before that change findError would return HYPORT_ERROR_SOCKET_OPFAILED. 6. Since hysock_getaddrinfo returned HYPORT_SUCCESS which is 0, the function Java_java_net_InetAddress_getHostByNameImpl continues to work with uninitialized addrinfo variable. 7. When Java_java_net_InetAddress_getHostByNameImpl calls to hysock_freeaddrinfo, free is called on unintialized pointer which leads to a crash.



--
Regards,

Ruth Cao
China Software Development Lab, IBM


Reply via email to