Hi Gregory,
Thanks so much for the detailed analysis. And I finally reproduce this
failure on my side since setting up the environment takes a lot of time :-(
I modified the findError method when fixing Harmony-2387, in which the
hysock_connect_with_timeout has a bug. I added this entry since the
getsockopt method returns a error code of zero when it indicates success.
Now it seems that this schema does not apply to everything. May you pls
revert this change? And I'll provide a new patch for Harmony-2387. Sorry
for any inconvenience caused.
Besides, I wonder why this failure does not happen on IBMVME/Linux,
sometimes even cannot reproduce on DRLVM/Linux32. I'll also take a
closer look. Thanks a lot.
Gregory Shimansky wrote:
Vladimir Ivanov wrote:
Hello everybody,
in case if someone miss the CC notification: now the classlib tests
crashed/ hang up on the Linux boxes when run over DRLVM.
Notifications were send ~12hours ago.
Failed tests:
Linux x86_64 (hang up):
TEST-org.apache.harmony.archive.tests.java.util.jar.JarExecTest.xml
Linux x86 (trying to reproduce):
TEST-org.apache.harmony.archive.tests.java.util.jar.JarFileTest.xml
TEST-org.apache.harmony.security.tests.PolicyEntryTest.xml
TEST-org.apache.harmony.security.tests.java.security.cert.CertificateFactory4Test.xml
I've found the reason of crash of
org.apache.harmony.archive.tests.java.util.jar.JarExecTest. The reason
for it is actually commit in revision 514596. Most likely other tests
fail for the same reason. The sequence that leads to a crash looks
like this:
1. Java calls Java_java_net_InetAddress_getHostByNameImpl with a host
name "jcltest.apache.org".
2. It calls hysock_getaddrinfo with this name and uninitialized
hyaddrinfo_struct addrinfo variable.
3. Function hysock_getaddrinfo calls system function getaddrinfo and
it returns not null which means error.
4. In this case hysock_getaddrinfo reads errno and records it in
errorCode. But errorCode appears to be 0. Looking at man page for
getaddrinfo I see that only in case of EAI_SYSTEM it sets errno to
some specific value. In other cases errno state is not specified.
5. Function hysock_getaddrinfo records an error with errorCode 0 using
findError. Since after the change in 514596 the errorCode 0 means
HYPORT_SUCCESS, then it is considered to be no error. Previously
before that change findError would return HYPORT_ERROR_SOCKET_OPFAILED.
6. Since hysock_getaddrinfo returned HYPORT_SUCCESS which is 0, the
function Java_java_net_InetAddress_getHostByNameImpl continues to work
with uninitialized addrinfo variable.
7. When Java_java_net_InetAddress_getHostByNameImpl calls to
hysock_freeaddrinfo, free is called on unintialized pointer which
leads to a crash.
--
Regards,
Ruth Cao
China Software Development Lab, IBM