non-JRMP server at remote endpoint
----------------------------------

                 Key: PAXEXAM-174
                 URL: http://issues.ops4j.org/browse/PAXEXAM-174
             Project: Pax Exam
          Issue Type: Bug
    Affects Versions: 1.2.0
         Environment: Linux, Java 1.6_u16
            Reporter: Bartosz Kowalewski
            Assignee: Toni Menzel


We observe the following problem when running pax exam tests on build machine 
(which runs Linux OS). It doesn't happen every time. This problem doesn't occur 
on our Windows laptops.

{code}
org.ops4j.pax.exam.spi.container.TestContainerException: Cannot get the remote 
bundle context
        at 
org.ops4j.pax.exam.rbc.client.RemoteBundleContextClient.getRemoteBundleContext(RemoteBundleContextClient.java:324)
        at 
org.ops4j.pax.exam.rbc.client.RemoteBundleContextClient.waitForState(RemoteBundleContextClient.java:265)
        at 
org.ops4j.pax.exam.container.def.internal.PaxRunnerTestContainer.waitForState(PaxRunnerTestContainer.java:317)
        at 
org.ops4j.pax.exam.container.def.internal.PaxRunnerTestContainer.start(PaxRunnerTestContainer.java:272)
        at 
org.ops4j.pax.exam.junit.internal.JUnit4TestMethod.invoke(JUnit4TestMethod.java:142)
        at 
org.junit.internal.runners.MethodRoadie.runTestMethod(MethodRoadie.java:105)
        at org.junit.internal.runners.MethodRoadie$2.run(MethodRoadie.java:86)
        at 
org.ops4j.pax.exam.junit.internal.JUnit4MethodRoadie.runBeforesThenTestThenAfters(JUnit4MethodRoadie.java:60)
        at org.junit.internal.runners.MethodRoadie.runTest(MethodRoadie.java:84)
        at org.junit.internal.runners.MethodRoadie.run(MethodRoadie.java:49)
        at 
org.ops4j.pax.exam.junit.JUnit4TestRunner.invokeTestMethod(JUnit4TestRunner.java:246)
        at 
org.ops4j.pax.exam.junit.JUnit4TestRunner.runMethods(JUnit4TestRunner.java:196)
        at 
org.ops4j.pax.exam.junit.JUnit4TestRunner$2.run(JUnit4TestRunner.java:186)
        at 
org.junit.internal.runners.ClassRoadie.runUnprotected(ClassRoadie.java:34)
        at 
org.junit.internal.runners.ClassRoadie.runProtected(ClassRoadie.java:44)
        at 
org.ops4j.pax.exam.junit.JUnit4TestRunner.run(JUnit4TestRunner.java:182)
        at 
org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:62)
        at 
org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.executeTestSet(AbstractDirectoryTestSuite.java:140)
        at 
org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.execute(AbstractDirectoryTestSuite.java:165)
        at org.apache.maven.surefire.Surefire.run(Surefire.java:107)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at 
org.apache.maven.surefire.booter.SurefireBooter.runSuitesInProcess(SurefireBooter.java:289)
        at 
org.apache.maven.surefire.booter.SurefireBooter.main(SurefireBooter.java:1005)
Caused by: java.rmi.ConnectIOException: non-JRMP server at remote endpoint
        at 
sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:230)
        at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:184)
        at sun.rmi.server.UnicastRef.newCall(UnicastRef.java:322)
        at sun.rmi.registry.RegistryImpl_Stub.lookup(Unknown Source)
        at 
org.ops4j.pax.exam.rbc.client.RemoteBundleContextClient.getRemoteBundleContext(RemoteBundleContextClient.java:302)
        ... 25 more
{code}

This might be a timing issue - something related to releasing OS /socket/ 
resources, etc. We thought that several tests (test methods) run in a row might 
cause this problem. The second Pax Exam test could be influenced by the first 
one. This was just a hypothesis. We then reproduced this issue in the first of 
the series of tests (test methods), so it could not be caused by resources left 
by other tests.

This issue is observed on a CI server. We also tried to verify if other builds 
that are run on this machine do not influence Pax Exam operation. We added some 
customizations to Pax Exam and changed the port number used for RMI 
communication between the OSGi container and the process that launches the 
test. Instead of 1099, a different port was used. We still observed this issue.

I finally a applied a _brute force_ workaround, but I'm not sure why does Pax 
Exam behave differently after this change was applied.
I hooked a custom org.ops4j.pax.exam.spi.container.TestContainerFactory 
implementation using Pax Exam's SPI mechanisms. I'm using a slightly modified 
version of org.ops4j.pax.exam.container.def.internal.PaxRunnerTestContainer. 
After reading Pax Exam source code and analyzing our log files I came to a 
conclusion that there might be some issue with synchronization and timing 
inside Pax Exam. The test container waits for a change in the state of the 
system bundle. This mechanism works over RMI - the container connects to an RMI 
registry that is started by the Pax Exam activator. It seems that sometimes the 
server side of the RMI connection is not yet prepared to handle any requests 
when waiting is being started. I added a sleep time to PaxRunnerTestContainer: 

{code}
  long startedAt = System.currentTimeMillis();
        URLUtils.resetURLStreamHandlerFactory();
        Run.start( m_javaRunner, m_arguments.getArguments() );
        LOG.info( "Test container (Pax Runner " + Info.getPaxRunnerVersion() + 
") started in "
            + ( System.currentTimeMillis() - startedAt ) + " millis" );

// XXX yah :-/
        try
                {
                        Thread.sleep(15000);
                }
                catch (InterruptedException e1)
                {
// ignore
                }
        
        LOG.info( "Wait for test container to finish its initialization "
            + ( m_startTimeout == WAIT_FOREVER ? "without timing out" : "for " 
+ m_startTimeout + " millis" ) );
        try
        {
            waitForState( SYSTEM_BUNDLE, Bundle.ACTIVE, m_startTimeout );
        }
        catch ( TimeoutException e )
        {
            throw new TimeoutException( "Test container did not initialize in 
the expected time of " + m_startTimeout
                + " millis" );
        }
{code} 

And I haven't reproduced the issue even though I ran the build 5 times. Of 
course:
- it's a brute force workaround that attempts to show where the problem might be
- I'm not _100%_ sure that this timing issue was the source of our problems





-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.ops4j.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

_______________________________________________
general mailing list
[email protected]
http://lists.ops4j.org/mailman/listinfo/general

Reply via email to